quotation:		[Copy]
		Yao Zhang,Chaoxu Mu,Yong Zhang,Yanghe Feng.[en_title][J].Control Theory and Technology,2021,19(3):339~353.[Copy]

This Paper:Browse 900 Download 11	码上扫一扫！
Heuristic dynamic programming-based learning control for discrete-time disturbed multi-agent systems
YaoZhang,ChaoxuMu,YongZhang,YangheFeng
0 Fontlarge +\|Default\|Small
(School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China;College of Systems Engineering, National University of Defense Technology, Changsha 410073, Hunan, China)

摘要:

关键词:

DOI：https://doi.org/10.1007/s11768-021-00049-9

基金项目:This work was supported by Tianjin Natural Science Foundation under Grant 20JCYBJC00880, Beijing key Laboratory Open Fund of Long-Life Technology of Precise Rotation and Transmission Mechanisms, and Guangdong Provincial Key Laboratory of Intelligent Decision and Cooperative Control.

Heuristic dynamic programming-based learning control for discrete-time disturbed multi-agent systems

Yao Zhang,Chaoxu Mu,Yong Zhang,Yanghe Feng

(School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China;College of Systems Engineering, National University of Defense Technology, Changsha 410073, Hunan, China)

Abstract:

Owing to extensive applications in many fields, the synchronization problem has been widely investigated in multi-agent systems. The synchronization for multi-agent systems is a pivotal issue, which means that under the designed control policy, the output of systems or the state of each agent can be consistent with the leader. The purpose of this paper is to investigate a heuristic dynamic programming (HDP)-based learning tracking control for discrete-time multi-agent systems to achieve synchronization while considering disturbances in systems. Besides, due to the difficulty of solving the coupled Hamilton– Jacobi–Bellman equation analytically, an improved HDP learning control algorithm is proposed to realize the synchronization between the leader and all following agents, which is executed by an action-critic neural network. The action and critic neural network are utilized to learn the optimal control policy and cost function, respectively, by means of introducing an auxiliary action network. Finally, two numerical examples and a practical application of mobile robots are presented to demonstrate the control performance of the HDP-based learning control algorithm.

Key words: Multi-agent systems · Heuristic dynamic programming (HDP) · Learning control · Neural network · Synchronization