引用本文:朱志斌,王付永,尹艳辉,刘忠信,陈增强.基于Q-learning的离散时间多智能体系统一致性[J].控制理论与应用,2021,38(7):997~1005.[点击复制]
ZHU Zhi-bin,WANG Fu-yong,YIN Yan-hui,LIU Zhong-xin,CHEN Zeng-qiang.Consensus of discrete-time multi-agent system based on Q-learning[J].Control Theory and Technology,2021,38(7):997~1005.[点击复制]
基于Q-learning的离散时间多智能体系统一致性
Consensus of discrete-time multi-agent system based on Q-learning
摘要点击 2738  全文点击 924  投稿时间:2020-08-12  修订日期:2021-02-03
查看全文  查看/发表评论  下载PDF阅读器
DOI编号  10.7641/CTA.2021.00533
  2021,38(7):997-1005
中文关键词  多智能体系统  一致性  离散时间  Q-learning
英文关键词  multi-agent systems  consensus  discrete-time  Q-learning
基金项目  天津市自然科学基金项目(20JCYBJC01060, 20JCQNJC01450), 国家自然科学基金项目(61973175), 南开大学中央高校基本科研业务费专项资金 项目(63201196)资助.
作者单位E-mail
朱志斌 南开大学人工智能学院 657707375@qq.com 
王付永 南开大学人工智能学院  
尹艳辉 南开大学人工智能学院  
刘忠信* 南开大学人工智能学院 lzhx@nankai.edu.cn 
陈增强 南开大学人工智能学院  
中文摘要
      针对模型未知的一类离散时间多智能体系统, 本文提出了一种Q-learning方法实现多智能体系统的一致性 控制. 该方法不依赖于系统模型, 能够利用系统数据迭代求解出可使给定目标函数最小的控制律, 使所有智能体的 状态实现一致. 通过各个智能体所产生的系统数据, 采用策略迭代的方法实时更新求解得到多智能体系统的控制 律, 并对所提Q-learning方法进行了收敛性和稳定性分析. 最后, 论文给出了计算机仿真验证了所提方法的有效性.
英文摘要
      For a class of discrete-time multi-agent systems with unknown models, a Q-learning method is proposed in this paper to achieve consensus of multi-agent systems. The proposed method does not depend on the system model, and the optimal control law can be obtained through the iteration of system data. Based on the system data, policy iteration is adopted to calculate the optimal control law of the multi-agent systems. Convergence and stability analysis of the proposed Q-learning method for multi-agent systems is also given in this work. Finally, a simulation example is provided to verify the effectiveness of the proposed method.