引用本文:邹启杰,汤宇,高兵,赵锡玲,张哲婕.深度强化学习下的多智能体思考型半多轮通信网络[J].控制理论与应用,2025,42(3):553~562.[点击复制]
ZOU Qi-jie,TANG Yu,GAO Bing,ZHAO Xi-ling,ZHANG Zhe-jie.The thinking communication network with semi-multiple communication cycles under the multi-agent deep reinforcement learning[J].Control Theory and Technology,2025,42(3):553~562.[点击复制]
深度强化学习下的多智能体思考型半多轮通信网络
The thinking communication network with semi-multiple communication cycles under the multi-agent deep reinforcement learning
摘要点击 29  全文点击 3  投稿时间:2023-01-20  修订日期:2025-01-02
查看全文  查看/发表评论  下载PDF阅读器
DOI编号  10.7641/CTA.2023.30028
  2025,42(3):553-562
中文关键词  多智能体系统  合作环境  深度强化学习  通信网络
英文关键词  multi-agent systems  cooperative environment  deep reinforcement learning  communication network
基金项目  国家自然科学基金项目(61673084), 2021年辽宁省教育厅项目(LJKZ1180)资助.
作者单位邮编
邹启杰 大连大学信息工程学院 116622
汤宇 大连大学信息工程学院 
高兵* 大连大学信息工程学院 116622
赵锡玲 大连大学信息工程学院 
张哲婕 大连大学信息工程学院 
中文摘要
      针对多智能体系统在合作环境中通信内容单一和信息稀疏问题,本文提出一种基于多智能体深度强化学习的思考型通信网络(TMACN).首先,智能体在交互过程中考虑不同信息源的差异性,智能体将接收到的通信信息与自身历史经验信息进行融合,形成推理信息,并将此信息作为新的发送消息,从而达到提高通信内容多样化的目标;然后,该模型在软注意力机制的基础上设计了一种半多轮通信策略,提高了信息饱和度,从而提升系统的通信交互效率.本文在合作导航、捕猎任务和交通路口3个模拟环境中证明,TMACN对比其他方法,提高了系统的准确率与稳定性.
英文摘要
      To address the problem of single communication content and sparse information in multi-agent systems under a cooperative environment, this paper proposes a thinking multi-agent communication network (TMACN) based on deep reinforcement learning of multi-agent. Firstly, the agent considers the differences of different information sources in the interaction process, and the agent fuses the received communication information with their own historical experience information to form inference information, and use this information as a new sent message, so as to achieve the goal of improving the diversity of communication contents. Then, the model designs a semi-multi-round communication strategy based on the soft attention mechanism, which improves the information saturation and thus enhances the communication interaction efficiency of the system. This paper demonstrates that TMACN improves the accuracy and stability of the system compared to other methods in three simulated environments: cooperative navigation, hunting task and traffic junction.