引用本文:宋梅萍, 顾国昌, 张国印, 刘海波.一般和博弈中的合作多agent学习[J].控制理论与应用,2007,24(2):317~321.[点击复制]
SONG Mei-ping, GU Guo-chang, ZHANG Guo-yin, LIU Hai-bo.Multi-agent learning in cooperative general-sum games[J].Control Theory and Technology,2007,24(2):317~321.[点击复制]
一般和博弈中的合作多agent学习
Multi-agent learning in cooperative general-sum games
摘要点击 4048  全文点击 910  投稿时间:2005-07-11  修订日期:2006-05-18
查看全文  查看/发表评论  下载PDF阅读器
DOI编号  10.7641/j.issn.1000-8152.2007.2.029
  2007,24(2):317-321
中文关键词  多agent学习  一般和随机博弈  Nash平衡  Pareto占优  Q-学习
英文关键词  multi-agent learning  general-sum game  Nash equilibrium  Pareto optimum  Q-learning
基金项目  
作者单位
宋梅萍, 顾国昌, 张国印, 刘海波 哈尔滨工程大学 计算机科学与技术学院, 黑龙江 哈尔滨 150001 
中文摘要
      理性和收敛是多agent学习研究所追求的目标.在理性合作的agent系统中提出利用Pareto占优解代替非合作的Nash平衡解进行学习,使agent更具理性.另一方面引入社会公约来启动和约束agent的推理,统一系统中所有agent的决策,从而保证学习的收敛性.利用2人栅格游戏对多种算法进行验证,成功率的比较说明了所提算法具有较好的学习性能.
英文摘要
      Rationality and convergence are two topics in the research on multi-agent learning.A new method called Pareto-Q is proposed with the concept of Pareto optimum, which is more rational than Nash equilibrium with regard to the cooperative system. At the same time, social conventions are also introduced to promise the convergence of learning. When tested on a two-person grid game, the algorithm performs better than the single Q-learning and Nash-Q learning.