引用本文: | 宋梅萍, 顾国昌, 张国印, 刘海波.一般和博弈中的合作多agent学习[J].控制理论与应用,2007,24(2):317~321.[点击复制] |
SONG Mei-ping, GU Guo-chang, ZHANG Guo-yin, LIU Hai-bo.Multi-agent learning in cooperative general-sum games[J].Control Theory and Technology,2007,24(2):317~321.[点击复制] |
|
一般和博弈中的合作多agent学习 |
Multi-agent learning in cooperative general-sum games |
摘要点击 4048 全文点击 910 投稿时间:2005-07-11 修订日期:2006-05-18 |
查看全文 查看/发表评论 下载PDF阅读器 |
DOI编号 10.7641/j.issn.1000-8152.2007.2.029 |
2007,24(2):317-321 |
中文关键词 多agent学习 一般和随机博弈 Nash平衡 Pareto占优 Q-学习 |
英文关键词 multi-agent learning general-sum game Nash equilibrium Pareto optimum Q-learning |
基金项目 |
|
中文摘要 |
理性和收敛是多agent学习研究所追求的目标.在理性合作的agent系统中提出利用Pareto占优解代替非合作的Nash平衡解进行学习,使agent更具理性.另一方面引入社会公约来启动和约束agent的推理,统一系统中所有agent的决策,从而保证学习的收敛性.利用2人栅格游戏对多种算法进行验证,成功率的比较说明了所提算法具有较好的学习性能. |
英文摘要 |
Rationality and convergence are two topics in the research on multi-agent learning.A new method called Pareto-Q is proposed with the concept of Pareto optimum, which is more rational than Nash equilibrium with regard to the cooperative system. At the same time, social conventions are also introduced to promise the convergence of learning. When tested on a two-person grid game, the algorithm performs better than the single Q-learning and Nash-Q learning. |
|
|
|
|
|