具有未知动态的线性系统二人零和博弈问题在线学习方案

富月; 柴天佑

引用本文:	富月,柴天佑.具有未知动态的线性系统二人零和博弈问题在线学习方案[J].控制理论与应用,2015,32(2):196~201.[点击复制]
	FU Yue,CHAI Tian-you.Online solution of two-player zero-sum games for linear systems with unknown dynamics[J].Control Theory and Technology,2015,32(2):196~201.[点击复制]

具有未知动态的线性系统二人零和博弈问题在线学习方案

Online solution of two-player zero-sum games for linear systems with unknown dynamics

摘要点击 3351 全文点击 1892 投稿时间：2014-01-09 修订日期：2014-10-18

查看全文查看/发表评论下载PDF阅读器

DOI编号 10.7641/CTA.2015.14005

2015,32(2):196-201

中文关键词二人零和博弈策略迭代博弈代数黎卡提方程

英文关键词 two-player zero-sum game policy iterations game algebraic Riccati equation

基金项目国家自然科学基金项目(61374042), 中央高校基本科研业务费基金项目(N130408003, N130108001)资助.

作者	单位	E-mail
富月	东北大学流程工业综合自动化国家重点实验室	fuyue@mail.neu.edu.cn
柴天佑	东北大学流程工业综合自动化国家重点实验室	tychai@mail.neu.edu.cn

中文摘要

针对具有未知动态线性系统的二人零和博弈问题, 本文提出了一种新的基于单环迭代方法的在线学习方案. 为保证单环迭代方法的收敛性, 给出了一种新的分析方法. 在系统内部矩阵A, 控制输入矩阵B以及干扰输入矩阵D均未知的情况下, 通过在线迭代策略, 同步得到了博弈代数黎卡提方程的近似解, 以及控制和干扰策略. 仿真结果表明了所提方法的有效性.

英文摘要

For two-player zero-sum games of continuous-time linear systems with unknown dynamics, we present an online adaptive learning algorithm based on the policy iteration (PI) scheme with only one loop. A new analytical method to prove the convergence of the PI scheme is presented. An approximate solution to the generalized game algebraic Riccati equation without using a priori knowledge of the system matrices is developed. Simulation results illustrate the effectiveness of the proposed method.