引用本文: | 朱国政,张茂光,何舒平.基于策略迭代算法的连续时间线性Markov跳变系统非零和微分反馈Nash控制[J].控制理论与应用,2020,37(8):1749~1756.[点击复制] |
Zhu Guo-zheng,Zhang Mao-guang,He Shu-ping.Policy iteration-based non-zero sum differential feedback Nash control for continuous-time Markov jump linear systems[J].Control Theory and Technology,2020,37(8):1749~1756.[点击复制] |
|
基于策略迭代算法的连续时间线性Markov跳变系统非零和微分反馈Nash控制 |
Policy iteration-based non-zero sum differential feedback Nash control for continuous-time Markov jump linear systems |
摘要点击 1833 全文点击 689 投稿时间:2019-07-23 修订日期:2020-01-20 |
查看全文 查看/发表评论 下载PDF阅读器 |
DOI编号 10.7641/CTA.2020.90603 |
2020,37(8):1749-1756 |
中文关键词 策略迭代 Markov跳变线性系统 非零和 微分反馈Nash策略 |
英文关键词 policy iteration Markov jump linear systems non-zero sum differential feedback Nash strategy |
基金项目 国家自然科学基金项目(61673001), 安徽省杰出青年基金项目(1608085J05), 安徽省高校优秀青年人才支持重点项目(gxydZD2017001)资助 |
|
中文摘要 |
针对一类连续时间线性Markov跳变系统, 本文提出了一种新的策略迭代算法用于求解系统的非零和微分
反馈Nash控制问题. 通过求解耦合的数值迭代解, 以获得具有线性动力学特性和无限时域二次成本的双层非零和
微分策略的Nash均衡解. 在每一个策略层, 采用策略迭代算法来计算与每一组给定的反馈控制策略相关联的最小
无限时域值函数. 然后,通过子系统分解将Markov跳变系统分解为N个并行的子系统, 并将该算法应用于跳变系
统. 本文提出的策略迭代算法可以很容易求解非零和微分策略所对应的耦合代数Riccati方程, 且对高维系统有效.
最后通过仿真示例证明了本文设计方法的有效性和可行性. |
英文摘要 |
In this paper, a new policy iterative algorithm is proposed to solve the non-zero sum differential feedback
Nash control problems for a class of continuous-time Markov jump linear systems. The Nash equilibrium solution of a
double-layer non-zero sum differential policy with linear dynamics and infinite time-domain secondary cost is found by
solving the coupled numerical iteration solutions. At each policy layer, an policy iterative algorithm is used to calculate the
minimum infinite time-domain value function associated with the set of given feedback control strategies. Then, Markov
jump linear systems is decomposed into N parallel subsystems by subsystems transformation. And the algorithm is applied
to jump systems. The policy iteration algorithm proposed in this paper can easily solve the coupled algebraic Riccati
equations corresponding to the non-zero and differential policy. It is effective for high-dimensional systems. Finally, a
simulation example is given to prove the effectiveness and feasibility of the design method. |