引用本文:崔黎黎,张勇,张欣.非线性零和微分对策的事件触发自适应动态规划算法[J].控制理论与应用,2018,35(5):610~618.[点击复制]
CUI Li-li,ZHANG Yong,ZHANG Xin.Event-triggered adaptive dynamic programming algorithm for the nonlinear zero-sum differential games[J].Control Theory and Technology,2018,35(5):610~618.[点击复制]
非线性零和微分对策的事件触发自适应动态规划算法
Event-triggered adaptive dynamic programming algorithm for the nonlinear zero-sum differential games
摘要点击 3148  全文点击 1526  投稿时间:2017-09-15  修订日期:2017-12-27
查看全文  查看/发表评论  下载PDF阅读器
DOI编号  10.7641/CTA.2017.70674
  2018,35(5):610-618
中文关键词  自适应动态规划  非线性零和微分对策  事件触发  神经网络  最优控制
英文关键词  adaptive dynamic programming  nonlinear zero-sum differential games  event-triggered  optimal control
基金项目  国家自然科学基金项目(61703289), 山东省自然科学基金项目(BX2015DX009), 辽宁省高等学校基本科研项目专项资金(LQN201720, LQN2017 02), 沈阳师范大学科技项目(L201510)资助.
作者单位E-mail
崔黎黎* 沈阳师范大学 cuilili8396@163.com 
张勇 沈阳师范大学  
张欣 中国石油大学(华东)  
中文摘要
      针对一类非线性零和微分对策问题, 本文提出了一种事件触发自适应动态规划(event-triggered adaptive dynamic programming, ET--ADP)算法在线求解其鞍点. 首先, 提出一个新的自适应事件触发条件. 然后, 利用一个输 入为采样数据的神经网络(评价网络)近似最优值函数, 并设计了新型的神经网络权值更新律使得值函数、控制策略 及扰动策略仅在事件触发时刻同步更新. 进一步地, 利用Lyapunov稳定性理论证明了所提出的算法能够在线获得 非线性零和微分对策的鞍点且不会引起Zeno行为. 所提出的ET--ADP算法仅在事件触发条件满足时才更新值函 数、控制策略和扰动策略, 因而可有效减少计算量和降低网络负荷. 最后, 两个仿真例子验证了所提出的ET--ADP算 法的有效性.
英文摘要
      In this paper, an event-triggered adaptive dynamic programming algorithm (ET--ADP) is proposed to solve the saddle point of a class of nonlinear zero-sum differential games. Firstly, a new adaptive event-triggered condition is proposed. Then, a neural network (critic network) with the sampled state as its input is utilized to approximate the optimal value function. The new neural network weights updating law is designed to enable the value function, the control strategy and the disturbance strategy to be updated synchronously only at the event-triggered time. Further, the Lyapunov stability theory is used to prove that the proposed algorithm can obtain the saddle point of nonlinear zero-sum differential games online and avoid the occurrence of Zeno behavior. In the proposed ET--ADP algorithm, the value function, the control strategy and the disturbance strategy are updated only when the event-triggered condition is satisfied, as a result of which the computational burden is reduced and the network burden is eased effectively. Finally, two simulation examples validate the effectiveness of the proposed ET--ADP algorithm.