引用本文: | 任坚,刘剑慰,杨蒲.基于增量式策略强化学习算法的飞行控制系统的容错跟踪控制[J].控制理论与应用,2020,37(7):1429~1438.[点击复制] |
REN Jian,LIU Jian-wei,YANG Pu.Fault-tolerant tracking control for continuous flight control system based on reinforcement learning algorithm with incremental strategy[J].Control Theory and Technology,2020,37(7):1429~1438.[点击复制] |
|
基于增量式策略强化学习算法的飞行控制系统的容错跟踪控制 |
Fault-tolerant tracking control for continuous flight control system based on reinforcement learning algorithm with incremental strategy |
摘要点击 3081 全文点击 1055 投稿时间:2019-05-25 修订日期:2019-12-26 |
查看全文 查看/发表评论 下载PDF阅读器 |
DOI编号 10.7641/CTA.2020.90380 |
2020,37(7):1429-1438 |
中文关键词 飞行控制系统 故障诊断 故障容错 强化学习 Q-learning算法 增量式策略 状态转移预测网络 |
英文关键词 flight control systems fault diagnosis fault tolerance reinforcement learning Q-learning algorithm incremental strategy state transition prediction |
基金项目 民航飞机健康监测与智能维护重点实验室基金项目(NJ2018012), 先进飞行器导航、控制与健康管理工业和信息化部重点实验室(南京航空航天大 学)项目, 中央高校基本科研业务费项目(NS2017017), 国家自然科学基金项目(61533008, 61490703)资助. |
|
中文摘要 |
针对发生故障的飞行控制系统, 在强化学习算法的基础上, 提出了一种基于增量式策略的强化学习容错方
法. 本方法利用传感器获取的系统状态值, 根据系统预先设定的奖励函数对当前控制系统状况做出最优的决策并
不断更新价值网络, 将系统的容错控制过程转换为强化学习Agent的贯序决策过程, 并使用一种改进型的增量式策
略实现对当前故障的正确补偿策略的逐渐逼近. 同时, 针对连续控制系统, 提出一种状态转移预测网络来得到下一
步状态值. 最后, 通过南京航空航天大学“先进飞行器导航、控制与健康管理”工信部重点实验室的飞行器故障诊
断实验平台验证了该方法的有效性. |
英文摘要 |
A reinforcement learning method based on incremental strategy is proposed to make fault-tolerant tracking
control for continuous flight control system with faults. The system state value obtained by the sensor is used in the method
proposed by this paper, The fault-tolerant system makes optimal decisions on the current control system conditions based on
pre-set reward functions and continuously updates the value network, This transforms the fault-tolerant control process of
the system into a sequential decision-making process of the reinforcement learning agent, and gradually approximates the
specific fault value using an improved incremental strategy. what’s more, A state transition prediction network is proposed
for the continuous control system to obtain the next state value. Finally, The effectiveness of the proposed method is verified
by the aircraft fault diagnosis experimental platform of the Key Laboratory of Advanced Aircraft Navigation, Control and
Health Management of Nanjing University of Aeronautics and Astronautics. |
|
|
|
|
|