| 引用本文: | 雷元龙,谢鹏,刘业华,陈翃正,朱静思,盛守照.EP-DDPG引导的着舰控制系统[J].控制理论与应用,2025,42(10):1904~1913.[点击复制] |
| LEI Yuan-long,XIE Peng,LIU Ye-hua,CHEN Hong-zheng,ZHU Jing-si,SHENG Shou-zhao.EP-DDPGguided carrier landing control system[J].Control Theory & Applications,2025,42(10):1904~1913.[点击复制] |
|
| EP-DDPG引导的着舰控制系统 |
| EP-DDPGguided carrier landing control system |
| 摘要点击 388 全文点击 60 投稿时间:2023-07-14 修订日期:2025-06-21 |
| 查看全文 查看/发表评论 下载PDF阅读器 |
| DOI编号 10.7641/CTA.2020.90085 |
| 2025,42(10):1904-1913 |
| 中文关键词 强化学习 深度确定性策略梯度算法 魔毯 行动者–评论家 BP神经网络 |
| 英文关键词 reinforcement learning deep deterministic policy gradient algorithm MAGIC CARPET actor-critic BP neural network |
| 基金项目 航空科学基金项目(20220058052002)资助. |
|
| 中文摘要 |
| 针对舰载机纵向通道下的控制精度提升问题,本文以保证舰载机以合理的姿态和速度沿期望下滑道着落
为目标,以深度确定性策略梯度算法为基本优化框架,提出了一种基于专家策略–深度确定性策略梯度(EP-DDPG)
算法的控制器参数自适应调节策略.首先,构建“魔毯”着舰控制系统作为基础架构;其次,为提升控制器的自适应能
力和鲁棒性,基于行动者–评论家框架设计深度确定性策略梯度(DDPG)算法对控制器参数进行在线调整;最后,针
对常规强化学习算法前期训练效率低,效果差的问题,基于反向传播(BP)神经网络构专家策略为智能体的训练提供
引导,并设计指导探索协调模块进行策略决策,保证动作策略的合理性和算法的高效性.仿真结果表明,与常规控
制器相比,该算法的控制精度和鲁棒性有了极大的提升. |
| 英文摘要 |
| In order to improve the control accuracy of carrier-based aircraft in the longitudinal channel, this paper
takes the deep deterministic policy gradient algorithm as the basic optimization framework to ensure that aircraft can land
along the desired glide path with reasonable attitude and speed. An adaptive controller parameter adjustment strategy
based on expert policy-deep deterministic policy gradient (EP-DDPG) algorithm is proposed. Firstly, building the MAGIC
CARPET landing control system as the framework. Secondly, aiming at improving the adaptive ability and robustness of
the controller, DDPG algorithm is designed based on the actor-critic framework to adjust the controller parameters online.
Finally, in view of the low efficiency and poor effect of the early training of conventional reinforcement learning algorithm,
an expert policy is constructed based on backward propagation (BP) neural network to provide guidance for the training
of the agent, and a guidance exploration and coordination module is designed to make strategy decisions, so as to ensure
the rationality of the action policy and the efficiency of the algorithm. The simulation results show that compared with the
conventional controllers, the control precision and the robustness of the proposed algorithm are greatly improved. |
|
|
|
|
|