引用本文:孟亦真,黄静,周绍辉,周彬,朱康武.输入受限下的超紧密航天器编队避撞相对位置强化学习控制[J].控制理论与应用,2025,42(4):659~668.[点击复制]
Meng Yi-Zhen,Huang Jing,Zhou Shao-hui,Zhou Bin,Zhu Kang-wu.Reinforcement learning control of collision avoidance for ultra-close formation of spacecraft with input constraints[J].Control Theory & Applications,2025,42(4):659~668.[点击复制]
输入受限下的超紧密航天器编队避撞相对位置强化学习控制
Reinforcement learning control of collision avoidance for ultra-close formation of spacecraft with input constraints
摘要点击 8  全文点击 0  投稿时间:2022-09-08  修订日期:2025-03-03
查看全文  查看/发表评论  下载PDF阅读器
DOI编号  10.7641/CTA.2023.20792
  2025,42(4):659-668
中文关键词  航天器编队  避撞  强化学习  死区效应  固定时间约束
英文关键词  spacecraft formation  collision avoidance  reinforcement learning control  dead-zone effect  fixed time constraint
基金项目  国家重点研发计划项目(2022YFB3902700,2022YFB3902702), 空间目标感知全国重点实验室资助.
作者单位E-mail
孟亦真 上海航天控制技术研究所 15151830168@163.com 
黄静* 上海航天控制技术研究所 huangjing04415@163.com 
周绍辉 上海航天控制技术研究所  
周彬 上海航天控制技术研究所  
朱康武 上海航天控制技术研究所  
中文摘要
      考虑具有外界干扰、避撞约束和固定时间约束的近地轨道超紧密航天器编队的重构控制问题,本文提出一种多 约束条件下的考虑执行机构死区效应的航天器编队鲁棒控制方法.首先,建立近地轨道完整的编队航天器相对位置非线 性动力学方程和执行机构死区动态响应模型;其次,根据状态约束条件设计编队相对位置约束机制,基于反步法和强化 学习评判–动作网络,提出防避撞约束和固定时间约束的鲁棒控制律,进一步考虑到执行机构电推力器的死区效应,基 于强化学习的动作网络来近似死区特性,本文通过最小化评判网络代价函数来解决执行机构死区效应对控制精度带来 的影响,并应用Lyapunov稳定性定理证明其闭环系统的一致有界性;最后,在MATLAB/Simulink平台上进行仿真验证, 结果表明所提出方法的有效性.
英文摘要
      Considering the control problem of reconstructing the ultra-tight formation of near-Earth orbit spacecraft in the presence of external disturbances, collision avoidance constraints, and fixed-time constraints, this study presents a robust control method for spacecraft formation that accounts for the dead-zone effect of the actuator under multiple constraint conditions. Firstly, we establish the nonlinear dynamic equations governing the relative positions of the spacecraft in the complete near-Earth orbit formation, as well as the dynamic response model capturing the dead-zone effect of the actuator. Secondly, we design a constraint mechanism for the relative positions of the formation based on state constraints. Robust control laws, employing a combination of backstepping and a reinforcement learning actor-critic network, are proposed to address collision avoidance constraints and fixed-time constraints. Additionally, we approximate the dead zone characteristics of the actuator’s thrusters by leveraging a reinforcement learning actor network. To mitigate the impact of the dead-zone effect on control accuracy, we minimize the cost function of the actor network. The Lyapunov stability theorem is employed to demonstrate the uniformly boundedness of the closed-loop system. Finally, we conduct simulation verification on the MATLAB/Simulink platform, and the results substantiate the effectiveness of the proposed method.