引用本文:刘城焱,胡健,姚建勇,谭天乐,刘宇,林家伟.强化学习驱动下的柔性机械臂双时间尺度组合控制[J].控制理论与应用,2025,42(3):541~552.[点击复制]
LIU Cheng-yan,HU Jian,YAO Jian-yong,TAN Tian-le,LIU Yu,LIN Jia-wei.Dual time scale combination control of flexible manipulator driven by reinforcement learning[J].Control Theory and Technology,2025,42(3):541~552.[点击复制]
强化学习驱动下的柔性机械臂双时间尺度组合控制
Dual time scale combination control of flexible manipulator driven by reinforcement learning
摘要点击 47  全文点击 5  投稿时间:2023-05-22  修订日期:2024-12-19
查看全文  查看/发表评论  下载PDF阅读器
DOI编号  10.7641/CTA.2023.30349
  2025,42(3):541-552
中文关键词  柔性机械臂  轨迹跟踪  振动抑制  神经网络  干扰观测器  强化学习
英文关键词  flexible manipulator  trajectory tracking  vibration suppression  neural network  disturbance observer  reinforcement learning
基金项目  国家自然科学基金项目(51975294), 中央高校基本科研业务费专项项目(30922010706, 2023101001)资助.
作者单位E-mail
刘城焱 南京理工大学 1349945093@qq.com 
胡健* 南京理工大学 hujiannjust@163.com 
姚建勇 南京理工大学  
谭天乐 上海航天控制技术研究所  
刘宇 上海航天控制技术研究所  
林家伟 南京理工大学  
中文摘要
      本文研究柔性机械臂的轨迹跟踪和振动抑制问题.首先,基于Lagrange法和假设模态法建立柔性机械臂的动力学模型.然后,利用奇异摄动理论将模型分解为描述刚体运动的慢时间尺度子系统和描述柔性振动的快时间尺度子系统,针对慢子系统建立以滑模变结构为基础,径向基神经网络估计模型参数,干扰观测器估计扰动和模型不确定性的刚性复合控制器;快子系统建立了以最优控制为基础,使用强化学习在线获取最优控制解的柔性控制器.通过Lyapunov稳定性理论分别证明了不同时间尺度下的系统稳定性,将在不同时间尺度下的控制器进行叠加,获得原系统状态的组合控制器.最后,通过实验结果对比表明,所设计的组合控制方法更具优越性.
英文摘要
      In this paper, the trajectory tracking and vibration suppression of flexible manipulators are studied. Firstly, the dynamic model of the flexible manipulator is established based on Lagrange method and assumed mode method. Then, the model is decomposed into a slow time scale subsystem to describe the rigid body motion and a fast time scale subsystem to describe the flexible vibration by using the singular perturbation theory. For the rigid body subsystem, a rigid composite controller based on sliding mode variable structure, radial basis neural network to estimate model parameters, and disturbance observer to estimate disturbance and model uncertainty is established; the fast subsystem establishes a flexible controller that uses reinforcement learning and optimal control to obtain the optimal control solution online. The stability of the system under different time scales is proved by Lyapunov stability theory, and the controller under different time scales is superimposed to obtain the combined controller of the original system state. Finally, the experimental results show that the proposed combined control method is superior.