基于强化学习的波动鳍推进水下作业机器人悬停控制

马睿宸; 白雪剑; 王宇; 王睿; 王硕

引用本文:	马睿宸,白雪剑,王宇,王睿,王硕.基于强化学习的波动鳍推进水下作业机器人悬停控制[J].控制理论与应用,2022,39(11):2092~2099.[点击复制]
	MA Rui-chen,BAI Xue-jian,WANG Yu,WANG Rui,WANG Shuo.Hovering control of an underwater vehicle-manipulator system propelled by undulatory fins via reinforcement learning[J].Control Theory and Technology,2022,39(11):2092~2099.[点击复制]

基于强化学习的波动鳍推进水下作业机器人悬停控制

Hovering control of an underwater vehicle-manipulator system propelled by undulatory fins via reinforcement learning

摘要点击 1673 全文点击 392 投稿时间：2021-11-01 修订日期：2022-09-13

查看全文查看/发表评论下载PDF阅读器

DOI编号 10.7641/CTA.2022.11054

2022,39(11):2092-2099

中文关键词水下作业机器人悬停控制波动鳍神经网络强化学习

英文关键词 underwater vehicle-manipulator system hovering control undulatory fin neural network reinforcement learning

基金项目国家自然科学基金项目(62122087, 62073316, U1806204, 62033013, U1713222), 中国科学院对外合作重点项目(173211KYSB20200020)资助.

作者	单位	邮编
马睿宸	中国科学院自动化研究所	100190
白雪剑^*	中国科学院自动化研究所	100190
王宇	中国科学院自动化研究所
王睿	中国科学院自动化研究所
王硕	中国科学院自动化研究所

中文摘要

本文针对波动鳍推进水下作业机器人的悬停控制问题开展研究. 首先, 给出了波动鳍推进水下作业机器人的运动学模型、动力学模型和波动鳍的参数–力映射模型, 建立了基于马尔可夫决策过程的悬停控制训练框架. 其次, 基于模型结构和训练策略, 使用强化学习的方法进行网络训练, 得到最佳的悬停控制器. 最终, 在室内水池中完成了波动鳍推进水下作业机器人的悬停控制实验, 实验结果验证了所提方法的有效性.

英文摘要

This paper addresses the hovering control of an underwater vehicle-manipulator system (UVMS) propelled by undulatory fins. First, the kinematic and dynamical models of the UVMS and a mapping model between the control parameters of undulatory fins and the driving force of the UVMS are introduced, and a hovering control training framework based on Markov decision process (MDP) is designed. Then, based on the framework and training strategies, the hovering controller is fully trained via reinforcement learning method. Finally, the well-trained controller is applied in the real environment, and the experimental results demonstrate that the proposed method can accomplish the UVMS’s hovering control effectively.