引用本文:王东署,高旭霖.未知环境中移动机器人柔性行为决策的调节发育学习[J].控制理论与应用,2023,40(4):641~652.[点击复制]
WANG Dong-shu,GAO Xu-lin.Motivated developmental learning for flexible behavioral decision-making of mobile robots in unknown environment[J].Control Theory and Technology,2023,40(4):641~652.[点击复制]
未知环境中移动机器人柔性行为决策的调节发育学习
Motivated developmental learning for flexible behavioral decision-making of mobile robots in unknown environment
摘要点击 1744  全文点击 471  投稿时间:2021-10-12  修订日期:2022-02-07
查看全文  查看/发表评论  下载PDF阅读器
DOI编号  10.7641/CTA.2022.10963
  2023,40(4):641-652
中文关键词  移动机器人  行为决策  蓝斑  好奇度  探索–利用平衡
英文关键词  mobile robot  behavioral decision-making  locus coeruleus  curiosity  exploration-exploitation trade-offs
基金项目  国家自然科学基金项目(62173309, 61873245), 河南省自然科学基金项目(202300410483), 河南省科技攻关项目(192102210256)
作者单位E-mail
王东署* 郑州大学 wangdongshu@zzu.edu.cn 
高旭霖 郑州大学 gaoxulin@163.com 
中文摘要
      未知环境中移动机器人柔性的行为决策是完成各种任务的前提. 目前的机器人行为决策方法在面对动态变化的环境时柔性较差, 机器人难以获得持续稳定的学习能力. 本文作者曾尝试通过集成小脑监督学习和基底神经节的强化学习来实现移动机器人动态环境下的柔性行为决策, 但所提算法适应动态环境的能力有限. 在前期工作基础上, 本文设计了更有生物学意义的好奇度指标代替原来的警觉度指标, 通过模拟蓝斑活动在基音模式和阶段模式之间的动态切换, 实现移动机器人环境探索–利用的动态自适应调节. 同时, 设计随外部环境变化的自适应调节因子, 实现移动机器人动态环境中基于小脑监督学习和基底神经节强化学习的柔性行为决策, 使机器人可以获得持续稳定的学习能力. 动态环境和实际环境中的实验结果验证了本文所提算法的有效性.
英文摘要
      The flexible behavioral decision-making of mobile robots in unknown environment is the premise of completing various tasks. The current robot behavior decision methods are not flexible in facing the dynamic environments, and it is difficult for the robots to obtain continuous and stable learning ability. In our previous work, the supervised learning of cerebellum and reinforcement learning of basal ganglia is integrated to achieve flexible behavior decision of mobile robot in dynamic environment. However, the proposed algorithm has limited adaptability to the dynamic environment. Based on the previous work, a more biologically meaningful curiosity index is designed to replace the original vigilance index, by simulating the dynamic switch of locus coeruleus activity between tonic mode and phasic mode, the dynamic adaptive adjustment of environment exploration and exploitation of the mobile robot is realized. At the same time, an adaptive adjustment factor changing with the external environment is designed to realize the flexible behavioral decision of mobile robot in dynamic environment based on the supervised learning of the cerebellum, and reinforcement learning of basal ganglia, make the robot obtain stable learning ability. Experimental results in dynamic environment and real environment verify the effectiveness of the proposed algorithm.