引用本文: | 唐昊,丁丽洁,程文娟,周雷.搬运系统作业分配问题的小脑模型关节控制器Q学习算法[J].控制理论与应用,2009,26(8):884~888.[点击复制] |
tanghao,DING Lijie,Cheng Wenjuan,ZHOU Lei.The cerebellar-model-articulation-controller Q-learning for the task assignment of a handling system[J].Control Theory and Technology,2009,26(8):884~888.[点击复制] |
|
搬运系统作业分配问题的小脑模型关节控制器Q学习算法 |
The cerebellar-model-articulation-controller Q-learning for the task assignment of a handling system |
摘要点击 2008 全文点击 1248 投稿时间:2008-05-25 修订日期:2008-11-16 |
查看全文 查看/发表评论 下载PDF阅读器 |
DOI编号 10.7641/j.issn.1000-8152.2009.8.CCTA080522 |
2009,26(8):884-888 |
中文关键词 作业分配 Markov决策过程 Q学习 CMAC |
英文关键词 task assignment MDP Q-learning CMAC |
基金项目 国家自然科学基金资助项目(60404009); 安徽省自然科学基金资助项目(090412046,070416242); 安徽高校省级自然科学研究重点资助项目(KJ2007A063,KJ2008A058); 教育部留学回国人员科研启动基金资助项目. |
|
中文摘要 |
研究两机器人高速搬运系统的作业分配问题. 在系统的Markov决策过程(MDP)模型中, 状态变量具有连续取值和离散取值的混杂性, 状态空间复杂且存在“维数灾”问题, 传统的数值优化难以进行. 根据小脑模型关节控制器(CMAC)具有收敛速度快和适应性强的特点, 运用该结构作为Q值函数的逼近器, 并与Q学习和性能势概念相结合, 给出了一种适用于平均或折扣性能准则的CMAC-Q学习优化算法. 仿真结果说明, 这种神经元动态规划方法比常规的Q学习算法具有节省存储空间, 优化精度高和优化速度快的优势. |
英文摘要 |
The task assignment of a high-speed handling system with two robots is studied in this paper. In the underlying Markov decision process(MDP) model, the state variable is composed of both continuous and discrete values, and the state space is complex and suffers from the curse of dimensionality. Therefore, the traditional numerical optimization is prevented from successful application to this system. Since the cerebellar-model-articulation-controller(CMAC) has the
advantages of fast convergence and desired adaptability, it is employed to approximate the Q-values in a CMAC-Q learning optimization algorithm for combining the concept of performance potential and Q-learning, and for unifying the average criteria with the discount criteria. Compared with the Q-learning, the proposed neuro-dynamic programming approach requires less memory, but provides higher learning speed and better optimization performance as shown in the simulations. |
|
|
|
|
|