引用本文: | 周雷,孔凤,唐昊,张建军.小脑模型关节控制器网络在传送带给料生产加工站学习优化控制中的应用[J].控制理论与应用,2011,28(11):1665~1670.[点击复制] |
ZHOU Lei,KONG Feng,TANG Hao,ZHANG Jian-jun.Application of cerebellar model articulation controller network to learning optimization control in conveyor-serviced production station[J].Control Theory and Technology,2011,28(11):1665~1670.[点击复制] |
|
小脑模型关节控制器网络在传送带给料生产加工站学习优化控制中的应用 |
Application of cerebellar model articulation controller network to learning optimization control in conveyor-serviced production station |
摘要点击 2342 全文点击 1694 投稿时间:2010-07-20 修订日期:2011-01-18 |
查看全文 查看/发表评论 下载PDF阅读器 |
DOI编号 10.7641/j.issn.1000-8152.2011.11.CCTA100836 |
2011,28(11):1665-1670 |
中文关键词 传送带给料生产加工站 小脑模型关节控制器 Q学习 在线策略迭代 |
英文关键词 conveyor-serviced production station cerebellar model articulation controller Q-learning online policy iteration |
基金项目 国家自然科学基金资助项目(60873003, 61174186); 教育部留学回国人员科研启动基金资助项目(教外司留2008890); 安徽省自然科学基金资助项目(090412046); 安徽高校省级自然科学研究重点资助项目(KJ2008A058, KJ2011A230); 中日国际科技合作资助项目(2011FA10440). |
|
中文摘要 |
研究单站点传送带给料生产加工站(conveyor-serviced production station, CSPS)系统的前视(look-ahead)距离最优控制问题, 以提高系统的工作效率. 论文运用半Markov决策过程对CSPS 优化控制问题进行建模. 考虑传统Q学习难以直接处理CSPS系统前视距离为连续变量的优化控制问题, 将小脑模型关节控制器网络的Q值函数逼近与在线学习技术相结合, 给出了在线Q学习及模型无关的在线策略迭代算法. 仿真结果表明, 文中算法提高了学习速度和优化精度. |
英文摘要 |
This paper is concerned with the optimization of the look-ahead distance for a conveyor-serviced production station(CSPS) to improve the efficiency of operations. The optimal control process for CSPS is modeled by a semi-Markov decision process(SMDP). Since the standard Q-learning is difficult to deal with the continuous variable optimal look-ahead control problem of CSPS directly, Cerebellar Model Articulation Controller(CMAC) for Q-values function approximation is combined with the online learning technology, and some online Q-learning and model-free online policy iteration algorithms are provided. Simulation results show that the proposed algorithms improve the learning speed and the precision of optimization. |