引用本文: | 陈学松,杨宜民.基于执行器–评价器学习的自适应PID控制[J].控制理论与应用,2011,28(8):1187~1192.[点击复制] |
CHEN Xue-song,YANG Yi-min.A novel adaptive PID controller based on Actor-Critic learning[J].Control Theory and Technology,2011,28(8):1187~1192.[点击复制] |
|
基于执行器–评价器学习的自适应PID控制 |
A novel adaptive PID controller based on Actor-Critic learning |
摘要点击 5881 全文点击 3327 投稿时间:2010-05-26 修订日期:2010-10-27 |
查看全文 查看/发表评论 下载PDF阅读器 |
DOI编号 10.7641/j.issn.1000-8152.2011.8.CCTA100618 |
2011,28(8):1187-1192 |
中文关键词 强化学习 执行器–评价器 自适应PID控制 |
英文关键词 reinforcement learning Actor-Critic adaptive PID control |
基金项目 国家自然科学基金资助项目(60974019); 广东省自然科学基金资助项目(9451009001002686). |
|
中文摘要 |
针对传统PID控制器无法在线自整定参数的不足, 提出了一种基于执行器–评估器(Actor-Critic, AC)学习的自适应PID控制器结构与学习算法. 该控制器利用AC学习实现PID参数的自适应整定, 采用一个径向基函数网络同时对Actor的策略函数和Critic的值函数进行逼近. 径向基函数网络的输入为系统误差、误差的一次差分和二次差分, Actor实现系统状态到PID参数的映射, Critic则对Actor的输出进行评判并且生成时序差分(temporal difference,TD)误差信号. 基于AC学习的体系结构和TD误差性能指标, 给出了控制器设计的步骤流程图. 两个仿真实验表明: 与传统的PID控制器相比, 基于AC学习的PID控制器在响应速度和自适应能力方面要优于传统PID控制器. |
英文摘要 |
Owing to the lack of the self-tuning for PID parameters in typical PID(T--PID) controllers, a self tuning PID control strategy using Actor-Critic learning(AC--PID) is proposed. Actor-Critic learning is used to tune PID parameters of the controller in an adaptive way. The policy function of Actor and the value function of Critic are approximated by a simple radial basis function neural network. The system error, the first and the second-order differences of system error are employed as inputs to the radial basis function network. The mapping from the system states to PID parameters is realized by the Actor, and the temporal difference(TD) error is evaluated by the Critic. Based on the structure of Actor-Critic learning and TD error performance index, the block diagram of the controller is developed. Two simulation results show that the proposed controller is efficient and perfectly adaptable with fast responses, providing better performances than the typical PID controller. |