quotation: | | [Copy] |
| | Yazhou Hu,Fengzhen Tang,Jun Chen,Wenxue Wang.[en_title][J].Control Theory and Technology,2021,19(4):455~464.[Copy] |
|
|
|
This Paper:Browse 646 Download 0 |
码上扫一扫! |
Quantum‑enhanced reinforcement learning for control: a preliminary study |
YazhouHu,FengzhenTang,JunChen,WenxueWang |
|
(1 College of Mechanical and Electronic Engineering, Northwest A& F University, Yangling, Shaanxi 712100, China;2 The State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, Liaoning 110016, China
3 Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang, Liaoning 110169, China) |
|
摘要: |
Reinforcement learning is one of the fastest growing areas in machine learning, and has obtained great achievements in
biomedicine, Internet of Things (IoT), logistics, robotic control, etc. However, there are still many challenges for engineering
applications, such as how to speed up the learning process, how to balance the trade-off between exploration and exploitation.
Quantum technology, which can solve complex problems faster than classical methods, especially in supercomputers,
provides us a new paradigm to overcome these challenges in reinforcement learning. In this paper, a quantum-enhanced
reinforcement learning is pictured for optimal control. In this algorithm, the states and actions of reinforcement learning
are quantized by quantum technology. And then, a probability amplification method, which can effectively avoid the
trade-off between exploration and exploitation via quantized technology, is presented. Finally, the optimal control policy is
learnt during the process of reinforcement learning. The performance of this quantized algorithm is demonstrated in both
MountainCar reinforcement learning environment and CartPole reinforcement learning environment—one kind of classical
control reinforcement learning environment in the OpenAI Gym. The preliminary study results validate that, compared with
Q-learning, this quantized reinforcement learning method has better control performance without considering the trade-off
between exploration and exploitation. The learning performance of this new algorithm is stable with different learning rates
from 0.01 to 0.10, which means it is promising to be employed in unknown dynamics systems. |
关键词: Quantum theory · Reinforcement learning · Quantum computation · State superposition · Optimal control |
DOI:https://doi.org/10.1007/s11768-021-00063-x |
|
基金项目: |
|
Quantum‑enhanced reinforcement learning for control: a preliminary study |
Yazhou Hu,Fengzhen Tang,Jun Chen,Wenxue Wang |
(1 College of Mechanical and Electronic Engineering, Northwest A& F University, Yangling, Shaanxi 712100, China;2 The State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, Liaoning 110016, China
3 Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang, Liaoning 110169, China) |
Abstract: |
Reinforcement learning is one of the fastest growing areas in machine learning, and has obtained great achievements in
biomedicine, Internet of Things (IoT), logistics, robotic control, etc. However, there are still many challenges for engineering
applications, such as how to speed up the learning process, how to balance the trade-off between exploration and exploitation.
Quantum technology, which can solve complex problems faster than classical methods, especially in supercomputers,
provides us a new paradigm to overcome these challenges in reinforcement learning. In this paper, a quantum-enhanced
reinforcement learning is pictured for optimal control. In this algorithm, the states and actions of reinforcement learning
are quantized by quantum technology. And then, a probability amplification method, which can effectively avoid the
trade-off between exploration and exploitation via quantized technology, is presented. Finally, the optimal control policy is
learnt during the process of reinforcement learning. The performance of this quantized algorithm is demonstrated in both
MountainCar reinforcement learning environment and CartPole reinforcement learning environment—one kind of classical
control reinforcement learning environment in the OpenAI Gym. The preliminary study results validate that, compared with
Q-learning, this quantized reinforcement learning method has better control performance without considering the trade-off
between exploration and exploitation. The learning performance of this new algorithm is stable with different learning rates
from 0.01 to 0.10, which means it is promising to be employed in unknown dynamics systems. |
Key words: Quantum theory · Reinforcement learning · Quantum computation · State superposition · Optimal control |
|
|
|
|
|