quotation:[Copy]
Jian SUN,Feng LIU,Jennie SI,Shengwei MEI.[en_title][J].Control Theory and Technology,2012,10(4):497~503.[Copy]
【Print page】 【Online reading】【Download 【PDF Full text】 View/Add CommentDownload reader Close

←Previous page|Page Next →

Back Issue    Advanced search

This Paper:Browse 1603   Download 198 本文二维码信息
码上扫一扫!
JianSUN,FengLIU,JennieSI,ShengweiMEI
0
(Department of Electrical Engineering, Tsinghua University;Department of Electrical Engineering, Arizona State University)
摘要:
关键词:  
DOI:
Received:May 11, 2010Revised:March 28, 2011
基金项目:This work was supported by the National Natural Science Foundation of China under Cooperative Research Funds (No. 50828701), and the third author is also supported by the U.S. Natural Science Foundation (No. ECCS-0702057).
Direct heuristic dynamic programming based on an improved PID neural network
Jian SUN,Feng LIU,Jennie SI,Shengwei MEI
(Department of Electrical Engineering, Tsinghua University;Department of Electrical Engineering, Arizona State University)
Abstract:
In this paper, an improved PID-neural network (IPIDNN) structure is proposed and applied to the critic and action networks of direct heuristic dynamic programming (DHDP). As one of online learning algorithm of approximate dynamic programming (ADP), DHDP has demonstrated its applicability to large state and control problems. Theoretically, the DHDP algorithm requires access to full state feedback in order to obtain solutions to the Bellman optimality equation. Unfortunately, it is not always possible to access all the states in a real system. This paper proposes a solution by suggesting an IPIDNN configuration to construct the critic and action networks to achieve an output feedback control. Since this structure can estimate the integrals and derivatives of measurable outputs, more system states are utilized and thus better control performance are expected. Compared with traditional PIDNN, this configuration is flexible and easy to expand. Based on this structure, a gradient decent algorithm for this IPIDNN-based DHDP is presented. Convergence issues are addressed within a single learning time step and for the entire learning process. Some important insights are provided to guide the implementation of the algorithm. The proposed learning controller has been applied to a cart-pole system to validate the effectiveness of the structure and the algorithm.
Key words:  Approximate dynamic programming (ADP)  Direct heuristic dynamic programming (DHDP)  Improved PID neural network (IPIDNN)