引用本文:包涛,李昊飞,余涛,张孝顺.考虑市场因素的电力系统供需互动混合博弈强化学习算法[J].控制理论与应用,2020,37(4):907~917.[点击复制]
BAO Tao,LI Hao-fei,YU Tao,ZHANG Xiao-shun.Mixed game reinforcement learning of supply-demand interaction in power system dis-patch on electricity market[J].Control Theory and Technology,2020,37(4):907~917.[点击复制]
考虑市场因素的电力系统供需互动混合博弈强化学习算法
Mixed game reinforcement learning of supply-demand interaction in power system dis-patch on electricity market
摘要点击 2270  全文点击 924  投稿时间:2018-10-22  修订日期:2019-07-08
查看全文  查看/发表评论  下载PDF阅读器
DOI编号  10.7641/CTA.2019.80814
  2020,37(4):907-917
中文关键词  混合博弈强化学习算法  供需互动  Stackelberg博弈  演化博弈  复杂网络
英文关键词  mixed game reinforcement learning  supply and demand interaction  Stackelberg game  evolutionary game  complex network
基金项目  国家自然科学基金
作者单位E-mail
包涛 广州供电局有限公司 baotaowork@foxmail.com 
李昊飞 华南理工大学电力学院  
余涛* 华南理工大学电力学院 taoyu1@scut.edu.cn 
张孝顺 汕头大学工学院  
中文摘要
      为对电力市场环境下电力系统供需互动问题更精确地建模,使其更好地与未来电力市场环境下需求侧负荷聚合商之间多变的关系和复杂的通信拓扑结构相匹配,本文将电力系统供需互动的Stackelberg博弈与复杂网络上反映需求侧负荷聚合商互动的演化博弈相结合,搭建考虑市场因素的电力系统供需互动混合博弈模型。并提出混合博弈强化学习算法求解该非凸不连续模型,该算法以Q学习为载体,通过引入博弈论和图论的思想,把分块协同和演化博弈的方法相结合,充分地利用博弈者之间互动博弈关系所形成的知识矩阵信息,高质量地求解考虑复杂网络的多智能体系统的非凸优化问题。基于复杂网络理论搭建的四类3机-6负荷系统和南方某一线城市电网的仿真结果表明:混合博弈强化学习算法的寻优性能比大多数集中式的智能算法都好,且在不同网络下均可以保证较好的寻优结果,具有很强的适应性和稳定性。
英文摘要
      In order to solve the supply and demand interaction problem in electricity market more accurately, this paper builds a mixed game model of supply and demand interaction in power system considering electricity market factors, and proposes a mixed game reinforcement learning algorithm. Considering the ideas of game theory and graph theory, the algorithm combines block cooperation and evolutionary game methods to fully utilize the interaction of knowledge matrix information formed by interactive game relationships between players based on Q-learning. The multi-agent system and non-convex optimization problem considering complex networks can be solved with high quality. Finally, the simulation results of two test systems indicate that the optimization performance of the mixed game reinforcement learning algorithm is better than that of most centralized intelligent algorithms. This algorithm can guarantee better search results and have strong adaptability and stability under different networks.