基于事件的优化方法简介及其在能源互联网中的应用

贾庆山; 杨玉; 夏俐; 管晓宏

引用本文:	贾庆山,杨玉,夏俐,管晓宏.基于事件的优化方法简介及其在能源互联网中的应用[J].控制理论与应用,2018,35(1):32~40.[点击复制]
	JIA Qing-Shan,YANG Yu,XIA Li,GUAN Xiao-Hong.A tutorial on event-based optimization with application in energy Internet[J].Control Theory and Technology,2018,35(1):32~40.[点击复制]

基于事件的优化方法简介及其在能源互联网中的应用

A tutorial on event-based optimization with application in energy Internet

摘要点击 3071 全文点击 1646 投稿时间：2017-02-06 修订日期：2018-01-25

查看全文查看/发表评论下载PDF阅读器

DOI编号 10.7641/CTA.2018.70064

2018,35(1):32-40

中文关键词事件驱动性能势事件Q因子性能差分仿真优化能源互联网

英文关键词 event-based performance potential Q-factors performance difference simulation-based optimization energy internet

基金项目国家重点研发计划(2016YFB0901900), 国家自然科学基金项目(61673229, 61174072, 61222302, 91224008, 61221063, U1301254)资助.

作者	单位	E-mail
贾庆山^*	清华大学	jiaqs@tsinghua.edu.cn
杨玉	清华大学
夏俐	清华大学
管晓宏	清华大学

中文摘要

许多实际系统具有事件驱动的特性, 即系统状态的动态演化由一系列离散事件触发, 这类系统称为离散事件动态系统(discrete event dynamic system, DEDS). 针对这类系统的性能优化, 本文介绍一种基于事件的优化模型 (event-based optimization, EBO). 该模型的典型特征是基于事件采取决策, 与马尔科夫决策过程(Markov decision process, MDP)基于状态的决策方法相比具有如下几个方面的优点: 一是一个事件通常对应一组具有相同特征的状态转移的集合, 系统的事件数目往往远小于状态数, 因此可利用系统的事件特征实现性能势集结, 缓解问题的维数灾难题; 二是许多实际系统只要求在特定事件发生时采取行动, 对于这类系统, 马尔科夫决策过程难以有效利用系统的结构信息. 具体而言, 马尔科夫决策过程要求不同状态下的决策独立, 而系统的同一个事件通常对应着多种不同状态, 难以利用相同事件可采取相同决策的结构特点. 本文以马尔科夫决策过程为基础, 重点围绕3个方面展开: 一是介绍基于事件优化模型的基本概念及其理论和应用发展; 二是介绍事件优化模型中基于性能势或事件Q因子的策略迭代算法; 三是以建筑微电网中分布式风力发电供给电动汽车充电的协调优化问题为例, 探讨基于事件的优化模型在能源互联网系统(energy internet)中的应用前景.

英文摘要

In many practical systems, the control or decision-making is usually triggered by certain events. These systems are classified as discrete event dynamic systems (DEDSs). Considering the performance optimization of these systems, a new optimization framework called event-based optimization (EBO) is introduced in this paper. Compared with Markov decision process (MDP), one of the main characteristics of EBO is that decisions are made based on “events” rather than states. In this regard, there exist a number of advantages for EBO. First, an event usually corresponds to a set of state transitions with some common properties. Generally, the number of events of a system requiring decisions is much smaller than that of states. Therefore, the EBO approach can utilize the special structure of systems characterized by events to aggregate the potentials, thus alleviating the curses of dimensionality. Second, the EBO approach applies to many practical problems where actions are required only when certain events happen. Such problems do not fit well the standard MDP formulation in which the decisions made based on different states are independent. However, for that cases, the same action may be taken for the same event, which may correspond to many different states. Based on the basic theory of MDP, this paper is addressed around three aspects. First, we briefly review the basic ideas of EBO and the development for its theory and applications. Second, we introduce the simulation-based policy iteration methods for EBO based on the performance potentials or Q-factors; Third, a case study is conducted on the coordination of electric vehicle charging with the distributed wind power generation of a building, which aims to shed some lights on the application of EBO in energy Internet.