基于模型深度强化学习的数据中心主动地板控制

温建伟; 张立; 段彦夺; 李雷孝

引用本文:	温建伟,张立,段彦夺,李雷孝.基于模型深度强化学习的数据中心主动地板控制[J].控制理论与应用,2022,39(6):1051~1056.[点击复制]
	WEN Jian-wei,ZHANG Li,DUAN Yan-duo,LI Lei-xiao.Model-based reinforcement learning for active ventilated tiles control in data centers[J].Control Theory and Technology,2022,39(6):1051~1056.[点击复制]

基于模型深度强化学习的数据中心主动地板控制

Model-based reinforcement learning for active ventilated tiles control in data centers

摘要点击 1780 全文点击 638 投稿时间：2021-07-27 修订日期：2022-03-23

查看全文查看/发表评论下载PDF阅读器

DOI编号 10.7641/CTA.2021.10682

2022,39(6):1051-1056

中文关键词数据中心主动地板强化学习性能评价

英文关键词 data center active ventilation tile reinforcement learning performance evaluation

基金项目国家自然科学基金项目(61862048), 内蒙古自治区科技重大专项项目(2019ZD015, 2019ZD016), 内蒙古自治区关键技术攻关计划项目 (2019GG273, 2020GG0094), 内蒙古自治区科技成果转化专项资金项目(2020CG0073, 2021CG0033)资助.

作者	单位	E-mail
温建伟	内蒙古自治区气象信息中心	duanyanduo@163.com
张立^*	内蒙古自治区气象信息中心	duanyanduo@163.com
段彦夺	内蒙古自治区基于大数据的软件服务工程技术研究中心
李雷孝	内蒙古自治区基于大数据的软件服务工程技术研究中心

中文摘要

如何消除数据中心的局部热点是困扰数据中心行业的关键问题之一. 本文采用主动地板(AVT)来抑制局部机架热点现象, 并将数据中心AVT控制问题抽象为马尔可夫决策过程, 设计了基于深度强化学习的主动地板最优控制策略. 该策略基于模型深度强化学习方法, 克服了传统无模型深度强化学习方法采样效率低的缺陷. 大量仿真实验结果表明, 与经典无模型(PPO)方法相比, 所提出的方法可迅速收敛到最优控制策略, 并可以有效抑制机架热点现象.

英文摘要

How to remove the hotspots in data centers is one of the key issues in the data center industry. This work focuses on designing active ventilation tiles (AVTs) to restrain hotspots in data centers. The AVT control problem is abstracted into a Markov decision process (MDP) problem, and an optimal control algorithm based on deep reinforcement learning (DRL) is proposed. The proposed approach adopts the model-based reinforcement learning (MBRL) paradigm and has better sample efficiency compared to traditional model-free approaches. Extensive simulation studies are conducted and numerical results show that our algorithm learns the optimal control policy faster than the classical model-free proximal policy optimization (PPO) algorithm and is effective in suppressing the local hotspots in data centers.