引用本文: | 温建伟,张立,段彦夺,李雷孝.基于模型深度强化学习的数据中心主动地板控制[J].控制理论与应用,2022,39(6):1051~1056.[点击复制] |
WEN Jian-wei,ZHANG Li,DUAN Yan-duo,LI Lei-xiao.Model-based reinforcement learning for active ventilated tiles control in data centers[J].Control Theory and Technology,2022,39(6):1051~1056.[点击复制] |
|
基于模型深度强化学习的数据中心主动地板控制 |
Model-based reinforcement learning for active ventilated tiles control in data centers |
摘要点击 1782 全文点击 638 投稿时间:2021-07-27 修订日期:2022-03-23 |
查看全文 查看/发表评论 下载PDF阅读器 |
DOI编号 10.7641/CTA.2021.10682 |
2022,39(6):1051-1056 |
中文关键词 数据中心 主动地板 强化学习 性能评价 |
英文关键词 data center active ventilation tile reinforcement learning performance evaluation |
基金项目 国家自然科学基金项目(61862048), 内蒙古自治区科技重大专项项目(2019ZD015, 2019ZD016), 内蒙古自治区关键技术攻关计划项目 (2019GG273, 2020GG0094), 内蒙古自治区科技成果转化专项资金项目(2020CG0073, 2021CG0033)资助. |
|
中文摘要 |
如何消除数据中心的局部热点是困扰数据中心行业的关键问题之一. 本文采用主动地板(AVT)来抑制局部
机架热点现象, 并将数据中心AVT控制问题抽象为马尔可夫决策过程, 设计了基于深度强化学习的主动地板最优控
制策略. 该策略基于模型深度强化学习方法, 克服了传统无模型深度强化学习方法采样效率低的缺陷. 大量仿真实
验结果表明, 与经典无模型(PPO)方法相比, 所提出的方法可迅速收敛到最优控制策略, 并可以有效抑制机架热点现
象. |
英文摘要 |
How to remove the hotspots in data centers is one of the key issues in the data center industry. This work
focuses on designing active ventilation tiles (AVTs) to restrain hotspots in data centers. The AVT control problem is abstracted
into a Markov decision process (MDP) problem, and an optimal control algorithm based on deep reinforcement
learning (DRL) is proposed. The proposed approach adopts the model-based reinforcement learning (MBRL) paradigm and
has better sample efficiency compared to traditional model-free approaches. Extensive simulation studies are conducted and
numerical results show that our algorithm learns the optimal control policy faster than the classical model-free proximal
policy optimization (PPO) algorithm and is effective in suppressing the local hotspots in data centers. |
|
|
|
|
|