The University of Southampton
University of Southampton Institutional Repository

Intra- and inter-server smart task scheduling for profit and energy optimization of HPC data centers

Intra- and inter-server smart task scheduling for profit and energy optimization of HPC data centers
Intra- and inter-server smart task scheduling for profit and energy optimization of HPC data centers
Servers in a data center are underutilized due to over-provisioning, which contributes heavily toward the high-power consumption of the data centers. Recent research in optimizing the energy consumption of High Performance Computing (HPC) data centers mostly focuses on consolidation of Virtual Machines (VMs) and using dynamic voltage and frequency scaling (DVFS). These approaches are inherently hardware-based, are frequently unique to individual systems, and often use simulation due to lack of access to HPC data centers. Other approaches require profiling information on the jobs in the HPC system to be available before run-time. In this paper, we propose a reinforcement learning based approach, which jointly optimizes profit and energy in the allocation of jobs to available resources, without the need for such prior information. The approach is implemented in a software scheduler used to allocate real applications from the Princeton Application Repository for Shared-Memory Computers (PARSEC) benchmark suite to a number of hardware nodes realized with Odroid-XU3 boards. Experiments show that the proposed approach increases the profit earned by 40% while simultaneously reducing energy consumption by 20% when compared to a heuristic-based approach. We also present a network-aware server consolidation algorithm called Bandwidth-Constrained Consolidation (BCC), for HPC data centers which can address the under-utilization problem of the servers. Our experiments show that the BCC consolidation technique can reduce the power consumption of a data center by up-to 37%.
data centers, energy consumption, high performance computing, machine learning, reinforcement learning, resource allocation, server consolidation
2079-9268
Ashraf Mamun, Sayed
908d85d7-68ce-4299-991e-5481e95d8c08
Gilday, Alexander
b533076e-30ea-438e-b465-205f0902a749
Singh, Amit Kumar
bded7886-24ab-4a24-8539-f8fe106426ac
Ganguly, Amlan
b46346d8-95b6-4675-87d1-9d3dd3078575
Merrett, Geoff
89b3a696-41de-44c3-89aa-b0aa29f54020
Wang, Xiaohang
95ffd2f0-3e1f-4cbe-8067-b600d6a08f75
Al-Hashimi, Bashir
0b29c671-a6d2-459c-af68-c4614dce3b5d
Ashraf Mamun, Sayed
908d85d7-68ce-4299-991e-5481e95d8c08
Gilday, Alexander
b533076e-30ea-438e-b465-205f0902a749
Singh, Amit Kumar
bded7886-24ab-4a24-8539-f8fe106426ac
Ganguly, Amlan
b46346d8-95b6-4675-87d1-9d3dd3078575
Merrett, Geoff
89b3a696-41de-44c3-89aa-b0aa29f54020
Wang, Xiaohang
95ffd2f0-3e1f-4cbe-8067-b600d6a08f75
Al-Hashimi, Bashir
0b29c671-a6d2-459c-af68-c4614dce3b5d

Ashraf Mamun, Sayed, Gilday, Alexander, Singh, Amit Kumar, Ganguly, Amlan, Merrett, Geoff, Wang, Xiaohang and Al-Hashimi, Bashir (2020) Intra- and inter-server smart task scheduling for profit and energy optimization of HPC data centers. Journal of Low Power Electronics and Applications, 10 (4), [32]. (doi:10.3390/jlpea10040032).

Record type: Article

Abstract

Servers in a data center are underutilized due to over-provisioning, which contributes heavily toward the high-power consumption of the data centers. Recent research in optimizing the energy consumption of High Performance Computing (HPC) data centers mostly focuses on consolidation of Virtual Machines (VMs) and using dynamic voltage and frequency scaling (DVFS). These approaches are inherently hardware-based, are frequently unique to individual systems, and often use simulation due to lack of access to HPC data centers. Other approaches require profiling information on the jobs in the HPC system to be available before run-time. In this paper, we propose a reinforcement learning based approach, which jointly optimizes profit and energy in the allocation of jobs to available resources, without the need for such prior information. The approach is implemented in a software scheduler used to allocate real applications from the Princeton Application Repository for Shared-Memory Computers (PARSEC) benchmark suite to a number of hardware nodes realized with Odroid-XU3 boards. Experiments show that the proposed approach increases the profit earned by 40% while simultaneously reducing energy consumption by 20% when compared to a heuristic-based approach. We also present a network-aware server consolidation algorithm called Bandwidth-Constrained Consolidation (BCC), for HPC data centers which can address the under-utilization problem of the servers. Our experiments show that the BCC consolidation technique can reduce the power consumption of a data center by up-to 37%.

Text
Intra- and Inter-Server Smart Task - Version of Record
Available under License Creative Commons Attribution.
Download (1MB)

More information

Submitted date: 13 August 2020
Accepted/In Press date: 29 September 2020
Published date: 14 October 2020
Keywords: data centers, energy consumption, high performance computing, machine learning, reinforcement learning, resource allocation, server consolidation

Identifiers

Local EPrints ID: 444813
URI: http://eprints.soton.ac.uk/id/eprint/444813
ISSN: 2079-9268
PURE UUID: ea938dc4-1a17-4114-8a96-607593068dc0
ORCID for Geoff Merrett: ORCID iD orcid.org/0000-0003-4980-3894

Catalogue record

Date deposited: 05 Nov 2020 17:32
Last modified: 05 Nov 2020 17:32

Export record

Altmetrics

Contributors

Author: Sayed Ashraf Mamun
Author: Alexander Gilday
Author: Amit Kumar Singh
Author: Amlan Ganguly
Author: Geoff Merrett ORCID iD
Author: Xiaohang Wang
Author: Bashir Al-Hashimi

University divisions

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×