A Value-Oriented Job Scheduling Approach for Power-Constrained and Oversubscribed HPC Systems

Academic Article

Abstract

  • © 1990-2012 IEEE. In this article, we investigate limitations in the traditional value-based algorithms for a power-constrained HPC system and evaluate their impact on HPC productivity. We expose the trade-off between allocating system-wide power budget uniformly and greedily under different system-wide power constraints in an oversubscribed system. We experimentally demonstrate that, under the tightest power constraint, the mean productivity of the greedy allocation is 38 percent higher than the uniform allocation whereas, under the intermediate power constraint, the uniform allocation has a mean productivity of 6 percent higher than the greedy allocation. We then propose a new algorithm that adapts its behavior to deliver the combined benefits of the two allocation strategies. We design a methodology with online retraining capability to create application-specific power-execution time models for a class of HPC applications. These models are used in predicting the execution time of an application on the available resources at the time of making scheduling decisions in the power-aware algorithms. We evaluate the proposed algorithm using emulation and simulation environments, and show that our adaptive strategy results in improving HPC resource utilization while delivering a mean productivity that is almost the same as the best performing algorithm across various system-wide power constraints.
  • Authors

    Digital Object Identifier (doi)

    Author List

  • Kumbhare N; Marathe A; Akoglu A; Siegel HJ; Abdulla G; Hariri S
  • Start Page

  • 1419
  • End Page

  • 1433
  • Volume

  • 31
  • Issue

  • 6