Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Performance and energy aware scheduling simulator for HPC: evaluating different resource selection methods

Published: 10 December 2015 Publication History

Abstract

Today, in an energy-aware society, job scheduling is becoming an important task for computer engineers and system analysts that may lead to a performance per Watt trade-off of computing infrastructures. Thus, new algorithms, and a simulator of computing environments, may help information and communications technology and data center managers to make decisions with a solid experimental basis. There are several simulators that try to address performance and, somehow, estimate energy consumption, but there are none in which the energy model is based on benchmark data that have been countersigned by independent bodies such as the Standard Performance Evaluation Corporation. This is the reason why we have implemented a performance and energy-aware scheduling PEAS simulator for high-performance computing. Furthermore, to evaluate the simulator, we propose an implementation of the non-dominated sorting genetic algorithm-II NSGA-II algorithm, a fast and elitist multiobjective genetic algorithm, for the resource selection. With the help of the PEAS simulator, we have studied if it is possible to provide an intelligent job allocation policy that may be able to save energy and time without compromising performance. The results of our simulations show a great improvement in response time and power consumption. In most of the cases, NSGA-II performs better than other 'intelligent' algorithms like multiobjective heterogeneous earliest finish time and clearly outperforms the first-fit algorithm. We demonstrate the usefulness of the simulator for this type of studies and conclude that the superior behavior of multiobjective algorithms makes them recommended for use in modern scheduling systems. Copyright © 2015 John Wiley & Sons, Ltd.

References

[1]
Choi K, Soma R, Pedram M. Dynamic voltage and frequency scaling based on workload decomposition. In Proceedings of the 2004 International Symposium on Low Power Electronics and Design,ACM: Newport Beach, CA, USA, 2004; pp.174-179.
[2]
Hu L, Jin H, Liao X, Xiong X, Liu H. Magnet: a novel scheduling policy for power reduction in cluster with virtual machines. In 2008 IEEE International Conference on Cluster Computing,IEEE: Tsukuba, Japan, 2008; pp.13-22.
[3]
Deb K, Pratap A, Agarwal S, Meyarivan T. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation. 2002; Volume 6 Issue 2: pp.182-197.
[4]
Quintero D, Cruz LC, Picone RM, Smolej D, de¿Souza¿Casali D, Tudor G, Wong J, et al. IBM Platform Computing Solutions Reference Architectures and Best Practices. IBM Redbooks: Armonk, NY, USA, 2014.
[5]
Univa grid engine software. Available from: "http://www.univa.com/products/grid-engine.php" {Accessed: 2015-01-07}.
[6]
Thain D, Tannenbaum T, Livny M. Distributed computing in practice: the condor experience. Concurrency and Computation - Practice and Experience. 2005; Volume 17 Issue 2-4: pp.323-356.
[7]
Yoo AB, Jette MA, Grondona M. Slurm: simple linux utility for resource management. In Job Scheduling Strategies for Parallel Processing. Springer: Seattle, WA, USA, 2003; pp.44-60.
[8]
Staples G. Torque resource manager. In Proceedings of the 2006 ACM/IEEE Conference on Supercomputing,ACM: Tampa, FL, USA, 2006; pp.8.
[9]
Krallmann J, Schwiegelshohn U, Yahyapour R. 1999. On the design and evaluation of job scheduling algorithms. In Job Scheduling Strategies for Parallel Processing. Springer: San Juan, Puerto Rico; pp.17-42.
[10]
Guim F, Corbalan J, Labarta J. Modeling the impact of resource sharing in backfilling policies using the alvio simulator. In 2007. MASCOTS'07. 15th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems,IEEE: Istanbul, Turkey, 2007; pp.145-150.
[11]
Heine F, Hovestadt M, Kao O, Streit A. On the impact of reservations from the grid on planning-based resource management. In Computational Science - ICCS 2005. Springer: Atlanta, GA, USA, 2005; pp.155-162.
[12]
Guim F, Rodero I, Corbalan J, Parashar M. Enabling GPU and many-core systems in heterogeneous HPC environments using memory considerations. In 2010 12th IEEE International Conference on High Performance Computing and Communications HPCC,IEEE: Melbourne, Australia, 2010; pp.146-155.
[13]
Berral JL, Goiri Í, Nou R, Julií F, Guitart J, Gavaldí R, Torres J. Towards energy-aware scheduling in data centers using machine learning. In Proceedings of the 1st International Conference on Energy-Efficient Computing and Networking,ACM: Passau, Germany, 2010; pp.215-224.
[14]
Chapin SJ, Cirne W, Feitelson DG, Jones JP, Leutenegger ST, Schwiegelshohn U, Smith W, Talby D. Benchmarks and standards for the evaluation of parallel job schedulers. In Job Scheduling Strategies for Parallel Processing. Springer, 1999; pp.67-90.
[15]
Parallel workload archive. Available from: "http://www.cs.huji.ac.il/labs/parallel/workload/logs.html" {Accessed: 2015-01-19}.
[16]
Feitelson DG, Tsafrir D, Krakov D. Experience with using the parallel workloads archive. Journal of Parallel and Distributed Computing. 2014; Volume 74 Issue 10: pp.2967-2982.
[17]
Poess M, Nambiar RO, Vaid K, Stephens JM, Jr., Huppler K, Haines E. Energy benchmarks: a detailed analysis. In Proceedings of the 1st International Conference on Energy-Efficient Computing and Networking,ACM: Passau, Germany, 2010; pp.131-140.
[18]
Lifka DA. 1995. The ANL/IBM SP scheduling system. In Job Scheduling Strategies for Parallel Processing Springer: Santa Barbara, CA, USA; pp.295-303.
[19]
Mu'alem AW, Feitelson DG. Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling. IEEE Transactions on Parallel and Distributed Systems. 2001; Volume 12 Issue 6: pp.529-543.
[20]
Khare V, Yao X, Deb K. Performance scaling of multi-objective evolutionary algorithms. In Evolutionary Multi-Criterion Optimization. Springer: Faro, Portugal, 2003; pp.376-390.
[21]
Durillo JJ, Nae V, Prodan R. Multi-objective energy-efficient workflow scheduling using list-based heuristics. Future Generation Computer Systems. 2014; Volume 36: pp.221-236.
[22]
Zhao H, Sakellariou R. An experimental investigation into the rank function of the heterogeneous earliest finish time scheduling algorithm. In Euro-par 2003 Parallel Processing, Kosch H, Bszrmnyi L, Hellwagner H eds., <bookSeriesTitle>Lecture Notes in Computer Science</bookSeriesTitle>, vol. 2790 Springer: Berlin Heidelberg, 2003; pp.189-194.
[23]
Specpower_ssj2008 results. Available from: "http://www.spec.org/power\_ssj2008/results/" {Accessed: 2015-01-07}.
[24]
Spec power and performance benchmark methodology v2.1. Available from: "https://www.spec.org/power/docs/ SPEC-Power\_and\_Performance\_Methodology.pdf" {Accessed: 2015-04-28}.
[25]
Albers S, Antoniadis A. Race to idle: new algorithms for speed scaling with a sleep state. ACM Transactions Algorithms. February 2014; Volume 10 Issue 2: pp.9:1-9:31.
[26]
Pakin S, Lang M. Energy modeling of supercomputers and large-scale scientific applications. In 2013 International Green Computing Conference IGCC: Arlington, VA, USA, June 2013; pp.1-6.
[27]
Streit A. Self-tuning job scheduling strategies for the resource management of HPC systems and computational grids. Ph.D. Thesis, Padeborn University, Germany, 2003.
[28]
Rangaiah GP. Multi-Objective Optimization: Techniques and Applications in Chemical Engineering, vol.¿1. World Scientific: Toh Tuck, Singapore, 2008.
[29]
Osyczka A. Multicriteria optimization for engineering design. Design Optimization. 1985; Volume 1: pp.193-227.
[30]
Konak A, Coit DW, Smith AE. Multi-objective optimization using genetic algorithms: a tutorial. Reliability Engineering & System Safety. 2006; Volume 91 Issue 9: pp.992-1007.
[31]
Jain L, Goldberg R, Abraham A. Evolutionary Multiobjective Optimization: Theoretical Advances and Applications. Springer: London, UK, 2005.
[32]
Goldberg DE. Genetic Algorithm in Search, Optimization and Machine Learning. Addison Wesley Publishing Company, Reading: Boston, MA, USA, 1989.
[33]
Domínguez J, Montiel O, Sepúlveda R, Medina N. High performance architecture for NSGA-II. In Recent Advances on Hybrid Intelligent Systems. Springer, 2013; pp.451-461.
[34]
Coello¿Coello CA. Theoretical and numerical constraint-handling techniques used with evolutionary algorithms: a survey of the state of the art. Computer Methods in Applied Mechanics and Engineering. 2002; Volume 191 Issue 11: pp.1245-1287.
[35]
Miller BL, Goldberg DE. Genetic algorithms, tournament selection, and the effects of noise. Complex Systems. 1995; Volume 9 Issue 3: pp.193-212.
[36]
Deb K, Agrawal RB. Simulated binary crossover for continuous search space. Complex Systems. 1995; Volume 9 Issue 2: pp.115-148.
[37]
Cicirello VA. Non-wrapping order crossover: an order preserving crossover operator that respects absolute position. In Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation GECCO,Seattle, WA, USA, 2006; pp.1125-1132.
[38]
Goldberg DE, Lingle R. Alleles, loci, and the traveling salesman problem. In Proceedings of an International Conference on Genetic Algorithms and Their Applications: Lawrence Erlbaum, Hillsdale, NJ, 1985; pp.154-159.
[39]
Misevičius A, Kilda B. Comparison of crossover operators for the quadratic assignment problem. Information Technology and Control. 2005; Volume 34 Issue 2: pp.109-119.
[40]
Srinivasan S, Kettimuthu R, Subramani V, Sadayappan P. Characterization of backfilling strategies for parallel job scheduling. In 2002. Proceedings. International Conference on Parallel Processing Workshops,IEEE: Vancouver, B.C., Canada, 2002; pp.514-519.

Cited By

View all
  • (2020)AccaSim: a customizable workload management simulator for job dispatching research in HPC systemsCluster Computing10.1007/s10586-019-02905-523:1(107-122)Online publication date: 1-Mar-2020
  • (2018)Optimization of resources in parallel systems using a multiobjective artificial bee colony algorithmThe Journal of Supercomputing10.1007/s11227-018-2407-574:8(4019-4036)Online publication date: 1-Aug-2018
  • (2017)A clustering-based knowledge discovery process for data centre infrastructure managementThe Journal of Supercomputing10.1007/s11227-016-1693-z73:1(215-226)Online publication date: 1-Jan-2017
  • Show More Cited By

Index Terms

  1. Performance and energy aware scheduling simulator for HPC: evaluating different resource selection methods
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image Concurrency and Computation: Practice & Experience
        Concurrency and Computation: Practice & Experience  Volume 27, Issue 17
        December 2015
        981 pages

        Publisher

        John Wiley and Sons Ltd.

        United Kingdom

        Publication History

        Published: 10 December 2015

        Author Tags

        1. energy awareness
        2. high-performance computing
        3. job scheduling
        4. multiobjective optimization
        5. performance evaluation
        6. simulator

        Qualifiers

        • Article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 26 Sep 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2020)AccaSim: a customizable workload management simulator for job dispatching research in HPC systemsCluster Computing10.1007/s10586-019-02905-523:1(107-122)Online publication date: 1-Mar-2020
        • (2018)Optimization of resources in parallel systems using a multiobjective artificial bee colony algorithmThe Journal of Supercomputing10.1007/s11227-018-2407-574:8(4019-4036)Online publication date: 1-Aug-2018
        • (2017)A clustering-based knowledge discovery process for data centre infrastructure managementThe Journal of Supercomputing10.1007/s11227-016-1693-z73:1(215-226)Online publication date: 1-Jan-2017
        • (2016)Fattened backfillingJournal of Parallel and Distributed Computing10.1016/j.jpdc.2016.06.01397:C(69-77)Online publication date: 1-Nov-2016

        View Options

        View options

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media