Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Virtual-machine-based emulation of future generation high-performance computing systems

Published: 01 May 2012 Publication History

Abstract

This paper describes the design of a system to enable research, development, and testing of new software stacks and hardware features for future high-end computing systems. Motivating uses include both small-scale research and development on simulated individual nodes of proposed high-performance computing systems, and large scaling studies that emulate a sizeable fraction of a future supercomputing system. The proposed architecture combines system virtualization, architectural simulation, time dilation, and slack simulation to provide scalable emulation of hypothetical systems. Virtualization-based full-system measurement and monitoring tools are also included to aid in using the proposed system for co-design of high-performance computing system software and architectural features for future systems. Finally, this paper provides a description of the implementation strategy and status of the system.

References

[1]
Binkert N,Beckmann B,Black G,Reinhardt SK,Saidi A,Basu A, et al.The gem5 simulator.ACM SIGARCH Computer Architecture News . 2011;39 (2): 1-7
[2]
Bohrer P,Elnozahy M,Gheith A,Lefurgy C,Nakra T,Peterson J, et al.Mambo - a full system simulator for the PowerPC architecture.ACM SIGMETRICS Performance Evaluation Review. 2004;31 (4): 8-12
[3]
Browne S,Dongarra J,Garner N,Ho G,Mucci P.A portable programming interface for performance evaluation on modern processors.The International Journal of High Performance Computing Applications. 2000;14 (3): 189-204
[4]
Chen J,Annavaram M,Dubois M38th International Conference on Parallel Processing. Los Alamitos: IEEE Computer Society; 2009:371-378.
[5]
Fujimoto RM.Parallel discrete event simulation.Communications of the ACM. 1990;33 (10): 30-53
[6]
Gupta D,Vishwanath KV,Vahdat A5th USENIX Symposium on Networked Systems Design and Implementation. San Francisco, CA Berkeley: USENIX Association; 2008:16 407-18 422.
[7]
Gupta D,Yocum K,McNett M,Snoeren AC,Vahdat A,Voelker GM3rd Conference on Networked Systems Design and Implementation. Berkeley: USENIX Association; 2006:7-7.
[8]
Lange J,Pedretti K,Dinda P,Bridges PG,Bae C,Soltero P, et al2011 International Conference on Virtual Execution Environments. USA: Newport Beach; 2011:9-11.
[9]
Lange J,Pedretti K,Hudson T,Dinda P,Cui Z,Xia L, et al24th IEEE International Parallel and Distributed Processing Symposium. USA: Atlanta; 2010:19-23.
[10]
León EA,Riesen R,Maccabe AB,Bridges PG2009 International Conference on Supercomputing. Germany: Hamburg; 2009:23-26.
[11]
Martin M,Sorin D,Beckmann B,Marty M,Xu M,Alameldeen A, et al.Multifacet's general execution-driven multiprocessor simulator (gems) toolset.ACM SIGARCH Computer Architecture News. 2005;33 (4): 92-99
[12]
Mohr B,Brown D,Malony AD1994 Conference on Algorithms and Hardware for Parallel Processing. Linz, Austria New York: Springer; 1994:6 29-8 40.
[13]
Riesen R,Brightwell R,Bridges PG,Hudson T,Maccabe AB,Widener PM, et al.Designing and implementing lightweight kernels for capability computing.Concurrency and Computation: Practice and Experience. 2009;21 (6): 791-817
[14]
Rodrigues A,Murphy R,Kogge P,Underwood K2006 Conference on Supercomputing. Tampa, FL New York: ACM; 2006:11 157-17.
[15]
Wallace D2007 Cray User Group Annual Technical Conference. Seattle, WA: ; 2007:7-10.
[16]
Willcock J,Hoefler T,Edmonds N,Lumsdaine A19th International Conference on Parallel Architectures and Compilation Techniques. Austria New York: Vienna ACM; 2010:11 401-15 410.
[17]
Zhu W,Bridges PG,Maccabe AB.Lightweight application monitoring and tuning with embedded gossip.IEEE Transactions of Parallel and Distributed Systems. 2009;20 (7): 1038-1049

Cited By

View all
  • (2018)Non-clairvoyant online scheduling of synchronized jobs on virtual clustersThe Journal of Supercomputing10.1007/s11227-018-2262-474:6(2353-2384)Online publication date: 1-Jun-2018
  • (2017)Reducing Load Imbalance of Virtual Clusters via Reconfiguration and Adaptive Job SchedulingProceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing10.1109/CCGRID.2017.60(992-999)Online publication date: 14-May-2017
  • (2017)Scheduling of online compute-intensive synchronized jobs on high performance virtual clustersJournal of Computer and System Sciences10.1016/j.jcss.2016.10.00985:C(1-17)Online publication date: 1-May-2017
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image International Journal of High Performance Computing Applications
International Journal of High Performance Computing Applications  Volume 26, Issue 2
May 2012
93 pages

Publisher

Sage Publications, Inc.

United States

Publication History

Published: 01 May 2012

Author Tags

  1. emulation
  2. exascale systems
  3. operating systems
  4. testbeds
  5. virtualization

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Non-clairvoyant online scheduling of synchronized jobs on virtual clustersThe Journal of Supercomputing10.1007/s11227-018-2262-474:6(2353-2384)Online publication date: 1-Jun-2018
  • (2017)Reducing Load Imbalance of Virtual Clusters via Reconfiguration and Adaptive Job SchedulingProceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing10.1109/CCGRID.2017.60(992-999)Online publication date: 14-May-2017
  • (2017)Scheduling of online compute-intensive synchronized jobs on high performance virtual clustersJournal of Computer and System Sciences10.1016/j.jcss.2016.10.00985:C(1-17)Online publication date: 1-May-2017
  • (2015)High-performance emulation of heterogeneous systems using adaptive time dilationInternational Journal of High Performance Computing Applications10.1177/109434201455478929:2(166-183)Online publication date: 1-May-2015
  • (2013)Using unreliable virtual hardware to inject errors in extreme-scale systemsProceedings of the 3rd Workshop on Fault-tolerance for HPC at extreme scale10.1145/2465813.2465820(21-26)Online publication date: 18-Jun-2013

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media