research-article

Hobbes: composition and virtualization as the foundations of an extreme-scale OS/R

Authors:

Ron Brightwell,

Arthur B. Maccabe,

David E. BernholdtAuthors Info & Claims

ROSS '13: Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers

Article No.: 2, Pages 1 - 8

https://doi.org/10.1145/2491661.2481427

Published: 10 June 2013 Publication History

Abstract

This paper describes our vision for Hobbes, an operating system and runtime (OS/R) framework for extreme-scale systems. The Hobbes design explicitly supports application composition, which is emerging as a key approach for applications to address scalability and power concerns anticipated with coming extreme-scale architectures. We make use of virtualization technologies to provide the flexibility to support requirements of application components for different node-level operating systems and runtimes, as well as different mappings of the components onto the hardware. We describe the architecture of the Hobbes OS/R, how we will address the cross-cutting concerns of power/energy, scheduling of massive levels of parallelism, and resilience. We also outline how the "users" of the OS/R (programming models, applications, and tools) influence the design.

References

[1]

Intel threading building blocks (Intel TBB 4.1 update 2. http://threadingbuildingblocks.org/.

[2]

"Center for Edge Physics Simulation". http://http://epsi.pppl.gov/, 2013.

[3]

A. Anjomshoaa, F. Brisard, M. Drescher, D. Fellows, A. Ly, S. McGough, D. Pulsipher, and A. Savva. Job submission description lanuguage (JSDL) specification, version 1.0. Technical Report GFD-R.056, Global Frid Forum, 7 Nov 2005. www.ggf.org/documents/GFD.56.pdf.

[4]

P. Beckman, R. Brightwell, B. R. de Supinski, M. Gokhale, S. Hofmeyr, S. Krishnamoorthy, M. Lang, B. Maccabe, J. Shalf, and M. Snir. Exascale operating systems and runtime software report. Technical report, U. S. Department of Energy, December 28 2012. http://science.energy.gov/~/media/ascr/pdf/research/cs/Exascale%20Workshop/ExaOSR-Report-Final.pdf.

[5]

J. C. Bennett, H. Abbasi, P.-T. Bremer, R. W. Grout, A. Gyulassy, T. Jin, S. Klasky, H. Kolla, M. Parashar, V. Pascucci, P. Pébay, D. Thompson, H. Yu, F. Zhang, and J. Chen. Combining in-situ and in-transit processing to enable extreme-scale scientific analysis. In Proceedings of SC2012: High Performance Networking and Computing, Salt Lake City, UT, Nov. 2012. ACM Press.

Digital Library

[6]

J. H. Chen, A. Choudhary, B. de Supinski, M. DeVries, E. R. Hawkes, S. Klasky, W. K. Liao, K. L. Ma, J. Mellor-Crummey, N. Podhorszki, R. Sankaran, S. Shende, and C. S. Yoo. Terascale direct numerical simulations of turbulent combustion using S3D. Computational Science & Discovery, 2(1):31pp, 2009.

[7]

H. M. C. Consortium. http://www.hybridmemorycube.org.

[8]

N. DeBardeleben, J. Laros, J. Daly, S. Scott, C. Engelmann, and B. Harrod. High-end computing resilience: Analysis of issues facing the HEC community and path-forward for research and development. Technical Report LA-UR-10-00030, LANL, SNL, DoD, ORNL, DARPA, January 2010.

[9]

W. Elwasif, D. E. Bernholdt, A. G. Shet, S. S. Foley, R. Bramley, D. B. Batchelor, and L. A. Berry. The design and implementation of the SWIM Integrated Plasma Simulator. In Parallel, Distributed and Network-Based Processing (PDP), 2010 18th Euromicro International Conference on, pages 419--427, 2010.

Digital Library

[10]

K. Ferreira, K. Pedretti, P. G. Bridges, R. Brightwell, D. Fiala, and F. Mueller. Evaluating operating system vulnerability to memory errors. In Workshop on Runtime and Operating Systems for Supercomputers, June 2012.

Digital Library

[11]

K. B. Ferreira, P. G. Bridges, R. Brightwell, and K. Pedretti. Impact of system design parameters on application noise sensitivity. In Proceedings of the 2010 IEEE International Conference on Cluster Computing, September 2010.

Digital Library

[12]

V. Gupta, P. Brett, D. Koufaty, D. Reddy, S. Hahn, K. Schwan, and G. Srinivasa. Heteromates: Providing high dynamic power range on client devices using heterogeneous core groups. In 2012 International Green Computing Conference (IGCC), volume 0, pages 1--10, Los Alamitos, CA, USA, 2012. IEEE Computer Society.

Digital Library

[13]

V. Gupta, R. Knauerhase, et al. Attaining system performance points: Revisiting the end-to-end argument in system design for heterogeneous many-core systems. SIGOPS Operating System Review, 2011.

Digital Library

[14]

B. Hendrickson and J. Berry. Graph analysis with high-performance computing. Computing in Science and Engineering, 10(2):14--19, March/April 2008.

Digital Library

[15]

H. Hoffmann, J. Eastep, M. D. Santambrogio, J. E. Miller, and A. Agarwal. Application heartbeats: a generic interface for specifying program performance and goals in autonomous computing environments. In Proceedings of the 7th international conference on Autonomic computing, ICAC '10, pages 79--88, New York, NY, USA, 2010. ACM.

Digital Library

[16]

B. Kocoloski and J. Lange. Better than native: using virtualization to improve compute node performance. In Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers, ROSS '12, pages 8:1--8:8, New York, NY, USA, 2012. ACM.

Digital Library

[17]

L. Lamers et al. Open virtualization format specification. version 2.0.0. DMTF Standard DSP0243, Distributed Management Task Force, Dec 13 2012. http://www.dmtf.org/sites/default/files/standards/documents/DSP0243_2.0.0.pdf.

[18]

J. Lange, K. Pedretti, T. Hudson, P. Dinda, Z. Cui, L. Xia, P. Bridges, A. Gocke, S. Jaconette, M. Levenhagen, and R. Brightwell. Palacios and kitten: New high performance operating systems for scalable virtualized and native supercomputing. In Proceedings of the 24th IEEE International Parallel and Distributed Processing Symposium, April 2010.

[19]

J. H. Laros, III, K. T. Pedretti, S. M. Kelly, W. Shu, and C. T. Vaughan. Energy based performance tuning for large scale high performance computing systems. In Proceedings of the 2012 Symposium on High Performance Computing, HPC '12, pages 6:1--6:10, San Diego, CA, USA, 2012. Society for Computer Simulation International.

Digital Library

[20]

J. Lofstead, R. A. Oldfield, and T. H. Kordenbrock. Experiences applying data staging technology in unconventional ways. In 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), Delft, The Netherlands, May 2013. IEEE/ACM.

Digital Library

[21]

K. Moreland, R. Oldfield, P. Marion, S. Joudain, N. Podhorszki, V. Vishwanath, N. Fabian, C. Docan, M. Parashar, M. Hereld, M. E. Papka, and S. Klasky. Examples of in transit visualization. In Proc. of PDAC 2011: 2nd International Workshop on Petascale Data Analytics: Challenges and Opportunities, Seattle, WA, Nov. 2011.

Digital Library

[22]

R. Pawlowski, R. Bartlett, N. Belcourt, R. Hooper, and R. Schmidt. A theory manual for multi-physics code coupling in LIME. Technical Report SAND2011-2195, Sandia National Laboratories, March 2011.

[23]

F. Petrini, D. Kerbyson, and S. Pakin. The case of the missing supercomputer performance: Achieving optimal performance on the 8,192 processors of ASCI Q. In Proceedings of the ACM/IEEE International Conference on High-Performance Computing, Networking, and Storage (SC), 2003.

Digital Library

[24]

N. Podhorszki, S. Klasky, Q. Liu, C. Docan, M. Parashar, H. Abbasi, J. Lofstead, K. Schwan, M. Wolf, F. Zheng, et al. Plasma fusion code coupling using scalable I/O services and scientific workflows. In Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science, page 8. ACM, 2009.

Digital Library

[25]

I. Raicu, Y. Zhao, C. Dumitrescu, I. Foster, and M. Wilde. Falkon: a fast and light-weight task execution framework. In Proceedings of the 2007 ACM/IEEE conference on Supercomputing, number 43, Reno, NV, November 2007. ACM Press.

Digital Library

[26]

R. Riesen, R. Brightwell, P. G. Bridges, T. Hudson, A. B. Maccabe, P. M. Widener, and K. Ferreira. Designing and implementing lightweight kernels for capability computing. Concurrency and Computation: Practice and Experience, 21(6):793--817, April 2009.

Digital Library

[27]

D. Rogers, K. Moreland, R. Oldfield, and N. Fabian. Data co-processing for extreme scale analysis. Technical Report SAND2013-XXXX, Sandia National Laboratories, March 2013. To appear.

[28]

E. Rotem, A. Naveh, D. Rajwan, A. Ananthakrishnan, and E. Weissmann. Power management architecture of the second generation Intel Core microarchitecture formerly codenamed Sandy Bridge. In Hot Chips: A Symposium on High Performance Chips, August 2011.

[29]

A. Sayed and H. El-Shishiny. Computational experience with nano-material science quantum monte carlo modeling on BlueGene/L. In MEMS, NANO, and Smart Systems (ICMENS), 2009 Fifth International Conference on, pages 213--217. IEEE, 2009.

Digital Library

[30]

F. Sironi, D. B. Bartolini, S. Campanoni, F. Cancare, H. Hoffmann, D. Sciuto, and M. D. Santambrogio. Metronome: operating system level performance management via self-adaptive computing. In Proceedings of the 49th Annual Design Automation Conference, DAC '12, pages 856--865, New York, NY, USA, 2012. ACM.

Digital Library

[31]

R. Sterritt, M. Parashar, H. Tianfield, and R. Unland. A concise introduction to autonomic computing. Advanced Engineering Informatics, 19(3):181--187, July 2005.

Digital Library

[32]

J.-C. Tournier. A survey of configurable operating systems. Technical Report TR-CS-2005-43, University of New Mexico, Computer Science Department, 2005.

[33]

J.-C. Tournier, P. Bridges, A. B. Maccabe, P. Widener, Z. Abudayyeh, R. Brightwell, R. Riesen, and T. Hudson. Towards a framework for dedicated operating systems development in high-end computing systems. ACM SIGOPS Operating Systems Review, 40(2):16--21, April 2006.

Digital Library

[34]

Y. Zhao, M. Hategan, B. Clifford, I. Foster, G. von Laszewski, V. Nefedova, I. Raicu, T. Stef-Praun, and M. Wilde. Swift: Fast, reliable, loosely coupled parallel computation. In 2007 IEEE Congress on Services, pages 199--206, July 2007.

[35]

F. Zheng, H. Abbasi, C. Docan, J. Lofstead, S. Klasky, Q. Liu, M. Parashar, N. Podhorszki, K. Schwan, and M. Wolf. PreDatA - preparatory data analytics on Peta-Scale machines. In In Proceedings of 24th IEEE International Parallel and Distributed Processing Symposium, April, Atlanta, Georgia, 2010.

Cited By

Song JAhn MLee GSeo EJeong J(2021)A Performance-Stable NUMA Management Scheme for Linux-Based HPC SystemsIEEE Access10.1109/ACCESS.2021.30699919(52987-53002)Online publication date: 2021
https://doi.org/10.1109/ACCESS.2021.3069991
Peterka TBard DBennett JBethel EOldfield RPouchard LSweeney CWolf M(2020)Priority research directions for in situ data management: Enabling scientific discovery from diverse data sourcesThe International Journal of High Performance Computing Applications10.1177/1094342020913628(109434202091362)Online publication date: 27-Mar-2020
https://doi.org/10.1177/1094342020913628
Riesen RGerofi BIshikawa YWisniewski R(2019)A New Age: An Overview of Multi-kernelsOperating Systems for Supercomputers and High Performance Computing10.1007/978-981-13-6624-6_13(223-226)Online publication date: 16-Oct-2019
https://doi.org/10.1007/978-981-13-6624-6_13
Show More Cited By

Index Terms

Recommendations

XEMEM: Efficient Shared Memory for Composed Applications on Multi-OS/R Exascale Systems
HPDC '15: Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing

Current trends in exascale systems research indicate that heterogeneity will abound in both the hardware and software layers on future HPC systems. It is our position that exascale environments are likely to be constructed from independent partitions of ...
A lightweight virtual machine monitor for Blue Gene/P

In this paper, we present a lightweight, micro-kernel-based virtual machine monitor (VMM) for the Blue Gene/P supercomputer. Our VMM comprises a small µ-kernel with virtualization capabilities and, atop, a user-level VMM component that manages virtual ...
Minix over Linux: A User-Space Multiserver Operating System
SBESC '11: Proceedings of the 2011 Brazilian Symposium on Computing System Engineering

Minix is an open-source multiserver operating system designed to be highly reliable, flexible, and secure. The kernel is small and is the only piece of software that runs in privileged-mode, on the other hand user processes, specialized servers and ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ROSS '13: Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers

June 2013

75 pages

ISBN:9781450321464

DOI:10.1145/2491661

Conference Chairs:
Torsten Hoefler
ETH Zurich, Switzerland
,
Kamil Iskra
Argonne National Laboratory

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGARCH: ACM Special Interest Group on Computer Architecture

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 June 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICS'13

Sponsor:

SIGARCH

ICS'13: International Conference on Supercomputing

June 10, 2013

Oregon, Eugene

Acceptance Rates

ROSS '13 Paper Acceptance Rate 9 of 18 submissions, 50%;

Overall Acceptance Rate 58 of 169 submissions, 34%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

39
Total Citations
View Citations
371
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)0

Reflects downloads up to 19 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Song JAhn MLee GSeo EJeong J(2021)A Performance-Stable NUMA Management Scheme for Linux-Based HPC SystemsIEEE Access10.1109/ACCESS.2021.30699919(52987-53002)Online publication date: 2021
https://doi.org/10.1109/ACCESS.2021.3069991
Peterka TBard DBennett JBethel EOldfield RPouchard LSweeney CWolf M(2020)Priority research directions for in situ data management: Enabling scientific discovery from diverse data sourcesThe International Journal of High Performance Computing Applications10.1177/1094342020913628(109434202091362)Online publication date: 27-Mar-2020
https://doi.org/10.1177/1094342020913628
Riesen RGerofi BIshikawa YWisniewski R(2019)A New Age: An Overview of Multi-kernelsOperating Systems for Supercomputers and High Performance Computing10.1007/978-981-13-6624-6_13(223-226)Online publication date: 16-Oct-2019
https://doi.org/10.1007/978-981-13-6624-6_13
Zhang HHoffmann H(2018)Performance & Energy Tradeoffs for Dependent Distributed Applications Under System-wide Power CapsProceedings of the 47th International Conference on Parallel Processing10.1145/3225058.3225098(1-11)Online publication date: 13-Aug-2018
https://dl.acm.org/doi/10.1145/3225058.3225098
Gerofi BSantogidis AMartinet DIshikawa YZhao MChandra ARamakrishnan L(2018)PicoDriverProceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing10.1145/3208040.3208060(2-13)Online publication date: 11-Jun-2018
https://dl.acm.org/doi/10.1145/3208040.3208060
Gerofi BRiesen RTakagi MBoku TNakajima KIshikawa YWisniewski R(2018)Performance and Scalability of Lightweight Multi-kernel Based Operating Systems2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS.2018.00022(116-125)Online publication date: May-2018
https://doi.org/10.1109/IPDPS.2018.00022
Khorandi SSharifi M(2018)Non-clairvoyant online scheduling of synchronized jobs on virtual clustersThe Journal of Supercomputing10.1007/s11227-018-2262-474:6(2353-2384)Online publication date: 1-Jun-2018
https://dl.acm.org/doi/10.1007/s11227-018-2262-4
Gerofi BRiesen RWisniewski RIshikawa Y(2017)Toward Full Specialization of the HPC Software StackProceedings of the 7th International Workshop on Runtime and Operating Systems for Supercomputers ROSS 201710.1145/3095770.3095777(1-8)Online publication date: 27-Jun-2017
https://dl.acm.org/doi/10.1145/3095770.3095777
Jones TBrim MVallee GMayer BWelch ALi TLang MIonkov LOtstott DGavrilovska AEisenhauer GDoudali TFernando P(2017)UNITYProceedings of the 7th International Workshop on Runtime and Operating Systems for Supercomputers ROSS 201710.1145/3095770.3095776(1-8)Online publication date: 27-Jun-2017
https://dl.acm.org/doi/10.1145/3095770.3095776
Younge APedretti KGrant RGaines BBrightwell R(2017)Enabling Diverse Software Stacks on Supercomputers Using High Performance Virtual Clusters2017 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER.2017.92(310-321)Online publication date: Sep-2017
https://doi.org/10.1109/CLUSTER.2017.92
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents