Abstract
The German research project FFMK aims to build a new HPC operating system platform that addresses hardware and software challenges posed by future exascale systems. These challenges include massively increased parallelism (e.g., nodes and cores), overcoming performance variability, and most likely higher failure rates due to significantly increased component counts. We also expect more complex applications and the need to manage system resources in a more dynamic way than on contemporary HPC platforms, which assign resources to applications statically. The project combines and adapts existing system-software building blocks that have already matured and proven themselves in other areas. At the lowest level, the architecture is based on a microkernel to provide an extremely lightweight and fast execution environment that leaves as many resources as possible to applications. An instance of the microkernel controls each compute node, but it is complemented by a virtualized Linux kernel that provides device drivers, compatibility with existing HPC infrastructure, and rich support for programming models and HPC runtimes such as MPI . Above the level of individual nodes, the system architecture includes distributed performance and health monitoring services as well as fault-tolerant information dissemination algorithms that enable failure handling and dynamic load management. In this chapter, we will give an overview of the overall architecture of the FFMK operating system platform. However, the focus will be on the microkernel and how it integrates with Linux to form a multi-kernel operating system architecture.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Deutsche Forschungsgemeinschaft (DFG).
References
Andersen, E. (2010). \(\mu \)Clibc. https://uclibc.org.
Barak, A., Drezner, Z., Levy, E., Lieber, M., & Shiloh, A. (2015). Resilient gossip algorithms for collecting online management information in exascale clusters. Concurrency and Computation: Practice and Experience, 27(17), 4797–4818.
Beckman, P. et al. (2015). Argo: An exascale operating system. http://www.argo-osr.org/. Accessed 20 Nov 2015.
Döbel, B., & Härtig, H. (2014). Can we put concurrency back into redundant multithreading? Proceedings of the 14th International Conference on Embedded Software, EMSOFT 2014 (pp. 19:1–19:10). USA: ACM.
Döbel, B., Härtig, H., & Engel, M. (2012). Operating system support for redundant multithreading. Proceedings of the Tenth ACM International Conference on Embedded Software EMSOFT 2012 (pp. 83–92). USA: ACM.
FFMK. FFMK Project Website. https://ffmk.tudos.org. Accessed 01 Feb 2018.
Fu, H., Liao, J., Yang, J., Wang, L., Song, Z., Huang, X., et al. (2016). The Sunway TaihuLight supercomputer: system and applications. Science China Information Sciences, 59(7), 072001.
Gerofi, B., Takagi, M., Hori, A., Nakamura, G., Shirasawa, T., & Ishikawa, Y. (2016). On the scalability, performance isolation and device driver transparency of the IHK/McKernel hybrid lightweight kernel. 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS) (pp. 1041–1050).
Graham, R. L., Woodall, T. S., & Squyres, J. M. (2005). Open MPI: A flexible high performance MPI. Proceedings, 6th Annual International Conference on Parallel Processing and Applied Mathematics. Poland: Poznan.
Härtig, H., & Roitzsch, M. (2006). Ten Years of Research on L4-Based Real-Time. Proceedings of the Eighth Real-Time Linux Workshop. China: Lanzhou.
Härtig, H., Hohmuth, M., Liedtke, J., Schönberg, S., & Wolter, J. (1997). The performance of \(\mu \)-kernel-based systems. SOSP 1997: Proceedings of the sixteenth ACM symposium on Operating systems principles (pp. 66–77). USA: ACM Press.
Hoefler, T., Schneider, T., & Lumsdaine, A. (2010). Characterizing the influence of system noise on large-scale applications by simulation. Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010. USA: IEEE Computer Society.
Lackorzynski, A., & Warg, A. (2009). Taming subsystems: capabilities as universal resource access control in L4. IIES 2009: Proceedings of the Second Workshop on Isolation and Integration in Embedded Systems (pp. 25–30). USA: ACM.
Lackorzynski, A., Weinhold, C., & Härtig, H. (2016a). Combining predictable execution with full-featured commodity systems. Proceedings of OSPERT2016, the 12th Annual Workshop on Operating Systems Platforms for Embedded Real-Time Applications OSPERT 2016 (pp. 31–36).
Lackorzynski, A., Weinhold, C., & Härtig, H. (2016b). Decoupled: Low-effort noise-free execution on commodity system. Proceedings of the 6th International Workshop on Runtime and Operating Systems for Supercomputers ROSS 2016. USA: ACM.
Lackorzynski, A., Weinhold, C., & Härtig, H. (2017). Predictable low-latency interrupt response with general-purpose systems. Proceedings of OSPERT2017, the 13th Annual Workshop on Operating Systems Platforms for Embedded Real-Time Applications OSPERT 2017 (pp. 19–24).
Lawrence Livermore National Laboratory. The FTQ/FWQ Benchmark.
Levy, E., Barak, A., Shiloh, A., Lieber, M., Weinhold, C., & Härtig, H. (2014). Overhead of a decentralized gossip algorithm on the performance of HPC applications. Proceedings of the ROSS 2014 (pp. 10:1–10:7). New York: ACM.
Lieber, M., Grützun, V., Wolke, R., Müller, M. S., & Nagel, W. E. (2012). Highly scalable dynamic load balancing in the atmospheric modeling system COSMO-SPECS+FD4. Proceedings of the PARA 2010 (Vol. 7133, pp. 131–141). Berlin: Springer.
Liedtke, J. (1995). On micro-kernel construction. SOSP 1995: Proceedings of the fifteenth ACM symposium on Operating systems principles (pp. 237–250). USA: ACM Press.
microHPC. microHPC Project Website. https://microhpc.tudos.org. Accessed 01 Feb 2018.
mvapichweb. MVAPICH: MPI over InfiniBand. http://mvapich.cse.ohio-state.edu/. Accessed 29 Jan 2017.
Reussner, R., Sanders, P., & Larsson Träff, J. (2002). SKaMPI: a comprehensive benchmark for public benchmarking of MPI (pp. 10:55–10:65).
Seelam, S., Fong, L., Tantawi, A., Lewars, J., Divirgilio, J., & Gildea, K. (2010). Extreme scale computing: Modeling the impact of system noise in multicore clustered systems. 2010 IEEE International Symposium on Parallel Distributed Processing (IPDPS).
Singaravelu, L., Pu, C., Härtig, H., & Helmuth, C. (2006). Reducing TCB complexity for security-sensitive applications: three case studies. Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006, EuroSys 2006 (pp. 161–174). USA: ACM.
The CP2K Developers Group. Open source molecular dynamics. http://www.cp2k.org/. Accessed 20 Nov 2015.
Weinhold, C. & Härtig, H. (2011). jVPFS: adding robustness to a secure stacked file system with untrusted local storage components. Proceedings of the 2011 USENIX Conference on USENIX Annual Technical Conference, USENIXATC 2011, (p. 32). USA: USENIX Association.
Weinhold, C., & Härtig, H. (2008). VPFS: building a virtual private file system with a small trusted computing base. Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008, Eurosys 2008 (pp. 81–93). USA: ACM.
Weinhold, C., Lackorzynski, A., Bierbaum, J., Küttler, M., Planeta, M., Härtig, H., et al. (2016). Ffmk: A fast and fault-tolerant microkernel-based system for exascale computing. Software for Exascale Computing—SPPEXA 2013–2015 (Vol. 113, pp. 405–426).
XtreemFS. XtreemFS - a cloud file system. http://www.xtreemfs.org. Accessed 16 May 2018.
Acknowledgements
We would like to thank the German priority program 1648 “Software for Exascale Computing” for supporting the project FFMK (FFMK 2019), the ESF-funded project microHPC (microHPC 2019), and the cluster of excellence “Center for Advancing Electronics Dresden” (cfaed). We also acknowledge the Julich Supercomputing Centre, the Gauss Centre for Supercomputing, and the John von Neumann Institute for Computing for providing compute time on the JUQUEEN and JURECA supercomputers. We would also like to deeply thank TU Dresden’s ZIH for allowing us bare metal access to nodes of their Taurus system, as well as all our fellow researchers in the FFMK project for their advise, contributions, and friendly collaboration.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Weinhold, C., Lackorzynski, A., Härtig, H. (2019). FFMK: An HPC OS Based on the L4Re Microkernel. In: Gerofi, B., Ishikawa, Y., Riesen, R., Wisniewski, R.W. (eds) Operating Systems for Supercomputers and High Performance Computing. High-Performance Computing Series, vol 1. Springer, Singapore. https://doi.org/10.1007/978-981-13-6624-6_19
Download citation
DOI: https://doi.org/10.1007/978-981-13-6624-6_19
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-6623-9
Online ISBN: 978-981-13-6624-6
eBook Packages: Computer ScienceComputer Science (R0)