On the Design and Implementation of an Efficient Lock-Free Scheduler

Florian Negele¹⁵,
Felix Friedrich¹⁵,
Suwon Oh¹⁶ &
…
Bernhard Egger¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10353))

Included in the following conference series:

689 Accesses

Abstract

Schedulers for symmetric multiprocessing (SMP) machines use sophisticated algorithms to schedule processes onto the available processor cores. Hardware-dependent code and the use of locks to protect shared data structures from simultaneous access lead to poor portability, the difficulty to prove correctness, and a myriad of problems associated with locking such as limiting the available parallelism, deadlocks, starvation, interrupt handling, and so on. In this work we explore what can be achieved in terms of portability and simplicity in an SMP scheduler that achieves similar performance to state-of-the-art schedulers. By strictly limiting ourselves to only lock-free data structures in the scheduler, the problems associated with locking vanish altogether. We show that by employing implicit cooperative scheduling, additional guarantees can be made that allow novel and very efficient implementations of memory-efficient unbounded lock-free queues. Cooperative multitasking has the additional benefit that it provides an extensive hardware independence. It even allows the scheduler to be used as a runtime library for applications running on top of standard operating systems. In a comparison against Windows Server and Linux running on up to 64 cores we analyze the performance of the lock-free scheduler and show that it matches or even outperforms the performance of these two state-of-the-art schedulers in a variety of benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Challenges in the Implementation of MrsP

Fast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors

Avoiding Scalability Collapse by Restricting Concurrency

References

Advanced Micro Devices, Inc. AMD64 Architecture Programmer’s Manual Volume 2: System Programming, May 2013. Revision 3.23
Google Scholar
Bläser, L.: A component language for pointer-free concurrent programming and its application to simulation. PhD thesis, ETH Zrich (2007)
Google Scholar
Conway, M.E.: Design of a separable transition-diagram compiler. Commun. ACM 6(7), 396–408 (1963)
Article MATH Google Scholar
Fog, A.: The Microarchitecture of Intel. Technical University of Denmark, AMD and VIA CPUs (2014)
Google Scholar
Greenwald, M., Cheriton, D.: The synergy between non-blocking synchronization and operating system structure. In: Second Symposium on Operating Systems Design and Implementation, OSDI 1996 (1996)
Google Scholar
Herlihy, M.: A methodology for implementing highly concurrent data structures. In: Proceedings of the Second ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming, PPOPP 1990 (1990)
Google Scholar
Herlihy, M.: Wait-free synchronization. ACM Trans. Program. Lang. Syst. 13(1), 124–149 (1991)
Article Google Scholar
Herlihy, M., Luchangco, V., Martin, P., Moir, M.: Dynamic sized lockfree, data structures. Technical report (2002)
Google Scholar
Herlihy, M., Luchangco, V., Moir, M.: The repeat offender problem: a mechanism for supporting dynamic-sized, lock-free data structures. In: Malkhi, D. (ed.) DISC 2002. LNCS, vol. 2508, pp. 339–353. Springer, Heidelberg (2002). doi:10.1007/3-540-36108-1_23
Chapter Google Scholar
Herlihy, M., Luchangco, V., Moir, M.: Obstruction-free synchronization: double-ended queues as an example. In: Proceedings of the 23rd International Conference on Distributed Computing Systems, ICDCS 2003 (2003)
Google Scholar
Herlihy, M., Shavit, N.: The Art of Multiprocessor Programming. Morgan Kaufmann Elsevier Science (2008)
Google Scholar
Hohmuth, M., Härtig, H.: Pragmatic nonblocking synchronization for realtime systems. In: Proceedings of the 2001 USENIX Annual Technical Conference, USENIX 2001 (2001)
Google Scholar
Hunt, G.C., Larus, J.R.: Singularity: rethinking the software stack. SIGOPS Oper. Syst. Rev. 41(2), 37–49 (2007)
Article Google Scholar
Hwang, K., Briggs, F.A.: Computer Architecture and Parallel Processing. McGraw-Hill, New York (1984)
MATH Google Scholar
IBM Corporation. IBM System/370 Extended Architecture Principles of Operation. Publication Number SA22-7085-0 (1983)
Google Scholar
Joukov, N., Iyer, R., Traeger, A., Wright, C.P., Zadok, E.: Versatile, portable, and efficient OS profiling via latency analysis. In: Proceedings of the Twentieth ACM Symposium on Operating Systems Principles, SOSP 2005, pp. 1–14. ACM, New York (2005)
Google Scholar
Kulkarni, A., Lumsdaine, A., Lang, M., Ionkov, L.: Optimizing latency and throughput for spawning processes on massively multicore processors. In: Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers, ROSS 2012, pp. 6:1–6:7. ACM, New York (2012)
Google Scholar
Martin, P., Moir, M., Steele, G.: Dcas-based concurrent deques supporting bulk allocation. Technical report, Sun Microsystems Laboratories (2002)
Google Scholar
Massalin, H., Pu, C.: A lock-free multiprocessor OS kernel. Technical report, Department of Computer Science, Columbia University (1991)
Google Scholar
Mellor-Crummey, J.M., LeBlanc, T.J.: A software instruction counter. In: Proceedings of the Third International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS III, pp. 78–86. ACM, New York (1989)
Google Scholar
Mellor-Crummey, J.M.: Concurrent queues: practical fetch-and-$\phi $ algorithms. Technical report 229, Computer Science Deptartement, University of Rochester (1987)
Google Scholar
Michael, M.M.: Hazard pointers: safe memory reclamation for lock-free objects. IEEE Trans. Parallel Distrib. Syst. 15(6), 491–504 (2004)
Article Google Scholar
Michael, M.M., Scott, M.L.: Simple, fast, and practical non-blocking and blocking concurrent queue algorithms. In: Proceedings of the Fifteenth Annual ACM Symposium on Principles of Distributed Computing, PODC 1996 (1996)
Google Scholar
Molnar, I.: Modular scheduler core and completely fair scheduler [CFS] (1997). http://lwn.net/Articles/230501/
Moura, A.L.D., Ierusalimschy, R.: Revisiting coroutines. ACM Trans. Program. Lang. Syst. 31(2), 6:1–6:31 (2009)
Article Google Scholar
Muller, P.J.: The active object system design and multiprocessor implementation. Ph.d. thesis, Swiss Federal Institute of Technology Zurich (ETH Zurich) (2002)
Google Scholar
Sun Microsystems. Multithreading in the Solaris(TM) Operating Environment (2002)
Google Scholar
Valois, J.D.: Implementing lock-free queues. In: Proceedings of the Seventh International Conference on Parallel and Distributed Computing Systems, PDCS 1994 (1994)
Google Scholar
Wirth, N.: The programming language Oberon. Softw. Pract. Exp. 18(7), 671–690 (1988)
Article MATH Google Scholar

Download references

Acknowledgments

This work was supported, in part, by grant IZKSZ2_162084 from the Swiss National Science Foundation, by BK21 Plus for Pioneers in Innovative Computing (Dept. of Computer Science and Engineering, SNU) funded by the National Research Foundation (NRF) of Korea (Grant 21A20151113068), the Basic Science Research Program through NRF funded by the Ministry of Science, ICT & Future Planning (Grant NRF-2015K1A3A1A14021288), and by the Promising-Pioneering Researcher Program through Seoul National University in 2015. ICT at Seoul National University provided research facilities for this study.

Author information

Authors and Affiliations

Department of Computer Science, ETH Zürich, Zürich, Switzerland
Florian Negele & Felix Friedrich
Department of Computer Science and Engineering, Seoul National University, Seoul, Korea
Suwon Oh & Bernhard Egger

Authors

Florian Negele
View author publications
You can also search for this author in PubMed Google Scholar
Felix Friedrich
View author publications
You can also search for this author in PubMed Google Scholar
Suwon Oh
View author publications
You can also search for this author in PubMed Google Scholar
Bernhard Egger
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bernhard Egger .

Editor information

Editors and Affiliations

Google, Seattle, USA
Narayan Desai
Google, Mountain View, USA
Walfredo Cirne

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Negele, F., Friedrich, F., Oh, S., Egger, B. (2017). On the Design and Implementation of an Efficient Lock-Free Scheduler. In: Desai, N., Cirne, W. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP JSSPP 2015 2016. Lecture Notes in Computer Science(), vol 10353. Springer, Cham. https://doi.org/10.1007/978-3-319-61756-5_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-61756-5_2
Published: 12 July 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61755-8
Online ISBN: 978-3-319-61756-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

On the Design and Implementation of an Efficient Lock-Free Scheduler

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Challenges in the Implementation of MrsP

Fast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors

Avoiding Scalability Collapse by Restricting Concurrency

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

On the Design and Implementation of an Efficient Lock-Free Scheduler

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Challenges in the Implementation of MrsP

Fast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors

Avoiding Scalability Collapse by Restricting Concurrency

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation