Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3503221.3508432acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article

Multi-queues can be state-of-the-art priority schedulers

Published: 28 March 2022 Publication History

Abstract

Designing and implementing efficient parallel priority schedulers is an active research area. An intriguing proposed design is the Multi-Queue: given n threads and mn distinct priority queues, task insertions are performed uniformly at random, while, to delete, a thread picks two queues uniformly at random, and removes the observed task of higher priority. This approach scales well, and has probabilistic rank guarantees: roughly, the rank of each task removed, relative to remaining tasks in all other queues, is O (m) in expectation. Yet, the performance of this pattern is below that of well-engineered schedulers, which eschew theoretical guarantees for practical efficiency.
We investigate whether it is possible to design and implement a Multi-Queue-based task scheduler that is both highly-efficient and has analytical guarantees. We propose a new variant called the Stealing Multi-Queue (SMQ), a cache-efficient variant of the Multi-Queue, which leverages both queue affinity---each thread has a local queue, from which tasks are usually removed; but, with some probability, threads also attempt to steal higher-priority tasks from the other queues---and task batching, that is, the processing of several tasks in a single insert / remove step. These ideas are well-known for task scheduling without priorities; our theoretical contribution is showing that, despite relaxations, this design can still provide rank guarantees, which in turn implies bounds on total work performed. We provide a general SMQ implementation which can surpass state-of-the-art schedulers such as OBIM and PMOD in terms of performance on popular graph-processing benchmarks. Notably, the performance improvement comes mainly from the superior rank guarantees provided by our scheduler, confirming that analytically-reasoned approaches can still provide performance improvements for priority task scheduling.
The full version of this paper is available in [24].

References

[1]
The artifact of this paper, which includes the source code, scripts to run the experiments, and the pre-configured docker images for reader convenience., January 2022. Available at
[2]
Maleen Abeydeera, Suvinay Subramanian, Mark C Jeffrey, Joel Emer, and Daniel Sanchez. Sam: Optimizing multithreaded cores for speculative parallelism. In 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT), pages 64--78. IEEE, 2017.
[3]
Vitalii Aksenov, Dan Alistarh, and Janne H Korhonen. Scalable belief propagation via relaxed scheduling. Advances in Neural Information Processing Systems, 33, 2020.
[4]
Dan Alistarh, Trevor Brown, Justin Kopinsky, Jerry Z Li, and Giorgi Nadiradze. Distributionally linearizable data structures. In Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures, pages 133--142, 2018.
[5]
Dan Alistarh, Trevor Brown, Justin Kopinsky, and Giorgi Nadiradze. Relaxed schedulers can efficiently parallelize iterative algorithms. In Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing, pages 377--386, 2018.
[6]
Dan Alistarh, Justin Kopinsky, Jerry Li, and Giorgi Nadiradze. The power of choice in priority scheduling. arXiv preprint arXiv:1706.04178, 2017. In Proceedings of PODC 2017.
[7]
Dan Alistarh, Justin Kopinsky, Jerry Li, and Nir Shavit. The spraylist: A scalable relaxed priority queue. In 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2015, San Francisco, CA, USA, 2015. ACM.
[8]
Dan Alistarh, Giorgi Nadiradze, and Nikita Koval. Efficiency guarantees for parallel incremental algorithms under relaxed schedulers. In The 31st ACM Symposium on Parallelism in Algorithms and Architectures, pages 145--154, 2019.
[9]
Dan Alistarh, Thomas Sauerwald, and Milan Vojnović. Lock-free algorithms under stochastic schedulers. In Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing, pages 251--260, 2015.
[10]
Robert D Blumofe, Christopher F Joerg, Bradley C Kuszmaul, Charles E Leiserson, Keith H Randall, and Yuli Zhou. Cilk: An efficient multithreaded runtime system. ACM SigPlan Notices, 30(8):207--216, 1995.
[11]
Robert D Blumofe and Charles E Leiserson. Scheduling multithreaded computations by work stealing. Journal of the ACM (JACM), 46(5):720--748, 1999.
[12]
Laxman Dhulipala, Guy Blelloch, and Julian Shun. Julienne: A framework for parallel graph algorithms using work-efficient bucketing. In Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures, pages 293--304, 2017.
[13]
Laxman Dhulipala, Guy E Blelloch, and Julian Shun. Theoretically efficient parallel graph algorithms can be fast and scalable. In Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures, pages 393--404, 2018.
[14]
Joseph E Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In OSDI, volume 12, page 2, 2012.
[15]
Mark C Jeffrey, Suvinay Subramanian, Cong Yan, Joel Emer, and Daniel Sanchez. A scalable architecture for ordered parallelism. In Proceedings of the 48th International Symposium on Microarchitecture, pages 228--241, 2015.
[16]
Mark C Jeffrey, Suvinay Subramanian, Cong Yan, Joel Emer, and Daniel Sanchez. Unlocking ordered parallelism with the swarm architecture. IEEE Micro, 36(3):105--117, 2016.
[17]
R. M. Karp and Y. Zhang. Parallel algorithms for backtrack search and branch-and-bound. Journal of the ACM, 40(3):765--789, 1993.
[18]
A Lenharth, D Nguyen, and K Pingali. Concurrent priority queues are not good priority schedulers. In Proceedings of the 21th International European Conference on Parallel and Distributed Computing, pages 209--221, 2015.
[19]
Yucheng Low, Joseph E Gonzalez, Aapo Kyrola, Danny Bickson, Carlos E Guestrin, and Joseph Hellerstein. Graphlab: A new framework for parallel machine learning. arXiv preprint arXiv:1408.2041, 2014.
[20]
Donald Nguyen, Andrew Lenharth, and Keshav Pingali. A lightweight infrastructure for graph analytics. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP '13, pages 456--471, New York, NY, USA, 2013. ACM.
[21]
Donald Nguyen, Andrew Lenharth, and Keshav Pingali. A lightweight infrastructure for graph analytics. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP '13, pages 456--471, New York, NY, USA, 2013. ACM.
[22]
Yuval Peres, Kunal Talwar, and Udi Wieder. Graphical balanced allocations and the 1 + beta-choice process. Random Struct. Algorithms, 47(4):760--775, December 2015.
[23]
Keshav Pingali, Donald Nguyen, Milind Kulkarni, Martin Burtscher, M Amber Hassaan, Rashid Kaleem, Tsung-Hsien Lee, Andrew Lenharth, Roman Manevich, Mario Méndez-Lojo, et al. The tao of parallelism in algorithms. In Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation, pages 12--25, 2011.
[24]
Anastasiia Postnikova, Nikita Koval, Giorgi Nadiradze, and Dan Alistarh. Multi-queues can be state-of-the-art priority schedulers. arXiv preprint arXiv:2109.00657, 2021.
[25]
Hamza Rihani, Peter Sanders, and Roman Dementiev. Brief announcement: Multiqueues: Simple relaxed concurrent priority queues. In Proceedings of the 27th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA '15, pages 80--82, New York, NY, USA, 2015. ACM.
[26]
Konstantinos Sagonas and Kjell Winblad. The contention avoiding concurrent priority queue. In International Workshop on Languages and Compilers for Parallel Computing, pages 314--330. Springer, 2016.
[27]
Julian Shun and Guy E Blelloch. Ligra: a lightweight graph processing framework for shared memory. In Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 135--146, 2013.
[28]
Martin Wimmer, Jakob Gruber, Jesper Larsson Träff, and Philippas Tsigas. The lock-free k-lsm relaxed priority queue. In ACM SIGPLAN Notices, volume 50, pages 277--278. ACM, 2015.
[29]
Serif Yesil, Azin Heidarshenas, Adam Morrison, and Josep Torrellas. Understanding priority-based scheduling of graph algorithms on a shared-memory platform. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1--14, 2019.

Cited By

View all
  • (2024)When Is Parallelism Fearless and Zero-Cost with Rust?Proceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3626183.3659966(27-40)Online publication date: 17-Jun-2024
  • (2024)Multi Bucket Queues: Efficient Concurrent Priority SchedulingProceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3626183.3659962(113-124)Online publication date: 17-Jun-2024
  • (2024)How to Relax Instantly: Elastic Relaxation of Concurrent Data StructuresEuro-Par 2024: Parallel Processing10.1007/978-3-031-69583-4_9(119-133)Online publication date: 26-Aug-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PPoPP '22: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
April 2022
495 pages
ISBN:9781450392044
DOI:10.1145/3503221
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 March 2022

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. concurrency
  2. parallel graph processing
  3. priority scheduling
  4. relaxed algorithms

Qualifiers

  • Research-article

Funding Sources

  • European Research Council (ERC)

Conference

PPoPP '22

Acceptance Rates

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)284
  • Downloads (Last 6 weeks)19
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)When Is Parallelism Fearless and Zero-Cost with Rust?Proceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3626183.3659966(27-40)Online publication date: 17-Jun-2024
  • (2024)Multi Bucket Queues: Efficient Concurrent Priority SchedulingProceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3626183.3659962(113-124)Online publication date: 17-Jun-2024
  • (2024)How to Relax Instantly: Elastic Relaxation of Concurrent Data StructuresEuro-Par 2024: Parallel Processing10.1007/978-3-031-69583-4_9(119-133)Online publication date: 26-Aug-2024
  • (2023)Conflict-free Replicated Priority Queue: Design, Verification and EvaluationProceedings of the 14th Asia-Pacific Symposium on Internetware10.1145/3609437.3609452(302-312)Online publication date: 4-Aug-2023
  • (2023)Provably-Efficient and Internally-Deterministic Parallel Union-FindProceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3558481.3591082(261-271)Online publication date: 17-Jun-2023
  • (2022)A scalable architecture for reprioritizing ordered parallelismProceedings of the 49th Annual International Symposium on Computer Architecture10.1145/3470496.3527387(437-453)Online publication date: 18-Jun-2022

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media