Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2933057.2933087acmconferencesArticle/Chapter ViewAbstractPublication PagespodcConference Proceedingsconference-collections
research-article

Recoverable Mutual Exclusion: [Extended Abstract]

Published: 25 July 2016 Publication History

Abstract

Mutex locks have traditionally been the most common mechanism for protecting shared data structures in parallel programs. However, the robustness of such locks against process failures has not been studied thoroughly. Most (user-level) mutex algorithms are designed around the assumption that processes are reliable, meaning that a process may not fail while executing the lock acquisition and release code, or while inside the critical section.
If such a failure does occur, then the liveness properties of a conventional mutex lock may cease to hold until the application or operating system intervenes by cleaning up the internal structure of the lock. For example, a process that is attempting to acquire an otherwise starvation-free mutex may be blocked forever waiting for a failed process to release the critical section. Adding to the difficulty, if the failed process recovers and attempts to acquire the same mutex again without appropriate cleanup, then the mutex may become corrupted to the point where it loses safety, notably the mutual exclusion property. We address this challenge by formalizing the problem of recoverable mutual exclusion, and proposing several solutions that vary both in their assumptions regarding hardware support for synchronization, and in their time complexity. Compared to known solutions, our algorithms are more robust as they do not restrict where or when a process may crash, and provide stricter guarantees in terms of time complexity, which we define in terms of remote memory references.

References

[1]
Y. Afek, D. S. Greenberg, M. Merritt, and G. Taubenfeld. Computing with faulty shared memory. In Proc. of the 11th ACM Symposium on Principles of Distributed Computing (PODC), pages 47--58, 1992.
[2]
J. Anderson and Y.-J. Kim. A new fast-path mechanism for mutual exclusion. Distributed Computing, 14(1):17--29, 2001.
[3]
J. Anderson and Y.-J. Kim. An improved lower bound for the time complexity of mutual exclusion. Distributed Computing, 15(4):221--253, 2002.
[4]
J. Anderson, Y.-J. Kim, and T. Herman. Shared-memory mutual exclusion: Major research trends since 1986. Distributed Computing, 16(2--3):75--110, 2003.
[5]
T. Anderson. The performance of spin lock alternatives for shared-memory multiprocessors.break IEEE Transactions on Parallel and Distributed Systems, 1(1):6--16, 1990.
[6]
H. Attiya, D. Hendler, and P. Woelfel. Tight RMR lower bounds for mutual exclusion and other problems. In Proc. of the 40th ACM Symposium on Theory of Computing (STOC), pages 217--226, 2008.
[7]
M. A. Bender and S. Gilbert. Mutual Exclusion with O(łog2 łog n) Amortized Work. In Proc. of the 52nd Symposium on Foundations of Computer Science (FOCS), pages 728--737, 2011.
[8]
P. Bohannon, D. F. Lieuwen, and A. Silberschatz. Recovering scalable spin locks. In Proc. of the 8th IEEE Symposium on Parallel and Distributed Processing (SPDP), pages 314--322, 1996.
[9]
P. Bohannon, D. F. Lieuwen, A. Silberschatz, S. Sudarshan, and J. Gava. Recoverable user-level mutual exclusion. In Proc. of the 7th IEEE Symposium on Parallel and Distributed Processing (SPDP), pages 293--301, 1995.
[10]
R. Cypher. The communication requirements of mutual exclusion. In Proc. of the 7th ACM Symposium on Parallel Algorithms and Architectures (SPAA), pages 147--156, 1995.
[11]
E. W. Dijkstra. Solution of a problem in concurrent programming control. Communications of the ACM, 8(9):569, 1965.
[12]
E. W. Dijkstra. Self-stabilizing systems in spite of distributed control. Communications of the ACM, 17(11):643--644, 1974.
[13]
R. Fan and N. Lynch. An Ω(n łog n) lower bound on the cost of mutual exclusion. In Proc. of the 25th ACM Symposium on Principles of Distributed Computing (PODC), pages 275--284, 2006.
[14]
G. Giakkoupis and P. Woelfel. A tight RMR lower bound for randomized mutual exclusion. In Proc. of the 44th Symposium on Theory of Computing (STOC), pages 983--1002, 2012.
[15]
G. Giakkoupis and P. Woelfel. Randomized Mutual Exclusion with Constant Amortized RMR Complexity on the DSM. In Proc. of the 55th Symposium on Foundations of Computer Science (FOCS), pages 504--513, 2014.
[16]
W. Golab, V. Hadzilacos, D. Hendler, and P. Woelfel. RMR-efficient implementations of comparison primitives using read and write operations. Distributed Computing, 25(2):109--162, 2012.
[17]
G. Graunke and S. Thakkar. Synchronization algorithms for shared-memory multiprocessors. IEEE Computer, 23(6):60--69, 1990.
[18]
J. Gray and A. Reuter. Transaction Processing: Concepts and Techniques. Morgan Kaufmann, 1993.
[19]
D. Hendler and P. Woelfel. Randomized mutual exclusion with sub-logarithmic RMR-complexity. Distributed Computing, 24(1):3--19, 2011.
[20]
M. Herlihy. Wait-free synchronization. ACM Transactions on Programming Languages and Systems, 13(1):124--149, 1991.
[21]
J.-H. Hoepman, M. Papatriantafilou, and P. Tsigas. Self-stabilization of wait-free shared memory objects. In Proc. of the 9th International Workshop on Distributed Algorithms (WDAG), pages 273--287, 1995.
[22]
Intel Corporation. Single-chip cloud computer. http://www.intel.com/content/dam/www/public/us/en/documents/technology-briefs/intel-labs-single-chip-cloud-overview-paper.pdf.
[23]
P. Jayanti, T. Chandra, and S. Toueg. Fault-tolerant wait-free shared objects. In Proc. of the 33rd Symposium on Foundations of Computer Science (FOCS), pages 157--166, 1992.
[24]
C. Johnen and L. Higham. Fault-tolerant implementations of regular registers by safe registers with applications to networks. In Proc. of 10th International Conference of Distributed Computing and Networking (ICDCN), pages 337--348, 2009.
[25]
J. Kessels. Arbitration without common modifiable variables. Acta Informatica, 17:135--141, 1982.
[26]
L. Lamport. A new solution of Dijkstra's concurrent programming problem. Communications of the ACM, 17(8):453--455, 1974.
[27]
L. Lamport. The mutual exclusion problem: part I -- a theory of interprocess communication. Journal of the ACM, 33(2):313--326, 1986.
[28]
L. Lamport. The mutual exclusion problem: part II -- statement and solutions. Journal of the ACM, 33(2):327--348, 1986.
[29]
L. Lamport. A fast mutual exclusion algorithm. ACM Transactions on Computer Systems, 5(1):1--11, 1987.
[30]
P. Magnusson, A. Landin, and E. Hagersten. Queue locks on cache coherent multiprocessors. In Proc. of the 8th International Parallel Processing Symposium (IPPS), pages 165--171, 1994.
[31]
J. Mellor-Crummey and M. Scott. Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Transactions on Computer Systems, 9(1):21--65, 1991.
[32]
M. Michael and Y. Kim. Fault tolerant mutual exclusion locks for shared memory systems. US Patent 7,493,618, 2009.
[33]
D. Narayanan and O. Hodson. Whole-system persistence. In Proc. of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 401--410, 2012.
[34]
A. Ramaraju. RGLock: Recoverable mutual exclusion for non-volatile main memory systems. Master's thesis, University of Waterloo, 2015. https://uwspace.uwaterloo.ca/handle/10012/9473.
[35]
M. Raynal. Algorithms for Mutual Exclusion. MIT Press, 1986.
[36]
M. Scott and W. Scherer. Scalable queue-based spin locks with timeout. In Proc. of the 8th ACM SIGPLAN symposium on Principles and Practices of Parallel Programming (PPoPP), pages 44--52, 2001.
[37]
G. Taubenfeld. Synchronization Algorithms and Concurrent Programming. Prentice Hall, 2006.
[38]
J.-H. Yang and J. Anderson. A fast, scalable mutual exclusion algorithm. Distributed Computing, 9(1):51--60, 1995.

Cited By

View all
  • (2024)Determining Recoverable Consensus NumbersProceedings of the 43rd ACM Symposium on Principles of Distributed Computing10.1145/3662158.3662775(3-13)Online publication date: 17-Jun-2024
  • (2024)An efficient failure-resilient mutual exclusion algorithm for distributed systems leveraging a novel zero-message overlay structureComputer Communications10.1016/j.comcom.2024.02.007218(1-13)Online publication date: Mar-2024
  • (2023)Brief Announcement: On Solving Recoverable Mutual Exclusion Under System-Wide FailuresProceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3558481.3591317(287-290)Online publication date: 17-Jun-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PODC '16: Proceedings of the 2016 ACM Symposium on Principles of Distributed Computing
July 2016
508 pages
ISBN:9781450339643
DOI:10.1145/2933057
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. concurrency
  2. fault tolerance
  3. multi-core algorithms
  4. mutual exclusion
  5. non-volatile main memory
  6. persistent data structures
  7. recovery
  8. shared memory

Qualifiers

  • Research-article

Funding Sources

  • Natural Sciences and Engineering Research Council (NSERC) of Canada

Conference

PODC '16
Sponsor:

Acceptance Rates

PODC '16 Paper Acceptance Rate 40 of 149 submissions, 27%;
Overall Acceptance Rate 740 of 2,477 submissions, 30%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)31
  • Downloads (Last 6 weeks)7
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Determining Recoverable Consensus NumbersProceedings of the 43rd ACM Symposium on Principles of Distributed Computing10.1145/3662158.3662775(3-13)Online publication date: 17-Jun-2024
  • (2024)An efficient failure-resilient mutual exclusion algorithm for distributed systems leveraging a novel zero-message overlay structureComputer Communications10.1016/j.comcom.2024.02.007218(1-13)Online publication date: Mar-2024
  • (2023)Brief Announcement: On Solving Recoverable Mutual Exclusion Under System-Wide FailuresProceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3558481.3591317(287-290)Online publication date: 17-Jun-2023
  • (2023)Constant RMR System-wide Failure Resilient Durable Locks with Dynamic JoiningProceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3558481.3591100(227-237)Online publication date: 17-Jun-2023
  • (2023)Abortable Recoverable Mutual ExclusionRecoverable Mutual Exclusion10.1007/978-3-031-20002-1_9(93-106)Online publication date: 18-Apr-2023
  • (2023)Adaptive AlgorithmsRecoverable Mutual Exclusion10.1007/978-3-031-20002-1_7(63-85)Online publication date: 18-Apr-2023
  • (2023)Sublogarithmic AlgorithmsRecoverable Mutual Exclusion10.1007/978-3-031-20002-1_6(45-61)Online publication date: 18-Apr-2023
  • (2023)Load and Store Based AlgorithmsRecoverable Mutual Exclusion10.1007/978-3-031-20002-1_5(39-44)Online publication date: 18-Apr-2023
  • (2023)Problem FormulationRecoverable Mutual Exclusion10.1007/978-3-031-20002-1_4(17-37)Online publication date: 18-Apr-2023
  • (2023)Prior WorkRecoverable Mutual Exclusion10.1007/978-3-031-20002-1_3(11-15)Online publication date: 18-Apr-2023
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media