Nonblocking synchronization and system design

January 1999

Author:
Michael Barry Greenwald,
Adviser:
David R. Cheriton

Publisher:

Stanford University
408 Panama Mall, Suite 217
Stanford
CA
United States

ISBN:978-0-599-61500-7

Order Number:AAI9958109

Pages:

241

Purchase on ProQuest

Bibliometrics

Abstract

Non-blocking synchronization (NBS) has significant advantages over blocking synchronization: The same code can ran on uniprocessors, asynchronous handlers, and on shared memory multiprocessors. NBS is deadlock-free, aids fault-tolerance, eliminates interference between synchronization and the scheduler, and can increase total system throughput. These advantages are becoming even more important with the increased use of parallelism and multiprocessors, and as the cost of a delay increases relative to processor speed. This thesis demonstrates that non-blocking synchronization is practical as the sole co-ordination mechanism in systems by showing that careful design and implementation of operating system software makes implementing efficient non-blocking synchronization far easier, by demonstrating that DCAS &parl0;Double-Compare-and-Swap&parr0; is the necessary and sufficient primitive for implementing NBS, and by demonstrating that efficient hardware DCAS is practical for RISC processors. This thesis presents non-blocking implementations of common data-structures sufficient to implement an operating system kernel. These out-perform all non-blocking implementations of the same data-structures and are comparable to spin-locks under no contention. They exploit properties of well- designed systems and depend on DCAS . I present an O(n) non-blocking implementation of CAS n with extensions that support multi-objects, a contention-reduction technique based on DCAS that is fault-tolerant and OS-independent yet performs as well as the best previously published techniques, and two implementations of dynamic, software transactional memory (STM) that support multi-object updates, and have O(w) overhead cost (for w writes in an update) in the absence of preemption. Finally, I demonstrate that the proposed OS implementation of DCAS is inefficient, and present a design of an efficient, hardware, DCAS implementation that is specific to the R4000 processor; however, the observations that make implementation practical are generally applicable. In short, the incremental costs of adding binary atomic synchronization primitives are very low, given that designers have already implemented unary atomic synchronization primitives.

Cited By

Contributors

David Ross Cheriton
Stanford University
- Publication Years1974 - 2016
- Publication counts106
- Citation count7,916
- Available for Download90
- Downloads (cumulative)94,955
- Downloads (12 months)9,283
- Downloads (6 weeks)1,655
- Average Downloads per Article1,055
- Average Citation per Article75
View Full Profile
Michael Barry Greenwald
University of Pennsylvania
- Publication Years1995 - 2015
- Publication counts26
- Citation count1,824
- Available for Download12
- Downloads (cumulative)12,833
- Downloads (12 months)752
- Downloads (6 weeks)93
- Average Downloads per Article1,069
- Average Citation per Article70
View Full Profile

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Recommendations

Non-blocking Synchronization and System Design
DCAS is not a silver bullet for nonblocking algorithm design
SPAA '04: Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures

Despite years of research, the design of efficient nonblocking algorithms remains difficult. A key reason is that current shared-memory multiprocessor architectures support only single-location synchronisation primitives such as compare-and-swap (CAS) ...
Synchronization in nested transactions

Browse Theses

Sections

Cited By