Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1294261.1294267acmconferencesArticle/Chapter ViewAbstractPublication PagessospConference Proceedingsconference-collections
Article

Zyzzyva: speculative byzantine fault tolerance

Published: 14 October 2007 Publication History

Abstract

We present Zyzzyva, a protocol that uses speculation to reduce the cost and simplify the design of Byzantine fault tolerant state machine replication. In Zyzzyva, replicas respond to a client's request without first running an expensive three-phase commit protocol to reach agreement on the order in which the request must be processed. Instead, they optimistically adopt the order proposed by the primary and respond immediately to the client. Replicas can thus become temporarily inconsistent with one another, but clients detect inconsistencies, help correct replicas converge on a single total ordering of requests, and only rely on responses that are consistent with this total order. This approach allows Zyzzyva to reduce replication overheads to near their theoretical minimal.

Supplementary Material

JPG File (1294267.jpg)
index.html (index.html)
Slides from the presentation
ZIP File (p45-slides.zip)
Supplemental material for Zyzzyva: speculative byzantine fault tolerance
Audio only (1294267.mp3)
Video (1294267.mp4)

References

[1]
OpenSSL. http://www.openssl.org/.
[2]
US secret service report on insider attacks. http://www.sei.cmu.edu/about/press/insider-2005.html.
[3]
M. Abd-El-Malek, G. Ganger, G. Goodson, M. Reiter, and J. Wylie. Fault-scalable byzantine fault-tolerant services. In Proc. SOSP, Oct. 2005.
[4]
A. Avizienis and L. Chen. On the implementation of n-version programming for software fault tolerance during execution. In Proc. IEEE COMPSAC, pages 149--155, Nov. 1977.
[5]
W. Bartlett and L. Spainhower. Commercial fault tolerance: A tale of two systems. IEEE TODSC, 1(1):87--96, 2004.
[6]
L. Bassham and W. Polk. Threat assessment of malicious code and human threats. Technical report, NIST, Computer Security Division, 1994.
[7]
M. Bellare and D. Micciancio. A new paradigm for collision-free hashing: Incrementally at reduced cost. In EUROCRYPT97, 1997.
[8]
M. Castro and B. Liskov. Practical byzantine fault tolerance. In Proc. OSDI, February 1999.
[9]
M. Castro and B. Liskov. Practical Byzantine fault tolerance and proactive recovery. ACM TOCS, Nov. 2002.
[10]
J. Cowling, D. Myers, B. Liskov, R. Rodrigues, and L. Shrira. HQ replication: A hybrid quorum protocol for Byzantine fault tolerance. In Proc. OSDI, Nov. 2006.
[11]
P. Dutta, R. Guerraoui, and M. Vukolić. Best-case complexity of asynchronous Byzantine consensus. Technical Report EPFL/IC/200499, EPFL, Feb. 2005.
[12]
C. Dwork, N. Lynch, and L. Stockmeyer. Consensus in the presence of partial synchrony. J. ACM, 1988.
[13]
M. Herlihy and J. Wing. Linearizability: A correctness condition for concurrent objects. ACM Trans. Prog. Lang. Sys., 12(3), 1990.
[14]
S. King and P. Chen. Backtracking intrusions. In Proc. SOSP, 2003.
[15]
J. C. Knight and N. G. Leveson. An experimental evaluation of the assumption of independence in multi-version programming. Software Engineering, 12(1):96--109, Jan. 1986.
[16]
R. Kotla, L. Alvisi, and M. Dahlin. SafeStore: A Durable and Practical Storage System. In USENIX Annual Technical Conference, Monterey, CA, June 2007.
[17]
R. Kotla, L. Alvisi, M. Dahlin, A. Clement, and E. Wong. Zyzzyva: Speculative byzantine fault tolerance. University of Texas at Austin, Technical Report: UTCS-TR-07-40, 2007.
[18]
R. Kotla and M. Dahlin. High-throughput byzantine fault tolerance. In DSN, June 2004.
[19]
L. Lamport. Time, clocks, and the ordering of events in a distributed system. Comm. ACM, 21(7):558--565, 1978.
[20]
L. Lamport. The part-time parliament. ACM TOCS, 16(2), 1998.
[21]
L. Lamport. Lower bounds for asynchronous consensus. In Proc. FUDICO, pages 22--23, June 2003.
[22]
J. Li and D. Mazières. Beyond one-third faulty replicas in Byzantine fault tolerant services. In NSDI, 2007.
[23]
B. Liskov, S. Ghemawat, R. Gruber, P. Johnson, and L. Shrira. Replication in the harp file system. In Proc. SOSP, 1991.
[24]
D. Malkhi and M. Reiter. Byzantine quorum systems. Distributed Computing, 11(4), 1998.
[25]
J.-P. Martin and L. Alvisi. Fast Byzantine consensus. IEEE TODSC, 3(3):202--215, July 2006.
[26]
S. Microsystems. NFS: Network file system protocol specification. InternetRFC1094, 1989.
[27]
S. S. Mukherjee, J. S. Emer, and S. K. Reinhardt. The soft error problem: An architectural perspective. In HPCA, 2005.
[28]
E. Nightingale, K. Veeraraghavan, P. Chen, and J. Flinn. Rethink the sync. In Proc. OSDI, 2006.
[29]
E. B. Nightingale, P. Chen, and J. Flinn. Speculative execution in a distributed file system. In Proc. SOSP, Oct. 2005.
[30]
D. Openheimer, A. Ganapathi, and D. Patterson. Why do internet systems fail, and what can be done about it. In Proc. USITS, Seattle,WA, March 2003.
[31]
M. Pease, R. Shostak, and L. Lamport. Reaching agreement in the presence of faults. Journal of the ACM, 27(2), April 1980.
[32]
V. Prabhakaran, L. Bairavasundaram, N. Agrawal, H. G. A. Arpaci-Dusseau, and R. Arpaci-Dusseau. IRON file systems. In Proc. SOSP, 2005.
[33]
R. Rodrigues, M. Castro, and B. Liskov. BASE : Using abstraction to improve fault tolerance. In Proc. SOSP, October 2001.
[34]
D. S. Santry, M. J. Feeley, N. C. Hutchinson, A. C. Veitch, R. W. Carton, and J. Ofir. Deciding when to forget in the Elephant file system. In Proc. OSDI, December 1999.
[35]
F. B. Schneider. Implementing fault-tolerant services using the state machine approach: a tutorial. ACM Computing Surveys, 22(4), 1990.
[36]
P. Shivakumar, M. Kistler, S. W. Keckler, D. Burger, and L. Alvisi. Modeling the effect of technology trends on the soft error rate of combinational logic. In Proc. DSN, 2002.
[37]
A. Singh, P. Maniatis, P. Druschel, and T. Roscoe. Conflict-free quorum-based bft protocols. Technical Report 2007-1, Max Planck Institute for Software Systems, August 2007.
[38]
M. Wiesmann, F. Pedone, A. Schiper, B. Kemme, and G. Alonso. Understanding replication in databases and distributed systems. In Proc. ICDCS, 2000.
[39]
J. Yang, C. Sar, and D. Engler. Explode: A lightweight, general system for finding serious storage system errors. In Proc. OSDI, 2006.
[40]
J. Yang, P. Twohey, D. Engler, and M. Musuvathi. Using Model Checking to Find Serious File System Errors. In Proc. OSDI, 2004.
[41]
J. Yin, J.-P. Martin, A. Venkataramani, L. Alvisi, and M. Dahlin. Separating agreement from execution for byzantine fault tolerant services. In Proc. SOSP, October 2003.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SOSP '07: Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
October 2007
378 pages
ISBN:9781595935915
DOI:10.1145/1294261
  • cover image ACM SIGOPS Operating Systems Review
    ACM SIGOPS Operating Systems Review  Volume 41, Issue 6
    SOSP '07
    December 2007
    363 pages
    ISSN:0163-5980
    DOI:10.1145/1323293
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 October 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. byzantine fault tolerance
  2. output commit
  3. replication
  4. speculative execution

Qualifiers

  • Article

Conference

SOSP07
Sponsor:
SOSP07: ACM SIGOPS 21st Symposium on Operating Systems Principles 2007
October 14 - 17, 2007
Washington, Stevenson, USA

Acceptance Rates

Overall Acceptance Rate 174 of 961 submissions, 18%

Upcoming Conference

SOSP '25
ACM SIGOPS 31st Symposium on Operating Systems Principles
October 13 - 16, 2025
Seoul , Republic of Korea

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)111
  • Downloads (Last 6 weeks)11
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)IONIAProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650711(225-242)Online publication date: 27-Feb-2024
  • (2024)Dynamic Byzantine Fault-Tolerant Consensus Algorithm with Supervised Feedback MechanismsMathematics10.3390/math1217264312:17(2643)Online publication date: 26-Aug-2024
  • (2024)A Survey of Consortium Blockchain and Its ApplicationsCryptography10.3390/cryptography80200128:2(12)Online publication date: 22-Mar-2024
  • (2024)DARE to Agree: Byzantine Agreement With Optimal Resilience and Adaptive CommunicationProceedings of the 43rd ACM Symposium on Principles of Distributed Computing10.1145/3662158.3662792(145-156)Online publication date: 17-Jun-2024
  • (2024)Lumiere: Making Optimal BFT for Partial Synchrony PracticalProceedings of the 43rd ACM Symposium on Principles of Distributed Computing10.1145/3662158.3662787(135-144)Online publication date: 17-Jun-2024
  • (2024)All Byzantine Agreement Problems Are ExpensiveProceedings of the 43rd ACM Symposium on Principles of Distributed Computing10.1145/3662158.3662780(157-169)Online publication date: 17-Jun-2024
  • (2024)Banyan: Fast Rotating Leader BFTProceedings of the 25th International Middleware Conference10.1145/3652892.3700788(494-507)Online publication date: 2-Dec-2024
  • (2024)TriSAS: Toward Dependable Inter-SAS Coordination with AuditabilityProceedings of the 19th ACM Asia Conference on Computer and Communications Security10.1145/3634737.3645005(399-411)Online publication date: 1-Jul-2024
  • (2024)MIS: A Multi-Identifier Management and Resolution System in the MetaverseACM Transactions on Multimedia Computing, Communications, and Applications10.1145/359764120:7(1-25)Online publication date: 27-Mar-2024
  • (2024)BitFT: An Understandable, Performant and Resource-Efficient Blockchain ConsensusIEEE Transactions on Sustainable Computing10.1109/TSUSC.2023.33414409:3(522-534)Online publication date: May-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media