Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1294261.1294281acmconferencesArticle/Chapter ViewAbstractPublication PagessospConference Proceedingsconference-collections
Article

Dynamo: amazon's highly available key-value store

Published: 14 October 2007 Publication History

Abstract

Reliability at massive scale is one of the biggest challenges we face at Amazon.com, one of the largest e-commerce operations in the world; even the slightest outage has significant financial consequences and impacts customer trust. The Amazon.com platform, which provides services for many web sites worldwide, is implemented on top of an infrastructure of tens of thousands of servers and network components located in many datacenters around the world. At this scale, small and large components fail continuously and the way persistent state is managed in the face of these failures drives the reliability and scalability of the software systems.
This paper presents the design and implementation of Dynamo, a highly available key-value storage system that some of Amazon's core services use to provide an "always-on" experience. To achieve this level of availability, Dynamo sacrifices consistency under certain failure scenarios. It makes extensive use of object versioning and application-assisted conflict resolution in a manner that provides a novel interface for developers to use.

Supplementary Material

JPG File (1294281.jpg)
index.html (index.html)
Slides from the presentation
ZIP File (p205-slides.zip)
Supplemental material for Dynamo: amazon's highly available key-value store
Audio only (1294281.mp3)
Video (1294281.mp4)

References

[1]
Adya, A., Bolosky, W. J., Castro, M., Cermak, G., Chaiken, R., Douceur, J. R., Howell, J., Lorch, J. R., Theimer, M., and Wattenhofer, R. P. 2002. Farsite: federated, available, and reliable storage for an incompletely trusted environment. SIGOPS Oper. Syst. Rev. 36, SI (Dec. 2002), 1--14.
[2]
Bernstein, P.A., and Goodman, N. An algorithm for concurrency control and recovery in replicated distributed databases. ACM Trans. on Database Systems, 9(4): 596--615, December 1984.
[3]
Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A., Burrows, M., Chandra, T., Fikes, A., and Gruber, R.
[4]
Douceur, J. R. and Bolosky, W. J. 2000. Process-based regulation of low-importance processes. SIGOPS Oper. Syst. Rev. 34, 2 (Apr. 2000), 26--27.
[5]
Fox, A., Gribble, S. D., Chawathe, Y., Brewer, E. A., and Gauthier, P. 1997. Cluster-based scalable network services. In Proceedings of the Sixteenth ACM Symposium on Operating Systems Principles (Saint Malo, France, October 05 -- 08, 1997). W. M. Waite, Ed. SOSP '97. ACM Press, New York, NY, 78--91.
[6]
Ghemawat, S., Gobioff, H., and Leung, S. 2003. The Google file system. In Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles (Bolton Landing, NY, USA, October 19 -- 22, 2003). SOSP '03. ACM Press, New York, NY, 29--43.
[7]
Gray, J., Helland, P., O'Neil, P., and Shasha, D. 1996. The dangers of replication and a solution. In Proceedings of the 1996 ACM SIGMOD international Conference on Management of Data (Montreal, Quebec, Canada, June 04 -- 06, 1996). J. Widom, Ed. SIGMOD '96. ACM Press, New York, NY, 173--182.
[8]
Gupta, I., Chandra, T. D., and Goldszmidt, G. S. 2001. On scalable and efficient distributed failure detectors. In Proceedings of the Twentieth Annual ACM Symposium on Principles of Distributed Computing (Newport, Rhode Island, United States). PODC '01. ACM Press, New York, NY, 170--179.
[9]
Kubiatowicz, J., Bindel, D., Chen, Y., Czerwinski, S., Eaton, P., Geels, D., Gummadi, R., Rhea, S., Weatherspoon, H., Wells, C., and Zhao, B. 2000. OceanStore: an architecture for global--scale persistent storage. SIGARCH Comput. Archit. News 28, 5 (Dec. 2000), 190--201.
[10]
Karger, D., Lehman, E., Leighton, T., Panigrahy, R., Levine, M., and Lewin, D. 1997. Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web. In Proceedings of the Twenty--Ninth Annual ACM Symposium on theory of Computing (El Paso, Texas, United States, May 04 -- 06, 1997). STOC '97. ACM Press, New York, NY, 654--663.
[11]
Lindsay, B.G., et. al., "Notes on Distributed Databases", Research Report RJ2571(33471), IBM Research, July 1979.
[12]
Lamport, L. Time, clocks, and the ordering of events in a distributed system. ACM Communications, 21(7), pp. 558--565, 1978.
[13]
Merkle, R. A digital signature based on a conventional encryption function. Proceedings of CRYPTO, pages 369--378. Springer-Verlag, 1988.
[14]
Ramasubramanian, V., and Sirer, E. G. Beehive: O(1)lookup performance for power-law query distributions in peer-to-peer overlays. In Proceedings of the 1st Conference on Symposium on Networked Systems Design and Implementation, San Francisco, CA, March 29-31, 2004.
[15]
Reiher, P., Heidemann, J., Ratner, D., Skinner, G., and Popek, G. 1994. Resolving file conflicts in the Ficus file system. In Proceedings of the USENIX Summer 1994 Technical Conference on USENIX Summer 1994 Technical Conference -- Volume 1 (Boston, Massachusetts, June 06-10, 1994). USENIX Association, Berkeley, CA, 12--12.
[16]
Rowstron, A., and Druschel, P. Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems. Proceedings of Middleware, pages 329--350, November, 2001.
[17]
Rowstron, A., and Druschel, P. Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility. Proceedings of Symposium on Operating Systems Principles, October 2001.
[18]
Saito, Y., Frølund, S., Veitch, A., Merchant, A., and Spence, S. 2004. FAB: building distributed enterprise disk arrays from commodity components. SIGOPS Oper. Syst. Rev. 38, 5 (Dec. 2004), 48--58.
[19]
Satyanarayanan, M., Kistler, J.J., Siegel, E.H. Coda: A Resilient Distributed File System. IEEE Workshop on Workstation Operating Systems, Nov. 1987.
[20]
Stoica, I., Morris, R., Karger, D., Kaashoek, M. F., and Balakrishnan, H. 2001. Chord: A scalable peer-to-peer lookup service for internet applications. In Proceedings of the 2001 Conference on Applications, Technologies, Architectures, and Protocols For Computer Communications (San Diego, California, United States). SIGCOMM '01. ACM Press, New York, NY, 149--160.
[21]
Terry, D. B., Theimer, M. M., Petersen, K., Demers, A. J., Spreitzer, M. J., and Hauser, C. H. 1995. Managing update conflicts in Bayou, a weakly connected replicated storage system. In Proceedings of the Fifteenth ACM Symposium on Operating Systems Principles (Copper Mountain, Colorado, United States, December 03 -- 06, 1995). M. B. Jones, Ed. SOSP '95. ACM Press, New York, NY, 172--182.
[22]
Thomas, R. H. A majority consensus approach to concurrency control for multiple copy databases. ACM Transactions on Database Systems 4 (2): 180--209, 1979.
[23]
Weatherspoon, H., Eaton, P., Chun, B., and Kubiatowicz, J. 2007. Antiquity: exploiting a secure log for wide-area distributed storage. SIGOPS Oper. Syst. Rev. 41, 3 (Jun. 2007), 371--384.
[24]
Welsh, M., Culler, D., and Brewer, E. 2001. SEDA: an architecture for well-conditioned, scalable internet services. In Proceedings of the Eighteenth ACM Symposium on Operating Systems Principles (Banff, Alberta, Canada, October 21 -- 24, 2001). SOSP '01. ACM Press, New York, NY, 230--243.

Cited By

View all
  • (2024)ELECTProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650715(293-310)Online publication date: 27-Feb-2024
  • (2024)Optimizing Test Data Management Strategies in Banking Domain ProjectsJournal of Sustainable Solutions10.36676/j.sust.sol.v1.i4.371:4(87-100)Online publication date: 28-Oct-2024
  • (2024)Design and Evaluation of Real-Time Data Storage and Signal Processing in a Long-Range Distributed Acoustic Sensing (DAS) Using Cloud-Based ServicesSensors10.3390/s2418594824:18(5948)Online publication date: 13-Sep-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SOSP '07: Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
October 2007
378 pages
ISBN:9781595935915
DOI:10.1145/1294261
  • cover image ACM SIGOPS Operating Systems Review
    ACM SIGOPS Operating Systems Review  Volume 41, Issue 6
    SOSP '07
    December 2007
    363 pages
    ISSN:0163-5980
    DOI:10.1145/1323293
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 October 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. performance
  2. reliability
  3. scalability

Qualifiers

  • Article

Conference

SOSP07
Sponsor:
SOSP07: ACM SIGOPS 21st Symposium on Operating Systems Principles 2007
October 14 - 17, 2007
Washington, Stevenson, USA

Acceptance Rates

Overall Acceptance Rate 174 of 961 submissions, 18%

Upcoming Conference

SOSP '25
ACM SIGOPS 31st Symposium on Operating Systems Principles
October 13 - 16, 2025
Seoul , Republic of Korea

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2,692
  • Downloads (Last 6 weeks)446
Reflects downloads up to 20 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)ELECTProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650715(293-310)Online publication date: 27-Feb-2024
  • (2024)Optimizing Test Data Management Strategies in Banking Domain ProjectsJournal of Sustainable Solutions10.36676/j.sust.sol.v1.i4.371:4(87-100)Online publication date: 28-Oct-2024
  • (2024)Design and Evaluation of Real-Time Data Storage and Signal Processing in a Long-Range Distributed Acoustic Sensing (DAS) Using Cloud-Based ServicesSensors10.3390/s2418594824:18(5948)Online publication date: 13-Sep-2024
  • (2024)Cloud Actor-Oriented Database Transactions in OrleansProceedings of the VLDB Endowment10.14778/3685800.368580117:12(3720-3730)Online publication date: 1-Aug-2024
  • (2024)Oasis: An Optimal Disjoint Segmented Learned Range FilterProceedings of the VLDB Endowment10.14778/3659437.365944717:8(1911-1924)Online publication date: 1-Apr-2024
  • (2024)Occam's Razor for Distributed ProtocolsProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698514(618-636)Online publication date: 20-Nov-2024
  • (2024)Approaches to Conflict-free Replicated Data TypesACM Computing Surveys10.1145/369524957:2(1-36)Online publication date: 9-Sep-2024
  • (2024)CAMAL: Optimizing LSM-trees via Active LearningProceedings of the ACM on Management of Data10.1145/36771382:4(1-26)Online publication date: 30-Sep-2024
  • (2024)On the Feasibility and Benefits of Extensive EvaluationProceedings of the ACM on Management of Data10.1145/36771372:4(1-24)Online publication date: 30-Sep-2024
  • (2024)Error Credits: Resourceful Reasoning about Error Bounds for Higher-Order Probabilistic ProgramsProceedings of the ACM on Programming Languages10.1145/36746358:ICFP(284-316)Online publication date: 15-Aug-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media