Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1188455.1188582acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
Article

CRUSH: controlled, scalable, decentralized placement of replicated data

Published: 11 November 2006 Publication History

Abstract

Emerging large-scale distributed storage systems are faced with the task of distributing petabytes of data among tens or hundreds of thousands of storage devices. Such systems must evenly distribute data and workload to efficiently utilize available resources and maximize system performance, while facilitating system growth and managing hardware failures. We have developed CRUSH, a scalable pseudorandom data distribution function designed for distributed object-based storage systems that efficiently maps data objects to storage devices without relying on a central directory. Because large systems are inherently dynamic, CRUSH is designed to facilitate the addition and removal of storage while minimizing unnecessary data movement. The algorithm accommodates a wide variety of data replication and reliability mechanisms and distributes data in terms of user-defined policies that enforce separation of replicas across failure domains.

References

[1]
Anderson, E., Hall, J., Hartline, J., Hobbs, M., Karlin, A. R., Saia, J., Swaminathan, R., and Wilkes, J. 2001. An experimental study of data migration algorithms. In Proceedings of the 5th International Workshop on Algorithm Engineering, Springer-Verlag, London, UK, 145--158.]]
[2]
Anderson, E., Hobbs, M., Keeton, K., Spence, S., Uysal, M., and Veitch, A. 2002. Hippodrome: running circles around storage administration. In Proceedings of the 2002 Conference on File and Storage Technologies (FAST).]]
[3]
Azagury, A., Dreizin, V., Factor, M., Henis, E., Naor, D., Rinetzky, N., Rodeh, O., Satran, J., Tavory, A., and Yerushalmi, L. 2003. Towards an object store. In Proceedings of the 20th IEEE / 11th NASA Goddard Conference on Mass Storage Systems and Technologies, 165--176.]]
[4]
Braam, P. J. 2004. The Lustre storage architecture. http://www.lustre.org/documentation.html, Cluster File Systems, Inc., Aug.]]
[5]
Brinkmann, A., Salzwedel, K., and Scheideler, C. 2000. Efficient, distributed data placement strategies for storage area networks. In Proceedings of the 12th ACM Symposium on Parallel Algorithms and Architectures (SPAA), ACM Press, 119--128. Extended Abstract.]]
[6]
Choy, D. M., Fagin, R., and Stockmeyer, L. 1996. Efficiently extendible mappings for balanced data distribution. Algorithmica 16, 215--232.]]
[7]
Ghemawat, S., Gobioff, H., and Leung, S.-T. 2003. The Google file system. In Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP '03), ACM.]]
[8]
Gobioff, H., Gibson, G., and Tygar, D. 1997. Security for network attached storage devices. Tech. Rep. TR CMU-CS-97-185, Carniege Mellon, Oct.]]
[9]
Goel, A., Shahabi, C., Yao, D. S.-Y., and Zimmerman, R. 2002. SCADDAR: An efficient randomized technique to reorganize continuous media blocks. In Proceedings of the 18th International Conference on Data Engineering (ICDE '02), 473--482.]]
[10]
Granville, A. 1993. On elementary proofs of the Prime Number Theorem for Arithmetic Progressions, without characters. In Proceedings of the 1993 Amalfi Conference on Analytic Number Theory, 157--194.]]
[11]
Honicky, R. J., and Miller, E. L. 2004. Replication under scalable hashing: A family of algorithms for scalable decentralized data distribution. In Proceedings of the 18th International Parallel & Distributed Processing Symposium (IPDPS 2004), IEEE.]]
[12]
Jenkins, R. J., 1997. Hash functions for hash table lookup. http://burtleburtle.net/bob/hash/evahash.html.]]
[13]
Karger, D., Lehman, E., Leighton, T., Levine, M., Lewin, D., and Panigrahy, R. 1997. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web. In ACM Symposium on Theory of Computing, 654--663.]]
[14]
Lumb, C. R., Ganger, G. R., and Golding, R. 2004. D-SPTF: Decentralized request distribution in brick-based storage systems. In Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 37--47.]]
[15]
Nagle, D., Serenyi, D., and Matthews, A. 2004. The Panasas ActiveScale storage cluster---delivering scalable high bandwidth storage. In Proceedings of the 2004 ACM/IEEE Conference on Supercomputing (SC '04).]]
[16]
Rodeh, O., and Teperman, A. 2003. zFS---a scalable distributed file system using object disks. In Proceedings of the 20th IEEE / 11th NASA Goddard Conference on Mass Storage Systems and Technologies, 207--218.]]
[17]
Saito, Y., Frølund, S., Veitch, A., Merchant, A., and Spence, S. 2004. FAB: Building distributed enterprise disk arrays from commodity components. In Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 48--58.]]
[18]
Santos, J. R., Muntz, R. R., and Ribeiro-Neto, B. 2000. Comparing random data allocation and data striping in multimedia servers. In Proceedings of the 2000 SIGMETRICS Conference on Measurement and Modeling of Computer Systems, ACM Press, Santa Clara, CA, 44--55.]]
[19]
Schmuck, F., and Haskin, R. 2002. GPFS: A shared-disk file system for large computing clusters. In Proceedings of the 2002 Conference on File and Storage Technologies (FAST), USENIX, 231--244.]]
[20]
Tang, H., Gulbeden, A., Zhou, J., Strathearn, W., Yang, T., and Chu, L. 2004. A self-organizing storage cluster for parallel data-intensive applications. In Proceedings of the 2004 ACM/IEEE Conference on Super-computing (SC '04).]]
[21]
Weil, S. A., Brandt, S. A., Miller, E. L., Long, D. D. E., and Maltzahn, C. 2006. Ceph: A scalable, high-performance distributed file system. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI).]]
[22]
Xin, Q., Miller, E. L., and Schwarz, T. J. E. 2004. Evaluation of distributed recovery in large-scale storage systems. In Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing (HPDC), 172--181.]]

Cited By

View all
  • (2024)Olsync: Object-level tiering and coordination in tiered storage systems based on software-defined networkFuture Generation Computer Systems10.1016/j.future.2024.107521(107521)Online publication date: Sep-2024
  • (2024)FlimmFuture Generation Computer Systems10.1016/j.future.2024.05.052160:C(140-153)Online publication date: 1-Nov-2024
  • (2023)The Open-source DeLiBA2 Hardware/Software Framework for Distributed Storage AcceleratorsACM Transactions on Reconfigurable Technology and Systems10.1145/362448217:2(1-32)Online publication date: 14-Sep-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '06: Proceedings of the 2006 ACM/IEEE conference on Supercomputing
November 2006
746 pages
ISBN:0769527000
DOI:10.1145/1188455
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 November 2006

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

SC '06
Sponsor:

Acceptance Rates

SC '06 Paper Acceptance Rate 54 of 239 submissions, 23%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)60
  • Downloads (Last 6 weeks)6
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Olsync: Object-level tiering and coordination in tiered storage systems based on software-defined networkFuture Generation Computer Systems10.1016/j.future.2024.107521(107521)Online publication date: Sep-2024
  • (2024)FlimmFuture Generation Computer Systems10.1016/j.future.2024.05.052160:C(140-153)Online publication date: 1-Nov-2024
  • (2023)The Open-source DeLiBA2 Hardware/Software Framework for Distributed Storage AcceleratorsACM Transactions on Reconfigurable Technology and Systems10.1145/362448217:2(1-32)Online publication date: 14-Sep-2023
  • (2023)Unifying Token- and Span-level Supervisions for Few-shot Sequence LabelingACM Transactions on Information Systems10.1145/361040342:1(1-27)Online publication date: 21-Aug-2023
  • (2023)Where Are the (Cellular) Data?ACM Computing Surveys10.1145/361040256:2(1-25)Online publication date: 15-Sep-2023
  • (2023)Forensic Examination of CephDigital Threats: Research and Practice10.1145/36098624:3(1-18)Online publication date: 6-Oct-2023
  • (2023)Service Caching and Computation Reuse Strategies at the Edge: A SurveyACM Computing Surveys10.1145/360950456:2(1-38)Online publication date: 20-Jul-2023
  • (2023)LibCOS: Enabling Converged HPC and Cloud Data Stores with MPIProceedings of the International Conference on High Performance Computing in Asia-Pacific Region10.1145/3578178.3578236(106-116)Online publication date: 27-Feb-2023
  • (2023)Provably Good Randomized Strategies for Data Placement in Distributed Key-Value StoresProceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3572848.3577501(27-38)Online publication date: 25-Feb-2023
  • (2023)Near-optimal Lower Bounds on Quantifier Depth and Weisfeiler–Leman Refinement StepsJournal of the ACM10.1145/319525770:5(1-32)Online publication date: 11-Oct-2023
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media