Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Self-repairing codes

Local repairability for cheap and fast maintenance of erasure coded data

  • Published:
Computing Aims and scope Submit manuscript

Abstract

Networked distributed data storage systems are essential to deal with the needs of storing massive volumes of data. Dependability of such a system relies on its fault tolerance (data should be available in case of node failures) as well as its maintainability (its ability to repair lost data to ensure redundancy replenishment over time). Erasure codes provide a storage efficient alternative to replication based redundancy in storage systems, ensuring the same fault tolerance at a lower storage overhead cost. Traditional erasure codes however have the drawback of entailing high communication overhead for maintenance, when encoded fragments are lost due to storage device failures, and need to be replenished in new nodes. We propose a new family of erasure codes called self-repairing codes (SRC) taking into account the peculiarities of distributed storage systems, specifically to improve its maintainability by ‘localizing’ the repairs. SRC have the property that encoded fragments can be repaired directly from other small subsets of (typically 2 or 3) encoded fragments. These code properties allow bandwidth efficient and fast recovery even in the presence of multiple failures, in turn translating into better system robustness. A concrete family of such locally repairable codes, namely, homomorphic SRC are proposed and various aspects and properties of the same are studied in detail and compared—quantitatively or qualitatively (as may be suitable) with respect to other codes including traditional erasure codes as well as some recent representative codes designed specifically for storage applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. In this paper, we use the terms ‘fragment’ and ‘block’ interchangeably. Depending on the context, the term ‘data’ is used to mean either fragment(s) or object(s).

  2. Linearized polynomials are also called additive polynomials.

  3. \(\rho _x\) is zero for \(x<k\) for \(\mathrm{HSRC}\) also.

  4. Note than in fact, often fewer than \(k\) fragments will be adequate to reconstruct a specific fragment.

  5. A storage system’s vulnerability to further failures, as well as spiky bandwidth usage are known problems of lazy repair strategies [11].

References

  1. Buttyán L, Czap L, Vajda I (2011) Detection and recovery from pollution attacks in coding-based distributed storage schemes. In: IEEE transactions on dependable and secure computing, vol 8, issue no 6

  2. Bhagwan R, Tati K, Cheng Y, Savage S, Voelker G (2004) Total recall: system support for automated availability management. In: Networked systems design and implementation (NSDI), 2004

  3. http://www.cleversafe.org/dispersed-storage/configurations. Accessed 11 Aug 2013

  4. Dimakis AG, Brighten Godfrey P, Wu Y, Wainwright MO, Ramchandran K (2010) Network coding for distributed storage systems. IEEE Trans Inf Theory 56(9):4539–4551

  5. Datta A, Aberer K (2006) Internet-scale storage systems under churn—a study of the steady-state using Markov models. In: Peer-to-peer computing (P2P)

  6. Duminuco A, Biersack E (2008) Hierarchical codes: how to make erasure codes attractive for peer-to-peer storage systems. In: Peer-to-peer computing (P2P)

  7. Duminuco A, Biersack EW (2009) A practical study of regenerating codes for peer-to-peer backup systems. In: International conference on distributed computing systems (ICDCS)

  8. El Rouayheb S, Ramchandran K (2010) Fractional repetition codes for repair in distributed storage systems. In: Allerton conference on communication control and computing

  9. Gaonkar S, Keeton K, Merchant A, Sanders WH (2010) Designing dependable storage solutions for shared application environments. IEEE Trans Dependable Secur Comput 7(4)

  10. Kermarrec A-M, Le Scouarnec N, Straub G (2010) Beyong regenerating codes. In: Technical report, 10 Sept 2010

  11. Liu X, Datta A (2009) Redundancy maintenance and garbage collection strategies in peer-to-peer storage systems. In: International symposium on stabilization, safety, and security of distributed systems (SSS)

  12. Gopalan P, Huang C, Simitchi H, Yekhanin S (2012) On the locality of codeword symbols. IEEE Trans Inf Theory 58(11):6925–6934

    Article  Google Scholar 

  13. Grolimund D (2007) Wuala—a distributed file system. Google Tech Talk http://www.youtube.com/watch?v=3xKZ4KGkQY8. Accessed 11 Aug 2013

  14. Haytaoglu E, Dalkilic ME (2013) Homomorphic minimum bandwidth repairing codes. In: Information sciences and systems 2013. Lecture notes in electrical engineering, vol 264, pp 339–348

  15. Hollmann H (2013) Storage codes—coding rate and repair locality. In: ICNC 2013

  16. Huang C, Chen M, Li J (2007) Pyramid codes: flexible schemes to trade space for access efficiency in reliable data storage systems. In: NCA 2007

  17. Kamath GM, Prakash N, Lalitha V, Vijay Kumar P (2013) Codes with Local Regeneration. In: The proceedings of the workshop on information theory and applications (ITA), 2013. Available at arXiv:1211.1932. Accessed Nov 2012

  18. Oggier F, Datta A (2011) Self-repairing homomorphic codes for distributed storage systems. In: INFOCOM 2011

  19. Oggier F, Datta A (2011) Self-repairing codes for distributed storage—a projective geometric construction. In: ITW 2011

  20. Papailiopoulos DS, Dimakis AG (2012) Locally repairable codes. In: ISIT 2012

  21. Rashmi KV, Shah NB, Kumar PV, Ramchandran K (2009) Explicit construction of optimal exact regenerating codes for distributed storage. In: Allerton conference on control, computing and communication 2009

  22. Rashmi KV, Shah NB, Kumar PV (2011) Optimal exact-regenerating codes for distributed storage at the MSR and MBR points via a product-matrix construction. IEEE Trans Inf Theory 57(8):5227–5239

  23. Rashmi KV, Shah NB, Kumar PV (2011) Enabling node repair in any erasure code for distributed storage. In: IEEE International Symposium on Information Theory, 2011. IEEE, New York

  24. Reed IS, Solomon G (1960) Polynomial Codes Over Certain Finite Fields. J Soc Ind Appl Math 8(2):300–304

  25. Rodrigues R, Liskov B (2005) High availability in DHTs: erasure coding vs. replication. In: Workshop on peer-to-peer systems (IPTPS)

  26. Shum KW (2011) Cooperative regenerating codes for distributed storage systems. ICC 2011

  27. Shum KW, Hu Y (2013) Cooperative regenerating codes. arXiv:1207.6762

  28. Silberstein N, Rawat AS, Koyluoglu O, Vishwanath S (2012) Optimal locally repairable codes via rank-metric codes. arXiv:1301.6331

  29. Tamo I, Wang Z, Bruck J (2011) MDS array codes with optimal rebuilding. In: IEEE ISIT 2011

Download references

Acknowledgments

This work has been supported by the MoE Tier-2 grant MOE2013-T2-1-068 “eCODE: Erasure Codes for Datacenter Environments”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anwitaman Datta.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Oggier, F., Datta, A. Self-repairing codes. Computing 97, 171–201 (2015). https://doi.org/10.1007/s00607-014-0426-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00607-014-0426-5

Keywords

Mathematics Subject Classification

Navigation