A Non-Forced-Write Atomic Commit Protocol for Cluster File Systems

Bing-Qing Shao¹,
Jun-Wei Zhang¹,
Cai-Ping Zheng¹,
Hao Zhang^1,2,
Zhen-Jun Liu¹ &
…
Lu Xu¹

85 Accesses
Explore all metrics

Abstract

Distributed metadata consistency is one of the critical issues of metadata clusters in distributed file systems. Existing methods to maintain metadata consistency generally need several log forced write operations. Since synchronous disk IO is very inefficient, the average response time of metadata operations is greatly increased. In this paper, an asynchronous atomic commit protocol (ACP) named Dual-Log (DL) is presented. It does not need any log forced write operations. Optimizing for distributed metadata operations involving only two metadata servers, DL mutually records the redo log in counterpart metadata servers by transferring through the low latency network. A crashed metadata server can redo the metadata operation with the redundant redo log. Since the latency of the network is much lower than the latency of disk IO, DL can improve the performance of distributed metadata service significantly. The prototype of DL is implemented based on local journal. The performance is tested by comparing with two widely used protocols, EP and S2PC-MP, and the results show that the average response time of distributed metadata operations is reduced by about 40%~60%, and the recovery time is only 1 second under 10 thousands uncompleted distributed metadata operations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

ICCG: low-cost and efficient consistency with adaptive synchronization for metadata replication

Article 11 November 2024

DCC: Distributed Cache Consistency

Low Overhead Log Replication for Main Memory Database System

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Reinsel D, Gantz J. The digital universe in 2020: Big data, bigger digital shadows, and biggest growth in the far east, December 2012. http://www.emc.com/leadership/digital-universe/iview/index.htm, January 2014.
Adrian M. Information management goes ‘Extreme’: The biggest challenges for 21st century CIOs, 2011. http://www.sas.com/offices/NA/canada/lp/Big-Data/Extreme-Information-Management.pdf, January 2014.
Roselli D S, Lorch J R, Anderson T E. A comparison of file system workloads. In Proc. the 2000 USENIX Annual Technical Conference, June 2000, pp.41-54.
Gray J. Notes on data base operating systems. In Lecture Notes in Computer Science 60, Bayer R, Graham R M, Seegmüller G (eds.), Springer Berlin Heidelberg, 1978, pp.393-481.
Ganger G, McKusick M, Soules C A et al. Soft updates: A solution to the metadata update problem in file systems. ACM Trans. Computer Systems, 2000, 18(2): 127-153.
Article Google Scholar
Seltzer M, Ganger G, McKusick M K, et al. Journaling versus soft updates: Asynchronous meta-data protection in file systems. In Proc. USENIX Annual Technical Conference, June 2000, pp.18-23.
Yang D Z, Huang H, Zhang J G, Xu L. A large capacity, high performance and scalability distributed file system — BWFS. Journal of Computer Research and Development, 2005, 42(3): 1028-1033. (In Chinese)
Article Google Scholar
Abd-El-Malek M, Courtright II W V, Cranor C et al. Ursa minor: Versatile cluster-based storage. In Proc. the 4th USENIX Conf. File and Storage Technologies, Dec. 2005, pp.59-72.
Cluster File Systems Inc. Lustre: A scalable, high-performance file system, 2002. http://www.cse.buffalo.edu/faculty/tkosar/cse710/papers/lustre-whitepaper.pdf, January 2014.
Weil S A, Brandt S A, Miller E L et al. Ceph: A scalable, high-performance distributed file system. In Proc. the 7th OSDI, Nov. 2006, pp.307-320.
Xiong J. Research on key issues in large-scale cluster file system [Ph.D. Thesis]. Institute of Computing Technology, Chinese Academy of Sciences, 2006.
Hennessy J L, Patterson D A. Computer Architecture: A Quantitative Approach (5 edition). Morgan Kaufmann, 2011.
Stamos J W, Cristian F. A low-cost atomic commit protocol. In Proc. the 9th IEEE Symposium on Reliable Distributed Systems, October 1990, pp.66-75.
Al-Houmaily Y, Chrysanthis P. Two-phase commit in gigabit-networked distributed databases. In Proc. the 8th Int. Conf. Parallel and Distributed Computing Systems, Sept. 1995.
Qiu Y J, Liu X S, Yang F. A low-cost distributed database log mechanism. Journal of Computer Research and Development, 2004, 41(11): 1942-1948.
Google Scholar
Bernstein P A, Hadzilacos V, Goodman N. Concurrency Control and Recovery in Database Systems. Boston, USA: Addison-Wesley, Longman Publishing Co., Inc., 1987.
Google Scholar
Mohan C, Lindsay B, Obermarck R. Transaction management in the R* distributed database management system. ACM Transactions on Database Systems, 1986, 11(4): 378-396.
Article Google Scholar
Gray J. A comparison of the Byzantine agreement problem and the transaction commit problem. In Lecture Notes in Computer Science 448, Simons B, Spector A (eds.), Springer New York, 1990, pp.10-17.
Xiong J, Hu Y, Li G et al. Metadata distribution and consistency techniques for large-scale cluster file systems. IEEE Trans. Parallel and Distributed Systems, 2011, 22(5): 803-816.
Article Google Scholar
Kuhn D R. IEEE’s Posix: Making progress. IEEE Spectrum, 1991, 28(12): 36-39.
Article Google Scholar
Tweedie S C. Journaling the Linux ext2fs filesystem. In Proc. the 4th Annual Linux Expo, May 1998.
Wood W G. Recovery control of communicating processes in a distributed system. In Texts and Monographs in Computer Science 1985, Shrivastava S K (ed.), Springer Berlin Heidelberg, 1985, pp.448-484.
Katcher J. Postmark: A new file system benchmark. Technical Report TR3022, Network Appliance, 1997. http://www.netapp.com/tech_library/3022.html, Jan. 2014.

Download references

Author information

Authors and Affiliations

Data Storage and Management Technology Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
Bing-Qing Shao, Jun-Wei Zhang, Cai-Ping Zheng, Hao Zhang, Zhen-Jun Liu & Lu Xu
University of Chinese Academy of Sciences, Beijing, 100049, China
Hao Zhang

Authors

Bing-Qing Shao
View author publications
You can also search for this author in PubMed Google Scholar
Jun-Wei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Cai-Ping Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Hao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhen-Jun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Lu Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bing-Qing Shao.

Additional information

This work was supported by the National Basic Research 973 Program of China under Grant No. 2011CB302304, the National High Technology Research and Development 863 Program of China under Grant Nos. 2011AA01A102 and 2013AA013205, the Strategic Priority Research Program of the Chinese Academy of Sciences under Grant No. XDA06010401, and the Chinese Academy of Sciences Key Deployment Project under Grant No. KGZD-EW-103-5(7).

Electronic supplementary material

Below is the link to the electronic supplementary material.

ESM 1

(DOC 28 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shao, BQ., Zhang, JW., Zheng, CP. et al. A Non-Forced-Write Atomic Commit Protocol for Cluster File Systems. J. Comput. Sci. Technol. 29, 303–315 (2014). https://doi.org/10.1007/s11390-014-1432-y

Download citation

Received: 17 November 2013
Revised: 09 January 2014
Published: 23 March 2014
Issue Date: March 2014
DOI: https://doi.org/10.1007/s11390-014-1432-y

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

ICCG: low-cost and efficient consistency with adaptive synchronization for metadata replication

DCC: Distributed Cache Consistency

Low Overhead Log Replication for Main Memory Database System

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

A Non-Forced-Write Atomic Commit Protocol for Cluster File Systems

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

ICCG: low-cost and efficient consistency with adaptive synchronization for metadata replication

DCC: Distributed Cache Consistency

Low Overhead Log Replication for Main Memory Database System

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now