Abstract
Distributed metadata consistency is one of the critical issues of metadata clusters in distributed file systems. Existing methods to maintain metadata consistency generally need several log forced write operations. Since synchronous disk IO is very inefficient, the average response time of metadata operations is greatly increased. In this paper, an asynchronous atomic commit protocol (ACP) named Dual-Log (DL) is presented. It does not need any log forced write operations. Optimizing for distributed metadata operations involving only two metadata servers, DL mutually records the redo log in counterpart metadata servers by transferring through the low latency network. A crashed metadata server can redo the metadata operation with the redundant redo log. Since the latency of the network is much lower than the latency of disk IO, DL can improve the performance of distributed metadata service significantly. The prototype of DL is implemented based on local journal. The performance is tested by comparing with two widely used protocols, EP and S2PC-MP, and the results show that the average response time of distributed metadata operations is reduced by about 40%~60%, and the recovery time is only 1 second under 10 thousands uncompleted distributed metadata operations.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Reinsel D, Gantz J. The digital universe in 2020: Big data, bigger digital shadows, and biggest growth in the far east, December 2012. http://www.emc.com/leadership/digital-universe/iview/index.htm, January 2014.
Adrian M. Information management goes ‘Extreme’: The biggest challenges for 21st century CIOs, 2011. http://www.sas.com/offices/NA/canada/lp/Big-Data/Extreme-Information-Management.pdf, January 2014.
Roselli D S, Lorch J R, Anderson T E. A comparison of file system workloads. In Proc. the 2000 USENIX Annual Technical Conference, June 2000, pp.41-54.
Gray J. Notes on data base operating systems. In Lecture Notes in Computer Science 60, Bayer R, Graham R M, Seegmüller G (eds.), Springer Berlin Heidelberg, 1978, pp.393-481.
Ganger G, McKusick M, Soules C A et al. Soft updates: A solution to the metadata update problem in file systems. ACM Trans. Computer Systems, 2000, 18(2): 127-153.
Seltzer M, Ganger G, McKusick M K, et al. Journaling versus soft updates: Asynchronous meta-data protection in file systems. In Proc. USENIX Annual Technical Conference, June 2000, pp.18-23.
Yang D Z, Huang H, Zhang J G, Xu L. A large capacity, high performance and scalability distributed file system — BWFS. Journal of Computer Research and Development, 2005, 42(3): 1028-1033. (In Chinese)
Abd-El-Malek M, Courtright II W V, Cranor C et al. Ursa minor: Versatile cluster-based storage. In Proc. the 4th USENIX Conf. File and Storage Technologies, Dec. 2005, pp.59-72.
Cluster File Systems Inc. Lustre: A scalable, high-performance file system, 2002. http://www.cse.buffalo.edu/faculty/tkosar/cse710/papers/lustre-whitepaper.pdf, January 2014.
Weil S A, Brandt S A, Miller E L et al. Ceph: A scalable, high-performance distributed file system. In Proc. the 7th OSDI, Nov. 2006, pp.307-320.
Xiong J. Research on key issues in large-scale cluster file system [Ph.D. Thesis]. Institute of Computing Technology, Chinese Academy of Sciences, 2006.
Hennessy J L, Patterson D A. Computer Architecture: A Quantitative Approach (5 edition). Morgan Kaufmann, 2011.
Stamos J W, Cristian F. A low-cost atomic commit protocol. In Proc. the 9th IEEE Symposium on Reliable Distributed Systems, October 1990, pp.66-75.
Al-Houmaily Y, Chrysanthis P. Two-phase commit in gigabit-networked distributed databases. In Proc. the 8th Int. Conf. Parallel and Distributed Computing Systems, Sept. 1995.
Qiu Y J, Liu X S, Yang F. A low-cost distributed database log mechanism. Journal of Computer Research and Development, 2004, 41(11): 1942-1948.
Bernstein P A, Hadzilacos V, Goodman N. Concurrency Control and Recovery in Database Systems. Boston, USA: Addison-Wesley, Longman Publishing Co., Inc., 1987.
Mohan C, Lindsay B, Obermarck R. Transaction management in the R* distributed database management system. ACM Transactions on Database Systems, 1986, 11(4): 378-396.
Gray J. A comparison of the Byzantine agreement problem and the transaction commit problem. In Lecture Notes in Computer Science 448, Simons B, Spector A (eds.), Springer New York, 1990, pp.10-17.
Xiong J, Hu Y, Li G et al. Metadata distribution and consistency techniques for large-scale cluster file systems. IEEE Trans. Parallel and Distributed Systems, 2011, 22(5): 803-816.
Kuhn D R. IEEE’s Posix: Making progress. IEEE Spectrum, 1991, 28(12): 36-39.
Tweedie S C. Journaling the Linux ext2fs filesystem. In Proc. the 4th Annual Linux Expo, May 1998.
Wood W G. Recovery control of communicating processes in a distributed system. In Texts and Monographs in Computer Science 1985, Shrivastava S K (ed.), Springer Berlin Heidelberg, 1985, pp.448-484.
Katcher J. Postmark: A new file system benchmark. Technical Report TR3022, Network Appliance, 1997. http://www.netapp.com/tech_library/3022.html, Jan. 2014.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by the National Basic Research 973 Program of China under Grant No. 2011CB302304, the National High Technology Research and Development 863 Program of China under Grant Nos. 2011AA01A102 and 2013AA013205, the Strategic Priority Research Program of the Chinese Academy of Sciences under Grant No. XDA06010401, and the Chinese Academy of Sciences Key Deployment Project under Grant No. KGZD-EW-103-5(7).
Electronic supplementary material
Below is the link to the electronic supplementary material.
ESM 1
(DOC 28 kb)
Rights and permissions
About this article
Cite this article
Shao, BQ., Zhang, JW., Zheng, CP. et al. A Non-Forced-Write Atomic Commit Protocol for Cluster File Systems. J. Comput. Sci. Technol. 29, 303–315 (2014). https://doi.org/10.1007/s11390-014-1432-y
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-014-1432-y