Abstract
Cluster computing has become an inevitable part of data processing as the huge volume of data being produced from different sources like online social media, IoT, mobiledata, sensor data, black box data and so on increases in an exponentially fast manner. Distributed File System defines different methods to distribute, read and eliminate the files among different cluster computing nodes. It is found that popular distributed file systems such as Google File System and Hadoop Distributed File System store metadata centrally. This creates a chance for a Single Point of Failure that arises the need for backup and alternative solutions to recover the metadata on the failure of the metadata server. Also, the name node server is built using expensive and reliable hardware. For small and medium clusters, it is not cost effective to maintain expensive name node server. Even though cheap commodity hardware may substitute the name node functionality, they are prone to hardware failure. This paper proposes a novel distributed file system to distribute files over a cluster of machines connected in a Peer-to-Peer network. The most significant feature of the file system is its capability to distribute the metadata using distributed consensus, using hash values. Although the distributed metadata is visible to the public, the methodology ensures that it is immutable and irrefutable. As part of the in-depth research, the proposed file system has been successfully tested in the Google Cloud Platform. Also, the basic operations like read, write, and delete on Distributed File System with distributed metadata are compared with that of Hadoop Distributed File System based on distribution time on the same cluster setup. The novel distributed file system provides better results compared to the existing methodologies.
Similar content being viewed by others
Data Availability
Data sharing not applicable—no new data generated.
Code Availability
We used our own data and coding.
References
Li, X. S., et al. (2011). Analysis and simplification of three-dimensional space vector PWM for three-phase four-leg inverters. IEEE Transactions on Industrial Electronics, 58, 450–464.
https://www.slideshare.net/wahabtl/chapter-8-distributed-file-systems.
Shvachko, K., et al. (2010). The hadoop distributed file system. Yahoo! Sunnyvale, California USA.
White, T. (2009). Hadoop: The definitive guide. O'Reilly Media, Yahoo! Press, 2009
Shvachko, K. V. (2010). HDFS scalability: the limits to growth. LOGIN
Shavchko, K. V., et al. (2017) File systems and storage- scaling namespace operations with giraffa file system. Summer 2017login 42(2):27–30, 2017. www.usenix.org.
Ghemawat, S., et al. (2003). The Google file system. In: Proceedings of the ACM symposium on operating systems principles, Lake George, NY, pp. 29–43.
McKusick, M. K., et al. (2009). GFS: Evolution on fastforward. ACM Queue, Vol. 7, no 7. ACM, New York
https://www.datastax.com/wp-content/uploads/2012/09/WP-DataStax-HDFSvsCFS.pdf.
Shvachko, K. V. (2006). The hadoop distributed file system requirements. Hadoop Wiki. http://wiki.apache.org/hadoop/DFS_requirements.
https://www.slideshare.net/KonstantinVShvachko/hdfs-design-principles.
Weil, S., et al. (2006). Ceph: A scalable, high-performance distributed file system. In: Proceedings of OSDI ’06: 7th conference on operating systems design and implementation (USENIX Association, 2006)
Lustre: http://www.lustre.org.
Bhaskar et al. (2016). 3–Bitcoin mining technology. Handbook of digital currency: Bitcoin, innovation, financial instruments, and big data. Academic Press. pp. 47–51. Retrieved 2 Dec 2016—via ScienceDirect
Kumar, D. S., et al. (2017). Performance evaluation of apache spark Vs MPI: A practical case study on twitter sentiment analysis. Journal of Computer Sciences, 13(12), 781–794. https://doi.org/10.3844/jcssp.2017.781.794
Steichen, M., et al. (2018). Blockchain-based, decentralized access control for IPFS, 2018. In: IEEE confs on internet of things, green computing and communications, cyber, physical and social computing, smart data, blockchain, computer and information technology, congress on cybermatics.
Funding
No funding.
Author information
Authors and Affiliations
Contributions
All authors are contributed in this work.
Corresponding author
Ethics declarations
Conflict of interest
No conflicts of interest to disclose.
Human and Animal Rights
Humans and animals are not involved in this research work.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kumar, D.S., Dija, S., Sumithra, M.D. et al. A Novel Distributed File System Using Blockchain Metadata. Wireless Pers Commun 129, 501–520 (2023). https://doi.org/10.1007/s11277-022-10108-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-022-10108-2