Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access

A Closer Look into IPFS: Accessibility, Content, and Performance

Published: 29 May 2024 Publication History

Abstract

The InterPlanetary File System (IPFS) has recently gained considerable attention. While prior research has focused on understanding its performance characterization and application support, it remains unclear: (1) what kind of files/content are stored in IPFS, (2) who are providing these files, (3) are these files always accessible, and (4) what affects the file access performance.
To answer these questions, in this paper, we perform measurement and analysis on over 4 million files associated with CIDs (content IDs) that appeared in publicly available IPFS datasets. Our results reveal the following key findings: (1) Mixed file accessibility: while IPFS is not designed for a permanent storage, accessing a non-trivial portion of files, such as those of NFTs and video streams, often requires multiple retrieval attempts, potentially blocking NFT transactions and negatively affecting the user experience. (2) Dominance of NFT (non-fungible token) and video files: about 50% of stored files are NFT-related, followed by a large portion of video files, among which about half are pirated movies and adult content. (3) Centralization of content providers: a small number of peers (top-50), mostly cloud nodes hosted by tech companies, serve a large portion (95%) of files, deviating from IPFS's intended design goal. (4) High variation of downloading throughput and lookup time: large file retrievals experience lower average throughput due to more overhead for resolving file chunk CIDs, and looking up files hosted by non-cloud nodes takes longer. We hope that our findings can offer valuable insights for (1) IPFS application developers to take into consideration these characteristics when building applications on top of IPFS, and (2) IPFS system developers to improve IPFS and similar systems to be developed for Web3.

References

[1]
2015. Bitcoin's Blockchain Offers Safe Haven For Malware And Child Abuse, Warns Interpol. https://www.forbes.com/sites/thomasbrewster/2015/03/27/bitcoin-blockchain-pollution-a-criminal-opportunity/'sh=73cf7c43207b.
[2]
2017. Filecoin: A Decentralized Storage Network. https://filecoin.io/filecoin.pdf.
[3]
2018. OpenSea, the largest NFT Marketplace. https://opensea.io/.
[4]
2019. Bitswap. https://docs.ipfs.tech/concepts/bitswap/.
[5]
2019. How CIDs are created. https://docs.ipfs.tech/concepts/content-addressing/#how-cids-are-created.
[6]
2019. IPFS doc. https://docs.ipfs.tech/how-to/nat-configuration/.
[7]
2019. IPFS Gateway. https://docs.ipfs.tech/concepts/ipfs-gateway/.
[8]
2019. IPFS Pinning Files. https://docs.ipfs.tech/how-to/pin-files/.
[9]
2019. Merkle Directed Acyclic Graphs (DAGs). https://docs.ipfs.tech/concepts/merkle-dag/.
[10]
2021. IPFS in 2021: The Backbone of Web3's Mainstream Momentum. https://blog.ipfs.tech/2022-01--11-IPFS-in-2021/.
[11]
2021. OpenSea stores NFTs with IPFS and FileCoin. https://blog.ipfs.tech/2021-06--17-opensea-ipfs-filecoin/.
[12]
2022. IPFS for Nextcloud. https://apps.nextcloud.com/apps/files_external_ipfs.
[13]
2022. NFT Collections Explained. https://www.nftgators.com/nft-collections/.
[14]
2023. Dtube. https://d.tube/.
[15]
2023. ERC-721 NON-FUNGIBLE TOKEN STANDARD. https://ethereum.org/en/developers/docs/standards/tokens/erc-721/.
[16]
2023. ipfs-gateway-doc. https://blog.ipfs.tech/2022-06--30-practical-explainer-ipfs-gateways-2/#debugging-ipfscontent-discovery-and-retrieval.
[17]
2023. NFT-storage-service. https://nft.storage.
[18]
2024. Alexa Website Ranking. https://www.alexa.com/.
[19]
2024. Amazon S3 Cloud Storage. https://aws.amazon.com/s3/.
[20]
2024. Amazon S3 Service Level Agreement. https://aws.amazon.com/s3/sla/.
[21]
2024. Azure Blob Storage. https://azure.microsoft.com/en-us/products/storage/blobs.
[22]
2024. content-blacklist. https://badbits.dwebops.pub/.
[23]
2024. Etherscan:. https://etherscan.io/.
[24]
2024. FPT Cloud Object Storage. https://fptcloud.com/en/product/object-storage-2/.
[25]
2024. IPFS powers the Distributed Web. https://ipfs.tech/.
[26]
2024. Protocol Labs. https://protocol.ai/.
[27]
Omar Abdullah Lajam and Tarek Ahmed Helmy. 2021. Performance evaluation of ipfs in private networks. In 2021 4th International Conference on Data Storage and Data Engineering. 77--84.
[28]
Dadepo Aderemi and Woudt van Steenbergen. 2020. An Evaluation of IPFS as a Distribution Mechanism for RPKI Repository. (2020).
[29]
Onur Ascigil, Sergi Reñé, Michal Król, George Pavlou, Lixia Zhang, Toru Hasegawa, Yuki Koizumi, and Kentaro Kita. 2019. Towards peer-to-peer content retrieval markets: Enhancing IPFS with ICN. In Proceedings of the 6th ACM Conference on Information-Centric Networking. 78--88.
[30]
Leonhard Balduf, Sebastian Henningsen, Martin Florian, Sebastian Rust, and Björn Scheuermann. 2022. Monitoring data requests in decentralized data storage systems: A case study of IPFS. In 2022 IEEE 42nd International Conference on Distributed Computing Systems (ICDCS). IEEE, 658--668.
[31]
Leonhard Balduf, Maciej Korczy'ski, Onur Ascigil, Navin V Keizer, George Pavlou, Björn Scheuermann, and Michal Król. 2023. The Cloud Strikes Back: Investigating the Decentralization of IPFS. arXiv preprint arXiv:2309.16203 (2023).
[32]
D. Barkai. 2001. Technologies for sharing and collaborating on the Net. In Proceedings First International Conference on Peer-to-Peer Computing. 13--28.
[33]
Vitalik Buterin et al. 2014. A next-generation smart contract and decentralized application platform. white paper 3, 37 (2014), 2--1.
[34]
A. Carzaniga, M.J. Rutherford, and A.L.Wolf. 2004. A routing scheme for content-based networking. In IEEE INFOCOM 2004.
[35]
Ruizhi Cheng, Nan Wu, Songqing Chen, and Bo Han. 2022. Will metaverse be nextg internet? vision, hype, and reality. IEEE Network 36, 5 (2022), 197--204.
[36]
Ruizhi Cheng, Nan Wu, Matteo Varvello, Songqing Chen, and Bo Han. 2022. Are we ready for metaverse? A measurement study of social virtual reality platforms. In Proceedings of the 22nd ACM Internet Measurement Conference. 504--518.
[37]
Chia Yuan Cho, Domagoj Babi c, Eui Chul Richard Shin, and Dawn Song. 2010. Inference and Analysis of Formal Models of Botnet Command and Control Protocols. In Proceedings of the 17th ACM Conference on Computer and Communications Security (CCS '10).
[38]
Bram Cohen. 2003. Incentives build robustness in BitTorrent. In Workshop on Economics of Peer-to-Peer systems, Vol. 6. Berkeley, CA, USA, 68--72.
[39]
Pedro Ákos Costa, João Leitão, and Yannis Psaras. 2022. Studying the workload of a fully decentralized Web3 system: IPFS. arXiv preprint arXiv:2212.07375 (2022).
[40]
Erik Daniel and Florian Tschorsch. 2022. Passively Measuring IPFS Churn and Network Size. In 2022 IEEE 42nd International Conference on Distributed Computing Systems Workshops (ICDCSW). IEEE, 60--65.
[41]
Alfonso De la Rocha, David Dias, and Yiannis Psaras. 2021. Accelerating content routing with bitswap: A multi-path file transfer protocol in ipfs and filecoin. San Francisco, CA, USA (2021), 11.
[42]
Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. 2007. Dynamo: Amazon's Highly Available Key-Value Store. SIGOPS Oper. Syst. Rev. 41, 6 (oct 2007), 205--220. https://doi.org/10.1145/1323293.1294281
[43]
Trinh Viet Doan, Tat Dat Pham, Markus Oberprieler, and Vaibhav Bajpai. 2020. Measuring decentralized video streaming: A case study of dtube. In 2020 IFIP Networking Conference (Networking). IEEE, 118--126.
[44]
Peter Druschel, Frans Kaashoek, and Antony Rowstron. 2003. Peer-to-Peer Systems: First International Workshop, IPTPS 2002, Cambridge, MA, USA, March 7--8, 2002, Revised Papers. Vol. 2429. Springer.
[45]
R Fielding, M Nottingham, and J Reschke. 2022. RFC 9110: HTTP Semantics.
[46]
Savannah Fortis. 2022. url=https://cointelegraph.com/news/gaming-makes-up-over-half-of-blockchain-industry-usagedappradar.
[47]
Barbara Guidi, Marco Conti, Andrea Passarella, and Laura Ricci. 2018. Managing social contents in Decentralized Online Social Networks: A survey. Online Social Networks and Media (2018).
[48]
Lei Guo, Songqing Chen, Zhen Xiao, Enhua Tan, Xiaoning Ding, and Xiaodong Zhang. 2005. Measurements, Analysis, and Modeling of BitTorrent-like Systems. In Proceedings of the 5th ACM SIGCOMM Conference on Internet Measurement (Berkeley, CA) (IMC '05).
[49]
Anaobi Ishaku Hassan, Aravindh Raman, Ignacio Castro, Haris Bin Zia, Emiliano De Cristofaro, Nishanth Sastry, and Gareth Tyson. 2021. Exploring Content Moderation in the Decentralised Web: The Pleroma Case. In Proceedings of the 17th International Conference on Emerging Networking EXperiments and Technologies.
[50]
Jim Hendler. 2009. Web 3.0 Emerging. Computer 42, 1 (2009), 111--113.
[51]
Sebastian Henningsen, Martin Florian, Sebastian Rust, and Björn Scheuermann. 2020. Mapping the interplanetary filesystem. In 2020 IFIP Networking Conference (Networking). IEEE, 289--297.
[52]
Binbing Hou and Feng Chen. 2020. A study on nine years of bitcoin transactions: Understanding real-world behaviors of bitcoin miners and users. In 2020 IEEE 40th international conference on distributed computing systems (ICDCS). IEEE, 1031--1043.
[53]
Tomas Isdal, Michael Piatek, Arvind Krishnamurthy, and Thomas Anderson. 2010. Privacy-Preserving P2P Data Sharing with OneSwarm. SIGCOMM Comput. Commun. Rev. (2010).
[54]
Aisyah Ismail, Mark Toohey, Young Choon Lee, Zhongli Dong, and Albert Y Zomaya. 2022. Cost and Performance Analysis on Decentralized File Systems for Blockchain-Based Applications: State-of-the-Art Report. In 2022 IEEE International Conference on Blockchain (Blockchain). IEEE, 230--237.
[55]
Guoli Li, Vinod Muthusamy, and Hans-Arno Jacobsen. 2008. Adaptive Content-Based Routing in General Overlay Topologies. In Proceedings of the 9th ACM/IFIP/USENIX International Conference on Middleware.
[56]
Ying Li, Yaxin Yu, and Xingwei Wang. 2022. Three-tier Storage Framework Based on TBchain and IPFS for Protecting IoT Security and Privacy. ACM Transactions on Internet Technology (TOIT) (2022).
[57]
Petar Maymounkov and David Mazières. 2002. Kademlia: A Peer-to-Peer Information System Based on the XOR Metric. In Peer-to-Peer Systems, Peter Druschel, Frans Kaashoek, and Antony Rowstron (Eds.).
[58]
David Mazières and M. Frans Kaashoek. 1998. Escaping the Evils of Centralized Control with Self-Certifying Pathnames. In Proceedings of the 8th ACM SIGOPS European Workshop on Support for Composing Distributed Applications.
[59]
Satoshi Nakamoto. 2008. Bitcoin whitepaper. URL: https://bitcoin. org/bitcoin. pdf-(: 17.07. 2019) (2008).
[60]
Shirish Patel and Philip J Rhodes. 2021. Decentralized Storage for Scientific Data. In 2021 IEEE International Conference on Big Data (Big Data). IEEE, 3760--3769.
[61]
Ingmar Poese, Steve Uhlig, Mohamed Ali Kaafar, Benoit Donnet, and Bamba Gueye. 2011. IP geolocation databases: Unreliable? ACM SIGCOMM Computer Communication Review 41, 2 (2011), 53--56.
[62]
Bernd Prünster, Alexander Marsalek, and Thomas Zefferer. 2022. Total Eclipse of the Heart - Disrupting the InterPlanetary File System. In USENIX Security Symposium.
[63]
Aravindh Raman, Sagar Joglekar, Emiliano De Cristofaro, Nishanth Sastry, and Gareth Tyson. 2019. Challenges in the decentralised web: The mastodon case. In Proceedings of the internet measurement conference. 217--229.
[64]
Aravindh Raman, Sagar Joglekar, Emiliano De Cristofaro, Nishanth R. Sastry, and Gareth Tyson. 2019. Challenges in the Decentralised Web: The Mastodon Case. Proceedings of the Internet Measurement Conference (2019).
[65]
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, and Scott Shenker. 2001. A Scalable Content-Addressable Network. In Proceedings of the 2001 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications.
[66]
Antony Rowstron and Peter Druschel. 2001. Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems. In Middleware 2001.
[67]
Recep Ahmet Saritekin, Eren Karabacak, Zübeyir Durgay, and Enis Karaarslan. 2018. Blockchain based secure communication application proposal: Cryptouch. In 2018 6th International Symposium on Digital Forensic and Security (ISDFS). 1--4. https://doi.org/10.1109/ISDFS.2018.8355380
[68]
Jiajie Shen, Yi Li, Yangfan Zhou, and Xin Wang. 2019. Understanding I/O Performance of IPFS Storage: A Client's Perspective. In 2019 IEEE/ACM 27th International Symposium on Quality of Service (IWQoS).
[69]
Atul Singh, Miguel Castro, Peter Druschel, and Antony Rowstron. 2004. Defending against eclipse attacks on overlay networks. In Proceedings of the 11th workshop on ACM SIGOPS European workshop. 21--es.
[70]
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan. 2001. Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications. In Proceedings of the 2001 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications.
[71]
D. Stutzbach and R. Rejaie. 2005. Capturing accurate snapshots of the Gnutella network. In Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies.
[72]
Dennis Trautwein, Aravindh Raman, Gareth Tyson, Ignacio Castro, Will Scott, Moritz Schubotz, Bela Gipp, and Yiannis Psaras. 2022. Design and evaluation of IPFS: a storage layer for the decentralized web. In Proceedings of the ACM SIGCOMM 2022 Conference. 739--752.
[73]
Qin Wang, Rujia Li, Qi Wang, and Shiping Chen. 2021. Non-fungible token (NFT): Overview, evaluation, opportunities and challenges. arXiv preprint arXiv:2105.07447 (2021).
[74]
Gavin Wood et al. 2014. Ethereum: A secure decentralised generalised transaction ledger. Ethereum project yellow paper 151, 2014 (2014), 1--32.
[75]
Zhengyu Wu, ChengHao Ryan Yang, Santiago Vargas, and Aruna Balasubramanian. 2023. Is IPFS Ready for Decentralized Video Streaming?. In Proceedings of the ACM Web Conference 2023.
[76]
Quanqing Xu, Zhiwen Song, Rick Siow Mong Goh, and Yongjun Li. 2018. Building an ethereum and ipfs-based decentralized social network system. In 2018 IEEE 24th international conference on parallel and distributed systems (ICPADS). IEEE, 1--6.
[77]
B. Y. Zhao, Ling Huang, J. Stribling, S. C. Rhea, A. D. Joseph, and J. D. Kubiatowicz. 2006. Tapestry: A Resilient Global-Scale Overlay for Service Deployment. IEEE J.Sel. A. Commun. (2006).
[78]
Qiuhong Zheng, Yi Li, Ping Chen, and Xinghua Dong. 2018. An innovative IPFS-based storage model for blockchain. In 2018 IEEE/WIC/ACM international conference on web intelligence (WI). IEEE, 704--708.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Measurement and Analysis of Computing Systems
Proceedings of the ACM on Measurement and Analysis of Computing Systems  Volume 8, Issue 2
POMACS
June 2024
344 pages
EISSN:2476-1249
DOI:10.1145/3669944
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 May 2024
Published in POMACS Volume 8, Issue 2

Check for updates

Author Tags

  1. IPFS
  2. network measurement
  3. peer-to-peer network

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 507
    Total Downloads
  • Downloads (Last 12 months)507
  • Downloads (Last 6 weeks)141
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media