Abstract
Object storage systems are a flexible class of storage where data are architectured and stored in the form of units called objects. Objects may store structured, semi-structured or unstructured data, so it is well suited for big data in cloud environments. It was introduced particularly for academic purposes at Carnegie Mellon University and the University of California at Berkeley. Till then, large-scale file systems were practiced for their search engine. The earliest instance of object storage used in an organization was EMC’s Centera platform (2001). This article plays an important role in the industry and research in the field of understanding the available storage systems for supporting big data features, whereas our analysis of the object-based storage system is based on the classification of the storage systems and the comparative study of these systems’ architecture. It helps to distinguish the basic characteristics of object-based storage systems with respect to the implementation of these storage systems. Furthermore, it discusses the future perspective, research challenges and limitations.
Similar content being viewed by others
Notes
https://cloudian.com/blog/object-storage-vs-file-storage/
https://github.com/minio/minio.
https://docs.ceph.com/docs/jewel/rados/configuration/filesystem-recommendations/
https://docs.ceph.com/docs/master/rados/operations/erasure-code/
https://docs.openstack.org/swift/latest/overview_erasure_code.html.
http://www.freeraidrecovery.com/library/what-is-raid.aspx.
https://docs.ceph.com/docs/master/dev/rados-client-protocol/
https://www.t10.org/drafts.htm#OSD_Family.
https://www.snia.org/
http://www.snia.org/cdmi.
https://www.ctera.com/partners/detail/emc-alliance-partner/
https://docs.ceph.com/docs/master/radosgw/
https://www.backblaze.com/b2/cloud-storage.html.
References
S. Ali, Where data defines value - Tarmin, in Technical Report, Tarmin (2013)
Amplidata, Himalaya Enterprise Edition Data Sheet (2014) . www.amplidata.com/resources/library/, data sheet
Amplidata, Himalaya - Distributed Storage System Architecture (2017) . http://amplidata.com/himalaya/architecture/
AWS, Amazon s3 Authentication Tool for Curl (2017) . https://aws.amazon.com/code/128
AAWS, Amazon simple storage service, in Technical Report, AWS, Developer Guide (2006)
R. Barber, C. Garcia-Arellano, R. Grosman, et al., Evolving databases for new-gen big data applications, in CIDR (2017)
H.B. Barua, K.C. Mondal, A comprehensive survey on cloud data mining (CDM) frameworks and algorithms. ACM Comput. Surv. (CSUR) 52(5), 1–62 (2019)
D. Beaver, S. Kumar, H.C. Li, et al., Finding a needle in haystack: Facebook’s photo storage, in OSDI (2010), pp. 1–8
J.S. Best, S.R. Hetzler, R.F. Hoyt, et al. Efficient variable-block data storage system employing a staggered fixed-block-architecture array, in US Patent US5459853A, International Business Machines Corporation (1995)
P.J. Braam, R. Zahir, Lustre technical project summary, in Technical Report (Cluster File Systems Inc, Mountain View, CA, 2001)
V. Bucur, C. Dehelean, L. Miclea, Object storage in the cloud and multi-cloud: state of the art and the research challenges, in 2018 IEEE International Conference on Automation. Quality and Testing, Robotics (AQTR) (IEEE, 2018), pp.1–6
B. Calder, J. Wang, A. Ogus, et al., Windows azure storage: a highly available cloud storage service with strong consistency, in: 23rd ACM Symposium on Operating Systems Principles (ACM, 2011), pp. 143–157
B. Callaway, R. Esker, Openstack Deployment and Operations Guide (NetAPP Inc, Technical guide, San Jose, 2015)
I. Chavis, D. Coutts, B. Demkowicz et al., A Guide to the IBM Clustered Network File System (IBM Redbooks, Indianapolis, 2012)
V. Chtchetkine, A. Kucheck, G. Terechtenko, Installable File System Having Virtual File System Drive, Virtual Device Driver, and Virtual Disks (US patent, Google Patents, 2002)
B. Dageville, T. Cruanes, M. Zukowski, et al., The snowflake elastic data warehouse, in Proceedings of the 2016 International Conference on Management of Data (2016), pp. 215–226
DEMC, EMC Elastic Cloud Storage (ECS) (2016). https://www.dellemc.com/sr-me/storage/ecs/index.htm, version 2.0
DEMC, Dell EMC isilon scale-out NAS product family: unstructured data storage made simple, in Technical Report (Dell Inc., 2020), data sheet
HP Enterprise, HPE Scalable Object Storage with Scality Ring on HPE Apollo 4200 Gen10 (2015). https://h20195.www2.hpe.com/v2/getpdf.aspx/4AA5-9749ENW.pdf, technical White Paper
M. Factor, K. Meth, D. Naor, et al., Object storage: the future building blocks for storage systems, in IEEE International Symposium on Mass Storage Systems and Technology (IEEE, 2005a), pp. 119–123
M. Factor, K. Meth, D. Naor, et al., Object storage: the future building block for storage systems, in 2005 IEEE International Symposium on Mass Storage Systems and Technology, (IEEE, 2005b), pp. 119–123
I. Fuchs, Storagegrid webscale: nonstop object storage for enterprise and cloud, in Technical Report (Cloud Solutions, 2014)
S. Garfinkel, An evaluation of amazon’s grid computing services: Ec2, s3, and SQS, in Technical Report TR-08-07 (CS Group, Harvard University, 2007)
G.A. Gibson, Scaling file service up and out, in FAST (2004)
D. He, X. Zhang, D.H. Du, et al., Coordinating parallel hierarchical storage management in object-base cluster file systems, in IEEE Symposium on Mass Storage Systems and Technologies (2006)
C.R. Hertel, Implementing CIFS: The Common Internet File System (Prentice Hall, Hoboken, 2003)
D. Hitz, J. Lau, M.A. Malcolm, File system design for an NFS file server appliance, in USENIX Winter (1994)
C. Huang, H. Simitci, Y. Xu, et al., Erasure coding in windows azure storage, in 2012 USENIX Annual Technical Conference (2012), pp. 15–26
Inc. C, Cloudian Hyperstore (2015)
P. Jain, A. Goel, S. Gupta, Requirement checklist for infrastructure monitoring of swift, in International Conference on High Performance Computing and Simulation (IEEE, 2015), pp. 599–604
Y.K. Kim, H.Y. Kim, S.M. Lee et al., Oasis: implementation of a cluster file system using object-based storage devices. Lect. Notes Comput. Sci. 3980, 1053–1061 (2006)
C. Kraft, ECS overview and architecture, in Technical Report ( Dell EMC Corporation, White Paper, 2020)
E. Levy, A. Silberschatz, Distributed file systems: concepts and examples. ACM Comput. Surv. 22(4), 321–374 (1990). https://doi.org/10.1145/98163.98169
L. Li, W. Chou, Design and describe rest API without violating rest: a petrinet-based approach, in International Conference on Web Services (IEEE, 2011), pp. 508–515
Y. Li, K. Chang, O. Bel, et al., Capes: unsupervised storage performance tuning using neural network-based deep reinforcement learning, in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (2017), pp. 1–14
Q. Liu, D. Feng, Q. Ling-jun, et al., A framework for accessing general object storage, in International Workshop on Networking, Architecture, and Storages (IEEE, 2006)
HTC Ltd., Huawei oceanstor UDS massive storage system, in Technical Report (Huawei, Technical White Paper, 2014)
Y. Mansouri, A.N. Toosi, R. Buyya, Data storage management in cloud environments: taxonomy, survey, and future directions. ACM Comput. Surv. (CSUR) 50(6), 1–51 (2017)
W.H. Marsh, Elements of block storage design. Int. J. Prod. Res 17(4), (1979)
M. Mesnier, G.R. Ganger, E. Riedel, Object-based storage. IEEE Commun. Mag. 41(8), 84–90 (2003)
A.S. Mondal, A. Mukhopadhyay, S. Chattopadhyay, Machine learning-driven automatic storage space recommendation for object-based cloud storage system. Complex Intell. Syst. 8(1), 489–505 (2022)
D.D. Networks, A beginner’s guide to next generation object storage, in Technical Report (DDN Inc., White Paper, 2015)
P. Nicolas, Scality ring: software defined storage for the 21st century, in Technical White Paper, Scality (2014)
R.R. Noel, R. Mehra, P. Lama, Towards self-managing cloud storage with reinforcement learning, in 2019 IEEE International Conference on Cloud Engineering (IC2E) (IEEE, 2019), pp. 34–44
I. Odun-Ayo, O. Ajayi, B. Akanle, et al., An overview of data storage on the cloud, in International Conference on Next Generation Computing and Information Systems (2017), pp. 29–34
Quantum, Lattus Object Storage (Quantum Corporation, Datasheet, 2017)
M. Ratner, Better Object Storage with Hitachi Content Platform (White Paper, Hitachi, 2014)
J.K. Resch, J.S. Plank, Aont-RS: blending security and performance in dispersed storage systems, in Proceedings of the 9th USENIX Conference on File and Stroage Technologies (USENIX Association, USA, FAST’11, 2011), p. 14
J. Satran, K. Meth, C. Sapuntzakis, et al., Internet small computer systems interface, in Internet Draft (2004), pp. 1–285
F. Schmidt, The SCSI Bus and IDE Interface: Protocols, Applications and Programming, 2nd edn. (Addison-Wesley Publishing Co., Boston, 1997)
S. Shirinbab, L. Lundberg, D. Erman, Performance evaluation of distributed storage systems for cloud computing. Int. J. Comput. Appl. 20(4), 195–207 (2013)
E. Signoretti, Caringo swarm 7: beyond the limits of traditional storage. A new private cloud foundation for storage needs at scale, in Technical report (JUKU Consulting SRL, Caringo, 2014)
M. Software, Solution Overview: Mezeo Software for the Enterprise (Mezeo Software, White Paper, Houston, Texas, 2012)
M. Staimer, Alternatives to restful API for accessing object storage, in Technical Report (Dragon Slayer Consulting, Report, 2013)
Storiant, Storiant technology deep dive: data storage for a new generation, in Technical Report (Storiant Inc., White Paper, 2014)
SwiftStack, Openstack swift architecture - swiftstack documentation, online document, SwiftStack (2011)
H. Tang, A. Gulbeden, J. Zhou, et al., The panasas activescale storage cluster - delivering scalable high bandwidth storage, in ACM/IEEE Conference on Supercomputing (IEEE, 2004), p. 53
A. Verbitski, A. Gupta, D. Saha, et al., Amazon aurora: design considerations for high throughput cloud-native relational databases, in Proceedings of the 2017 ACM International Conference on Management of Data (2017), pp 1041–1052
F. Wang, S.A. Brandt, E.L. Miller, et al. OBFS: a file system for object-based storage devices, in 21st IEEE / 12th NASA Goddard Conference on Mass Storage Systems and Technologies (IEEE, 2004), pp. 283–300
A. Waqar, A. Raza, H. Abbas, User privacy issues in eucalyptus: a private cloud computing environment, in IEEE 10th International Conference on Trust. Security and Privacy in Computing and Communications (IEEE, 2011), pp.927–932
M. Wehle, EMC Atmos Cloud Storage Architecture (EMC Corporation, White paper, Texas, 2014)
S.A. Weil, S.A. Brandt, E.L. Miller, et al. Ceph: a scalable, high-performance distributed file system, in 7th USENIX Symposium on Operating Systems Design and Implementation, USENIX Association (2006), pp. 307–320
J.C. Wu, S.A. Brandt, QOS support in object-based storage devices, in 3rd International Workshop on Storage Network Architecture and Parallel I/Os (2005), pp. 41–48
Funding
The authors declare that no funds, grants or other support were received during the preparation of this manuscript.
Author information
Authors and Affiliations
Contributions
All authors contributed equally to the study and documentation. The first draft of the manuscript was written by Anindita Sarkar Mondal and Madhupa Sanyal, and all authors commented and modified on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mondal, A.S., Sanyal, M., Barua, H.B. et al. Comparative Analysis of Object-Based Big Data Storage Systems on Architectures and Services: A Recent Survey. J. Inst. Eng. India Ser. B 105, 685–700 (2024). https://doi.org/10.1007/s40031-023-00983-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40031-023-00983-z