Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Hierarchical data replication strategy to improve performance in cloud computing

Published: 01 April 2021 Publication History

Abstract

Cloud computing environment is getting more interesting as a new trend of data management. Data replication has been widely applied to improve data access in distributed systems such as Grid and Cloud. However, due to the finite storage capacity of each site, copies that are useful for future jobs can be wastefully deleted and replaced with less valuable ones. Therefore, it is considerable to have appropriate replication strategy that can dynamically store the replicas while satisfying quality of service (QoS) requirements and storage capacity constraints. In this paper, we present a dynamic replication algorithm, named hierarchical data replication strategy (HDRS). HDRS consists of the replica creation that can adaptively increase replicas based on exponential growth or decay rate, the replica placement according to the access load and labeling technique, and finally the replica replacement based on the value of file in the future. We evaluate different dynamic data replication methods using CloudSim simulation. Experiments demonstrate that HDRS can reduce response time and bandwidth usage compared with other algorithms. It means that the HDRS can determine a popular file and replicates it to the best site. This method avoids useless replications and decreases access latency by balancing the load of sites.

References

[1]
Fu X, Chen J, Deng S, Wang J, and Zhang L Layered virtual machine migration algorithm for network resource balancing in cloud computing Frontiers of Computer Science 2018 12 1 75-85
[2]
Mansouri N and Javidi M M A hybrid data replication strategy with fuzzy-based deletion for heterogeneous cloud data centers The Journal of Supercomputing 2018 74 10 5349-5372
[3]
Mansouri N, Javidi M M. A review of data replication based on metaheuristics approach in cloud computing and data grid. Soft Computing, 2020
[4]
Yang X, Wallom D, Waddington S, Wang J, Shaon A, Matthews B, Wilson M, Guo Y, Guo L, Blower J D, Vasilakos A V, Liu K, and Kershaw P Cloud computing in e-Science: research challenges and opportunities The Journal of Supercomputing 2014 70 1453-1471
[5]
Shi Y, Meng X, Zhao J, Hu X, Liu B, Wang H. Benchmarking cloud-based data management systems. In: Proceedings of the 2nd International CIKM Workshop on Cloud Data Management. 2010
[6]
Thusoo A, Sarma J, Jain N, Shao Z, Chakka P, Anthony S, Liu H, Wyckoff P, and Murthy R Hive-a warehousing solution over a MapReduce framework Proceedings of the VLDB Endowment 2009 2 2 1626-1629
[7]
Kuhlenkamp J, Klems M, and Röss O Benchmarking scalability and elasticity of distributed database systems Proceedings of the VLDB Endowment 2014 7 12 1219-1230
[8]
Loukopoulos T, Ahmad I, Papadias D. An overview of data replication on the internet. In: Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN.02). 2002, 27–32
[9]
Mansouri N Adaptive data replication strategy in cloud computing for performance improvement Frontiers of Computer Science 2016 10 5 925-935
[10]
ElYamany H F, Mohamed M F, Grolinger K, Capretz M A. A generalized service replication process in distributed environments. In: Proceedings of the 5th International Conference on Cloud Computing and Services Science (CLOSER). 2015, 20–22
[11]
Kim H, Parashar M, Foran D J, Yang L. Investigating the use of cloudbursts for high-throughput medical image registration. In: Proceedings of the 10th IEEE/ACM International Conference on Grid Computing (GRID). 2009
[12]
Mohamed M F Service replication taxonomy in distributed environments Service Oriented Computing and Applications 2016 10 3 317-336
[13]
Zhong H, Zhang Z, Zhang X. A dynamic replica management strategy based on data grid. In: Proceedings of the 9th International Conference on Grid and Cloud Computing. 2010, 18–23
[14]
Ghemawat S, Gobioff H, Leung S T. The Google file system. In: Proceedings of the 19th ACM Symposium on Operating Systems Principles. 2003, 29–43
[15]
Wang Y and Wang J An optimized replica distribution method in cloud storage system Journal of Control Science and Engineering 2017 11 1-8
[16]
Milani B A and Navimipour N J A comprehensive review of the data replication techniques in the cloud environments: major trends and future directions Journal of Network and Computer Applications 2016 64 229-238
[17]
Tabet K, Mokadem R, Laouar M R, and Eom S Data replication in cloud systems: a survey International Journal of Systems and Social Change 2017 8 3 1-17
[18]
Shvachko K, Hairong K, Radia S, Chansler R. TheHadoop distributed file system. In: Proceedings of the 26th Symposium on Mass Storage Systems and Technologies, Incline Village, NV. 2010, 1–10
[19]
Mansouri N and Dastghaibyfard G H Job scheduling and dynamic data replication in data grid environment The Journal of Supercomputing 2013 64 204-225
[20]
Tos U, Mokadem R, Hameurlain A, Ayav T, and Bora S Dynamic replication strategies in data grid systems: a survey The Journal of Supercomputing 2015 71 11 4116-4140
[21]
Jianjin J and Guangwen Y An optimal replication strategy for data grid systems Frontiers of Computer Science 2007 1 3 338-348
[22]
Mansouri N and Javidi M M A new prefetching-aware data replication to decrease access latency in cloud environment Journal of Systems and Software 2018 144 197-215
[23]
Gopinath S and Sherly E A dynamic replica factor calculator for weighted dynamic replication management in cloud storage systems Procedia Computer Science 2018 132 1771-1780
[24]
Mansouri N, Dastghaibyfard G H, and Mansouri E Combination of data replication and scheduling algorithm for improving data availability in data grids Journal of Network and Computer Applications 2013 36 711-722
[25]
Dabas C and Aggarwal J Shetty N, Pathaik L, Nagaraj H, Hamsavath P, and Nalini N An intensive review of data replication algorithms for cloud systems. Emerging Research in Computing, Information, Communication and Applications 2019 Singapore Springer 25-39
[26]
Mansouri N and Dastghaibyfard G H Enhanced dynamic hierarchical replication and weighted scheduling strategy in data grid Journal of Parallel and Distributed Computing 2013 73 4 534-543
[27]
Ranganathan K, Foster I. Identifying dynamic replication strategies for a high performance data grid. In: Proceedings of International Workshop on Grid Computing. 2001, 75–86
[28]
Park S M, Kim J H, Ko Y B, Yoon W S. Dynamic data grid replication strategy based on Internet hierarchy. In: Proceedings of International Conference on Grid and Cooperative Computing. 2003, 838–846
[29]
Myint J, Hunger A. Comparative analysis of adaptive file replication algorithms for cloud data storage. In: Proceedings of International Conference on Future Internet of Things and Cloud. 2014
[30]
Khanli L M, Isazadeh A, and Shishavan T N PHFS: a dynamic replication method, to decrease access latency in the multi-tier data grid Future Generation Computer Systems 2011 27 3 233-244
[31]
Sun D W, Chang G R, Gao S, Jin L Z, and Wang X W Modeling a dynamic data replication strategy to increase system availability in cloud computing environments Journal of Computer Science and Technology 2012 27 256-272
[32]
Chang R S and Chang H P A dynamic data replication strategy using access-weights in data grids Journal of Supercomputing 2008 45 3 277-295
[33]
Kim Y H, Jung M J, and Lee C H Energy-aware real-time task scheduling exploiting temporal locality IEICE Transactions on Information and Systems 2010 93 5 1147-1153
[34]
Sun D W, Chang G R, Miao C, Jin L Z, and Wang X W Analyzing modeling and evaluating dynamic adaptive fault tolerance strategies in cloud computing environments The Journal of Supercomputing 2013 66 193-228
[35]
Zhang B, Wang X, and Huang M A PGSA based data replica selection scheme for accessing cloud storage system Advanced Computer Architecture 2014 451 140-151
[36]
Ding X, You J. Plant Growth Simulation Algorithm. Shanghai People’s Publishing House, 2011, 1–59
[37]
Long S Q, Zhao Y L, and Chen W MORM: a multi-objective optimized replication management strategy for cloud storage cluster Journal of Systems Architecture 2014 60 2 234-244
[38]
Lou C, Zheng M, Liu X, and Li X Replica selection strategy based on individual QoS sensitivity constraints in cloud environment Pervasive Computing and the Networked World 2014 8351 393-399
[39]
Kumar K A, Quamar A, Deshpande A, and Khuller S SWORD: workload-aware data placement and replica selection for cloud data management systems The VLDB Journal 2014 23 6 845-870
[40]
Tos U, Mokadem R, Hameurlain A, Ayav T, and Bora S Ensuring performance and provider profit through data replication in cloud systems Cluster Computing 2018 21 3 1479-1492
[41]
Wu Z, Butkiewicz M, Perkins D, Katz-Basset E, Madhyastha H V. Spanstore: cost-effective geo-replicated storage spanning multiple cloud services. In: Proceedings of the 24th ACM Symposium on Operating Systems Principles. 2013, 292–308
[42]
Vulimiri A, Curino C, Godfrey B, Padhye J, Varghese G. Global analytics in the face of bandwidth and regulatory constraints. In: Proceedings of the 12th USENIX Conference on Networked Systems Design and Implementation. 2015, 323–336
[43]
Wei Q, Veeravalli B, Gong B, Zeng L, Feng D. CDRM: a cost-effective dynamic replication management scheme for cloud storage cluster. In: Proceedings of IEEE International Conference on Cluster Computing. 2010,188-196
[44]
Edwin E B, Umamaheswari P, and Thanka M R An efficient and improved multi-objective optimized replication management with dynamic and cost aware strategies in cloud computing data center Cluster Computing 2019 22 11119-11128
[45]
Azimi S K Montaser Kouhsari S A Bee Colony (Beehive) based approach for data replication in cloud environments Fundamental Research in Electrical Engineering 2018 Singapore Springer 1039-1052
[46]
Tatarinov I, Viglas S D, Beyer K S, Shanmugasundaram J, Shekita E J, Zhang C. Storing and querying ordered XML using a relational database system. In: Proceedings of the 2002 ACMSIGMOD International Conference on Management of Data. 2002, 204–215
[47]
Cheng X, Dale C, Liu J. Statistics and social network of YouTube videos. In: Proceedings of the 16th International Workshop on Quality of Service. 2008, 229–238
[48]
Madi M K, Hassan S. Dynamic replication algorithm in data grid: survey. In: Proceedings of International Conference on Network Applications, Protocols and Services. 2008
[49]
Madi M, Hassan S, Yusof Y. A dynamic replication strategy based on exponential growth/decay rate. In: Proceedings of International Conference on Computing and Informatics. 2009
[50]
Xu L, Ling T W, Wu H, Bao Z. DDE: from dewey to a fully dynamic XML labeling scheme. In: Proceedings of SIGMOD Conference. 2009, 719–730
[51]
Dogan A A study on performance of dynamic file replication algorithms for real-time file access in data grids Future Generation Computer Systems 2009 25 8 829-839
[52]
Rahmani A M, Fadaie Z, and Chronopoulos A T Data placement using dewey encoding in a hierarchical data grid Journal of Network and Computer Applications 2015 49 88-98
[53]
Barroso L A, Clidaras J, Holzle U. The Datacenter As a Computer: an Introduction to the Design of Warehouse-scale Machines. 2nd ed. Morgan and Claypool Publishers, 2013
[54]
Murugesan R, Elango C, and Kannan S Cloud computing networks with poisson arrival process dynamic resource allocation IOSR Journal of Computer Engineering 2014 16 5 124-129
[55]
Mosleh M A S, Radhamani G, Hasan S H. Adaptive cost-based task scheduling in cloud environment. Scientific Programming, 2016
[56]
Cameron D G, Carvajal-schiaffino R, Paul Millar A, Nicholson C, Stockinger K, Zini F. UK Grid Simulation with OptorSim. UK e-Science All Hands Meeting, 2003
[57]
Lee L W, Scheuermann P, and Vingralek R File assignment in parallel I/O systems with minimal variance of service time IEEE Transactions on Computers 2000 49 2 127-140
[58]
Ranganathan K, Foster I. Decoupling computation and data scheduling in distributed data intensive applications. In: Proceedings of International Symposium for High Performance Distributed Computing. 2002
[59]
Breslau L, Cao P, Fan L, Phillips G, Shenker S. Web caching and Zipf-like distributions: evidence and implications. In: Proceedings of IEEE INFO-COM’99, Conference on Computer Communications. 1999, 126–134
[60]
Iamnitchi A, Ripeanu M, Foster I. Locating data in (small-world?) peer-to-peer scientific collaborations. In: Proceedings of the 1 st International Workshop on Peer-to-Peer Systems. 2002, 232–241
[61]
Visser M Zipf’s law, power laws and maximum entropy New Journal of Physics 2013 15 4 1-13
[62]
Adamic L and Huberman B Zipf’s law and the Internet Glottometrics 2002 3 1 143-150
[63]
Tos U, Mokadem R, Hameurlain A, Ayav T, and Bora S Dynamic replication strategies in data grid systems: a survey The Journal of Supercomputing 2015 21 11 4116-4140

Cited By

View all
  • (2024)Data Replication Methods in Cloud, Fog, and Edge Computing: A Systematic Literature ReviewWireless Personal Communications: An International Journal10.1007/s11277-024-11082-7135:1(531-561)Online publication date: 1-Mar-2024
  • (2024)File block multi-replica management technology in cloud storageCluster Computing10.1007/s10586-022-03952-127:1(457-476)Online publication date: 1-Feb-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Frontiers of Computer Science: Selected Publications from Chinese Universities
Frontiers of Computer Science: Selected Publications from Chinese Universities  Volume 15, Issue 2
Apr 2021
190 pages
ISSN:2095-2228
EISSN:2095-2236
Issue’s Table of Contents

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 April 2021
Accepted: 06 November 2019
Received: 20 March 2019

Author Tags

  1. cloud computing
  2. data replication
  3. multi-tier architecture
  4. simulation
  5. load balance

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 26 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Data Replication Methods in Cloud, Fog, and Edge Computing: A Systematic Literature ReviewWireless Personal Communications: An International Journal10.1007/s11277-024-11082-7135:1(531-561)Online publication date: 1-Mar-2024
  • (2024)File block multi-replica management technology in cloud storageCluster Computing10.1007/s10586-022-03952-127:1(457-476)Online publication date: 1-Feb-2024

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media