Nothing Special   »   [go: up one dir, main page]

skip to main content
article

An on-line replication strategy to increase availability in Data Grids

Published: 01 February 2008 Publication History

Abstract

Data is typically replicated in a Data Grid to improve the job response time and data availability. Strategies for data replication in a Data Grid have previously been proposed, but they typically assume unlimited storage for replicas. In this paper, we address the system-wide data availability problem assuming limited replica storage. We describe two new metrics to evaluate the reliability of the system, and propose an on-line optimizer algorithm that can Minimize the Data Missing Rate (MinDmr) in order to maximize the data availability. Based on MinDmr, we develop four optimizers associated with four different file access prediction functions. Simulation results utilizing the OptorSim show our MinDmr strategies achieve better performance overall than other strategies in terms of the goal of data availability using the two new metrics.

References

[1]
Bell, William H., Cameron, David G., Capozza, Luigi, Paul Millar, A., Stockinger, Kurt and Zini, Floriano, OptorSim-A grid simulator for studying dynamic data replication strategies. International Journal of High Performance Computing Applications. v17 i4.
[2]
Bell, William H., Cameron, David G., Carvajal-Schiaffino, Ruben, Paul Millar, A., Stockinger, Kurt and Zini, Floriano, Evaluation of an economy-based file replication strategy for a Data Grid. In: International Workshop on Agent Based Cluster and Grid Computing at CCGrid 2003, IEEE Computer Society Press.
[3]
Cameron, David G., Carvajal-Schiaffino, Ruben, Paul Millar, A., Nicholson, Caitriana, Stockinger, Kurt and Zini, Floriano, Evaluating scheduling and replica optimisation strategies in OptorSim. In: 4th International Workshop on Grid Computing (Grid2003), IEEE Computer Society Press.
[4]
Carman, Mark, Zini, Floriano, Serafini, Luciano and Stockinger, Kurt, Towards an economy-based optimisation of file access and replication on a Data Grid. In: International Workshop on Agent Based Cluster and Grid Computing at International Symposium on Cluster Computing and the Grid, IEEE Computer Society Press.
[5]
EU Data Grid project. http://www.eu-egee.org/
[6]
GriPhyN: The Grid physics network project. http://www.griphyn.org
[7]
Ming Lei, S. Vrbsky, A data replication strategy to increase availability in Data Grids, in: Grid Computing and Applications, Las Vegas, NV, 2006, pp. 221-227
[8]
T.E. Ng, H. Zhang, Predicting internet network distance with coordinates-based approaches, in: 21st IEEE INFOCOM Conference, June 2002
[9]
OptorSim-A replica optimizer simulation. http://edg-wp2.web.cern.ch/edgwp2/optimization/optorsim.html
[10]
E. Otoo, A. Shoshani, Accurate modeling of cache replacement policies in a Data-Grid. in: Proceedings. 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies, MSS03, San Diego, CA, April 2003, 2003
[11]
Sang-Min Park, Jai-Hoon Kim, Young-Bae Ko, Won-Sik Yoon, Dynamic Data Grid replication strategy based on internet hierarchy, in: Second International Workshop on Grid and CooperativeComputing, GCC'2003, Shanghai, China, Dec 2003
[12]
PPDG. http://www.ppdg.net
[13]
Kavitha Ranganathan, Adriana Iamnitchi, Ian Foster, Improving data availability through dynamic model-driven replication in large peer-to-peer communities, in: Proceedings of the Workshop on Global and Peer-to-Peer Computing on Large Scale Distributed Systems, Berlin, May 2002
[14]
Kavitha Ranganathan, Ian Foster, Identifying dynamic replication strategies for a high performance Data Grid, in: International Workshop on Grid Computing, Denver, November 2001
[15]
Schintke, F. and Reinefeld, A., Modeling replica availability in large Data Grids. Journal of Grid Computing. v1 i2.
[16]
Michal Szymaniak, Guillaume Pierre, Maarten van Steen, Latency-driven replica placement, in: 2005 Symposium on Applications and the Internet, SAINT'05, pp. 399-405
[17]
Kunszt, Peter, Laure, Erwin, Stockinger, Heinz and Stockinger, Kurt, File-based replica management. Future Generation Computer Systems Journal. v21. 115-123.
[18]
Tang, Ming, Lee, Bu-Sung, Tang, Xueyan and Yeo, Chai-Kiat, The impact of data replication on job scheduling performance in the Data Grid. Future Generation Computer Systems Journal. v22. 254-268.
[19]
Chang, Ruay-Shiung and Chen, Po-Hung, Complete and fragmented replica selection and retrieval in Data Grids. Future Generation Computer Systems Journal. v23. 536-546.
[20]
Tang, Ming, Lee, Bu-Sung, Yeo, Chai-Kiat and Tang, Xueyan, Dynamic replication algorithms for the multi-tier Data Grid. Future Generation Computer Systems Journal. v21. 775-790.
[21]
Cameron, David, Casey, James, Guy, Leanne, Kunszt, Peter, Lemaitre, Sophie, McCance, Gavin, Stockinger, Heinz, Stockinger, Kurt, Andronico, Giuseppe, Bell, William, Ben-Akiva, Itzhak, Bosio, Diana, Chytracek, Radovan, Domenici, Andrea, Donno, Flavia, Hoschek, Wolfgang, Laure, Erwin, Lucio, Levi, Millar, Paul, Salconi, Livio, Segal, Ben and Silander, Mika, Replica management in the European DataGrid project. Journal of Grid Computer. v2 i4. 341-351.
[22]
BIRN. http://www.nbirn.net/
[23]
LHC accelerator project. http://www-td.fnal.gov/LHC/USLHC.html
[24]
IVOA. http://www.ivoa.net/pub/info/

Cited By

View all
  • (2018)Storage tier-aware replicative data reorganization with prioritization for efficient workload processingFuture Generation Computer Systems10.1016/j.future.2017.04.01079:P2(618-629)Online publication date: 1-Feb-2018
  • (2018)Evaluation of site availability exploitation towards performance optimization in data gridsCluster Computing10.1007/s10586-018-2836-121:4(1967-1980)Online publication date: 1-Dec-2018
  • (2018)Evaluation Through Realistic Simulations of File Replication Strategies for Large Heterogeneous Distributed SystemsEuro-Par 2018: Parallel Processing Workshops10.1007/978-3-030-10549-5_32(409-420)Online publication date: 27-Aug-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Future Generation Computer Systems
Future Generation Computer Systems  Volume 24, Issue 2
February, 2008
95 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 February 2008

Author Tags

  1. Data Grid
  2. Data availability
  3. Data missing rate
  4. Limited storage
  5. Replica strategy

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Storage tier-aware replicative data reorganization with prioritization for efficient workload processingFuture Generation Computer Systems10.1016/j.future.2017.04.01079:P2(618-629)Online publication date: 1-Feb-2018
  • (2018)Evaluation of site availability exploitation towards performance optimization in data gridsCluster Computing10.1007/s10586-018-2836-121:4(1967-1980)Online publication date: 1-Dec-2018
  • (2018)Evaluation Through Realistic Simulations of File Replication Strategies for Large Heterogeneous Distributed SystemsEuro-Par 2018: Parallel Processing Workshops10.1007/978-3-030-10549-5_32(409-420)Online publication date: 27-Aug-2018
  • (2017)Dynamic Data Replication Based on Tasks scheduling for Cloud Computing EnvironmentInternational Journal of Strategic Information Technology and Applications10.4018/IJSITA.20171001048:4(40-51)Online publication date: 1-Oct-2017
  • (2017)Data Replication in Cloud SystemsInternational Journal of Information Systems and Social Change10.4018/IJISSC.20170701028:3(17-33)Online publication date: 1-Jul-2017
  • (2016)Data popularity measurements in distributed systemsJournal of Network and Computer Applications10.1016/j.jnca.2016.06.00272:C(150-161)Online publication date: 1-Sep-2016
  • (2016)A comprehensive review of the data replication techniques in the cloud environmentsJournal of Network and Computer Applications10.1016/j.jnca.2016.02.00564:C(229-238)Online publication date: 1-Apr-2016
  • (2016)A dynamic, cost-aware, optimized data replication strategy for heterogeneous cloud data centersFuture Generation Computer Systems10.1016/j.future.2016.05.01665:C(10-32)Online publication date: 1-Dec-2016
  • (2015)Data replication strategies with performance objective in data grid systemsInternational Journal of Grid and Utility Computing10.1504/IJGUC.2015.0663956:1(30-46)Online publication date: 1-Dec-2015
  • (2015)Dynamic replication strategies in data grid systemsThe Journal of Supercomputing10.1007/s11227-015-1508-771:11(4116-4140)Online publication date: 1-Nov-2015
  • Show More Cited By

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media