Abstract
Replication is a technique used in Data Grid environments that helps to reduce access latency and network bandwidth utilization. Replication also increases data availability thereby enhancing system reliability. The research addresses the problem of replication in Data Grid environment by investigating a set of highly decentralized dynamic replica placement algorithms. Replica placement algorithms are based on heuristics that consider both network latency and user requests to select the best candidate sites to place replicas. Due to dynamic nature of Grid, the candidate site holds replicas currently may not be the best sites to fetch replicas in subsequent periods. Therefore, a replica maintenance algorithm is proposed to relocate replicas to different sites if the performance metric degrades significantly. The study of our replica placement algorithms is carried out using a model of the EU Data Grid Testbed 1 [Bell et al. Comput. Appl., 17(4), 2003] sites and their associated network geometry. We validate our replica placement algorithms with total file transfer times, the number of local file accesses, and the number of remote file accesses.
Similar content being viewed by others
References
Allcock, B., Bester, J., Bresnahan, J., Chervenak, A.L., Foster, I., Kesselman, C., Meder, S., Nefedova, V., Quesnal, D., Tuecke, S.: Secure, efficient data transport and replica management for high performance data-intensive computing. IEEE Mass Storage Conference (2001)
Allcock, B., Bester, J., Bresnahan, J., Chervenak, A., Foster, I., Kesselman, C., Meder, S., Nefedova, V., Quesnel, D., Tuecke, S.: Data management and transfer in high performance computational Grid environments. Parallel Comput. J. 28(5), 749–771 (2002) May
Buyya, R., Abramson, D., Giddy, J.: Nimrod/G: An architecture of a resource management and scheduling system in a global computational Grid, HPC Asia 2000, May 14–17, 2000, pp 283–289, Beijing, China
Bell, W., Cameron, D.G., Capozza, L., Millar, A.P., Stockinger, K., Zini, F.: OptorSim – a Grid simulator for studying dynamic data replication strategies. Int. J. High Perform. Comput. Appl. 17(4), (2003)
Cai M., Chervenak, A., Frank, M.: A peer-to-peer replica location service based on a distributed hash table. Proceedings of the super computing conference, pp. 56–68, (2004)
Cohon, J.L.: Multiobjective Programming and Planning. Academic, New York (1978)
Drezner, Z., Hamacher, H.W.: Facility Location Application and Theory. Springer, Berlin (2002)
Daskin, M.S.: Network and Discrete Location Models: Algorithms and Applications. Wiley, New York (1995)
Foster, I.: Internet Computing and the Emerging Grid, Nature Web Matters (2000)
Fisher, M.L.: The Lagrangian relaxation method for solving integer programming problems. Manag. Sci. 27, 1–18 (1981)
Foster, I., Kesselman, C.: Globus: A metacomputing infrastructure toolkit. Int. J. Supercomput. Appl. 11(2), 115–128 (1997)
Foster, I., Kesselman, C.: Globus: A toolkit-based Grid architecture. In: Foster, I., Kesselman, C. (eds.) The Grid: Blueprint for a New Computing Infrastructure, pp. 259–278. Morgan Kaufmann, San Mateo, CA (1999)
Foster, I., Kesselman, C., Tuecke, S.: The Anatomy of the Grid: Enabling Scalable Virtual Organizations. Int. J. Supercomput. Appl. 15(3) (2001)
Huffman, B.T., et al.: The CDF/D0 UK GridPP Project. CDF Internal Note CDF/DOC/COMP_UPG/5858, February 2002
Hakami, S.: Optimum location of switching centers and the absolute centers and medians of a graph. Oper. Res. 12, 450–459
High Energy Physics Experiment Website, http://www.hep.net
Howes, T.A., Smith, M.: A scalable, deployable directory service framework for the internet. Technical report, Center for Information Technology Integration, University of Michigan
Kavitha, R., Foster, I.: Design and Evaluation of Replication Strategies for a High Performance Data Grid, in Computing and High Energy and Nuclear Physics 2001 (CHEP’01) Conference
Kavitha, R., Foster, I.: Decoupling Computation and Data Scheduling in Distributed Data-Intensive Applications. Proceedings of 11th. IEEE International Symposium on High Performance Distributed Computing Edinburgh, Scotland, July 2002
Kavitha, R., Iamnitchi, A., Foster, I.: Improving Data Availability through Dynamic Model Driven Replication in Large Peer-to-Peer Communities. Proceedings of Global and Peer-to-Peer Computing on Large Scale Distributed Systems Workshop, Berlin, Germany, May 2002
The GriPhyN Project, http://www.griphyn.org
Revees, C.R. (ed.): Modern Heuristic Techniques for Combinatorial Problems, Oxford Blackwell Scientific Publication, Oxford, UK (1993)
Rahman, R.M., Barker, K., Alhajj, R.: Replica Placement Design with Static Optimality and Dynamic Maintainability, Proceedings of the IEEE/ACM International Symposium on Cluster Computing and Grid (CCGrid 06), Singapore, May, 2006
Rahman, R.M., Barker, K., Alhajj, R.: Replica Placement on Data Grid: Considering Utility and Risk. IEEE International Conference on Coding and Computing (ITCC), April, 2005
Toregas, C., Swain, R., Revelle, C., Bergman, L.: The location of emergency service facilities. Oper. Res. 19, 1363–1373 (1971)
Wesolowsky, G., Truscott, W.: The multiperiod location-allocation problem with relocation of facilities. Manag. Sci. 22, 57–64 (1975)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rahman, R.M., Barker, K. & Alhajj, R. Replica Placement Strategies in Data Grid. J Grid Computing 6, 103–123 (2008). https://doi.org/10.1007/s10723-007-9090-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10723-007-9090-8