Abstract
A fundamental problem of grid computing is the communication over head. One reason of this overhead is the access to remotely stored data. Caching read-only data is a possible alleviation of the problem. In case of grid computing caching can be optimized by using allocation schemes considering the contents of the caches. Possible ways to achieve such an allocation in a grid are the topic of this paper. The paper proposes to use allocation schemes preferring resources with the required data in their caches. In doing so the hit rate of the caches will be increased and as a consequence the average response time of the jobs and the network load will be reduced. Two new possible allocation approaches are discussed and com pared with classical allocation schemes. The performance and the costs of the schemes (when applied to large grids) are evaluated using a simulation environment.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Allcock, B., Bester, J., Bresnahan, J.: Secure, Efficient Data Transport and Replica Management for High-Performance Data-Intensive Computing. In: IEEE Mass Storage Conference (2001)
Avaki Data Grid 3.0 Conceptual Overview, Avaki Corporation (December 2002), http://www.avaki.com
Baldridge, K., Bourne, P.E.: The new biology and the Grid. In: Berman, F., Hey, A., Fox, G. (eds.) Grid Computing - Making the Global Infrastructure a Reality. John Wiley & Sons, Chichester (2003)
Barford, P., Bestavros, A., Bradley, A., Crovella, M.: Changes in Web Client Access Patterns. Characteristics and Caching Implications. World Wide Web 2(1–2) (1999)
Berman, H.M., Goodsell, D.S., Bourne, P.E.: Protein Structures: From Famine to Feast. American Scientist (July-August 2002), http://www.sdsc.edu/pb/papers/amer_sci.pdf
Bester, J., Foster, I., Kesselman, C., Tedesco, J., Tuecke, S.: GASS: A Data Movement and Access Service for Wide Area Computing Systems. In: Sixth Workshop on I/O in Parallel and Distributed Systems, May 5 (1999)
Bloom, B.H.: Space/Time Trade-offs in Hash Coding with Allowable Errors. Communications of the ACM 13(7), 422–426 (1970)
Breslau, L., Cao, P., Fan, L., Phillips, G., Shenker, S.: Web Caching and Zipf-like Distributions: Evidence and Implications. In: IEEE Infocom 1999, New York, NY, March 1999, pp. 126–134 (1999)
Calzarossa, M., Serazzi, G.: A Characterization of the Variation in Time of Workload Arrival Patterns. IEEE Transactions on Computers 34(2), 156–162 (1985)
Cherkasova, L., Karlsson, M.: Dynamics and Evolution of Web Sites: Analysis, Metrics and Design Issue. In: Sixth IEEE Symposium on Computers and Communications (ISCC 2001), July 03-05, p. 64 (2001)
Chervenak, A., Deelman, E., Foster, I.: Giggle: A Framework for Constructing Scalable Replica Location Services. In: IEEE Supercomputing 2002 (2002)
Chiang, S.-H., Vernon, M.K.: Characteristics of a Large Shared Memory Production Workload. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 2001. LNCS, vol. 2221, p. 159. Springer, Heidelberg (2001)
Dadam, P.: Verteilte Datenbanken und Client-/Server-Systeme. Grundlagen, Konzepte und Realisierungsformen. Springer, Heidelberg (1996)
Downey, A.B., Feitelson, D.G.: The Elusive Goal of Workload Characterization. Performance Evaluation Review 26(4), 14–29 (1999), http://www.cs.huji.ac.il/~feit/pub.html
Downey, A.B.: The structural cause of file size distributions. Technical Report CSD-RT25-2000, Wellesley College (2000), http://allendowney.com/research/filesize/
England, D., Weissman, J.B.: Costs and Benefits of Load Sharing in the Computational Grid. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, pp. 160–175. Springer, Heidelberg (2005)
Ernemann, C., Hamscher, V., Yahyapour, R.: Economic Scheduling in Grid Computing. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2002. LNCS, vol. 2537, pp. 128–152. Springer, Heidelberg (2002)
Faloutsos, M., Faloutsos, P., Faloutsos, C.: On Power-Law Relationships of the Internet Topology. In: ACM SIGCOMM, Cambridge, MA (September 1999)
Feitelson, D.G., Rudolph, L.: Job Scheduling in Multiprogrammed Parallel Systems. Condensed Version. Research Report RC 19790 (87657), IBM T. J. Watson Research Center (October 1994), http://www.cs.huji.ac.il/%7Efeit/pub.html (Revised version from August 1997)
Gehring, J., Preiß, T.: Scheduling a Metacomputer With Uncooperative Sub-schedulers. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 1999, IPPS-WS 1999, and SPDP-WS 1999. LNCS, vol. 1659, pp. 179–201. Springer, Heidelberg (1999)
Hamscher, V., Schwiegelshohn, U., Streit, A., Yahyapour, R.: Evaluation of Job-Scheduling Strategies for Grid Computing. In: Buyya, R., Baker, M. (eds.) GRID 2000. LNCS, vol. 1971, pp. 191–202. Springer, Heidelberg (2000)
Iamnitchi, A., Ripeanu, M.: Myth and Reality: Usage Behavior in a Large Data-Intensive Physics Project. In: SC 2002, November 11-16 (2002), Baltimore, Maryland (poster), GriPhyN TR 2003-4
Jann, J., Pattnaik, P., Franke, H.: Modeling of Workload in MPPs. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1997 and JSSPP 1997. LNCS, vol. 1291, pp. 95–116. Springer, Heidelberg (1997)
Karger, D., Lehman, E., Leighton, T.: Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web. In: Symposium on Theory of Computing (1997)
Kee, Y.-S., Casanova, H., Chien, A.: Realistic Modeling and Synthesis of Resources for Computational Grids. In: ACM Conference on High Performance Computing and Networking, SC 2004, Pittsburgh, Pennsylvania (November 2004)
Lu, D., Dinda, P.A.: Synthesizing Realistic Computational Grids. In: ACM/IEEE Supercomputing 2003 (SC 2003) (November 2003) Phoenix
Nishikawa, N., Hosokawa, T.: Memory-based architecture for distributed WWW caching proxy. In: 7th International Conference on World Wide Web, Brisbane, Australia (1998)
Pixar Animation Studios: How We Make A Movie. Pixar’s Animation Process, http://www.pixar.com/howwedoit/index.html (Download: 7.6.2004)
de Prycker, M.: Asynchronous Transfer Mode. Prentice Hall, Englewood Cliffs (1996)
Strumpen, V.: Volunteer Computing. Software - Practice and Experience 25(3), 291–304 (1995)
Wang, F., Xin, Q., Hong, B., Brandt, S.A., Miller, E.L., Long, D.D.E., McLarty, T.T.: Large-scale virtual screening for discovering leads in the postgenomic era. In: 21st IEEE / 12th NASA Goddard Conference on Mass Storage Systems and Technologies, College Park, MD, April 2004, pp. 139–152 (2004)
Waszkowycz, B., Perkins, T.D.J., Sykes, R.A., Li, J.: Large-scale virtual screening for discovering leads in the postgenomic era. IBM Systems Journal 40(2) (2001), Deep computing for the life sciences, http://www.research.ibm.com/journal/sj/402/waszkowycz.html
Zhou, S., Brecht, T.: Processor Pool-Based Scheduling for Large-Scale NUMA Multiprocessors. In: 1991 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, San Diego, California, USA, May 21-24 (1991)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 IFIP International Federation for Information Processing
About this paper
Cite this paper
Opitz, A., Koenig, H. (2005). Optimizing the Access to Read-Only Data in Grid Computing. In: Kutvonen, L., Alonistioti, N. (eds) Distributed Applications and Interoperable Systems. DAIS 2005. Lecture Notes in Computer Science, vol 3543. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11498094_19
Download citation
DOI: https://doi.org/10.1007/11498094_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26262-6
Online ISBN: 978-3-540-31582-7
eBook Packages: Computer ScienceComputer Science (R0)