Abstract
The goal of Grid computing is to integrate the usage of computer resources from cooperating partners in the form of Virtual Organizations (VO). One of its key functions is to match jobs to execution resources efficiently. For interoperability between VOs, this matching operation occurs in resource brokering middleware, commonly referred to as the meta-scheduler or meta-broker. In this paper, we present an approach to a meta-scheduler architecture, combining hierarchical and peer-to-peer models for flexibility and extensibility. Interoperability is further promoted through the introduction of a set of protocols, allowing meta-schedulers to maintain sessions and exchange job and resource state using Web Services. Our architecture also incorporates a resource model that enables an efficient resource matching across multiple Virtual Organizations, especially where the compute resources and state are dynamic. Experiments demonstrate these new functional features across three distributed organizations (BSC, FIU, and IBM), that internally use different job scheduling technologies, computing infrastructure and security mechanisms. Performance evaluations through actual system measurements and simulations provide the insights on the architecture’s effectiveness and scalability.
Similar content being viewed by others
References
Agarwal, A., Ahmed, M., Berman, A., Caron, B.L., et al.: GridX1: A Canadian computational Grid. Future Gener. Comput. Syst. 23, 680–687 (2007)
Andrieux, A., Czajkowski, K., Dan, A., Keahey, K., Ludwig, H., Nakata, T., Pruyne, J., Rofrano, J., Tuecke, S., Xu, M.: Web Services Agreement Specification (WS-Agreement), GFD-R-P.107. Tech. rep., Grid Resource Allocation Agreement Protocol (GRAAP) WG, Open Grid Forum (2007)
Anjomshoaa, A., Brisard, F., Drescher, M., Fellows, D., et al.: Job Submission Description Language (JSDL) Specification Version 1.0, GFD-R.056. Tech. rep., Open Grid Forum (OGF) (2005)
Assuncao, M.D., Buyya, R.: Performance analysis of allocation policies for intergrid resource provisioning. Inf. Softw. Technol. 51, 42–55 (2009)
Assuncao, M.D., Buyya, R., Venugopal, S.: InterGrid: a case for internetworking islands of Grids. Concurr. Comput.: Pract. Exper. 20, 997–1024 (2008)
Badia, R., Dasgupta, G., Ezenwoye, O., Fong, L., et al.: Innovative Grid technologies applied to bioinformatics and hurricane mitigation. In: High Performance Computing and Grids in Action, pp. 436–462. IOS Press, Amsterdam (2007)
Basney, J., Humphrey, M., Welch, V.: The myproxy online credential repository. Softw.: Pract. Exper. 35, 801–816 (2005)
Baur, T., Breu, R., Kálmán, T., Lindinger, T., Milbert, A., Poghosyan, G.S., Reiser, H., Romberg, M.: An interoperable Grid information system for integrated resource monitoring based on virtual organizations. J. Grid Computing 7(3), 319–333 (2009)
Bolze, R., Cappello, F., Caron, E., Dayde, M., et al.: Grid’5000: a large scale and highly reconfigurable experimental Grid testbed. Int. J. High Perform. Comput. Appl. 20, 481–494 (2006)
Brooke, J., Fellows, D., Garwood, K., Goble, C.: Semantic matching of Grid resource descriptions. In: European Acrossgrids Conference, pp. 240–249. Nicosia, Greece (2004). LNCS 3165
Buyya, R., Murshed, M.: GridSim: a toolkit for the modeling and simulation of distributed resource management and scheduling for Grid computing. Concurr. Comput.: Pract. Exper. 14, 1175–1220 (2002)
Catlett, C., Beckman, P., Skow, D., Foster, I.: Creating and operating national-scale cyberinfrastructure services. CTWatch Quarterly 2, 2–10 (2006)
Chapin, S., Katramatos, D., Karpovich, J., Grimshaw, A.: The legion resource management system. In: Job Scheduling Strategies for Parallel Processing (JSSPP), pp. 162–178. Puerto Rico (1999). LNCS 1659
Czajkowski, K., Fitzgerald, S., Foster, I., Kesselman, C.: Grid information services for distributed resource sharing. In: IEEE International Symposium on High-Performance Distributed Computing (HPDC), pp. 181–194. San Francisco, CA, USA (2001)
Dunning, T., Nandkumar, R.: International cyberinfrastructure: activities around the globe. CTWatch Quarterly 2, 2–4 (2006)
Elmroth, E., Tordsson, J.: A standards-based Grid resource brokering service supporting advance reservations, coallocation, and cross-Grid interoperability. Concurr. Comput.: Pract. Exper. 21(18), 2298–2335 (2009)
Erwin, D., Snelling, D.: UNICORE: a Grid computing environment. In: International Euro-Par Conference on Parallel Processing, pp. 825–834. Manchester, UK (2001)
Farkas, Z.: Grid interoperability based on a formal design. J. Grid Computing 9(4), 479–499 (2011)
Field, L., Laure, E., Schulz, M.W.: Grid deployment experiences: Grid interoperation. J. Grid Computing 7(3), 287–296 (2009)
Foster, I., Kesselman, C.: Globus: a metacomputing infrastructure toolkit. Int. J. Supercomput. Appl. 11, 115–128 (1997)
Foster, I., Kesselman, C., Tuecke, S.: The anatomy of the Grid: enabling scalable virtual organizations. Int. J. High Perfo. Comput. Appl. 15(3), 200–222 (2001)
Foster, I., Grimshaw, A., Lane, P., Lee, W., et al.: OGSA Basic Execution Service Version 1.0, GFD-R.108. Tech. rep., Open Grid Forum (OGF) (2008)
Garzoglio, G., Alderman, I., Altunay, M., Ananthakrishnan, R., et al.: Definition and implementation of a saml-xacml profile for authorization interoperability across Grid middleware in osg and egee. J. Grid Computing 7(3), 297–307 (2009)
Goodale, T., Jha, S., Kielmann, T., Merzky, A., Shalf, J., Smith, C.: A Simple API for Grid Applications (SAGA), GWD-R.72. Tech. rep., SAGA-CORE Working Group, Open Grid Forum (2006)
Grimme, C., Lepping, J., Papaspyrou, A.: Prospects of collaboration between compute providers by means of job interchange. In: Job Scheduling Strategies for Parallel Processing (JSSPP), pp. 132–151. Seattle, WA, USA (2008). LNCS 4942
Gucer, V., Biggs-Finstad, J., Cappariello, A., Dufner, M., et al.: Getting Started with Tivoli Dynamic Workload Broker 1.1. IBM Redbooks (2007)
Guim, F., Labarta, J., Corbalan, J.: Modeling the impact of resource sharing in backfilling policies using the alvio simulator. In: IEEE Intl. Symp. on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), pp. 145–150. Istanbul, Turkey (2007)
Guim, F., Rodero, I., Corbalan, J., Labarta, J., Oleksiak, A., Kuczynski, T., Szejnfeld, D., Nabrzyski, J.: Uniform job monitoring in the HPC-Europa project: data model, API and services. Int. J.Web Grid Serv. 3, 333–353 (2007)
Helvaci, A., Cetinkaya, C., Yildirim, M.: Using rerouting to improve aggregate based resource allocation. J. Netw. 3, 1–12 (2008)
Hey, T., Trefethen, A.: The UK e-Science core programme and the Grid. Future Gener. Comput. Syst. 18, 1017–1031 (2002)
Huedo, E., Montero, R., Llorente, I.: A framework for adaptive execution in Grids. Softw.: Pract. Exper. 34, 631–651 (2004)
Huedo, E., Montero, R., Llorente, I.: A recursive architecture for hierarchical Grid resource management. Future Gener. Comput. Syst. 25, 401–405 (2009)
Iosup, A., Epema, D., Tannenbaum, T., Farrelle, M., Livny, M.: Inter-operable Grids through delegated matchmaking. In: International Conference for High Performance Computing, Networking, Storage and Analysis (SC07), pp. 13:1–13:12. Reno, Nevada (2007)
Jette, M.A., Yoo, A.B., Grondona, M.: Slurm: Simple linux utility for resource management. In: Lecture Notes in Computer Science: Proceedings of Job Scheduling Strategies for Parallel Processing (JSSPP), pp. 44–60. Edinburgh, UK (2002)
Kacsuk, P., Kiss, T., Sipos, G.: Solving the Grid interoperability problem by P-GRADE portal at workflow level. Future Gener. Comput. Syst. 24, 744–751 (2008)
Kacsuk, P., Kovács, J., Farkas, Z., Marosi, C.A., Balaton, Z.: Towards a powerful European dci based on desktop Grids. J. Grid Computing 9(2), 219–239 (2011)
Kertesz, A., Kacsuk, P., Iosup, A., Epema, D.H.: Investigating Peer-to-Peer Meta-Brokering in Grids. Tech. rep., CoreGRID—Network of Excellence (2008)
Kertész, A., Kacsuk, P.: Gmbs: A new middleware service for making Grids interoperable. Future Gener. Comput. Syst. 26(4), 542–553 (2010)
Kiss, T., Kukla, T.: Achieving interoperation of Grid data resources via workflow level integration. J. Grid Computing 7(3), 355–374 (2009)
Leal, K., Huedo, E., Llorente, I.: A decentralized model for scheduling independent tasks in federated Grids. Future Gener. Comput. Syst. 25, 840–852 (2009)
Martin, O., Martin-Flatin, J., Martelli, E., Moroni, P., Newman, H., Ravot, S., Nae, D.: The DataTAG transatlantic testbed. Future Gener. Comput. Syst. 21, 443–456 (2005)
Martinez, J.C., Wang, L., Zhao, M., Sadjadi, S.M.: Experimental study of large-scale computing on virtualized resources. In: Proceedings of the 3rd International Workshop on Virtualization Technologies in Distributed Computing, pp. 35–42. Barcelona, Spain (2009)
Marzolla, M., Andreetto, P., Venturi, V., Ferraro, A., et al.: Open standards-based interoperability of job submission and management interfaces across the Grid middleware platforms gLite and UNICORE. In: IEEE International Conference on e-Science and Grid Computing, pp. 592–601. Bangalore, India (2007)
Massie, M., Chun, B., Culler, D.: The Ganglia distributed monitoring system: design, implementation, and experience. Parallel Comput. 30, 817–840 (2004)
Michalakes, J., Dudhia, J., Gill, D., Henderson, T., Klemp, J., Skamarock, W., Wang, W.: Reseach and forecast model: software architecture and performance. In: 11th ECMWF Workshop on the Use of High Performance Computing In Meteorology, pp. 156–168. Reading, UK (2004)
Mohamed, H., Epema, D.: KOALA: a co-allocating Grid scheduler. Concurr. Comput.: Pract. Exper. 20, 1851–1876 (2008)
Oleksiak, A., Tullo, A., Graham, P., Kuczynski, T., Nabrzyski, J., Szejnfeld, D., Sloan, T.: HPC-Europa: towards uniform access to European HPC infrastructures. In: IEEE/ACM International Workshop on Grid Computing, pp. 308–311. Seattle, WA, USA (2005)
Ram, N., Ramakrishran, S.: International cyberinfrastructure: activities around the globe. CTWatch Quarterly 2, 15–19 (2006)
Riedel, M., Memon, A., Memon, M., Mallmann, D., et al.: Improving e-science with interoperability of the e-infrastructures EGEE and DEISA. In: International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 225–231. Opatija, Croatia (2008)
Riedel, M., Laure, E., Soddemann, T., Field, L., et al.: Interoperation of world-wide production e-science infrastructures. Concurr. Comput.: Pract. Exper. 21(8), 961–990 (2009)
Rings, T., Caryer, G., Gallop, J.R., Grabowski, J., Kovacikova, T., Schulz, S., Stokes-Rees, I.: Grid and cloud computing: opportunities for integration with the next generation network. J. Grid Computing 7(3), 375–393 (2009)
Rodero, I., Corbalan, J., Badia, R., Labarta, J.: eNANOS Grid resource broker. In: European Grid Conference 2005, pp. 111–121. Amsterdam (2005). LNCS 3470
Rodero, I., Guim, F., Corbalan, J., Labarta, J.: eNANOS: coordinated scheduling in Grid environments. In: International Conference on Parallel Computing (ParCo), pp. 81–88. Malaga, Spain (2005)
Rodero, I., Guim, F., Corbalan, J., Fong, L., Liu, Y., Sadjadi, S.: Looking for an evolution of Grid scheduling: meta-brokering. In: Grid Middleware and Services: Challenges and Solutions, pp. 105–119 (2008)
Rodero, I., Guim, F., Corbalan, J.: Evaluation of coordinated Grid scheduling strategies. In: Proceedings of the 2009 11th IEEE International Conference on High Performance Computing and Communications, pp. 1–10. Seoul, Korea (2009)
Rodero, I., Guim, F., Corbalan, J., Fong, L., Sadjadi, S.: Broker selection strategies in interoperable Grid systems. Future Gener. Comput. Syst. 26(1), 72–86 (2010)
Rowstron, A.I.T., Druschel, P.: Pastry: Scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In: Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg, pp. 329–350. London, UK (2001)
Sadjadi, S., Fong, L., Badia, R., Figueroa, J., et al.: Transparent Grid enablement of weather research and forecasting. In: Proceedings of the Mardi Gras Conference 2008—Workshop on Grid-Enabling Applications, p. 8. Baton Rouge, LA, USA (2008)
Seidel, J., Waldrich, O., Ziegler, W., Wieder, P., Yahyapour, R.: Using SLA for Resource Management and Scheduling—A Survey, TR-0096. Tech. rep., Institute on Resource Management and Scheduling (2007)
Solomon, M., Raman, R., Livny, M.: Matchmaking: distributed resource management for high throughput computing. In: IEEE International Symposium on High Performnce Distributed Computing (HPDC), pp. 28–31. Chicago, USA (1998)
Stoica, I., Morris, R., Liben-Nowell, D., Karger, D.R., Kaashoek, M.F., Dabek, F., Balakrishnan, H.: Chord: a scalable peer-to-peer lookup protocol for internet applications. IEEE/ACM Trans. Netw. 11(1), 17–32 (2003)
Troger, P., Rajic, H., Haas, A., Domagalski, P.: Standardization of an API for distributed resource management systems. In: Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid, pp. 619–626. Washington, DC, USA (2007)
Urbah, E., Kacsuk, P., Farkas, Z., Fedak, G., et al.: Edges: bridging egee to boinc and xtremweb. J. Grid Computing 7(3), 335–354 (2009)
Vazquez, T., Huedo, E., Montero, R., Lorente, I.: Evaluation of a utility computing model based on the federation of Grid infrastructures. In: International Euro-Par Conference on Parallel Processing, pp. 372–381. Rennes, France (2007)
Villegas, D., Bobroff, N., Rodero, I., Delgado, J., Liu, Y., Devarakonda, A., Fong, L., Sadjadi, S.M., Parashar, M.: Cloud federation in a layered service model. J. Comput. Syst. Sci. 78(5), 1330–1344 (2012)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rodero, I., Villegas, D., Bobroff, N. et al. Enabling Interoperability among Grid Meta-Schedulers. J Grid Computing 11, 311–336 (2013). https://doi.org/10.1007/s10723-013-9252-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10723-013-9252-9