Nothing Special   »   [go: up one dir, main page]

Skip to main content

Evolving Toward the Perfect Schedule: Co-scheduling Job Assignments and Data Replication in Wide-Area Systems Using a Genetic Algorithm

  • Conference paper
Job Scheduling Strategies for Parallel Processing (JSSPP 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3834))

Included in the following conference series:

Abstract

Traditional job schedulers for grid or cluster systems are responsible for assigning incoming jobs to compute nodes in such a way that some evaluative condition is met. Such systems generally take into consideration the availability of compute cycles, queue lengths, and expected job execution times, but they typically do not account directly for data staging and thus miss significant associated opportunities for optimisation. Intuitively, a tighter integration of job scheduling and automated data replication can yield significant advantages due to the potential for optimised, faster access to data and decreased overall execution time. In this paper we consider data placement as a first-class citizen in scheduling and use an optimisation heuristic for generating schedules. We make the following two contributions. First, we identify the necessity for co-scheduling job dispatching and data replication assignments and posit that simultaneously scheduling both is critical for achieving good makespans. Second, we show that deploying a genetic search algorithm to solve the optimal allocation problem has the potential to achieve significant speed-up results versus traditional allocation mechanisms. Through simulation, we show that our algorithm provides on average an approximately 20-45% faster makespan than greedy schedulers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Adamic, L.: Zipf, Power-laws, and Pareto – a ranking tutorial, http://www.hpl.hp.com/research/idl/papers/ranking/ranking.html

  2. Baeck, T., Fogel, D., Michalewicz, Z. (eds.): Evolutionary Computation 1: Basic Algorithms and Operators. Institute of Physics Publishing (2000)

    Google Scholar 

  3. Braun, T., Siegel, H., Beck, N., Boloni, L., Maheswaran, M., Reuther, A., Robertson, J., Theys, M., Yao, B., Hengsen, D., Freund, R.: A Comparison of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems. Journal of Parallel and Distributed Computing 61(6) (June 2001)

    Google Scholar 

  4. Casanova, H., Legrand, A., Zagorodnov, D., Berman, F.: Heuristics for Scheduling Parameter Sweep Applications in Grid Environments. In: Proceedings of the 9th Heterogeneous Computing Workshop (May 2000)

    Google Scholar 

  5. Chakrabarti, A., Dheepak, R.A., Sengupta, S.: Integration of Scheduling and Replication in Data Grids. In: Bougé, L., Prasanna, V.K. (eds.) HiPC 2004. LNCS, vol. 3296, pp. 375–385. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  6. Davis, L.: Job Shop Scheduling with Genetic Algorithms. In: Proceedings of the International Conference on High Performance Computing (1985)

    Google Scholar 

  7. Deelman, E., Kosar, T., Kesselman, C., Livny, M.: What Makes Workflows Work in an Opportunistic Environment? In: Concurrency and Computation: Practice and Experience (2004)

    Google Scholar 

  8. Feitelson, D.: A Survey of Scheduling in Multiprogrammed Parallel Systems. IBM Research Report RC 19790 (87657) (1994)

    Google Scholar 

  9. Feitelson, D., Rudolph, L., Schwiegelshohn, U.: Parallel Job Scheduling – A Status Report. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, pp. 1–16. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  10. The Grid Physics Project, http://www.griphyn.org

  11. Holtman, K.: CMS Requirements for the Grid. In: Proceedings of the International Conference on Computing in High Energy and Nuclear Physics (2001)

    Google Scholar 

  12. Hu, N., Li, L., Mao, Z., Steenkiste, P., Wang, J.: Locating Internet Bottlenecks: Algorithms, Measurements, and Implications. In: Proceedings of SIGCOMM (2004)

    Google Scholar 

  13. Kosar, T., Livny, M.: Stork: Making Data Placement a First Class Citizen in the Grid. In: Proceedings of IEEE International Conference on Distributed Computing Systems (2004)

    Google Scholar 

  14. Lifka, D.: The ANL/IBM SP Scheduling System. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1995 and JSSPP 1995. LNCS, vol. 949. Springer, Heidelberg (1995)

    Google Scholar 

  15. Michaelewicz, Z., Fogel, D.: How to Solve It: Modern Heuristics. Springer, Heidelberg (2000)

    Google Scholar 

  16. Mohamed, H., Epema, D.: An Evaluation of the Close-to-Files Processor and Data Co-Allocation Policy in Multiclusters. In: Proceedings of the IEEE International Conference on Cluster Computing (2004)

    Google Scholar 

  17. Mu’alem, A., Feitelson, D.: Utilization, Predictability, Workloads,and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling. IEEE Transactions on Parallel and Distributed Systems (June 2001)

    Google Scholar 

  18. The Particle Physics Data Grid, http://www.ppdg.net

  19. Ranganathan, K., Foster, I.: Computation Scheduling and Data Replication Algorithms for Data Grids. In: Nabrzyski, J., Schopf, J., Weglarz, J. (eds.) Grid Resource Management: State of the Art and Future Trends. Kluwer Academic Publishers, Dordrecht (2003)

    Google Scholar 

  20. Ribeiro, V., Riedi, R., Baraniuk, R.: Locating Available Bandwidth Bottlenecks. IEEE Internet Computing (September-October 2004)

    Google Scholar 

  21. Santos-Neto, E., Cirne, W., Brasileiro, F., Lima, A.: Exploiting Replication and Data Reuse to Efficiently Schedule Data-Intensive Applications on Grids. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, pp. 210–232. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  22. Schmueli, E., Feitelson, D.: Backfilling with Lookahead to Optimize the Packing of Parallel Jobs. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2003. LNCS, vol. 2862, pp. 228–251. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  23. Stockinger, H., Samar, A., Allcock, B., Foster, I., Holtman, K., Tierney, B.: File and Object Replication in Data Grids. In: Proceedings of the 10th International Symposium on High Performance Distributed Computing (2001)

    Google Scholar 

  24. Thain, D., Bent, J., Arpaci-Dusseau, A., Arpaci-Dusseau, R., Livny, M.: Gathering at the Well: Creating Communities for Grid I/O. In: Proceedings of Supercomputing (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Phan, T., Ranganathan, K., Sion, R. (2005). Evolving Toward the Perfect Schedule: Co-scheduling Job Assignments and Data Replication in Wide-Area Systems Using a Genetic Algorithm. In: Feitelson, D., Frachtenberg, E., Rudolph, L., Schwiegelshohn, U. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2005. Lecture Notes in Computer Science, vol 3834. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11605300_9

Download citation

  • DOI: https://doi.org/10.1007/11605300_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-31024-2

  • Online ISBN: 978-3-540-31617-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics