Abstract
BlueGene/L is a massively parallel cellular architecture system with a toroidal interconnect. Cellular architectures with a toroidal interconnect are effective at producing highly scalable computing systems, but typically require job partitions to be both rectangular and contiguous. These restrictions introduce fragmentation issues that affect the utilization of the system and the wait time and slowdown of queued jobs. We propose to solve these problems for the BlueGene/L system through scheduling algorithms that augment a baseline first come first serve (FCFS) scheduler. Restricting ourselves to space-sharing techniques, which constitute a simpler solution to the requirements of cellular computing, we present simulation results for migration and backfilling techniques on BlueGene/L. These techniques are explored individually and jointly to determine their impact on the system. Our results demonstrate that migration can be effective for a pure FCFS scheduler but that backfilling produces even more benefits. We also show that migration can be combined with backfilling to produce more opportunities to better utilize a parallel machine.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
T. Agerwala, J. L. Martin, J.H. Mirza, D.C. Sadler, D. M. Dias, and M. Snir. SP2 system architecture. IBM Systems Journal, 34(2):152–184, 1995. 38
H. Choo, S.-M. Yoo, and H. Y. Youn. Processor Scheduling and Allocation for 3D Torus Multicomputer Systems. IEEE Transactions on Parallel and Distributed Systems, 11(5):475–484, May 2000. 49
D.H. J. Epema, M. Livny, R. van Dantzig, X. Evers, and J. Pruyne. A worldwide fockof Condors: Load sharing among workstation clusters. Future Generation Computer Systems, 12(1):53–65, May 1996. 39
D.G. Feitelson. A Survey of Scheduling in Multiprogrammed Parallel Systems. Technical Report RC 19790 (87657), IBM T. J. Watson Research Center, October 1994. 39
D.G. Feitelson. Packing schemes for gang scheduling. In Job Scheduling Strategies for Parallel Processing, IPPS’96 Workshop, volume 1162 of Lecture Notes in Computer Science, pages 89–110, Berlin, March 1996. Springer-Verlag. 39
D.G. Feitelson. Parallel Workloads Archive. URL: http://www.cs.huji.ac.il/labs/parallel/workload/index.html, 2001. 44
D.G. Feitelson and M.A. Jette. Improved Utilization and Responsiveness with Gang Scheduling. In IPPS’97 Workshop on Job Scheduling Strategies for Parallel Processing, volume 1291 of Lecture Notes in Computer Science, pages 238–261. Springer-Verlag, April 1997. 39, 49
D.G. Feitelson and A. Mu’alem Weil. Utilization and predictability in scheduling the IBM SP2 with backfilling. In 12th International Parallel Processing Symposium, pages 542–546, April 1998. 39, 40, 41, 43, 49
H. Franke, J. Jann, J. E. Moreira, and P. Pattnaik. An Evaluation of Parallel Job Scheduling for ASCI Blue-Pacific. In Proceedings of SC99, Portland, OR, November 1999. IBM Research Report RC21559. 39
B. Gorda and R. Wolski. Time Sharing Massively Parallel Machines. In International Conference on Parallel Processing, volume II, pages 214–217, August 1995. 39
D. Hyatt. A Beginner’s Guide to the Cray T3D/T3E. URL:http://www.jics.utk.edu/SUPER COMPS/T3D/T3D guide/T3D guideJul97.html, July 1997. 38
H.D. Karatza. A Simulation-Based Performance Analysis of Gang Scheduling in a Distributed System. In Proceedings 32nd Annual Simulation Symposium, pages 26–33, San Diego, CA, April 11–15 1999. 39
D.H. Lawrie. Access and Alignment of Data in an Array Processor. IEEE Transactions on Computers, 24(12):1145–1155, December 1975. 38
D. Lifka. The ANL/IBM SP scheduling system. In IPPS’95 Workshop on Job Scheduling Strategies for Parallel Processing, volume 949 of Lecture Notes in Computer Science, pages 295–303. Springer-Verlag, April 1995. 39, 49
J. E. Moreira, W. Chan, L. L. Fong, H. Franke, and M.A. Jette. An Infrastructure for Efficient Parallel Job Execution in Terascale Computing Environments. In Proceedings of SC98, Orlando, FL, November 1998. 39
U. Schwiegelshohn and R. Yahyapour. Improving First-Come-First-Serve Job Scheduling by Gang Scheduling. In IPPS’98 Workshop on Job Scheduling Strategies for Parallel Processing, March 1998. 39
J. Skovira, W. Chan, H. Zhou, and D. Lifka. The EASY-LoadLeveler API project. In IPPS’96 Workshop on Job Scheduling Strategies for Parallel Processing, volume 1162 of Lecture Notes in Computer Science, pages 41–47. Springer-Verlag, April 1996. 39, 49
W. Smith, V. Taylor, and I. Foster. Using Run-Time Predictions to Estimate Queue Wait Times and Improve Scheduler Performance. In Proceedings of the 5th Annual Workshop on Job Scheduling Strategies for Parallel Processing, April 1999. In conjunction with IPPS/SPDP’99, Condado Plaza Hotel & Casino, San Juan, Puerto Rico. 40
H. S. Stone. High-Performance Computer Architecture. Addison-Wesley, 1993. 38
C. Z. Xu and F.C.M. Lau. Load Balancing in Parallel Computers: Theory and Practice. Kluwer Academic Publishers, Boston, MA, 1996. 39
B. S. Yoo and C. R. Das. Processor Management Techniques for Mesh-Connected Multiprocessors. In Proceedings of the International Conference on Parallel Processing (ICPP’95), volume 2, pages 105–112, August 1995. 39, 49
K.K. Yue and D. J. Lilja. Comparing Processor Allocation Strategies in Multiprogrammed Shared-Memory Multiprocessors. Journal of Parallel and Distributed Computing, 49(2):245–258, March 1998. 39
Y. Zhang, H. Franke, J.E. Moreira, and A. Sivasubramaniam. Improving Parallel Job Scheduling by Combining Gang Scheduling and Backfilling Techniques. In Proceedings of IPDPS 2000, Cancun, Mexico, May 2000. 40, 41
Y. Zhang, H. Franke, J.E. Moreira, and A. Sivasubramaniam. The Impact of Migration on Parallel Job Scheduling for Distributed Systems. In Proceedings of the 6th International Euro-Par Conference, pages 242–251, August 29-September 1 2000. 49
Y. Zhang, H. Franke, J.E. Moreira, and A. Sivasubramaniam. An Analysis of Space-and Time-Sharing Techniques for Parallel Job Scheduling. In Job Scheduling Strategies for Parallel Processing, Sigmetrics’01 Workshop, June 2001. 49
B. B. Zhou, R.P. Brent, C.W. Jonhson, and D. Walsh. Job Re-packing for Enhancing the Performance of Gang Scheduling. In Job Scheduling Strategies for Parallel Processing, IPPS’99 Workshop, pages 129–143, April 1999. LNCS 1659. 39
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Krevat, E., Castaños, J.G., Moreira, J.E. (2002). Job Scheduling for the BlueGene/L System. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2002. Lecture Notes in Computer Science, vol 2537. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36180-4_3
Download citation
DOI: https://doi.org/10.1007/3-540-36180-4_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00172-0
Online ISBN: 978-3-540-36180-0
eBook Packages: Springer Book Archive