Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/646380.689537guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype

Benchmarks and Standards for the Evaluation of Parallel Job Schedulers

Published: 16 April 1999 Publication History


The evaluation of parallel job schedulers hinges on the workloads used. It is suggested that this be standardized, in terms of both format and content, so as to ease the evaluation and comparison of different systems. The question remains whether this can encompass both traditional parallel systems and metacomputing systems.
This paper is based on a panel on this subject that was held at the workshop, and the ensuing discussion; its authors are both the panel members and participants from the audience. Naturally, not all of us agree with all the opinions expressed here...


A. K. Agrawala, J. M. Mohr, and R. M. Bryant, "An approach to the workload characterization problem". Computer 9(6), pp. 18-32, Jun 1976.
G. Alverson, S. Kahan, R. Korry, C. McCann, and B. Smith, "Scheduling on the Tera MTA". In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 19-44, Springer-Verlag, 1995. Lect. Notes Comput. Sci. vol. 949.
P. Barford and M. Crovella, "Generating representative web workloads for network and server performance evaluation". In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 151-160, Jun 1998.
A. Batat. Master's thesis, Hebrew University, 1999. (in preparation).
F. Berman, "High-performance schedulers". In The Grid: Blueprint for a New Computing Infrastructure, I. Foster and C. Kesselman (eds.), pp. 279-309, Morgan Kaufmann, 1999.
F. Berman, R. Wolski, S. Figueira, J. Schopf, and G. Shao, "Application-level scheduling on distributed heterogeneous networks". In Supercomputing '96, 1996.
M. Calzarossa, G. Haring, G. Kotsis, A. Merlo, and D. Tessera, "A hierarchical approach to workload characterization for parallel systems". In High-Performance Computing and Networking, pp. 102-109, Springer-Verlag, May 1995. Lect. Notes Comput. Sci. vol. 919.
S. J. Chapin, "Distributed scheduling support in the presence of autonomy". In Proc. 4th Heterogeneous Computing Workshop, pp. 22-29, Apr 1995. Santa Barbara, CA.
S. J. Chapin and E. H. Spafford, "Support for implementing scheduling algorithms using MESSIAHS". Scientific Programming 3(4), pp. 325-340, Winter 1994.
W. Cirne and F. Berman, "S3: a metacomputing-friendly parallel scheduler". Manuscript, UCSD, In preparation.
W. Cirne and K. Marzullo, "The computational co-op: gathering clusters into a metacomputer". In Second Merged Symposium IPPS/SPDP 1999, 13th International Parallel Processing Symposium & 10th Symposium on Parallel and Distributed Processing, pp. 160-166, April 1999.
D. E. Culler, J. P. Singh, and A. Gupta, Parallel Computer Architecture: A Hardware/ oftware Approach. Morgan Kaufmann, 1999.
A. B. Downey, "A parallel workload model and its implications for processor allocation". In 6th Intl. Symp. High Performance Distributed Comput., Aug 1997.
A. B. Downey, "Predicting queue times on space-sharing parallel computers". In 11th Intl. Parallel Processing Symp., pp. 209-218, Apr 1997.
A. B. Downey, "Using queue time predictions for processor allocation". In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 35-57, Springer Verlag, 1997. Lect. Notes Comput. Sci. vol. 1291.
A. B. Downey and D. G. Feitelson, "The elusive goal of workload characterization". Perf. Eval. Rev. 26(4), pp. 14-29, Mar 1999.
D. G. Feitelson, "Memory usage in the LANL CM-5 workload". In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 78-94, Springer Verlag, 1997. Lect. Notes Comput. Sci. vol. 1291.
D. G. Feitelson, "Packing schemes for gang scheduling". In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 89-110, Springer-Verlag, 1996. Lect. Notes Comput. Sci. vol. 1162.
D. G. Feitelson, "Parallel workloads archive". URL labs/parallel/workload/.
D. G. Feitelson, A Survey of Scheduling in Multiprogrammed Parallel Systems. Research Report RC 19790 (87657), IBM T. J. Watson Research Center, Oct 1994.
D. G. Feitelson and B. Nitzberg, "Job characteristics of a production parallel scientific workload on the NASA Ames iPSC/860". In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 337-360, Springer-Verlag, 1995. Lect. Notes Comput. Sci. vol. 949.
D. G. Feitelson and L. Rudolph, "Distributed hierarchical control for parallel processing". Computer 23(5), pp. 65-77, May 1990.
D. G. Feitelson and L. Rudolph, "Gang scheduling performance benefits for finegrain synchronization". J. Parallel & Distributed Comput. 16(4), pp. 306-318, Dec 1992.
D. G. Feitelson and L. Rudolph, "Metrics and benchmarking for parallel job scheduling". In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 1-24, Springer-Verlag, 1998. Lect. Notes Comput. Sci. vol. 1459.
D. G. Feitelson and L. Rudolph, "Toward convergence in job schedulers for parallel supercomputers". In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 1-26, Springer-Verlag, 1996. Lect. Notes Comput. Sci. vol. 1162.
D. Ferrari, "Workload characterization and selection in computer performance measurement". Computer 5(4), pp. 18-24, Jul/Aug 1972.
I. Foster and C. Kesselman, "Globus: a metacomputing infrastructure toolkit". International Journal of Supercomputing Applications 11(2), pp. 115-128, 1997.
I. Foster and C. Kesselman (eds.), The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, 1999.
I. Foster, C. Kesselman, C. Lee, R. Lindell, K. Nahrstedt, and A. Roy, "A distributed resource management architecture that supports advance reservations and co-allocation". In International Workshop on Quality of Service, 1999.
H. Franke, P. Pattnaik, and L. Rudolph, "Gang scheduling for highly efficient distributed multiprocessor systems". In 6th Symp. Frontiers Massively Parallel Comput., pp. 1-9, Oct 1996.
G. Ghare and S. T. Leutenegger, "The effect of correlating quantum allocation and job size for gang scheduling". In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer Verlag, 1999. Lect. Notes Comput. Sci. vol. 1659.
R. Gibbons, "A historical application profiler for use by parallel schedulers". In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 58-77, Springer Verlag, 1997. Lect. Notes Comput. Sci. vol. 1291.
A. S. Grimshaw, J. B. Weissman, E. A. West, and E. C. Loyot, Jr., "Metasystems: an approach combining parallel processing and heterogeneous distributed computing systems". J. Parallel & Distributed Comput. 21(3), pp. 257-270, Jun 1994.
A. S. Grimshaw, W. A. Wulf, and the Legion team, "The Legion vision of a worldwide virtual computer". Comm. ACM 40(1), pp. 39-45, Jan 1997.
A. Gupta, A. Tucker, and S. Urushibara, "The impact of operating system scheduling policies and synchronization methods on the performance of parallel applications". In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 120-132, May 1991.
M. A. Holliday and C. S. Ellis, "Accuracy of memory reference traces of parallel computations in trace-driven simulation". IEEE Trans. Parallel & Distributed Syst. 3(1), pp. 97-109, Jan 1992.
S. Hotovy, "Workload evolution on the Cornell Theory Center IBM SP2". In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 27-40, Springer-Verlag, 1996. Lect. Notes Comput. Sci. vol. 1162.
R. Jain, The Art of Computer Systems Performance Analysis. John Wiley & Sons, 1991.
J. Jann, P. Pattnaik, H. Franke, F. Wang, J. Skovira, and J. Riodan, "Modeling of workload in MPPs". In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 95-116, Springer Verlag, 1997. Lect. Notes Comput. Sci. vol. 1291.
R. E. Kessler, M. D. Hill, and D. A. Wood, "A comparison of trace-sampling techniques for multi-megabyte caches". IEEE Trans. Comput. 43(6), pp. 664- 675, Jun 1994.
E. J. Koldinger, S. J. Eggers, and H. M. Levy, "On the validity of trace-driven simulation for multiprocessors". In 18th Ann. Intl. Symp. Computer Architecture Conf. Proc., pp. 244-253, May 1991.
J. Krallmann, U. Schwiegelshohn, and R. Yahyapour, "On the design and evaluation of job scheduling systems". In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1999. Lect. Notes Comput. Sci. vol. 1659.
M. Krunz and S. K. Tripathi, "On the characterization of VBR MPEG streams". In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 192-202, Jun 1997.
W. Lee, M. Frank, V. Lee, K. Mackenzie, and L. Rudolph, "Implications of I/O for gang scheduled workloads". In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 215-237, Springer Verlag, 1997. Lect. Notes Comput. Sci. vol. 1291.
W. E. Leland, M. S. Taqqu, W. Willinger, and D. V. Wilson, "On the self-similar nature of Ethernet traffic". IEEE/ACM Trans. Networking 2(1), pp. 1-15, Feb 1994.
M. J. Litzkow, M. Livny, and M. W. Mutka, "Condor - a hunter of idle workstations ". In 8th Intl. Conf. Distributed Comput. Syst., pp. 104-111, Jun 1988.
U. Lublin, A Workload Model for Parallel Computer Systems. Master's thesis, Hebrew University, 1999. (In Hebrew).
N. Nieuwejaar, D. Kotz, A. Purakayastha, C. S. Ellis, and M. L. Best, "File-access characteristics of parallel scientific workloads". IEEE Trans. Parallel & Distributed Syst. 7(10), pp. 1075-1089, Oct 1996.
J. K. Ousterhout, H. Da Costa, D. Harrison, J. A. Kunze, M. Kupfer, and J. G. Thompson, "A trace-driven analysis of the UNIX 4.2 BSD file system". In 10th Symp. Operating Systems Principles, pp. 15-24, Dec 1985.
E.W. Parsons and K. C. Sevcik, "Coordinated allocation of memory and processors in multiprocessors". In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 57-67, May 1996.
V. G. J. Peris, M. S. Squillante, and V. K. Naik, "Analysis of the impact of memory in distributed parallel processing systems". In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 5-18, May 1994.
A. Polze, M. Werner, and G. Fohler, "Predictable network computing". In 17th Intl. Conf. Distributed Comput. Syst., pp. 423-431, May 1997.
E. Rosti, G. Serazzi, E. Smirni, and M. S. Squillante, "The impact of I/O on program behavior and parallel scheduling". In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 56-65, Jun 1998.
U. Schwiegelshohn and R. Yahyapour, "Resource allocation and scheduling in metasystems". In Proc. Distributed Computing & Metacomputing Workshop at HPCN Europe, P. Sloot, M. Bibak, A. Hoekstra, and B. Hertzberger (eds.), pp. 851- 860, Springer-Verlag, Apr 1999. Lect. Notes in Comput. Sci. vol. 1593.
K. C. Sevcik, "Application scheduling and processor allocation in multiprogrammed parallel processing systems". Performance Evaluation 19(2-3), pp. 107- 140, Mar 1994.
R. L. Sites and A. Agarwal, "Multiprocessor cache analysis using ATUM". In 15th Ann. Intl. Symp. Computer Architecture Conf. Proc., pp. 186-195, 1988.
W. Smith, V. Taylor, and I. Foster, "Using run-time predictions to estimate queue wait times and improve scheduler performance". In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer Verlag, 1999. Lect. Notes Comput. Sci. vol. 1659.
D. Talby, D. G. Feitelson, and A. Raveh, "Comparing logs and models of parallel workloads using the co-plot method". In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer Verlag, 1999. Lect. Notes Comput. Sci.
D. ThiÉbaut, J. L. Wolf, and H. S. Stone, "Synthetic traces for trace-driven simulation of cache memories". IEEE Trans. Comput. 41(4), pp. 388-410, Apr 1992. (Corrected in IEEE Trans. Comput. 42(5) p. 635, May 1993).
K. Windisch, V. Lo, R. Moore, D. Feitelson, and B. Nitzberg, "A comparison of workload traces from two production parallel machines". In 6th Symp. Frontiers Massively Parallel Comput., pp. 319-326, Oct 1996.
R. Wolski, N. T. Spring, and J. Hayes, "The network weather service: a distributed resource performance forecasting service for metacomputing". Journal of Future Generation Computing Systems, 1999.
S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta, "The SPLASH- 2 programs: characterization and methodological considerations". In 22nd Ann. Intl. Symp. Computer Architecture Conf. Proc., pp. 24-36, Jun 1995.

Cited By

View all
  • (2016)Enhancing infiniband with openflow-style SDN capabilityProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.5555/3014904.3014953(1-12)Online publication date: 13-Nov-2016
  • (2016)HPC job mapping over reconfigurable wireless linksProceedings of the 16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing10.1109/CCGrid.2016.17(570-575)Online publication date: 16-May-2016
  • (2016)Fattened backfillingJournal of Parallel and Distributed Computing10.1016/j.jpdc.2016.06.01397:C(69-77)Online publication date: 1-Nov-2016
  • Show More Cited By



Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors


Published In

cover image Guide Proceedings
IPPS/SPDP '99/JSSPP '99: Proceedings of the Job Scheduling Strategies for Parallel Processing
April 1999
235 pages



Berlin, Heidelberg

Publication History

Published: 16 April 1999


  • Article


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 26 Sep 2024

Other Metrics


Cited By

View all
  • (2016)Enhancing infiniband with openflow-style SDN capabilityProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.5555/3014904.3014953(1-12)Online publication date: 13-Nov-2016
  • (2016)HPC job mapping over reconfigurable wireless linksProceedings of the 16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing10.1109/CCGrid.2016.17(570-575)Online publication date: 16-May-2016
  • (2016)Fattened backfillingJournal of Parallel and Distributed Computing10.1016/j.jpdc.2016.06.01397:C(69-77)Online publication date: 1-Nov-2016
  • (2016)Resampling with Feedback -- A New Paradigm of Using Workload Data forźPerformanceźEvaluationProceedings of the 22nd International Conference on Euro-Par 2016: Parallel Processing - Volume 983310.1007/978-3-319-43659-3_1(3-21)Online publication date: 24-Aug-2016
  • (2015)Energy-efficient, thermal-aware modeling and simulation of data centersAd Hoc Networks10.1016/j.adhoc.2014.11.00225:PB(535-553)Online publication date: 1-Feb-2015
  • (2015)Performance and energy aware scheduling simulator for HPCConcurrency and Computation: Practice & Experience10.1002/cpe.360727:17(5436-5459)Online publication date: 10-Dec-2015
  • (2013)Multiple objective scheduling of HPC workloads through dynamic prioritizationProceedings of the High Performance Computing Symposium10.5555/2499968.2499981(1-8)Online publication date: 7-Apr-2013
  • (2012)ATLAS grid workload on NDGF resourcesProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis10.5555/2388996.2389104(1-11)Online publication date: 10-Nov-2012
  • (2012)Towards realistic benchmarks for virtual infrastructure resource allocatorsProceedings of the Third ACM SIGOPS Asia-Pacific conference on Systems10.5555/2387841.2387846(5-5)Online publication date: 23-Jul-2012
  • (2012)Towards realistic benchmarks for virtual infrastructure resource allocatorsProceedings of the Asia-Pacific Workshop on Systems10.1145/2349896.2349901(1-6)Online publication date: 23-Jul-2012
  • Show More Cited By

View Options

View options

Get Access

Login options







Share this Publication link

Share on social media