Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1007/11605300_3guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

ScoPred–scalable user-directed performance prediction using complexity modeling and historical data

Published: 19 June 2005 Publication History

Abstract

Using historical information to predict future runs of parallel jobs has shown to be valuable in job scheduling. Trends toward more flexible job-scheduling techniques such as adaptive resource allocation, and toward the expansion of scheduling to grids, make runtime predictions even more important. We present a technique of employing both a user's knowledge of his/her parallel application and historical application-run data, synthesizing them to derive accurate and scalable predictions for future runs. These scalable predictions apply to runtime characteristics for different numbers of nodes (processor scalability) and different problem sizes (problem-size scalability). We employ multiple linear regression and show that for decently accurate complexity models, good prediction accuracy can be obtained.

References

[1]
Angela C. Sodan and Lun Liu. Dynamic Multi-Resource Monitoring for Predictive Job Scheduling with ScoPro. Technical Report 04-002, U of W, CS Department, February 2005.
[2]
Angela C. Sodan and Xuemin Huang. Adaptive Time/Space Scheduling with SCOJO. Int. Symp. on High-Performance Computing Systems (HPCS), Winnipeg/ Manitoba, May 2004, pp. 165-178.
[3]
Angela C. Sodan and Lin Han. ATOP-Space and Time Adaptation for Parallel and Grid Applications via Flexible Data Partitioning. 3rd ACM/IFIP/USENIX Workshop on Reflective and Adaptive Middleware, Toronto, Oct. 2004.
[4]
Angela C. Sodan and Lei Lan. LOMARC-Lookahead Matchmaking in Multi-Resource Coscheduling. JSSPP (Workshop on Job Scheduling Strategies for Parallel Processing), New York / USA, June 2004, to appear in Springer.
[5]
W. Cirne and F. Berman. A Model for Moldable Supercomputer Jobs. Proc. Internat. Parallel and Distributed Processing Symposium (IPDPS), April 2001.
[6]
Angela C. Sodan. Loosely Coordinated Coscheduling in the Context of Other Dynamic Approaches for Job Scheduling-A Survey. Concurrency & Computation: Practice & Experience. Accepted for publication. (57 pages).
[7]
V. K. Naik, S. K. Setia, and M. S. Squillante. Processor Allocation in Multiprogrammed Distributed-Memory Parallel Computer Systems. J. of Parallel and Distributed Computing, Vol. 46, No. 1, 1997, pp. 28-47.
[8]
Eitan Frachtenberg, Dror Feitelson, Fabrizio Petrini, and Juan Fernandez. Flexible CoScheduling: Mitigating Load Imbalance and Improving Utilization of Heterogeneous Resources. Proc. Int. Parallel and Distributed Processing Symposium (IPDPS'03), Nice, France, April 2003.
[9]
R.A. Gibbons. Historical Application Profiler for Use by Parallel Schedulers. Proc. IPPS Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP), April 1997, Lecture Notes in Computer Science 1291, Springer Verlag.
[10]
Mu'alem A and Feitelson D G. 2001. Utilization, Predictability, Workloads and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling. IEEE Transactions Parallel & Distributed Systems June 2001, 12(6).
[11]
Perkovic D and Keleher P J. Randomization, Speculation, and Adaptation in Batch Schedulers. Proc. ACM/IEEE Supercomputing (SC), Dallas/TX, Nov. 2000.
[12]
Chiang S-H and Vernon M K. Characteristics of a Large Shared Memory Production Workload. Proc. Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP), June 2001, Lecture Notes in Computer Science 2221, Springer-Verlag, pp. 159-187.
[13]
Smith W, Taylor V, and Foster I. Using Run-Time Predictions to Estimate Queue Wait Times and Improve Scheduler Performance. Proc. Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP), 1999, Lecture Notes in Computer Science 1659, Springer Verlag.
[14]
Arpaci-Dusseau A C, Culler D E, and Mainwaring A M. Scheduling with Implicit Information in Distributed Systems. Proc. SIGMETRICS'98/PERFORMANCE'98 Joint Conference on the Measurement and Modeling of Computer Systems, Madison/WI, USA, June 1998.
[15]
M.E. Crovella and T.J. LeBlanc. Parallel Performance Prediction Using Lost Cycles Analysis. Proc. Supercomputing (SC), 1994.
[16]
K. Keahey, P. Beckman, and J. Ahrens. Ligature: Component Architecture for High Performance Applications. The International Journal of High Performance Applications, 14(4):347-356, Winter 2000.
[17]
Frederik Vraalsen, Ruth A. Aydt, Celso L. Mendes, and Daniel A. Reed. Performance Contracts: Predicting and Monitoring Grid Application Behavior. Proc. 2nd Internat. Workshop on Grid Computing, Nov. 2001.
[18]
G. Marin and J. Mellor-Crummey. Cross-Architecture Predictions for Scientific Applications Using Parameterized Models. Proc. Joint. Internat. Conf. on Measurement and Modeling of Computer Systems (SIGMETRICS), New York, NY, USA, June 2004.
[19]
A. Snavely, L. Carrington, and N. Wolter. Modeling Application Performance by Convolving Machine Signatures with Application Profiles. Proc. IEEE 3th Annual Workshop on Workload Characterization, 2001.
[20]
NIST/SEMATECH e-Handbook of Statistical Methods, http://www.itl.nist.gov /div898/handbook, retrieved October, 2004.
[21]
J. Cohen, P. Cohen, S.G. West, and L.S. Alken. Applied Multiple Regression/ Correlation Analysis for the Behavioural Sciences, 3rd ed. Mahwah, New Jersey, USA: Lawrence Erlbaum Associates, 2003.
[22]
W. Mendenhall, R.J. Beaver, and B.M. Beaver. Introduction to Probability and Statistics, 10th edition. Pacific Grove, CA, USA: Brooks/Cole Publishing Company, 1999.
[23]
D.H. Bailey, T. Harris, W.C. Saphir, R.F. Van der Wijngaart, A.C. Woo, M. Yarrow. The NAS Parallel Benchmarks 2.0. NAS Technical Report NAS-95-020, NASA Ames Research Center, Moffett Field, CA, 1995.
[24]
A. Grama, A. Gupta, G. Karypis, V. Kumar. Introduction to Parallel Computing, 2nd ed. Addison Wesley, 2003.
[25]
M. Yarrow, R.F. Van der Wijngaart. Communication Improvement for the LU NAS Parallel Benchmark: A Model for Efficient Parallel Relaxation Schemes. NAS Technical Report NAS-97-032, NASA Ames Research Center, Moffett Field, CA, 1997.
[26]
E. Barszcz, R. Fatoohi, V. Venkatakrishnan, S. Weeratunga. Solution of Regular, Sparse Triangular Linear Systems on Vector and Distributed-Memory Multiprocessors. NAS Applied Research Branch Report RNR-94-007, NASA Ames Research Center, Moffet Field, CA, 1993.
[27]
Maple 9.5-Advanced Mathematics Software for Engineers, Academics, Researchers, and Students, http://www.maplesoft.com/products/maple/index.aspx, retrieved December 2004.
[28]
OpenMaple-An API into Maple, http://www.adaptscience.com/products/ /maple/html/OpenMaple.html, retrieved December 2004.
[29]
HPL-A Portable Implementation of the High-Performance Linpack Benchmark for Distributed-Memory Computers, http://www.netlib.org/benchmark/hpl/, retrieved June, 2005.
[30]
Ian Foster. Designing and Building Parallel Programs. Reading, MA: Addison-Wesley, 1995.
[31]
S. S. Vadhiyar, G. E. Fagg, and J. Dongarra. Automatically Tuned Collective Communications. IEEE/ACM Supercomputing, Nov. 2000.
[32]
J. Pješivac-Grbovic, T. Angskun, G. Bosilca, G. E. Fagg, E. Gabriel, and J. J. Dongarra. Performance Analysis of MPI Collective Operations. PMEO-PDS, Apr. 2005.
[33]
Ljupčo Todorowski, Peter Ljubič, and Sašo Džeroski. Inducing Polynomial Equations for Regression. ECML, 2004.
[34]
E. Schmidt, A. Schulz, L. Kruse, G. von Cölln, and W. Nebel. Automatic Generation of Complexity Functions for High-Level Power Analysis. PATMOS, 2001.

Cited By

View all
  • (2011)On/off-line prediction applied to job scheduling on non-dedicated NOWsJournal of Computer Science and Technology10.5555/1991836.199184626:1(99-116)Online publication date: 1-Jan-2011
  • (2008)Time and space adaptation for computational grids with the ATOP-Grid middlewareFuture Generation Computer Systems10.1016/j.future.2007.08.00424:6(561-581)Online publication date: 1-Jun-2008
  • (2008)Enhancing Prediction on Non-dedicated ClustersProceedings of the 14th international Euro-Par conference on Parallel Processing10.1007/978-3-540-85451-7_26(233-242)Online publication date: 26-Aug-2008
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
JSSPP'05: Proceedings of the 11th international conference on Job Scheduling Strategies for Parallel Processing
June 2005
283 pages
ISBN:354031024X
  • Editors:
  • Dror Feitelson,
  • Eitan Frachtenberg,
  • Larry Rudolph,
  • Uwe Schwiegelshohn

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 19 June 2005

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2011)On/off-line prediction applied to job scheduling on non-dedicated NOWsJournal of Computer Science and Technology10.5555/1991836.199184626:1(99-116)Online publication date: 1-Jan-2011
  • (2008)Time and space adaptation for computational grids with the ATOP-Grid middlewareFuture Generation Computer Systems10.1016/j.future.2007.08.00424:6(561-581)Online publication date: 1-Jun-2008
  • (2008)Enhancing Prediction on Non-dedicated ClustersProceedings of the 14th international Euro-Par conference on Parallel Processing10.1007/978-3-540-85451-7_26(233-242)Online publication date: 26-Aug-2008
  • (2007)Performance problems of using system-predicted runtimes for parallel job schedulingProceedings of the 19th IASTED International Conference on Parallel and Distributed Computing and Systems10.5555/1647539.1647607(369-374)Online publication date: 6-Nov-2007
  • (2007)Adaptive performance control for distributed scientific coupled modelsProceedings of the 21st annual international conference on Supercomputing10.1145/1274971.1275009(274-283)Online publication date: 17-Jun-2007
  • (2006)Time vs. space adaptation with ATOP-gridProceedings of the 5th workshop on Adaptive and reflective middleware (ARM '06)10.1145/1175855.1175861Online publication date: 27-Nov-2006
  • (2006)Using on-the-fly simulation for estimating the turnaround time on non-dedicated clustersProceedings of the 12th international conference on Parallel Processing10.1007/11823285_19(177-187)Online publication date: 28-Aug-2006

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media