Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access

Probabilistic modeling for job symbiosis scheduling on SMT processors

Published: 15 June 2012 Publication History

Abstract

Symbiotic job scheduling improves simultaneous multithreading (SMT) processor performance by coscheduling jobs that have “compatible” demands on the processor's shared resources. Existing approaches however require a sampling phase, evaluate a limited number of possible coschedules, use heuristics to gauge symbiosis, are rigid in their optimization target, and do not preserve system-level priorities/shares.
This article proposes probabilistic job symbiosis modeling, which predicts whether jobs will create positive or negative symbiosis when coscheduled without requiring the coschedule to be evaluated. The model, which uses per-thread cycle stacks computed through a previously proposed cycle accounting architecture, is simple enough to be used in system software. Probabilistic job symbiosis modeling provides six key innovations over prior work in symbiotic job scheduling: (i) it does not require a sampling phase, (ii) it readjusts the job coschedule continuously, (iii) it evaluates a large number of possible coschedules at very low overhead, (iv) it is not driven by heuristics, (v) it can optimize a performance target of interest (e.g., system throughput or job turnaround time), and (vi) it preserves system-level priorities/shares. These innovations make symbiotic job scheduling both practical and effective.
Our experimental evaluation, which assumes a realistic scenario in which jobs come and go, reports an average 16% (and up to 35%) reduction in job turnaround time compared to the previously proposed SOS (sample, optimize, symbios) approach for a two-thread SMT processor, and an average 19% (and up to 45%) reduction in job turnaround time for a four-thread SMT processor.

References

[1]
Boneti, C., Cazorla, F. J., Gioiosa, R., Buyuktosunoglu, A., Cher, C.-Y., and Valero, M. 2008. Software-controlled priority characterization of POWER5 processor. In Proceedings of the International Symposium on Computer Architecture. 415--426.
[2]
Bulpin, J. R. and Pratt, I. 2005. Hyper-threading aware process scheduling heuristics. In Proceedings of the USENIX Annual Technical Conference. 103--106.
[3]
Cazorla, F. J., Knijnenburg, P. M. W., Sakellariou, R., Fernández, E., Ramirez, A., and Valero, M. 2006. Predictable performance in SMT processors: Synergy between the OS and SMTs. IEEE Trans. Comput. 55, 7, 785--799.
[4]
Cazorla, F. J., Ramirez, A., Valero, M., and Fernández, E. 2004a. Dynamically controlled resource allocation in SMT processors. In Proceedings of the 37th Annual IEEE/ACM International Symposium on Microarchitecture. 171--182.
[5]
Cazorla, F. J., Ramirez, A., Valero, M., Knijnenburg, P. M. W., Sakellariou, R., and Fernández, E. 2004b. QoS for high-performance SMT processors in embedded systems. IEEE Micro 24, 4, 24--31.
[6]
Chandra, D., Guo, F., Kim, S., and Solihin, Y. 2005. Predicting inter-thread cache contention on a chip-multiprocessor architecture. In Proceedings of the 11th International Symposium on High Performance Computer Architecture. 340--351.
[7]
Choi, S. and Yeung, D. 2006. Learning-based SMT processor resource distribution via hill-climbing. In Proceedings of the 33rd Annual International Symposium on Computer Architecture. 239--250.
[8]
Cota-Robles, E. 2003. Priority based simultaneous multi-threading. United States Patent No. 6,658,447 B2.
[9]
El-Moursy, A., Garg, R., Albonesi, D., and Dwarkadas, S. 2006. Compatible phase co-scheduling on a CMP of multi-threaded processors. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium.
[10]
Eyerman, S. and Eeckhout, L. 2007. A memory-level parallelism aware fetch policy for SMT processors. In Proceedings of the International Symposium on High-Performance Computer Architecture. 240--249.
[11]
Eyerman, S. and Eeckhout, L. 2008. System-level performance metrics for multi-program workloads. IEEE Micro 28, 3, 42--53.
[12]
Eyerman, S. and Eeckhout, L. 2009. Per-thread cycle accounting in SMT processors. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. 133--144.
[13]
Fedorova, A., Seltzer, M., and Smith, M. D. 2006. A non-work-conserving operating system scheduler for SMT processors. In Proceedings of the Workshop on the Interaction between Operating Systems and Computer Architecture, in conjunction with ISCA.
[14]
Gabor, R., Weiss, S., and Mendelson, A. 2007. Fairness enforcement in switch on event multithreading. ACM Trans. Architect. Code Optim. 4, 3, 34.
[15]
Gibbs, B., Atyam, B., Berres, F., Blanchard, B., Castillo, L., Coelho, P., Guerin, N., Liu, L., Maciel, C. D., Sosa, C., and Thirumalai, C. 2005. Advanced POWER Virtualization on IBM eServer p5 Servers: Architecture and Performance Considerations. IBM.
[16]
Ishihara, T. and Yasuura, H. 1998. Voltage scheduling problem for dynamically variable voltage processors. In Proceedings of the International Symposium on Low Power Electronics and Design. 197--202.
[17]
Jain, R., Hughes, C. J., and Adve, S. V. 2002. Soft real-time scheduling on simultaneous multithreaded processors. In Proceedings of the 23rd IEEE International Real-Time Systems Symposium. 134--145.
[18]
Luo, K., Gummaraju, J., and Franklin, M. 2001. Balancing throughput and fairness in SMT processors. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software. 164--171.
[19]
Parekh, S., Eggers, S., Levy, H., and Lo, J. 2000. Thread-sensitive scheduling for SMT processors. Tech. rep., University of Washington.
[20]
Qureshi, M. K. and Patt, Y. N. 2006. Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches. In Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture. 423--432.
[21]
Raasch, S. E. and Reinhardt, S. K. 2003. The impact of resource partitioning on SMT processors. In Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques. 15--26.
[22]
Ramirez, T., Pajuelo, A., Santana, O. J., and Valero, M. 2008. Runahead threads to improve SMT performance. In Proceedings of the 14th International Symposium on High-Performance Computer Architecture. 149--158.
[23]
Settle, A., Kihm, J., Janiszewski, A., and Connors, D. 2004. Architectural support for enhanced SMT job scheduling. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques. 63--73.
[24]
Sherwood, T., Perelman, E., Hamerly, G., and Calder, B. 2002. Automatically characterizing large scale program behavior. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. 45--57.
[25]
Snavely, A. and Carter, L. 2000. Symbiotic job scheduling on the MTA. In Proceedings of theWorkshop on Multi-Threaded Execution, Architecture and Compilers.
[26]
Snavely, A. and Tullsen, D. M. 2000. Symbiotic jobscheduling for simultaneous multithreading processor. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. 234--244.
[27]
Snavely, A., Tullsen, D. M., and Voelker, G. 2002. Symbiotic jobscheduling with priorities for a simultaneous multithreading processor. In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems. 66--76.
[28]
Tam, D., Azimi, R., and Stumm, M. 2007. Thread clustering: Sharing-aware scheduling on SMP-CMP-SMT multiprocessors. In Proceedings of the European Conference in Computer Systems. 47--58.
[29]
Tuck, N. and Tullsen, D. M. 2003. Initial observations of the simultaneous multithreading Pentium 4 processor. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques. 26--34.
[30]
Tullsen, D. 1996. Simulation and modeling of a simultaneous multithreading processor. In Proceedings of the 22nd Annual Computer Measurement Group Conference.
[31]
Tullsen, D. M. and Brown, J. A. 2001. Handling long-latency loads in a simultaneous multi-threading processor. In Proceedings of the 34th Annual IEEE/ACM International Symposium on Microarchitecture. 318--327.
[32]
Tullsen, D. M., Eggers, S. J., Emer, J. S., Levy, H. M., Lo, J. L., and Stamm, R. L. 1996. Exploiting choice: Instruction fetch and issue on an implementable simultaneous multithreading processor. In Proceedings of the 23rd Annual International Symposium on Computer Architecture. 191--202.
[33]
Tullsen, D. M., Eggers, S. J., and Levy, H. M. 1995. Simultaneous multithreading: Maximizing on-chip parallelism. In Proceedings of the 22nd Annual International Symposium on Computer Architecture. 392--403.
[34]
VMware 2004. HyperThreading Support in VMware ESX Server 2.1. VMware.

Cited By

View all
  • (2016)SMT-Aware Instantaneous Footprint OptimizationProceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing10.1145/2907294.2907308(267-279)Online publication date: 31-May-2016
  • (2015)Making the Most of SMT in HPCACM Transactions on Architecture and Code Optimization10.1145/268765111:4(1-26)Online publication date: 9-Jan-2015
  • (2014)Adaptive SMT control for more responsive web applications2014 IEEE International Symposium on Workload Characterization (IISWC)10.1109/IISWC.2014.6983038(41-50)Online publication date: Oct-2014
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Architecture and Code Optimization
ACM Transactions on Architecture and Code Optimization  Volume 9, Issue 2
June 2012
177 pages
ISSN:1544-3566
EISSN:1544-3973
DOI:10.1145/2207222
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 June 2012
Accepted: 01 December 2011
Revised: 01 December 2011
Received: 01 June 2011
Published in TACO Volume 9, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Simultaneous multithreading (SMT)
  2. performance modeling
  3. symbiotic job scheduling

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)46
  • Downloads (Last 6 weeks)8
Reflects downloads up to 29 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2016)SMT-Aware Instantaneous Footprint OptimizationProceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing10.1145/2907294.2907308(267-279)Online publication date: 31-May-2016
  • (2015)Making the Most of SMT in HPCACM Transactions on Architecture and Code Optimization10.1145/268765111:4(1-26)Online publication date: 9-Jan-2015
  • (2014)Adaptive SMT control for more responsive web applications2014 IEEE International Symposium on Workload Characterization (IISWC)10.1109/IISWC.2014.6983038(41-50)Online publication date: Oct-2014
  • (2014)History-Based Predictive Instruction Window Weighting for SMT ProcessorsProceedings of the 29th International Conference on Supercomputing - Volume 848810.1007/978-3-319-07518-1_12(187-198)Online publication date: 22-Jun-2014

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media