research-article

An approach to resource-aware co-scheduling for CMPs

Authors:

Major Bhadauria,

Sally A. McKeeAuthors Info & Claims

ICS '10: Proceedings of the 24th ACM International Conference on Supercomputing

Pages 189 - 199

https://doi.org/10.1145/1810085.1810113

Published: 02 June 2010 Publication History

Abstract

We develop real-time scheduling techniques for improving performance and energy for multiprogrammed workloads that scale non-uniformly with increasing thread counts. Multithreaded programs generally deliver higher throughput than single-threaded programs on chip multiprocessors, but performance gains from increasing threads decrease when there is contention for shared resources. We use analytic metrics to derive local search heuristics for creating efficient multiprogrammed, multithreaded workload schedules. Programs are allocated fewer cores than requested, and scheduled to space-share the CMP to improve global throughput. Our holistic approach attempts to co-schedule programs that complement each other with respect to shared resource consumption. We find application co-scheduling for performance and energy in a resource-aware manner achieves better results than solely targeting total throughput or concurrently co-scheduling all programs. Our schedulers improve overall energy delay (E*D) by a factor of 1.5 over time-multiplexed gang scheduling.

References

[1]

D. Bailey, T. Harris, W. Saphir, R. Van der Wijngaart, A. Woo, and M. Yarrow. The NAS parallel benchmarks 2.0. Report NAS-95-020, NASA Ames Research Center, Dec. 1995.

[2]

M. Banikazemi, D. Poff, and B. Abali. PAM: A novel performance/power aware meta-scheduler for multi-core systems. In Proc. IEEE/ACM Supercomputing International Conference on High Performance Computing, Networking, Storage and Analysis, number 39, Nov. 2008.

Digital Library

[3]

M. Bhadauria and S. McKee. Optimizing thread throughput for multithreaded workloads on memory constrained CMPs. In Proc. ACM Computing Frontiers Conference, pages 119--128, May 2008.

Digital Library

[4]

C. Bienia, S. Kumar, J. Singh, and K. Li. The PARSEC benchmark suite: Characterization and architectural implications. In Proc. IEEE/ACM International Conference on Parallel Architectures and Compilation Techniques, pages 72--81, Oct. 2008.

Digital Library

[5]

C. Boneti, R. Gioiosa, F. Cazorla, and M. Valero. A dynamic scheduler for balancing HPC applications. In Proc. IEEE/ACM Supercomputing International Conference on High Performance Computing, Networking, Storage and Analysis, number 41, Nov. 2008.

Digital Library

[6]

J. Corbalan, X. Martorell, and J. Labarta. Performance-driven processor allocation. In Proc. 4th USENIX Symposium on Operating System Design and Implementation, pages 59--73, Oct. 2000.

Digital Library

[7]

J. Corbalan, X. Martorell, and J. Labarta. Improving gang scheduling through job performance analysis and malleability. In Proc. 15th ACM International Conference on Supercomputing, pages 303--312, June 2001.

Digital Library

[8]

M. Curtis-Maury, K. Singh, S. McKee, F. Blagojevic, D. Nikolopoulos, B. de Supinski, and M. Schulz. Identifying energy-efficient concurrency levels using machine learning. In Proc. 1st International Workshop on Green Computing, Sept. 2007.

Digital Library

[9]

Electronic Educational Devices. Watts Up PRO. http://www.wattsupmeters.com/, May 2009.

[10]

A. Fedorova, M. Seltzer, C. Small, and D. Nussbaum. Performance of multithreaded chip multiprocessors and implications for operating system design. In Proc. USENIX Annual Technical Conference, pages 26--26, Apr. 2005.

Digital Library

[11]

E. Frachtenberg, D. G. Feitelson, F. Petrini, and J. Fernandez. Adaptive parallel job scheduling with flexible coscheduling. IEEE Transactions on Parallel and Distributed Systems, 16(11):1066--1077, Nov. 2005.

Digital Library

[12]

S. Herbert and D. Marculescu. Analysis of dynamic voltage/frequency scaling in chip-multiprocessors. In Proc. IEEE/ACM International Symposium on Low Power Electronics and Design, pages 38--43, Aug. 2007.

Digital Library

[13]

C. Isci, G. Contreras, and M. Martonosi. Live, runtime phase monitoring and prediction on real systems with application to dynamic power management. In Proc. IEEE/ACM 40th Annual International Symposium on Microarchitecture, pages 359--370, Dec. 2006.

Digital Library

[14]

P. Kongetira, K. Aingaran, and K. Olukotun. Niagara: A 32-way multithreaded SPARC processor. IEEE Micro, 25(2):21--29, Mar. 2005.

Digital Library

[15]

R. McGregor, C. Antonopoulos, and D. Nikolopoulos. Scheduling algorithms for effective thread pairing on hybrid multiprocessors. In Proc. 19th IEEE/ACM International Parallel and Distributed Processing Symposium, volume 1, page 28a, Los Alamitos, CA, USA, Apr. 2005. IEEE Computer Society.

Digital Library

[16]

S. McKee. Maximizing Memory Bandwidth for Streamed Computations. PhD thesis, School of Engineering and Applied Science, Univ. of Virginia, May 1995.

Digital Library

[17]

K. Nesbit, N. Aggarwal, J. Laudon, and J. Smith. Fair queuing memory systems. In Proc. IEEE/ACM 40th Annual International Symposium on Microarchitecture, pages 208--222, Dec. 2006.

Digital Library

[18]

K. Nesbit, J. Laudon, and J. Smith. Virtual private caches. In Proc. 34th IEEE/ACM International Symposium on Computer Architecture, pages 57--68, June 2007.

Digital Library

[19]

S. Parekh, S. Eggers, and H. Levy. Thread-sensitive scheduling for SMT processors. Technical Report Technical Report, University of Washington, 2000.

[20]

C. Severance and R. Enbody. Comparing gang scheduling with dynamic space sharing on symmetric multiprocessors using automatic self-allocating threads (ASAT). In 11th International Parallel Processing Symposium, pages 288--292, Apr. 1997.

Digital Library

[21]

K. Singh, M. Bhadauria, and S. McKee. Real time power estimation of multi-cores via performance counters. Proc. Workshop on Design, Architecture and Simulation of Chip Multi-Processors, Nov. 2008.

[22]

G. Suh, L. Rudolph, and S. Devadas. Effects of memory performance on parallel job scheduling. Lecture Notes in Computer Science, 2221:116, Jan. 2001.

Digital Library

[23]

G. E. Suh, S. Devadas, and L. Rudolph. A new memory monitoring scheme for memory-aware scheduling and partitioning. In Proc. 8th IEEE Symposium on High Performance Computer Architecture, pages 117--125, Feb. 2002.

Digital Library

[24]

M. A. Suleman, M. K. Qureshi, and Y. N. Patt. Feedback-driven threading: power-efficient and high-performance execution of multi-threaded workloads on CMPs. In Proc. 13th ACM Symposium on Architectural Support for Programming Languages and Operating Systems, pages 277--286, Mar. 2008.

Digital Library

[25]

D. M. Tullsen, S. Eggers, and H. M. Levy. Simultaneous multithreading: Maximizing on-chip parallelism. In Proc. 22nd IEEE/ACM International Symposium on Computer Architecture, pages 392--403, June 1995.

Digital Library

[26]

M. D. Vuyst, R. Kumar, and D. Tullsen. Exploiting unbalanced thread scheduling for energy and performance on a CMP of SMT processors. In Proc. 20th IEEE/ACM International Parallel and Distributed Processing Symposium, page 10, Apr. 2006.

Digital Library

Cited By

Hada RSrivastava A(2024)A Novel Priority Based Scheduler for Asymmetric Multi-core Edge ComputingCurrent Trends in Web Engineering10.1007/978-3-031-50385-6_1(7-18)Online publication date: 4-Jan-2024
https://doi.org/10.1007/978-3-031-50385-6_1
Wang YYu JYu Z(2023)Resource scheduling techniques in cloud from a view of coordination: a holistic survey从协同视角论云资源调度技术：综述Frontiers of Information Technology & Electronic Engineering10.1631/FITEE.210029824:1(1-40)Online publication date: 23-Jan-2023
https://doi.org/10.1631/FITEE.2100298
Saroliya UArima ELiu DSchulz M(2023)Hierarchical Resource Partitioning on Modern GPUs: A Reinforcement Learning Approach2023 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER52292.2023.00023(185-196)Online publication date: 31-Oct-2023
https://doi.org/10.1109/CLUSTER52292.2023.00023
Show More Cited By

Index Terms

An approach to resource-aware co-scheduling for CMPs
1. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Process management

Recommendations

Machine scheduling with deteriorating and resource-dependent maintenance activity

Concept of deteriorating and resource-dependent maintenance is introduced.Four single-machine scheduling problems are analyzed.Measures are makespan, flowtime, maximum tardiness and due-date related.Solving algorithms are proposed for the considered ...
Scheduling of deteriorating jobs with release dates to minimize the maximum lateness

In this paper, we consider the problem of scheduling n deteriorating jobs with release dates on a single (batching) machine. Each job's processing time is a simple linear function of its starting time. The objective is to minimize the maximum lateness. ...
A YARN-based Energy-Aware Scheduling Method for Big Data Applications under Deadline Constraints
Abstract
Hadoop is a distributed framework for processing big data. One of the critical parts of Hadoop is YARN, which carries out scheduling and resource management. A scheduling algorithm should consider multiple objectives. However, YARN schedulers do ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICS '10: Proceedings of the 24th ACM International Conference on Supercomputing

June 2010

365 pages

ISBN:9781450300186

DOI:10.1145/1810085

General Chair:
Taisuke Boku
University of Tsukuba
,
Program Chairs:
Hiroshi Nakashima
Kyoto University
,
Avi Mendelson
Microsoft

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGARCH: ACM Special Interest Group on Computer Architecture

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 June 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICS'10

Sponsor:

SIGARCH

ICS'10: International Conference on Supercomputing

June 2 - 4, 2010

Ibaraki, Tsukuba, Japan

Acceptance Rates

Overall Acceptance Rate 629 of 2,180 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

96
Total Citations
View Citations
752
Total Downloads

Downloads (Last 12 months)17
Downloads (Last 6 weeks)2

Reflects downloads up to 22 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Hada RSrivastava A(2024)A Novel Priority Based Scheduler for Asymmetric Multi-core Edge ComputingCurrent Trends in Web Engineering10.1007/978-3-031-50385-6_1(7-18)Online publication date: 4-Jan-2024
https://doi.org/10.1007/978-3-031-50385-6_1
Wang YYu JYu Z(2023)Resource scheduling techniques in cloud from a view of coordination: a holistic survey从协同视角论云资源调度技术：综述Frontiers of Information Technology & Electronic Engineering10.1631/FITEE.210029824:1(1-40)Online publication date: 23-Jan-2023
https://doi.org/10.1631/FITEE.2100298
Saroliya UArima ELiu DSchulz M(2023)Hierarchical Resource Partitioning on Modern GPUs: A Reinforcement Learning Approach2023 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER52292.2023.00023(185-196)Online publication date: 31-Oct-2023
https://doi.org/10.1109/CLUSTER52292.2023.00023
Ab Wahab AAhmad Zairun MMohd Daud KAbu Bakar FBharudin IAbdul Murad A(2022)Evaluation and Improvement of Protocols for Ganoderma boninense Protoplast Isolation and RegenerationMalaysian Applied Biology10.55230/mabjournal.v51i5.234751:5(43-57)Online publication date: 26-Dec-2022
https://doi.org/10.55230/mabjournal.v51i5.2347
Bläsius TFreiberger CFriedrich TKatzmann MMontenegro-Retana FThieffry M(2022)Efficient Shortest Paths in Scale-Free Networks with Underlying Hyperbolic GeometryACM Transactions on Algorithms10.1145/351648318:2(1-32)Online publication date: 30-Mar-2022
https://dl.acm.org/doi/10.1145/3516483
Cygan MNederlof JPilipczuk MPilipczuk MVan Rooij JWojtaszczyk J(2022)Solving Connectivity Problems Parameterized by Treewidth in Single Exponential TimeACM Transactions on Algorithms10.1145/350670718:2(1-31)Online publication date: 4-Mar-2022
https://dl.acm.org/doi/10.1145/3506707
Mahajan SRani R(2022)Word Level Script Identification Using Convolutional Neural Network Enhancement for Scenic ImagesACM Transactions on Asian and Low-Resource Language Information Processing10.1145/350669921:4(1-29)Online publication date: 4-Mar-2022
https://dl.acm.org/doi/10.1145/3506699
Gogoi ABaruah N(2022)A Lemmatizer for Low-resource Languages: WSD and Its Role in the Assamese LanguageACM Transactions on Asian and Low-Resource Language Information Processing10.1145/350215721:4(1-22)Online publication date: 17-May-2022
https://dl.acm.org/doi/10.1145/3502157
Srikanthan SChakraborti SFerro PDwarkadas S(2022)MAPPER: Managing Application Performance via Parallel Efficiency Regulation*ACM Transactions on Architecture and Code Optimization10.1145/350176719:2(1-26)Online publication date: 24-Mar-2022
https://dl.acm.org/doi/10.1145/3501767
Henzinger MPeng P(2022)Constant-time Dynamic (Δ +1)-ColoringACM Transactions on Algorithms10.1145/350140318:2(1-21)Online publication date: 4-Mar-2022
https://dl.acm.org/doi/10.1145/3501403
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents