Article

Understanding Cache Hierarchy Contention in CMPs to Improve Job Scheduling

Authors:

Josue Feliu,

Julio Sahuquillo,

Salvador Petit,

Jose DuatoAuthors Info & Claims

IPDPS '12: Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium

Pages 508 - 519

https://doi.org/10.1109/IPDPS.2012.54

Published: 21 May 2012 Publication History

Abstract

In order to improve CMP performance, recent research has focused on scheduling to mitigate contention produced by the limited memory bandwidth. Nowadays, commercial CMPs implement multi-level cache hierarchies where last level caches are shared by at least two cache structures located at the immediately lower cache level. In turn, these caches can be shared by several multithreaded cores. In this microprocessor design, contention points may appear along the whole memory hierarchy. Moreover, this problem is expected to aggravate in future technologies, since the number of cores and hardware threads, and consequently the size of the shared caches increases with each microprocessor generation. In this paper we characterize the impact on performance of the different contention points that appear along the memory subsystem. Then, we propose a generic scheduling strategy for CMPs that takes into account the available bandwidth at each level of the cache hierarchy. The proposed strategy selects the processes to be co-scheduled and allocates them to cores in order to minimize contention effects. The proposal has been implemented and evaluated in a commercial single-threaded quad-core processor with a relatively small two-level cache hierarchy. Despite these potential contention limitations are less than in recent processor designs, compared to the Linux scheduler, the proposal reaches performance improvements up to 9% while these benefits (across the studied benchmark mixes) are always lower than 6% for a memory-aware scheduler that does not take into account the cache hierarchy. Moreover, in some cases the proposal doubles the speedup achieved by the memory-aware scheduler.

Cited By

View all

Cruz EDiener MPilla LNavaux P(2021)Online Thread and Data Mapping Using a Sharing-Aware Memory Management UnitACM Transactions on Modeling and Performance Evaluation of Computing Systems10.1145/34336875:4(1-28)Online publication date: 21-Jan-2021
https://dl.acm.org/doi/10.1145/3433687
Janakiram DBalaji SDhumal AGarg NKulkarni G(2018)GASWorkshop Proceedings of the 47th International Conference on Parallel Processing10.1145/3229710.3229758(1-9)Online publication date: 13-Aug-2018
https://dl.acm.org/doi/10.1145/3229710.3229758
Cruz EDiener MPilla LNavaux P(2016)Hardware-Assisted Thread and Data Mapping in Hierarchical Multicore ArchitecturesACM Transactions on Architecture and Code Optimization10.1145/297558713:3(1-28)Online publication date: 17-Sep-2016
https://dl.acm.org/doi/10.1145/2975587
Show More Cited By

Understanding Cache Hierarchy Contention in CMPs to Improve Job Scheduling
1. Hardware
  1. Integrated circuits
    1. Semiconductor memory

Recommendations

Cache-Hierarchy Contention-Aware Scheduling in CMPs

To improve chip multiprocessor (CMP) performance, recent research has focused on scheduling strategies to mitigate main memory bandwidth contention. Nowadays, commercial CMPs implement multilevel cache hierarchies that are shared by several ...
Criticality aware tiered cache hierarchy: a fundamental relook at multi-level cache hierarchies
ISCA '18: Proceedings of the 45th Annual International Symposium on Computer Architecture

On-die caches are a popular method to help hide the main memory latency. However, it is difficult to build large caches without substantially increasing their access latency, which in turn hurts performance. To overcome this difficulty, on-die caches ...
Page Size Aware Cache Prefetching
MICRO '22: Proceedings of the 55th Annual IEEE/ACM International Symposium on Microarchitecture

The increase in working set sizes of contemporary applications outpaces the growth in cache sizes, resulting in frequent main memory accesses that deteriorate system performance due to the disparity between processor and memory speeds. Prefetching ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

IPDPS '12: Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium

May 2012

1402 pages

ISBN:9780769546759

Publisher

IEEE Computer Society

United States

Publication History

Published: 21 May 2012

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 14 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Cruz EDiener MPilla LNavaux P(2021)Online Thread and Data Mapping Using a Sharing-Aware Memory Management UnitACM Transactions on Modeling and Performance Evaluation of Computing Systems10.1145/34336875:4(1-28)Online publication date: 21-Jan-2021
https://dl.acm.org/doi/10.1145/3433687
Janakiram DBalaji SDhumal AGarg NKulkarni G(2018)GASWorkshop Proceedings of the 47th International Conference on Parallel Processing10.1145/3229710.3229758(1-9)Online publication date: 13-Aug-2018
https://dl.acm.org/doi/10.1145/3229710.3229758
Cruz EDiener MPilla LNavaux P(2016)Hardware-Assisted Thread and Data Mapping in Hierarchical Multicore ArchitecturesACM Transactions on Architecture and Code Optimization10.1145/297558713:3(1-28)Online publication date: 17-Sep-2016
https://dl.acm.org/doi/10.1145/2975587
Diener MCruz ENavaux PBusse AHeiß HAmaral JTorrellas J(2014)kMAFProceedings of the 23rd international conference on Parallel architectures and compilation10.1145/2628071.2628085(277-288)Online publication date: 24-Aug-2014
https://dl.acm.org/doi/10.1145/2628071.2628085
Feliu JSahuquillo JPetit SDuato JFensch CO'Boyle MSeznec ABodin F(2013)L1-bandwidth aware thread allocation in multicore SMT processorsProceedings of the 22nd international conference on Parallel architectures and compilation techniques10.5555/2523721.2523741(123-132)Online publication date: 7-Oct-2013
https://dl.acm.org/doi/10.5555/2523721.2523741

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

Cited By

Recommendations

Cache-Hierarchy Contention-Aware Scheduling in CMPs

Criticality aware tiered cache hierarchy: a fundamental relook at multi-level cache hierarchies

Page Size Aware Cache Prefetching

Comments

Information

Published In

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations