Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/IPDPS.2012.54guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Understanding Cache Hierarchy Contention in CMPs to Improve Job Scheduling

Published: 21 May 2012 Publication History

Abstract

In order to improve CMP performance, recent research has focused on scheduling to mitigate contention produced by the limited memory bandwidth. Nowadays, commercial CMPs implement multi-level cache hierarchies where last level caches are shared by at least two cache structures located at the immediately lower cache level. In turn, these caches can be shared by several multithreaded cores. In this microprocessor design, contention points may appear along the whole memory hierarchy. Moreover, this problem is expected to aggravate in future technologies, since the number of cores and hardware threads, and consequently the size of the shared caches increases with each microprocessor generation. In this paper we characterize the impact on performance of the different contention points that appear along the memory subsystem. Then, we propose a generic scheduling strategy for CMPs that takes into account the available bandwidth at each level of the cache hierarchy. The proposed strategy selects the processes to be co-scheduled and allocates them to cores in order to minimize contention effects. The proposal has been implemented and evaluated in a commercial single-threaded quad-core processor with a relatively small two-level cache hierarchy. Despite these potential contention limitations are less than in recent processor designs, compared to the Linux scheduler, the proposal reaches performance improvements up to 9% while these benefits (across the studied benchmark mixes) are always lower than 6% for a memory-aware scheduler that does not take into account the cache hierarchy. Moreover, in some cases the proposal doubles the speedup achieved by the memory-aware scheduler.

Cited By

View all
  • (2021)Online Thread and Data Mapping Using a Sharing-Aware Memory Management UnitACM Transactions on Modeling and Performance Evaluation of Computing Systems10.1145/34336875:4(1-28)Online publication date: 21-Jan-2021
  • (2018)GASWorkshop Proceedings of the 47th International Conference on Parallel Processing10.1145/3229710.3229758(1-9)Online publication date: 13-Aug-2018
  • (2016)Hardware-Assisted Thread and Data Mapping in Hierarchical Multicore ArchitecturesACM Transactions on Architecture and Code Optimization10.1145/297558713:3(1-28)Online publication date: 17-Sep-2016
  • Show More Cited By
  1. Understanding Cache Hierarchy Contention in CMPs to Improve Job Scheduling

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    IPDPS '12: Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium
    May 2012
    1402 pages
    ISBN:9780769546759

    Publisher

    IEEE Computer Society

    United States

    Publication History

    Published: 21 May 2012

    Author Tags

    1. cache hierarchy
    2. contention-points
    3. memory-aware scheduling
    4. shared caches

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 14 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Online Thread and Data Mapping Using a Sharing-Aware Memory Management UnitACM Transactions on Modeling and Performance Evaluation of Computing Systems10.1145/34336875:4(1-28)Online publication date: 21-Jan-2021
    • (2018)GASWorkshop Proceedings of the 47th International Conference on Parallel Processing10.1145/3229710.3229758(1-9)Online publication date: 13-Aug-2018
    • (2016)Hardware-Assisted Thread and Data Mapping in Hierarchical Multicore ArchitecturesACM Transactions on Architecture and Code Optimization10.1145/297558713:3(1-28)Online publication date: 17-Sep-2016
    • (2014)kMAFProceedings of the 23rd international conference on Parallel architectures and compilation10.1145/2628071.2628085(277-288)Online publication date: 24-Aug-2014
    • (2013)L1-bandwidth aware thread allocation in multicore SMT processorsProceedings of the 22nd international conference on Parallel architectures and compilation techniques10.5555/2523721.2523741(123-132)Online publication date: 7-Oct-2013

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media