Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3539845.3540167acmconferencesArticle/Chapter ViewAbstractPublication PagesdateConference Proceedingsconference-collections
research-article

Thermal- and cache-aware resource management based on ML-driven cache contention prediction

Published: 31 May 2022 Publication History

Abstract

While on-chip many-core systems enable a large number of applications to run in parallel, the increased overall performance may come at the cost of complicating the performance constraints of individual applications due to contention on shared resources. For instance, the competition for last-level cache by concurrently-running applications may lead to slowing down the execution and to potentially violating individual performance constraints. Clustered many-cores reduce cache contention at chip level by sharing caches only at cluster level. To reduce cache contention within a cluster, state-of-the art techniques aim to co-map a memory-intensive application with a compute-intensive application onto one cluster. However, compute-intensive applications typically consume high power, and therefore, executing another application in their nearby cores may lead to high temperatures. Hence, there is a trade-off between cache contention and temperature. This paper is the first to consider this trade-off through a novel thermal- and cache-aware resource management technique. We build a neural network (NN)-based model to predict the slowdown of the application execution induced by cache contention feeding our resource management technique that then optimizes the application mapping and selects the voltage/frequency levels of the clusters to compensate for the potential contention-induced slowdown. Thereby, it meets the performance constraints, while minimizing temperature. Compared to the state of the art, our technique significantly reduces the temperature by 30% on average, while satisfying performance constraints of all individual applications.

References

[1]
C. Bienia, S. Kumar, J. P. Singh, and K. Li, "The PARSEC Benchmark Suite: Characterization and Architectural Implications," in Parallel Architectures and Compilation Techniques (PACT). ACM, 2008.
[2]
J. L. Manferdelli, N. K. Govindaraju, and C. Crall, "Challenges and Opportunities in Many-Core Computing," Proceedings of the IEEE, vol. 96, no. 5, pp. 808--815, 2008.
[3]
H. Khdr, H. Amrouch, and J. Henkel, "Aging-Constrained Performance Optimization for Multi Cores," in Design Automation Conference (DAC), 2018, pp. 1--6.
[4]
X. E. Chen and T. Aamodt, "Modeling Cache Contention and Throughput of Multiprogrammed Manycore Processors," IEEE Trans. on Computers (TC), vol. 61, no. 7, pp. 913--927, 2011.
[5]
(2020) AMD "Zen 3" Core Architecture. [Online]. Available: https://www.amd.com/en/technologies/zen-core-3
[6]
T. Marinakis, S. Kundan, and I. Anagnostopoulos, "Meeting Power Constraints While Mitigating Contention on Clustered Multiprocessor System," IEEE Embedded Systems Letters (ESL), vol. 12, no. 3, 2019.
[7]
J.-H. Liao, H.-R. Chen, and Y.-S. Chen, "A Cache Contention-aware Run-time Scheduling for Power-constrained Asymmetric Multicore Processors," in Research in Adaptive and Convergent Systems (RACS), 2020.
[8]
S. C. Woo, M. Ohara, E. Torrie et al., "The SPLASH-2 Programs: Characterization and Methodological Considerations," Int. Symp. Computer Architecture (ISCA), 1995.
[9]
D. G. Feitelson and L. Rudolph, "Metrics and Benchmarking for Parallel Job Scheduling," in Workshop on Job Scheduling Strategies for Parallel Processing. Springer, 1998.
[10]
M. Rapp, H. Amrouch, Y. Lin et al., "MLCAD: A Survey of Research in Machine Learning for CAD Keynote Paper," IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2021.
[11]
Y. G. Kim, M. Kim, J. Kong, and S. W. Chung, "An Adaptive Thermal Management Framework for Heterogeneous Multi-Core Processors," IEEE Trans. on Computers (TC), vol. 69, no. 6, pp. 894--906, 2020.
[12]
D. Hackenberg, R. Schöne, T. Ilsche et al., "An Energy Efficiency Feature Survey of the Intel Haswell Processor," in Int. Parallel and Distributed Processing Symp. Workshop (IPDPSW). IEEE, 2015, pp. 896--904.
[13]
M. Rapp, M. B. Sikal, H. Khdr, and J. Henkel, "SmartBoost: Lightweight ML-Driven Boosting for Thermally-Constrained Many-Core Processors," in Design Automation Conference (DAC), 2021.
[14]
S. Dey, E. Z. Guajardo, K. R. Basireddy et al., "EdgeCoolingMode: An Agent Based Thermal Management Mechanism for DVFS Enabled Heterogeneous MPSoCs," in International Conference on VLSI Design and International Conference on Embedded Systems (VLSID). IEEE, 2019, pp. 19--24.
[15]
H. Khdr, S. Pagani, M. Shafique, and J. Henkel, "Thermal constrained resource management for mixed ILP-TLP workloads in dark silicon chips," in Design Automation Conference (DAC), 2015, pp. 1--6.
[16]
X. Wang, A. K. Singh, B. Li et al., "Bubble Budgeting: Throughput Optimization for Dynamic Workloads by Exploiting Dark Cores in Many Core Systems," IEEE Trans. on Computers (TC), vol. 67, no. 2, 2017.
[17]
L. Subramanian, V. Seshadri, A. Ghosh et al., "The Application Slowdown Model: Quantifying and Controlling the Impact of Inter-Application Interference at Shared Caches and Main Memory," in International Symposium on Microarchitecture (MICRO), 2015.
[18]
N. Mishra, J. D. Lafferty, and H. Hoffmann, "ESP: A Machine Learning Approach to Predicting Application Interference," in International Conference on Autonomic Computing (ICAC), 2017, pp. 125--134.
[19]
H. Usui, L. Subramanian, K. K.-W. Chang, and O. Mutlu, "DASH: Deadline-Aware High-Performance Memory Scheduler for Heterogeneous Systems with Hardware Accelerators," ACM Trans. on Architecture and Code Optimization (TACO), vol. 12, no. 4, 2016.
[20]
S. Kundan and I. Anagnostopoulos, "A Machine Learning Approach for Improving Power Efficiency on Clustered Multi-Processor System," in International Symposium on Circuits and Systems (ISCAS), 2020.
[21]
W. Huang, S. Ghosh, S. Velusamy et al., "HotSpot: A Compact Thermal Modeling Methodology for Early-Stage VLSI Design," IEEE Trans. Very Large Scale Integration (VLSI) Systems, vol. 14, no. 5, pp. 501--513, 2006.
[22]
S. Boyd-Wickizer, H. Chen, R. Chen et al., "Corey: An Operating System for Many Cores," in Symp. Operating System Design and Implementation (OSDI), 2008.
[23]
A. Pathania and J. Henkel, "HotSniper: Sniper-Based Toolchain for Many-Core Thermal Simulations in Open Systems," IEEE Embedded Systems Letters (ESL), 2018.
[24]
T. E. Carlson, W. Heirman, and L. Eeckhout, "Sniper: Exploring the Level of Abstraction for Scalable and Accurate Parallel Multi-Core Simulation," in High Performance Computing, Networking, Storage and Analysis (SC). ACM, 2011.
[25]
S. Li, J. H. Ahn, R. D. Strong et al., "The McPAT Framework for Multicore and Manycore Architectures: Simultaneously Modeling Power, Area, and Timing," ACM Trans. Arch. and Code Opt. (TACO), 2013.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
DATE '22: Proceedings of the 2022 Conference & Exhibition on Design, Automation & Test in Europe
March 2022
1637 pages
ISBN:9783981926361

Sponsors

In-Cooperation

  • EDAA: European Design Automation Association
  • IEEE SSCS Shanghai Chapter
  • ESDA: Electronic System Design Alliance
  • IEEE CEDA
  • IEEE CS
  • IEEE-RAS: Robotics and Automation

Publisher

European Design and Automation Association

Leuven, Belgium

Publication History

Published: 31 May 2022

Check for updates

Author Tags

  1. DVFS
  2. application mapping
  3. cache contention
  4. machine learning
  5. resource management
  6. thermal optimization

Qualifiers

  • Research-article

Conference

DATE '22
Sponsor:
DATE '22: Design, Automation and Test in Europe
March 14 - 23, 2022
Antwerp, Belgium

Acceptance Rates

Overall Acceptance Rate 518 of 1,794 submissions, 29%

Upcoming Conference

DATE '25
Design, Automation and Test in Europe
March 31 - April 2, 2025
Lyon , France

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 30
    Total Downloads
  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media