Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/ETFA.2017.8247615guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article

Memory interference characterization between CPU cores and integrated GPUs in mixed-criticality platforms

Published: 12 September 2017 Publication History

Abstract

Most of today's mixed criticality platforms feature Systems on Chip (SoC) where a multi-core CPU complex (the host) competes with an integrated Graphic Processor Unit (iGPU, the device) for accessing central memory. The multi-core host and the iGPU share the same memory controller, which has to arbitrate data access to both clients through often undisclosed or non-priority driven mechanisms. Such aspect becomes critical when the iGPU is a high performance massively parallel computing complex potentially able to saturate the available DRAM bandwidth of the considered SoC. The contribution of this paper is to qualitatively analyze and characterize the conflicts due to parallel accesses to main memory by both CPU cores and iGPU, so to motivate the need of novel paradigms for memory centric scheduling mechanisms. We analyzed different well known and commercially available platforms in order to estimate variations in throughput and latencies within various memory access patterns, both at host and device side.

References

[1]
N. Rajovic, A. Rico, J. Vipond, I. Gelado, N. Puzovic, and A. Ramirez, “Experiences with mobile processors for energy efficient hpc,” in Proceedings of the Conference on Design, Automation and Test in Europe. EDA Consortium, 2013, pp. 464–468.
[2]
C. Nvidia, “Programming guide version 8. 0,” Nvidia Corporation, 2016. [Online]. Available: https://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf.
[3]
O. W. G. Khronos, “The opencl specification version 2. 0,” Khronos Group, 2015. [Online]. Available: https://www.khronos.org/registry/OpenCL/specs/opencl-2.0.pdf.
[4]
L. Chai, Q. Gao, and D. K. Panda, “Understanding the impact of multicore architecture in cluster computing: A case study with intel dual-core system,” in Cluster Computing and the Grid, 2007. CCGRID 2007. Seventh IEEE International Symposium on. IEEE, 2007, pp. 471–478.
[5]
H. Kim, D. De Niz, B. Andersson, M. Klein, O. Mutlu, and R. Rajkumar, “Bounding memory interference delay in cots-based multi-core systems,” in Real-Time and Embedded Technology and Applications Symposium (RTAS), 2014 IEEE 20th. IEEE, 2014, pp. 145–154.
[6]
R. Pellizzoni, A. Schranzhofer, J.-J. Chen, M. Caccamo, and L. Thiele, “Worst case delay analysis for memory interference in multicore systems,” in Proceedings of the Conference on Design, Automation and Test in Europe. European Design and Automation Association, 2010, pp. 741–746.
[7]
D. Dasari, B. Andersson, V. Nelis, S. M. Petters, A. Easwaran, and 1. Lee, “Response time analysis of cots-based multicores considering the contention on the shared memory bus,” in Trust, Security and Privacy in Computing and Communications (TrustCom), 2011 IEEE 10th International Conference on. IEEE, 2011, pp. 1068–1075.
[8]
G. Yao, R. Pellizzoni, S. Bak, H. Yun, and M. Caccamo, “Global realtime memory-centric scheduling for multicore systems,” 2015.
[9]
H. Yun, G. Yao, R. Pellizzoni, M. Caccamo, and L. Sha, “Memguard: Memory bandwidth reservation system for efficient performance isolation in multi-core platforms,” in Real-Time and Embedded Technology and Applications Symposium (RTAS), 2013 IEEE 19th. IEEE, 2013, pp. 55–64.
[10]
H. Yun, S. Gondi, and S. Biswas, “Protecting memory-performance critical sections in soft real-time applications,” arXiv preprint arXiv: 1502. 02287, 2015.
[11]
R. Pellizzoni, E. Betti, S. Bak, G. Yao, J. Criswell, M. Caccamo, and R. Kegley, “A predictable execution model for cots-based embedded systems,” in 2011 17th IEEE Real-Time and Embedded Technology and Applications Symposium. IEEE, 2011, pp. 269–279.
[12]
M. K. Jeong, M. Erez, C. Sudanthi, and N. Paver, “A qos-aware memory controller for dynamically balancing gpu and cpu bandwidth use in an mpsoc,” in Proceedings of the 49th Annual Design Automation Conference. ACM, 2012, pp. 850–855.
[13]
L. Sha, M. Caccamo, R. Mancuso, J.-E. Kim, M.-K. Yoon, R. Pellizzoni, H. Yun, R. Kegley, D. Perlman, G. Arundale et al., “Single core equivalent virtual machines for hard realtime computing on multicore processors,” Tech. Rep., 2014.
[14]
A. Rao, A. Srivastava, K. Yogesh, A. Douillet, G. Gerfin, M. Kaushik, N. Shulga, V. Venkataraman, D. Fontaine, M. Hairgrove et al., “Uni-fied memory systems and methods,” Jan. 20 2015, uS Patent App. 14/601, 223.
[15]
B. A. Hechtman and D. J. Sorin, “Evaluating cache coherent shared virtual memory for heterogeneous multicore chips,” in Performance Analysis of Systems and Software (ISPASS), 2013 IEEE International Symposium on. IEEE, 2013, pp. 118–119.
[16]
NVIDIA, “Nvidia tegra k1 white paper, a new era in mobile computing,” NVIDIA Corporation, 2014. [Online]. Available: http://www.nvidia.com/content/pdf/tegra_white_papers/tegra_k1_whitepaper_v1.0.pdf.
[17]
G. A. Elliott, B. C. Ward, and J. H. Anderson, “Gpusync: A framework for real-time gpu management,” in Real-Time Systems Symposium (RTSS), 2013 IEEE 34th. IEEE, 2013, pp. 33–44.
[18]
S. Goossens, B. Akesson, K. Goossens, and K. Chandrasekar, Memory Controllers for Mixed-Time-Criticality Systems. Springer, 2016.
[19]
NVIDIA, “Nvidia tegra x1 white paper, nvidiaś new mobile superchip,” NVIDIA Corporation, 2015. [Online]. Available: http://international.download.nvidia.com/pdf/tegra/Tegra-X1-whitepaper-v1.0.pdf.
[20]
D. Marr, F. Binns, D. Hill, G. Hinton, D. Koufaty et al., “Hyper-threading technology in the netburst® microarchitecture,” 14th Hot Chips, 2002.
[21]
S. Saini, H. Jin, R. Hood, D. Barker, P. Mehrotra, and R. Biswas, “The impact of hyper-threading on processor resource utilization in production applications,” in High Performance Computing (HiPC), 2011 18th International Conference on. IEEE, 2011, pp. 1–10.
[22]
Intel, “The compute architecture of intel processor graphics gen 9, v. 1. 0,” Intel White Paper, 2015. [Online]. Available: https://software.intel.com/sites/default/files/managed/c5/9a/The-Compute-Architecture-of-Intel-Processor-Graphics-Gen9-v1d0.pdf.
[23]
L. W. McVoy, C. Staelin et al., “lmbench: Portable tools for performanceanalysis” in USENIX annual technical conference. San Diego, CA, USA, 1996, pp. 279–294.
[24]
R. A. Starke and R. S. de Oliveira, “Impact of the x86 system management mode in real-time systems,” in Computing System Engineering (SBESC), 2011 Brazilian Symposium on. IEEE, 2011, pp. 151–157.
[25]
C. Maurice, N. Le Scourance, C. Neumann, O. Heen, and A. Francillon, “Reverse engineering intel last-level cache complex addressing using performance counters,” in International Workshop on Recent Advances in Intrusion Detection. Springer, 2015, pp. 48–65.
[26]
Intel, “Intel 64 and ia-32 architectures. optimization reference manual,” Intel Corporation, 2016. [Online]. Available: http://www.intel.com/content/dam/www/public/us/en/documentslmanuals/64-ia-32-architectures-optimization-manual.pdf.

Cited By

View all
  • (2024)MemPol: polling-based microsecond-scale per-core memory bandwidth regulationReal-Time Systems10.1007/s11241-024-09422-860:3(369-412)Online publication date: 1-Sep-2024
  • (2022)GPU Devices for Safety-Critical Systems: A SurveyACM Computing Surveys10.1145/354952655:7(1-37)Online publication date: 15-Dec-2022
  • (2022)Evaluating Controlled Memory Request Injection for Efficient Bandwidth Utilization and Predictable Execution in Heterogeneous SoCsACM Transactions on Embedded Computing Systems10.1145/354877322:1(1-25)Online publication date: 13-Dec-2022

Index Terms

  1. Memory interference characterization between CPU cores and integrated GPUs in mixed-criticality platforms
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    2017 22nd IEEE International Conference on Emerging Technologies and Factory Automation (ETFA)
    Sep 2017
    1377 pages

    Publisher

    IEEE Press

    Publication History

    Published: 12 September 2017

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 14 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)MemPol: polling-based microsecond-scale per-core memory bandwidth regulationReal-Time Systems10.1007/s11241-024-09422-860:3(369-412)Online publication date: 1-Sep-2024
    • (2022)GPU Devices for Safety-Critical Systems: A SurveyACM Computing Surveys10.1145/354952655:7(1-37)Online publication date: 15-Dec-2022
    • (2022)Evaluating Controlled Memory Request Injection for Efficient Bandwidth Utilization and Predictable Execution in Heterogeneous SoCsACM Transactions on Embedded Computing Systems10.1145/354877322:1(1-25)Online publication date: 13-Dec-2022

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media