Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

3DNN-Xplorer: A Machine Learning Framework for Design Space Exploration of Heterogeneous 3-D DNN Accelerators

Published: 01 February 2025 Publication History

Abstract

This article presents 3DNN-Xplorer, the first machine learning (ML)-based framework for predicting the performance of heterogeneous 3-D deep neural network (DNN) accelerators. Our ML framework facilitates the design space exploration (DSE) of heterogeneous 3-D accelerators with a two-tier compute-on-memory (CoM) configuration, considering 3-D physical design factors. Our design space encompasses four distinct heterogeneous 3-D integration styles, combining 28- and 16-nm technology nodes for both compute and memory tiers. Using extrapolation techniques with ML models trained on 10-to-256 processing element (PE) accelerator configurations, we estimate the performance of systems featuring 75–16384 PEs, achieving a maximum absolute error of 13.9% (the number of PEs is not continuous and varies based on the accelerator architecture). To ensure balanced tier areas in the design, our framework assumes the same number of PEs or on-chip memory capacity across the four integration styles, accounting for area imbalance resulting from different technology nodes. Our analysis reveals that the heterogeneous 3-D style with 28-nm compute and 16-nm memory is energy-efficient and offers notable energy savings of up to 50% and an 8.8% reduction in runtime compared to other 3-D integration styles with the same number of PEs. Similarly, the heterogeneous 3-D style with 16-nm compute and 28-nm memory is area-efficient and shows up to 8.3% runtime reduction compared to other 3-D styles with the same on-chip memory capacity.

References

[1]
P. Shukla, V. F. Pavlidis, E. Salman, and A. K. Coskun, “Temperature-aware monolithic 3D DNN accelerators for biomedical applications,” in Proc. Design, Autom. Test Eur. Conf. (DATE), 2022, pp. 1–3.
[2]
W. Lu, P.-T. Huang, H.-M. Chen, and W. Hwang, “An energy-efficient 3D cross-ring accelerator with 3D-SRAM cubes for hybrid deep neural networks,” IEEE J. Emerg. Sel. Topics Circuits Syst., vol. 11, no. 4, pp. 776–788, Dec. 2021.
[3]
G. Murali, X. Sun, S. Yu, and S. K. Lim, “Heterogeneous mixed-signal monolithic 3-D in-memory computing using resistive RAM,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 29, no. 2, pp. 386–396, Feb. 2021.
[4]
M.-F. Chen, F.-C. Chen, W.-C. Chiou, and D. C. H. Yu, “System on integrated chips (SoIC(TM) for 3D heterogeneous integration,” in Proc. IEEE 69th Electron. Compon. Technol. Conf. (ECTC), May 2019, pp. 594–599.
[5]
D. B. Ingerly et al., “Foveros: 3D integration and the use of face-to-face chip stacking for logic devices,” in IEDM Tech. Dig., Dec. 2019, p. 19.
[6]
N. P. Jouppi et al., “A domain-specific supercomputer for training deep neural networks,” Commun. ACM, vol. 63, no. 7, pp. 67–78, Jun. 2020.
[7]
E. Qin et al., “SIGMA: A sparse and irregular GEMM accelerator with flexible interconnects for DNN training,” in Proc. IEEE Int. Symp. High Perform. Comput. Archit. (HPCA), Feb. 2020, pp. 58–70.
[8]
Y.-H. Chen, T. Krishna, J. S. Emer, and V. Sze, “Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks,” IEEE J. Solid-State Circuits, vol. 52, no. 1, pp. 127–138, Jan. 2017.
[9]
H. Kwon, A. Samajdar, and T. Krishna, “A communication-centric approach for designing flexible DNN accelerators,” IEEE Micro, vol. 38, no. 6, pp. 25–35, Nov. 2018.
[10]
K. Shiba, T. Omori, M. Hamada, and T. Kuroda, “A 3D-stacked SRAM using inductive coupling technology for AI inference accelerator in 40-nm CMOS,” in Proc. 26th Asia South Pacific Design Autom. Conf. (ASP-DAC), Jan. 2021, pp. 97–98.
[11]
C.-Y. Lo, P.-T. Huang, and W. Hwang, “Energy-efficient accelerator design with 3D-SRAM and hierarchical interconnection architecture for compact sparse CNNs,” in Proc. 2nd IEEE Int. Conf. Artif. Intell. Circuits Syst. (AICAS), Aug. 2020, pp. 320–323.
[12]
H. Esmaeilzadeh et al., “Physically accurate learning-based performance prediction of hardware-accelerated ML algorithms,” in Proc. ACM/IEEE 4th Workshop Mach. Learn. CAD (MLCAD), Sep. 2022, pp. 119–126.
[13]
E. LeDell and S. Poirier, “H2O AutoML: Scalable automatic machine learning,” in Proc. 7th ICML Workshop Automated Mach. Learn. (AutoML), Jul. 2020. [Online]. Available: https://www.automl.org/wp-content/uploads/2020/07/AutoML_2020_paper_61.pdf
[14]
R. Mathur, A. K. A. Kumar, L. John, and J. P. Kulkarni, “Thermal-aware design space exploration of 3-D systolic ML accelerators,” IEEE J. Explor. Solid-State Comput. Devices Circuits, vol. 7, no. 1, pp. 70–78, Jun. 2021.
[15]
H. Li, M. Bhargav, P. N. Whatmough, and H.-S. Philip Wong, “On-chip memory technology design space explorations for mobile deep neural network accelerators,” in Proc. 56th ACM/IEEE Design Autom. Conf. (DAC), New York, NY, USA, Jun. 2019, pp. 1–6.
[16]
Y. Yu, Y. Li, S. Che, N. K. Jha, and W. Zhang, “Software-defined design space exploration for an efficient DNN accelerator architecture,” IEEE Trans. Comput., vol. 70, no. 1, pp. 45–56, Jan. 2021.
[17]
G. Murali, A. Iyer, N. Ravichandran, and S. K. Lim, “3DNN-xplorer: A machine learning framework for design space exploration of heterogeneous 3D DNN accelerators,” in Proc. IEEE/ACM Int. Conf. Comput. Aided Design (ICCAD), Oct. 2023, pp. 1–9.
[18]
Z. Zeng and S. S. Sapatnekar, “Energy-efficient hardware acceleration of shallow machine learning applications,” in Proc. Design, Autom. Test Eur. Conf. Exhib. (DATE), Apr. 2023, pp. 1–6.
[19]
F. Muñoz-Martínez, J. L. Abellán, M. E. Acacio, and T. Krishna, “STONNE: Enabling cycle-level microarchitectural simulation for DNN inference accelerators,” in Proc. IEEE Int. Symp. Workload Characterization (IISWC), Nov. 2021, pp. 201–213.
[20]
H. Esmaeilzadeh et al., “VeriGOOD-ML: An open-source flow for automated ML hardware synthesis,” in Proc. IEEE/ACM Int. Conf. Comput. Aided Design (ICCAD), Nov. 2021, pp. 1–7.
[21]
A. Samajdar, Y. Zhu, P. Whatmough, M. Mattina, and T. Krishna, “SCALE-sim: Systolic CNN accelerator simulator,” 2018, arXiv:1811.02883.
[22]
H. Kwon, A. Samajdar, and T. Krishna. (2019). Maeri Project. Accessed: May 20, 2023. [Online]. Available: https://github.com/maeri-project/MAERI_bsv
[23]
L. Bamberg, A. Garcia-Ortiz, L. Zhu, S. Pentapati, D. E. Shim, and S. K. Lim, “Macro-3D: A physical design methodology for face-to-face-stacked heterogeneous 3D ICs,” in Proc. Design, Autom. Test Eur. Conf. Exhib. (DATE), Mar. 2020, pp. 37–42.
[24]
S. S. K. Pentapati and S. K. Lim, “Heterogeneous monolithic 3D ICs: EDA solutions, and power, performance, cost tradeoffs,” in Proc. 58th ACM/IEEE Design Autom. Conf. (DAC), Dec. 2021, pp. 925–930.
[25]
A. Malistov and A. Trushin, “Gradient boosted trees with extrapolation,” in Proc. 18th IEEE Int. Conf. Mach. Learn. Appl. (ICMLA), Dec. 2019, pp. 783–789.
[26]
F. Pedregosa et al., “Scikit-learn: Machine learning in Python,” J. Mach. Learn. Res., vol. 12, pp. 2825–2830, Nov. 2011.
[27]
R. Yousefzadeh and X. Cao, “To what extent should we trust AI models when they extrapolate?” 2022, arXiv:2201.11260.
[28]
V. F. Pavlidis, I. Savidis, and E. G. Friedman, Three-Dimensional Integrated Circuit Design, 2nd ed., San Francisco, CA, USA: Morgan Kaufmann, 2017.

Index Terms

  1. 3DNN-Xplorer: A Machine Learning Framework for Design Space Exploration of Heterogeneous 3-D DNN Accelerators
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Please enable JavaScript to view thecomments powered by Disqus.

            Information & Contributors

            Information

            Published In

            cover image IEEE Transactions on Very Large Scale Integration (VLSI) Systems
            IEEE Transactions on Very Large Scale Integration (VLSI) Systems  Volume 33, Issue 2
            Feb. 2025
            300 pages

            Publisher

            IEEE Educational Activities Department

            United States

            Publication History

            Published: 01 February 2025

            Qualifiers

            • Research-article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • 0
              Total Citations
            • 0
              Total Downloads
            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 05 Mar 2025

            Other Metrics

            Citations

            View Options

            View options

            Figures

            Tables

            Media

            Share

            Share

            Share this Publication link

            Share on social media