Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3240765.3240767guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article

3DICT: A Reliable and QoS Capable Mobile Process-In-Memory Architecture for Lookup-based CNNs in 3D XPoint ReRAMs

Published: 05 November 2018 Publication History

Abstract

It is extremely challenging to deploy computing-intensive convolutional neural networks (CNNs) with rich parameters in mobile devices because of their limited computing resources and low power budgets. Although prior works build fast and energy-efficient CNN accelerators by greatly sacrificing test accuracy, mobile devices have to guarantee high CNN test accuracy for critical applications, e.g., unlocking phones by face recognitions. In this paper, we propose a 3D XPoint ReRAM-based process-in-memory architecture, 3DICT, to provide various test accuracies to applications with different priorities by lookup-based CNN tests that dynamically exploit the trade-off between test accuracy and latency. Compared to the state-of-the-art accelerators, on average, 3DICT improves the CNN test performance per Watt by 13% ∼ 61× and guarantees 9-year endurance under various CNN test accuracy requirements.

References

[1]
R. Andri et al., “YodaNN: An Ultra-Low Power CNN Accelerator Based on Binary Weights;” in ISVLSI, pages 236–241, July 2016.
[2]
H. Bagherinezhad et al., “LCNN: Lookup-based Convolutional Neural Network,” in CVPR, 2017.
[3]
B. Chakrabarti et al., “A multiply-add engine with monolithically integrated 3D memristor crossbar/CMOS hybrid circuit,” Scientific Reports, 7, 2017.
[4]
Y.H. Chen et al., “Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks,” in ISSCC, 2016.
[5]
P. Chi et al., “PRIME: A Novel PIM Architecture for Neural Network Computation in ReRAM-Based Main Memory;” in ISCA, 2016.
[6]
N. Chidambaram Nachiappan, et al., “GemDroid: A Framework to Evaluate Mobile Platforms,” in SIGMETRICS, 2014.
[7]
D. Ciregan et al., “Multi-column deep neural networks for image classification,” in CVPR, 2012.
[8]
R. Collobert et al., “Torch7: A Matlab-like Environment for Machine Learning,” in BigLeam, NIPS Workshop, 2011.
[9]
M. Courbariaux et al., “Binaryconnect: Training deep neural networks with binary weights during propagations;” in NIPS, 2015.
[10]
X. Dong et al., “NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory,” TCAD, 2012.
[11]
C. Farabet et al., “CNP: An FPGA-based processor for Convolutional Networks,” in FPL, 2009.
[12]
M. Gao et al., “TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory,” in ASPLOS, 2017.
[13]
P. Gu et al., “Technological exploration of RRAM crossbar array for matrix-vector multiplication,” in ASPDAC, 2015.
[14]
S. Han et al., “MCDNN: An Approximation-Based Execution Framework for Deep Stream Processing Under Resource Constraints,” in MobiSys, 2016.
[15]
K. He et al., “Deep Residual Learning for Image Recognition;” in CVPR, 2016.
[16]
M. Hu et al., “Dot-product engine for neuromorphic computing: Programming 1T1M crossbar to accelerate matrix-vector multiplication;” in DAC, 2016.
[17]
L. Jiang et al., “Enhancing Phase Change Memory Lifetime through Fine-Grained Current Regulation and Voltage Upscaling,” in ISLPED, 2011.
[18]
L. Jiang et al., “XNOR-POP: A processing-in-memory architecture for binary Convolutional Neural Networks in Wide-IO2 DRAMs,” in ISLPED, 2017.
[19]
A. Krizhevsky et al., “ImageNet Classification with Deep Convolutional Neural Networks,” in NIPS, 2012.
[20]
Y. Lecun et al., “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, 86 (11), Nov 1998.
[21]
S.E. Lee et al., “Accelerating mobile augmented reality on a handheld platform,” in ICCD, 2009.
[22]
R. Midya et al., “Anatomy of Ag/Hafnia-Based Selectors with 1010 Nonlinearity,” Advanced Materials, 29 (12), 2017.
[23]
B. Moons et al., “Envision: A 0.26-to-10TOPS/W subword-parallel dynamic-voltage-accuracy-frequency-scalable Convolutional Neural Network processor in 28nm FDSOI,” in ISSCC, 2017.
[24]
B. Murmann, “An ADC Performance, Power and Area Survey from 1997 to 2017,” http://web.stanford.edu/-murmann/adcsurvey.html
[25]
L. Ni et al., “An Energy-Efficient Digital ReRAM-Crossbar-Based CNN With Bitwise Parallelism,” JESSCDC, 2017.
[26]
M. Rastegari et al., “XNOR-Net: Imagenet Classification Using Binary Convolutional Neural Networks,” in ECCV, 2016.
[27]
A. Sampson and M. Buckler, “FODLAM: a first-order deep learning accelerator model,” https://github.com/cucapra/fodlam
[28]
A. Shafiee et al., “ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars,” in ISCA, 2016.
[29]
P.Y. Simard et al., “Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis,” in ICDAR, 2003.
[30]
T. Tang et al., “Binary convolutional neural network on RRAM,” in ASP-DAC, 2017.
[31]
W. Wen et al., “Speeding up crossbar resistive memory by exploiting in-memory data patterns,” in ICCAD, 2017.
[32]
H.S.P. Wong et al., “Metal-Oxide RRAM;” Proceedings of the IEEE, 2012.
[33]
C. Xu, et al., “Overcoming the challenges of crossbar resistive memory architectures;” in HPCA, 2015.
[34]
T.Y. Liu, et al., “A 130.7-mm22-Layer 32-Gb ReRAM Memory Device in 24-nm Technology,” JSSC, 2014.
[35]
C. Zhang et al., “Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks,” in FPGA, 2015.
[36]
P. Zhou et al., “A Durable and Energy Efficient Main Memory Using Phase Change Memory Technology,” in ISCA, 2009.

Cited By

View all
  • (2023)An Energy-Efficient Inference Engine for a Configurable ReRAM-Based Neural Network AcceleratorIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.318446442:3(740-753)Online publication date: Mar-2023
  • (2023)Endurance-Aware Deep Neural Network Real-Time Scheduling on ReRAM Accelerators2023 International Conference on Computational Science and Computational Intelligence (CSCI)10.1109/CSCI62032.2023.00072(404-410)Online publication date: 13-Dec-2023
  • (2022)Resistive-RAM-Based In-Memory Computing for Neural Network: A ReviewElectronics10.3390/electronics1122366711:22(3667)Online publication date: 9-Nov-2022
  • Show More Cited By

Index Terms

  1. 3DICT: A Reliable and QoS Capable Mobile Process-In-Memory Architecture for Lookup-based CNNs in 3D XPoint ReRAMs
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Please enable JavaScript to view thecomments powered by Disqus.

            Information & Contributors

            Information

            Published In

            cover image Guide Proceedings
            2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)
            Nov 2018
            939 pages

            Publisher

            IEEE Press

            Publication History

            Published: 05 November 2018

            Permissions

            Request permissions for this article.

            Qualifiers

            • Research-article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 19 Nov 2024

            Other Metrics

            Citations

            Cited By

            View all
            • (2023)An Energy-Efficient Inference Engine for a Configurable ReRAM-Based Neural Network AcceleratorIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.318446442:3(740-753)Online publication date: Mar-2023
            • (2023)Endurance-Aware Deep Neural Network Real-Time Scheduling on ReRAM Accelerators2023 International Conference on Computational Science and Computational Intelligence (CSCI)10.1109/CSCI62032.2023.00072(404-410)Online publication date: 13-Dec-2023
            • (2022)Resistive-RAM-Based In-Memory Computing for Neural Network: A ReviewElectronics10.3390/electronics1122366711:22(3667)Online publication date: 9-Nov-2022
            • (2021)Exploring the Feasibility of Using 3-D XPoint as an In-Memory Computing AcceleratorIEEE Journal on Exploratory Solid-State Computational Devices and Circuits10.1109/JXCDC.2021.31122387:2(88-96)Online publication date: Dec-2021
            • (2020)An Efficient Wear-level Architecture using Self-adaptive Wear LevelingProceedings of the 49th International Conference on Parallel Processing10.1145/3404397.3404405(1-11)Online publication date: 17-Aug-2020
            • (2020)AccPar: Tensor Partitioning for Heterogeneous Deep Learning Accelerators2020 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA47549.2020.00036(342-355)Online publication date: Feb-2020
            • (2019)ZARAProceedings of the 56th Annual Design Automation Conference 201910.1145/3316781.3317936(1-6)Online publication date: 2-Jun-2019
            • (2019)HyPar: Towards Hybrid Parallelism for Deep Learning Accelerator Array2019 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2019.00027(56-68)Online publication date: Feb-2019

            View Options

            View options

            Login options

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media