Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3400302.3416344acmconferencesArticle/Chapter ViewAbstractPublication PagesiccadConference Proceedingsconference-collections
invited-talk

Fundamental limits on the precision of in-memory architectures

Published: 17 December 2020 Publication History

Abstract

This paper obtains the fundamental limits on the computational precision of in-memory computing architectures (IMCs). Various compute SNR metrics for IMCs are defined and their interrelationships analyzed to show that the accuracy of IMCs is fundamentally limited by the compute SNR (SNRa) of its analog core, and that activation, weight and output precision needs to be assigned appropriately for the final output SNR SNRT → SNRa. The minimum precision criterion (MPC) is proposed to minimize the output and hence the column analog-to-digital converter (ADC) precision. The charge summing (QS) compute model and its associated IMC QS-Arch are studied to obtain analytical models for its compute SNR, minimum ADC precision, energy and latency. Compute SNR models of QS-Arch are validated via Monte Carlo simulations in a 65 nm CMOS process. Employing these models, upper bounds on SNRa of a QS-Arch-based IMC employing a 512 row SRAM array are obtained and it is shown that QS-Arch's energy cost reduces by 3.3× for every 6 dB drop in SNRa, and that the maximum achievable SNRa reduces with technology scaling while the energy cost at the same SNRa increases. These models also indicate the existence of an upper bound on the dot product dimension N due to voltage headroom clipping, and this bound can be doubled for every 3 dB drop in SNRa.

References

[1]
Avishek Biswas and Anantha P Chandrakasan. 2018. Conv-RAM: An energy-efficient SRAM with embedded convolution computation for low-power CNN-based machine learning applications. In IEEE International Solid-State Circuits Conference (ISSCC). 488--490.
[2]
Wei-Hao Chen et al. 2018. A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors. In IEEE International Solid-State Circuits Conference (ISSCC). 494--496.
[3]
Hassan Dbouk, Sujan K Gonugondla, Charbel Sakr, and Naresh R Shanbhag. 2020. KeyRAM: A 0.34 uJ/decision 18 k decisions/s Recurrent Attention Inmemory Processor for Keyword Spotting. In 2020 IEEE Custom Integrated Circuits Conference (CICC). IEEE, 1--4.
[4]
Qing Dong et al. 2020. A 351 TOPS/W and 372.4 GOPS Compute-in-Memory SRAM Macro in 7nm FinFET CMOS for Machine Learning Applications. In IEEE International Solid-State Circuits Conference (ISSCC). 242--243.
[5]
Laura Fick, David Blaauw, Dennis Sylvester, Skylar Skrzyniarz, M Parikh, and David Fick. 2017. Analog in-memory subthreshold deep neural network accelerator. In 2017 IEEE Custom Integrated Circuits Conference (CICC). IEEE, 1--4.
[6]
Sujan Kumar Gonugondla, Mingu Kang, and Naresh Shanbhag. 2018. A 42pJ/decision 3.12 TOPS/W robust in-memory machine learning classifier with on-chip training. In IEEE International Solid-State Circuits Conference (ISSCC). 490--492.
[7]
Sujan K Gonugondla, Mingu Kang, and Naresh R. Shanbhag. 2018. A variationtolerant in-memory machine learning classifier via on-chip training. IEEE Journal of Solid-State Circuits 53, 11 (2018), 3163--3173.
[8]
Ruiqi Guo, Yonggang Liu, Shixuan Zheng, Ssu-Yen Wu, Peng Ouyang, Win-San Khwa, Xi Chen, Jia-Jing Chen, Xiudong Li, Leibo Liu, Meng-Fan Chang, Shaojun Wei, and Shouyi Yin. 2019. A 5.1pJ/Neuron 127.3us/Inference RNN-based Speech Recognition Processor using 16 Computing-in-Memory SRAM Macros in 65nm CMOS. In 2019 IEEE Symposium on VLSI Circuits. IEEE, 120--121.
[9]
Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. 2015. Deep learning with limited numerical precision. In International Conference on Machine Learning. 1737--1746.
[10]
ITRS-collaborations. 2015. ITRS Roadmap tables. ITRS (2015). http://www.itrs2.net/itrs-reports.html
[11]
Hongyang Jia, Yinqi Tang, Hossein Valavi, Jintao Zhang, and Naveen Verma. 2018. A Microprocessor implemented in 65nm CMOS with Configurable and Bit-scalable Accelerator for Programmable In-memory Computing. arXiv preprint arXiv:1811.04047 (2018).
[12]
Zhewei Jiang, Shihui Yin, Mingoo Seok, and Jae-sun Seo. 2018. XNOR-SRAM: In-memory computing SRAM macro for binary/ternary deep neural networks. In 2018 IEEE Symposium on VLSI Technology. IEEE, 173--174.
[13]
Mingu Kang, Sujan Gonugondla, and Naresh R Shanbhag. 2020. Deep In-memory Architectures for Machine Learning. Springer.
[14]
Mingu Kang, Sujan K. Gonugondla, Min-Sun Keel, and Naresh R. Shanbhag. 2015. An energy-efficient memory-based high-throughput VLSI architecture for convolutional networks. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15]
Mingu Kang, Sujan K Gonugondla, Ameya Patil, and Naresh R. Shanbhag. 2018. A multi-functional in-memory inference processor using a standard 6T SRAM array. IEEE Journal of Solid-State Circuits 53, 2 (2018), 642--655.
[16]
Win-San Khwa, Jia-Jing Chen, Jia-Fang Li, Xin Si, En-Yu Yang, Xiaoyu Sun, Rui Liu, Pai-Yu Chen, Qiang Li, Shimeng Yu, et al. 2018. A 65nm 4Kb algorithm-dependent computing-in-memory SRAM unit-macro with 2.3 ns and 55.8 TOPS/W fully parallel product-sum operation for binary DNN edge processors. In IEEE International Solid-State Circuits Conference (ISSCC). 496--498.
[17]
Jinseok Kim, Jongeun Koo, Taesu Kim, Yulhwa Kim, Hyungjun Kim, Seunghyun Yoo, and Jae-Joon Kim. 2019. Area-Efficient and Variation-Tolerant In-Memory BNN Computing using 6T SRAM Array. In 2019 IEEE Symposium on VLSI Circuits. IEEE, 118--119.
[18]
S. Lloyd. 1982. Least squares quantization in PCM. IEEE Transactions on Information Theory 28, 2 (1982), 129--137.
[19]
M. Kang, M.-S. Keel, N. R. Shanbhag, S. Eilert, and K. Curewitz. 2014. An energy-efficient VLSI architecture for pattern recognition via deep embedding of computation in SRAM. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 8326--8330.
[20]
Boris Murmann. 2008. A/D converter trends: Power dissipation, scaling and digitally assisted architectures. In 2008 IEEE Custom Integrated Circuits Conference. IEEE, 105--112.
[21]
Boris Murmann. 2015. The race for the extra decibel: a brief review of current ADC performance trajectories. IEEE Solid-State Circuits Magazine 7, 3 (2015), 58--66.
[22]
Boris Murmann. 2019. ADC performance survey 1997--2019. https://web.stanford.edu/~murmann/adcsurvey.html
[23]
Shunsuke Okumura, Makoto Yabuuchi, Kenichiro Hijioka, and Koichi Nose. 2019. A Ternary Based Bit Scalable, 8.80 TOPS/W CNN accelerator with Many-core Processing-in-memory Architecture with 896K synapses/mm2. In 2019 IEEE Symposium on VLSI Circuits. IEEE, 248--249.
[24]
Angad S. Rekhi, Brian Zimmer, Nikola Nedovic, Ningxi Liu, Rangharajan Venkatesan, Miaorong Wang, Brucek Khailany, William J. Dally, and C. Thomas Gray. 2019. Analog/Mixed-Signal Hardware Error Modeling for Deep Learning Inference. In Proceedings of the 56th Annual Design Automation Conference 2019 (DAC '19). Association for Computing Machinery, New York, NY, USA, Article 81, 6 pages.
[25]
Charbel Sakr, Yongjune Kim, and Naresh Shanbhag. 2017. Analytical Guarantees on Numerical Precision of Deep Neural Networks. In International Conference on Machine Learning. 3007--3016.
[26]
Charbel Sakr and Naresh Shanbhag. 2018. An analytical method to determine minimum per-layer precision of deep neural networks. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 1090--1094.
[27]
Richard Schreier, Gabor C Temes, et al. 2005. Understanding delta-sigma data converters. Vol. 74. IEEE press Piscataway, NJ.
[28]
Naresh Shanbhag, Mingu Kang, and Min-Sun Keel. 2017. Compute memory. US Patent 9,697,877, Issued July 4th., 2017.
[29]
Naresh R Shanbhag, Naveen Verma, Yongjune Kim, Ameya D Patil, and Lav R Varshney. 2018. Shannon-inspired statistical computing for the nanoscale era. Proc. IEEE 107, 1 (2018), 90--107.
[30]
Xin Si, Jia-Jing Chen, Yung-Ning Tu, Wei-Hsing Huang, Jing-Hong Wang, Yen-Cheng Chiu, Wei-Chen Wei, Ssu-Yen Wu, Xiaoyu Sun, Rui Liu, et al. 2019. A Twin-8T SRAM Computation-In-Memory Macro for Multiple-Bit CNN-Based Machine Learning. In IEEE International Solid-State Circuits Conference (ISSCC). IEEE, 396--398.
[31]
Xin Si, Yung-Ning Tu, Wei-Hsing Huang, Jian-Wei Su, Pei-Jung Lu, Jing-Hong Wang, Ta-Wei Liu, Ssu-Yen Wu, Ruhui Liu, Yen-Chi Chou, Zhixiao Zhang, Syuan-Hao Sie, Wei-Chen Wei, Yun-Chen Lo, Tai-Hsing Wen, Tzu-Hsiang Hsu, YenKai Chen, William Shih, Chung-Chuan Lo, Ren-Shuo Liu, Chih-Cheng Hsieh, Kea-Tiong Tang, Nan-Chun Lien, Wei-Chiang Shih, Yajuan He, Qiang Li, and Meng-Fan Chang. 2020. A 28nm 64Kb 6T SRAM Computing-in- Memory Macro with 8b MAC Operation for AI Edge Chips. In IEEE International Solid-State Circuits Conference (ISSCC). 246--247.
[32]
Jian-Wei Su, Xin Si, Yen-Chi Chou, Ting-Wei Chang, Wei-Hsing Huang, Yung-Ning Tu, Ruhui Liu, Ta-Wei Lu, Pei-Jungand Liu, Jing-Hong Wang, Zhixiao Zhang, Hongwu Jiang, Shanshi Huang, Chung-Chuan Lo, Ren-Shuo Liu, Chih-Cheng Hsieh, Kea-Tiong Tang, Shyh-Shyuan Sheu, Sih-Han Li, Heng-Yuan Lee, Shih-Chieh Chang, Shimeng Yu, and Meng-Fan Chang. 2020. A 28nm 64Kb Inference-Training Two-Way Transpose Multibit 6T SRAM Compute-in-Memory Macro for AI Edge Chips. In IEEE International Solid-State Circuits Conference (ISSCC). 240--241.
[33]
Hossein Valavi, Peter J Ramadge, Eric Nestler, and Naveen Verma. 2018. A mixed-signal binarized convolutional-neural-network accelerator integrating dense weight storage and multiplication for reduced data movement. In 2018 IEEE Symposium on VLSI Circuits. IEEE, 141--142.
[34]
Naveen Verma, Hongyang Jia, Hossein Valavi, Yinqi Tang, Murat Ozatay, LungYen Chen, Bonan Zhang, and Peter Deaville. 2019. In-memory computing: Advances and prospects. IEEE Solid-State Circuits Magazine 11, 3 (2019), 43--55.
[35]
Cheng-Xin Xue, Wei-Hao Chen, Je-Syu Liu, Jia-Fang Li, Wei-Yu Lin, Wei-En Lin, Jing-Hong Wang, Wei-Chen Wei, Ting-Wei Chang, Tung-Cheng Chang, et al. 2019. A 1Mb Multibit ReRAM Computing-In-Memory Macro with 14.6 ns Parallel MAC Computing Time for CNN Based AI Edge Processors. In IEEE International Solid-State Circuits Conference (ISSCC). IEEE, 388--390.
[36]
Cheng-Xin Xue, Tsung-Yuan Huang, Je-Syu Liu, Ting-Wei Chang, Hui-Yao Kao, Jing-Hong Wang, Ta-Wei Liu, Shih-Ying Wei, Sheng-Po Huang, Wei-Chen Wei, Yi-Ren Chen, Tzu-Hsiang Hsu, Yen-Kai Chen, Yun-Chen Lo, Tai-Hsing Wen, Chung-Chuan Lo, Ren-Shuo Liu, Chih-Cheng Hsieh, Kea-Tiong Tang, and MengFan Chang. 2020. A 22nm 2Mb ReRAM Compute-in-Memory Macro with 121-28TOPS/W for Multibit MAC Computing for Tiny AI Edge Devices. In IEEE International Solid-State Circuits Conference (ISSCC). 244--245.
[37]
Bonan Yan, Qing Yang, Wei-Hao Chen, Kung-Tang Chang, Jian-Wei Su, Chien-Hua Hsu, Sih-Han Li, Heng-Yuan Lee, Shyh-Shyuan Sheu, Mon-Shu Ho, et al. 2019. RRAM-based Spiking Nonvolatile Computing-In-Memory Processing Engine with Precision-Configurable In Situ Nonlinear Activation. In 2019 Symposium on VLSI Technology. IEEE, T86--T87.
[38]
Jinshan Yue, Zhe Yuan, Xiaoyu Feng, Yifan He, Zhixiao Zhang, Xin Si, Ruhui Liu, Meng-Fan Chang, Xueqing Li, Huazhong Yang, and Yongpan Liu. 2020. A 65nm Computing-in-Memory-Based CNN Processor with 2.9-to-35.8TOPS/W System Energy Eficiency Using Dynamic-Sparsity Performance-Scaling Architecture and Energy-Efficient Inter/Intra-Macro Data Reuse. In IEEE International SolidState Circuits Conference (ISSCC). 234--235.
[39]
Yue Zha, Etienne Nowak, and Jing Li. 2019. Liquid Silicon: A Nonvolatile Fully Programmable Processing-In-Memory Processor with Monolithically Integrated ReRAM for Big Data/Machine Learning Applications. In 2019 IEEE Symposium on VLSI Circuits. IEEE, 206--207.
[40]
Jintao Zhang, Zhuo Wang, and Naveen Verma. 2017. In-Memory Computation of a Machine-Learning Classifier in a Standard 6T SRAM Array. IEEE Journal of Solid-State Circuits 52, 4 (April 2017), 915--924.

Cited By

View all
  • (2024)34.5 A 818-4094TOPS/W Capacitor-Reconfigured CIM Macro for Unified Acceleration of CNNs and Transformers2024 IEEE International Solid-State Circuits Conference (ISSCC)10.1109/ISSCC49657.2024.10454489(574-576)Online publication date: 18-Feb-2024
  • (2024)ZEBRA: A Zero-Bit Robust-Accumulation Compute-In-Memory Approach for Neural Network Acceleration Utilizing Different Bitwise Patterns2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)10.1109/ASP-DAC58780.2024.10473851(153-158)Online publication date: 22-Jan-2024
  • (2023)Built-in Self-Test and Built-in Self-Repair Strategies Without Golden Signature for Computing in Memory2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE56975.2023.10137074(1-6)Online publication date: Apr-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICCAD '20: Proceedings of the 39th International Conference on Computer-Aided Design
November 2020
1396 pages
ISBN:9781450380263
DOI:10.1145/3400302
  • General Chair:
  • Yuan Xie
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IEEE CAS
  • IEEE CEDA
  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 December 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. accelerator
  2. compute in-memory
  3. in-memory accuracy
  4. in-memory computing
  5. in-memory noise
  6. in-memory precision
  7. machine learning
  8. taxonomy of in-memory

Qualifiers

  • Invited-talk

Conference

ICCAD '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 457 of 1,762 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)111
  • Downloads (Last 6 weeks)23
Reflects downloads up to 22 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)34.5 A 818-4094TOPS/W Capacitor-Reconfigured CIM Macro for Unified Acceleration of CNNs and Transformers2024 IEEE International Solid-State Circuits Conference (ISSCC)10.1109/ISSCC49657.2024.10454489(574-576)Online publication date: 18-Feb-2024
  • (2024)ZEBRA: A Zero-Bit Robust-Accumulation Compute-In-Memory Approach for Neural Network Acceleration Utilizing Different Bitwise Patterns2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)10.1109/ASP-DAC58780.2024.10473851(153-158)Online publication date: 22-Jan-2024
  • (2023)Built-in Self-Test and Built-in Self-Repair Strategies Without Golden Signature for Computing in Memory2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE56975.2023.10137074(1-6)Online publication date: Apr-2023
  • (2023)RAELLA: Reforming the Arithmetic for Efficient, Low-Resolution, and Low-Loss Analog PIM: No Retraining Required!Proceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589062(1-16)Online publication date: 17-Jun-2023
  • (2023)The Impact of Analog-to-Digital Converter Architecture and Variability on Analog Neural Network AccuracyIEEE Journal on Exploratory Solid-State Computational Devices and Circuits10.1109/JXCDC.2023.33151349:2(176-184)Online publication date: Dec-2023
  • (2023)DIANA: An End-to-End Hybrid DIgital and ANAlog Neural Network SoC for the EdgeIEEE Journal of Solid-State Circuits10.1109/JSSC.2022.321406458:1(203-215)Online publication date: Jan-2023
  • (2023)Accelerating Polynomial Modular Multiplication with Crossbar-Based Compute-in-Memory2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD)10.1109/ICCAD57390.2023.10323790(1-9)Online publication date: 28-Oct-2023
  • (2023)DIANA: DIgital and ANAlog Heterogeneous Multi-core System-on-ChipTowards Heterogeneous Multi-core Systems-on-Chip for Edge Machine Learning10.1007/978-3-031-38230-7_7(119-141)Online publication date: 3-Jul-2023
  • (2022)Towards ADC-Less Compute-In-Memory Accelerators for Energy Efficient Deep Learning2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE54114.2022.9774573(624-627)Online publication date: 14-Mar-2022
  • (2022)A cross-layer approach to cognitive computingProceedings of the 59th ACM/IEEE Design Automation Conference10.1145/3489517.3530642(1327-1330)Online publication date: 10-Jul-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media