Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Enhancing the Accuracy of 6T SRAM-Based In-Memory Architecture via Maximum Likelihood Detection

Published: 29 April 2024 Publication History

Abstract

This paper presents a statistical signal processing-based algorithmic approach to enhance the compute signal-to-noise ratio (compute SNR) of 6T SRAM-based analog in-memory computing (IMC) architectures which have recently emerged as an attractive alternative to mainstream digital accelerators for machine learning workloads due to their superior energy efficiency and compute densities. However, today, the compute SNR of analog IMCs is limited by device parameter variations and noise. To overcome this limitation, we propose a maximum likelihood (ML)-based statistical error compensation (MLEC) technique to improve the accuracy of binary dot-products (DPs) realized in 6T (six transistor) SRAM-based analog IMC architectures. The MLEC method involves exploiting the symmetric nature of the 6T SRAM bitcell to extract multiple observations efficiently and employ them for detection purposes. MLEC methods involving two (MLEC-2) and four (MLEC-4) observations are proposed along with efficient architectures to realize them in hardware, e.g., distribution-aware and energy-aware approximations of MLEC-4. Simulations in a commercial <inline-formula><tex-math notation="LaTeX">$\mathbf{28} \mathrm{\mathbf{nm}}$</tex-math></inline-formula> CMOS process demonstrate that the proposed methods increase the compute SNR for the commonly used 144-dimensional DP by <inline-formula><tex-math notation="LaTeX">$\mathbf{5} \mathrm{\mathbf{dB}}$</tex-math></inline-formula>-to-<inline-formula><tex-math notation="LaTeX">$\mathbf{12} \mathrm{\mathbf{dB}}$</tex-math></inline-formula>. This improvement in the bank-level compute SNR leads to a network-level accuracy improvement of up to <inline-formula><tex-math notation="LaTeX">$\mathbf{11} \boldsymbol{\mathbf{\%}}$</tex-math></inline-formula> when a ResNet-20 (CIFAR-10 dataset) network is implemented on the IMC. Employing energy models of the IMC, the energy overhead of MLEC is estimated to lie between <inline-formula><tex-math notation="LaTeX">$\mathbf{3} \boldsymbol{\mathbf{\%}}$</tex-math></inline-formula>-to-<inline-formula><tex-math notation="LaTeX">$\mathbf{10} \boldsymbol{\mathbf{\%}}$</tex-math></inline-formula> resulting in up to <inline-formula><tex-math notation="LaTeX">$\mathbf{45.6} \boldsymbol{\mathbf{\%}}$</tex-math></inline-formula> and <inline-formula><tex-math notation="LaTeX">$\mathbf{18}\%$</tex-math></inline-formula> increase in energy efficiency (1b-TOPS/W) for a target SNR of <inline-formula><tex-math notation="LaTeX">$\mathbf{20} \mathrm{\mathbf{dB}}$</tex-math></inline-formula> and ResNet-20 accuracy of <inline-formula><tex-math notation="LaTeX">$\mathbf{90} \boldsymbol{\mathbf{\%}}$</tex-math></inline-formula> on the CIFAR-10 dataset, respectively, compared to a conventional (uncompensated) 6T SRAM-based IMC.

References

[1]
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” Commun. ACM, vol. 60, no. 6, pp. 84–90, 2017.
[2]
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 770–778.
[3]
W. Chan, N. Jaitly, Q. Le, and O. Vinyals, “Listen, attend and spell: A neural network for large vocabulary conversational speech recognition,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), Piscataway, NJ, USA: IEEE Press, 2016, pp. 4960–4964.
[4]
A. Vaswani et al., “Attention is all you need,” in Proc. Adv. Neural Inf. Process. Syst., vol. 30, 2017.
[5]
B. Wu, F. Iandola, P. H. Jin, and K. Keutzer, “SqueezeDet: Unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, 2017, pp. 129–137.
[6]
X. Lian et al., “Persia: An open, hybrid system scaling deep learning-based recommenders up to 100 trillion parameters,” in Proc. 28th ACM SIGKDD Conf. Knowl. Discovery Data Mining, 2022, pp. 3288–3298.
[7]
Y.-H. Chen, T. Krishna, J. S. Emer, and V. Sze, “Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks,” IEEE J. Solid-State Circuits, vol. 52, no. 1, pp. 127–138, Jan. 2017.
[8]
M. Kang, M.-S. Keel, N. R. Shanbhag, S. Eilert, and K. Curewitz, “An energy-efficient VLSI architecture for pattern recognition via deep embedding of computation in SRAM,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), Piscataway, NJ, USA: IEEE Press, 2014, pp. 8326–8330.
[9]
M. Kang, S. K. Gonugondla, and N. R. Shanbhag, “Deep in-memory architectures in SRAM: An analog approach to approximate computing,” Proc. IEEE, vol. 108, no. 12, pp. 2251–2275, Dec. 2020.
[10]
X. Si et al., “15.5 A 28nm 64Kb 6T SRAM computing-in-memory macro with 8b MAC operation for AI edge chips,” in Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC), 2020, pp. 246–248.
[11]
J. Zhang, Z. Wang, and N. Verma, “In-memory computation of a machine-learning classifier in a standard 6T SRAM array,” IEEE J. Solid-State Circuits, vol. 52, no. 4, pp. 915–924, Apr. 2017.
[12]
Z. Chen et al., “CAP-RAM: A charge-domain in-memory computing 6T-SRAM for accurate and precision-programmable CNN inference,” IEEE J. Solid-State Circuits, vol. 56, no. 6, pp. 1924–1935, Jun. 2021.
[13]
S. Yin, Z. Jiang, J.-S. Seo, and M. Seok, “XNOR-SRAM: In-memory computing SRAM macro for binary/ternary deep neural networks,” IEEE J. Solid-State Circuits, vol. 55, no. 6, pp. 1733–1743, Jun. 2020.
[14]
J. Kim et al., “Area-efficient and variation-tolerant in-memory BNN computing using 6T SRAM array,” in Proc. Symp. VLSI Circuits, Piscataway, NJ, USA: IEEE Press, 2019, pp. C118–C119.
[15]
S. Yin et al., “PIMCA: A 3.4-Mb programmable in-memory computing accelerator in 28nm for on-chip DNN inference,” in Proc. Symp. VLSI Technol., Piscataway, NJ, USA: IEEE Press, 2021, pp. 1–2.
[16]
Z. Jiang, S. Yin, J.-S. Seo, and M. Seok, “C3SRAM: An in-memory-computing SRAM macro based on robust capacitive coupling computing mechanism,” IEEE J. Solid-State Circuits, vol. 55, no. 7, pp. 1888–1897, Jul. 2020.
[17]
J. Yue et al., “15.2 A 2.75-to-75.9 TOPS/W computing-in-memory NN processor supporting set-associate block-wise zero skipping and ping-pong CIM with simultaneous computation and weight updating,” in Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC), vol. 64, Piscataway, NJ, USA: IEEE Press, 2021, pp. 238–240.
[18]
Q. Dongo et al., “15.3 A 351TOPS/W and 372.4 GOPS compute-in-memory SRAM macro in 7nm FinFET CMOS for machine-learning applications,” in Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC), Piscataway, NJ, USA: IEEE Press, 2020, pp. 242–244.
[19]
Y.-D. Chih et al., “16.4 an 89TOPS/W and 16.3 TOPS/mm2 all-digital SRAM-based full-precision compute-in memory macro in 22nm for machine-learning edge applications,” in Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC), vol. 64, Piscataway, NJ, USA: IEEE Press, 2021, pp. 252–254.
[20]
H. Jia, H. Valavi, Y. Tang, J. Zhang, and N. Verma, “A programmable heterogeneous microprocessor based on bit-scalable in-memory computing,” IEEE J. Solid-State Circuits, vol. 55, no. 9, pp. 2609–2621, Sep. 2020.
[21]
A. Biswas and A. P. Chandrakasan, “Conv-RAM: An energy-efficient SRAM with embedded convolution computation for low-power CNN-based machine learning applications,” in Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC), Piscataway, NJ, USA: IEEE Press, 2018, pp. 488–490.
[22]
X. Si et al., “24.5 A twin-8T SRAM computation-in-memory macro for multiple-bit CNN-based machine learning,” in Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC), Piscataway, NJ, USA: IEEE Press, 2019, pp. 396–398.
[23]
R. Guo et al., “A 5.1 pJ/neuron 127.3 us/inference RNN-based speech recognition processor using 16 computing-in-memory SRAM macros in 65nm CMOS,” in Proc. Symp. VLSI Circuits, Piscataway, NJ, USA: IEEE Press, 2019, pp. C120–C121.
[24]
J.-W. Su et al., “15.2 a 28nm 64Kb inference-training two-way transpose multibit 6T SRAM Compute-in-Memory macro for AI edge chips,” in Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC), Piscataway, NJ, USA: IEEE Press, 2020, pp. 240–242.
[25]
S. K. Gonugondla, M. Kang, and N. Shanbhag, “A 42pJ/decision 3.12 TOPS/W robust in-memory machine learning classifier with on-chip training,” in Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC), Piscataway, NJ, USA: IEEE Press, 2018, pp. 490–492.
[26]
P. Mannocci, E. Melacarne, and D. Ielmini, “An analogue in-memory ridge regression circuit with application to massive MIMO acceleration,” IEEE J. Emerg. Sel. Topics Circuits Syst., vol. 12, no. 4, pp. 952–962, Dec. 2022.
[27]
P. Mannocci et al., “Accelerating massive MIMO in 6G communications by analog in-memory computing circuits,” in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), Piscataway, NJ, USA: IEEE Press, 2023, pp. 1–5.
[28]
N. R. Shanbhag and S. K. Roy, “Comprehending in-memory computing trends via proper benchmarking,” in Proc. IEEE Custom Integr. Circuits Conf. (CICC), Piscataway, NJ, USA: IEEE Press, 2022, pp. 1–7.
[29]
B. Zhang, L.-Y. Chen, and N. Verma, “Stochastic data-driven hardware resilience to efficiently train inference models for stochastic hardware implementations,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), Piscataway, NJ, USA: IEEE Press, 2019, pp. 1388–1392.
[30]
C. Zhou, P. Kadambi, M. Mattina, and P. N. Whatmough, “Noisy machines: Understanding noisy neural networks and enhancing robustness to analog hardware errors using distillation,” 2020,.
[31]
H. Kim et al., “Direct gradient calculation: Simple and variation-tolerant on-chip training method for neural networks,” Adv. Intell. Syst., vol. 3, no. 8, 2021, Art. no.
[32]
C. Sakr and N. R. Shanbhag, “Signal processing methods to enhance the energy efficiency of in-memory computing architectures,” IEEE Trans. Signal Process., vol. 69, pp. 6462–6472, 2021.
[33]
H.-M. Ou and N. R. Shanbhag, “Enhancing the accuracy of resistive in-memory architectures using adaptive signal processing,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), Piscataway, NJ, USA: IEEE Press, 2023, pp. 1–5.
[34]
H. Kim and N. Shanbhag, “Boosting the accuracy of SRAM-Based in-memory architectures via maximum likelihood-based error compensation method,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), Piscataway, NJ, USA: IEEE Press, 2023, pp. 1–5.
[35]
H. Valavi, P. J. Ramadge, E. Nestler, and N. Verma, “A mixed-signal binarized convolutional-neural-network accelerator integrating dense weight storage and multiplication for reduced data movement,” in Proc. IEEE Symp. VLSI Circuits, Piscataway, NJ, USA: IEEE Press, 2018, pp. 141–142.
[36]
S. K. Gonugondla, C. Sakr, H. Dbouk, and N. R. Shanbhag, “Fundamental limits on the precision of in-memory architectures,” in Proc. 39th Int. Conf. Comput.-Aided Des., 2020, pp. 1–9.
[37]
W. P. Zhang and X. Tong, “Noise modeling and analysis of SAR ADCs,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 23, no. 12, pp. 2922–2930, Dec. 2015.
[38]
M. Kang, S. K. Gonugondla, A. Patil, and N. R. Shanbhag, “A multi-functional in-memory inference processor using a standard 6T SRAM array,” IEEE J. Solid-State Circuits, vol. 53, no. 2, pp. 642–655, Feb. 2018.
[39]
B. Murmann, “Mixed-signal computing for deep neural network inference,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 29, no. 1, pp. 3–13, Jan. 2021.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Signal Processing
IEEE Transactions on Signal Processing  Volume 72, Issue
2024
4446 pages

Publisher

IEEE Press

Publication History

Published: 29 April 2024

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media