Abstract
The rapid growth of CMOS logic circuits has surpassed the advancements in memory access, leading to significant “memory wall” bottlenecks, particularly in artificial intelligence applications. To address this challenge, compute-in-memory (CIM) has emerged as a promising approach to enhance the performance, area efficiency, and energy efficiency of computing systems. By enabling memory cells to perform parallel computations, CIM improves data reuse and minimizes data movement between the memory and the processor. This study conducts a comprehensive review of various domains of SRAM-based CIM macros and their associated computing paradigms. Additionally, it presents a survey of recent SRAM-CIM macros, with a specific focus on the key challenges and design tradeoffs involved. Furthermore, this research identifies potential future trends in SRAM-CIM macro-level design, including hybrid computing, precision enhancement, and operator reconfiguration. These trends aim to resolve the tradeoff between computational accuracy, energy efficiency, and support for diverse operators within the SRAM-CIM framework. At the microarchitecture level, two possible solutions for tradeoffs are proposed: chiplet integration and sparsity optimization. Finally, research perspectives are proposed for future development.
Similar content being viewed by others
References
Chang L, Li C, Zhang Z, et al. Energy-efficient computing-in-memory architecture for AI processor: device, circuit, architecture perspective. Sci China Inf Sci, 2021, 64: 160403
Cheng C, Tiw P J, Cai Y, et al. In-memory computing with emerging nonvolatile memory devices. Sci China Inf Sci, 2021, 64: 221402
Zhang W, Gao B, Yao P, et al. Array-level boosting method with spatial extended allocation to improve the accuracy of memristor based computing-in-memory chips. Sci China Inf Sci, 2021, 64: 160406
Zou X, Xu S, Chen X, et al. Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology. Sci China Inf Sci, 2021, 64: 160404
Jhang C J, Xue C X, Hung J M, et al. Challenges and trends of SRAM-based computing-in-memory for AI edge devices. IEEE Trans Circ Syst I, 2021, 68: 1773–1786
Si X, Zhou Y L, Yang J, et al. Challenge and trend of SRAM based computation-in-memory circuits for AI edge devices. In: Proceedings of the 14th International Conference on ASIC (ASICON), 2021
Wang Y F, Zhou Y L, Wang B, et al. Design challenges and methodology of high-performance SRAM-based compute-in-memory for AI edge devices. In: Proceedings of International Conference on UK-China Emerging Technologies (UCET), 2021
Xiong T Z, Zhou Y L, Kong Y Y, et al. Design methodology towards high-precision SRAM based computation-in-memory for AI edge devices. In: Proceedings of the 18th International SoC Design Conference (ISOCC), 2021
Dong F Y, Si X, Chang M F. Design methodology and trends of SRAM-based compute-in-memory circuits. In: Proceedings of the 16th International Conference on Solid-State & Integrated Circuit Technology (ICSICT), 2022
Chang M F, Lin C C, Lee A, et al. 17.5 A 3T1R nonvolatile TCAM using MLC ReRAM with Sub-1ns search time. In: Proceedings of IEEE International Solid-State Circuits Conference, 2015
Khwa W S, Chang M F, Wu J Y, et al. 7.3 A resistance-drift compensation scheme to reduce MLC PCM raw BER by over 100× for storage-class memory applications. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2016
Lin C C, Hung J Y, Lin W Z, et al. 7.4 A 256b-wordlength ReRAM-based TCAM with 1ns search-time and 14× improvement in wordlength-energyefficiency-density product using 2.5T1R cell. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2016
Xue C X, Chen W H, Liu J S, et al. 24.1 A 1Mb multibit ReRAM computing-in-memory macro with 14.6ns parallel MAC computing time for CNN based AI edge processors. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2019
Chang T C, Chiu Y C, Lee C Y, et al. 13.4 A 22nm 1Mb 1024b-read and near-memory-computing dual-mode STT-MRAM macro with 42.6GB/s read bandwidth for security-aware mobile devices. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2020
Liu Q, Gao B, Yao P, et al. 33.2 A fully integrated analog ReRAM based 78.4TOPS/W compute-in-memory chip with fully parallel MAC computing. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2020
Xue C X, Huang T Y, Liu J S, et al. 15.4 A 22nm 2Mb ReRAM compute-in-memory macro with 121-28TOPS/W for multibit MAC computing for tiny AI edge devices. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2020
Xue C X, Hung J M, Kao H Y, et al. 16.1 A 22nm 4Mb 8b-precision ReRAM computing-in-memory macro with 11.91 to 195.7TOPS/W for tiny AI edge devices. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2021
Yoon J H, Chang M, Khwa W S, et al. 29.1 A 40nm 64Kb 56.67TOPS/W read-disturb-tolerant compute-in-memory/digital RRAM macro with active-feedback-based read and in-situ write verification. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2021
Chang M, Spetalnick S D, Crafton B, et al. A 40nm 60.64TOPS/W ECC-capable compute-in-memory/digital 2.25MB/768KB RRAM/SRAM system with embedded cortex M3 microprocessor for edge recommendation systems. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2022
Chiu Y C, Yang C S, Teng S H, et al. A 22nm 4Mb STT-MRAM data-encrypted near-memory computation macro with a 192GB/s read-and-decryption bandwidth and 25.1–55.1TOPS/W 8b MAC for AI operations. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2022
Hu H W, Wang W C, Chen C K, et al. A 512Gb in-memory-computing 3D-NAND flash supporting similar-vector-matching operations on edge-AI devices. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2022
Hung J M, Huang Y H, Huang S P, et al. An 8-Mb DC-current-free binary-to-8b precision ReRAM nonvolatile computing-in-memory macro using time-space-readout with 1286.4-21.6TOPS/W for edge-AI devices. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2022
Khwa W S, Chiu Y C, Jhang C J, et al. A 40-nm, 2M-cell, 8b-precision, hybrid SLC-MLC PCM computing-in-memory macro with 20.5-65.0TOPS/W for tiny-Al edge devices. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2022
Spetalnick S D, Chang M, Crafton B, et al. A 40nm 64kb 26.56TOPS/W 2.37Mb/mm2 RRAM binary/compute-in-memory macro with 4.23× improvement in density and >75 use of sensing dynamic range. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2022
Everson L R, Liu M, Pande N, et al. A 104.8TOPS/W one-shot time-based neuromorphic chip employing dynamic threshold error correction in 65nm. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2022
Mohammed M U, Chowdhury M H. Reliability and energy efficiency of the tunneling transistor-based 6T SRAM cell in sub-10 nm domain. IEEE Trans Circ Syst II, 2018, 65: 1829–1833
Everson L R, Liu M, Pande N, et al. An energy-efficient one-shot time-based neural network accelerator employing dynamic threshold error correction in 65 nm. IEEE J Solid-State Circ, 2019, 54: 2777–2785
Huynh K, Saltin J, Han J W, et al. Study of layout dependent radiation hardness of FinFET SRAM using full domain 3D TCAD simulation. In: Proceedings of IEEE SOI-3D-Subthreshold Microelectronics Technology Unified Conference, 2019
Sayal A, Fathima S, Nibhanupudi S S T, et al. 14.4 all-digital time-domain CNN engine using bidirectional memory delay lines for energy-efficient edge computing. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2019
Wang T, Shan W W. An energy-efficient in-memory BNN architecture with time-domain analog and digital mixed-signal processing. In: Proceedings of IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH), 2019
Yang J, Kong Y Y, Wang Z, et al. 24.4 sandwich-RAM: an energy-efficient in-memory BWN architecture with pulse-width modulation. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2019
Agrawal A, Kosta A, Kodge S, et al. CASH-RAM: enabling in-memory computations for edge inference using charge accumulation and sharing in standard 8T-SRAM arrays. IEEE J Emerg Sel Top Circ Syst, 2020, 10: 295–305
He Y X, Choi M, Kim K K, et al. A time-domain computing-in-memory micro using ring oscillator. In: Proceedings of the 18th International SoC Design Conference (ISOCC), 2021
Lin C S, Tsai F C, Su J W, et al. A 48 TOPS and 20943 TOPS/W 512kb computation-in-SRAM macro for highly reconfigurable ternary CNN acceleration. In: Proceedings of IEEE Asian Solid-State Circuits Conference (A-SSCC), 2021
Song J, Wang Y, Guo M, et al. TD-SRAM: time-domain-based in-memory computing macro for binary neural networks. IEEE Trans Circ Syst I, 2021, 68: 3377–3387
Kong Y, Chen X, Si X, et al. Evaluation platform of time-domain computing-in-memory circuits. IEEE Trans Circ Syst, 2023, 70: 1174–1178
Park H, Lee K, Park J. A 10T SRAM compute-in-memory macro with analog MAC operation and time domain conversion. In: Proceedings of the 4th International Conference on Artificial Intelligence Circuits and Systems (AICAS), 2022
Wu P C, Su J W, Chung Y L, et al. A 28nm 1Mb time-domain computing-in-memory 6T-SRAM macro with a 6.6ns latency, 1241GOPS and 37.01TOPS/W for 8b-MAC operations for edge-AI devices. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2022
Wang Y S, Liu L B, Yin S Y, et al. Hierarchical representation of on-chip context to reduce reconfiguration time and implementation area for coarse-grained reconfigurable architecture. Sci China Inf Sci, 2013, 56: 112401
Miyashita D, Kousai S, Suzuki T, et al. A neuromorphic chip optimized for deep learning and CMOS technology with time-domain analog and digital mixed-signal processing. IEEE J Solid-State Circ, 2017, 52: 2679–2689
Biswas A, Chandrakasan A P. Conv-RAM: an energy-efficient SRAM with embedded convolution computation for low-power CNN-based machine learning applications. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2018
Zhang J, Wang Z, Verma N. In-memory computation of a machine-learning classifier in a standard 6T SRAM array. IEEE J Solid-State Cir, 2017, 52: 915–924
Khwa W S, Chen J J, Li J F, et al. A 65nm 4Kb algorithm-dependent computing-in-memory SRAM unit-macro with 2.3ns and 55.8TOPS/W fully parallel product-sum operation for binary DNN edge processors. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2018
Si X, Chen J J, Tu Y N, et al. 24.5 A twin-8T SRAM computation-in-memory macro for multiple-bit CNN-based machine learning. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2019
Choi I, Choi E J, Yi D, et al. An SRAM-based hybrid computation-in-memory macro using current-reused differential CCO. IEEE J Emerg Sel Top Circ Syst, 2022, 12: 536–546
Si X, Tu Y N, Huang W H, et al. 15.5 A 28nm 64Kb 6T SRAM computing-in-memory macro with 8b MAC operation for AI edge chips. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2020
Su J W, Si X, Chou Y C, et al. 15.2 A 28nm 64Kb inference-training two-way transpose multibit 6T SRAM compute-in-memory macro for AI edge chips. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2020
Xue C X, Huang T Y, Liu J S, et al. 15.4 A 22nm 2Mb ReRAM Compute-in-Memory Macro with 121-28TOPS/W for Multibit MAC Computing for Tiny AI Edge Devices. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2020
Yin S, Jiang Z, Seo J, et al. XNOR-SRAM: in-memory computing SRAM macro for binary/ternary deep neural networks. IEEE J Solid-State Circ, 2020, 55: 1733–1743
Wang Y, Zou Z, Zheng L. Design framework for SRAM-based computing-in-memory edge CNN accelerators. In: Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS), 2021
Xu T, Li S, Su F, et al. A current domain computing-in-memory SRAM macro with hybrid IAF-SAR ADC for signal margin enhancement. In: Proceedings of IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA), 2022. 119–120
Yue J S, Feng X Y, He Y F, et al. 15.2 A 2.75-to-75.9TOPS/W computing-in-memory NN processor supporting set-associate block-wise zero skipping and ping-pong CIM with simultaneous computation and weight updating. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2021
Song J, Tang X, Luo H, et al. A calibration-free 15-level/cell eDRAM computing-in-memory macro with 3T1C current-programmed dynamic-cascoded MLC achieving 233-to-304-TOPS/W 4b MAC. In: Proceedings of IEEE Custom Integrated Circuits Conference (CICC), 2023
Peng S Y, Liu I C, Wu Y H, et al. An SRAM-based reconfigurable cognitive computation matrix for sensor edge applications. IEEE J Solid State Circ, 2023. doi: https://doi.org/10.1109/JSSC.2023.3303910
Yin G, Cai Y, Wu J, et al. Enabling lower-power charge-domain nonvolatile in-memory computing with ferroelectric FETs. IEEE Trans Circ Syst, 2021, 68: 2262–2266
Song J, Tang X, Luo H, et al. Spike-CIM: a 290TOPS/W spike-encoding sparsity-adaptive computing-in-memory macro with differential charge-domain integrate-and-fire. In: Proceedings of IEEE Asian Solid-State Circuits Conference (A-SSCC), 2022
Gonugondla S K, Kang M, Shanbhag N. A 42pJ/decision 3.12TOPS/W robust in-memory machine learning classifier with on-chip training. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2018
Valavi H, Ramadge P J, Nestler E, et al. A mixed-signal binarized convolutional-neural-network accelerator integrating dense weight storage and multiplication for reduced data movement. In: Proceedings of IEEE Symposium on VLSI Circuits, 2018
Jiang Z W, Yin S H, Seo J S, et al. C3SRAM: in-memory-computing SRAM macro based on capacitive-coupling computing. IEEE Solid-State Circ Lett, 2019, 2: 131–134
Kim H, Chen Q, Kim B. A 16K SRAM-based mixed-signal in-memory computing macro featuring voltage-mode accumulator and row-by-row ADC. In: Proceedings of IEEE Asian Solid-State Circuits Conference (A-SSCC), 2019
Valavi H, Ramadge P J, Nestler E, et al. A 64-tile 2.4-Mb in-memory-computing CNN accelerator employing charge-domain compute. IEEE J Solid-State Circ, 2019, 54: 1789–1799
Dong Q, Sinangil M E, Erbagci B, et al. 15.3 A 351TOPS/W and 372.4GOPS compute-in-memory SRAM macro in 7nm FinFET CMOS for machine-learning applications. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2020
Chen Z Y, Chen X, Gu J. 15.3 A 65nm 3T dynamic analog RAM-based computing-in-memory macro and CNN accelerator with retention enhancement, adaptive analog sparsity and 44TOPS/W system energy efficiency. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2021
Chen Z Y, Yu Z H, Jin Q, et al. CAP-RAM: a charge-domain in-memory computing 6T-SRAM for accurate and precision-programmable CNN inference. IEEE J Solid-State Circ, 2021, 56: 1924–1935
Jia H Y, Ozatay M, Tang Y Q, et al. 15.1 A programmable neural-network inference accelerator based on scalable in-memory computing. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2021
Lee E, Han T, Seo D, et al. A charge-domain scalable-weight in-memory computing macro with dual-SRAM architecture for precision-scalable DNN accelerators. IEEE Trans Circ Syst I, 2021, 68: 3305–3316
Lee J, Valavi H, Tang Y, et al. Fully row/column-parallel in-memory computing SRAM macro employing capacitor-based mixed-signal computation with 5-b inputs. In: Proceedings of Symposium on VLSI Technology, 2021
Song J H, Wang Y, Tang X Y, et al. A 16Kb transpose 6T SRAM in-memory-computing macro based on robust charge-domain computing. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2021
Su J W, Chou Y C, Liu R, et al. 16.3 A 28nm 384kb 6T-SRAM computation-in-memory macro with 8b precision for AI edge chips. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2021
Bharti P K, Jain S, Pillai K R, et al. Compute-in-memory using 6T SRAM for a wide variety of workloads. In: Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS), 2022
Chih Y D, Lee P H, Fujiwara H, et al. 16.4 An 89TOPS/W and 16.3TOPS/mm2 all-digital SRAM-based full-precision compute-in memory macro in 22nm for machine-learning edge applications. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2021
Yan B, Hsu J L, Yu P C, et al. A 1.041-Mb/mm2 27.38-TOPS/W signed-INT8 dynamic-logic-based ADC-less SRAM compute-in-memory macro in 28nm with reconfigurable bitwise operation for AI and embedded applications. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2022
Yang J X, Kong Y Y, Zhang Z, et al. TIMAQ: a time-domain computing-in-memory-based processor using predictable decomposed convolution for arbitrary quantized DNNs. IEEE J Solid-State Circ, 2021, 56: 3021–3038
Wang B, Xue C, Feng Z Y, et al. A 28nm horizontal-weight-shift and vertical-feature-shift-based separate-WL 6T-SRAM computation-in-memory unit-macro for edge depthwise neural-networks. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2023
Hsieh S, Wei C, Xue C, et al. 7.6 A 70.85-86.27TOPS/W PVT-insensitive 8b word-wise ACIM with post-processing relaxation. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2023
Yao C Y, Wu T Y, Liang H C, et al. A fully bit-flexible computation in memory macro using multi-functional computing bit cell and embedded input sparsity sensing. IEEE J Solid-State Circ, 2023, 58: 1487–1495
Guo A, Si X, Chen X, et al. A 28nm 64-kb 31.6-TFLOPS/W digital-domain floating-point- computing-unit and double-bit 6T-SRAM computing-in-memory macro for floating-point CNNs. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2023
Chen P Y, Wu M, Zhao W T, et al. 7.8 A 22nm delta-sigma computing-in-memory (delta sigma CIM) SRAM macro with near-zero-mean outputs and LSB-first ADCs achieving 21.38TOPS/W for 8b-MAC edge AI processing. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2023
Wang H C, Liu R Z, Dorrance R, et al. A 32.2 TOPS/W SRAM compute-in-memory macro employing a linear 8-bit C-2C ladder for charge domain computation in 22nm for edge inference. In: Proceedings of IEEE Symposium on VLSI Technology and Circuits, 2022
Park J S, Jang J W, Lee H, et al. 9.5 A 6K-MAC feature-map-sparsity-aware neural processing unit in 5nm flagship mobile SoC. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2021
Yue Z H, Wang Y, Wang H Z, et al. 7.7 CV-CIM: a 28nm XOR-derived similarity-aware computation-in-memory for cost-volume construction. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2023
Vivet P, Guthmuller E, Thonnart Y, et al. IntAct: a 96-core processor with six chiplets 3D-stacked on an active interposer with distributed interconnects and integrated power management. IEEE J Solid-State Circ, 2021, 56: 79–97
Gomes W, Koker A, Stover P, et al. Ponte Vecchio: a multi-tile 3D stacked processor for exascale computing. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2022
Tu F B, Wang Y Q, Wu Z H, et al. A 28nm 29.2TFLOPS/W BF16 and 36.5TOPS/W INT8 reconfigurable digital CIM processor with unified FP/INT pipeline and bitwise in-memory booth multiplication for cloud deep learning acceleration. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2022
Acknowledgements
This work was supported by National Key R&D Program of China (Grant No. 2022ZD0118902) and National Natural Science Foundation of China (Grant Nos. 92264203, 62204036).
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Zhang, Z., Chen, J., Chen, X. et al. From macro to microarchitecture: reviews and trends of SRAM-based compute-in-memory circuits. Sci. China Inf. Sci. 66, 200403 (2023). https://doi.org/10.1007/s11432-023-3800-9
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-023-3800-9