Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3453688.3461528acmconferencesArticle/Chapter ViewAbstractPublication PagesglsvlsiConference Proceedingsconference-collections
research-article

Energy-Efficient Hybrid-RAM with Hybrid Bit-Serial based VMM Support

Published: 22 June 2021 Publication History

Abstract

This work presents HRAM, a SRAM-based hybrid memory bit-cell for energy-efficient in-memory computing purpose. The HRAM bit-cell consists of conventional 6T-SRAM for static data storage, and extra one accessing transistor and capacitor for caching data temporarily then conduct the computation within the HRAM array. As the Vector-Matrix Multiplication (VMM) is the dominant operation of neural network inference, performing the VMM in bit-serial fashion is a popular method in recent works. Meanwhile, there are two variants of bit-serial VMM, digital and analog VMM respectively, which fits for varying network topology (e.g., ResNet and MobileNet correspondingly). Through designing re-configurable sensing module and peripherals, our HRAM can be configured to conduct both DVMM and AVMM efficiently. With 65nm technology, the cross-layer simulation indicates that the HRAM based in-memory computing accelerator outperforms the state-of-the-art CSRAM and MBC design by 1.94×/1.81× and 1.95×/11× respectively, in energy efficiency for ResNet-50/MobileNet-V2.

Supplemental Material

MP4 File
We propose the Hybrid-RAM (HRAM). It's a low power process-in-memory design that support both analog and digital in-memory computing, which makes it efficient and still flexible. Analog and digital modes are expert at different applications, and cannot be a general solution. However, the switching of HRAM, makes it could always obtain the best performance in any case. Compared with the state-of-the-art MBC and CSRAM, HRAM could get decent performance in most cases, and could get the best performance adaptively in any case.

References

[1]
Ian Goodfellow et al. Deep learning. MIT press Cambridge, 2016.
[2]
Kaiming He et al. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2016.
[3]
K Ermis et al. Artificial neural network analysis of world green energy use. Energy Policy, pages 1731--1743, 2007.
[4]
Alfredo Canziani et al. An analysis of deep neural network models for practical applications. arXiv preprint arXiv:1605.07678, 2016.
[5]
Wm A Wulf et al. Hitting the memory wall: implications of the obvious. ACM SIGARCH computer architecture news, pages 20--24, 1995.
[6]
Mark Horowitz. 1.1 computing's energy problem (and what we can do about . In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), pages 10--14. IEEE, 2014.
[7]
Norman P Jouppi et al. In-datacenter performance analysis of a tensor processing unit. In Proceedings of the 44th annual international symposium on computer architecture, pages 1--12, 2017.
[8]
Mingu Kang et al. A multi-functional in-memory inference processor using a standard 6t sram array. IEEE Journal of Solid-State Circuits, 2018.
[9]
Jintao Zhang et al. In-memory computation of a machine-learning classifier in a standard 6t sram array. IEEE Journal of Solid-State Circuits, 2017.
[10]
Hongyang Jia et al. A programmable heterogeneous microprocessor based on bit-scalable in-memory computing. IEEE Journal of Solid-State Circuits, 2020.
[11]
Zhewei Jiang et al. C3sram: An in-memory-computing sram macro based on robust capacitive coupling computing mechanism. IEEE JSSC, 2020.
[12]
Avishek Biswas et al. Conv-sram: An energy-efficient sram with in-memory dot-product computation for low-power convolutional neural networks. IEEE Journal of Solid-State Circuits, 2018.
[13]
Charles Eckert et al. Neural cache: Bit-serial in-cache acceleration of deep neural networks. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA), pages 383--396. IEEE, 2018.
[14]
Jingcheng Wang et al. A 28-nm compute sram with bit-serial logic/arithmetic operations for programmable in-memory vector computing. IEEE Journal of Solid-State Circuits, pages 76--86, 2019.
[15]
Kyeongho Lee et al. Bit parallel 6t sram in-memory computing with reconfigurable bit-precision. In Design Automation Conference, pages 1--6. IEEE, 2020.
[16]
Vivek Seshadri et al. Ambit: In-memory accelerator for bulk bitwise operations using commodity dram technology. In 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 273--287. IEEE, 2017.
[17]
Ali Shafiee et al. Isaac: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars. Computer Architecture News, 44(3):14--26, 2016.
[18]
Ping Chi et al. Prime: A novel processing-in-memory architecture for neural network computation in reram-based main memory. ACM SIGARCH Computer Architecture News, 44(3):27--39, 2016.
[19]
Shuangchen Li et al. Pinatubo: A processing-in-memory architecture for bulk bitwise operations in emerging non-volatile memories. In Proceedings of the 53rd Annual Design Automation Conference, pages 1--6, 2016.
[20]
Kenneth E. Batcher. Bit-serial parallel processing systems. IEEE Computer Architecture Letters, 31(05):377--384, 1982.
[21]
Peter B Denyer et al. VLSI signal processing; a bit-serial approach. Addison-Wesley Longman Publishing Co., Inc., 1985.
[22]
Zhezhi He et al. Noise injection adaption: End-to-end reram crossbar non-ideal effect adaption for neural network mapping. In Proceedings of the 56th Annual Design Automation Conference 2019, pages 1--6, 2019.
[23]
Bing Li et al. An overview of in-memory processing with emerging non-volatile memory for data-intensive applications. In Proceedings of the 2019 on Great Lakes Symposium on VLSI, pages 381--386, 2019.
[24]
Harijot Singh Bindra et al. A 30fj/comparison dynamic bias comparator. In ESSCIRC 2017--43rd IEEE European Solid State Circuits Conference. IEEE, 2017.
[25]
Xiangyu Dong et al. Nvsim: A circuit-level performance, energy, and area model for emerging nonvolatile memory. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 31(7):994--1007, 2012.
[26]
Christian Szegedy,Wei Liu, et al. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2015.
[27]
Andrew G Howard, Menglong Zhu, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861, 2017.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
GLSVLSI '21: Proceedings of the 2021 Great Lakes Symposium on VLSI
June 2021
504 pages
ISBN:9781450383936
DOI:10.1145/3453688
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 June 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bit-serial
  2. in-memory computing
  3. machine learning accelerator
  4. neural networks
  5. sram.
  6. vector-matrix multiplication

Qualifiers

  • Research-article

Data Availability

We propose the Hybrid-RAM (HRAM). It's a low power process-in-memory design that support both analog and digital in-memory computing, which makes it efficient and still flexible. Analog and digital modes are expert at different applications, and cannot be a general solution. However, the switching of HRAM, makes it could always obtain the best performance in any case. Compared with the state-of-the-art MBC and CSRAM, HRAM could get decent performance in most cases, and could get the best performance adaptively in any case. https://dl.acm.org/doi/10.1145/3453688.3461528#meeting_02.mp4

Conference

GLSVLSI '21
Sponsor:
GLSVLSI '21: Great Lakes Symposium on VLSI 2021
June 22 - 25, 2021
Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate 312 of 1,156 submissions, 27%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 184
    Total Downloads
  • Downloads (Last 12 months)28
  • Downloads (Last 6 weeks)1
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media