Review Article
Published: 20 May 2024

Neural architecture search for in-memory computing-based deep learning accelerators

Nature Reviews Electrical Engineering volume 1, pages 374–390 (2024)Cite this article

5009 Accesses
15 Altmetric
Metrics details

Subjects

Abstract

The rapid growth of artificial intelligence and the increasing complexity of neural network models are driving demand for efficient hardware architectures that can address power-constrained and resource-constrained deployments. In this context, the emergence of in-memory computing (IMC) stands out as a promising technology. For this purpose, several IMC devices, circuits and architectures have been developed. However, the intricate nature of designing, implementing and deploying such architectures necessitates a well-orchestrated toolchain for hardware–software co-design. This toolchain must allow IMC-aware optimizations across the entire stack, encompassing devices, circuits, chips, compilers, software and neural network design. The complexity and sheer size of the design space involved renders manual optimizations impractical. To mitigate these challenges, hardware-aware neural architecture search (HW-NAS) has emerged as a promising approach to accelerate the design of streamlined neural networks tailored for efficient deployment on IMC hardware. This Review illustrates the application of HW-NAS to the specific features of IMC hardware and compares existing optimization frameworks. Ongoing research and unresolved issues are discussed. A roadmap for the evolution of HW-NAS for IMC architectures is proposed.

Key points

Hardware-aware neural architecture search (HW-NAS) is an efficient tool in hardware–software co-design, and it can be combined with other architecture-level and system-level optimization techniques to design efficient in-memory computing (IMC) hardware for deep learning accelerators.
HW-NAS for IMC can be used for optimizing deep learning models for a specific IMC hardware, and co-optimizing a model and hardware design searching for the most efficient implementation.
In HW-NAS, it is important to define a search space, select an appropriate problem formulation technique, and consider the trade-off between performance, search speed, computation demands and scalability when selecting a search strategy and a hardware evaluation technique.
In addition to neural network model hyperparameters and quantization and pruning policies, HW-NAS for IMC can include the circuit-level and architecture-level hardware parameters in the search.
The main challenges in HW-NAS for IMC include a lack of unified framework to support different types of neural network models and different IMC hardware architectures, HW-NAS benchmarks and efficient software–hardware co-design techniques and tools.
Fully automated NAS methods capable of constructing new deep learning operations and algorithms suitable for IMC with minimal human design are needed.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Fundamentals of hardware-aware neural architecture search.**

**Fig. 2: Problem formulation methods in hardware-aware neural architecture search.**

**Fig. 3: Search strategies in hardware-aware neural architecture search.**

**Fig. 4: Hardware cost estimation methods for hardware-aware neural architecture search.**

**Fig. 5: Roadmap for hardware-aware neural architecture search for in-memory computing.**

**Fig. 6: Place of hardware-aware neural architecture search in hardware–software co-design.**

Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators

Article Open access 30 August 2023

A full-stack memristor-based computation-in-memory system with software-hardware co-development

Article Open access 03 March 2025

Memory devices and applications for in-memory computing

Article 30 March 2020

References

Bengio, Y., Lecun, Y. & Hinton, G. Deep learning for AI. Commun. ACM 64, 58–65 (2021). This work provides an overview of deep learning methods for artificial intelligence applications and related future directions.
Article Google Scholar
Krestinskaya, O., James, A. P. & Chua, L. O. Neuromemristive circuits for edge computing: a review. IEEE Trans. Neural Netw. Learn. Syst. 31, 4–23 (2019).
Article MathSciNet Google Scholar
Ielmini, D. & Wong, H.-S. P. In-memory computing with resistive switching devices. Nat. Electron. 1, 333–343 (2018).
Article Google Scholar
Sebastian, A., Le Gallo, M., Khaddam-Aljameh, R. & Eleftheriou, E. Memory devices and applications for in-memory computing. Nat. Nanotechnol. 15, 529–544 (2020). This work explains the importance of and highlights the application landscape of in-memory computing, and also includes an overview of in-memory computing devices.
Article Google Scholar
Xia, Q. & Yang, J. J. Memristive crossbar arrays for brain-inspired computing. Nat. Mater. 18, 309–323 (2019).
Article Google Scholar
Yang, J. J., Strukov, D. B. & Stewart, D. R. Memristive devices for computing. Nat. Nanotechnol. 8, 13–24 (2013).
Article Google Scholar
Zhang, W. et al. Neuro-inspired computing chips. Nat. Electron. 3, 371–382 (2020). This work benchmarks in-memory computing architectures, presents the requirements for device metrics based on different applications and provides an in-memory computing roadmap.
Article Google Scholar
Wang, T. et al. Apq: joint search for network architecture, pruning and quantization policy. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 2078–2087 (IEEE/CVF, 2020).
Benmeziane, H. et al. A comprehensive survey on hardware-aware neural architecture search. Preprint at https://doi.org/10.48550/arXiv.2101.09336 (2021).
Chitty-Venkata, K. T. & Somani, A. K. Neural architecture search survey: a hardware perspective. ACM Comput. Surv. 55, 1–36 (2022).
Article Google Scholar
Benmeziane, H. et al. Hardware-aware neural architecture search: survey and taxonomy. In Proc. Thirtieth International Joint Conference on Artificial Intelligence 4322–4329 (IJCAI, 2021).
Benmeziane, H. et al. AnalogNAS: a neural network design framework for accurate inference with analog in-memory computing. In 2023 IEEE International Conference on Edge Computing and Communications (EDGE) 233–244 (IEEE, 2023).
Yuan, Z. et al. NAS4RRAM: neural network architecture search for inference on RRAM-based accelerators. Sci. China Inf. Sci. 64, 160407 (2021).
Article Google Scholar
Guan, Z. et al. A hardware-aware neural architecture search Pareto front exploration for in-memory computing. In 2022 IEEE 16th International Conference on Solid-State & Integrated Circuit Technology (ICSICT) 1–4 (IEEE, 2022).
Negi, S., Chakraborty, I., Ankit, A. & Roy, K. NAX: neural architecture and memristive xbar based accelerator co-design. In Proc. 59th ACM/IEEE Design Automation Conference 451–456 (IEEE, 2022).
Sun, H. et al. Gibbon: efficient co-exploration of NN model and processing-in-memory architecture. In 2022 Design, Automation and Test in Europe Conference and Exhibition (DATE) 867–872 (IEEE, 2022).
Jiang, W. et al. Device-circuit-architecture co-exploration for computing-in-memory neural accelerators. IEEE Trans. Comput. 70, 595–605 (2020).
Article MathSciNet Google Scholar
Elsken, T., Metzen, J. H. & Hutter, F. Neural architecture search: a survey. J. Mach. Learn. Res. 20, 1997–2017 (2019).
MathSciNet Google Scholar
Ren, P. et al. A comprehensive survey of neural architecture search: challenges and solutions. ACM Comput. Surv. 54, 1–34 (2021). This work provides a survey on neural architecture search from the software, algorithms and frameworks perspective.
Google Scholar
Sekanina, L. Neural architecture search and hardware accelerator co-search: a survey. IEEE Access 9, 151337–151362 (2021).
Article Google Scholar
Zhang, X., Jiang, W., Shi, Y. & Hu, J. When neural architecture search meets hardware implementation: from hardware awareness to co-design. In 2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) 25–30 (IEEE, 2019).
Efnusheva, D., Cholakoska, A. & Tentov, A. A survey of different approaches for overcoming the processor-memory bottleneck. Int. J. Comput. Sci. Inf. Technol. 9, 151–163 (2017).
Google Scholar
Liu, B. et. al. Hardware acceleration for neuromorphic computing: An evolving view. In 15th Non-Volatile Memory Technology Symposium (NVMTS) 1–4 (IEEE, 2015).
Yantır, H. E., Eltawil, A. M. & Salama, K. N. IMCA: an efficient in-memory convolution accelerator. IEEE Trans. Very Large Scale Integr. Syst. 29, 447–460 (2021).
Article Google Scholar
Fouda, M. E., Yantır, H. E., Eltawil, A. M. & Kurdahi, F. In-memory associative processors: tutorial, potential, and challenges. IEEE Trans. Circuits Syst. II Express Briefs 69, 2641–2647 (2022).
Google Scholar
Yantır, H. E., Eltawil, A. M. & Salama, K. N. A hardware/software co-design methodology for in-memory processors. J. Parallel Distrib. Comput. 161, 63–71 (2022).
Article Google Scholar
Lotfi-Kamran, P. et al. Scale-out processors. ACM SIGARCH Comput. Archit. N. 40, 500–511 (2012).
Article Google Scholar
Ali, M. et al. Compute-in-memory technologies and architectures for deep learning workloads. IEEE Trans. Very Large Scale Integr. Syst. 30, 1615–1630 (2022).
Article Google Scholar
Ielmini, D. & Pedretti, G. Device and circuit architectures for in‐memory computing. Adv. Intell. Syst. 2, 2000040 (2020). This work provides an extensive overview of in-memory computing devices.
Article Google Scholar
Lanza, M. et al. Memristive technologies for data storage, computation, encryption, and radio-frequency communication. Science 376, eabj9979 (2022). This work reviews in-memory computing devices, related computations and their applications.
Article Google Scholar
Mannocci, P. et al. In-memory computing with emerging memory devices: status and outlook. APL Mach. Learn. 1, 010902 (2023).
Article Google Scholar
Sun, Z. et al. A full spectrum of computing-in-memory technologies. Nat. Electron. https://doi.org/10.1038/s41928-023-01053-4 (2023).
Smagulova, K., Fouda, M. E., Kurdahi, F., Salama, K. N. & Eltawil, A. Resistive neural hardware accelerators. Proc. IEEE 111, 500–527 (2023). This work provides an overview of in-memory computing-based deep learning accelerators.
Article Google Scholar
Rasch, M. Neural network accelerator design with resistive crossbars: opportunities and challenges. IBM J. Res. Dev. 63, 10:11–10:13 (2019).
Google Scholar
Ankit, A., Chakraborty, I., Agrawal, A., Ali, M. & Roy, K. Circuits and architectures for in-memory computing-based machine learning accelerators. IEEE Micro 40, 8–22 (2020).
Article Google Scholar
Gebregiorgis, A. et al. A survey on memory-centric computer architectures. ACM J. Emerg. Technol. Comput. Syst. 18, 1–50 (2022).
Article Google Scholar
Aguirre, F. et al. Hardware implementation of memristor-based artificial neural networks. Nat. Commun. 15, 1974 (2024).
Article Google Scholar
Rasch, M. J. et al. Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators. Nat. Commun. 14, 5282 (2023).
Article Google Scholar
Le Gallo, M. et al. A 64-core mixed-signal in-memory compute chip based on phase-change memory for deep neural network inference. Nat. Electron. 6, 680–693 (2023).
Article Google Scholar
Fick, L., Skrzyniarz, S., Parikh, M., Henry, M. B. & Fick, D. Analog matrix processor for edge AI real-time video analytics. In 2022 IEEE International Solid-State Circuits Conference (ISSCC) 260–262 (IEEE, 2022).
Shafiee, A. et al. ISAAC: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars. ACM SIGARCH Comput. Archit. Netw. 44, 14–26 (2016).
Article Google Scholar
Krishnan, G. et al. SIAM: chiplet-based scalable in-memory acceleration with mesh for deep neural networks. ACM Trans. Embedded Comput. Syst. 20, 1–24 (2021). This work provides an overview of the hierarchical system-level design of in-memory computing accelerators for deep neural networks.
Article Google Scholar
Ankit, A. et al. Panther: a programmable architecture for neural network training harnessing energy-efficient reram. IEEE Trans. Comput. 69, 1128–1142 (2020).
Article Google Scholar
Krishnan, G. et al. Impact of on-chip interconnect on in-memory acceleration of deep neural networks. ACM J. Emerg. Technol. Comput. Syst. 18, 1–22 (2021).
Article Google Scholar
Ankit, A. et al. PUMA: a programmable ultra-efficient memristor-based accelerator for machine learning inference. In Proc 24th International Conference on Architectural Support for Programming Languages and Operating Systems 715–731 (ACM, 2019).
Li, W. et al. TIMELY: pushing data movements and interfaces in PIM accelerators towards local and in time domain. In 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) 832–845 (IEEE, 2020).
Chen, P., Wu, M., Ma, Y., Ye, L. & Huang, R. RIMAC: an array-level ADC/DAC-free ReRAM-based in-memory DNN processor with analog cache and computation. In Proc. 28th Asia and South Pacific Design Automation Conference 228–233 (ACM, 2023).
Zhang, B. et al. PIMCA: a programmable in-memory computing accelerator for energy-efficient dnn inference. IEEE J. Solid-State Circ. 58, 1436–1449 (2022).
Article Google Scholar
Kim, D. E., Ankit, A., Wang, C. & Roy, K. SAMBA: sparsity aware in-memory computing based machine learning accelerator. IEEE Trans. Comput. 72, 2615–2627 (2023).
Jain, S. et al. A heterogeneous and programmable compute-in-memory accelerator architecture for analog-AI using dense 2-D mesh. IEEE Trans. Very Large Scale Integr. Syst. 31, 114–127 (2022).
Article Google Scholar
Wang, X. et al. TAICHI: a tiled architecture for in-memory computing and heterogeneous integration. IEEE Trans. Circ. Syst. II Express Briefs 69, 559–563 (2021).
Google Scholar
Wan, W. et al. A compute-in-memory chip based on resistive random-access memory. Nature 608, 504–512 (2022).
Article Google Scholar
Hung, J.-M. et al. A four-megabit compute-in-memory macro with eight-bit precision based on CMOS and resistive random-access memory for AI edge devices. Nat. Electron. 4, 921–930 (2021).
Article Google Scholar
Xue, C.-X. et al. A 22nm 4Mb 8b-precision ReRAM computing-in-memory macro with 11.91 to 195.7 TOPS/W for tiny AI edge devices. In IEEE International Solid-State Circuits Conference (ISSCC) 245–247 (IEEE, 2021).
Khwa, W.-S. et al. A 40-nm, 2M-cell, 8b-precision, hybrid SLC-MLC PCM computing-in-memory macro with 20.5-65.0 TOPS/W for tiny-Al edge devices. In IEEE International Solid-State Circuits Conference (ISSCC) 1–3 (IEEE, 2022).
Jia, H. et al. Scalable and programmable neural network inference accelerator based on in-memory computing. IEEE J. Solid State Circuits 57, 198–211 (2021).
Article Google Scholar
Jung, S. et al. A crossbar array of magnetoresistive memory devices for in-memory computing. Nature 601, 211–216 (2022).
Article Google Scholar
Krestinskaya, O., Zhang, L. & Salama, K. N. Towards efficient RRAM-based quantized neural networks hardware: state-of-the-art and open issues. In IEEE 22nd International Conference on Nanotechnology (NANO) 465–468 (IEEE, 2022).
Joshi, V. et al. Accurate deep neural network inference using computational phase-change memory. Nat. Commun. 11, 2473 (2020).
Article Google Scholar
Cao, T. et al. A non-idealities aware software–hardware co-design framework for edge-ai deep neural network implemented on memristive crossbar. IEEE J. Emerg. Sel. Top. Circuits Syst. 12, 934–943 (2022).
Article Google Scholar
Wen, W., Wu, C., Wang, Y., Chen, Y. & Li, H. Learning structured sparsity in deep neural networks. Adv. Neural Inf. Proc. Syst. 29, 2074–2082 (2016).
Google Scholar
Peng, J. et al. CMQ: crossbar-aware neural network mixed-precision quantization via differentiable architecture search. IEEE Trans. Comput. Des. Integr. Circuits Syst. 41, 4124–4133 (2022).
Article Google Scholar
Huang, S. et al. Mixed precision quantization for ReRAM-based DNN inference accelerators. In Proc. 26th Asia and South Pacific Design Automation Conference 372–377 (ACM, 2021).
Meng, F.-H., Wang, X., Wang, Z., Lee, E. Y.-J. & Lu, W. D. Exploring compute-in-memory architecture granularity for structured pruning of neural networks. IEEE J. Emerg. Sel. Top. Circuits Syst. 12, 858–866 (2022).
Article Google Scholar
Krestinskaya, O., Zhang, L. & Salama, K. N. Towards efficient in-memory computing hardware for quantized neural networks: state-of-the-art, open challenges and perspectives. IEEE Trans. Nanotechnol. 22, 377–386 (2023).
Article Google Scholar
Li, Y., Dong, X. & Wang, W. Additive powers-of-two quantization: an efficient non-uniform discretization for neural networks. In International Conference on Learning Representations (ICRL) (ICRL, 2020).
Karimzadeh, F., Yoon, J.-H. & Raychowdhury, A. Bits-net: bit-sparse deep neural network for energy-efficient RRAM-based compute-in-memory. IEEE Trans. Circuits Syst. I: Regul. Pap. 69, 1952–1961 (2022).
Article Google Scholar
Yang, H., Duan, L., Chen, Y. & Li, H. BSQ: exploring bit-level sparsity for mixed-precision neural network quantization. In International Conference on Learning Representations (ICRL) (ICRL, 2020).
Qu, S. et al. RaQu: an automatic high-utilization CNN quantization and mapping framework for general-purpose RRAM accelerator. In 2020 57th ACM/IEEE Design Automation Conference (DAC) 1–6 (IEEE, 2020).
Kang, B. et al. Genetic algorithm-based energy-aware CNN quantization for processing-in-memory architecture. IEEE J. Emerg. Sel. Top. Circuits Syst. 11, 649–662 (2021).
Article Google Scholar
Li, S., Hanson, E., Li, H. & Chen, Y. Penni: pruned kernel sharing for efficient CNN inference. In International Conference on Machine Learning 5863–5873 (PMLR, 2020).
Yang, S. et al. AUTO-PRUNE: automated DNN pruning and mapping for ReRAM-based accelerator. In Proc. ACM International Conference on Supercomputing 304–315 (ACM, 2021).
Zhang, T. et al. Autoshrink: a topology-aware NAS for discovering efficient neural architecture. In Proc. AAAI Conference on Artificial Intelligence 6829–6836 (AAAI, 2020).
Cheng, H.-P. et al. NASGEM: neural architecture search via graph embedding method. in Proc. AAAI Conference on Artificial Intelligence 7090–7098 (AAAI, 2021).
Lammie, C. et al. Design space exploration of dense and sparse mapping schemes for RRAM architectures. In 2022 IEEE International Symposium on Circuits and Systems (ISCAS) 1107–1111 (IEEE, 2022).
Fiacco, A. V. & McCormick, G. P. Nonlinear Programming: Sequential Unconstrained Minimization Techniques (SIAM, 1990).
Lu, Z. et al. NSGA-Net: neural architecture search using multi-objective genetic algorithm. In. Proc. Genetic and Evolutionary Computation Conference 419–427 (ACM, 2019).
Guo, Y. et al. Pareto-aware neural architecture generation for diverse computational budgets. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 2247–2257 (IEEE, 2023).
Qu, S., Li, B., Wang, Y. & Zhang, L. ASBP: automatic structured bit-pruning for RRAM-based NN accelerator. In 2021 58th ACM/IEEE Design Automation Conference (DAC) 745–750 (IEEE, 2021).
Krestinskaya, O., Salama, K. & James, A. P. Towards hardware optimal neural network selection with multi-objective genetic search. In 2020 IEEE International Symposium on Circuits and Systems (ISCAS) 1–5 (IEEE, 2020).
Zhang, T. et al. NASRec: weight sharing neural architecture search for recommender systems. In Proc. ACM Web Conference 1199–1207 (ACM, 2023).
Stolle, K., Vogel, S., van der Sommen, F. & Sanberg, W. in Joint European Conference on Machine Learning and Knowledge Discovery in Databases 463–479 (Springer).
Li, L. & Talwalkar, A. Random search and reproducibility for neural architecture search. In Uncertainty in Artificial Intelligence 367–377 (PMLR, 2020).
Huang, H., Ma, X., Erfani, S. M. & Bailey, J. Neural architecture search via combinatorial multi-armed bandit. In 2021 International Joint Conference on Neural Networks (IJCNN) 1–8 (IEEE, 2021).
Liu, C.-H. et al. FOX-NAS: fast, on-device and explainable neural architecture search. In Proc. IEEE/CVF International Conference on Computer Vision 789–797 (IEEE, 2021).
Li, C. et al. HW-NAS-Bench: hardware-aware neural architecture search benchmark. In 2021 International Conference on Learning Representations (ICRL, 2021).
Peng, X., Huang, S., Luo, Y., Sun, X. & Yu, S. DNN+ NeuroSim: an end-to-end benchmarking framework for compute-in-memory accelerators with versatile device technologies. In 2019 IEEE International Electron Devices Meeting (IEDM) 32.35.31–32.35.34 (IEEE, 2019).
Xia, L. et al. MNSIM: Simulation platform for memristor-based neuromorphic computing system. IEEE Trans. Comput. Des. Integr. Circuits Syst. 37, 1009–1022 (2017).
Google Scholar
Zhu, Z. et al. MNSIM 2.0: a behavior-level modeling tool for memristor-based neuromorphic computing systems. In Proc. 2020 on Great Lakes Symposium on VLSI 83–88 (ACM, 2020).
Rasch, M. J. et al. A flexible and fast PyTorch toolkit for simulating training and inference on analog crossbar arrays. In 2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS) 1–4 (IEEE, 2021).
Lee, H., Lee, S., Chong, S. & Hwang, S. J. Hardware-adaptive efficient latency prediction for nas via meta-learning. Adv. Neural Inf. Process. Syst. 34, 27016–27028 (2021).
Google Scholar
Laube, K. A., Mutschler, M. & Zell, A. What to expect of hardware metric predictors in NAS. In International Conference on Automated Machine Learning 13/11–13/15 (PMLR, 2022).
Hu, Y., Shen, C., Yang, L., Wu, Z. & Liu, Y. A novel predictor with optimized sampling method for hardware-aware NAS. In 2022 26th International Conference on Pattern Recognition (ICPR) 2114–2120 (IEEE, 2022).
Guo, Z. et al. Single path one-shot neural architecture search with uniform sampling. In Proc. Computer Vision — ECCV 2020: 16th European Conference Part XVI 16, 544–560 (Springer, 2020).
Shu, Y., Chen, Y., Dai, Z. & Low, B. K. H. Neural ensemble search via Bayesian sampling. In 38th Conference on Uncertainty in Artificial Intelligence (UAI) 1803-1812 (PMLR, 2022).
Wang, D., Li, M., Gong, C. & Chandra, V. AttentiveNAS: improving neural architecture search via attentive sampling. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 6418–6427 (IEEE, 2021).
Yang, Z. & Sun, Q. Efficient resource-aware neural architecture search with dynamic adaptive network sampling. In IEEE International Symposium on Circuits and Systems (ISCAS) 1–5 (IEEE, 2021).
Lyu, B. & Wen, S. TND-NAS: towards non-differentiable objectives in differentiable neural architecture search. In Proc. 3rd International Symposium on Automation, Information and Computing (INSTICC, 2022).
Li, G., Mandal, S. K., Ogras, U. Y. & Marculescu, R. FLASH: fast neural architecture search with hardware optimization. ACM Trans. Embedded Comput. Syst. 20, 1–26 (2021).
Article Google Scholar
Krestinskaya, O., Salama, K. N. & James, A. P. Automating analogue AI chip design with genetic search. Adv. Intell. Syst. 2, 2000075 (2020).
Article Google Scholar
Yan, Z., Juan, D.-C., Hu, X. S. & Shi, Y. Uncertainty modeling of emerging device based computing-in-memory neural accelerators with application to neural architecture search. In Proc. 26th Asia and South Pacific Design Automation Conference 859–864 (ACM, 2021).
Yang, X. et al. Multi-objective optimization of ReRAM crossbars for robust DNN inferencing under stochastic noise. In 2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD) 1–9 (IEEE, 2021).
Chitty-Venkata, K. T., Emani, M., Vishwanath, V. & Somani, A. K. Neural architecture search for transformers: a survey. IEEE Access 10, 108374–108412 (2022).
Article Google Scholar
Oloulade, B. M., Gao, J., Chen, J., Lyu, T. & Al-Sabri, R. Graph neural architecture search: a survey. Tsinghua Sci. Technol. 27, 692–708 (2021).
Article Google Scholar
Al-Sabri, R., Gao, J., Chen, J., Oloulade, B. M. & Lyu, T. Multi-view graph neural architecture search for biomedical entity and relation extraction. IEEE/ACM Trans. Comput. Biol. Bioinform. 20, 1221–1233 (2022).
Article Google Scholar
Klyuchnikov, N. et al. NAS-Bench-NLP: neural architecture search benchmark for natural language processing. IEEE Access 10, 45736–45747 (2022).
Article Google Scholar
Chitty-Venkata, K. T., Emani, M., Vishwanath, V. & Somani, A. K. Neural architecture search benchmarks: insights and survey. IEEE Access 11, 25217–25236 (2023).
Article Google Scholar
Ying, C. et al. NAS-Bench-101: towards reproducible neural architecture search. In International Conference on Machine Learning 7105–7114 (PMLR, 2019).
Dong, X. & Yang, Y. NAS-Bench-201: extending the scope of reproducible neural architecture search. In 2020 International Conference on Learning Representations (ICLR) (ICLR, 2020).
Tu, R. et al. NAS-Bench-360: benchmarking neural architecture search on diverse tasks. Adv. Neural Inf. Process. Syst. 35, 12380–12394 (2022).
Google Scholar
Real, E., Liang, C., So, D. & Le, Q. AutoML-Zero: evolving machine learning algorithms from scratch. In International Conference on Machine Learning 8007–8019 (PMLR, 2020).
Chakraborty, I., Ali, M. F., Kim, D. E., Ankit, A. & Roy, K. GENIEx: a generalized approach to emulating non-ideality in memristive xbars using neural networks. In 2020 57th ACM/IEEE Design Automation Conference (DAC) 1–6 (IEEE, 2020).
Houshmand, P. et al. Assessment and optimization of analog-in-memory-compute architectures for DNN processing. In IEEE International Electron Devices Meeting (IEEE, 2020).
Zhou, C. et al. ML-HW co-design of noise-robust tinyml models and always-on analog compute-in-memory edge accelerator. IEEE Micro 42, 76–87 (2022).
Article Google Scholar
Mei, L., Houshmand, P., Jain, V., Giraldo, S. & Verhelst, M. ZigZag: enlarging joint architecture-mapping design space exploration for DNN accelerators. IEEE Trans. Comput. 70, 1160–1174 (2021).
Article Google Scholar
Ghose, S., Boroumand, A., Kim, J. S., Gómez-Luna, J. & Mutlu, O. Processing-in-memory: a workload-driven perspective. IBM J. Res. Dev. 63, 1–3 (2019).
Article Google Scholar
Liu, R. et al. FeCrypto: instruction set architecture for cryptographic algorithms based on FeFET-based in-memory computing. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 42, 2889–2902 (2023).
Article Google Scholar
Mambu, K., Charles, H.-P. & Kooli, M. Dedicated instruction set for pattern-based data transfers: an experimental validation on systems containing in-memory computing units. IEEE Trans. Comput. Aided Design Integr. Circuits Syst. 42, 3757–3767 (2023).
Article Google Scholar
Jiang, N. et al. A detailed and flexible cycle-accurate network-on-chip simulator. In 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) 86–96 (IEEE, 2013).
Jiang, H., Huang, S., Peng, X. & Yu, S. MINT: mixed-precision RRAM-based in-memory training architecture. In 2020 IEEE International Symposium on Circuits and Systems (ISCAS) 1–5 (IEEE, 2020).

Download references

Acknowledgements

This work was supported by the King Abdullah University of Science and Technology through the Competitive Research Grant program under grant URF/1/4704-01-01.

Author information

Authors and Affiliations

Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
Olga Krestinskaya, Suhaib A. Fahmy, Ahmed Eltawil & Khaled N. Salama
Rain Neuromorphics, San Francisco, CA, USA
Mohammed E. Fouda
IBM Research Europe, Ruschlikon, Switzerland
Hadjer Benmeziane & Abu Sebastian
IBM T. J. Watson Research Center, Yorktown Heights, NY, USA
Kaoutar El Maghraoui
Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, MI, USA
Wei D. Lu
Physical Science and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
Mario Lanza
Electrical and Computer Engineering Department, Duke University, Durham, NC, USA
Hai Li
Electrical Engineering and Computer Science Department, University of California at Irvine, Irvine, CA, USA
Fadi Kurdahi

Authors

Olga Krestinskaya
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed E. Fouda
View author publications
You can also search for this author in PubMed Google Scholar
Hadjer Benmeziane
View author publications
You can also search for this author in PubMed Google Scholar
Kaoutar El Maghraoui
View author publications
You can also search for this author in PubMed Google Scholar
Abu Sebastian
View author publications
You can also search for this author in PubMed Google Scholar
Wei D. Lu
View author publications
You can also search for this author in PubMed Google Scholar
Mario Lanza
View author publications
You can also search for this author in PubMed Google Scholar
Hai Li
View author publications
You can also search for this author in PubMed Google Scholar
Fadi Kurdahi
View author publications
You can also search for this author in PubMed Google Scholar
Suhaib A. Fahmy
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed Eltawil
View author publications
You can also search for this author in PubMed Google Scholar
Khaled N. Salama
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

O.K. researched data and wrote the article. O.K., M.E.F., K.E.M., A.S., A.M.E. and K.N.S. contributed substantially to discussion of the content. O.K., M.E.F., H.B., K.E.M., A.S., W.D.L., M.L., H.L., F.K., S.A.F., A.M.E. and K.N.S. reviewed and/or edited the manuscript before submission.

Corresponding authors

Correspondence to Olga Krestinskaya or Khaled N. Salama.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Reviews Electrical Engineering thanks Arun Somani and Zheyu Yan for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Krestinskaya, O., Fouda, M.E., Benmeziane, H. et al. Neural architecture search for in-memory computing-based deep learning accelerators. Nat Rev Electr Eng 1, 374–390 (2024). https://doi.org/10.1038/s44287-024-00052-7

Download citation

Accepted: 23 April 2024
Published: 20 May 2024
Issue Date: June 2024
DOI: https://doi.org/10.1038/s44287-024-00052-7

Neural architecture search for in-memory computing-based deep learning accelerators

Subjects

Abstract

Key points

Access options

Similar content being viewed by others

Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators

A full-stack memristor-based computation-in-memory system with software-hardware co-development

Memory devices and applications for in-memory computing

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Supplementary Information

Rights and permissions

About this article

Cite this article

Search

Quick links

Subjects

Abstract

Key points

Access options

Similar content being viewed by others

Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators

A full-stack memristor-based computation-in-memory system with software-hardware co-development

Memory devices and applications for in-memory computing

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links