research-article

FPGA implementation and verification of efficient and reconfigurable CNN-LSTM accelerator design

Authors:

Fen GeAuthors Info & Claims

AIBDF '23: Proceedings of the 2023 3rd Guangdong-Hong Kong-Macao Greater Bay Area Artificial Intelligence and Big Data Forum

Pages 245 - 250

https://doi.org/10.1145/3660395.3660438

Published: 01 June 2024 Publication History

AIBDF '23: Proceedings of the 2023 3rd Guangdong-Hong Kong-Macao Greater Bay Area Artificial Intelligence and Big Data Forum

FPGA implementation and verification of efficient and reconfigurable CNN-LSTM accelerator design

Pages 245 - 250

Abstract
References

Abstract

Verilog language is used to complete the RTL modeling of the high efficiency LSTM accelerator and the reconfigurable CNN-LSTM accelerator on FPGA. Through comparing the calculation results of hardware and software, the functional correctness of the designed accelerator is confirmed. The experimental results show that the proposed high efficiency LSTM accelerator has 16 times the acceleration ratio of the CPU, 19.12% of the power consumption of the GPU, 85.68 GOPS of throughput, and 22.4 GOPS/W of energy efficiency, which is superior to other LSTM accelerator designs of the same type. Compared with the CPU, the proposed reconfigurable CNN-LSTM accelerator can achieve 12 times the acceleration ratio, while the power consumption is only approximately 10.02% of GPU; the throughput rate reaches 77.5 GOPS, and the energy efficiency ratio is 42.9 GOPS/W. In the same application background, compared with the efficient LSTM accelerator, on-chip resource consumption is reduced while decreasing the time consumed to process a set of data by 65%.

References

[1]

Collobert, R., Weston, J. 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning[C]. In: Proceedings of the 25th international conference on Machine learning. Helsinki. pp. 160-167. https://doi.org/10.1145/1390156.1390177

Digital Library

[2]

Zhou, D.X. 2020. Universality of deep convolutional neural networks[J]. Applied and Computational Harmonic Analysis, 48(2): 787-794. https://doi.org/10.48550/arXiv.1805.10769

[3]

Chen, C.H., Lai, J.P., Chang, Y.M., et.al. 2023. Study of Optimization in Deep Neural Networks for Regression[J]. Electronics, 12(14): 3071. https://doi.org/10.3390/electronics12143071

[4]

Deng, L., Li, J., Huang, J.T., et.al. 2013. Recent advances in deep learning for speech research at Microsoft[C]. In: IEEE International Conference on Acoustics, Speech and Signal Processing. Vancouver. pp. 8604-8608. https://doi.org/10.1109/ICASSP.2013.6639345

[5]

Krizhevsky, A., Sutskever, I., Hinton, G.E. 2017. Imagenet classification with deep convolutional neural networks[J]. Communications of the ACM, 60(6): 84-90. https://doi.org/10.1145/3065386

Digital Library

[6]

Greff, K., Srivastava, R.K., Koutník, J., et.al. 2016. LSTM: A search space odyssey[J]. IEEE Transactions on Neural Networks and Learning Systems, 28(10): 2222-2232. https://doi.org/10.1109/TNNLS.2016.2582924

[7]

Vipin, K., Fahmy, S.A. 2018. FPGA dynamic and partial reconfiguration: A survey of architectures, methods, and applications[J]. ACM Computing Surveys (CSUR), 51(4): 1-39. https://doi.org/10.1145/3193827

Digital Library

[8]

Sedcole, P., Blodget, B., Becker, T., et.al. 2006. Modular dynamic reconfiguration in Virtex FPGAs[J]. IEE Proceedings-Computers and Digital Techniques, 153(3): 157-164. https://doi.org/10.1049/ip-cdt:20050176

[9]

Piazza, F., Uncini, A., Zenobi, M. 1993. Neural networks with digital LUT activation functions[C]. In: Proceedings of 1993 International Conference on Neural Networks (IJCNN-93-Nagoya, Japan). Nagoya. pp. 1401-1404. https://doi.org/10.1109/IJCNN.1993.716806

[10]

Hochreiter, S., Schmidhuber, J. 1997. Long short-term memory[J]. Neural Computation, 9(8): 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735

Digital Library

[11]

Ullah, A., Ahmad, J., Muhammad, K., et,al. 2017. Action recognition in video sequences using deep bi-directional LSTM with CNN features[J]. IEEE Access, 6: 1155-1166. https://doi.org/10.1109/ACCESS.2017.2778011

[12]

Zhou, J., Lu, Y., Dai, H.N., et.al. 2019. Sentiment analysis of Chinese microblog based on stacked bidirectional LSTM[J]. IEEE Access, 7: 38856-38866.

[13]

Feng, Y., Ma, L., Liu, W., et.al. 2019. Unsupervised image captioning[C]. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach. pp. 4125-4134.

[14]

Xia, L., Diao, L., Jiang, Z., et.al. 2019. PAI-FCNN: FPGA based inference system for complex CNN models[C]. In: IEEE 30th International Conference on Application-specific Systems, Architectures and Processors (ASAP). New York. pp. 107-114.

[15]

Zhang, X., Liu, X., Ramachandran, A., et.al. 2017. High-performance video content recognition with long-term recurrent convolutional network for FPGA[C]. In: 27th International Conference on Field Programmable Logic and Applications (FPL). Ghent. pp. 1-4.

[16]

Zeng, S., Guo, K., Fang, S., et.al. 2018. An efficient reconfigurable framework for general purpose CNN-RNN models on FPGAs[C]. In: IEEE 23rd International Conference on Digital Signal Processing (DSP). Shanghai. pp. 1-5.

[17]

Mahjoub, A.B., Atri,M. 2019. Implementation of convolutional-LSTM network based on CPU, GPU and pynq-zl board[C]. In: IEEE International Conference on Design & Test of Integrated Micro & Nano-Systems (DTS). Gammarth, Tunisia. pp. 1-6.

[18]

Willmott, C.J., Matsuura, K. 2005. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance[J]. Climate Research, 30(1): 79-82.

[19]

Lu, W., Li, J., Li, Y., et.al. 2020. A CNN-LSTM-based model to forecast stock prices[J]. Complexity, 2020: 2020.

Digital Library

Index Terms

FPGA implementation and verification of efficient and reconfigurable CNN-LSTM accelerator design
1. Theory of computation
  1. Models of computation
    1. Computability

Recommendations

A CNN accelerator on embedded FPGA using dynamic reconfigurable coprocessor
AIIPCC '19: Proceedings of the International Conference on Artificial Intelligence, Information Processing and Cloud Computing

Convolutional neural network (CNN) has been widely deployed in deep learning networks at present. However, numerous convolution operations are computing intensive and often require powerful accelerator such as FPGA. The existed accelerators usually as ...
Energy-Efficient CNN Implementation on a Deeply Pipelined FPGA Cluster
ISLPED '16: Proceedings of the 2016 International Symposium on Low Power Electronics and Design

Recently, FPGA-based CNN accelerators have demonstrated superior energy efficiency compared to high-performance devices like GPGPUs. However, due to the constrained on-chip resource and many other factors, single-board FPGA designs may have difficulties ...
Reconfigurable CRC IP core design on Xilinx Spartan 3AN FPGA

This paper presents an efficient, reconfigurable, high throughput IP core implementation of a Cyclic Redundancy Check CRC chip design on Field Programmable Gate Array FPGA. The IP core design has the advantage of correcting multiple errors based on the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

AIBDF '23: Proceedings of the 2023 3rd Guangdong-Hong Kong-Macao Greater Bay Area Artificial Intelligence and Big Data Forum

September 2023

577 pages

ISBN:9798400716362

DOI:10.1145/3660395

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Conference

AIBDF 2023

AIBDF 2023: 2023 3rd Guangdong-Hong Kong-Macao Greater Bay Area Artificial Intelligence and Big Data Forum

September 22 - 24, 2023

Guangzhou, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
38
Total Downloads

Downloads (Last 12 months)38
Downloads (Last 6 weeks)6

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten