Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3660395.3660438acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaibdfConference Proceedingsconference-collections
research-article

FPGA implementation and verification of efficient and reconfigurable CNN-LSTM accelerator design

Published: 01 June 2024 Publication History

Abstract

Verilog language is used to complete the RTL modeling of the high efficiency LSTM accelerator and the reconfigurable CNN-LSTM accelerator on FPGA. Through comparing the calculation results of hardware and software, the functional correctness of the designed accelerator is confirmed. The experimental results show that the proposed high efficiency LSTM accelerator has 16 times the acceleration ratio of the CPU, 19.12% of the power consumption of the GPU, 85.68 GOPS of throughput, and 22.4 GOPS/W of energy efficiency, which is superior to other LSTM accelerator designs of the same type. Compared with the CPU, the proposed reconfigurable CNN-LSTM accelerator can achieve 12 times the acceleration ratio, while the power consumption is only approximately 10.02% of GPU; the throughput rate reaches 77.5 GOPS, and the energy efficiency ratio is 42.9 GOPS/W. In the same application background, compared with the efficient LSTM accelerator, on-chip resource consumption is reduced while decreasing the time consumed to process a set of data by 65%.

References

[1]
Collobert, R., Weston, J. 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning[C]. In: Proceedings of the 25th international conference on Machine learning. Helsinki. pp. 160-167. https://doi.org/10.1145/1390156.1390177
[2]
Zhou, D.X. 2020. Universality of deep convolutional neural networks[J]. Applied and Computational Harmonic Analysis, 48(2): 787-794. https://doi.org/10.48550/arXiv.1805.10769
[3]
Chen, C.H., Lai, J.P., Chang, Y.M., et.al. 2023. Study of Optimization in Deep Neural Networks for Regression[J]. Electronics, 12(14): 3071. https://doi.org/10.3390/electronics12143071
[4]
Deng, L., Li, J., Huang, J.T., et.al. 2013. Recent advances in deep learning for speech research at Microsoft[C]. In: IEEE International Conference on Acoustics, Speech and Signal Processing. Vancouver. pp. 8604-8608. https://doi.org/10.1109/ICASSP.2013.6639345
[5]
Krizhevsky, A., Sutskever, I., Hinton, G.E. 2017. Imagenet classification with deep convolutional neural networks[J]. Communications of the ACM, 60(6): 84-90. https://doi.org/10.1145/3065386
[6]
Greff, K., Srivastava, R.K., Koutník, J., et.al. 2016. LSTM: A search space odyssey[J]. IEEE Transactions on Neural Networks and Learning Systems, 28(10): 2222-2232. https://doi.org/10.1109/TNNLS.2016.2582924
[7]
Vipin, K., Fahmy, S.A. 2018. FPGA dynamic and partial reconfiguration: A survey of architectures, methods, and applications[J]. ACM Computing Surveys (CSUR), 51(4): 1-39. https://doi.org/10.1145/3193827
[8]
Sedcole, P., Blodget, B., Becker, T., et.al. 2006. Modular dynamic reconfiguration in Virtex FPGAs[J]. IEE Proceedings-Computers and Digital Techniques, 153(3): 157-164. https://doi.org/10.1049/ip-cdt:20050176
[9]
Piazza, F., Uncini, A., Zenobi, M. 1993. Neural networks with digital LUT activation functions[C]. In: Proceedings of 1993 International Conference on Neural Networks (IJCNN-93-Nagoya, Japan). Nagoya. pp. 1401-1404. https://doi.org/10.1109/IJCNN.1993.716806
[10]
Hochreiter, S., Schmidhuber, J. 1997. Long short-term memory[J]. Neural Computation, 9(8): 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735
[11]
Ullah, A., Ahmad, J., Muhammad, K., et,al. 2017. Action recognition in video sequences using deep bi-directional LSTM with CNN features[J]. IEEE Access, 6: 1155-1166. https://doi.org/10.1109/ACCESS.2017.2778011
[12]
Zhou, J., Lu, Y., Dai, H.N., et.al. 2019. Sentiment analysis of Chinese microblog based on stacked bidirectional LSTM[J]. IEEE Access, 7: 38856-38866.
[13]
Feng, Y., Ma, L., Liu, W., et.al. 2019. Unsupervised image captioning[C]. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach. pp. 4125-4134.
[14]
Xia, L., Diao, L., Jiang, Z., et.al. 2019. PAI-FCNN: FPGA based inference system for complex CNN models[C]. In: IEEE 30th International Conference on Application-specific Systems, Architectures and Processors (ASAP). New York. pp. 107-114.
[15]
Zhang, X., Liu, X., Ramachandran, A., et.al. 2017. High-performance video content recognition with long-term recurrent convolutional network for FPGA[C]. In: 27th International Conference on Field Programmable Logic and Applications (FPL). Ghent. pp. 1-4.
[16]
Zeng, S., Guo, K., Fang, S., et.al. 2018. An efficient reconfigurable framework for general purpose CNN-RNN models on FPGAs[C]. In: IEEE 23rd International Conference on Digital Signal Processing (DSP). Shanghai. pp. 1-5.
[17]
Mahjoub, A.B., Atri,M. 2019. Implementation of convolutional-LSTM network based on CPU, GPU and pynq-zl board[C]. In: IEEE International Conference on Design & Test of Integrated Micro & Nano-Systems (DTS). Gammarth, Tunisia. pp. 1-6.
[18]
Willmott, C.J., Matsuura, K. 2005. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance[J]. Climate Research, 30(1): 79-82.
[19]
Lu, W., Li, J., Li, Y., et.al. 2020. A CNN-LSTM-based model to forecast stock prices[J]. Complexity, 2020: 2020.

Index Terms

  1. FPGA implementation and verification of efficient and reconfigurable CNN-LSTM accelerator design

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    AIBDF '23: Proceedings of the 2023 3rd Guangdong-Hong Kong-Macao Greater Bay Area Artificial Intelligence and Big Data Forum
    September 2023
    577 pages
    ISBN:9798400716362
    DOI:10.1145/3660395
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 June 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    AIBDF 2023

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 20
      Total Downloads
    • Downloads (Last 12 months)20
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 20 Nov 2024

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media