Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3177540.3177559acmconferencesArticle/Chapter ViewAbstractPublication PagesispdConference Proceedingsconference-collections
research-article

Exploration and Tradeoffs of different Kernels in FPGA Deep Learning Applications

Published: 25 March 2018 Publication History

Abstract

In the field of deep learning, efficient computational hardware has come to the forefront of the large scale implementation and deployment of many applications. In the process of designing hardware, various characteristics of hardware platforms have been studied in order to best implement the high computational demand, high memory bandwidth, and flexibility of networks. In addition to design space exploration of kernels, kernel design must be seen in the context of full system architectures or in terms of the combination of deep learning and other types of applications whether video encoding/decoding or analytics, speech recognition, or the multitude of potential applications combining deep learning kernels with tightly integrated coprocessor architectures. Kernel sizes, on-chip and off-chip memories, numeric datatypes and efficient compute architectures all must be merged into optimal design choices for both performing computations with maximum efficiency as well as programmable flexibility.

References

[1]
A. Krizhevsky, et al., Imagenet classification with deep convolutional neural networks, Neural Information Processing Systems 2012
[2]
W. Liu, et al., SSD: Single Shot MultiBox Detector, Proceedings of the European Conference on Computer Vision 2016
[3]
G. Hinton, et al., Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups, IEEE Signal Processing Magazine Vol 29, Issue 6, Nov 2012.
[4]
Y. H. Ng, et al., Beyond short snippets: Deep networks for video classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2015.
[5]
P. C. Woodland, Weight limiting, weight quantisation and generalisation in multi-layer perceptrons, Proceedings of the First IEE International Conference on Artificial Neural Networks, 1989.
[6]
Jian Ouyang et al., SDA: Software-Defined Accelerator for Large-Scale DNN Systems, HotChips 2014
[7]
S. Han et al., ESE: Efficient Speech Recognition Engine for Compressed LSTM on FPGA, International Symposium on Field-Programmable Gate Arrays, 2017
[8]
8-bit Dot Product Acceleration https://www.xilinx.com/support/documentation/white_papers/wp487-int8-acceleration.pdf
[9]
T. Sainath, Towards End-To-End Speech Recognition Using Deep Neural Networks, Invited Talk, International Conference on Machine Learning 2015
[10]
A. Chang, Recurrent Neural Networks Hardware Implementation on FPGA, https://arxiv.org/abs/1511.05552v4
[11]
Norman P. Jouppi et al., In-Datacenter Performance Analysis of a Tensor Processing Unit, International Symposium on Computer Architecture (ISCA), Toronto, Canada, June 26, 2017
[12]
C. Szegedy, et al., Going deeper with convolutions, ILSVRC 2014
[13]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, Deep Residual Learning for Image Recognition, ILSVRC 2015.
[14]
Y. Umuroglu, et al., FINN: A Framework for Fast, Scalable Binarized Neural Network Inference, International Symposium on Field Programmable Gate Arrays, 2017
[15]
Reduce Power and Cost by Converting from Floating Point to Fixed Point https://www.xilinx.com/support/documentation/white_papers/wp491-floating-to-fixed-point.pdf
[16]
S. Gupta, et al., Deep Learning with Limited Numerical Precision, https://arxiv.org/abs/1502.02551 2015
[17]
C. Zhang et al., Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks, International Symposium on Field-Programmable Gate Arrays, 2015
[18]
C. Farabet, et al., Large-Scale FPGA-based Convolutional Networks, Scaling up Machine Learning: Parallel and Distributed Approaches, Cambridge University Press, 2011
[19]
K. Negi, et al., Deep pipelined one-chip FPGA implementation of a real-time image-based human detection algorithm, International Conference on Field-Programmable Technology, 2011
[20]
J. Qiu, Going Deeper with Embedded FPGA Platform for Convolutional Neural Network, International Symposium on Field Programmable Gate Arrays, 2016
[21]
C. Couprie, et al., Indoor Semantic Segmentation using depth information, International Conference on Learning Representations 2013
[22]
F. Iandola, Squeezenet: Alexnet-Level Accuracy with 50x fewer Parameters and <0.5MB Model Size, https://arxiv.org/abs/1602.07360
[23]
P. Gysel et al., Hardware-oriented Approximation of Convolutional Neural Networks, International Conference on Learning Representations 2016

Index Terms

  1. Exploration and Tradeoffs of different Kernels in FPGA Deep Learning Applications

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ISPD '18: Proceedings of the 2018 International Symposium on Physical Design
    March 2018
    178 pages
    ISBN:9781450356268
    DOI:10.1145/3177540
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 March 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. FPGA
    2. deep learning
    3. quantization
    4. systolic array

    Qualifiers

    • Research-article

    Conference

    ISPD '18
    Sponsor:
    ISPD '18: International Symposium on Physical Design
    March 25 - 28, 2018
    California, Monterey, USA

    Acceptance Rates

    Overall Acceptance Rate 62 of 172 submissions, 36%

    Upcoming Conference

    ISPD '25
    International Symposium on Physical Design
    March 16 - 19, 2025
    Austin , TX , USA

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 225
      Total Downloads
    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 18 Nov 2024

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media