Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2744769.2744788acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

Accelerating real-time embedded scene labeling with convolutional networks

Published: 07 June 2015 Publication History

Abstract

Today there is a clear trend towards deploying advanced computer vision (CV) systems in a growing number of application scenarios with strong real-time and power constraints. Brain-inspired algorithms capable of achieving record-breaking results combined with embedded vision systems are the best candidate for the future of CV and video systems due to their flexibility and high accuracy in the area of image understanding. In this paper, we present an optimized convolutional network implementation suitable for real-time scene labeling on embedded platforms. We show that our algorithm can achieve up to 96GOp/s, running on the Nvidia Tegra K1 embedded SoC. We present experimental results, compare them to the state-of-the-art, and demonstrate that for scene labeling our approach achieves a 1.5x improvement in throughput when compared to a modern desktop CPU at a power budget of only 11 W.

References

[1]
C. Bobda and S. Velipasalar, editors. Distributed Embedded Smart Cameras. Springer, 2014.
[2]
S. Chetlur, C. Woolley, P. Vandermersch, J. Cohen, J. Tran, B. Catanzaro, and E. Shelhamer. cuDNN: Efficient Primitives for Deep Learning. In arXiv:1410.0759, Oct. 2014.
[3]
R. Collobert. Torch7: A matlab-like environment for machine learning. Proc. NIPSW'11, 2011.
[4]
A. Dundar, J. Jin, V. Gokhale, B. Krishnamurthy, A. Canziani, B. Martini, and E. Culurciello. Accelerating Deep Neural Networks on Mobile Processor with Embedded Programmable Logic. In Proc. NIPS'13, 2013.
[5]
C. Farabet, C. Couprie, L. Najman, and Y. LeCun. Learning hierarchical features for scene labeling. IEEE Trans. on PAMI, 2013.
[6]
C. Farabet, B. Martini, B. Corda, P. Akselrod, E. Culurciello, and Y. LeCun. NeuFlow: A runtime reconfigurable dataflow processor for vision. In Proc. IEEE CVPRW'11, pages 109--116, June 2011.
[7]
V. Gokhale, J. Jin, A. Dundar, B. Martini, and E. Culurciello. A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks. In Proc. IEEE CVPR'14, pages 682--687, 2014.
[8]
S. Gould, R. Fulton, and D. Koller. Decomposing a scene into geometric and semantically consistent regions. In Proc. IEEE ICCV'09, 2009.
[9]
Y. Jia. Caffe: An Open Source Convolutional Architecture for Fast Feature Embedding, 2013.
[10]
J. Jin, V. Gokhale, A. Dundar, B. Krishnamurthy, B. Martini, and E. Culurciello. An efficient implementation of deep convolutional neural networks on a mobile coprocessor. In Proc. IEEE MWSCAS'14, pages 133--136, Aug. 2014.
[11]
A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Proc. NIPS'12, 2012.
[12]
M. Kumar and D. Koller. Efficiently selecting regions for scene understanding. In Proc. IEEE CVPR'10, pages 3217--3224, June 2010.
[13]
C. Labovitz, S. Iekel-Johnson, D. McPherson, J. Oberheide, and F. Jahanian. Internet inter-domain traffic, 2010.
[14]
M. Mathieu, M. Henaff, and Y. LeCun. Fast Training of Convolutional Networks through FFTs. In arXiv:1312.5851, Dec. 2013.
[15]
D. Munoz, J. Bagnell, and M. Hebert. Stacked hierarchical labeling. In Proc. ECCV'10, 2010.
[16]
F. Porikli, F. Bremond, S. L. Dockstader, J. Ferryman, A. Hoogs, B. C. Lovell, S. Pankanti, B. Rinner, P. Tu, and P. L. Venetianer. Video surveillance: past, present, and now the future {DSP Forum}. IEEE Signal Processing Magazine, 30:190--198, 2013.
[17]
X. Ren, L. Bo, and D. Fox. Rgb-(d) scene labeling: Features and algorithms. In Proc. IEEE CVPR'12, pages 2759--2766, June 2012.
[18]
M. Seyedhosseini and T. Tasdizen. Scene Labeling with Contextual Hierarchical Models. In arXiv:1402.0595, Feb. 2014.
[19]
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going Deeper with Convolutions. In arXiv:1409.4842, Sept. 2014.
[20]
Y. Taigman and M. Yang. Deepface: Closing the gap to human-level performance in face verification. In Proc. IEEE CVPR'13, 2013.
[21]
Teradeep Inc. Teradeep Technology Website, 2014.
[22]
J. Tighe and S. Lazebnik. Superparsing: scalable nonparametric image parsing with superpixels. In Proc. ECCV'10, 2010.

Cited By

View all
  • (2024)ALPRI-FI: A Framework for Early Assessment of Hardware Fault Resiliency of DNN AcceleratorsElectronics10.3390/electronics1316324313:16(3243)Online publication date: 15-Aug-2024
  • (2024)Design of a Convolutional Neural Network Accelerator Based on On-Chip Data ReorderingElectronics10.3390/electronics1305097513:5(975)Online publication date: 4-Mar-2024
  • (2022)Reconfigurable Bit-Serial Operation Using Toggle SOT-MRAM for High-Performance Computing in Memory ArchitectureIEEE Transactions on Circuits and Systems I: Regular Papers10.1109/TCSI.2022.319216569:11(4535-4545)Online publication date: Nov-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
DAC '15: Proceedings of the 52nd Annual Design Automation Conference
June 2015
1204 pages
ISBN:9781450335201
DOI:10.1145/2744769
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 June 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. accelerator
  2. convolutional networks
  3. scene labeling

Qualifiers

  • Research-article

Funding Sources

  • armasuisse Science & Technology

Conference

DAC '15
Sponsor:
DAC '15: The 52nd Annual Design Automation Conference 2015
June 7 - 11, 2015
California, San Francisco

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25
62nd ACM/IEEE Design Automation Conference
June 22 - 26, 2025
San Francisco , CA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)1
Reflects downloads up to 29 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)ALPRI-FI: A Framework for Early Assessment of Hardware Fault Resiliency of DNN AcceleratorsElectronics10.3390/electronics1316324313:16(3243)Online publication date: 15-Aug-2024
  • (2024)Design of a Convolutional Neural Network Accelerator Based on On-Chip Data ReorderingElectronics10.3390/electronics1305097513:5(975)Online publication date: 4-Mar-2024
  • (2022)Reconfigurable Bit-Serial Operation Using Toggle SOT-MRAM for High-Performance Computing in Memory ArchitectureIEEE Transactions on Circuits and Systems I: Regular Papers10.1109/TCSI.2022.319216569:11(4535-4545)Online publication date: Nov-2022
  • (2022)CUTIE: Beyond PetaOp/s/W Ternary DNN Inference Acceleration With Better-Than-Binary Energy EfficiencyIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2021.307542041:4(1020-1033)Online publication date: Apr-2022
  • (2022)BackboneAnalysis: Structured Insights into Compute Platforms from CNN Inference Latency2022 IEEE Intelligent Vehicles Symposium (IV)10.1109/IV51971.2022.9827260(1801-1809)Online publication date: 5-Jun-2022
  • (2022)Spatial Data Dependence Graph Based Pre-RTL Simulator for Convolutional Neural Network DataflowsIEEE Access10.1109/ACCESS.2022.314641310(11382-11403)Online publication date: 2022
  • (2022)Enabling Edge Computing Using Emerging Memory Technologies: From Device to ArchitectureFrontiers of Quality Electronic Design (QED)10.1007/978-3-031-16344-9_11(415-464)Online publication date: 6-Sep-2022
  • (2021)Dynamic Temperature Management of Near-Sensor Processing for Energy-Efficient High-Fidelity ImagingSensors10.3390/s2103092621:3(926)Online publication date: 30-Jan-2021
  • (2021)Low-Power Implementation Techniques for Convolutional Neural Networks Using Precise and Active Skipping MethodsIEICE Transactions on Electronics10.1587/transele.2020CDP0003E104.C:7(330-337)Online publication date: 1-Jul-2021
  • (2021)All-optical neuromorphic binary convolution with a spiking VCSEL neuron for image gradient magnitudesPhotonics Research10.1364/PRJ.4121419:5(B201)Online publication date: 14-Apr-2021
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media