Article

Recursive context propagation network for semantic scene labeling

Authors:

Abhishek Sharma,

Ming-Yu LiuAuthors Info & Claims

NIPS'14: Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2

Pages 2447 - 2455

Published: 08 December 2014 Publication History

Abstract

We propose a deep feed-forward neural network architecture for pixel-wise semantic scene labeling. It uses a novel recursive neural network architecture for context propagation, referred to as rCPN. It first maps the local visual features into a semantic space followed by a bottom-up aggregation of local information into a global representation of the entire image. Then a top-down propagation of the aggregated information takes place that enhances the contextual information of each local feature. Therefore, the information from every location in the image is propagated to every other location. Experimental results on Stanford background and SIFT Flow datasets show that the proposed method outperforms previous approaches. It is also orders of magnitude faster than previous methods and takes only 0.07 seconds on a GPU for pixel-wise labeling of a 256 x 256 image starting from raw RGB pixel values, given the super-pixel mask that takes an additional 0.3 seconds using an off-the-shelf implementation.

References

[1]

A. Torralba, K.P. Murphy, W.T. Freeman, and M.A. Rubin. Context-based vision system for place and object recognition. IEEE CVPR, 2003.

[2]

Roozbeh Mottaghi, Sanja Fidler, Jian Yao, Raquel Urtasun, and Devi Parikh. Analyzing semantic segmentation using hybrid human-machine crfs. IEEE CVPR, 2013.

Digital Library

[3]

Roozbeh Mottaghi, Xianjie Chen, Xiaobai Liu, Nam-Gyu Cho, Seong-Whan Lee, Sanja Fidler, Raquel Urtasun, and Alan Yuille. The role of context for object detection and semantic segmentation in the wild. IEEE CVPR, 2014.

Digital Library

[4]

John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. ICML, pages 282-289, 2001.

Digital Library

[5]

Ioannis Tsochantaridis, Thorsten Joachims, Thomas Hofmann, Yasemin Altun, and Yoram Singer. Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research, 6(2):1453, 2006.

Digital Library

[6]

Clement Farabet, Camille Couprie, Laurent Najman, and Yann LeCun. Learning hierarchical features for scene labeling. IEEE TPAMI, August 2013.

Digital Library

[7]

Richard Socher, Cliff Chiung-Yu Lin, Andrew Y. Ng, and Christopher D. Manning. Parsing natural scenes and natural language with recursive neural networks. ICML, 2011.

Digital Library

[8]

Ming-Yu Liu, Oncel Tuzel, Srikumar Ramalingam, and Rama Chellappa. Entropy rate super-pixel segmentation. IEEE CVPR, 2011.

Digital Library

[9]

V. Lempitsky, A. Vedaldi, and A. Zisserman. A pylon model for semantic segmentation. NIPS, 2011.

[10]

Yangqing Jia. Caffe: An open source convolutional architecture for fast feature embedding. http://caffe.berkeleyvision.org/, 2013.

[11]

Christoph Goller and Andreas Kchler. Learning task-dependent distributed representations by backpropagation through structure. Int Conf. on Neural Network, 1995.

[12]

Dong C. Liu, Jorge Nocedal, and Dong C. On the limited memory bfgs method for large scale optimization. Mathematical Programming, 45:503-528, 1989.

Digital Library

[13]

Stephen Gould, Richard Fulton, and Daphne Koller. Decomposing a scene into geometric and semantically consistent regions. IEEE ICCV, 2009.

[14]

Ce Liu, Jenny Yuen, and Antonio Torralba. Nonparametric scene parsing via label transfer. IEEE TPAMI, 33(12), Dec 2011.

Digital Library

[15]

Joseph Tighe and Svetlana Lazebnik. Superparsing: Scalable nonparametric image parsing with superpixels. IJCV, 101:329-349, 2013.

Digital Library

[16]

Gautam Singh and Jana Kosecka. Nonparametric scene parsing with adaptive feature relevance and semantic context. IEEE CVPR, 2013.

Digital Library

[17]

R. Fergus and D. Eigen. Nonparametric image parsing using adaptive neighbor sets. IEEE CVPR, 2012.

Digital Library

[18]

Joseph Tighe and Svetlana Lazebnik. Finding things: Image parsing with regions and per-exemplar detectors. IEEE CVPR, 2013.

Digital Library

[19]

Pedro H. O. Pinheiro and Ronan Collobert. Recurrent convolutional neural networks for scene parsing. ICML, 2014.

[20]

Daniel Munoz, J. Andrew Bagnell, and Martial Hebert. Stacked hierarchical labeling. ECCV, 2010.

Digital Library

[21]

M. Pawan Kumar and Daphne Koller. Efficiently selecting regions for scene understanding. IEEE CVPR, 2010.

[22]

Gungor Polatkan and Oncel Tuzel. Compressed inference for probabilistic sequential models. UAI, pages 609-618, 2011.

[23]

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. Microsoft coco: Common objects in context. ECCV, 2014.

Cited By

Zhu AZhang LChen JZhou Y(2021)Pedestrian-Aware Panoramic Video Stitching Based on a Structured Camera ArrayACM Transactions on Multimedia Computing, Communications, and Applications10.1145/346051117:4(1-24)Online publication date: 12-Nov-2021
https://dl.acm.org/doi/10.1145/3460511
Zhang RLin LWang GWang MZuo W(2019)Hierarchical Scene Parsing by Weakly Supervised Learning with Image DescriptionsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2018.279984641:3(596-610)Online publication date: 1-Mar-2019
https://dl.acm.org/doi/10.1109/TPAMI.2018.2799846
Friesen ADomingos P(2018)Submodular field grammarsProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327144.3327343(4312-4322)Online publication date: 3-Dec-2018
https://dl.acm.org/doi/10.5555/3327144.3327343
Show More Cited By

Index Terms

Recursive context propagation network for semantic scene labeling
1. Computer systems organization
  1. Architectures
    1. Other architectures
2. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object detection
      2. Computer vision representations
        Image representations
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Index terms have been assigned to the content through auto-classification.

Recommendations

Local label descriptor for example based semantic image labeling
ECCV'12: Proceedings of the 12th European conference on Computer Vision - Volume Part VII

In this paper we introduce the concept of local label descriptor, which is a concatenation of label histograms for each cell in a patch. Local label descriptors alleviate the label patch misalignment issue in combining structured label predictions for ...
Accurate semantic image labeling by fast geodesic propagation
ICIP'09: Proceedings of the 16th IEEE international conference on Image processing

Motivated by recently raised image semantic labeling problem, this paper studies a fast Geodesic Propagation (GP) algorithm that integrates recognition proposal and image compatibility into a graphical representation. Given the recognition proposal map ...
Exploiting context for semantic scene classification

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

NIPS'14: Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2

December 2014

3697 pages

Publisher

MIT Press

Cambridge, MA, United States

Publication History

Published: 08 December 2014

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 29 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhu AZhang LChen JZhou Y(2021)Pedestrian-Aware Panoramic Video Stitching Based on a Structured Camera ArrayACM Transactions on Multimedia Computing, Communications, and Applications10.1145/346051117:4(1-24)Online publication date: 12-Nov-2021
https://dl.acm.org/doi/10.1145/3460511
Zhang RLin LWang GWang MZuo W(2019)Hierarchical Scene Parsing by Weakly Supervised Learning with Image DescriptionsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2018.279984641:3(596-610)Online publication date: 1-Mar-2019
https://dl.acm.org/doi/10.1109/TPAMI.2018.2799846
Friesen ADomingos P(2018)Submodular field grammarsProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327144.3327343(4312-4322)Online publication date: 3-Dec-2018
https://dl.acm.org/doi/10.5555/3327144.3327343
Jeong EJeong JKim SYu GChun BOliveira RFelber PHu Y(2018)Improving the expressiveness of deep learning frameworks with recursionProceedings of the Thirteenth EuroSys Conference10.1145/3190508.3190530(1-13)Online publication date: 23-Apr-2018
https://dl.acm.org/doi/10.1145/3190508.3190530
Jin XChen YJie ZFeng JYan SSingh SMarkovitch S(2017)Multi-path feedback recurrent neural networks for scene parsingProceedings of the Thirty-First AAAI Conference on Artificial Intelligence10.5555/3298023.3298163(4096-4102)Online publication date: 4-Feb-2017
https://dl.acm.org/doi/10.5555/3298023.3298163
Cheng FHe XZhang H(2017)Stacked Learning to Search for Scene LabelingIEEE Transactions on Image Processing10.1109/TIP.2017.266821826:4(1887-1898)Online publication date: 1-Apr-2017
https://dl.acm.org/doi/10.1109/TIP.2017.2668218

View Options

View options

Figures

Tables

Media

View Table of Conten