Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/2969033.2969100guideproceedingsArticle/Chapter ViewAbstractPublication PagesnipsConference Proceedingsconference-collections
Article

Recursive context propagation network for semantic scene labeling

Published: 08 December 2014 Publication History

Abstract

We propose a deep feed-forward neural network architecture for pixel-wise semantic scene labeling. It uses a novel recursive neural network architecture for context propagation, referred to as rCPN. It first maps the local visual features into a semantic space followed by a bottom-up aggregation of local information into a global representation of the entire image. Then a top-down propagation of the aggregated information takes place that enhances the contextual information of each local feature. Therefore, the information from every location in the image is propagated to every other location. Experimental results on Stanford background and SIFT Flow datasets show that the proposed method outperforms previous approaches. It is also orders of magnitude faster than previous methods and takes only 0.07 seconds on a GPU for pixel-wise labeling of a 256 x 256 image starting from raw RGB pixel values, given the super-pixel mask that takes an additional 0.3 seconds using an off-the-shelf implementation.

References

[1]
A. Torralba, K.P. Murphy, W.T. Freeman, and M.A. Rubin. Context-based vision system for place and object recognition. IEEE CVPR, 2003.
[2]
Roozbeh Mottaghi, Sanja Fidler, Jian Yao, Raquel Urtasun, and Devi Parikh. Analyzing semantic segmentation using hybrid human-machine crfs. IEEE CVPR, 2013.
[3]
Roozbeh Mottaghi, Xianjie Chen, Xiaobai Liu, Nam-Gyu Cho, Seong-Whan Lee, Sanja Fidler, Raquel Urtasun, and Alan Yuille. The role of context for object detection and semantic segmentation in the wild. IEEE CVPR, 2014.
[4]
John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. ICML, pages 282-289, 2001.
[5]
Ioannis Tsochantaridis, Thorsten Joachims, Thomas Hofmann, Yasemin Altun, and Yoram Singer. Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research, 6(2):1453, 2006.
[6]
Clement Farabet, Camille Couprie, Laurent Najman, and Yann LeCun. Learning hierarchical features for scene labeling. IEEE TPAMI, August 2013.
[7]
Richard Socher, Cliff Chiung-Yu Lin, Andrew Y. Ng, and Christopher D. Manning. Parsing natural scenes and natural language with recursive neural networks. ICML, 2011.
[8]
Ming-Yu Liu, Oncel Tuzel, Srikumar Ramalingam, and Rama Chellappa. Entropy rate super-pixel segmentation. IEEE CVPR, 2011.
[9]
V. Lempitsky, A. Vedaldi, and A. Zisserman. A pylon model for semantic segmentation. NIPS, 2011.
[10]
Yangqing Jia. Caffe: An open source convolutional architecture for fast feature embedding. http://caffe.berkeleyvision.org/, 2013.
[11]
Christoph Goller and Andreas Kchler. Learning task-dependent distributed representations by backpropagation through structure. Int Conf. on Neural Network, 1995.
[12]
Dong C. Liu, Jorge Nocedal, and Dong C. On the limited memory bfgs method for large scale optimization. Mathematical Programming, 45:503-528, 1989.
[13]
Stephen Gould, Richard Fulton, and Daphne Koller. Decomposing a scene into geometric and semantically consistent regions. IEEE ICCV, 2009.
[14]
Ce Liu, Jenny Yuen, and Antonio Torralba. Nonparametric scene parsing via label transfer. IEEE TPAMI, 33(12), Dec 2011.
[15]
Joseph Tighe and Svetlana Lazebnik. Superparsing: Scalable nonparametric image parsing with superpixels. IJCV, 101:329-349, 2013.
[16]
Gautam Singh and Jana Kosecka. Nonparametric scene parsing with adaptive feature relevance and semantic context. IEEE CVPR, 2013.
[17]
R. Fergus and D. Eigen. Nonparametric image parsing using adaptive neighbor sets. IEEE CVPR, 2012.
[18]
Joseph Tighe and Svetlana Lazebnik. Finding things: Image parsing with regions and per-exemplar detectors. IEEE CVPR, 2013.
[19]
Pedro H. O. Pinheiro and Ronan Collobert. Recurrent convolutional neural networks for scene parsing. ICML, 2014.
[20]
Daniel Munoz, J. Andrew Bagnell, and Martial Hebert. Stacked hierarchical labeling. ECCV, 2010.
[21]
M. Pawan Kumar and Daphne Koller. Efficiently selecting regions for scene understanding. IEEE CVPR, 2010.
[22]
Gungor Polatkan and Oncel Tuzel. Compressed inference for probabilistic sequential models. UAI, pages 609-618, 2011.
[23]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. Microsoft coco: Common objects in context. ECCV, 2014.

Cited By

View all
  • (2021)Pedestrian-Aware Panoramic Video Stitching Based on a Structured Camera ArrayACM Transactions on Multimedia Computing, Communications, and Applications10.1145/346051117:4(1-24)Online publication date: 12-Nov-2021
  • (2019)Hierarchical Scene Parsing by Weakly Supervised Learning with Image DescriptionsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2018.279984641:3(596-610)Online publication date: 1-Mar-2019
  • (2018)Submodular field grammarsProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327144.3327343(4312-4322)Online publication date: 3-Dec-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
NIPS'14: Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2
December 2014
3697 pages

Publisher

MIT Press

Cambridge, MA, United States

Publication History

Published: 08 December 2014

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 29 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Pedestrian-Aware Panoramic Video Stitching Based on a Structured Camera ArrayACM Transactions on Multimedia Computing, Communications, and Applications10.1145/346051117:4(1-24)Online publication date: 12-Nov-2021
  • (2019)Hierarchical Scene Parsing by Weakly Supervised Learning with Image DescriptionsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2018.279984641:3(596-610)Online publication date: 1-Mar-2019
  • (2018)Submodular field grammarsProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327144.3327343(4312-4322)Online publication date: 3-Dec-2018
  • (2018)Improving the expressiveness of deep learning frameworks with recursionProceedings of the Thirteenth EuroSys Conference10.1145/3190508.3190530(1-13)Online publication date: 23-Apr-2018
  • (2017)Multi-path feedback recurrent neural networks for scene parsingProceedings of the Thirty-First AAAI Conference on Artificial Intelligence10.5555/3298023.3298163(4096-4102)Online publication date: 4-Feb-2017
  • (2017)Stacked Learning to Search for Scene LabelingIEEE Transactions on Image Processing10.1109/TIP.2017.266821826:4(1887-1898)Online publication date: 1-Apr-2017

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media