Incorporating scene priors to dense monocular mapping

Alejo Concha¹,
Wajahat Hussain¹,
Luis Montano¹ &
…
Javier Civera¹

672 Accesses
15 Citations
1 Altmetric
Explore all metrics

Abstract

This paper presents a dense monocular mapping algorithm that improves the accuracy of the state-of-the-art variational and multiview stereo methods by incorporating scene priors into its formulation. Most of the improvement of our proposal is in low-textured image regions and for low-parallax camera motions; two typical failure cases of multiview mapping. The specific priors we model are the planarity of homogeneous color regions, the repeating geometric primitives of the scene—that can be learned from data—and the Manhattan structure of indoor rooms. We evaluate the performance of our method in our own sequences and in the publicly available NYU dataset, emphasizing its strengths and weaknesses in different cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Coarse-to-fine Planar Regularization for Dense Monocular Depth Estimation

Adaptive Self-supervised Depth Estimation in Monocular Videos

Plane Completion and Filtering for Multi-View Stereo Reconstruction

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Angeli, A., Handa, A., Newcombe, R., & Davison, A. (2011). Applications of Legendre-Fenchel transformation to computer vision problems. In Technical report DTR11-7. London: Imperial College.
Bao, S. Y., & Savarese, S. (2011). Semantic structure from motion. In 2011 IEEE conference on computer vision and pattern recognition (CVPR), IEEE (pp. 2025–2032).
Bao, Y., Chandraker, M., Lin, Y., & Savarese S. (2013). Dense object reconstruction with semantic priors. In 26th IEEE conference on computer vision and pattern recognition (CVPR).
Concha, A., & Civera, J. (2014, June). Using superpixels in monocular SLAM. In IEEE international conference on robotics and automation, Hong Kong.
Concha, A., & Civera, J. (2015a). DPPTAM: Dense piecewise planar tracking and mapping from a monocular sequence. In IEEE/RSJ international conference on intelligent robots and systems, Hamburg, Germany.
Concha, A., & Civera, J. (2015b). An evaluation of robust cost functions for RGB direct mapping . In European conference on mobile robotics (ECMR15), Lincoln, UK.
Concha, A., Hussain, W., Montano, L., & Civera, J. (2014). Manhattan and piecewise-planar constraints for dense monocular mapping. In Robotics: Science and systems.
Dame, A., Prisacariu, V. A., Ren, C. Y., & Reid, I., (2013). Dense reconstruction using 3D object shape priors. In 2013 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1288–1295).
Davison, A. J., Reid, I. D., Molton, N. D., & Stasse, O. (2007). Monoslam: Real-time single camera slam. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6), 1052–1067.
Article Google Scholar
Eigen, D., Puhrsch, C., & Fergus, R. (2014). Depth map prediction from a single image using a multi-scale deep network. In Advances in neural information processing systems (pp. 2366–2374).
Engel, J., Schöps, T., & Cremers, D. (2014). LSD-SLAM: Large-scale direct monocular slam. In Computer vision—ECCV 2014, Springer (pp. 834–849).
Felzenszwalb, Pedro F., & Huttenlocher, Daniel P. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2), 167–181.
Article Google Scholar
Flint, A., Murray, D., & Reid, I. (2011). Manhattan scene understanding using monocular, stereo, and 3D features. In 2011 IEEE international conference on computer vision (ICCV) (pp. 2228–2235).
Fouhey, D. F., Gupta, A., & Hebert, M. (2013). Data-driven 3D primitives for single image understanding. In ICCV.
Furukawa, Y., Curless, B., Seitz, S. M., & Szeliski, R. (2009). Reconstructing building interiors from images. In Proceedings of the international conference on computer vision (pp. 80–87).
Furukawa, Y., & Ponce, J. (2010). Accurate, dense, and robust multiview stereopsis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(8), 1362–1376.
Article Google Scholar
Gallup, D., Frahm, J.-M., & Pollefeys, M. (2010). Piecewise planar and non-planar stereo for urban scene reconstruction. In 2010 IEEE conference on computer vision and pattern recognition (CVPR), IEEE (pp. 1418–1425).
Graber, G., Pock, T., & Bischof, H. (2011). Online 3d reconstruction using convex optimization. In 2011 IEEE international conference on computer vision workshops (pp. 708–711).
Hartley, R. I., & Zisserman, A. (2004). Multiple view geometry in computer vision. Cambridge: Cambridge University Press. ISBN 0521540518.
Hedau, V., Hoiem, D., & Forsyth, D. (2009). Recovering the spatial layout of cluttered rooms. In 2009 IEEE 12th international conference on computer vision, IEEE (pp. 1849–1856).
Hoiem, D., Efros, A. A., & Hebert, M. (2007). Recovering surface layout from an image. International Journal of Computer Vision, 75(1), 151–172.
Article Google Scholar
Hoiem, D., Efros, A. A., & Hebert, M. (2005). Automatic photo pop-up. In ACM transactions on graphics (TOG), ACM (Vol. 24, pp. 577–584).
Klein, G., & Murray, D. (2007). Parallel tracking and mapping for small AR workspaces. In Sixth IEEE and ACM international symposium on mixed and augmented reality.
Košecká, J., & Zhang, W. (2006). Video compass. In Computer vision—ECCV 2002, Springer (pp. 476–490).
Mičušík, Branislav, & Košecká, Jana. (2010). Multi-view superpixel stereo in urban environments. International Journal of Computer Vision, 89(1), 106–119.
Article Google Scholar
Nabbe, B., Hoiem, D., Efros, A. A., & Hebert, M. (2006). Opportunistic use of vision to push back the path-planning horizon. In 2006 IEEE/RSJ international conference on intelligent robots and systems, IEEE (pp. 2388–2393).
Newcombe, R. A., Lovegrove, S. J., & Davison, A. J. (2011). DTAM: Dense tracking and mapping in real-time. In 2011 IEEE international conference on computer vision (ICCV) (pp. 2320–2327).
Owens, A., Xiao, J., Torralba, A., & Freeman, W. (2013, December). Shape anchors for data-driven multi-view reconstruction. In 2013 IEEE international conference on computer vision (ICCV), Sydney, Australia.
Piniés, P., Paz, L. M., & Newman, P. (2015). Dense mono reconstruction: Living with the pain of the plain plane. In Proceedings of the 2015 IEEE international conference on robotics and automation (pp. 5226–5231).
Saxena, A., Chung, S. H., & Ng, A. Y. (2005). Learning depth from single monocular images. In Advances in neural information processing systems (pp. 1161–1168).
Saxena, A., Sun, M., & Ng, A. Y. (2008). Make3d: Depth perception from a single still image. In AAAI (pp. 1571–1576).
Silberman, N., Hoiem, D., Kohli, P., & Fergus, R. (2012). Indoor segmentation and support inference from RGBD images. In ECCV.
Snavely, N., Seitz, S. M., & Szeliski, R. (2008). Modeling the world from internet photo collections. International Journal of Computer Vision, 80(2), 189–210.
Article Google Scholar
Stühmer, J., Gumhold, S., & Cremers, D. (2010). Real-time dense geometry from a handheld camera. In Pattern recognition, Springer (pp. 11–20).
Sturm, P., & Maybank, S. (1999). A method for interactive 3d reconstruction of piecewise planar objects from single images. In The 10th British machine vision conference (BMVC’99) (pp. 265–274).
Tsai, G., Xu, C., Liu, J., & Kuipers, B. (2011). Real-time indoor scene understanding using bayesian filtering with motion cues. In 2011 IEEE international conference on computer vision (ICCV), IEEE (pp. 121–128).
Vanegas, C. A., Aliaga, D. G., & Benes, B. (2010). Building reconstruction using manhattan-world grammars. In 2010 IEEE conference on computer vision and pattern recognition (CVPR), IEEE (pp. 358–365).

Download references

Acknowledgments

This research was funded by the Spanish government with the Projects DPI2012-32168, DPI2012-32100 and IPT-2012-1309-430000. We would like to thank Marta Salas for her help with the NYU dataset.

Author information

Authors and Affiliations

Universidad de Zaragoza, C/María de Luna 1, Ada Byron Building, 50018, Zaragoza, Spain
Alejo Concha, Wajahat Hussain, Luis Montano & Javier Civera

Authors

Alejo Concha
View author publications
You can also search for this author in PubMed Google Scholar
Wajahat Hussain
View author publications
You can also search for this author in PubMed Google Scholar
Luis Montano
View author publications
You can also search for this author in PubMed Google Scholar
Javier Civera
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alejo Concha.

Additional information

This is one of several papers published in Autonomous Robots comprising the “Special Issue on Robotics Science and Systems”.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Concha, A., Hussain, W., Montano, L. et al. Incorporating scene priors to dense monocular mapping. Auton Robot 39, 279–292 (2015). https://doi.org/10.1007/s10514-015-9465-9

Download citation

Received: 21 November 2014
Accepted: 07 July 2015
Published: 29 July 2015
Issue Date: October 2015
DOI: https://doi.org/10.1007/s10514-015-9465-9

Incorporating scene priors to dense monocular mapping

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Coarse-to-fine Planar Regularization for Dense Monocular Depth Estimation

Adaptive Self-supervised Depth Estimation in Monocular Videos

Plane Completion and Filtering for Multi-View Stereo Reconstruction

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Incorporating scene priors to dense monocular mapping

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Coarse-to-fine Planar Regularization for Dense Monocular Depth Estimation

Adaptive Self-supervised Depth Estimation in Monocular Videos

Plane Completion and Filtering for Multi-View Stereo Reconstruction

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation