Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access

Casual 3D photography

Published: 20 November 2017 Publication History

Abstract

We present an algorithm that enables casual 3D photography. Given a set of input photos captured with a hand-held cell phone or DSLR camera, our algorithm reconstructs a 3D photo, a central panoramic, textured, normal mapped, multi-layered geometric mesh representation. 3D photos can be stored compactly and are optimized for being rendered from viewpoints that are near the capture viewpoints. They can be rendered using a standard rasterization pipeline to produce perspective views with motion parallax. When viewed in VR, 3D photos provide geometrically consistent views for both eyes. Our geometric representation also allows interacting with the scene using 3D geometry-aware effects, such as adding new objects to the scene and artistic lighting effects.
Our 3D photo reconstruction algorithm starts with a standard structure from motion and multi-view stereo reconstruction of the scene. The dense stereo reconstruction is made robust to the imperfect capture conditions using a novel near envelope cost volume prior that discards erroneous near depth hypotheses. We propose a novel parallax-tolerant stitching algorithm that warps the depth maps into the central panorama and stitches two color-and-depth panoramas for the front and back scene surfaces. The two panoramas are fused into a single non-redundant, well-connected geometric mesh. We provide videos demonstrating users interactively viewing and manipulating our 3D photos.

References

[1]
Robert Anderson, David Gallup, Jonathan T. Barron, Janne Kontkanen, Noah Snavely, Carlos Hernandez Esteban, Sameer Agarwal, and Steven M. Seitz. 2016. Jump: Virtual Reality Video. ACM Transactions on Graphics 35, 6 (2016).
[2]
Jonathan T. Barron and Jitendra Malik. 2015. Shape, Illumination, and Reflectance from Shading. IEEE Trans. Pattern Anal. Mach. Intell. 37, 8 (2015), 1670--1687.
[3]
Frederic Besse, Carsten Rother, Andrew Fitzgibbon, and Jan Kautz. 2014. PMBP: Patch-Match Belief Propagation for Correspondence Field Estimation. Int. J. Comput. Vision 110, 1 (2014), 2--13.
[4]
Aaron F. Bobick and Stephen S. Intille. 1999. Large Occlusion Stereo. International Journal of Computer Vision 33, 3 (1999), 181--200.
[5]
Chris Buehler, Michael Bosse, Leonard McMillan, Steven Gortler, and Michael Cohen. 2001. Unstructured Lumigraph Rendering. (2001), 425--432.
[6]
Gaurav Chaurasia, Sylvain Duchene, Olga Sorkine-Hornung, and George Drettakis. 2013. Depth Synthesis and Local Warps for Plausible Image-based Navigation. ACM Trans. Graph. 32, 3 (2013), 30:1--30:12.
[7]
Robert T. Collins. 1996. A space-sweep approach to true multi-image matching. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 1996). 358--363.
[8]
Paul Debevec, Chris Tchou, Andrew Gardner, Tim Hawkins, Charis Poullis, Jessi Stumpfel, Andrew Jones, Nathaniel Yun, Per Einarsson, Therese Lundgren, Marcos Fajardo, and Philippe Martinez. 2004. Estimating Surface Reflectance Properties of a Complex Scene under Captured Natural Illumination. ICT Technical Report ICT TR 06 2004 (2004).
[9]
Paul E. Debevec, Camillo J. Taylor, and Jitendra Malik. 1996. Modeling and Rendering Architecture from Photographs: A Hybrid Geometry- and Image-based Approach. In Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '96). ACM, New York, NY, USA, 11--20.
[10]
Sylvain Duchêne, Clement Riant, Gaurav Chaurasia, Jorge Lopez-Moreno, Pierre-Yves Laffont, Stefan Popov, Adrien Bousseau, and George Drettakis. 2015. Multi-View Intrinsic Images of Outdoors Scenes with an Application to Relighting. ACM Transactions on Graphics (2015).
[11]
David Eigen, Christian Puhrsch, and Rob Fergus. 2014. Depth Map Prediction from a Single Image Using a Multi-scale Deep Network. Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS) (2014), 2366--2374.
[12]
Jakob Engel, Vladlen Koltun, and Daniel Cremers. 2016. Direct Sparse Odometry. arXiv:1607.02565 (2016).
[13]
Facebook. 2016. Facebook Surround 360. https://facebook360.fb.com/facebook-surround-360/. (2016). Accessed: 2016-12-26.
[14]
John Flynn, Ivan Neulander, James Philbin, and Noah Snavely. 2016. DeepStereo: Learning to Predict New Views From the World's Imagery. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016).
[15]
Simon Fuhrmann and Michael Goesele. 2014. Floating Scale Surface Reconstruction. ACM Trans. Graph. 33, 4 (2014), article no. 46.
[16]
Simon Fuhrmann, Fabian Langguth, and Michael Goesele. 2014. MVE: A Multi-view Reconstruction Environment. Proceedings of the Eurographics Workshop on Graphics and Cultural Heritage (GCH '14) (2014), 11--18.
[17]
Yasutaka Furukawa and Carlos Hernández. 2015. Multi-View Stereo: A Tutorial. Foundations and Trends. in Computer Graphics and Vision 9, 1--2 (2015), 1--148.
[18]
Yasutaka Furukawa and Jean Ponce. 2010. Accurate, Dense, and Robust Multiview Stereopsis. IEEE Trans. Pattern Anal. Mach. Intell. 32, 8 (2010), 1362--1376.
[19]
Silvano Galliani, Katrin Lasinger, and Konrad Schindler. 2015. Massively Parallel Multiview Stereopsis by Surface Normal Diffusion. The IEEE International Conference on Computer Vision (ICCV) (2015).
[20]
Clément Godard, Oisin Mac Aodha, and Gabriel J. Brostow. 2017. Unsupervised Monocular Depth Estimation with Left-Right Consistency. CVPR (2017).
[21]
M. Goesele, N. Snavely, B. Curless, H. Hoppe, and S.M. Seitz. 2007. Multi-View Stereo for Community Photo Collections. (2007), 1--8.
[22]
Google. 2015. Carboard Camera. https://googleblog.blogspot.com/2015/12/step-inside-your-photos-with-cardboard.html/. (2015). Accessed: 2016-12-26.
[23]
Peter Hedman, Tobias Ritschel, George Drettakis, and Gabriel Brostow. 2016. Scalable Inside-out Image-based Rendering. ACM Trans. Graph. 35, 6 (2016), 231:1--231:11.
[24]
Sunghoon Im, Hyowon Ha, François Rameau, Hae-Gon Jeon, Gyeongmin Choe, and InSo Kweon. 2016. All-Around Depth from Small Motion with a Spherical Panoramic Camera. European Conference on Computer Vision (ECCV '16) (2016), 156--172.
[25]
Hiroshi Ishiguro, Masashi Yamamoto, and Saburo Tsuji. 1990. Omni-directional stereo for making global map. In Third International Conference on Computer Vision. IEEE, 540--547.
[26]
Shahram Izadi, David Kim, Otmar Hilliges, David Molyneaux, Richard Newcombe, Pushmeet Kohli, Jamie Shotton, Steve Hodges, Dustin Freeman, Andrew Davison, and Andrew Fitzgibbon. 2011. KinectFusion: Real-time 3D Reconstruction and Interaction Using a Moving Depth Camera. Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology (2011), 559--568.
[27]
Michal Jancosek and Tomas Pajdla. 2011. Multi-view Reconstruction Preserving Weakly-supported Surfaces. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2011) (2011), 3121--3128.
[28]
Kevin Karsch, Varsha Hedau, David Forsyth, and Derek Hoiem. 2011. Rendering Synthetic Objects into Legacy Photographs. ACM Trans. Graph. 30, 6 (2011), 157:1--157:12.
[29]
Michael Kazhdan and Hugues Hoppe. 2013. Screened Poisson Surface Reconstruction. ACM Trans. Graph. 32, 3 (2013), article no. 29.
[30]
Erum Arif Khan, Erik Reinhard, Roland W. Fleming, and Heinrich H. Bülthoff. 2006. Image-based Material Editing. ACM Transactions on Graphics (Proc. SIGGRAPH 2006) 25, 3 (2006), 654--663.
[31]
Vladimir Kolmogorov and Ramin Zabih. 2004. What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence 26, 2 (2004), 65--81.
[32]
Nikos Komodakis and Georgios Tziritas. 2007. Approximate Labeling via Graph Cuts Based on Linear Programming. IEEE Transactions on Pattern Analysis and Machine Intelligence 29, 8 (2007), 1436--1453.
[33]
Johannes Kopf, Michael F. Cohen, Dani Lischinski, and Matt Uyttendaele. 2007. Joint Bilateral Upsampling. ACM Trans. Graph. 26, 3 (2007).
[34]
Johannes Kopf, Fabian Langguth, Daniel Scharstein, Richard Szeliski, and Michael Goesele. 2013. Image-based Rendering in the Gradient Domain. ACM Trans. Graph. 32, 6 (2013), 199:1--199:9.
[35]
Vivek Kwatra, Arno Schödl, Irfan Essa, Greg Turk, and Aaron Bobick. 2003. Graphcut Textures: Image and Video Synthesis Using Graph Cuts. ACM Trans. Graph. 22, 3 (2003), 277--286.
[36]
Fabian Langguth, Kalyan Sunkavalli, Sunil Hadap, and Michael Goesele. 2016. Shading-aware Multi-view Stereo. Proceedings of the European Conference on Computer Vision (ECCV) (2016).
[37]
Anat Levin, Dani Lischinski, and Yair Weiss. 2004. Colorization Using Optimization. ACM Trans. Graph. 23, 3 (2004), 689--694.
[38]
Kaimo Lin, Nianjuan Jiang, Loong-Fah Cheong, Minh N. Do, and Jiangbo Lu. 2016. SEAGULL: Seam-Guided Local Alignment for Parallax-Tolerant Image Stitching. 14th European Conference on Computer Vision (ECCV) (2016), 370--385.
[39]
Sheng-Jie Luo, I-Chao Shen, Bing-Yu Chen, Wen-Huang Cheng, and Yung-Yu Chuang. 2012. Perspective-aware Warping for Seamless Stereoscopic Image Cloning. ACM Trans. Graph. 31, 6 (2012), article no. 182.
[40]
Ziyang Ma, Kaiming He, Yichen Wei, Jian Sun, and Enhua Wu. 2013. Constant Time Weighted Median Filtering for Stereo Matching and Beyond. In IEEE International Conference on Computer Vision (ICCV 2013). 49--56.
[41]
Raúl Mur-Artal and Juan D. Tardós. 2016. ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras. arXiv preprint arXiv:1610.06475 (2016).
[42]
OpenMVS. 2016. OpenMVS: open Multi-View Stereo reconstruction library. https://github.com/cdcseacave/openMVS. (2016). Accessed: 2016-12-26.
[43]
Shmuel Peleg and Moshe Ben-Ezra. 1999. Stereo panorama with a single camera. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 1999) (1999), 395--401.
[44]
Shmuel Peleg, Moshe Ben-Ezra, and Yael Pritch. 2001. Omnistereo: panoramic stereo imaging. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 3 (2001), 279--290.
[45]
Realities. 2017. realities.io | Go Places. http://realities.io/. (2017). Accessed: 2017-1-12.
[46]
Christoph Rhemann, Asmaa Hosni, Michael Bleyer, Carsten Rother, and Margit Gelautz. 2011. Fast cost-volume filtering for visual correspondence and beyond. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2011). 3017--3024.
[47]
Christian Richardt, Yael Pritch, Henning Zimmer, and Alexander Sorkine-Hornung. 2013. Megastereo: Constructing High-Resolution Stereo Panoramas. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013) (2013), 1256--1263.
[48]
Daniel Scharstein and Richard Szeliski. 2002. A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms. International Journal of Computer Vision 47, 1--3 (2002), 7--42.
[49]
Frank Schmitt and Lutz Priese. 2009. Sky detection in CSC-segmented color images. International Conference on Computer Vision Theory and Applications (VISAPP 2009) (2009), 101--106.
[50]
Johannes Lutz Schönberger, Enliang Zheng, Marc Pollefeys, and Jan-Michael Frahm. 2016. Pixelwise View Selection for Unstructured Multi-View Stereo. European Conference on Computer Vision (ECCV) (2016).
[51]
Steven M Seitz, Brian Curless, James Diebel, Daniel Scharstein, and Richard Szeliski. 2006. A comparison and evaluation of multi-view stereo reconstruction algorithms. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06), Vol. 1. IEEE, 519--528.
[52]
Jonathan Shade, Steven Gortler, Li-wei He, and Richard Szeliski. 1998. Layered Depth Images. Proceedings of SIGGRAPH '98 (1998), 231--242.
[53]
Harry Shumand RickSzeliski. 1998. Construction and refinement of panoramic mosaics with global and local alignment. Sixth International Conference on Computer Vision (ICCV '98) (1998), 953--958.
[54]
Richard Szeliski. 2006. Image Alignment and Stitching: A Tutorial. Found. Trends. Comput. Graph. Vis. 2, 1 (2006), 1--104.
[55]
Jayant Thatte, Jean-Baptiste Boin, Haricharan Lakshman, and Bernd Girod. 2016. Depth augmented stereo panorama for cinematic virtual reality with head-motion parallax. 2016 IEEE International Conference on Multimedia and Expo (ICME) (2016).
[56]
Benjamin Ummenhofer and Thomas Brox. 2015. Global, Dense Multiscale Reconstruction for a Billion Points. IEEE International Conference on Computer Vision (ICCV) (2015).
[57]
Benjamin Ummenhofer, Huizhong Zhou, Jonas Uhrig, Nikolaus Mayer, Eddy Ilg, Alexey Dosovitskiy, and Thomas Brox. 2017. DeMoN:Depth and Motion Network for Learning Monocular Stereo. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017).
[58]
Valve. 2016. Valve Developer Community: Advanced Outdoors Photogrammetry. https://developer.valvesoftware.com/wiki/Destinations/Advanced_Outdoors_Photogrammetry. (2016). Accessed: 2016-11-3.
[59]
George Vogiatzis, Carlos Hernández Esteban, Philip H. S. Torr, and Roberto Cipolla. 2007. Multiview Stereo via Volumetric Graph-Cuts and Occlusion Robust Photo-Consistency. IEEE Trans. Pattern Anal. Mach. Intell. 29, 12 (2007), 2241--2246.
[60]
Michael Waechter, Mate Beljan, Simon Fuhrmann, Nils Moehrle, Johannes Kopf, and Michael Goesele. 2017. Virtual Rephotography: Novel View Prediction Error for 3D Reconstruction. ACM Trans. Graph. 36, 1 (2017), article no. 8.
[61]
Michael Waechter, Nils Moehrle, and Michael Goesele. 2014. Let There Be Color! Large-Scale Texturing of 3D Reconstructions. ECCV 2014 8693 (2014), 836--850.
[62]
Katja Wolff, Changil Kim, Henning Zimmer, Christopher Schroers, Mario Botsch, Olga Sorkine-Hornung, and Alexander Sorkine-Hornung. 2016. Point Cloud Noise and Outlier Removal for Image-Based 3D Reconstruction. In International Conference on 3D Vision (3DV 2016). 118--127.
[63]
Chenglei Wu, Bennet Wilburn, Yasuyuki Matsushita, and Christian Theobalt. 2011. High-quality Shape from Multi-view Stereo and Shading Under General Illumination. IEEE Conference on Computer Vision and Pattern Recognition (CVPR '11) (2011), 969--976.
[64]
Kuk-Jin Yoon and In-So Kweon. 2005. Locally adaptive support-weight approach for visual correspondence search. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2005), Vol. 2. 924--931.
[65]
Julio Zaragoza, Tat-Jun Chin, Michael S. Brown, and David Suter. 2013. As-Projective-As-Possible Image Stitching with Moving DLT. Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition (2013), 2339--2346.
[66]
Fan Zhang and Feng Liu. 2014. Parallax-Tolerant Image Stitching. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (2014), 3262--3269.
[67]
Fan Zhang and Feng Liu. 2015. Casual Stereoscopic Panorama Stitching. IEEE Conference on Computer Vision and Pattern Recognition (CVPR '15) (2015), 2002--2010.
[68]
Ke Colin Zheng, Sing Bing Kang, Michael F. Cohen, and Richard Szeliski. 2007. Layered Depth Panoramas. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2007) (2007), 1--8.
[69]
C. Lawrence Zitnick, Sing Bing Kang, Matthew Uyttendaele, Simon Winder, and Richard Szeliski. 2004. High-quality Video View Interpolation Using a Layered Representation. ACM Trans. Graph. (Proc. SIGGRAPH 2004) 23, 3 (2004), 600--608.

Cited By

View all
  • (2024)A Virtual View Acquisition Technique for Complex Scenes of Monocular Images Based on Layered Depth ImagesApplied Sciences10.3390/app14221055714:22(10557)Online publication date: 15-Nov-2024
  • (2024)A Real-Time Viewpoint Rendering System Based on Multi-Plane Images and Neural Radiance FieldsProceedings of the International Conference on Computer Vision and Deep Learning10.1145/3653781.3653809(1-6)Online publication date: 19-Jan-2024
  • (2024)StereoDiffusion: Training-Free Stereo Image Generation Using Latent Diffusion Models2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW63382.2024.00737(7416-7425)Online publication date: 17-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics
ACM Transactions on Graphics  Volume 36, Issue 6
December 2017
973 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/3130800
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 November 2017
Published in TOG Volume 36, Issue 6

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. 3D reconstruction
  2. image-based rendering
  3. virtual reality

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)269
  • Downloads (Last 6 weeks)30
Reflects downloads up to 23 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A Virtual View Acquisition Technique for Complex Scenes of Monocular Images Based on Layered Depth ImagesApplied Sciences10.3390/app14221055714:22(10557)Online publication date: 15-Nov-2024
  • (2024)A Real-Time Viewpoint Rendering System Based on Multi-Plane Images and Neural Radiance FieldsProceedings of the International Conference on Computer Vision and Deep Learning10.1145/3653781.3653809(1-6)Online publication date: 19-Jan-2024
  • (2024)StereoDiffusion: Training-Free Stereo Image Generation Using Latent Diffusion Models2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW63382.2024.00737(7416-7425)Online publication date: 17-Jun-2024
  • (2024)NeRFiller: Completing Scenes via Generative 3D Inpainting2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01959(20731-20741)Online publication date: 16-Jun-2024
  • (2024)Global Latent Neural Rendering2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01865(19723-19733)Online publication date: 16-Jun-2024
  • (2024)Modeling Practical Multi-Center-of-Projection Using EllipsoidIEEE Access10.1109/ACCESS.2024.345150212(122328-122339)Online publication date: 2024
  • (2024)Plenoptic ReconstructionPlenoptic Imaging and Processing10.1007/978-981-97-6915-5_4(75-189)Online publication date: 16-Oct-2024
  • (2024)High-Resolution Plenoptic SensingPlenoptic Imaging and Processing10.1007/978-981-97-6915-5_3(37-73)Online publication date: 16-Oct-2024
  • (2024)Neural Radiance Fields for Dynamic View Synthesis Using Local Temporal PriorsComputational Visual Media10.1007/978-981-97-2095-8_5(74-90)Online publication date: 10-Apr-2024
  • (2023)Virtual Reality Solutions Employing Artificial Intelligence Methods: A Systematic Literature ReviewACM Computing Surveys10.1145/356502055:10(1-29)Online publication date: 2-Feb-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media