Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3664647.3681507acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article
Free access

Learning Geometry Consistent Neural Radiance Fields from Sparse and Unposed Views

Published: 28 October 2024 Publication History

Abstract

The latest progress in novel view synthesis can be attributed to the Neural Radiance Field (NeRF), which requires densely sampled images with precise camera poses. However, collecting dense input images for a NeRF with accurate camera poses is highly expensive in many real-world scenarios. In this paper, we propose to learn Geometry Consistent Neural Radiance Field (GC-NeRF), to tackle this challenge by jointly optimizing a NeRF and its corresponding camera poses with sparse (as low as 2) and unposed views. First, the proposed GC-NeRF establishes image-level geometric consistencies, by producing photometric constraints from inter- and intra-views to update the NeRF and the camera poses in a fine-grained manner. Then, we adopt geometry projection with camera extrinsic parameters to further provide region-level consistency supervisions, which constructs pseudo-pixel labels to capture critical matching correlations. Moreover, we present an adaptive high-frequency mapping function to augment the geometry and texture information of the 3D scene. Extensive experiments on multiple challenging real-world datasets validate the effectiveness of the proposed GC-NeRF, which sets a new state-of-the-art for effectively learning NeRF with sparse and unposed views.

References

[1]
Wenjing Bian, Zirui Wang, Kejie Li, and Jia-Wang Bian. 2023. NoPe-NeRF: Optimising Neural Radiance Field with No Pose Prior. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4160--4169.
[2]
Anpei Chen, Zexiang Xu, Fuqiang Zhao, Xiaoshuai Zhang, Fanbo Xiang, Jingyi Yu, and Hao Su. 2021. MVSNeRF: Fast Generalizable Radiance Field Reconstruction from Multi-View Stereo. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. 14104--14113.
[3]
Zezhou Cheng, Carlos Esteves, Varun Jampani, Abhishek Kar, Subhransu Maji, and Ameesh Makadia. 2023. LU-NeRF: Scene and Pose Estimation by Synchronizing Local Unposed NeRFs. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. 18312--18321.
[4]
Shin-Fang Chng, Sameera Ramasinghe, Jamie Sherrah, and Simon Lucey. 2022. Gaussian Activated Neural Radiance Fields for High Fidelity Reconstruction and Pose Estimation. In Proceedings of the 17th European Conference on Computer Vision. 264--280.
[5]
François Darmon, Bénédicte Bascle, Jean-Clément Devaux, Pascal Monasse, and Mathieu Aubry. 2022. Improving neural implicit surfaces geometry with patch warping. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6250--6259.
[6]
Kangle Deng, Andrew Liu, Jun-Yan Zhu, and Deva Ramanan. 2022. Depth-supervised NeRF: Fewer Views and Faster Training for Free. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12872--12881.
[7]
Mihai Dusmanu, Ignacio Rocco, Tomás Pajdla, Marc Pollefeys, Josef Sivic, Akihiko Torii, and Torsten Sattler. 2019. D2-Net: A Trainable CNN for Joint Description and Detection of Local Features. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8092--8101.
[8]
Wei Feng, Fei-Peng Tian, Qian Zhang, Nan Zhang, Liang Wan, and Jizhou Sun. 2015. Fine-Grained Change Detection of Misaligned Scenes with Varied Illuminations. In Proceedings of the IEEE International Conference on Computer Vision. 1260--1268.
[9]
Sara Fridovich-Keil, Alex Yu, Matthew Tancik, Qinhong Chen, Benjamin Recht, and Angjoo Kanazawa. 2022. Plenoxels: Radiance Fields without Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5491--5500.
[10]
Yang Fu, Ishan Misra, and Xiaolong Wang. 2023. MonoNeRF: Learning Generalizable NeRFs from Monocular Videos without Camera Poses. In Proceedings of the 40th International Conference on Machine Learning. 10392--10404.
[11]
Stephan J. Garbin, Marek Kowalski, Matthew Johnson, Jamie Shotton, and Julien P. C. Valentin. 2021. FastNeRF: High-Fidelity Neural Rendering at 200FPS. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. 14326--14335.
[12]
Stephen Hausler, Sourav Garg, Ming Xu, Michael Milford, and Tobias Fischer. 2021. Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14141--14152.
[13]
Yoonwoo Jeong, Seokjun Ahn, Christopher B. Choy, Animashree Anandkumar, Minsu Cho, and Jaesik Park. 2021. Self-Calibrating Neural Radiance Fields. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. 5826--5834.
[14]
Mijeong Kim, Seonguk Seo, and Bohyung Han. 2022. InfoNeRF: Ray Entropy Minimization for Few-Shot Neural Volume Rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12902--12911.
[15]
Arno Knapitsch, Jaesik Park, Qian-Yi Zhou, and Vladlen Koltun. 2017. Tanks and temples: benchmarking large-scale scene reconstruction. ACM Transactions on Graphics, Vol. 36, 4 (2017), 78:1--78:13.
[16]
Johannes Kopf, Xuejian Rong, and Jia-Bin Huang. 2021. Robust Consistent Video Depth Estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1611--1621.
[17]
Chen-Hsuan Lin, Wei-Chiu Ma, Antonio Torralba, and Simon Lucey. 2021. BARF: Bundle-Adjusting Neural Radiance Fields. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. 5721--5731.
[18]
Yen-Chen Lin, Pete Florence, Jonathan T. Barron, Alberto Rodriguez, Phillip Isola, and Tsung-Yi Lin. 2021. iNeRF: Inverting Neural Radiance Fields for Pose Estimation. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems. 3437--3444.
[19]
David G. Lowe. 2004. Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision, Vol. 60, 2 (2004), 91--110.
[20]
Quan Meng, Anpei Chen, Haimin Luo, Minye Wu, Hao Su, Lan Xu, Xuming He, and Jingyi Yu. 2021. GNeRF: GAN-based Neural Radiance Field without Posed Camera. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. 6331--6341.
[21]
Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. 2020. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. In Proceedings of the 16th European Conference on Computer Vision. 405--421.
[22]
Thomas Müller, Alex Evans, Christoph Schied, and Alexander Kelle. 2022. Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics, Vol. 41, 4 (2022), 102:1--102:15.
[23]
Michael Niemeyer, Jonathan T. Barron, Ben Mildenhall, Mehdi S. M. Sajjadi, Andreas Geiger, and Noha Radwan. 2022. RegNeRF: Regularizing Neural Radiance Fields for View Synthesis from Sparse Inputs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5470--5480.
[24]
Nasim Rahaman, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred A. Hamprecht, Yoshua Bengio, and Aaron C. Courville. 2019. On the spectral bias of neural networks. In Proceedings of the International Conference on Machine Learning. 5301--5310.
[25]
Ali Rahimi and Benjamin Recht. 2007. Random Features for Large-Scale Kernel Machines. In Proceedings of the Conference on Neural Information Processing Systems. 1--8.
[26]
Barbara Roessle, Jonathan T. Barron, Ben Mildenhall, Pratul P. Srinivasan, and Matthias Nießner. 2022. Dense Depth Priors for Neural Radiance Fields from Sparse Input Views. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12882--12891.
[27]
Antoni Rosinol, John J. Leonard, and Luca Carlone. 2023. NeRF-SLAM: Real-Time Dense Monocular SLAM with Neural Radiance Fields. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems. 3437--3444.
[28]
Ethan Rublee, Vincent Rabaud, Kurt Konolige, and Gary R. Bradski. 2011. ORB: An efficient alternative to SIFT or SURF. In Proceedings of the 2011 IEEE/CVF International Conference on Computer Vision. 2564--2571.
[29]
Mehdi S. M. Sajjadi, Aravindh Mahendran, Thomas Kipf, Etienne Pot, Daniel Duckworth, Mario Lucic, and Klaus Greff. 2023. RUST: Latent Neural Scene Representations from Unposed Imagery. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 17297--17306.
[30]
Johannes L. Schönberger and Jan-Michael Frahm. 2016. Structure-from-Motion Revisited. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4104--4113.
[31]
Seunghyeon Seo, Yeonjin Chang, and Nojun Kwak. 2023. FlipNeRF: Flipped Reflection Rays for Few-shot Novel View Synthesis. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. 22883--22893.
[32]
Seunghyeon Seo, Donghoon Han, Yeonjin Chang, and Nojun Kwak. 2023. MixNeRF: Modeling a Ray with Mixture Density for Novel View Synthesis from Sparse Inputs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 20659--20668.
[33]
Mohammad Shafiei, Sai Bi, Zhengqin Li, Aidas Liaudanskas, Rodrigo Ortiz Cayon, and Ravi Ramamoorthi. 2021. Learning Neural Transmittance for Efficient Rendering of Reflectance Fields. In Proceedings of the 32nd British Machine Vision Conference 2021. 45--45.
[34]
Ken Shoemake. 1985. Animating rotation with quaternion curves. In Proceedings of the International Conference on Computer Graphics and Interactive Techniques. 245--254.
[35]
Jürgen Sturm, Nikolas Engelhard, Felix Endres, Wolfram Burgard, and Daniel Cremers. 2012. A benchmark for the evaluation of RGB-D SLAM systems. In Proceedings of the IEEE/RJS International Conference on Intelligent RObots and Systems. 573--580.
[36]
Edgar Sucar, Shikun Liu, Joseph Ortiz, and Andrew J. Davison. 2021. iMAP: Implicit Mapping and Positioning in Real-Time. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. 6209--6218.
[37]
Jiaming Sun, Zehong Shen, Yuang Wang, Hujun Bao, and Xiaowei Zhou. 2021. LoFTR: Detector-Free Local Feature Matching With Transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8922--8931.
[38]
Matthew Tancik, Pratul P. Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan T. Barron, and Ren Ng. 2020. Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains. In Proceedings of the Conference on Neural Information Processing Systems. 1--11.
[39]
Fei-Peng Tian, Wei Feng, Qian Zhang, Xiaowei Wang, Jizhou Sun, Vincenzo Loia, and Zhi-Qiang Liu. 2019. Active Camera Relocalization from a Single Reference Image without Hand-Eye Calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 41, 12 (2019), 2791--2806.
[40]
Prune Truong, Marie-Julie Rakotosaona, Fabian Manhardt, and Federico Tombari. 2023. SPARF: Neural Radiance Fields from Sparse and Noisy Poses. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4190--4200.
[41]
Peng Wang, Lingzhe Zhao, Ruijie Ma, and Peidong Liu. 2023. BAD-NeRF: Bundle Adjusted Deblur Neural Radiance Fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4170--4179.
[42]
Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, and Eero P. Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, Vol. 13, 4 (2004), 600--612.
[43]
Zirui Wang, Shangzhe Wu, Weidi Xie, Min Chen, and Victor Adrian Prisacariu. 2021. NeRF--: Neural Radiance Fields Without Known Camera Parameters. In arXiv preprint arXiv:2102.07064. 1--17.
[44]
Jamie Wynn and Daniyar Turmukhambetov. 2023. DiffusioNeRF: Regularizing Neural Radiance Fields with Denoising Diffusion Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4180--4189.
[45]
Yitong Xia, Hao Tang, Radu Timofte, and Luc Van Gool. 2022. SiNeRF: Sinusoidal Neural Radiance Fields for Joint Pose Estimation and Scene Reconstruction. In Proceedings of the 33rd British Machine Vision Conference 2022. 131--131.
[46]
Chen Yang, Peihao Li, Zanwei Zhou, Shanxin Yuan, Bingbing Liu, Xiaokang Yang, Weichao Qiu, and Wei Shen. 2023. NeRFVS: Neural Radiance Fields for Free View Synthesis via Geometry Scaffolds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16549--16558.
[47]
Weicai Ye, Shuo Chen, Chong Bao, Hujun Bao, Marc Pollefeys, Zhaopeng Cui, and Guofeng Zhang. 2023. IntrinsicNeRF: Learning Intrinsic Neural Radiance Fields for Editable Novel View Synthesis. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. 339--351.
[48]
Alex Yu, Vickie Ye, Matthew Tancik, and Angjoo Kanazawa. 2021. pixelNeRF: Neural Radiance Fields From One or Few Images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4578--4587.
[49]
Jiahui Zhang, Fangneng Zhan, Yingchen Yu, Kunhao Liu, Rongliang Wu, Xiaoqin Zhang, Ling Shao, and Shijian Lu. 2023. Pose-Free Neural Radiance Fields via Implicit Pose Regularization. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. 3534--3543.
[50]
Jian Zhang, Yuanqing Zhang, Huan Fu, Xiaowei Zhou, Bowen Cai, Jinchi Huang, Rongfei Jia, Binqiang Zhao, and Xing Tang. 2022. Ray Priors through Reprojection: Improving Neural Radiance Fields for Novel View Extrapolation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18355--18365.
[51]
Jason Y. Zhang, Deva Ramanan, and Shubham Tulsiani. 2022. RelPose: Predicting Probabilistic Relative Rotation for Single Objects in the Wild. In Proceedings of the European Conference on Computer Vision. 592--611.
[52]
Qian Zhang, Wei Feng, Liang Wan, Fei-Peng Tian, and Ping Tan. 2018. Active Recurrence of Lighting Condition for Fine-Grained Change Detection. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. 4972--4978.
[53]
Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. 2018. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 586--595.
[54]
Zichao Zhang and Davide Scaramuzza. 2018. A Tutorial on Quantitative Trajectory Evaluation for Visual(-Inertial) Odometry. In Proceedings of the IEEE/RJS International Conference on Intelligent RObots and Systems. 7244--7251.
[55]
Zihan Zhu, Songyou Peng, Viktor Larsson, Weiwei Xu, Hujun Bao, Zhaopeng Cui, Martin R. Oswald, and Marc Pollefeys. 2022. NICE-SLAM: Neural Implicit Scalable Encoding for SLAM. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12776--12786.

Index Terms

  1. Learning Geometry Consistent Neural Radiance Fields from Sparse and Unposed Views

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '24: Proceedings of the 32nd ACM International Conference on Multimedia
    October 2024
    11719 pages
    ISBN:9798400706868
    DOI:10.1145/3664647
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 October 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. geometric consistency
    2. neural radiance fields
    3. sparse and unposed views
    4. volume rendering

    Qualifiers

    • Research-article

    Funding Sources

    • National Key R&D Program of China
    • Natural Science Foundation of China

    Conference

    MM '24
    Sponsor:
    MM '24: The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne VIC, Australia

    Acceptance Rates

    MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;
    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 35
      Total Downloads
    • Downloads (Last 12 months)35
    • Downloads (Last 6 weeks)35
    Reflects downloads up to 10 Nov 2024

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media