Digging into Radiance Grid for Real-Time View Synthesis with Detail Preservation

Jian Zhang¹²,
Jinchi Huang¹²,
Bowen Cai¹²,
Huan Fu¹²,
Mingming Gong¹³,
Chaohui Wang¹⁴,
Jiaming Wang¹²,
Hongchen Luo¹²,
Rongfei Jia¹²,
Binqiang Zhao¹² &
…
Xing Tang¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13675))

Included in the following conference series:

European Conference on Computer Vision

2827 Accesses
4 Citations

Abstract

Neural Radiance Fields (NeRF) [31] series are impressive in representing scenes and synthesizing high-quality novel views. However, most previous works fail to preserve texture details and suffer from slow training speed. A recent method SNeRG [11] demonstrates that baking a trained NeRF as a Sparse Neural Radiance Grid enables real-time view synthesis with slight scarification of rendering quality. In this paper, we dig into the Radiance Grid representation and present a set of improvements, which together result in boosted performance in terms of both speed and quality. First, we propose an HieRarchical Sparse Radiance Grid (HrSRG) representation that has higher voxel resolution for informative spaces and fewer voxels for other spaces. HrSRG leverages a hierarchical voxel grid building process inspired by [30, 55], and can describe a scene at high resolution without excessive memory footprint. Furthermore, we show that directly optimizing the voxel grid leads to surprisingly good texture details in rendered images. This direct optimization is memory-friendly and requires multiple orders of magnitude less time than conventional NeRFs as it only involves a tiny MLP. Finally, we find that a critical factor that prevents fine details restoration is the misaligned 2D pixels among images caused by camera pose errors. We propose to use the perceptual loss to add tolerance to misalignments, leading to the improved visual quality of rendered images.

J. Zhang, J. Huang and B. Cai—These authors contribute equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Neural Deformable Voxel Grid for Fast Optimization of Dynamic View Synthesis

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

Few-Shot View Synthesis Based on Geometric and Semantic Consistency

References

Arandjelović, R., Zisserman, A.: Nerf in detail: learning to sample for view synthesis. arXiv preprint arXiv:2106.05264 (2021)
Barron, J.T., Mildenhall, B., Tancik, M., Hedman, P., Martin-Brualla, R., Srinivasan, P.P.: Mip-nerf: a multiscale representation for anti-aliasing neural radiance fields. In: ICCV (2021)
Google Scholar
Bentley, J.L.: K-d trees for semidynamic point sets. In: Proceedings of the Sixth Annual Symposium on Computational Geometry, pp. 187–197 (1990)
Google Scholar
Chen, A., et al.: Mvsnerf: fast generalizable radiance field reconstruction from multi-view stereo. In: ICCV (2021)
Google Scholar
Davis, A., Levoy, M., Durand, F.: Unstructured light fields. In: Computer Graphics Forum, vol. 31, pp. 305–314. Wiley Online Library (2012)
Google Scholar
Deng, B., Barron, J.T., Srinivasan, P.P.: JaxNeRF: an efficient JAX implementation of NeRF (2020). https://github.com/google-research/google-research/tree/master/jaxnerf
Gafni, G., Thies, J., Zollhofer, M., Nießner, M.: Dynamic neural radiance fields for monocular 4d facial avatar reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8649–8658 (2021)
Google Scholar
Garbin, S.J., Kowalski, M., Johnson, M., Shotton, J., Valentin, J.: Fastnerf: high-fidelity neural rendering at 200fps. arXiv preprint arXiv:2103.10380 (2021)
Gortler, S.J., Grzeszczuk, R., Szeliski, R., Cohen, M.F.: The lumigraph. In: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, pp. 43–54 (1996)
Google Scholar
Hedman, P., Philip, J., Price, T., Frahm, J.M., Drettakis, G., Brostow, G.: Deep blending for free-viewpoint image-based rendering. ACM Trans. Graph. (TOG) 37(6), 1–15 (2018)
Article Google Scholar
Hedman, P., Srinivasan, P.P., Mildenhall, B., Barron, J.T., Debevec, P.: Baking neural radiance fields for real-time view synthesis. In: ICCV (2021)
Google Scholar
Huang, J., et al.: Adversarial texture optimization from rgb-d scans. In: CVPR, pp. 1559–1568 (2020)
Google Scholar
Ichnowski, J., Avigal, Y., Kerr, J., Goldberg, K.: Dex-nerf: using a neural radiance field to grasp transparent objects. arXiv preprint arXiv:2110.14217 (2021)
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: CVPR, pp. 1125–1134 (2017)
Google Scholar
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43
Chapter Google Scholar
Kar, A., Häne, C., Malik, J.: Learning a multi-view stereo machine. arXiv preprint arXiv:1708.05375 (2017)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Knapitsch, A., Park, J., Zhou, Q.Y., Koltun, V.: Tanks and temples: benchmarking large-scale scene reconstruction. ACM Trans. Graph. (ToG) 36(4), 1–13 (2017)
Article Google Scholar
Kutulakos, K.N., Seitz, S.M.: A theory of shape by space carving. Int. J. Comput. Vis. 38(3), 199–218 (2000). https://doi.org/10.1023/A:1008191222954
Article MATH Google Scholar
Larsen, A.B.L., Sønderby, S.K., Larochelle, H., Winther, O.: Autoencoding beyond pixels using a learned similarity metric. In: ICML, pp. 1558–1566. PMLR (2016)
Google Scholar
Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: CVPR, pp. 4681–4690 (2017)
Google Scholar
Levoy, M., Hanrahan, P.: Light field rendering. In: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, pp. 31–42 (1996)
Google Scholar
Li, Z., Niklaus, S., Snavely, N., Wang, O.: Neural scene flow fields for space-time view synthesis of dynamic scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6498–6508 (2021)
Google Scholar
Lin, C.H., Ma, W.C., Torralba, A., Lucey, S.: Barf: bundle-adjusting neural radiance fields. In: IEEE International Conference on Computer Vision (ICCV) (2021)
Google Scholar
Lindell, D.B., Martel, J.N., Wetzstein, G.: Autoint: automatic integration for fast neural volume rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14556–14565 (2021)
Google Scholar
Liu, L., Gu, J., Lin, K.Z., Chua, T.S., Theobalt, C.: Neural sparse voxel fields. In: NeurIPS (2020)
Google Scholar
Lombardi, S., Simon, T., Saragih, J., Schwartz, G., Lehrmann, A., Sheikh, Y.: Neural volumes: learning dynamic renderable volumes from images. arXiv preprint arXiv:1906.07751 (2019)
Max, N.: Optical models for direct volume rendering. IEEE TVCG 1(2), 99–108 (1995)
Google Scholar
Meagher, D.: Geometric modeling using octree encoding. Comput. Graphics Image Process. 19(2), 129–147 (1982)
Article Google Scholar
Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.: Occupancy networks: Learning 3d reconstruction in function space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4460–4470 (2019)
Google Scholar
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: Nerf: representing scenes as neural radiance fields for view synthesis. In: ECCV, pp. 405–421. Springer (2020). https://doi.org/10.1145/3503250
Müller, T., Evans, A., Schied, C., Keller, A.: Instant neural graphics primitives with a multiresolution hash encoding. arXiv preprint arXiv:2201.05989 (2022)
Neff, T., et al.: Donerf: towards real-time rendering of compact neural radiance fields using depth oracle networks. arXiv preprint arXiv:2103.03231 (2021)
Ost, J., Mannan, F., Thuerey, N., Knodt, J., Heide, F.: Neural scene graphs for dynamic scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2856–2865 (2021)
Google Scholar
Park, K., et al.: Deformable neural radiance fields. arXiv preprint arXiv:2011.12948 (2020)
Rebain, D., Jiang, W., Yazdani, S., Li, K., Yi, K.M., Tagliasacchi, A.: Derf: decomposed radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14153–14161 (2021)
Google Scholar
Reiser, C., Peng, S., Liao, Y., Geiger, A.: Kilonerf: speeding up neural radiance fields with thousands of tiny mlps. arXiv preprint arXiv:2103.13744 (2021)
Schönberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Schönberger, J.L., Zheng, E., Frahm, J.-M., Pollefeys, M.: Pixelwise view selection for unstructured multi-view stereo. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 501–518. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_31
Chapter Google Scholar
Seitz, S.M., Dyer, C.R.: Photorealistic scene reconstruction by voxel coloring. Int. J. Comput. Vis. 35(2), 151–173 (1999). https://doi.org/10.1023/A:1008176507526
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)
Google Scholar
Sitzmann, V., Chan, E.R., Tucker, R., Snavely, N., Wetzstein, G.: Metasdf: meta-learning signed distance functions. arXiv preprint arXiv:2006.09662 (2020)
Sitzmann, V., Thies, J., Heide, F., Nießner, M., Wetzstein, G., Zollhofer, M.: Deepvoxels: learning persistent 3d feature embeddings. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2437–2446 (2019)
Google Scholar
Sun, C., Sun, M., Chen, H.T.: Direct voxel grid optimization: super-fast convergence for radiance fields reconstruction. arXiv preprint arXiv:2111.11215 (2021)
Tancik, M., et al.: Learned initializations for optimizing coordinate-based neural representations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2846–2855 (2021)
Google Scholar
Thies, J., Zollhöfer, M., Nießner, M.: Deferred neural rendering: image synthesis using neural textures. ACM Trans. Graph. (TOG) 38(4), 1–12 (2019)
Article Google Scholar
Trevithick, A., Yang, B.: Grf: learning a general radiance field for 3d scene representation and rendering. arXiv:2010.04595 (2020)
Tulsiani, S., Zhou, T., Efros, A.A., Malik, J.: Multi-view supervision for single-view reconstruction via differentiable ray consistency. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2626–2634 (2017)
Google Scholar
Verbin, D., Hedman, P., Mildenhall, B., Zickler, T., Barron, J.T., Srinivasan, P.P.: Ref-nerf: Structured view-dependent appearance for neural radiance fields. arXiv preprint arXiv:2112.03907 (2021)
Wang, Q., et al.: Ibrnet: learning multi-view image-based rendering. In: CVPR, pp. 4690–4699 (2021)
Google Scholar
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Article Google Scholar
Wang, Z., Wu, S., Xie, W., Chen, M., Prisacariu, V.A.: NeRF$-$: neural radiance fields without known camera parameters. arXiv preprint arXiv:2102.07064 (2021)
Wei, Y., Liu, S., Rao, Y., Zhao, W., Lu, J., Zhou, J.: Nerfingmvs: guided optimization of neural radiance fields for indoor multi-view stereo. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5610–5619 (2021)
Google Scholar
Yu, A., Fridovich-Keil, S., Tancik, M., Chen, Q., Recht, B., Kanazawa, A.: Plenoxels: radiance fields without neural networks. arXiv preprint arXiv:2112.05131 (2021)
Yu, A., Li, R., Tancik, M., Li, H., Ng, R., Kanazawa, A.: PlenOctrees for real-time rendering of neural radiance fields. In: ICCV (2021)
Google Scholar
Yu, A., Ye, V., Tancik, M., Kanazawa, A.: pixelnerf: neural radiance fields from one or few images (2020)
Google Scholar
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Tao Technology Department, Alibaba Group, Hangzhou, China
Jian Zhang, Jinchi Huang, Bowen Cai, Huan Fu, Jiaming Wang, Hongchen Luo, Rongfei Jia, Binqiang Zhao & Xing Tang
School of Mathematics and Statistics Melbourne Centre, University of Melbourne, Melbourne, Australia
Mingming Gong
Univ Gustave Eiffel, CNRS, ENPC, LIGM, 77454, Marne-la-Vallée, France
Chaohui Wang

Authors

Jian Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jinchi Huang
View author publications
You can also search for this author in PubMed Google Scholar
Bowen Cai
View author publications
You can also search for this author in PubMed Google Scholar
Huan Fu
View author publications
You can also search for this author in PubMed Google Scholar
Mingming Gong
View author publications
You can also search for this author in PubMed Google Scholar
Chaohui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jiaming Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hongchen Luo
View author publications
You can also search for this author in PubMed Google Scholar
Rongfei Jia
View author publications
You can also search for this author in PubMed Google Scholar
Binqiang Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Xing Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huan Fu .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, J. et al. (2022). Digging into Radiance Grid for Real-Time View Synthesis with Detail Preservation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13675. Springer, Cham. https://doi.org/10.1007/978-3-031-19784-0_42

Download citation

DOI: https://doi.org/10.1007/978-3-031-19784-0_42
Published: 31 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19783-3
Online ISBN: 978-3-031-19784-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Digging into Radiance Grid for Real-Time View Synthesis with Detail Preservation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Neural Deformable Voxel Grid for Fast Optimization of Dynamic View Synthesis

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

Few-Shot View Synthesis Based on Geometric and Semantic Consistency

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Digging into Radiance Grid for Real-Time View Synthesis with Detail Preservation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Neural Deformable Voxel Grid for Fast Optimization of Dynamic View Synthesis

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

Few-Shot View Synthesis Based on Geometric and Semantic Consistency

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation