Abstract
This paper tackles the intricate challenge of object removal to update the radiance field using the 3D Gaussian Splatting. The main challenges of this task lie in the preservation of geometric consistency and the maintenance of texture coherence in the presence of the substantial discrete nature of Gaussian primitives. We introduce a robust framework specifically designed to overcome these obstacles. The key insight of our approach is the enhancement of information exchange among visible and invisible areas, facilitating content restoration in terms of both geometry and texture. Our methodology begins with optimizing the positioning of Gaussian primitives to improve geometric consistency across both removed and visible areas, guided by an online registration process informed by monocular depth estimation. Following this, we employ a novel feature propagation mechanism to bolster texture coherence, leveraging a cross-attention design that bridges sampling Gaussians from both uncertain and certain areas. This innovative approach significantly refines the texture coherence within the final radiance field. Extensive experiments validate that our method not only elevates the quality of novel view synthesis for scenes undergoing object removal but also showcases notable efficiency gains in training and rendering speeds. Project Page: https://w-ted.github.io/publications/gscream.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: MIP-Nerf 360: unbounded anti-aliased neural radiance fields. In: CVPR (2022)
Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: ZIP-Nerf: Anti-aliased grid-based neural radiance fields. In: ICCV (2023)
Bertalmio, M., Sapiro, G., Caselles, V., Ballester, C.: Image inpainting. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques (2000)
Cen, J., et al.: Segment anything in 3D with nerfs. In: NeurIPS (2023)
Chen, G., Wang, W.: A survey on 3D gaussian splatting. arXiv preprint arXiv:2401.03890 (2024)
Chen, Y., et al.: Gaussianeditor: swift and controllable 3D editing with gaussian splatting. In: CVPR (2024)
Chen, Z., Funkhouser, T., Hedman, P., Tagliasacchi, A.: MobileNeRF: exploiting the polygon rasterization pipeline for efficient neural field rendering on mobile architectures. In: CVPR (2023)
Cheng, H.K., Tai, Y.W., Tang, C.K.: Rethinking space-time networks with improved memory coverage for efficient video object segmentation. In: NeurIPS (2021)
Dai, A., Diller, C., Nießner, M.: SG-NN: sparse generative neural networks for self-supervised scene completion of RGB-D scans. In: CVPR (2020)
Dai, A., Qi, C.R., Nießner, M.: Shape completion using 3D-encoder-predictor CNNs and shape synthesis. In: CVPR (2017)
Dai, A., Ritchie, D., Bokeloh, M., Reed, S., Sturm, J., Nießner, M.: Scancomplete: large-scale scene completion and semantic segmentation for 3D scans. In: CVPR (2018)
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In: NeurIPS (2017)
Kazhdan, M., Bolitho, M., Hoppe, H.: Poisson surface reconstruction. In: Proceedings of the Fourth Eurographics Symposium on Geometry Processing (2006)
Ke, B., Obukhov, A., Huang, S., Metzger, N., Daudt, R.C., Schindler, K.: Repurposing diffusion-based image generators for monocular depth estimation. In: CVPR (2024)
Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3D gaussian splatting for real-time radiance field rendering. ToG (2023)
Kutulakos, K.N., Seitz, S.M.: A theory of shape by space carving. IJCV (2000)
Liu, H.K., Shen, I., Chen, B.Y., et al.: NeRF-in: free-form nerf inpainting with RGB-D priors. arXiv preprint arXiv:2206.04901 (2022)
Lombardi, S., Simon, T., Saragih, J., Schwartz, G., Lehrmann, A., Sheikh, Y.: Neural volumes: learning dynamic renderable volumes from images. TOG (2019)
Lu, T., et al.: Scaffold-GS: structured 3D gaussians for view-adaptive rendering. In: CVPR (2024)
Max, N.: Optical models for direct volume rendering. TVCG (1995)
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. Commun. ACM (2021)
Mirzaei, A., et al.: Reference-guided controllable inpainting of neural radiance fields. In: ICCV (2023)
Mirzaei, A., et al.: Spin-NeRF: multiview segmentation and perceptual inpainting with neural radiance fields. In: CVPR (2023)
Müller, T., Evans, A., Schied, C., Keller, A.: Instant neural graphics primitives with a multiresolution hash encoding. ToG (2022)
Park, J.J., Florence, P., Straub, J., Newcombe, R., Lovegrove, S.: DeepsDF: learning continuous signed distance functions for shape representation. In: CVPR (2019)
Reiser, C., et al.: MERF: memory-efficient radiance fields for real-time view synthesis in unbounded scenes. TOG (2023)
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: CVPR (2022)
RunwayML: Stable diffusion (2021). https://huggingface.co/runwayml/stable-diffusion-inpainting
Schonberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: CVPR (2016)
Seitz, S.M., Dyer, C.R.: Photorealistic scene reconstruction by voxel coloring. IJCV (1999)
Shih, M.L., Su, S.Y., Kopf, J., Huang, J.B.: 3D photography using context-aware layered depth inpainting. In: CVPR (2020)
Sitzmann, V., Thies, J., Heide, F., Nießner, M., Wetzstein, G., Zollhofer, M.: Deepvoxels: learning persistent 3d feature embeddings. In: CVPR (2019)
Sitzmann, V., Zollhöfer, M., Wetzstein, G.: Scene representation networks: continuous 3D-structure-aware neural scene representations. In: NeurIPS (2019)
Tancik, M., et al.: Fourier features let networks learn high frequency functions in low dimensional domains. In: NeurIPS (2020)
Vaswani, A., et al.: Attention is all you need. In: NeurIPS (2017)
Wang, D., Zhang, T., Abboud, A., Süsstrunk, S.: Inpaintnerf360: text-guided 3D inpainting on unbounded neural radiance fields. In: CVPR (2024)
Wang, Q., et al.: Ibrnet: learning multi-view image-based rendering. In: CVPR (2021)
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. TIP (2004)
Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003 (2003)
Weder, S., et al.: Removing objects from neural radiance fields. In: CVPR (2023)
Yin, Y., Fu, Z., Yang, F., Lin, G.: Or-NeRF: object removing from 3D scenes guided by multiview segmentation with neural radiance fields. arXiv preprint arXiv:2305.10503 (2023)
Yu, Z., Peng, S., Niemeyer, M., Sattler, T., Geiger, A.: MonoSDF: exploring monocular geometric cues for neural implicit surface reconstruction. In: NeurIPS (2022)
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR (2018)
Zwicker, M., Pfister, H., Van Baar, J., Gross, M.: EWA volume splatting. In: VIS (2001)
Acknowledgements
This research is supported in part by the Early Career Scheme of the Research Grants Council (RGC) of the Hong Kong SAR under grant No. 26202321, SAIL Research Project, HKUST-Zeekr Collaborative Research Fund, HKUST-WeBank Joint Lab Project, and Tencent Rhino-Bird Focused Research Program.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary material 2 (mp4 91967 KB)
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, Y., Wu, Q., Zhang, G., Xu, D. (2025). Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15061. Springer, Cham. https://doi.org/10.1007/978-3-031-72646-0_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-72646-0_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72645-3
Online ISBN: 978-3-031-72646-0
eBook Packages: Computer ScienceComputer Science (R0)