Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3446132.3446157acmotherconferencesArticle/Chapter ViewAbstractPublication PagesacaiConference Proceedingsconference-collections
research-article

Colorful 3d reconstruction from a single image based on deep learning

Published: 09 March 2021 Publication History

Abstract

Simultaneously recovering the 3D shape and its surface color from a single image has been a very challenging. In this paper, we substantially improve Soft Rasterizer that is a state-of-the art method for 3D color object reconstruction. The model adopts the structure of the encoder and decoder with a single image as input. Firstly, the features are extracted by the encoder, and then they are simultaneously sent to the shape generator and the color generator to obtain the shape estimate and the corresponding surface color, and finally the final colorful 3D model is rendered by the differentiable renderer. In order to ensure the details of the reconstructed 3D model, this paper introduces an attention mechanism into the encoder to further improve the reconstruction effect. For surface color reconstruction, we propose a combination loss. The experimental results show that compared with the 3D reconstruction network models 3D-R2N2 and OccNet, the intersection-over-union (IOU) increases by 10% and 3% in our model. Compared to the open source project SoftRas_O, the model increases by 3.8% on structural similarity (SSIM) and decreases by 1.2% on mean square error (MSE).

References

[1]
Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, 2015. Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012(2015).
[2]
Wenzheng Chen, Huan Ling, Jun Gao, Edward Smith, Jaakko Lehtinen, Alec Jacobson, and Sanja Fidler. 2019. Learning to predict 3d objects with an interpolation-based differentiable renderer. In Advances in Neural Information Processing Systems. 9609–9619.
[3]
Christopher B Choy, Danfei Xu, JunYoung Gwak, Kevin Chen, and Silvio Savarese. 2016. 3d-r2n2: A unified approach for single and multi-view 3d object reconstruction. In European conference on computer vision. Springer, 628–644.
[4]
Haoqiang Fan, Hao Su, and Leonidas J Guibas. 2017. A point set generation network for 3d object reconstruction from a single image. In Proceedings of the IEEE conference on computer vision and pattern recognition. 605–613.
[5]
Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2414–2423.
[6]
Geoffrey E Hinton and Ruslan R Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. science 313, 5786 (2006), 504–507.
[7]
Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7132–7141.
[8]
Angjoo Kanazawa, Shubham Tulsiani, Alexei A Efros, and Jitendra Malik. 2018. Learning category-specific mesh reconstruction from image collections. In Proceedings of the European Conference on Computer Vision (ECCV). 371–386.
[9]
Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4401–4410.
[10]
Hiroharu Kato and Tatsuya Harada. 2019. Learning view priors for single-view 3d reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9778–9787.
[11]
Hiroharu Kato, Yoshitaka Ushiku, and Tatsuya Harada. 2018. Neural 3d mesh renderer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3907–3916.
[12]
Shichen Liu, Tianye Li, Weikai Chen, and Hao Li. 2019. Soft rasterizer: A differentiable renderer for image-based 3d reasoning. In Proceedings of the IEEE International Conference on Computer Vision. 7708–7717.
[13]
Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J Black. 2015. SMPL: A skinned multi-person linear model. ACM transactions on graphics (TOG) 34, 6 (2015), 1–16.
[14]
Matthew M Loper and Michael J Black. 2014. OpenDR: An approximate differentiable renderer. In European Conference on Computer Vision. Springer, 154–169.
[15]
Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, and Andreas Geiger. 2019. Occupancy networks: Learning 3d reconstruction in function space. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4460–4470.
[16]
Ryota Natsume, Shunsuke Saito, Zeng Huang, Weikai Chen, Chongyang Ma, Hao Li, and Shigeo Morishima. 2019. Siclope: Silhouette-based clothed people. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4480–4490.
[17]
Yongbin Sun, Ziwei Liu, Yue Wang, and Sanjay E Sarma. 2018. Im2avatar: Colorful 3d reconstruction from a single image. arXiv preprint arXiv:1804.06375(2018).
[18]
Nanyang Wang, Yinda Zhang, Zhuwen Li, Yanwei Fu, Wei Liu, and Yu-Gang Jiang. 2018. Pixel2mesh: Generating 3d mesh models from single rgb images. In Proceedings of the European Conference on Computer Vision (ECCV). 52–67.
[19]
Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing 13, 4 (2004), 600–612.
[20]
Sanghyun Woo, Jongchan Park, Joon-Young Lee, and In So Kweon. 2018. Cbam: Convolutional block attention module. In Proceedings of the European conference on computer vision (ECCV). 3–19.
[21]
Jiajun Wu, Yifan Wang, Tianfan Xue, Xingyuan Sun, Bill Freeman, and Josh Tenenbaum. 2017. Marrnet: 3d shape reconstruction via 2.5 d sketches. In Advances in neural information processing systems. 540–550.
[22]
Jiajun Wu, Chengkai Zhang, Xiuming Zhang, Zhoutong Zhang, William T Freeman, and Joshua B Tenenbaum. 2018. Learning shape priors for single-view 3d completion and reconstruction. In Proceedings of the European Conference on Computer Vision (ECCV). 646–662.
[23]
Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 2015. 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1912–1920.
[24]
Xiuming Zhang, Zhoutong Zhang, Chengkai Zhang, Josh Tenenbaum, Bill Freeman, and Jiajun Wu. 2018. Learning to reconstruct shapes from unseen classes. Advances in neural information processing systems 31 (2018), 2257–2268.
[25]
Hang Zhao, Orazio Gallo, Iuri Frosio, and Jan Kautz. 2016. Loss functions for image restoration with neural networks. IEEE Transactions on computational imaging 3, 1 (2016), 47–57.
[26]
Silvia Zuffi, Angjoo Kanazawa, Tanya Berger-Wolf, and Michael Black. 2019. Three-D Safari: Learning to Estimate Zebra Pose, Shape, and Texture from Images “In the Wild”. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 5358–5367.

Cited By

View all
  • (2024)Single-View 3D Reconstruction Based on Gradient-Applied Weighted LossJournal of Electrical Engineering & Technology10.1007/s42835-024-01812-z19:7(4523-4535)Online publication date: 27-Feb-2024
  • (2023)Deep Learning-Based 3-D Model for the Cultural Heritage Sites in the State of Gujarat, IndiaArtificial Intelligence and Sustainable Computing10.1007/978-981-99-1431-9_59(737-750)Online publication date: 24-Sep-2023
  • (2022)Research on 3D Reconstruction of Furniture Based on Differentiable RendererIEEE Access10.1109/ACCESS.2022.320465010(94312-94320)Online publication date: 2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ACAI '20: Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence
December 2020
576 pages
ISBN:9781450388115
DOI:10.1145/3446132
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 March 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Attention mechanism
  2. Colorful 3D reconstruction
  3. Deep learning
  4. Single image
  5. differentiable renderer

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ACAI 2020

Acceptance Rates

Overall Acceptance Rate 173 of 395 submissions, 44%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)21
  • Downloads (Last 6 weeks)3
Reflects downloads up to 26 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Single-View 3D Reconstruction Based on Gradient-Applied Weighted LossJournal of Electrical Engineering & Technology10.1007/s42835-024-01812-z19:7(4523-4535)Online publication date: 27-Feb-2024
  • (2023)Deep Learning-Based 3-D Model for the Cultural Heritage Sites in the State of Gujarat, IndiaArtificial Intelligence and Sustainable Computing10.1007/978-981-99-1431-9_59(737-750)Online publication date: 24-Sep-2023
  • (2022)Research on 3D Reconstruction of Furniture Based on Differentiable RendererIEEE Access10.1109/ACCESS.2022.320465010(94312-94320)Online publication date: 2022
  • (2021)Voxel-Based 3D Object Reconstruction from Single 2D Image Using Variational AutoencodersMathematics10.3390/math91822889:18(2288)Online publication date: 17-Sep-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media