research-article

Colorful 3d reconstruction from a single image based on deep learning

Authors:

Qiaosheng FengAuthors Info & Claims

ACAI '20: Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence

Article No.: 25, Pages 1 - 7

https://doi.org/10.1145/3446132.3446157

Published: 09 March 2021 Publication History

Abstract

Simultaneously recovering the 3D shape and its surface color from a single image has been a very challenging. In this paper, we substantially improve Soft Rasterizer that is a state-of-the art method for 3D color object reconstruction. The model adopts the structure of the encoder and decoder with a single image as input. Firstly, the features are extracted by the encoder, and then they are simultaneously sent to the shape generator and the color generator to obtain the shape estimate and the corresponding surface color, and finally the final colorful 3D model is rendered by the differentiable renderer. In order to ensure the details of the reconstructed 3D model, this paper introduces an attention mechanism into the encoder to further improve the reconstruction effect. For surface color reconstruction, we propose a combination loss. The experimental results show that compared with the 3D reconstruction network models 3D-R2N2 and OccNet, the intersection-over-union (IOU) increases by 10% and 3% in our model. Compared to the open source project SoftRas_O, the model increases by 3.8% on structural similarity (SSIM) and decreases by 1.2% on mean square error (MSE).

References

[1]

Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, 2015. Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012(2015).

[2]

Wenzheng Chen, Huan Ling, Jun Gao, Edward Smith, Jaakko Lehtinen, Alec Jacobson, and Sanja Fidler. 2019. Learning to predict 3d objects with an interpolation-based differentiable renderer. In Advances in Neural Information Processing Systems. 9609–9619.

[3]

Christopher B Choy, Danfei Xu, JunYoung Gwak, Kevin Chen, and Silvio Savarese. 2016. 3d-r2n2: A unified approach for single and multi-view 3d object reconstruction. In European conference on computer vision. Springer, 628–644.

[4]

Haoqiang Fan, Hao Su, and Leonidas J Guibas. 2017. A point set generation network for 3d object reconstruction from a single image. In Proceedings of the IEEE conference on computer vision and pattern recognition. 605–613.

[5]

Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2414–2423.

[6]

Geoffrey E Hinton and Ruslan R Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. science 313, 5786 (2006), 504–507.

[7]

Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7132–7141.

[8]

Angjoo Kanazawa, Shubham Tulsiani, Alexei A Efros, and Jitendra Malik. 2018. Learning category-specific mesh reconstruction from image collections. In Proceedings of the European Conference on Computer Vision (ECCV). 371–386.

Digital Library

[9]

Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4401–4410.

[10]

Hiroharu Kato and Tatsuya Harada. 2019. Learning view priors for single-view 3d reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9778–9787.

[11]

Hiroharu Kato, Yoshitaka Ushiku, and Tatsuya Harada. 2018. Neural 3d mesh renderer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3907–3916.

[12]

Shichen Liu, Tianye Li, Weikai Chen, and Hao Li. 2019. Soft rasterizer: A differentiable renderer for image-based 3d reasoning. In Proceedings of the IEEE International Conference on Computer Vision. 7708–7717.

[13]

Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J Black. 2015. SMPL: A skinned multi-person linear model. ACM transactions on graphics (TOG) 34, 6 (2015), 1–16.

Digital Library

[14]

Matthew M Loper and Michael J Black. 2014. OpenDR: An approximate differentiable renderer. In European Conference on Computer Vision. Springer, 154–169.

[15]

Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, and Andreas Geiger. 2019. Occupancy networks: Learning 3d reconstruction in function space. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4460–4470.

[16]

Ryota Natsume, Shunsuke Saito, Zeng Huang, Weikai Chen, Chongyang Ma, Hao Li, and Shigeo Morishima. 2019. Siclope: Silhouette-based clothed people. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4480–4490.

[17]

Yongbin Sun, Ziwei Liu, Yue Wang, and Sanjay E Sarma. 2018. Im2avatar: Colorful 3d reconstruction from a single image. arXiv preprint arXiv:1804.06375(2018).

[18]

Nanyang Wang, Yinda Zhang, Zhuwen Li, Yanwei Fu, Wei Liu, and Yu-Gang Jiang. 2018. Pixel2mesh: Generating 3d mesh models from single rgb images. In Proceedings of the European Conference on Computer Vision (ECCV). 52–67.

Digital Library

[19]

Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing 13, 4 (2004), 600–612.

Digital Library

[20]

Sanghyun Woo, Jongchan Park, Joon-Young Lee, and In So Kweon. 2018. Cbam: Convolutional block attention module. In Proceedings of the European conference on computer vision (ECCV). 3–19.

Digital Library

[21]

Jiajun Wu, Yifan Wang, Tianfan Xue, Xingyuan Sun, Bill Freeman, and Josh Tenenbaum. 2017. Marrnet: 3d shape reconstruction via 2.5 d sketches. In Advances in neural information processing systems. 540–550.

[22]

Jiajun Wu, Chengkai Zhang, Xiuming Zhang, Zhoutong Zhang, William T Freeman, and Joshua B Tenenbaum. 2018. Learning shape priors for single-view 3d completion and reconstruction. In Proceedings of the European Conference on Computer Vision (ECCV). 646–662.

Digital Library

[23]

Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 2015. 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1912–1920.

[24]

Xiuming Zhang, Zhoutong Zhang, Chengkai Zhang, Josh Tenenbaum, Bill Freeman, and Jiajun Wu. 2018. Learning to reconstruct shapes from unseen classes. Advances in neural information processing systems 31 (2018), 2257–2268.

[25]

Hang Zhao, Orazio Gallo, Iuri Frosio, and Jan Kautz. 2016. Loss functions for image restoration with neural networks. IEEE Transactions on computational imaging 3, 1 (2016), 47–57.

[26]

Silvia Zuffi, Angjoo Kanazawa, Tanya Berger-Wolf, and Michael Black. 2019. Three-D Safari: Learning to Estimate Zebra Pose, Shape, and Texture from Images “In the Wild”. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 5358–5367.

Cited By

Kim TLee JLee KChoe Y(2024)Single-View 3D Reconstruction Based on Gradient-Applied Weighted LossJournal of Electrical Engineering & Technology10.1007/s42835-024-01812-z19:7(4523-4535)Online publication date: 27-Feb-2024
https://doi.org/10.1007/s42835-024-01812-z
Pandi GAggarwal K(2023)Deep Learning-Based 3-D Model for the Cultural Heritage Sites in the State of Gujarat, IndiaArtificial Intelligence and Sustainable Computing10.1007/978-981-99-1431-9_59(737-750)Online publication date: 24-Sep-2023
https://doi.org/10.1007/978-981-99-1431-9_59
Miao YJiang HJiang LTong M(2022)Research on 3D Reconstruction of Furniture Based on Differentiable RendererIEEE Access10.1109/ACCESS.2022.320465010(94312-94320)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3204650
Show More Cited By

Recommendations

Deep learning framework-based 3D shape reconstruction of tanks from a single RGB image
Abstract
In recent times, complicated three-dimensional shape reconstruction from a single RGB image has become a crucial technology in many industries such as Automotive, Healthcare, and Military. It is particularly challenging to reconstruct the complex ...
Highlights
- A new method was proposed to reconstruct meshes of various complicated tanks from a single image.
- Our framework avoids the problems of local adhesion, uneven surface and distortion of structure.
- We design a Shape Initialization ...
Mirror Surface Reconstruction from a Single Image
This paper tackles the problem of reconstructing the shape of a smooth mirror surface from a single image. In particular, we consider the case where the camera is observing the reflection of a static reference target in the unknown mirror. We first study ...
PushNet: 3D reconstruction from a single image by pushing
Abstract
Taking inspiration from the recent advancements in deep learning within the three-dimensional (3D) domain, we propose an end-to-end deep learning framework to reconstruct 3D shapes in point cloud format from a single color image. While many state-...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

ACAI '20: Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence

December 2020

576 pages

ISBN:9781450388115

DOI:10.1145/3446132

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 March 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ACAI 2020

ACAI 2020: 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence

December 24 - 26, 2020

Sanya, China

Acceptance Rates

Overall Acceptance Rate 173 of 395 submissions, 44%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
185
Total Downloads

Downloads (Last 12 months)21
Downloads (Last 6 weeks)3

Reflects downloads up to 26 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Kim TLee JLee KChoe Y(2024)Single-View 3D Reconstruction Based on Gradient-Applied Weighted LossJournal of Electrical Engineering & Technology10.1007/s42835-024-01812-z19:7(4523-4535)Online publication date: 27-Feb-2024
https://doi.org/10.1007/s42835-024-01812-z
Pandi GAggarwal K(2023)Deep Learning-Based 3-D Model for the Cultural Heritage Sites in the State of Gujarat, IndiaArtificial Intelligence and Sustainable Computing10.1007/978-981-99-1431-9_59(737-750)Online publication date: 24-Sep-2023
https://doi.org/10.1007/978-981-99-1431-9_59
Miao YJiang HJiang LTong M(2022)Research on 3D Reconstruction of Furniture Based on Differentiable RendererIEEE Access10.1109/ACCESS.2022.320465010(94312-94320)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3204650
Tahir RSargano AHabib Z(2021)Voxel-Based 3D Object Reconstruction from Single 2D Image Using Variational AutoencodersMathematics10.3390/math91822889:18(2288)Online publication date: 17-Sep-2021
https://doi.org/10.3390/math9182288

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents