research-article

Unsupervised contrastive learning with simple transformation for 3D point cloud data

Authors:

Meili WangAuthors Info & Claims

The Visual Computer, Volume 40, Issue 8

Pages 5169 - 5186

https://doi.org/10.1007/s00371-023-02921-y

Published: 31 July 2023 Publication History

Abstract

Though a number of point cloud learning methods have been proposed to handle unordered points, most of them are supervised and require labels for training. By contrast, unsupervised learning of point cloud data has received much less attention to date. In this paper, we propose a simple yet effective approach for unsupervised point cloud learning. In particular, we identify a very useful transformation which generates a good contrastive version of an original point cloud. They make up a pair. After going through a shared encoder and a shared head network, the consistency between the output representations are maximized with introducing two variants of contrastive losses to respectively facilitate downstream classification and segmentation. To demonstrate the efficacy of our method, we conduct experiments on three downstream tasks which are 3D object classification (on ModelNet40 and ModelNet10), shape part segmentation (on ShapeNet Part dataset) as well as scene segmentation (on S3DIS). Comprehensive results show that our unsupervised contrastive representation learning enables impressive outcomes in object classification and semantic segmentation. It generally outperforms current unsupervised methods, and even achieves comparable performance to supervised methods.

References

[1]

Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3d shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)

[2]

Riegler, G., Osman Ulusoy, A., Geiger, A.: Octnet: learning deep 3d representations at high resolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3577–3586 (2017)

[3]

Wang P-S, Liu Y, Guo Y-X, Sun C-Y, and Tong X O-cnn: Octree-based convolutional neural networks for 3d shape analysis ACM Trans. Graph. (TOG) 2017 36 4 1-11

Digital Library

[4]

Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 945–953 (2015)

[5]

Li, L., Zhu, S., Fu, H., Tan, P., Tai, C.-L.: End-to-end learning local multi-view descriptors for 3d point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1919–1928 (2020)

[6]

Lyu, Y., Huang, X., Zhang, Z.: Learning to segment 3d point clouds in 2d image space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12255–12264 (2020)

[7]

Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)

[8]

Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, pp. 5099–5108 (2017)

[9]

Li, Y., Bu, R., Sun, M., Wu, W., Di, X., Chen, B.: Pointcnn: convolution on x-transformed points. In: Advances in Neural Information Processing Systems, pp. 820–830 (2018)

[10]

Achlioptas, P., Diamanti, O., Mitliagkas, I., Guibas, L.: Learning representations and generative models for 3d point clouds. In: International Conference on Machine Learning, pp. 40–49. PMLR (2018)

[11]

Yang, Y., Feng, C., Shen, Y., Tian, D.: Foldingnet: point cloud auto-encoder via deep grid deformation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 206–215 (2018)

[12]

Han, Z., Wang, X., Liu, Y.-S., Zwicker, M.: Multi-angle point cloud-vae: unsupervised feature learning for 3d point clouds from multiple angles by joint self-reconstruction and half-to-half prediction. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 10441–10450. IEEE (2019)

[13]

Zhao, Y., Birdal, T., Deng, H., Tombari, F.: 3d point capsule networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1009–1018 (2019)

[14]

Zhang D, Lu X, Qin H, and He Y Pointfilter: point cloud filtering via encoder-decoder modeling IEEE Trans. Vis. Comput. Graph. 2020

[15]

Lu D, Lu X, Sun Y, and Wang J Deep feature-preserving normal estimation for point cloud filtering Comput. Aided Des. 2020 125

[16]

Lu X, Schaefer S, Luo J, Ma L, and He Y Low rank matrix approximation for 3d geometry filtering IEEE Trans. Vis. Comput. Graph. 2020

[17]

Lu X, Wu S, Chen H, Yeung S, Chen W, and Zwicker M Gpf: Gmm-inspired feature-preserving point set filtering IEEE Trans. Vis. Comput. Graph. 2018 24 8 2315-2326

[18]

Su, H., Jampani, V., Sun, D., Maji, S., Kalogerakis, E., Yang, M.-H., Kautz, J.: Splatnet: sparse lattice networks for point cloud processing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2530–2539 (2018)

[19]

Zhou H-Y, Liu A-A, Nie W-Z, and Nie J Multi-view saliency guided deep neural network for 3-d object retrieval and classification IEEE Trans. Multimed. 2019 22 6 1496-1506

[20]

Wu, W., Qi, Z., Fuxin, L.: Pointconv: deep convolutional networks on 3d point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9621–9630 (2019)

[21]

Xu, Y., Fan, T., Xu, M., Zeng, L., Qiao, Y.: Spidercnn: Deep learning on point sets with parameterized convolutional filters. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 87–102 (2018)

[22]

Liu, Y., Fan, B., Xiang, S., Pan, C.: Relation-shape convolutional neural network for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8895–8904 (2019)

[23]

Komarichev, A., Zhong, Z., Hua, J.: A-cnn: annularly convolutional neural networks on point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7421–7430 (2019)

[24]

Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, and Solomon JM Dynamic graph cnn for learning on point clouds Acm Trans. Graph. (tog) 2019 38 5 1-12

Digital Library

[25]

Lin, Z.-H., Huang, S.-Y., Wang, Y.-C.F.: Convolution in the cloud: learning deformable kernels in 3d graph convolution networks for point cloud analysis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1800–1809 (2020)

[26]

Jiang, L., Shi, S., Tian, Z., Lai, X., Liu, S., Fu, C.-W., Jia, J.: Guided point contrastive learning for semi-supervised point cloud semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6423–6432 (2021)

[27]

Du, B., Gao, X., Hu, W., Li, X.: Self-contrastive learning with hard negative sampling for self-supervised point cloud learning. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 3133–3142 (2021)

[28]

Xu C, Leng B, Chen B, Zhang C, and Zhou X Learning discriminative and generative shape embeddings for three-dimensional shape retrieval IEEE Trans. Multimed. 2019 22 9 2234-2245

[29]

Huang, J., Yan, W., Li, T.H., Liu, S., Li, G.: Learning the global descriptor for 3d object recognition based on multiple views decomposition. IEEE Trans. Multimed. (2020)

[30]

Simonovsky, M., Komodakis, N.: Dynamic edge-conditioned filters in convolutional neural networks on graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3693–3702 (2017)

[31]

Wang, S., Suo, S., Ma, W.-C., Pokrovsky, A., Urtasun, R.: Deep parametric continuous convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2589–2597 (2018)

[32]

Li, J., Chen, B.M., Hee Lee, G.: So-net: self-organizing network for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9397–9406 (2018)

[33]

Zhao, H., Jiang, L., Fu, C.-W., Jia, J.: Pointweb: enhancing local neighborhood features for point cloud processing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5565–5573 (2019)

[34]

Xie, S., Liu, S., Chen, Z., Tu, Z.: Attentional shapecontextnet for point cloud recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4606–4615 (2018)

[35]

Fujiwara, K., Hashimoto, T.: Neural implicit embedding for point cloud analysis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11734–11743 (2020)

[36]

Yan, X., Zheng, C., Li, Z., Wang, S., Cui, S.: Pointasnl: robust point clouds processing using nonlocal neural networks with adaptive sampling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5589–5598 (2020)

[37]

Qiu, S., Anwar, S., Barnes, N.: Geometric back-projection network for point cloud classification. IEEE Trans. Multimed. (2021)

[38]

Chen, C., Qian, S., Fang, Q., Xu, C.: Hapgn: hierarchical attentive pooling graph network for point cloud segmentation. IEEE Trans. Multimed. (2020)

[39]

Liu, H., Guo, Y., Ma, Y., Lei, Y., Wen, G.: Semantic context encoding for accurate 3d point cloud segmentation. IEEE Trans. Multimed. (2020)

[40]

Rao, Y., Lu, J., Zhou, J.: Global-local bidirectional reasoning for unsupervised representation learning of 3d point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5376–5385 (2020)

[41]

Zhang M, You H, Kadam P, Liu S, and Kuo C-CJ Pointhop: an explainable machine learning method for point cloud classification IEEE Trans. Multimed. 2020 22 7 1744-1755

[42]

Xie, S., Gu, J., Guo, D., Qi, C.R., Guibas, L., Litany, O.: Pointcontrast: unsupervised pre-training for 3d point cloud understanding. In: European Conference on Computer Vision, pp. 574–591. Springer (2020)

[43]

Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607. PMLR (2020)

[44]

Oord, A.v.d., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018)

[45]

Gadelha, M., Wang, R., Maji, S.: Multiresolution tree networks for 3d point cloud processing. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 103–118 (2018)

[46]

Han, Z., Shang, M., Liu, Y.-S., Zwicker, M.: View inter-prediction gan: unsupervised representation learning for 3d shapes by learning global shape memories to support local view predictions. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8376–8384 (2019)

[47]

Chang, A.X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., et al.: Shapenet: an information-rich 3d model repository. arXiv preprint arXiv:1512.03012 (2015)

[48]

Yi L, Kim VG, Ceylan D, Shen I-C, Yan M, Su H, Lu C, Huang Q, Sheffer A, and Guibas L A scalable active framework for region annotation in 3d shape collections ACM Trans. Graph. (ToG) 2016 35 6 1-12

Digital Library

[49]

Armeni, I., Sax, S., Zamir, A.R., Savarese, S.: Joint 2d-3d-semantic data for indoor scene understanding. arXiv preprint arXiv:1702.01105 (2017)

[50]

Klokov, R., Lempitsky, V.: Escape from cells: deep kd-networks for the recognition of 3d point cloud models. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 863–872 (2017)

[51]

Shen, Y., Feng, C., Yang, Y., Tian, D.: Mining point cloud local structures by kernel correlation and graph pooling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4548–4557 (2018)

[52]

Thomas, H., Qi, C.R., Deschaud, J.-E., Marcotegui, B., Goulette, F., Guibas, L.J.: Kpconv: flexible and deformable convolution for point clouds. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 6411–6420 (2019)

[53]

Atzmon M, Maron H, and Lipman Y Point convolutional neural networks by extension operators ACM Trans. Graph. (TOG) 2018 37 4 1-12

Digital Library

[54]

Liu, X., Han, Z., Liu, Y.-S., Zwicker, M.: Point2sequence: learning the shape representation of 3d point clouds with an attention-based sequence to sequence network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8778–8785 (2019)

[55]

Hassani, K., Haley, M.: Unsupervised multi-task feature learning on point clouds. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 8160–8171 (2019)

[56]

Huang, Q., Wang, W., Neumann, U.: Recurrent slice networks for 3d segmentation of point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2626–2635 (2018)

[57]

Feng, Z., Zhou, Q., Gu, Q., Tan, X., Cheng, G., Lu, X., Shi, J., Ma, L.: Dmt: dynamic mutual training for semi-supervised learning. arXiv preprint arXiv:2004.08514 (2020)

Cited By

Muzahid AHan HZhang YLi DZhang YJamshid JSohel F(2024)Deep learning for 3D object recognitionNeurocomputing10.1016/j.neucom.2024.128436608:COnline publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1016/j.neucom.2024.128436

Index Terms

Unsupervised contrastive learning with simple transformation for 3D point cloud data
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Machine learning

Index terms have been assigned to the content through auto-classification.

Recommendations

Semi-supervised Contrastive Learning for Label-Efficient Medical Image Segmentation
Medical Image Computing and Computer Assisted Intervention – MICCAI 2021
Abstract
The success of deep learning methods in medical image segmentation tasks heavily depends on a large amount of labeled data to supervise the training. On the other hand, the annotation of biomedical images requires domain knowledge and can be ...
Few-shot 3D Point Cloud Semantic Segmentation with Prototype Alignment
ICMLT '23: Proceedings of the 2023 8th International Conference on Machine Learning Technologies

Semantic Segmentation for 3D point clouds has made great progress in recent years. Most existing approaches for 3D point cloud segmentation are fully supervised, and they require a large number of well-annotated data for training. The training data is ...
Critical direction projection networks for few-shot learning
Abstract
With the development of deep learning, visual systems perform better than human beings in many classification tasks. However, the scarcity of labelled data is the most critical problem in such visual systems. Few-shot learning is adopted to tackle ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image The Visual Computer: International Journal of Computer Graphics

The Visual Computer: International Journal of Computer Graphics Volume 40, Issue 8

Aug 2024

782 pages

Issue’s Table of Contents

© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 31 July 2023

Accepted: 28 May 2023

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China
Key Research and Development Projects of Shaanxi Province
Science and Technology Innovation Program of Shaanxi Academy of Forestry Science

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 25 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Muzahid AHan HZhang YLi DZhang YJamshid JSohel F(2024)Deep learning for 3D object recognitionNeurocomputing10.1016/j.neucom.2024.128436608:COnline publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1016/j.neucom.2024.128436

View Options

View options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents