RIConv++: Effective Rotation Invariant Convolutions for 3D Point Clouds Deep Learning

1306 Accesses
28 Citations
18 Altmetric
2 Mentions
Explore all metrics

Abstract

3D point clouds deep learning is a promising field of research that allows a neural network to learn features of point clouds directly, making it a robust tool for solving 3D scene understanding tasks. While recent works show that point cloud convolutions can be invariant to translation and point permutation, investigations of the rotation invariance property for point cloud convolution has been so far scarce. Some existing methods perform point cloud convolutions with rotation-invariant features, existing methods generally do not perform as well as translation-invariant only counterpart. In this work, we argue that a key reason is that compared to point coordinates, rotation-invariant features consumed by point cloud convolution are not as distinctive. To address this problem, we propose a simple yet effective convolution operator that enhances feature distinction by designing powerful rotation invariant features from the local regions. We consider the relationship between the point of interest and its neighbors as well as the internal relationship of the neighbors to largely improve the feature descriptiveness. Our network architecture can capture both local and global context by simply tuning the neighborhood size in each convolution layer. We conduct several experiments on synthetic and real-world point cloud classifications, part segmentation, and shape retrieval to evaluate our method, which achieves the state-of-the-art accuracy under challenging rotations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Convolutional Point Transformer

3DTI-Net: Learn 3D Transform-Invariant Feature Using Hierarchical Graph CNN

BLNet: Bidirectional learning network for point clouds

Article Open access 06 March 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Armeni, I., Sener, O., Zamir, A.R., Jiang, H., Brilakis, I., Fischer, M., & Savarese, S. (2016). 3d semantic parsing of large-scale indoor spaces. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 1534–1543).
Bai, S., Bai, X., Zhou, Z., Zhang, Z., & Jan Latecki, L. (2016). Gift: A real-time and scalable 3d shape search engine. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 5023–5032).
Chang, A.X., Funkhouser, T.A., Guibas, L.J., Hanrahan, P., Huang, Q.X., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., Xiao, J., Yi, L., & Yu, F. (2015). Shapenet: an information-rich 3d model repository. arXiv preprint arXiv:151203012
Chen, C., Li, G., Xu, R., Chen, T., Wang, M., & Lin, L. (2019). Clusternet: Deep hierarchical cluster network with rigorously rotation-invariant representation for point cloud analysis. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 4994–5002).
Curless, B., & Levoy, M. (1996). A volumetric method for building complex models from range images. In Proceedings of the 23rd annual conference on Computer graphics and interactive techniques, (pp. 303–312).
Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., & Niessner, M. (2017). Scannet: richly-annotated 3d reconstructions of indoor scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 5828–5839).
Deng, H., Birdal, T., & Ilic, S. (2018). Ppf-foldnet: unsupervised learning of rotation invariant 3d local descriptors. In Proceedings of the european conference on computer vision, (pp. 602–618)
Esteves, C., Allen-Blanchette, C., Makadia, & A., Daniilidis, K. (2018a). Learning so (3) equivariant representations with spherical cnns. In Proceedings of the European Conference on Computer Vision (ECCV), (pp. 52–68).
Esteves, C., Allen-Blanchette, C., Zhou. X., & Daniilidis, K. (2018b). Polar transformer networks. In International conference on learning representations.
Esteves, C., Xu, Y., Allen-Blanchette, C., & Daniilidis, K. (2019). Equivariant multi-view networks. In Proceedings of the IEEE international conference on computer vision, (pp. 1568–1577).
Furuya, T., & Ohbuchi, R. (2016). Deep aggregation of local 3d geometric features for 3d model retrieval. In BMVC, (Vol. 7, p. 8).
Guo, Y., Sohel, F., Bennamoun, M., Lu, M., & Wan, J. (2013). Rotational projection statistics for 3d local surface description and object recognition. International Journal of Computer Vision, 105(1), 63–86.
Article MathSciNet Google Scholar
Guo, Y., Wang, H., Hu, Q., Liu, H., Liu, L., & Bennamoun, M. (2020). Deep learning for 3d point clouds: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(12), 4338–4364.
Hua, B.S., Pham, Q.H., Nguyen, D.T., Tran, M.K., Yu, L.F., & Yeung, S.K. (2016). Scenenn: A scene meshes dataset with annotations. In: 2016 fourth international conference on 3D vision, (pp. 92–101).
Hua, B.S., Tran, M.K., & Yeung, S.K. (2018). Pointwise convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 984–993).
Kim, J., Jung, W., Kim, H., & Lee, J. (2020a). Cycnn: a rotation invariant cnn using polar mapping and cylindrical convolution layers. arXiv preprint arXiv:2007.10588
Kim, S., Park, J., & Han, B. (2020b). Rotation-invariant local-to-global representation learning for 3d point cloud. Advances in Neural Information Processing Systems, 33, 8174–8185.
Klokov, R., & Lempitsky, V. (2017). Escape from cells: Deep kd-networks for the recognition of 3d point cloud models. In Proceedings of the IEEE international conference on computer vision, (pp. 863–872).
Laptev, D., Savinov, N., Buhmann, J.M., & Pollefeys, M. (2016). Ti-pooling: transformation-invariant pooling for feature learning in convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 289–297).
Li, Y., Pirk, S., Su, H., Qi, C.R., & Guibas, L.J. (2016). Fpnn: Field probing neural networks for 3d data. Advances in Neural Information Processing Systems, 29, 307–315.
Li, Y., Bu, R., Sun, M., Wu, W., Di, X., & Chen, B. (2018). Pointcnn: Convolution on x-transformed points. Advances in Neural Information Processing Systems, 31, 820–830.
Li, X., Li, R., Chen, G., Fu, C.W., Cohen-Or, D., & Heng, P.A. (2021). A rotation-invariant framework for deep point cloud analysis. IEEE Transactions on Visualization and Computer Graphics. https://doi.org/10.48550/arXiv.2003.07238
Liu, Y., Fan, B., Xiang, S., & Pan, C. (2019). Relation-shape convolutional neural network for point cloud analysis. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 8895–8904).
Manning, C.D., Raghavan, P., & Schütze, H. (2008). Introduction to Information Retrieval. Cambridge University Press, http://nlp.stanford.edu/IR-book/information-retrieval-book.html
Maturana, D., & Scherer, S. (2015). Voxnet: A 3d convolutional neural network for real-time object recognition. In 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS), (pp. 922–928).
Mian, A., Bennamoun, M., & Owens, R. (2010). On the repeatability and quality of keypoints for local feature-based 3d object retrieval from cluttered scenes. International Journal of Computer Vision, 89(2–3), 348–361.
Article Google Scholar
Poulenard, A., Rakotosaona, M.J., Ponty, Y., & Ovsjanikov, M. (2019). Effective rotation-invariant point cnn with spherical harmonics kernels. In International conference on 3D Vision
Qi, C.R., Su, H., Nießner, M., Dai, A., Yan, M., & Guibas, L.J. (2016). Volumetric and multi-view cnns for object classification on 3d data. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 5648–5656).
Qi, C.R., Su, H., Mo, K., & Guibas, L.J. (2017a). Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 652–660).
Qi, C.R., Yi, L., Su, H., & Guibas, L.J. (2017b). Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Advances in neural information processing systems, (pp. 5105–5114).
Rao, Y., Lu, J., & Zhou, J. (2019). Spherical fractal convolutional neural networks for point cloud recognition. In Computer vision and pattern recognition
Riegler, G., Osman Ulusoy, A., & Geiger, A. (2017). Octnet: Learning deep 3d representations at high resolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 3577–3586).
Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: convolutional networks for biomedical image segmentation. In International conference on medical image computing and computer-assisted intervention, (pp. 234–241).
Savva, M., Yu, F., Su, H., Aono, M., Chen, B., Cohen-Or, D., Deng, W., Su, H., Bai, S., & Bai, X., et al. (2016). Shrec16 track: largescale 3d shape retrieval from shapenet core55. In Proceedings of the eurographics workshop on 3D object retrieval, (vol 10).
Su, H., Maji, S., Kalogerakis, E., & Learned-Miller, E. (2015). Multi-view convolutional neural networks for 3d shape recognition. In Proceedings of the IEEE international conference on computer vision, (pp. 945–953).
Tatsuma, A., & Aono, M. (2009). Multi-fourier spectra descriptor and augmentation with spectral clustering for 3d shape retrieval. The Visual Computer, 25(8), 785–804.
Article Google Scholar
Thomas, H. (2020). Rotation-invariant point convolution with multiple equivariant alignments. In 2020 International Conference on 3D Vision (3DV), (pp. 504–513).
Tombari, F., Salti, S., & Di Stefano, L. (2010). Unique signatures of histograms for local surface description. In Proceedings of the european conference on computer vision, (pp. 356–369). Springer.
Uy, M.A., Pham, Q.H., Hua, B.S., Nguyen, D.T., & Yeung, S.K. (2019). Revisiting point cloud classification: a new benchmark dataset and classification model on real-world data. In International conference on computer vision
van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9(86), 2579–2605.
Wang, P. S., Liu, Y., Guo, Y. X., Sun, C. Y., & Tong, X. (2017). O-cnn: Octree-based convolutional neural networks for 3d shape analysis. ACM Transactions on Graphics, 36(4), 1–11.
Article Google Scholar
Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., & Solomon, J.M. (2019). Dynamic graph cnn for learning on point clouds. ACM Transactions on Graphics
Weiler, M., Hamprecht, F.A., & Storath, M. (2018). Learning steerable filters for rotation equivariant cnns. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 849–858).
Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., & Xiao, J. (2015a). 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 1912–1920).
Wu. Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., & Xiao, J. (2015b). 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 1912–1920).
Xu, Y., Fan, T., Xu, M., Zeng, L., & Qiao, Y. (2018). Spidercnn: Deep learning on point sets with parameterized convolutional filters. In Proceedings of the european conference on computer vision, (pp. 87–102).
Yi, L., Kim, V. G., Ceylan, D., Shen, I. C., Yan, M., Su, H., Lu, C., Huang, Q., Sheffer, A., & Guibas, L. (2016). A scalable active framework for region annotation in 3d shape collections. ACM Transactions on Graphics, 35(6), 1–12.
Article Google Scholar
Zaharescu, A., Boyer, E., Varanasi, K., & Horaud, R. (2009). Surface feature detection and description with applications to mesh matching. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 373–380). IEEE.
Zhang, Z., Hua, B.S., Rosen, D.W., Yeung, S.K. (2019a). Rotation invariant convolutions for 3d point clouds deep learning. In International conference on 3D Vision, (pp. 204–213).
Zhang, Z., Hua, B.S., & Yeung, S.K. (2019b). Shellnet: Efficient point cloud convolutional neural networks using concentric shells statistics. In Proceedings of the IEEE international conference on computer vision, (pp. 1607–1616).
Zhang, Z., Hua, B.S., Chen, W., Tian, Y., & Yeung, S.K. (2020). Global context aware convolutions for 3d point cloud understanding. In International conference on 3D vision
Zhao, Y., Birdal, T., Lenssen, J.E., Menegatti, E., Guibas, L., & Tombari, F. (2020). Quaternion equivariant capsule networks for 3d point clouds. In European conference on computer vision, (pp. 1–19). Springer.

Download references

Acknowledgements

We thank the anonymous reviewers for their constructive comments. This research project is supported by the grant from Ningbo Research Institute of Zhejiang University (1149957B20210125), and partially supported by an internal grant from HKUST (R9429).

Author information

Authors and Affiliations

Ningbo Research Institute, Zhejiang University, Ningbo, China
Zhiyuan Zhang
College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, China
Zhiyuan Zhang
NingboTech University, Ningbo, China
Zhiyuan Zhang
VinAI, Hanoi, Vietnam
Binh-Son Hua
Hong Kong University of Science and Technology, Hong Kong, China
Sai-Kit Yeung

Authors

Zhiyuan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Binh-Son Hua
View author publications
You can also search for this author in PubMed Google Scholar
Sai-Kit Yeung
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Communicated by A. Hilton.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Z., Hua, BS. & Yeung, SK. RIConv++: Effective Rotation Invariant Convolutions for 3D Point Clouds Deep Learning. Int J Comput Vis 130, 1228–1243 (2022). https://doi.org/10.1007/s11263-022-01601-z

Download citation

Received: 19 February 2021
Accepted: 14 February 2022
Published: 18 March 2022
Issue Date: May 2022
DOI: https://doi.org/10.1007/s11263-022-01601-z

RIConv++: Effective Rotation Invariant Convolutions for 3D Point Clouds Deep Learning

Abstract

Access this article

Subscribe and save

Buy Now