Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

RIConv++: Effective Rotation Invariant Convolutions for 3D Point Clouds Deep Learning

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

3D point clouds deep learning is a promising field of research that allows a neural network to learn features of point clouds directly, making it a robust tool for solving 3D scene understanding tasks. While recent works show that point cloud convolutions can be invariant to translation and point permutation, investigations of the rotation invariance property for point cloud convolution has been so far scarce. Some existing methods perform point cloud convolutions with rotation-invariant features, existing methods generally do not perform as well as translation-invariant only counterpart. In this work, we argue that a key reason is that compared to point coordinates, rotation-invariant features consumed by point cloud convolution are not as distinctive. To address this problem, we propose a simple yet effective convolution operator that enhances feature distinction by designing powerful rotation invariant features from the local regions. We consider the relationship between the point of interest and its neighbors as well as the internal relationship of the neighbors to largely improve the feature descriptiveness. Our network architecture can capture both local and global context by simply tuning the neighborhood size in each convolution layer. We conduct several experiments on synthetic and real-world point cloud classifications, part segmentation, and shape retrieval to evaluate our method, which achieves the state-of-the-art accuracy under challenging rotations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Armeni, I., Sener, O., Zamir, A.R., Jiang, H., Brilakis, I., Fischer, M., & Savarese, S. (2016). 3d semantic parsing of large-scale indoor spaces. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 1534–1543).

  • Bai, S., Bai, X., Zhou, Z., Zhang, Z., & Jan Latecki, L. (2016). Gift: A real-time and scalable 3d shape search engine. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 5023–5032).

  • Chang, A.X., Funkhouser, T.A., Guibas, L.J., Hanrahan, P., Huang, Q.X., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., Xiao, J., Yi, L., & Yu, F. (2015). Shapenet: an information-rich 3d model repository. arXiv preprint arXiv:151203012

  • Chen, C., Li, G., Xu, R., Chen, T., Wang, M., & Lin, L. (2019). Clusternet: Deep hierarchical cluster network with rigorously rotation-invariant representation for point cloud analysis. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 4994–5002).

  • Curless, B., & Levoy, M. (1996). A volumetric method for building complex models from range images. In Proceedings of the 23rd annual conference on Computer graphics and interactive techniques, (pp. 303–312).

  • Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., & Niessner, M. (2017). Scannet: richly-annotated 3d reconstructions of indoor scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 5828–5839).

  • Deng, H., Birdal, T., & Ilic, S. (2018). Ppf-foldnet: unsupervised learning of rotation invariant 3d local descriptors. In Proceedings of the european conference on computer vision, (pp. 602–618)

  • Esteves, C., Allen-Blanchette, C., Makadia, & A., Daniilidis, K. (2018a). Learning so (3) equivariant representations with spherical cnns. In Proceedings of the European Conference on Computer Vision (ECCV), (pp. 52–68).

  • Esteves, C., Allen-Blanchette, C., Zhou. X., & Daniilidis, K. (2018b). Polar transformer networks. In International conference on learning representations.

  • Esteves, C., Xu, Y., Allen-Blanchette, C., & Daniilidis, K. (2019). Equivariant multi-view networks. In Proceedings of the IEEE international conference on computer vision, (pp. 1568–1577).

  • Furuya, T., & Ohbuchi, R. (2016). Deep aggregation of local 3d geometric features for 3d model retrieval. In BMVC, (Vol. 7, p. 8).

  • Guo, Y., Sohel, F., Bennamoun, M., Lu, M., & Wan, J. (2013). Rotational projection statistics for 3d local surface description and object recognition. International Journal of Computer Vision, 105(1), 63–86.

    Article  MathSciNet  Google Scholar 

  • Guo, Y., Wang, H., Hu, Q., Liu, H., Liu, L., & Bennamoun, M. (2020). Deep learning for 3d point clouds: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(12), 4338–4364.

  • Hua, B.S., Pham, Q.H., Nguyen, D.T., Tran, M.K., Yu, L.F., & Yeung, S.K. (2016). Scenenn: A scene meshes dataset with annotations. In: 2016 fourth international conference on 3D vision, (pp. 92–101).

  • Hua, B.S., Tran, M.K., & Yeung, S.K. (2018). Pointwise convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 984–993).

  • Kim, J., Jung, W., Kim, H., & Lee, J. (2020a). Cycnn: a rotation invariant cnn using polar mapping and cylindrical convolution layers. arXiv preprint arXiv:2007.10588

  • Kim, S., Park, J., & Han, B. (2020b). Rotation-invariant local-to-global representation learning for 3d point cloud. Advances in Neural Information Processing Systems, 33, 8174–8185.

  • Klokov, R., & Lempitsky, V. (2017). Escape from cells: Deep kd-networks for the recognition of 3d point cloud models. In Proceedings of the IEEE international conference on computer vision, (pp. 863–872).

  • Laptev, D., Savinov, N., Buhmann, J.M., & Pollefeys, M. (2016). Ti-pooling: transformation-invariant pooling for feature learning in convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 289–297).

  • Li, Y., Pirk, S., Su, H., Qi, C.R., & Guibas, L.J. (2016). Fpnn: Field probing neural networks for 3d data. Advances in Neural Information Processing Systems, 29, 307–315.

  • Li, Y., Bu, R., Sun, M., Wu, W., Di, X., & Chen, B. (2018). Pointcnn: Convolution on x-transformed points. Advances in Neural Information Processing Systems, 31, 820–830.

  • Li, X., Li, R., Chen, G., Fu, C.W., Cohen-Or, D., & Heng, P.A. (2021). A rotation-invariant framework for deep point cloud analysis. IEEE Transactions on Visualization and Computer Graphics. https://doi.org/10.48550/arXiv.2003.07238

  • Liu, Y., Fan, B., Xiang, S., & Pan, C. (2019). Relation-shape convolutional neural network for point cloud analysis. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 8895–8904).

  • Manning, C.D., Raghavan, P., & Schütze, H. (2008). Introduction to Information Retrieval. Cambridge University Press, http://nlp.stanford.edu/IR-book/information-retrieval-book.html

  • Maturana, D., & Scherer, S. (2015). Voxnet: A 3d convolutional neural network for real-time object recognition. In 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS), (pp. 922–928).

  • Mian, A., Bennamoun, M., & Owens, R. (2010). On the repeatability and quality of keypoints for local feature-based 3d object retrieval from cluttered scenes. International Journal of Computer Vision, 89(2–3), 348–361.

    Article  Google Scholar 

  • Poulenard, A., Rakotosaona, M.J., Ponty, Y., & Ovsjanikov, M. (2019). Effective rotation-invariant point cnn with spherical harmonics kernels. In International conference on 3D Vision

  • Qi, C.R., Su, H., Nießner, M., Dai, A., Yan, M., & Guibas, L.J. (2016). Volumetric and multi-view cnns for object classification on 3d data. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 5648–5656).

  • Qi, C.R., Su, H., Mo, K., & Guibas, L.J. (2017a). Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 652–660).

  • Qi, C.R., Yi, L., Su, H., & Guibas, L.J. (2017b). Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Advances in neural information processing systems, (pp. 5105–5114).

  • Rao, Y., Lu, J., & Zhou, J. (2019). Spherical fractal convolutional neural networks for point cloud recognition. In Computer vision and pattern recognition

  • Riegler, G., Osman Ulusoy, A., & Geiger, A. (2017). Octnet: Learning deep 3d representations at high resolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 3577–3586).

  • Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: convolutional networks for biomedical image segmentation. In International conference on medical image computing and computer-assisted intervention, (pp. 234–241).

  • Savva, M., Yu, F., Su, H., Aono, M., Chen, B., Cohen-Or, D., Deng, W., Su, H., Bai, S., & Bai, X., et al. (2016). Shrec16 track: largescale 3d shape retrieval from shapenet core55. In Proceedings of the eurographics workshop on 3D object retrieval, (vol 10).

  • Su, H., Maji, S., Kalogerakis, E., & Learned-Miller, E. (2015). Multi-view convolutional neural networks for 3d shape recognition. In Proceedings of the IEEE international conference on computer vision, (pp. 945–953).

  • Tatsuma, A., & Aono, M. (2009). Multi-fourier spectra descriptor and augmentation with spectral clustering for 3d shape retrieval. The Visual Computer, 25(8), 785–804.

    Article  Google Scholar 

  • Thomas, H. (2020). Rotation-invariant point convolution with multiple equivariant alignments. In 2020 International Conference on 3D Vision (3DV), (pp. 504–513).

  • Tombari, F., Salti, S., & Di Stefano, L. (2010). Unique signatures of histograms for local surface description. In Proceedings of the european conference on computer vision, (pp. 356–369). Springer.

  • Uy, M.A., Pham, Q.H., Hua, B.S., Nguyen, D.T., & Yeung, S.K. (2019). Revisiting point cloud classification: a new benchmark dataset and classification model on real-world data. In International conference on computer vision

  • van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9(86), 2579–2605.

  • Wang, P. S., Liu, Y., Guo, Y. X., Sun, C. Y., & Tong, X. (2017). O-cnn: Octree-based convolutional neural networks for 3d shape analysis. ACM Transactions on Graphics, 36(4), 1–11.

    Article  Google Scholar 

  • Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., & Solomon, J.M. (2019). Dynamic graph cnn for learning on point clouds. ACM Transactions on Graphics

  • Weiler, M., Hamprecht, F.A., & Storath, M. (2018). Learning steerable filters for rotation equivariant cnns. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 849–858).

  • Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., & Xiao, J. (2015a). 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 1912–1920).

  • Wu. Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., & Xiao, J. (2015b). 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 1912–1920).

  • Xu, Y., Fan, T., Xu, M., Zeng, L., & Qiao, Y. (2018). Spidercnn: Deep learning on point sets with parameterized convolutional filters. In Proceedings of the european conference on computer vision, (pp. 87–102).

  • Yi, L., Kim, V. G., Ceylan, D., Shen, I. C., Yan, M., Su, H., Lu, C., Huang, Q., Sheffer, A., & Guibas, L. (2016). A scalable active framework for region annotation in 3d shape collections. ACM Transactions on Graphics, 35(6), 1–12.

    Article  Google Scholar 

  • Zaharescu, A., Boyer, E., Varanasi, K., & Horaud, R. (2009). Surface feature detection and description with applications to mesh matching. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 373–380). IEEE.

  • Zhang, Z., Hua, B.S., Rosen, D.W., Yeung, S.K. (2019a). Rotation invariant convolutions for 3d point clouds deep learning. In International conference on 3D Vision, (pp. 204–213).

  • Zhang, Z., Hua, B.S., & Yeung, S.K. (2019b). Shellnet: Efficient point cloud convolutional neural networks using concentric shells statistics. In Proceedings of the IEEE international conference on computer vision, (pp. 1607–1616).

  • Zhang, Z., Hua, B.S., Chen, W., Tian, Y., & Yeung, S.K. (2020). Global context aware convolutions for 3d point cloud understanding. In International conference on 3D vision

  • Zhao, Y., Birdal, T., Lenssen, J.E., Menegatti, E., Guibas, L., & Tombari, F. (2020). Quaternion equivariant capsule networks for 3d point clouds. In European conference on computer vision, (pp. 1–19). Springer.

Download references

Acknowledgements

We thank the anonymous reviewers for their constructive comments. This research project is supported by the grant from Ningbo Research Institute of Zhejiang University (1149957B20210125), and partially supported by an internal grant from HKUST (R9429).

Author information

Authors and Affiliations

Authors

Additional information

Communicated by A. Hilton.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Z., Hua, BS. & Yeung, SK. RIConv++: Effective Rotation Invariant Convolutions for 3D Point Clouds Deep Learning. Int J Comput Vis 130, 1228–1243 (2022). https://doi.org/10.1007/s11263-022-01601-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-022-01601-z

Keywords

Navigation