Abstract
In 3D vision, point cloud registration remains a major challenge, especially in end-to-end deep learning, where low-quality point pairs will directly lead to the degradation of registration accuracy. Therefore, we propose a point cloud registration network based on convolution fusion and a new attention mechanism to obtain high-quality point pairs and improve the accuracy of registration. In this work, we first fuse kernel point convolution and adaptive point convolution by cross-attention mechanism as the feature extraction backbone of the network to obtain features. Secondly, we use transformer to exchange information between source and target point clouds, which consists of a new attention mechanism module, named ReSE-Attention. It obtains a global feature view by adding a squeeze extraction module and deep learnable parameters to the normal attention mechanism. And then, a regression decoder is adapted to generate the correct point pairs. Finally, we first introduce Focal Loss on the loss function in point cloud registration to balance the relationship between overlapping and non-overlapping regions. Our approach is evaluated on both the scene dataset 3DMatch and the object dataset ModelNet and achieves state-of-the-art performance.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Availability of Data and Materials
The datasets generated during the current study are available from the corresponding author upon reasonable request.
References
Huang X, Mei G, Zhang J, Abbas R (2021) A comprehensive survey on point cloud registration. arXiv preprint arXiv:2103.02690
Takimoto R Y, Tsuzuki MdSG, Vogelaar R, Castro Martins T, Sato A K, Iwao Y, Gotoh T, Kagei S (2016) 3d reconstruction and multiple point cloud registration using a low precision RGB-D sensor. Mechatronics 35:11–22
Dang Z, Wang L, Guo Y, Salzmann M (2022) Learning-based point cloud registration for 6d object pose estimation in the real world. In: European conference on computer vision, pp. 19– 37 . Springer
Choy C, Park J, Koltun V (2019) Fully convolutional geometric features. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 8958– 8966
Zeng A, Song S, Nießner M, Fisher M, Xiao J, Funkhouser T (2017) 3dmatch: learning local geometric descriptors from RGB-D reconstructions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1802– 1811
Deng H, Birdal T, Ilic S (2018) Ppfnet: global context aware local features for robust 3d point matching. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 195– 205
Bai X, Luo Z, Zhou L, Fu H, Quan L, Tai C-L (2020) D3feat: joint learning of dense detection and description of 3d local features. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6359– 6367
Yew ZJ, Lee GH (2018) 3dfeat-net: weakly supervised local 3d features for point cloud registration. In: Proceedings of the european conference on computer vision (ECCV), pp. 607– 623
Huang S, Gojcic Z, Usvyatsov M, Wieser A, Schindler K (2021) Predator: registration of 3d point clouds with low overlap. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4267– 4276
Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395
Deng H, Birdal T, Ilic S (2018) Ppf-foldnet: unsupervised learning of rotation invariant 3d local descriptors. In: Proceedings of the European conference on computer vision (ECCV), pp. 602– 618
Gojcic Z, Zhou C, Wegner JD, Wieser A (2019) The perfect match: 3d point cloud matching with smoothed densities. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5545– 5554
Shi S, Wang X, Li H (2019) Pointrcnn: 3d object proposal generation and detection from point cloud. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 770– 779
Hu Q , Yang B, Xie L, Rosa S, Guo Y, Wang Z, Trigoni N, Markham A (2020) Randla-net: efficient semantic segmentation of large-scale point clouds. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11108– 11117
Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM (2019) Dynamic graph CNN for learning on point clouds. ACM Trans Gr (TOG) 38(5):1–12
Qi CR, Yi L, Su H, Guibas LJ (2017) Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in neural information processing systems 30
Li Y, Bu R, Sun M, Wu W, Di X, Chen B (2018) Pointcnn: convolution on x-transformed points. In: Advances in neural information processing systems 31
Wu W, Qi Z, Fuxin L (2019) Pointconv: deep convolutional networks on 3d point clouds. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9621– 9630
Xu Q, Sun X, Wu C-Y, Wang P, Neumann U (2020) Grid-gcn for fast and scalable point cloud learning. in: proceedings of the ieee/cvf Conference on Computer Vision and Pattern Recognition, pp. 5661– 5670
Zhou H, Feng Y, Fang M, Wei M, Qin J, Lu T (2021) Adaptive graph convolution for point cloud analysis. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 4965– 4974
Yew ZJ, Lee GH (2022) Regtr: end-to-end point cloud correspondences with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6677– 6686
Qin Z, Yu H, Wang C, Guo Y, Peng Y, Xu K (2022) Geometric transformer for fast and robust point cloud registration. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11143– 11152
Zhou D, Kang B, Jin X, Yang L, Lian X, Jiang Z, Hou Q, Feng J (2021) Deepvit: towards deeper vision transformer. arXiv preprint arXiv:2103.11886
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132– 7141
Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp. 2980– 2988
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3d shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1912– 1920
Besl PJ, McKay ND (1992) Method for registration of 3-d shapes. In: Sensor Fusion IV: control paradigms and data structures, vol. 1611, pp. 586– 606. Spie
Aiger D, Mitra NJ, Cohen-Or D (2008) 4-points congruent sets for robust pairwise surface registration. In: ACM SIGGRAPH 2008 papers, pp. 1– 10
Rusu RB, Blodow N, Marton ZC, Beetz M (2008) Aligning point cloud views using persistent feature histograms. In: 2008 IEEE/RSJ international conference on intelligent robots and systems, pp. 3384– 3391 . IEEE
Rusu RB, Blodow N, Beetz M (2009) Fast point feature histograms (FPFH) for 3d registration. In: 2009 IEEE international conference on robotics and automation, pp. 3212– 3217. IEEE
Tombari F, Salti S, Di Stefano L (2010) Unique shape context for 3d data description. In: Proceedings of the ACM workshop on 3D object retrieval, pp. 57– 62
Chen H, Bhanu B (2007) 3d free-form object recognition in range images using local surface patches. Pattern Recogn Lett 28(10):1252–1262
Salti S, Tombari F, Di Stefano L (2014) Shot: unique signatures of histograms for surface and texture description. Comput Vis Image Underst 125:251–264
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431– 3440
Yu H, Li F, Saleh M, Busam B, Ilic S (2021) Cofinet: reliable coarse-to-fine correspondences for robust pointcloud registration. Adv Neural Inf Process Syst 34:23872–23884
Li Y, Harada T (2022) Lepard: learning partial point cloud matching in rigid and deformable scenes. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5554– 5564
Wang Y, Solomon JM (2019) Deep closest point: learning representations for point cloud registration. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 3523– 3532
Cao A-Q, Puy G, Boulch A, Marlet R (2021) Pcam: product of cross-attention matrices for rigid registration of point clouds. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 13229– 13238
Yuan W, Eckart B, Kim K, Jampani V, Fox D, Kautz J (2020) Deepgmr: learning latent gaussian mixture models for registration. In: Computer vision–ECCV 2020: 16th European conference, Glasgow, Proceedings, Part V 16, pp. 733– 750. Springer
Aoki Y, Goforth H, Srivatsan RA, Lucey S (2019) Pointnetlk: robust & efficient point cloud registration using pointnet. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 7163– 7172
Baker S, Matthews I (2004) Lucas-kanade 20 years on: a unifying framework. Int J Comput Vis 56:221–255
Choy C, Dong W, Koltun V (2020) Deep global registration. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2514–2523
Graham B, Engelcke M, Van Der Maaten L (2018) 3d semantic segmentation with submanifold sparse convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 9224– 9232
Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 652– 660
Pais GD, Ramalingam S, Govindu VM, Nascimento JC, Chellappa R, Miraldo P (2020) 3dregnet: a deep neural network for 3d point registration. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 7193– 7203
Lee J, Kim S, Cho M, Park J (2021) Deep hough voting for robust global registration. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 15994– 16003
Gojcic Z, Zhou C, Wegner JD, Guibas LJ, Birdal T (2020) Learning multiview 3d point cloud registration. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1759– 1769
Yi KM, Trulls E, Ono Y, Lepetit V, Salzmann M, Fua P (2018) Learning to find good correspondences. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2666–2674
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems 30
Qiu S, Anwar S, Barnes N (2022) Pu-transformer: point cloud upsampling transformer. In: Proceedings of the Asian conference on computer vision, pp. 2475– 2493
Yang J, Zhang Q, Ni B, Li L, Liu J, Zhou M, Tian Q (2019) Modeling point clouds with self-attention and gumbel subset sampling. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3323– 3332
He C, Li R, Li S, Zhang L (2022) Voxel set transformer: a set-to-set approach to 3d object detection from point clouds. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 8417– 8427
Thomas H, Qi CR, Deschaud J-E, Marcotegui B, Goulette F, Guibas LJ (2019) Kpconv: flexible and deformable convolution for point clouds. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 6411– 6420
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp. 448– 456. pmlr
Xu B, Wang N, Chen T, Li M (2015) Empirical evaluation of rectified activations in convolutional network. arXiv preprint arXiv:1505.00853
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770– 778
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp. 315– 323 . JMLR workshop and conference proceedings
Kabsch W (1976) A solution for the best rotation to relate two sets of vectors. Acta Crystallogr Sect A Cryst Phys Diffr Theor Gen Crystallogr 32(5):922–923
Umeyama S (1991) Least-squares estimation of transformation parameters between two point patterns. IEEE Trans Pattern Anal Mach Intell 13(04):376–380
Oord A, Li Y, Vinyals O (2018) Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748
Loshchilov I, Hutter F (2017) Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101
Xu H, Liu S, Wang G, Liu G, Zeng B (2021) Omnet: learning overlapping mask for partial-to-partial point cloud registration. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 3132– 3141
Yew ZJ, Lee GH (2020) Rpm-net: robust point matching using learned features. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11824–11833
Acknowledgements
This work was supported by the Zhejiang Provincial Natural Science Foundation of China under Grants for Research on Dynamic 3D Point Cloud Compression Method for Holographic Video Transmission (LY21F010009) and Research on Performance Degradation and Life Prediction of Traction System under Mixed Uncertainty (LQ23F030016).
Author information
Authors and Affiliations
Contributions
Conceptualization, project administration, formal analysis, and funding are provided by WZ. Methodology, software, investigation writing—original draft and visualization are performed by YY. Validation, resources, writing review, and editing are provided by JZ. Writing review, editing, and funding are provided by XW. The paper is reviewed and edited by YZ. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
All authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhu, W., Ying, Y., Zhang, J. et al. Point Cloud Registration Network Based on Convolution Fusion and Attention Mechanism. Neural Process Lett 55, 12625–12645 (2023). https://doi.org/10.1007/s11063-023-11435-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-023-11435-6