IPCONV: Convolution with Multiple Different Kernels for Point Cloud Semantic Segmentation
<p>Integrated Point Convolution avoids one single physical understanding. (<b>a</b>) KPCONV; (<b>b</b>) IPCONV.</p> "> Figure 2
<p>Kernel points generation. (<b>a</b>) KPCONV: convolution kernel points generated with an attractive potential at the sphere’s center to prevent divergence; (<b>b</b>) cylindrical: convolution kernel points distributed cylindrically for enhanced feature capture; (<b>c</b>) spherical cone: convolution kernel points distributed in a spherical cone pattern for improved feature extraction.</p> "> Figure 3
<p>Generation of spherical cone kernel points.</p> "> Figure 4
<p>The convolution parallelization strategy of IPCONV. <math display="inline"><semantics> <mi>N</mi> </semantics></math> is the sum of cylindrical convolution kernels <math display="inline"><semantics> <mrow> <msub> <mi>N</mi> <mrow> <mi>c</mi> <mi>y</mi> <mi>c</mi> <mi>l</mi> <mi>i</mi> <mi>n</mi> <mi>d</mi> <mi>e</mi> <mi>r</mi> </mrow> </msub> </mrow> </semantics></math> and spherical convolution kernels <math display="inline"><semantics> <mrow> <msub> <mi>N</mi> <mrow> <mi>sphericalcone</mi> </mrow> </msub> </mrow> </semantics></math>. Roman numerals such as I, II signify the individual cylindrical convolution kernel or spherical cone convolution kernel. Optional blocks: shortcut max pooling (1) is only needed for strided KPCONV.</p> "> Figure 5
<p>The network of integrated point convolution. <math display="inline"><semantics> <mrow> <msub> <mi>S</mi> <mn>0</mn> </msub> <mo>></mo> <msub> <mi>S</mi> <mn>1</mn> </msub> <mo>></mo> <msub> <mi>S</mi> <mn>2</mn> </msub> <mo>></mo> <msub> <mi>S</mi> <mn>3</mn> </msub> <mo>></mo> <msub> <mi>S</mi> <mn>4</mn> </msub> </mrow> </semantics></math> signify point numbers.</p> "> Figure 6
<p>An overview of the ISPRS benchmark dataset. The left: the training set. The right: the test set.</p> "> Figure 7
<p>Covered areas and separate sections of the LASDU dataset. Upper right corner: <a href="#sec1-remotesensing-15-05136" class="html-sec">Section 1</a>, upper left corner: <a href="#sec2-remotesensing-15-05136" class="html-sec">Section 2</a>, lower left corner: <a href="#sec3-remotesensing-15-05136" class="html-sec">Section 3</a>, and lower right corner: <a href="#sec4-remotesensing-15-05136" class="html-sec">Section 4</a>.</p> "> Figure 8
<p>The percentage point count of point cloud surface density for the ISPRS benchmark dataset, LASDU dataset, and DFC2019 dataset is estimated within a radius of 2.5 m.</p> "> Figure 9
<p>Error map of IPCONV on the ISPRS benchmark dataset.</p> "> Figure 10
<p>The semantic segmentation results achieved by our IPCONV on the ISPRS benchmark dataset. The enlarged area shows the good discriminative ability of our network for categories such as building, tree, and roof.</p> "> Figure 11
<p>The semantic segmentation results achieved by our IPCONV on the ISPRS benchmark dataset. The left column is the real label, the middle column is the forecast result of Baseline (KPCONV), and the right column is the forecast result of IPCONV.</p> "> Figure 12
<p>The semantic segmentation results achieved by our IPCONV on the LASDU dataset. The enlarged regions show excellent semantic segmentation performance in areas with complex demands.</p> "> Figure 13
<p>The semantic segmentation results achieved by our IPCONV on the DFC 2019 dataset. The left column is the real label, the middle column is the forecast result of Baseline (KPCONV), and the right column is the forecast result of IPCONV.</p> ">
Abstract
:1. Introduction
- (1)
- Innovative Kernel Generation Methodology: Our study pioneers the development of two distinct methodologies for generating cylindrical convolution kernel points and spherical cone kernel points. Our proposed network optimally captures idiosyncrasies in ground object characteristics by meticulously tuning these parameters, enhancing feature learning.
- (2)
- Enhanced Local Category Differentiation: Addressing the need for precise discrimination between local categories, we introduce the MSNS. This system effectively concatenates knowledge acquired through diverse convolutional kernel point generation methods, elevating the proficiency of feature learning. This approach enhances the discernment capabilities of the network and augments its grasp on complex local features.
- (3)
- Benchmark Validation: The efficacy of our proposed model is demonstrated through comprehensive evaluations on multiple 3D benchmark datasets. Comparisons with established baseline methods consistently highlight the progress achieved. Specifically, on the ISPRS Vaihingen 3D dataset [13], our model achieves an impressive Avg.F1 score of 70.7%, with a corresponding overall accuracy (OA) index of 84.5%. Performance on the LASDU dataset [14] further underscores the prowess of our model, with an Avg.F1 score of 75.67% and an OA of 86.66%. On the DFC 2019 dataset [15], our model attains a remarkable Avg.F1 score of 87.9%, and OA score of 97.1%.
2. Related Work
2.1. Projection-Based Networks
2.2. Voxel-Based Networks
2.3. Point-Based Networks
3. Method
3.1. Convolution Kernel Point Generation
3.1.1. Cylindrical Convolution Kernel Point Generation
3.1.2. Spherical Cone Convolution Kernel Point Generation
3.1.3. Convolution Rules Based on KPCONV
3.2. Multi-Shape Neighborhood System
3.3. Network Architecture
4. Experiments
4.1. Dataset
4.2. Evaluation Metrics
4.3. Model Hyperparameters
5. Results
5.1. Semantic Segmentation Results for the ISPRS Benchmark Dataset
5.2. Semantic Segmentation Results on the LASDU Dataset
5.3. Semantic Segmentation Results on the DFC Dataset
6. Ablation Study
6.1. Effect of Difference Convolution Kernel Point Generation
6.2. Effect of the MSNS
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Shahat, E.; Hyun, C.T.; Yeom, C. City digital twin potentials: A review and research agenda. Sustainability 2021, 13, 3386. [Google Scholar] [CrossRef]
- Sommer, M.; Stjepandić, J.; Stobrawa, S.; Von Soden, M. Automatic generation of digital twin based on scanning and object recognition. In Transdisciplinary Engineering for Complex Socio-Technical Systems; IOS Press: Amsterdam, The Netherlands, 2019; Volume 7, pp. 645–654. [Google Scholar] [CrossRef]
- Lamas, D.; Soilán, M.; Grandío, J.; Riveiro, B. Automatic point cloud semantic segmentation of complex railway environments. Remote Sens. 2021, 13, 2332. [Google Scholar] [CrossRef]
- Pierdicca, R.; Paolanti, M.; Matrone, F.; Martini, M.; Morbidoni, C.; Malinverni, E.S.; Frontoni, E.; Lingua, A.M. Point cloud semantic segmentation using a deep learning framework for cultural heritage. Remote Sens. 2020, 12, 1005. [Google Scholar] [CrossRef]
- Munir, N.; Awrangjeb, M.; Stantic, B. Power Line Extraction and Reconstruction Methods from Laser Scanning Data: A Literature Review. Remote Sens. 2023, 15, 973. [Google Scholar] [CrossRef]
- Pulikkaseril, C.; Lam, S. Laser eyes for driverless cars: The road to automotive LIDAR. In Proceedings of the 2019 Optical Fiber Communications Conference and Exhibition (OFC), San Diego, CA, USA, 3–7 March 2019; pp. 1–4. [Google Scholar]
- Zhang, J.; Lin, X.; Ning, X. SVM-based classification of segmented airborne LiDAR point clouds in urban areas. Remote Sens. 2013, 5, 3749–3775. [Google Scholar] [CrossRef]
- Ni, H.; Lin, X.; Zhang, J. Classification of ALS point cloud with improved point cloud segmentation and random forests. Remote Sens. 2017, 9, 288. [Google Scholar] [CrossRef]
- Guo, Y.; Wang, H.; Hu, Q.; Liu, H.; Liu, L.; Bennamoun, M. Deep learning for 3d point clouds: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 4338. [Google Scholar] [CrossRef]
- Bello, S.A.; Yu, S.; Wang, C.; Adam, J.M.; Li, J. Deep learning on 3D point clouds. Remote Sens. 2020, 12, 1729. [Google Scholar] [CrossRef]
- Thomas, H.; Qi, C.R.; Deschaud, J.E.; Marcotegui, B.; Goulette, F.; Guibas, L.J. Kpconv: Flexible and deformable convolution for point clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6411–6420. [Google Scholar]
- Lin, Y.; Vosselman, G.; Cao, Y.; Yang, M.Y. Local and global encoder network for semantic segmentation of Airborne laser scanning point clouds. ISPRS J. Photogramm. Remote Sens. 2021, 176, 151–168. [Google Scholar] [CrossRef]
- Rottensteiner, F.; Sohn, G.; Jung, J.; Gerke, M.; Baillard, C.; Benitez, S.; Breitkopf, U. The isprs benchmark on urban object classification and 3d building reconstruction. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2012, I-3, 293–298. [Google Scholar] [CrossRef]
- Ye, Z.; Xu, Y.; Huang, R.; Tong, X.; Li, X.; Liu, X.; Luan, K.; Hoegner, L.; Stilla, U. LASDU: A large-scale aerial lidar dataset for semantic labeling in dense urban areas. ISPRS Int. J. Geo-Inf. 2020, 9, 450. [Google Scholar] [CrossRef]
- Bosch, M.; Foster, K.; Christie, G.; Wang, S.; Hager, G.D.; Brown, M. Semantic stereo for incidental satellite images. In Proceedings of the 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 7–11 January 2019; pp. 1524–1532. [Google Scholar]
- Su, H.; Maji, S.; Kalogerakis, E.; Learned-Miller, E. Multi-view convolutional neural networks for 3d shape recognition. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 945–953. [Google Scholar]
- Ma, C.; Guo, Y.; Yang, J.; An, W. Learning Multi-View Representation with LSTM for 3-D Shape Recognition and Retrieval. IEEE Trans. Multimed. 2019, 21, 1169–1182. [Google Scholar] [CrossRef]
- Yu, T.; Meng, J.; Yuan, J. Multi-view harmonized bilinear network for 3d object recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 186–194. [Google Scholar]
- Maset, E.; Padova, B.; Fusiello, A. Efficient large-scale airborne LiDAR data classification via fully convolutional network. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2020, 43, 527. [Google Scholar] [CrossRef]
- Hamdi, A.; Giancola, S.; Ghanem, B. Mvtn: Multi-view transformation network for 3d shape recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 1–11. [Google Scholar] [CrossRef]
- Song, W.; Li, D.; Sun, S.; Zhang, L.; Xin, Y.; Sung, Y.; Choi, R. 2D&3DHNet for 3D object classification in LiDAR point cloud. Remote Sens. 2022, 14, 3146. [Google Scholar] [CrossRef]
- Qi, C.R.; Su, H.; Nießner, M.; Dai, A.; Yan, M.; Guibas, L.J. Volumetric and multi-view cnns for object classification on 3d data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 5648–5656. [Google Scholar]
- Riegler, G.; Osman Ulusoy, A.; Geiger, A. Octnet: Learning deep 3d representations at high resolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 3577–3586. [Google Scholar]
- Klokov, R.; Lempitsky, V. Escape from cells: Deep kd-networks for the recognition of 3d point cloud models. In Proceedings of the IEEE International Conference on Computer Vision, Honolulu, HI, USA, 21–26 July 2017; pp. 863–872. [Google Scholar]
- Huang, M.; Wei, P.; Liu, X. An efficient encoding voxel-based segmentation (EVBS) algorithm based on fast adjacent voxel search for point cloud plane segmentation. Remote Sens. 2019, 11, 2727. [Google Scholar] [CrossRef]
- Meng, H.Y.; Gao, L.; Lai, Y.K.; Manocha, D. Vv-net: Voxel vae net with group convolutions for point cloud segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8500–8508. [Google Scholar]
- Zhu, X.; Zhou, H.; Wang, T.; Hong, F.; Li, W.; Ma, Y.; Li, H.; Yang, R.; Lin, D. Cylindrical and asymmetrical 3d convolution networks for lidar-based perception. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 6807. [Google Scholar] [CrossRef] [PubMed]
- Zhao, L.; Xu, S.; Liu, L.; Ming, D.; Tao, W. SVASeg: Sparse voxel-based attention for 3D LiDAR point cloud semantic segmentation. Remote Sens. 2022, 14, 4471. [Google Scholar] [CrossRef]
- Zaboli, M.; Rastiveis, H.; Hosseiny, B.; Shokri, D.; Sarasua, W.A.; Homayouni, S. D-Net: A Density-Based Convolutional Neural Network for Mobile LiDAR Point Clouds Classification in Urban Areas. Remote Sens. 2023, 15, 2317. [Google Scholar] [CrossRef]
- Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 652–660. [Google Scholar]
- Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Adv. Neural Inf. Process. Syst. 2017, 30, 5105–5114. [Google Scholar] [CrossRef]
- Jiang, M.Y.; Wu, Y.R.; Zhao, T.Q.; Zhao, Z.L.; Lu, C.W. Pointsift: A sift-like network module for 3d point cloud semantic segmentation. arXiv 2018, arXiv:1807.00652. [Google Scholar] [CrossRef]
- Zhao, H.; Jiang, L.; Fu, C.W.; Jia, J. Pointweb: Enhancing local neighborhood features for point cloud processing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 5565–5573. [Google Scholar]
- Hu, Q.; Yang, B.; Xie, L.; Rosa, S.; Guo, Y.; Wang, Z.; Trigoni, N.; Markham, A. Randla-net: Efficient semantic segmentation of large-scale point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11108–11117. [Google Scholar]
- Yin, F.; Huang, Z.; Chen, T.; Luo, G.; Yu, G.; Fu, B. Dcnet: Large-scale point cloud semantic segmentation with discriminative and efficient feature aggregation. IEEE Trans. Circuits Syst. Video Technol. 2023, 33, 4083–4095. [Google Scholar] [CrossRef]
- Wang, Y.; Sun, Y.; Liu, Z.; Sarma, S.E.; Bronstein, M.M.; Solomon, J.M. Dynamic graph cnn for learning on point clouds. ACM Trans. Graph. Tog 2019, 38, 1–12. [Google Scholar] [CrossRef]
- Zhang, K.; Hao, M.; Wang, J.; de Silva, C.W.; Fu, C. Linked dynamic graph cnn: Learning on point cloud via linking hierarchical features. arXiv 2019, arXiv:1904.10014. [Google Scholar] [CrossRef]
- Xu, Q.; Zhou, Y.; Wang, W.; Qi, C.R.; Anguelov, D. Spg: Unsupervised domain adaptation for 3d object detection via semantic point generation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 15446–15456. [Google Scholar]
- Zhou, H.; Feng, Y.; Fang, M.; Wei, M.; Qin, J.; Lu, T. Adaptive graph convolution for point cloud analysis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 4965–4974. [Google Scholar]
- Wang, L.; Huang, Y.; Hou, Y.; Zhang, S.; Shan, J. Graph attention convolution for point cloud semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 10296–10305. [Google Scholar]
- Huang, C.Q.; Jiang, F.; Huang, Q.H.; Wang, X.Z.; Han, Z.M.; Huang, W.Y. Dual-graph attention convolution network for 3-D point cloud classification. IEEE Trans. Neural Netw. Learn. Syst. 2022, 99, 1–13. [Google Scholar] [CrossRef]
- Tran, A.T.; Le, H.S.; Kwon, O.J.; Lee, S.H.; Kwon, K.R. General Local Graph Attention in Large-scale Point Cloud Segmentation. In Proceedings of the 2023 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV, USA, 6–8 January 2023; pp. 1–4. [Google Scholar]
- Zhao, H.; Jiang, L.; Jia, J.; Torr, P.H.; Koltun, V. Point transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 16259–16268. [Google Scholar]
- Engel, N.; Belagiannis, V.; Dietmayer, K. Point transformer. IEEE Access 2021, 9, 134826–134840. [Google Scholar] [CrossRef]
- Guo, M.H.; Cai, J.X.; Liu, Z.N.; Mu, T.J.; Martin, R.R.; Hu, S.M. Pct: Point cloud transformer. Comput. Vis. Media 2021, 7, 187–199. [Google Scholar] [CrossRef]
- Yu, X.; Tang, L.; Rao, Y.; Huang, T.; Zhou, J.; Lu, J. Point-bert: Pre-training 3d point cloud transformers with masked point modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 19313–19322. [Google Scholar]
- Hui, L.; Yang, H.; Cheng, M.; Xie, J.; Yang, J. Pyramid point cloud transformer for large-scale place recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 6098–6107. [Google Scholar]
- Li, Y.; Lin, Q.; Zhang, Z.; Zhang, L.; Chen, D.; Shuang, F. MFNet: Multi-level feature extraction and fusion network for large-scale point cloud classification. Remote Sens. 2022, 14, 5707. [Google Scholar] [CrossRef]
- Lu, D.; Xie, Q.; Gao, K.; Xu, L.; Li, J. 3DCTN: 3D convolution-transformer network for point cloud classification. IEEE Trans. Intell. Transp. Syst. 2022, 23, 24854–24865. [Google Scholar] [CrossRef]
- Lai, X.; Liu, J.; Jiang, L.; Wang, L.; Zhao, H.; Liu, S.; Qi, X.; Jia, J. Stratified transformer for 3d point cloud segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 8500–8509. [Google Scholar]
- Yang, Y.Q.; Guo, Y.X.; Xiong, J.Y.; Liu, Y.; Pan, H.; Wang, P.S.; Tong, X.; Guo, B. Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding. arXiv 2023, arXiv:2304.06906. [Google Scholar] [CrossRef]
- Li, Y.; Bu, R.; Sun, M.; Wu, W.; Di, X.; Chen, B. Pointcnn: Convolution on x-transformed points. Adv. Neural Inf. Process. Syst. 2018, 31, 820–830. [Google Scholar]
- Komarichev, A.; Zhong, Z.; Hua, J. A-cnn: Annularly convolutional neural networks on point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 7421–7430. [Google Scholar]
- Boulch, A. ConvPoint: Continuous convolutions for point cloud processing. Comput. Graph. 2020, 88, 24–34. [Google Scholar] [CrossRef]
- Engelmann, F.; Kontogianni, T.; Leibe, B. Dilated point convolutions: On the receptive field size of point convolutions on 3d point clouds. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020; pp. 9463–9469. [Google Scholar]
- Wu, W.; Fuxin, L.; Shan, Q. Pointconvformer: Revenge of the point-based convolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 21802–21813. [Google Scholar]
- Wen, C.; Yang, L.; Li, X.; Peng, L.; Chi, T. Directionally constrained fully convolutional neural network for airborne LiDAR point cloud classification. ISPRS J. Photogramm. Remote Sens. 2020, 162, 50–62. [Google Scholar] [CrossRef]
- Niemeyer, J.; Rottensteiner, F.; Soergel, U. Contextual classification of lidar data and building object detection in urban areas. ISPRS J. Photogramm. Remote Sens. 2014, 87, 152–165. [Google Scholar] [CrossRef]
- Yang, Z.; Tan, B.; Pei, H.; Jiang, W. Segmentation and multi-scale convolutional neural network-based classification of airborne laser scanner data. Sensors 2018, 18, 3347. [Google Scholar] [CrossRef]
- Yousefhussien, M.; Kelbe, D.J.; Ientilucci, E.J.; Salvaggio, C. A multi-scale fully convolutional network for semantic labeling of 3D point clouds. ISPRS J. Photogramm. Remote Sens. 2018, 143, 191–204. [Google Scholar] [CrossRef]
- Winiwarter, L.; Mandlburger, G.; Schmohl, S.; Pfeifer, N. Classification of ALS point clouds using end-to-end deep learning. PFG–J. Photogramm. Remote Sens. Geoinf. Sci. 2019, 87, 75–90. [Google Scholar] [CrossRef]
- Arief, H.A.A.; Indahl, U.G.; Strand, G.H.; Tveite, H. Addressing overfitting on point cloud classification using Atrous XCRF. ISPRS J. Photogramm. Remote Sens. 2019, 155, 90–101. [Google Scholar] [CrossRef]
- Mao, Y.; Chen, K.; Diao, W.; Sun, X.; Lu, X.; Fu, K.; Weinmann, M. Beyond single receptive field: A receptive field fusion-and-stratification network for airborne laser scanning point cloud classification. ISPRS J. Photogramm. Remote Sens. 2022, 188, 45–61. [Google Scholar] [CrossRef]
- Liu, Y.; Fan, B.; Meng, G.; Lu, J.; Xiang, S.; Pan, C. Densepoint: Learning densely contextual representation for efficient point cloud processing. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 5239–5248. [Google Scholar]
- Liu, Z.; Hu, H.; Cao, Y.; Zhang, Z.; Tong, X. A closer look at local aggregation operators in point cloud analysis. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; pp. 326–342. [Google Scholar]
- Huang, R.; Xu, Y.; Hong, D.; Yao, W.; Ghamisi, P.; Stilla, U. Deep point embedding for urban classification using ALS point clouds: A new perspective from local to global. ISPRS J. Photogramm. Remote Sens. 2020, 163, 62–81. [Google Scholar] [CrossRef]
- Wu, W.; Qi, Z.; Fuxin, L. Pointconv: Deep convolutional networks on 3d point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 9621–9630. [Google Scholar]
- Li, J.; Weinmann, M.; Sun, X.; Diao, W.; Feng, Y.; Hinz, S.; Fu, K. VD-LAB: A view-decoupled network with local-global aggregation bridge for airborne laser scanning point cloud classification. ISPRS J. Photogramm. Remote Sens. 2022, 186, 19–33. [Google Scholar] [CrossRef]
- Li, X.; Wang, L.; Wang, M.; Wen, C.; Fang, Y. DANCE-NET: Density-aware convolution networks with context encoding for airborne LiDAR point cloud classification. ISPRS J. Photogramm. Remote Sens. 2020, 166, 128–139. [Google Scholar] [CrossRef]
Number | H | M |
---|---|---|
1 | 8 | 2 |
2 | 4 | 2 |
3 | 2 | 1 |
4 | 2 | 1 |
Number | |
---|---|
5 | 45 |
6 | 60 |
Method | Power | Low Veg | Imp Surf | Car | Fence | Roof | Facade | Shrub | Tree | Avg.F1 | OA |
---|---|---|---|---|---|---|---|---|---|---|---|
LUH [58] | 59.6 | 77.5 | 91.1 | 73.1 | 34.0 | 94.2 | 56.3 | 46.6 | 83.1 | 68.4 | 81.6 |
WhuY4 [59] | 42.5 | 82.7 | 91.4 | 74.7 | 53.7 | 94.3 | 53.1 | 47.9 | 82.8 | 69.2 | 84.9 |
RIT_1 [60] | 37.5 | 77.9 | 91.5 | 73.4 | 18.0 | 94.0 | 49.3 | 45.9 | 82.5 | 63.3 | 81.6 |
alsNet [61] | 70.1 | 80.5 | 90.2 | 45.7 | 7.6 | 93.1 | 47.3 | 34.7 | 74.5 | 60.4 | 80.6 |
A-XCRF [62] | 63.0 | 82.6 | 91.9 | 74.9 | 39.9 | 94.5 | 59.3 | 50.7 | 82.7 | 71.1 | 85.0 |
D-FCN [57] | 70.4 | 80.2 | 91.4 | 78.1 | 37.0 | 93.0 | 60.5 | 46.0 | 79.4 | 70.7 | 82.2 |
LGENet [12] | 76.5 | 82.1 | 91.8 | 80.0 | 40.6 | 93.8 | 64.7 | 49.9 | 83.6 | 73.7 | 84.5 |
RFFS-Net [63] | 75.5 | 80.0 | 90.5 | 78.5 | 45.5 | 92.7 | 57.9 | 48.3 | 75.7 | 71.6 | 82.1 |
KPCONV [12] | 73.5 | 78.7 | 88.0 | 79.4 | 33.0 | 94.2 | 61.3 | 45.7 | 82.0 | 70.6 | 81.7 |
Ours | 66.8 | 82.1 | 91.4 | 74.3 | 36.8 | 94.8 | 65.2 | 42.3 | 82.7 | 70.7 | 84.5 |
Method | Ground | Buil. | Trees | Low Veg | Artifacts | OA | Avg.F1 |
---|---|---|---|---|---|---|---|
PointNet++ [14] | 87.74 | 90.63 | 81.98 | 63.17 | 31.26 | 82.84 | 70.96 |
PointCNN [68] | 89.3 | 92.83 | 84.08 | 62.77 | 31.65 | 85.04 | 72.13 |
DensePoint [68] | 89.78 | 94.77 | 85.2 | 65.45 | 34.17 | 86.31 | 73.87 |
DGCNN [68] | 90.52 | 93.21 | 81.55 | 63.26 | 37.08 | 85.51 | 73.12 |
PosPool [68] | 88.25 | 93.67 | 83.92 | 61.00 | 38.34 | 83.52 | 73.03 |
HAD-PointNet++ [14] | 88.74 | 93.16 | 82.24 | 65.24 | 36.89 | 84.37 | 73.25 |
PointConv [63] | 89.57 | 94.31 | 84.59 | 67.51 | 36.41 | 85.91 | 74.48 |
VD-LAB [68] | 91.19 | 95.53 | 87.26 | 73.49 | 44.64 | 88.01 | 78.42 |
RFFS-Net [63] | 90.92 | 95.35 | 86.81 | 71.01 | 44.36 | 87.12 | 77.69 |
KPCONV [68] | 89.12 | 93.43 | 83.22 | 59.70 | 31.85 | 83.71 | 71.47 |
IPCONV(Ours) | 90.47 | 96.26 | 85.75 | 59.58 | 46.34 | 86.66 | 75.67 |
Method | Ground | Trees | Buil. | Water | Bridge | OA | Avg.F1 |
---|---|---|---|---|---|---|---|
PointNet++ [69] | 98.3 | 95.8 | 79.7 | 4.40 | 7.30 | 92.7 | 57.1 |
PointSIFT [69] | 98.6 | 97.0 | 85.5 | 46.4 | 60.4 | 94.0 | 77.6 |
PointCNN [69] | 98.7 | 97.2 | 84.9 | 44.1 | 65.3 | 93.8 | 78.0 |
D-FCN [57] | 99.1 | 98.1 | 89.9 | 45.0 | 73.0 | 95.6 | 81.0 |
DANCE-NET [69] | 99.1 | 93.9 | 87.0 | 58.3 | 83.9 | 96.8 | 84.4 |
PointConv [63] | 97.3 | 95.8 | 93.6 | 74.5 | 69.2 | 95.3 | 86.1 |
RFFS-Net [63] | 96.6 | 96.1 | 88.7 | 77.8 | 80.1 | 94.3 | 88.0 |
KPCONV | 98.5 | 97.3 | 89.5 | 87.9 | 39.5 | 95.7 | 82.4 |
IPCONV(Ours) | 98.8 | 97.5 | 92.9 | 92.1 | 58.2 | 97.1 | 87.9 |
Method | Power | Low Veg | Imp Surf | Car | Fence | Roof | Facade | Shrub | Tree | Avg.F1 | OA |
---|---|---|---|---|---|---|---|---|---|---|---|
KPCONV [12] | 73.5 | 78.7 | 88.0 | 79.4 | 33.0 | 94.2 | 61.3 | 45.7 | 82.0 | 70.6 | 81.7 |
Cylinder (1) 1 | 72.3 | 82.2 | 91.6 | 74.3 | 24.0 | 93.9 | 63.1 | 43.9 | 80.7 | 69.5 | 83.8 |
Spherical cone (1) | 52.4 | 81.4 | 90.6 | 76.6 | 37.6 | 94.3 | 62.1 | 46.6 | 82.7 | 69.4 | 83.9 |
IPCONV (6) | 66.8 | 82.1 | 91.4 | 74.3 | 36.8 | 94.8 | 65.2 | 42.3 | 82.7 | 70.7 | 84.5 |
Method | Power | Low Veg | Imp Surf | Car | Fence | Roof | Facade | Shrub | Tree | Avg.F1 | OA |
---|---|---|---|---|---|---|---|---|---|---|---|
Vanilla (6) 1 | 67.5 | 81.7 | 91.5 | 62.5 | 30.0 | 94.7 | 63.5 | 41.2 | 81.5 | 68.2 | 83.9 |
Cylinder (6) | 65.2 | 81.8 | 91.1 | 72.4 | 31.5 | 94.9 | 63.2 | 45.8 | 81.9 | 69.8 | 84.3 |
Spherical cone (6) | 64.2 | 81.9 | 91.5 | 62.4 | 31.9 | 94.8 | 62.9 | 44.3 | 82.9 | 68.5 | 84.3 |
IPCONV (6) | 66.8 | 82.1 | 91.4 | 74.3 | 36.8 | 94.8 | 65.2 | 42.3 | 82.7 | 70.7 | 84.5 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, R.; Chen, S.; Wang, X.; Zhang, Y. IPCONV: Convolution with Multiple Different Kernels for Point Cloud Semantic Segmentation. Remote Sens. 2023, 15, 5136. https://doi.org/10.3390/rs15215136
Zhang R, Chen S, Wang X, Zhang Y. IPCONV: Convolution with Multiple Different Kernels for Point Cloud Semantic Segmentation. Remote Sensing. 2023; 15(21):5136. https://doi.org/10.3390/rs15215136
Chicago/Turabian StyleZhang, Ruixiang, Siyang Chen, Xuying Wang, and Yunsheng Zhang. 2023. "IPCONV: Convolution with Multiple Different Kernels for Point Cloud Semantic Segmentation" Remote Sensing 15, no. 21: 5136. https://doi.org/10.3390/rs15215136
APA StyleZhang, R., Chen, S., Wang, X., & Zhang, Y. (2023). IPCONV: Convolution with Multiple Different Kernels for Point Cloud Semantic Segmentation. Remote Sensing, 15(21), 5136. https://doi.org/10.3390/rs15215136