Efficient Point Cloud Object Classifications with GhostMLP
<p>Quick view of point cloud object classification performances of different methods in the ScanObjectNN benchmark. Regarding our models, GhostMLP-S and GhostMLP, the former could use minimal parameters to obtain good performance, and the other could achieve the best performance in those models which do not require extra training data.</p> "> Figure 2
<p>Overview of a stage in GhostMLP. For full stages, the neural network mainly comprises an input embedding module and four pairs of ghost set abstraction modules. For the sample and group, a well-designed geometric affine module in PointMLP is applied.</p> "> Figure 3
<p>Feature maps in PointNet++.</p> "> Figure 4
<p>Visualization of part segmentation on the ShapeNet-Part dataset. (<b>a</b>) Ground truth of airplane; (<b>b</b>) ground truth of chair; (<b>c</b>) ground truth of chair; (<b>d</b>) ground truth of desk; (<b>e</b>) prediction of airplane; (<b>f</b>) prediction of chair; (<b>g</b>) prediction of chair; (<b>h</b>) prediction of desk.</p> "> Figure 5
<p>Visualization of classification results on the Oakland dataset. (<b>a</b>) Raw data of Oakland dataset. (<b>b</b>) Classification results of Oakland dataset.</p> "> Figure 6
<p>The loss landscape of GhostMLP and GhostMLP-S. (Dataset: ScanObjectNN). (<b>a</b>) Loss landscape of GhostMLP. (<b>b</b>) Loss landscape of GhostMLP-S.</p> ">
Abstract
:1. Introduction
2. Related Works
2.1. Deep Learning on Point Cloud
2.2. PointMLP
2.3. GhostNet
3. Methodology
3.1. Framework of GhostMLP
Algorithm 1 GhostMLP in a PyTorch-like style. |
|
3.2. Rethinking Training Strategies
3.2.1. Data Augmentation
3.2.2. Training Strategies
4. Experiments and Result Analysis
4.1. How Lightweight a GhostMLP Is
4.2. Classification on ScanObjectNN and ModelNet40
4.3. Part Segmentation on ShapeNet and Classification on MLS Data
4.3.1. ShapeNet
4.3.2. Oakland MLS Dataset
- Formatting the dataset to be compatible with GhostMLP. We found that the Oakland dataset was formatted by . Since the confidence values were almost always 2, we assumed that confidence was always true and removed this property. We allocated 5 types of colors to the labels since Oakland only had 5 classes. After formatting, the Oakland dataset was structured as .
- Classifying according to the area and dividing large point clouds into smaller ones. Although inputting the entire scene is a popular method for semantic segmentation, we recommend dividing large point clouds into smaller ones because we have labels for each point cloud. Therefore, a point cloud with approximately 100k points will be shuffled and divided into multiple point clouds with 4k points.
- Dividing the dataset into training, testing, and validation sets. Finally, we shuffled the 4k point clouds and divided them into training, testing, and validation sets with a ratio of 7:2:1.
4.4. Interpretability Analysis of GhostMLP
4.4.1. Ablation Studies: Data Augmentations
4.4.2. Ablation Studies: Modules in GhostMLP
4.4.3. Ablation Studies: Network Depth
4.4.4. Loss Landscape
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Camuffo, E.; Mari, D.; Milani, S. Recent Advancements in Learning Algorithms for Point Clouds: An Updated Overview. Sensors 2022, 22, 1357. [Google Scholar] [CrossRef] [PubMed]
- Chen, X.; Wu, H.; Lichti, D.; Han, X.; Ban, Y.; Li, P.; Deng, H. Extraction of indoor objects based on the exponential function density clustering model. Inf. Sci. 2022, 607, 1111–1135. [Google Scholar] [CrossRef]
- Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. Pointnet: Deep learning on point sets for 3D classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 652–660. [Google Scholar]
- Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Adv. Neural Inf. Process. Syst. 2017, 30, 5105–5114. [Google Scholar]
- Wu, W.; Qi, Z.; Fuxin, L. Pointconv: Deep convolutional networks on 3D point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 9621–9630. [Google Scholar]
- Thomas, H.; Qi, C.R.; Deschaud, J.E.; Marcotegui, B.; Goulette, F.; Guibas, L.J. Kpconv: Flexible and deformable convolution for point clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6411–6420. [Google Scholar]
- Li, G.; Muller, M.; Thabet, A.; Ghanem, B. Deepgcns: Can gcns go as deep as cnns? In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9267–9276. [Google Scholar]
- Zhao, H.; Jiang, L.; Jia, J.; Torr, P.H.; Koltun, V. Point transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Virtual Conference, 11–17 October 2021; pp. 16259–16268. [Google Scholar]
- Wu, Z.; Song, S.; Khosla, A.; Yu, F.; Zhang, L.; Tang, X.; Xiao, J. 3D shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1912–1920. [Google Scholar]
- Uy, M.A.; Pham, Q.H.; Hua, B.S.; Nguyen, T.; Yeung, S.K. Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1588–1597. [Google Scholar]
- Armeni, I.; Sener, O.; Zamir, A.R.; Jiang, H.; Brilakis, I.; Fischer, M.; Savarese, S. 3D semantic parsing of large-scale indoor spaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1534–1543. [Google Scholar]
- Qian, G.; Li, Y.; Peng, H.; Mai, J.; Hammoud, H.A.A.K.; Elhoseiny, M.; Ghanem, B. PointNeXt: Revisiting PointNet++ with Improved Training and Scaling Strategies. arXiv 2022, arXiv:2206.04670. [Google Scholar]
- Ma, X.; Qin, C.; You, H.; Ran, H.; Fu, Y. Rethinking network design and local geometry in point cloud: A simple residual mlp framework. arXiv 2022, arXiv:2202.07123. [Google Scholar]
- Zhou, Y.; Tuzel, O. Voxelnet: End-to-end learning for point cloud based 3d object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 4490–4499. [Google Scholar]
- Hamdi, A.; Giancola, S.; Ghanem, B. Mvtn: Multi-view transformation network for 3D shape recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 1–11. [Google Scholar]
- Yang, Z.; Ye, Q.; Stoter, J.; Nan, L. Enriching Point Clouds with Implicit Representations for 3D Classification and Segmentation. Remote Sens. 2022, 15, 61. [Google Scholar] [CrossRef]
- Qi, Z.; Dong, R.; Fan, G.; Ge, Z.; Zhang, X.; Ma, K.; Yi, L. Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining. arXiv 2023, arXiv:2302.02318. [Google Scholar]
- Zhang, R.; Wang, L.; Qiao, Y.; Gao, P.; Li, H. Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders. arXiv 2022, arXiv:2212.06785. [Google Scholar]
- Xue, L.; Gao, M.; Xing, C.; Martín-Martín, R.; Wu, J.; Xiong, C.; Xu, R.; Niebles, J.C.; Savarese, S. ULIP: Learning Unified Representation of Language, Image and Point Cloud for 3D Understanding. arXiv 2022, arXiv:2212.05171. [Google Scholar]
- Ran, H.; Liu, J.; Wang, C. Surface Representation for Point Clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 18942–18952. [Google Scholar]
- Ganea, O.; Bécigneul, G.; Hofmann, T. Hyperbolic neural networks. Adv. Neural Inf. Process. Syst. 2018, 31, 1–11. [Google Scholar]
- Montanaro, A.; Valsesia, D.; Magli, E. Rethinking the compositionality of point clouds through regularization in the hyperbolic space. arXiv 2022, arXiv:2209.10318. [Google Scholar]
- Han, K.; Wang, Y.; Tian, Q.; Guo, J.; Xu, C.; Xu, C. Ghostnet: More features from cheap operations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual Conference, 14–19 June 2020; pp. 1580–1589. [Google Scholar]
- Yang, G.; Lai, H.; Zhou, Q. Visual defects detection model of mobile phone screen. J. Intell. Fuzzy Syst. 2022, 43, 4335–4349. [Google Scholar] [CrossRef]
- Li, S.; Sultonov, F.; Tursunboev, J.; Park, J.H.; Yun, S.; Kang, J.M. Ghostformer: A GhostNet-Based Two-Stage Transformer for Small Object Detection. Sensors 2022, 22, 6939. [Google Scholar] [CrossRef]
- Carreira, J.; Zisserman, A. Quo vadis, action recognition? a new model and the kinetics dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 6299–6308. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
- Picard, D. Torch. manual_seed (3407) is all you need: On the influence of random seeds in deep learning architectures for computer vision. arXiv 2021, arXiv:2109.08203. [Google Scholar]
- Xu, M.; Ding, R.; Zhao, H.; Qi, X. PAConv: Position Adaptive Convolution With Dynamic Kernel Assembling on Point Clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 19–25 June 2021; pp. 3173–3182. [Google Scholar]
- Yi, L.; Kim, V.G.; Ceylan, D.; Shen, I.C.; Yan, M.; Su, H.; Lu, C.; Huang, Q.; Sheffer, A.; Guibas, L. A scalable active framework for region annotation in 3D shape collections. ACM Trans. Graph. (ToG) 2016, 35, 1–12. [Google Scholar] [CrossRef]
- Munoz, D.; Bagnell, J.A.; Vandapel, N.; Hebert, M. Contextual classification with functional max-margin markov networks. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 975–982. [Google Scholar]
- Che, E.; Jung, J.; Olsen, M.J. Object recognition, segmentation, and classification of mobile laser scanning point clouds: A state of the art review. Sensors 2019, 19, 810. [Google Scholar] [CrossRef] [PubMed]
- Li, H.; Xu, Z.; Taylor, G.; Studer, C.; Goldstein, T. Visualizing the loss landscape of neural nets. Adv. Neural Inf. Process. Syst. 2018, 31, 1–11. [Google Scholar]
- Liu, Y.; Tian, B.; Lv, Y.; Li, L.; Wang, F.Y. Point cloud classification using content-based transformer via clustering in feature space. IEEE/CAA J. Autom. Sinica 2023. [Google Scholar] [CrossRef]
Methods | Inputs | mAcc (%) | OA (%) | Parameters | FLOPs | Train Speed | Test Speed |
---|---|---|---|---|---|---|---|
PointNet | 1 k points | 86.0 | 89.2 | 223.8 | 308.5 | ||
PointNet++ | 1 k points | 90.7 | 1.41 M | 223.8 | 308.5 | ||
PointMLP w/o vote | 1 k points | 91.3 | 94.1 | 12.6 M | 14.6 G | 92 * | * (7.7 w/. CPU) |
PointMLP-elite w/o vote | 1 k points | 90.9 | 93.6 | 0.7 M | 0.8 G | 328 | 822.6 (31 w/. CPU) |
GhostMLP w/o vote | 1 k points | 91.5 | 93.9 | 6.0 M | 7.2 G | 118.5 | 308.5 (11 w/. CPU) |
GhostMLP-S w/o vote | 1 k points | 90.6 | 93.3 | 0.6 M | 0.7 G | 378 | 823 (32.5 w/. CPU) |
ScanObjectNN | ModelNet40 | ||||
---|---|---|---|---|---|
Methods | mAcc(%) | OA(%) | mAcc(%) | OA(%) | Parameters |
PointNet [3] | 63.4 | 68.2 | 86.0 | 89.2 | - |
PointNet++ [4] | 75.4 | 77.9 | - | 90.7 | 1.41 M |
PointMLP [13] | 84.4 | 85.7 | 91.3 (+0.1 w/vote) | 94.1 (+0.4 w/vote) | 12.6 M |
PointMLPElite | 82.6 | 84.4 | 90.7 (+0.2 w/vote) | 93.6 (+0.4 w/vote) | 0.7 M |
PointNeXt [12] | 86.8 | 88.2 | 91.6 | 94.0 | 1.6 M |
HyCoRe [22] | 87.0 | 88.3 | 91.9 | 94.5 | - |
GhostMLP-S | 90.9 (+0.1 w/vote) | 93.3 (+0.6 w/vote) | |||
GhostMLP | 91.6 (+0.4 w/vote) | 93.7 (+0.3 w/vote) | 6.0 M |
Methods | ShapeNetPart (Unit: IoU) | ||||||||
---|---|---|---|---|---|---|---|---|---|
Instance | aero | bag | cap | car | chair | earphone | guitar | knife | |
PointNet++ | 85.1 | 82.4 | 79.0 | 87.7 | 77.3 | 90.8 | 71.8 | 91.0 | 85.9 |
PointMLP | 86.1 | 83.4 | 83.3 | 87.4 | 80.5 | 90.3 | 78.1 | 92.1 | 88.0 |
GhostMLP | 86.1 | 84.5 | 87.0 | 88.3 | 80.6 | 90.3 | 81.3 | 92.0 | 88.6 |
Methods | ShapeNetPart (Unit: IoU) | ||||||||
Class | lamp | laptop | moto | mug | pistol | rocket | stake board | table | |
PointNet++ | 81.9 | 83.7 | 95.3 | 71.6 | 94.1 | 81.3 | 58.7 | 76.4 | 82.6 |
PointMLP | 84.5 | 82.5 | 96.2 | 77.5 | 95.7 | 85.3 | 65.7 | 83.3 | 84.3 |
GhostMLP | 85.0 | 82.1 | 96.0 | 77.6 | 95.0 | 84.3 | 64.2 | 83.7 | 84.0 |
Methods in ScanObjectNN | Mean Accuracy (%) | Overall Accuracy (%) | Buff or not |
---|---|---|---|
GhostMLP-S (Baseline) | |||
+epoch==600 (slightly called: epoch) | ✗ | ||
+rotation +epoch | ✓ | ||
+point dropout +epoch | ✓ | ||
+scale +epoch | ✗ | ||
+rotation + point dropout +epoch | ✓ | ||
+rotation + point dropout + scale +epoch | ✓ |
OA (%) | mAcc (%) | Parameters (M) | FLOPs | ||
---|---|---|---|---|---|
✓ | ✓ | 0.597 | 0.725G | ||
✗ | ✓ | 0.637 | 0.844G | ||
✓ | ✗ | 0.637 | 0.730G | ||
✗ | ✗ | 0.683 | 0.849G |
Repeat | Depth | mAcc (%) | OA (%) |
---|---|---|---|
24 layers | |||
28 layers | |||
40 layers |
Methods | Group and Sample Methods | Feature Spaces | Feautre Extraction |
---|---|---|---|
PointNet [3] | FPS | Euclidean | MLP |
PointNet++ [4] | FPS | Euclidean | MLP |
Point Transformer [8] | FPS | Euclidean | Attention |
PointNeXt [12] | FPS | Euclidean | MLP |
HyCoRe [22] | Geometric Affine | Hyperbolic | MLP |
PointConT [34] | FPS | Euclidean | Attention |
GhostMLP | Geometric Affine | Euclidean | MLP |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lai, H.; Law, K.L.E. Efficient Point Cloud Object Classifications with GhostMLP. Remote Sens. 2023, 15, 2254. https://doi.org/10.3390/rs15092254
Lai H, Law KLE. Efficient Point Cloud Object Classifications with GhostMLP. Remote Sensing. 2023; 15(9):2254. https://doi.org/10.3390/rs15092254
Chicago/Turabian StyleLai, Hawking, and K. L. Eddie Law. 2023. "Efficient Point Cloud Object Classifications with GhostMLP" Remote Sensing 15, no. 9: 2254. https://doi.org/10.3390/rs15092254
APA StyleLai, H., & Law, K. L. E. (2023). Efficient Point Cloud Object Classifications with GhostMLP. Remote Sensing, 15(9), 2254. https://doi.org/10.3390/rs15092254