research-article

Multi-Scale Superpoint Network for 3D Point Cloud Semantic Segmentation

Authors:

Haofeng ZhangAuthors Info & Claims

MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in Asia

Article No.: 75, Pages 1 - 7

https://doi.org/10.1145/3595916.3626449

Published: 01 January 2024 Publication History

Abstract

3D point cloud semantic segmentation is a fundamental task for 3D scene understanding. However, most existing pipelines usually use k-NN or ball query operation to form hard neighborhoods, which may cross different semantic objects, resulting low-quality local features. To address this issue, we propose a multi-scale superpoint network that gradually generates multi-scale soft neighborhoods to extract geometric local features, thereby boosting the 3D semantic segmentation performance. Specifically, we present a simple yet efficient superpoint merging module that merge small-scale superpoints to obtain large-scale superpoint by considering the feature similarity of superpoints, so that we can obtain multi-scale geometric features of point clouds. We also develop a superpoint upsampling module that adopt inverse mapping function to propagate multi-scale features from low-resolution point cloud to high-resolution point cloud. By integrating our multi-scale superpoint network into a simple point based semantic segmentation network, our method can obtain SOTA results on S3DIS Area 5 and 6-fold, and competitive results on ScanNet v2.

References

[1]

Iro Armeni, Ozan Sener, Amir R Zamir, Helen Jiang, Ioannis Brilakis, Martin Fischer, and Silvio Savarese. 2016. 3d semantic parsing of large-scale indoor spaces. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1534–1543.

[2]

Daniel Bolya, Cheng-Yang Fu, Xiaoliang Dai, Peizhao Zhang, Christoph Feichtenhofer, and Judy Hoffman. 2022. Token merging: Your vit but faster. arXiv preprint arXiv:2210.09461 (2022).

[3]

Xiaozhi Chen, Huimin Ma, Ji Wan, Bo Li, and Tian Xia. 2017. Multi-view 3D object detection network for autonomous driving. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 1907–1915.

[4]

Christopher Choy, JunYoung Gwak, and Silvio Savarese. 2019. 4D spatio-temporal convnets: Minkowski convolutional neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3075–3084.

[5]

Angela Dai, Angel X Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner. 2017. Scannet: Richly-annotated 3d reconstructions of indoor scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5828–5839.

[6]

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).

[7]

Benjamin Graham, Martin Engelcke, and Laurens Van Der Maaten. 2018. 3D semantic segmentation with submanifold sparse convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 9224–9232.

[8]

Meng-Hao Guo, Jun-Xiong Cai, Zheng-Ning Liu, Tai-Jiang Mu, Ralph R Martin, and Shi-Min Hu. 2021. Pct: Point cloud transformer. Computational Visual Media 7 (2021), 187–199.

[9]

Qingyong Hu, Bo Yang, Linhai Xie, Stefano Rosa, Yulan Guo, Zhihua Wang, Niki Trigoni, and Andrew Markham. 2020. Randla-net: Efficient semantic segmentation of large-scale point clouds. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11108–11117.

[10]

Le Hui, Jia Yuan, Mingmei Cheng, Jin Xie, Xiaoya Zhang, and Jian Yang. 2021. Superpoint network for point cloud oversegmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5510–5519.

[11]

Xin Lai, Jianhui Liu, Li Jiang, Liwei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, and Jiaya Jia. 2022. Stratified transformer for 3D point cloud segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8500–8509.

[12]

Loic Landrieu and Martin Simonovsky. 2018. Large-scale point cloud semantic segmentation with superpoint graphs. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4558–4567.

[13]

Bo Li, Tianlei Zhang, and Tian Xia. 2016. Vehicle detection from 3d lidar using fully convolutional network. arXiv preprint arXiv:1608.07916 (2016).

[14]

Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Xinhan Di, and Baoquan Chen. 2018. Pointcnn: Convolution on x-transformed points. Advances in neural information processing systems 31 (2018).

[15]

Haojia Lin, Xiawu Zheng, Lijiang Li, Fei Chao, Shanshan Wang, Yan Wang, Yonghong Tian, and Rongrong Ji. 2023. Meta Architecture for Point Cloud Analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 17682–17691.

[16]

Yangbin Lin, Cheng Wang, Dawei Zhai, Wei Li, and Jonathan Li. 2018. Toward better boundary preserved supervoxel segmentation for 3D point clouds. ISPRS journal of photogrammetry and remote sensing 143 (2018), 39–47.

[17]

Daniel Maturana and Sebastian Scherer. 2015. VoxNet: A 3D convolutional neural network for real-time object recognition. In 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 922–928.

Digital Library

[18]

Yatian Pang, Wenxiao Wang, Francis EH Tay, Wei Liu, Yonghong Tian, and Li Yuan. 2022. Masked autoencoders for point cloud self-supervised learning. In European conference on computer vision. Springer, 604–621.

Digital Library

[19]

Jeremie Papon, Alexey Abramov, Markus Schoeler, and Florentin Worgotter. 2013. Voxel cloud connectivity segmentation-supervoxels for point clouds. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2027–2034.

Digital Library

[20]

Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017. PointNet: Deep learning on point sets for 3D classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 652–660.

[21]

Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J Guibas. 2017. PointNet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems 30 (2017).

[22]

Guocheng Qian, Yuchen Li, Houwen Peng, Jinjie Mai, Hasan Hammoud, Mohamed Elhoseiny, and Bernard Ghanem. 2022. Pointnext: Revisiting pointnet++ with improved training and scaling strategies. Advances in Neural Information Processing Systems 35 (2022), 23192–23204.

[23]

Haoxi Ran, Jun Liu, and Chengjie Wang. 2022. Surface representation for point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18942–18952.

[24]

Shuran Song, Fisher Yu, Andy Zeng, Angel X Chang, Manolis Savva, and Thomas Funkhouser. 2017. Semantic scene completion from a single depth image. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1746–1754.

[25]

Hang Su, Subhransu Maji, Evangelos Kalogerakis, and Erik Learned-Miller. 2015. Multi-view convolutional neural networks for 3D shape recognition. In Proceedings of the IEEE international conference on computer vision. 945–953.

Digital Library

[26]

Liyao Tang, Yibing Zhan, Zhe Chen, Baosheng Yu, and Dacheng Tao. 2022. Contrastive boundary learning for point cloud segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8489–8499.

[27]

Hugues Thomas, Charles R Qi, Jean-Emmanuel Deschaud, Beatriz Marcotegui, François Goulette, and Leonidas J Guibas. 2019. KPConv: Flexible and deformable convolution for point clouds. In Proceedings of the IEEE/CVF international conference on computer vision. 6411–6420.

[28]

Lei Wang, Yuchun Huang, Yaolin Hou, Shenman Zhang, and Jie Shan. 2019. Graph attention convolution for point cloud semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10296–10305.

[29]

Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E Sarma, Michael M Bronstein, and Justin M Solomon. 2019. Dynamic graph cnn for learning on point clouds. ACM Transactions on Graphics (tog) 38, 5 (2019), 1–12.

Digital Library

[30]

Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, and Hengshuang Zhao. 2022. Point transformer v2: Grouped vector attention and partition-based pooling. Advances in Neural Information Processing Systems 35 (2022), 33330–33342.

[31]

Xumin Yu, Lulu Tang, Yongming Rao, Tiejun Huang, Jie Zhou, and Jiwen Lu. 2022. Point-Bert: Pre-training 3D point cloud transformers with masked point modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 19313–19322.

[32]

Hengshuang Zhao, Li Jiang, Chi-Wing Fu, and Jiaya Jia. 2019. PointWeb: Enhancing local neighborhood features for point cloud processing. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5565–5573.

[33]

Hengshuang Zhao, Li Jiang, Jiaya Jia, Philip HS Torr, and Vladlen Koltun. 2021. Point transformer. In Proceedings of the IEEE/CVF international conference on computer vision. 16259–16268.

Index Terms

Multi-Scale Superpoint Network for 3D Point Cloud Semantic Segmentation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Scene understanding

Recommendations

Transformer-based Point Cloud Generation Network
MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Point cloud generation is an important research topic in 3D computer vision, which can provide high-quality datasets for various downstream tasks. However, efficiently capturing the geometry of point clouds remains a challenging problem due to their ...
Automatic Segmentation of the Prostate on 3D CT Images by Using Multiple Deep Learning Networks
ICBBE '18: Proceedings of the 2018 5th International Conference on Biomedical and Bioinformatics Engineering

Automatic segmentation of the prostate on CT images has many applications in prostate cancer diagnosis and therapy. However, prostate segmentation from CT images is a very challenging task due to the low contrast of soft tissue and the large variations ...
Point cloud semantic segmentation method based on self-attention and pyramid fusion network model
EITCE '22: Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering

Semantic segmentation of point clouds has received extensive attention as the basis for 3D scene understanding, recognition, and various applications. However, the research on semantic segmentation of 3D point cloud scenes is extremely challenging due ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in Asia

December 2023

745 pages

ISBN:9798400702051

DOI:10.1145/3595916

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 January 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

MMAsia '23

Sponsor:

SIGMM

MMAsia '23: ACM Multimedia Asia

December 6 - 8, 2023

Tainan, Taiwan

Acceptance Rates

Overall Acceptance Rate 59 of 204 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
189
Total Downloads

Downloads (Last 12 months)118
Downloads (Last 6 weeks)11

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten