Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3664647.3680700acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article
Open access

On-the-fly Point Feature Representation for Point Clouds Analysis

Published: 28 October 2024 Publication History

Abstract

Point cloud analysis is challenging due to its unique characteristics of unorderness, sparsity and irregularity. Prior works attempt to capture local relationships by convolution operations or attention mechanisms, exploiting geometric information from coordinates implicitly. These methods, however, are insufficient to describe the explicit local geometry, e.g., curvature and orientation. In this paper, we propose On-the-fly Point Feature Representation (OPFR), which captures abundant geometric information explicitly through Curve Feature Generator module. This is inspired by Point Feature Histogram (PFH) from computer vision community. However, the utilization of vanilla PFH encounters great difficulties when applied to large datasets and dense point clouds, as it demands considerable time for feature generation. In contrast, we introduce the Local Reference Constructor module, which approximates the local coordinate systems based on triangle sets. Owing to this, our OPFR only requires extra 1.56ms for inference (65X faster than vanilla PFH) and 0.012M more parameters, and it can serve as a versatile plug-and-play module for various backbones, particularly MLP-based and Transformer-based backbones examined in this study. Additionally, we introduce the novel Hierarchical Sampling module aimed at enhancing the quality of triangle sets, thereby ensuring robustness of the obtained geometric features. Our proposed method improves overall accuracy (OA) on ModelNet40 from 90.7% to 94.5% (+3.8%) for classification, and OA on S3DIS Area-5 from 86.4% to 90.0% (+3.6%) for semantic segmentation, respectively, building upon PointNet++ backbone. When integrated with Point Transformer backbone, we achieve state-of-the-art results on both tasks: 94.8% OA on ModelNet40 and 91.7% OA on S3DIS Area-5.

References

[1]
Iro Armeni, Ozan Sener, Amir R Zamir, Helen Jiang, Ioannis Brilakis, Martin Fischer, and Silvio Savarese. 2016. 3d semantic parsing of large-scale indoor spaces. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1534--1543.
[2]
Huizhou Chen, Jiangyi Wang, Yuxin Li, Na Zhao, Jun Cheng, and Xulei Yang. 2024. Improving 3D Occupancy Prediction through Class-balancing Loss and Multi-scale Representation. arXiv preprint arXiv:2405.16099 (2024).
[3]
Zhongyao Cheng, Cen Chen, Ziyuan Zhao, Peisheng Qian, Xiaoli Li, and Xulei Yang. 2023. COCO-TEACH: A Contrastive Co-Teaching Network For Incremental 3D Object Detection. In 2023 IEEE International Conference on Image Processing (ICIP). IEEE, 1990--1994.
[4]
Thomas Czerniawski, Mohammad Nahangi, Carl Haas, and Scott Walbridge. 2016. Pipe spool recognition in cluttered point clouds using a curvature-based shape descriptor. Automation in Construction, Vol. 71 (2016), 346--358.
[5]
Lunhao Duan, Shanshan Zhao, Nan Xue, Mingming Gong, Gui-Song Xia, and Dacheng Tao. 2024. ConDaFormer: Disassembled Transformer with Local Structure Enhancement for 3D Point Cloud Understanding. Advances in Neural Information Processing Systems, Vol. 36 (2024).
[6]
Yueqi Duan, Yu Zheng, Jiwen Lu, Jie Zhou, and Qi Tian. 2019. Structural relational reasoning of point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 949--958.
[7]
Yuval Eldar, Michael Lindenbaum, Moshe Porat, and Yehoshua Y Zeevi. 1997. The farthest point strategy for progressive image sampling. IEEE Transactions on Image Processing, Vol. 6, 9 (1997), 1305--1315.
[8]
A Foorginejad and K Khalili. 2014. Umbrella curvature: a new curvature estimation method for point clouds. Procedia Technology, Vol. 12 (2014), 347--352.
[9]
Andreas Geiger, Philip Lenz, and Raquel Urtasun. 2012. Are we ready for autonomous driving? the kitti vision benchmark suite. In 2012 IEEE conference on computer vision and pattern recognition. IEEE, 3354--3361.
[10]
Meng-Hao Guo, Jun-Xiong Cai, Zheng-Ning Liu, Tai-Jiang Mu, Ralph R Martin, and Shi-Min Hu. 2021. Pct: Point cloud transformer. Computational Visual Media, Vol. 7 (2021), 187--199.
[11]
Abdullah Hamdi, Silvio Giancola, and Bernard Ghanem. 2021. Mvtn: Multi-view transformation network for 3d shape recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1--11.
[12]
Yucheng Han, Na Zhao, Weiling Chen, Keng Teck Ma, and Hanwang Zhang. 2024. Dual-Perspective Knowledge Enrichment for Semi-supervised 3D Object Detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 2049--2057.
[13]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[14]
Michael Himmelsbach, Thorsten Luettel, and H-J Wuensche. 2009. Real-time object classification in 3D point clouds using point feature histograms. In 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 994--1000.
[15]
Richard Hoffman and Anil K Jain. 1987. Segmentation and classification of range images. IEEE transactions on pattern analysis and machine intelligence 5 (1987), 608--620.
[16]
Hugues Hoppe, Tony DeRose, Tom Duchamp, John McDonald, and Werner Stuetzle. 1992. Surface reconstruction from unorganized points. In Proceedings of the 19th annual conference on computer graphics and interactive techniques. 71--78.
[17]
Mingyang Jiang, Yiran Wu, Tianqi Zhao, Zelin Zhao, and Cewu Lu. 2018. Pointsift: A sift-like network module for 3d point cloud semantic segmentation. arXiv preprint arXiv:1807.00652 (2018).
[18]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[19]
Artem Komarichev, Zichun Zhong, and Jing Hua. 2019. A-cnn: Annularly convolutional neural networks on point clouds. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 7421--7430.
[20]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, Vol. 25 (2012).
[21]
Linfeng Li and Na Zhao. 2024. End-to-End Semi-Supervised 3D Instance Segmentation with PCTeacher. In 2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 5352--5358.
[22]
Peng Li, Jian Wang, Yindi Zhao, Yanxia Wang, and Yifei Yao. 2016. Improved algorithm for point cloud registration based on fast point feature histograms. Journal of Applied Remote Sensing, Vol. 10, 4 (2016), 045024--045024.
[23]
Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Xinhan Di, and Baoquan Chen. 2018. Pointcnn: Convolution on x-transformed points. Advances in neural information processing systems, Vol. 31 (2018).
[24]
Yicong Li, Na Zhao, Junbin Xiao, Chun Feng, Xiang Wang, and Tat-seng Chua. 2024. LASO: Language-guided Affordance Segmentation on 3D Object. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14251--14260.
[25]
Yongcheng Liu, Bin Fan, Shiming Xiang, and Chunhong Pan. 2019. Relation-shape convolutional neural network for point cloud analysis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 8895--8904.
[26]
Ze Liu, Han Hu, Yue Cao, Zheng Zhang, and Xin Tong. 2020. A closer look at local aggregation operators in point cloud analysis. In Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XXIII 16. Springer, 326--342.
[27]
Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017).
[28]
Xu Ma, Can Qin, Haoxuan You, Haoxi Ran, and Yun Fu. 2022. Rethinking network design and local geometry in point cloud: A simple residual MLP framework. arXiv preprint arXiv:2202.07123 (2022).
[29]
Jiageng Mao, Xiaogang Wang, and Hongsheng Li. 2019. Interpolated convolutional networks for 3d point cloud understanding. In Proceedings of the IEEE/CVF international conference on computer vision. 1578--1587.
[30]
Simone Melzi, Riccardo Spezialetti, Federico Tombari, Michael M Bronstein, Luigi Di Stefano, and Emanuele Rodola. 2019. Gframes: Gradient-based local reference frame for 3d shape matching. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4629--4638.
[31]
Ehsan Nezhadarya, Ehsan Taghavi, Ryan Razani, Bingbing Liu, and Jun Luo. 2020. Adaptive hierarchical down-sampling for point cloud classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12956--12964.
[32]
Jinyoung Park, Sanghyeok Lee, Sihyeon Kim, Yunyang Xiong, and Hyunwoo J Kim. 2023. Self-positioning point-based transformer for point cloud understanding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 21814--21823.
[33]
Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 652--660.
[34]
Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J Guibas. 2017. Pointnet: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems, Vol. 30 (2017).
[35]
Guocheng Qian, Yuchen Li, Houwen Peng, Jinjie Mai, Hasan Hammoud, Mohamed Elhoseiny, and Bernard Ghanem. 2022. Pointnext: Revisiting pointnet with improved training and scaling strategies. Advances in Neural Information Processing Systems, Vol. 35 (2022), 23192--23204.
[36]
Haoxi Ran, Jun Liu, and Chengjie Wang. 2022. Surface representation for point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18942--18952.
[37]
Haoxi Ran, Wei Zhuo, Jun Liu, and Li Lu. 2021. Learning inner-group relations on point clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 15477--15487.
[38]
Radu Bogdan Rusu, Nico Blodow, and Michael Beetz. 2009. Fast point feature histograms (FPFH) for 3D registration. In 2009 IEEE international conference on robotics and automation. IEEE, 3212--3217.
[39]
Radu Bogdan Rusu, Nico Blodow, Zoltan Csaba Marton, and Michael Beetz. 2008. Aligning point cloud views using persistent feature histograms. In 2008 IEEE/RSJ international conference on intelligent robots and systems. IEEE, 3384--3391.
[40]
Radu Bogdan Rusu, Zoltan Csaba Marton, Nico Blodow, and Michael Beetz. 2008. Persistent point feature histograms for 3D point clouds. In Proc 10th Int Conf Intel Autonomous Syst (IAS-10), Baden-Baden, Germany. 119--128.
[41]
Julia Sanchez, Florence Denis, David Coeurjolly, Florent Dupont, Laurent Trassoudaine, and Paul Checchin. 2020. Robust normal vector estimation in 3D point clouds through iterative principal component analysis. ISPRS Journal of Photogrammetry and Remote Sensing, Vol. 163 (2020), 18--35.
[42]
Paul Scovanner, Saad Ali, and Mubarak Shah. 2007. A 3-dimensional sift descriptor and its application to action recognition. In Proceedings of the 15th ACM international conference on Multimedia. 357--360.
[43]
Isabel M Serrano and Bogdan D Suceava. 2015. A medieval mystery: Nicole Oresme?s concept of curvitas. Notices of the AMS, Vol. 62, 9 (2015).
[44]
Hualian Sheng, Sijia Cai, Na Zhao, Bing Deng, Jianqiang Huang, Xian-Sheng Hua, Min-Jian Zhao, and Gim Hee Lee. 2022. Rethinking IoU-based optimization for single-stage 3D object detection. In European Conference on Computer Vision. Springer, 544--561.
[45]
Hualian Sheng, Sijia Cai, Na Zhao, Bing Deng, Min-Jian Zhao, and Gim Hee Lee. 2023. PDR: Progressive depth regularization for monocular 3D object detection. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 33, 12 (2023), 7591--7603.
[46]
Jiahao Sun, Chunmei Qing, Junpeng Tan, and Xiangmin Xu. 2023. Superpoint transformer for 3d scene instance segmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 2393--2401.
[47]
Junhua Sun, Jie Zhang, and Guangjun Zhang. 2016. An automatic 3D point cloud registration method based on regional curvature maps. Image and vision computing, Vol. 56 (2016), 49--58.
[48]
Earl William Swokowski. 1979. Calculus with analytic geometry. Taylor & Francis.
[49]
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2818--2826.
[50]
Lyne Tchapmi, Christopher Choy, Iro Armeni, JunYoung Gwak, and Silvio Savarese. 2017. Segcloud: Semantic segmentation of 3d point clouds. In 2017 international conference on 3D vision (3DV). IEEE, 537--547.
[51]
Gusi Te, Wei Hu, Amin Zheng, and Zongming Guo. 2018. Rgcnn: Regularized graph cnn for point cloud segmentation. In Proceedings of the 26th ACM international conference on Multimedia. 746--754.
[52]
Hugues Thomas, Charles R Qi, Jean-Emmanuel Deschaud, Beatriz Marcotegui, Franccois Goulette, and Leonidas J Guibas. 2019. Kpconv: Flexible and deformable convolution for point clouds. In Proceedings of the IEEE/CVF international conference on computer vision. 6411--6420.
[53]
Mikaela Angelina Uy, Quang-Hieu Pham, Binh-Son Hua, Thanh Nguyen, and Sai-Kit Yeung. 2019. Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).
[54]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems, Vol. 30 (2017).
[55]
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E Sarma, Michael M Bronstein, and Justin M Solomon. 2019. Dynamic graph cnn for learning on point clouds. ACM Transactions on Graphics (tog), Vol. 38, 5 (2019), 1--12.
[56]
Louis Wiesmann, Rodrigo Marcuzzi, Cyrill Stachniss, and Jens Behley. 2022. Retriever: Point cloud retrieval in compressed 3D maps. In 2022 International Conference on Robotics and Automation (ICRA). IEEE, 10925--10932.
[57]
Svante Wold, Kim Esbensen, and Paul Geladi. 1987. Principal component analysis. Chemometrics and intelligent laboratory systems, Vol. 2, 1--3 (1987), 37--52.
[58]
Wenxuan Wu, Zhongang Qi, and Li Fuxin. 2019. Pointconv: Deep convolutional networks on 3d point clouds. In Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition. 9621--9630.
[59]
Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, and Hengshuang Zhao. 2022. Point transformer v2: Grouped vector attention and partition-based pooling. Advances in Neural Information Processing Systems, Vol. 35 (2022), 33330--33342.
[60]
Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 2015. 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1912--1920.
[61]
Tiange Xiang, Chaoyi Zhang, Yang Song, Jianhui Yu, and Weidong Cai. 2021. Walk in the cloud: Learning curves for point clouds shape analysis. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 915--924.
[62]
Qiangeng Xu, Xudong Sun, Cho-Ying Wu, Panqu Wang, and Ulrich Neumann. 2020. Grid-gcn for fast and scalable point cloud learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5661--5670.
[63]
Yaoqing Yang, Chen Feng, Yiru Shen, and Dong Tian. 2018. Foldingnet: Point cloud auto-encoder via deep grid deformation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 206--215.
[64]
Xumin Yu, Lulu Tang, Yongming Rao, Tiejun Huang, Jie Zhou, and Jiwen Lu. 2022. Point-bert: Pre-training 3d point cloud transformers with masked point modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 19313--19322.
[65]
Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Russ R Salakhutdinov, and Alexander J Smola. 2017. Deep sets. Advances in neural information processing systems, Vol. 30 (2017).
[66]
Kuangen Zhang, Ming Hao, Jing Wang, Clarence W de Silva, and Chenglong Fu. 2019. Linked dynamic graph cnn: Learning on point cloud via linking hierarchical features. arXiv preprint arXiv:1904.10014 (2019).
[67]
Hengshuang Zhao, Li Jiang, Chi-Wing Fu, and Jiaya Jia. 2019. Pointweb: Enhancing local neighborhood features for point cloud processing. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5565--5573.
[68]
Hengshuang Zhao, Li Jiang, Jiaya Jia, Philip HS Torr, and Vladlen Koltun. 2021. Point transformer. In Proceedings of the IEEE/CVF international conference on computer vision. 16259--16268.
[69]
Na Zhao, Tat-Seng Chua, and Gim Hee Lee. 2021. Few-shot 3d point cloud semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8873--8882.
[70]
Na Zhao and Gim Hee Lee. 2022. Static-dynamic co-teaching for class-incremental 3d object detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 3436--3445.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '24: Proceedings of the 32nd ACM International Conference on Multimedia
October 2024
11719 pages
ISBN:9798400706868
DOI:10.1145/3664647
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. classification
  2. local geometry
  3. point clouds representation
  4. scene understanding
  5. semantic segmentation.

Qualifiers

  • Research-article

Funding Sources

  • Agency for Science, Technology and Research (A*STAR)

Conference

MM '24
Sponsor:
MM '24: The 32nd ACM International Conference on Multimedia
October 28 - November 1, 2024
Melbourne VIC, Australia

Acceptance Rates

MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;
Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 12
    Total Downloads
  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)12
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media