Computer Science > Computer Vision and Pattern Recognition

arXiv:2211.10927 (cs)

[Submitted on 20 Nov 2022]

Title:GLT-T: Global-Local Transformer Voting for 3D Single Object Tracking in Point Clouds

Authors:Jiahao Nie, Zhiwei He, Yuxiang Yang, Mingyu Gao, Jing Zhang

View PDF

Abstract:Current 3D single object tracking methods are typically based on VoteNet, a 3D region proposal network. Despite the success, using a single seed point feature as the cue for offset learning in VoteNet prevents high-quality 3D proposals from being generated. Moreover, seed points with different importance are treated equally in the voting process, aggravating this defect. To address these issues, we propose a novel global-local transformer voting scheme to provide more informative cues and guide the model pay more attention on potential seed points, promoting the generation of high-quality 3D proposals. Technically, a global-local transformer (GLT) module is employed to integrate object- and patch-aware prior into seed point features to effectively form strong feature representation for geometric positions of the seed points, thus providing more robust and accurate cues for offset learning. Subsequently, a simple yet effective training strategy is designed to train the GLT module. We develop an importance prediction branch to learn the potential importance of the seed points and treat the output weights vector as a training constraint term. By incorporating the above components together, we exhibit a superior tracking method GLT-T. Extensive experiments on challenging KITTI and NuScenes benchmarks demonstrate that GLT-T achieves state-of-the-art performance in the 3D single object tracking task. Besides, further ablation studies show the advantages of the proposed global-local transformer voting scheme over the original VoteNet. Code and models will be available at this https URL.

Comments:	Accepted to AAAI 2023. The source code and models will be available at this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2211.10927 [cs.CV]
	(or arXiv:2211.10927v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2211.10927

Submission history

From: Jiahao Nie [view email]
[v1] Sun, 20 Nov 2022 09:53:24 UTC (1,575 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:GLT-T: Global-Local Transformer Voting for 3D Single Object Tracking in Point Clouds

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:GLT-T: Global-Local Transformer Voting for 3D Single Object Tracking in Point Clouds

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators