Computer Science > Computer Vision and Pattern Recognition

arXiv:2406.20024 (cs)

[Submitted on 28 Jun 2024 (v1), last revised 4 Nov 2024 (this version, v3)]

Title:eMoE-Tracker: Environmental MoE-based Transformer for Robust Event-guided Object Tracking

Abstract:The unique complementarity of frame-based and event cameras for high frame rate object tracking has recently inspired some research attempts to develop multi-modal fusion approaches. However, these methods directly fuse both modalities and thus ignore the environmental attributes, e.g., motion blur, illumination variance, occlusion, scale variation, etc. Meanwhile, insufficient interaction between search and template features makes distinguishing target objects and backgrounds difficult. As a result, performance degradation is induced especially in challenging conditions. This paper proposes a novel and effective Transformer-based event-guided tracking framework, called eMoE-Tracker, which achieves new SOTA performance under various conditions. Our key idea is to disentangle the environment into several learnable attributes to dynamically learn the attribute-specific features and strengthen the target information by improving the interaction between the target template and search regions. To achieve the goal, we first propose an environmental Mix-of-Experts (eMoE) module that is built upon the environmental Attributes Disentanglement to learn attribute-specific features and environmental Attributes Assembling to assemble the attribute-specific features by the learnable attribute scores dynamically. The eMoE module is a subtle router that prompt-tunes the transformer backbone more efficiently. We then introduce a contrastive relation modeling (CRM) module to emphasize target information by leveraging a contrastive learning strategy between the target template and search regions. Extensive experiments on diverse event-based benchmark datasets showcase the superior performance of our eMoE-Tracker compared to the prior arts.

Comments:	RGB-event single object tracking
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2406.20024 [cs.CV]
	(or arXiv:2406.20024v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2406.20024

Submission history

From: Yucheng Chen [view email]
[v1] Fri, 28 Jun 2024 16:13:55 UTC (9,103 KB)
[v2] Sat, 7 Sep 2024 07:28:03 UTC (11,116 KB)
[v3] Mon, 4 Nov 2024 06:08:30 UTC (9,691 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:eMoE-Tracker: Environmental MoE-based Transformer for Robust Event-guided Object Tracking

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:eMoE-Tracker: Environmental MoE-based Transformer for Robust Event-guided Object Tracking

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators