Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Multi-object tracking algorithm based on interactive attention network and adaptive trajectory reconnection

Published: 17 July 2024 Publication History

Abstract

Multi-object tracking (MOT) detects multiple targets in an image and assigns a unique identifier to each target. However, challenges such as rapid motion, occlusion, and camera motion in the tracking scene may lead to identity switches (IDs) and missing trajectory problems, which degrade the performance of the tracker. To address these issues, this paper presents an MOT algorithm based on an interactive attention network and adaptive trajectory reconnection. First, an interactive attention network was created to learn the features for two different tasks of detection and tracking to alleviate feature conflicts in order to extract sufficient feature information. A new cost matrix was then designed to fuse the motion and feature information, thereby reducing the number of IDs. Meanwhile, the extreme gradient boosting reconnection module was used to achieve adaptive trajectory reconnection and reduce missing trajectories. The proposed algorithm achieved 61.5 % and 55.4 % HOTA using the standard MOT17 and MOT20 datasets, respectively. In comparison to FairMOT, our algorithm showcased notable enhancements of 3% and 1.6% on these datasets. Furthermore, when compared to state-of-the-art algorithms, the proposed algorithm demonstrated superior tracking performance.

References

[1]
Aharon, N., Orfaig, R., & Bobrovsky, B.-Z. (2022). BoT-SORT: Robust associations multi-pedestrian tracking. arXiv preprint arXiv:2206.14651.
[2]
Bergmann, P., Meinhardt, T., & Leal-Taixe, L. (2019). Tracking without bells and whistles. In 2019 IEEE/CVF international conference on computer vision (ICCV) (pp. 941-951). https://doi.org/10.1109/ICCV.2019.00103.
[3]
Bewley, A., Ge, Z., Ott, L., Ramos, F., & Upcroft, B. (2016). Simple online and realtime tracking. In 2016 IEEE international conference on image processing (ICIP) (pp. 3464-3468). https://doi.org/10.1109/ICIP.2016.7533003.
[4]
Bochinski, E., Senst, T., & Sikora, T. (2018). Extending IOU based multi-object tracking by visual information. In 2018 15th IEEE international conference on advanced video and signal based surveillance (AVSS) (pp. 1-6). https://doi.org/10.1109/AVSS.2018.8639144.
[5]
Cai, J., Xu, M., Li, W., Xiong, Y., Xia, W., Tu, Z., & Soatto, S. (2022). MeMOT: multi-object tracking with memory. In 2022 IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 8090-8100). https://doi.org/10.1109/CVPR52688.2022.00792.
[6]
J. Cao, J. Zhang, B. Li, L. Gao, J. Zhang, RetinaMOT: Rethinking anchor-free YOLOv5 for online multiple object tracking, Complex & Intelligent Systems 1–19 (2023),.
[7]
Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794). https://doi.org/10.1145/2939672.2939785.
[8]
Dai, P., Weng, R., Choi, W., Zhang, C., He, Z., & Ding, W. (2021). Learning a proposal classifier for multiple object tracking. In 2021 IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 2443-2452). https://doi.org/10.1109/CVPR46437.2021.00247.
[9]
Dendorfer, P., Rezatofighi, H., Milan, A., Shi, J., Cremers, D., Reid, I., . . . Leal-Taixé, L. (2020). Mot20: A benchmark for multi object tracking in crowded scenes. arXiv preprint arXiv:2003.09003.
[10]
Du, Y., Wan, J., Zhao, Y., Zhang, B., Tong, Z., & Dong, J. (2021). Giaotracker: A comprehensive framework for mcmot with global information and optimizing strategies in visdrone 2021. In 2021 IEEE/CVF international conference on computer vision workshops (ICCVW) (pp. 2809-2819). https://doi.org/10.1109/ICCVW54120.2021.00315.
[11]
X. Feng, X. Zhang, X. Shi, L. Li, S. Wang, ST-ITEF: Spatio-Temporal Intraoperative Task Estimating Framework to recognize surgical phase and predict instrument path based on multi-object tracking in keratoplasty, Medical Image Analysis 91 (2024),.
[12]
Fukui, H., Miyagawa, T., & Morishita, Y. (2023). Multi-Object Tracking as Attention Mechanism. In 2023 IEEE international conference on image processing (ICIP) (pp. 505-509). https://doi.org/10.1109/ICIP49359.2023.10222207.
[13]
J. Gao, Y. Zhang, X. Geng, H. Tang, U.A. Bhatti, PE-Transformer: Path enhanced transformer for improving underwater object detection, Expert Systems with Applications (2024),.
[14]
Gao, R., & Wang, L. (2023). MeMOTR: Long-Term Memory-Augmented Transformer for Multi-Object Tracking. In 2023 IEEE/CVF international conference on computer vision (ICCV) (pp. 9901-9910). https://doi.org/10.1109/CVPR52688.2022.00792.
[15]
Guo, S., Wang, J., Wang, X., & Tao, D. (2021). Online multiple object tracking with cross-task synergy. In 2021 IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 8136-8145). https://doi.org/10.1109/CVPR46437.2021.00804.
[16]
S. Han, P. Huang, H. Wang, E. Yu, D. Liu, X. Pan, Mat: Motion-aware multi-object tracking, Neurocomputing 476 (2022) 75–86,.
[17]
J.F. Henriques, R. Caseiro, P. Martins, J. Batista, High-speed tracking with kernelized correlation filters, IEEE Transactions on Pattern Analysis and Machine Intelligence 37 (3) (2014) 583–596,.
[18]
Hornakova, A., Kaiser, T., Swoboda, P., Rolinek, M., Rosenhahn, B., & Henschel, R. (2021). Making higher order mot scalable: An efficient approximate solver for lifted disjoint paths. In 2021 IEEE/CVF international conference on computer vision (ICCV) (pp. 6330-6340). https://doi.org/10.1109/ICCV48922.2021.00627.
[19]
Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. In 2018 IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 7132-7141). https://doi.org/10.1109/CVPR.2018.00745.
[20]
Hyun, J., Kang, M., Wee, D., & Yeung, D.-Y. (2023). Detection recovery in online multi-object tracking with sparse graph tracker. In 2023 IEEE/CVF winter conference on applications of computer vision (WACV) (pp. 4850-4859). https://doi.org/10.1109/WACV56688.2023.00483.
[21]
S.Z.M. Jamaludin, N.A. Romli, M.S.M. Kasihmuddin, A. Baharum, M.A. Mansor, M.F. Marsani, Novel logic mining incorporating log linear approach, Journal of King Saud University-Computer and Information Sciences 34 (10) (2022) 9011–9027,.
[22]
S. Jia, C. Song, Y. Cao, X. Lu, IMDet: Injecting more supervision to CenterNet-like object detection, Expert Systems with Applications 234 (2023),.
[23]
M.S.M. Kasihmuddin, S.Z.M. Jamaludin, M.A. Mansor, H.A. Wahab, S.M.S. Ghadzi, Supervised learning perspective in logic mining, Mathematics 10 (6) (2022) 915,.
[24]
S.-H. Lee, D.-H. Park, S.-H. Bae, Decode-MOT: How can we hurdle frames to go beyond tracking-by-detection?, IEEE Transactions on Image Processing (2023),.
[25]
Li, X., Wang, W., Hu, X., & Yang, J. (2019). Selective kernel networks. In 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 510-519). https://doi.org/10.1109/CVPR.2019.00060.
[26]
C. Liang, Z. Zhang, X. Zhou, B. Li, S. Zhu, W. Hu, Rethinking the competition between detection and ReID in multiobject tracking, IEEE Transactions on Image Processing 31 (2022) 3182–3196,.
[27]
Q. Liu, D. Chen, Q. Chu, L. Yuan, B. Liu, L. Zhang, N. Yu, Online multi-object tracking with unsupervised re-identification learning and occlusion estimation, Neurocomputing 483 (2022) 333–347,.
[28]
Y. Liu, B. Li, X. Zhou, D. Li, Q. Duan, FishTrack: Multi-object tracking method for fish using spatiotemporal information fusion, Expert Systems with Applications 238 (2024),.
[29]
J. Luiten, A. Osep, P. Dendorfer, P. Torr, A. Geiger, L. Leal-Taixé, B. Leibe, Hota: A higher order metric for evaluating multi-object tracking, International Journal of Computer Vision 129 (2021) 548–578,.
[30]
S. Ma, B. Zhao, Z. Hou, W. Yu, L. Pu, X. Yang, SOCF: A correlation filter for real-time UAV tracking based on spatial disturbance suppression and object saliency-aware, Expert Systems with Applications 238 (2024),.
[31]
Milan, A., Leal-Taixé, L., Reid, I., Roth, S., & Schindler, K. (2016). MOT16: A benchmark for multi-object tracking. arXiv preprint arXiv:.00831.
[32]
R. Mostafa, H. Baraka, A. Bayoumi, LMOT: Efficient light-weight detection and tracking in crowds, IEEE Access 10 (2022) 83085–83095,.
[33]
Pang, B., Li, Y., Zhang, Y., Li, M., & Lu, C. (2020). Tubetk: Adopting tubes to track multi-object in a one-step training model. In 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 6308-6318). https://doi.org/10.1109/CVPR42600.2020.00634.
[34]
Pang, J., Qiu, L., Li, X., Chen, H., Li, Q., Darrell, T., & Yu, F. (2021). Quasi-dense similarity learning for multiple object tracking. In 2021 IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 164-173). https://doi.org/10.1109/CVPR46437.2021.00023.
[35]
Park, J., Woo, S., Lee, J.-Y., & Kweon, I. S. (2018). Bam: Bottleneck attention module. arXiv preprint arXiv:1807.06514.
[36]
Peng, J., Wang, C., Wan, F., Wu, Y., Wang, Y., Tai, Y., . . . Fu, Y. (2020). Chained-tracker: Chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking. In The 16th European conference on computer vision (ECCV) (pp. 145-161). https://doi.org/10.1007/978-3-030-58548-8_9.
[37]
R. Pereira, G. Carvalho, L. Garrote, U.J. Nunes, Sort and deep-SORT based multi-object tracking for mobile robotics: Evaluation with new data association metrics, Applied Sciences 12 (3) (2022) 1319,.
[38]
Ristani, E., Solera, F., Zou, R., Cucchiara, R., & Tomasi, C. (2016). Performance measures and a data set for multi-target, multi-camera tracking. In The 14th European conference on computer vision (ECCV) (pp. 17-35). https://doi.org/10.1007/978-3-319-48881-3_2.
[39]
Saleh, F., Aliakbarian, S., Rezatofighi, H., Salzmann, M., & Gould, S. (2021). Probabilistic tracklet scoring and inpainting for multiple object tracking. In 2021 IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 14329-14339). https://doi.org/10.1109/CVPR46437.2021.01410.
[40]
M. Shahbazi, S.H. Mirtajadini, H. Fahimi, Visual–inertial object tracking: Incorporating camera pose into motion models, Expert Systems with Applications 229 (2023),.
[41]
Shao, S., Zhao, Z., Li, B., Xiao, T., Yu, G., Zhang, X., & Sun, J. (2018). Crowdhuman: A benchmark for detecting human in a crowd. arXiv preprint arXiv:1805.00123.
[42]
Sun, P., Cao, J., Jiang, Y., Zhang, R., Xie, E., Yuan, Z., . . . Luo, P. (2020). Transtrack: Multiple object tracking with transformer. arXiv preprint arXiv:2012.15460.
[43]
Sun, P., Kretzschmar, H., Dotiwalla, X., Chouard, A., Patnaik, V., Tsui, P., . . . Caine, B. (2020). Scalability in perception for autonomous driving: Waymo open dataset. In 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 2446-2454). https://doi.org/10.1109/CVPR42600.2020.00252.
[44]
D. Tian, Y. Han, S. Wang, Object feedback and feature information retention for small object detection in intelligent transportation scenes, Expert Systems with Applications 238 (2024),.
[45]
Tokmakov, P., Li, J., Burgard, W., & Gaidon, A. (2021). Learning to track with object permanence. In 2021 IEEE/CVF international conference on computer vision (ICCV) (pp. 10860-10869). https://doi.org/10.1109/ICCV48922.2021.01068.
[46]
C.-Y. Tsai, G.-Y. Shen, H. Nisar, Swin-JDE: Joint detection and embedding multi-object tracking in crowded scenes based on swin-transformer, Engineering Applications of Artificial Intelligence 119 (2023),.
[47]
Wang, Y., Kitani, K., & Weng, X. (2021). Joint object detection and multi-object tracking with graph neural networks. In 2021 IEEE international conference on robotics and automation (ICRA) (pp. 13708-13715). https://doi.org/10.1109/ICRA48506.2021.9561110.
[48]
Wang, Z., Zheng, L., Liu, Y., Li, Y., & Wang, S. (2020). Towards real-time multi-object tracking. In The 16th European conference on computer vision (ECCV) (pp. 107-122). https://doi.org/10.1007/978-3-030-58621-8_7.
[49]
Wojke, N., Bewley, A., & Paulus, D. (2017). Simple online and realtime tracking with a deep association metric. In 2017 IEEE international conference on image processing (ICIP) (pp. 3645-3649). https://doi.org/10.1109/ICIP.2017.8296962.
[50]
Woo, S., Park, J., Lee, J.-Y., & Kweon, I. S. (2018). Cbam: Convolutional block attention module. In Proceedings of the European conference on computer vision (ECCV) (pp. 3-19). https://doi.org/10.1007/978-3-030-01234-2_1.
[51]
Wu, J., Cao, J., Song, L., Wang, Y., Yang, M., & Yuan, J. (2021). Track to detect and segment: An online multi-object tracker. In 2021 IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 12352-12361). https://doi.org/10.1109/CVPR46437.2021.01217.
[52]
Y. Xu, Y. Ban, G. Delorme, C. Gan, D. Rus, X. Alameda-Pineda, TransCenter: Transformers with dense representations for multiple-object tracking, IEEE Transactions on Pattern Analysis and Machine Intelligence 45 (6) (2022) 7820–7835,.
[53]
Zeng, F., Dong, B., Zhang, Y., Wang, T., Zhang, X., & Wei, Y. (2022). Motr: End-to-end multiple-object tracking with transformer. In The 17th European conference on computer vision (ECCV) (pp. 659-675). https://doi.org/10.1007/978-3-031-19812-0_38.
[54]
K. Zeng, Y. You, T. Shen, Q. Wang, Z. Tao, Z. Wang, Q. Liu, NCT: Noise-control multi-object tracking, Complex & Intelligent Systems (2023) 1–17.
[55]
Y. Zhang, H. Sheng, Y. Wu, S. Wang, W. Ke, Z. Xiong, Multiplex labeling graph for near-online tracking in crowded scenes, IEEE Internet of Things Journal 7 (9) (2020) 7892–7902,.
[56]
Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., . . . Wang, X. (2022). Bytetrack: Multi-object tracking by associating every detection box. In The 17th European conference on computer vision (ECCV) (pp. 1-21). https://doi.org/10.1007/978-3-031-20047-2_1.
[57]
Y. Zhang, C. Wang, X. Wang, W. Zeng, W. Liu, Fairmot: On the fairness of detection and re-identification in multiple object tracking, International Journal of Computer Vision 129 (11) (2021) 3069–3087,.
[58]
Zhou, X., Koltun, V., & Krähenbühl, P. (2020). Tracking objects as points. In The 16th European conference on computer vision (ECCV) (pp. 474-490). https://doi.org/10.1007/978-3-030-58548-8_28.
[59]
Zhou, X., Wang, D., & Krähenbühl, P. (2019). Objects as points. arXiv preprint arXiv:1904.07850.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Expert Systems with Applications: An International Journal
Expert Systems with Applications: An International Journal  Volume 249, Issue PA
Sep 2024
1543 pages

Publisher

Pergamon Press, Inc.

United States

Publication History

Published: 17 July 2024

Author Tags

  1. Multi-object tracking
  2. Interactive attention network
  3. Camera motion compensation
  4. Cost matrix
  5. Adaptive trajectory reconnection

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Nov 2024

Other Metrics

Citations

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media