Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Anti-distractors: two-branch siamese tracker with both static and dynamic filters for object tracking

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Visual Object Tracking is a very challenging task because of the large appearance variance caused by illumination, deformation, and motion. Siamese network-based trackers, which select target through a matching function, are widely used for visual object tracking. The trackers are capable of robustly recognizing the target with appearance variance. However, while the filter template is a crucial part of such methods, most of them did not update the filter template effectively, and have shown limited discriminative ability between target and similar semantic objects (distractors). In order to tackle the challenge of distractors, we added a dynamic filter branch on the traditional siamese network. Under the condition that multipeaks are detected on the static response map, the tracker will redetect target with dynamic branch and the final target location will be determined by the combined result of the dynamic filter branch and static filter branch. Subsequently the sample library with hard negative mining strategy is updated and the dynamic filter kernel is restrained online. With the fusion of two branches, the tracker can distinguish the true target from similar objects. Meanwhile, we conduct extensive experiments and empirical evaluations on two popular datasets: Visdrone and UAV123. Our tracker achieves an AUC of 58% on Visdrone dataset and an AUC of 60.7% on UAV123 dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Bertinetto, L., Valmadre, J., Henriques, J.F., et al.: Fully-convolutional siamese networks for object tracking. In: European Conference on Computer Vision (2016)

  2. Li, B., Yan, J., Wu, W., et al.: High performance visual tracking with siamese region proposal network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8971–8980 (2018)

  3. Zhu, Z., Wang, Q., Li, B., et al.: Distractor-aware siamese networks for visual object tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 101–117 (2018)

  4. Li, B., Wu, W., Wang, Q., et al.: Siamrpn++: Evolution of siamese visual tracking with very deep networks[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4282–4291 (2019)

  5. Wang, Q., Zhang, L., Bertinetto, L., et al.: Fast online object tracking and segmentation: a unifying approach. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1328–1338 (2019)

  6. Wu, Y. , Lim, J., Yang, M.H.: Online object tracking: a benchmark. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, New York (2013)

  7. Wu, Y., Lim, J., Yang, M.H.: Object tracking benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1834–1848 (2015)

    Article  Google Scholar 

  8. The visual object tracking VOT2013 challenge results. In: 2013 IEEE International Conference on Computer Vision Workshops (ICCVW). IEEE Computer Society, New York (2013)

  9. Kristan, M., Pflugfelder, R., Leonardis, A., et al. The visual object tracking VOT2014 challenge results. In: Computer Vision—ECCV 2016 Workshops, PT II, vol. 8926, pp. 191–217 (2014)

  10. Kristan, M., Matas, J., Leonardis, Aleš, et al.: The visual object tracking VOT2015 challenge results. In: ICCV 2015. IEEE, New York (2015)

  11. Leonardis, A.: The visual object tracking VOT2016 challenge results. In: IEEE International Conference on Computer Vision Workshops (2016)

  12. Kristan, M., Leonardis, A., Matas, J., et al.: The visual object tracking VOT2017 challenge results. In: 2017 IEEE International Conference on Computer Vision Workshop (ICCVW). IEEE, New York (2017)

  13. Kristan, M.: The visual object tracking VOT2018 challenge results. In: Computer Vision—ECCV 2018. Workshops, pp. 3–53 (2018)

  14. Mueller, M., Smith, N., Ghanem, B.: A benchmark and simulator for UAV tracking. In: European Conference on Computer Vision. Springer, Cham (2016)

  15. Fan, H., Lin, L., Yang, F., et al.: LaSOT: a high-quality benchmark for large-scale single object tracking (2018)

  16. Zhu, P., Wen, L., Bian, X., et al.: Vision meets drones: A challenge[J]. arXiv preprint arXiv:1804.07437 (2018)

  17. Henriques, J.F., Caseiro, R., Martins, P., et al.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2015)

    Article  Google Scholar 

  18. Danelljan, M., Bhat, G., Khan, F.S., Felsberg, M.: Eco: efficient convolution operators for tracking. In: CVPR (2017)

  19. Danelljan, M., Robinson, A., Khan, F.S., et al.: Beyond correlation filters: learning continuous convolution operators for visual tracking. In: European Conference on Computer Vision. Springer, New York (2016)

  20. Li, Y., Zhu, J.: A scale adaptive kernel correlation filter tracker with feature integration. In: European Conference on Computer Vision, pp. 254–265. Springer, Cham (2014)

  21. Bertinetto, L., Valmadre, J., Golodetz, S., et al.: Staple: Complementary learners for real-time tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1401–1409 (2016)

  22. Danelljan, M., Hager, G., Shahbaz Khan, F., et al.: Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4310–4318 (2015)

  23. Danelljan, M., Bhat, G., Khan, F.S., et al.: Atom: Accurate tracking by overlap maximization[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4660–4669 (2019)

  24. Jiang, B., Luo, R., Mao, J., et al.: Acquisition of localization confidence for accurate object detection. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 784–799 (2018)

  25. zuihou He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  26. Li, X., Ma, C., Wu, B., et al.: Target-aware deep tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1369–1378 (2019)

  27. Bolme, D.S., Beveridge, J.R., Draper, B.A., Lui, Y.M.: Visual object tracking using adaptive correlation filters. In: CVPR (2010)

  28. Danelljan, M., Häger, G., Khan, F., et al.: Accurate scale estimation for robust visual tracking. In: British Machine Vision Conference, Nottingham, 1–5 September 2014. BMVA Press (2014)

  29. Zhang, K., Zhang, L., Liu, Q., Zhang, D., Yang, M.-H.: Fast visual tracking via dense spatio-temporal context learning. In: ECCV (2014)

  30. Bhat, G., Danelljan, M., Van Gool, L., et al.: Learning discriminative model prediction for tracking. arXiv preprint arXiv:1904.07220 (2019)

  31. Zhang, Lichao, Gonzalez-Garcia, Abel, et al.: Learning the model update for siamese trackers. arXiv preprint arXiv:1908.00855 (2019)

  32. Wang, M., Liu, Y., Huang, Z.: Large margin object tracking with circulant feature maps. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4021–4029 (2017)

  33. Gan, Q., Guo, Q., Zhang, Z., et al.: First step toward model-free, anonymous object tracking with recurrent neural networks. Comput. Sci. (2015)

  34. Kahou, S.E., Michalski, V., Memisevic, R.: Ratm: recurrent attentive tracking model. In: CVPRw (2017)

  35. Kosiorek, A.R., Bewley, A., Posner, I.: Hierarchical attentive recurrent tracking. In: NIPS (2017)

  36. Yang, T., Chan, A.B.: Recurrent filter learning for visual tracking. In: ICCVw, pp. 2010–2019 (2017)

  37. Gordon, D., Farhadi, A., Fox, D.: Re3: real-time recurrent regression networks for object tracking. arXiv preprint arXiv:1705.06368 (2017)

  38. Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet Classification with Deep Convolutional Neural Networks. NIPS. Curran Associates Inc, Red Hook (2012)

    Google Scholar 

  39. Suzuki, S., Be, K.: Topological structural analysis of digitized binary images by border following. Comput. Vis. Graph. Image Process. 30(1), 32–46 (1985)

    Article  Google Scholar 

  40. Zhou, J., Wang, P., Sun, H.: Discriminative and robust online learning for siamese visual tracking. arXiv preprint arXiv:1909.02959 (2019)

  41. Zheng, Z., Yi, Y., Shen, J., et al.: Adaptive updating siamese network with like-hood estimation for surveillance video object tracking. In: 2019 IEEE International Conference on Multimedia and Expo Workshops (ICMEW). IEEE, New York, pp. 126–131 (2019)

  42. Perry, J.M.: A class of conjugate gradient algorithms with a two-step variable-metric memory. In: Discussion Papers, p. 269 (1977)

  43. Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program. 45(1–3), 503–528 (1989)

    Article  MathSciNet  Google Scholar 

  44. Wang, H., O’Brien, J.F., Ramamoorthi, R.: Data-driven elastic models for cloth: modeling and measurement. ACM Trans. Graph. (TOG) 30(4), 1–12 (2011)

    Google Scholar 

  45. Berahas, A.S., Nocedal, J., Takác, M.: A multi-batch L-BFGS method for machine learning. Adv Neural Inform Process Syst 1055–1063 (2016)

  46. Bollapragada, R. et al.: A progressive batching L-BFGS method for machine learning. arXiv preprint arXiv:1802.05374 (2018)

  47. Paszke, A., Gross, S., Massa, F., et al.: PyTorch: an imperative style, high-performance deep learning library. Adv. Neural Inform. Process. Syst. 8024–8035 (2019)

  48. Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 400–407 (1951)

  49. http://github.com/foolwood/DaSiamRPN

  50. Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollar, P., Zitnick, C.L.: Microsoft coco: common objects in context. In: Proceedings of the European Conference on Computer Vision, pp. 740-755. Springer, New York (2014)

  51. Real, E., Shlens, J., Mazzocchi, S., Pan, X., Vanhoucke, V.: Youtube-bounding boxes: a large high-precision human-annotated data set for object detection in video. arXiv preprint arXiv:1702.00824 (2017)

  52. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV) 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guangyu Gao.

Additional information

Communicated by I. IDE.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shen, H., Lin, D., Song, T. et al. Anti-distractors: two-branch siamese tracker with both static and dynamic filters for object tracking. Multimedia Systems 26, 631–641 (2020). https://doi.org/10.1007/s00530-020-00670-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-020-00670-9

Keywords

Navigation