Abstract
This paper focuses on dealing with the tracking challenges such as target occlusion and deformation. It proposes a new tracking method via extracting and evaluating multi-features for both target region and its adjacent surroundings. The multi-features separately describe the key factors to detect target including the color feature, the shape and contour feature, and the distributions of structure and intensity described by the Pearson Correlation Coefficient. These multi-features are proposed as the basic representation of target template and candidates and used to define a matching algorithm between them. The best matched candidate is taken as the final tracking result. To improve the efficiency of target template and candidates, the region of importance (ROI) for target is proposed by evaluating the distribution of salient values on many extended regions. The ROIs produce more accurate regions to form target template and candidates. Finally, a new template update method is defined based on the precision of tracked result to adapt to target state and achieve the follow target tracking. Using 25 videos in visual tracking benchmark, we achieve the quantitative and qualitatively evaluations of 12 different trackers. Many experiments demonstrate that our tracker produces much better results than the present trackers in dealing with target occlusion, deformation, rotation, background clutters.
Similar content being viewed by others
References
Adam A, Rivlin E, Shimshoni I (2006) Robust fragments -based tracking using the integral histogram. IEEE conference on computer vision and pattern recognition, p.798-805
Bibi A, Mueller M, Ghanem B (2016) Target response adaptation for correlation filter tracking. European conference on computer vision, p.419-433. 2
Bolme DS, Beveridge JR, Draper BA, et al. (2010) Visual object tracking using adaptive correlation filters. IEEE conference on computer vision and pattern recognition, p.2544–2550
Cai Z, Wen L, Yang J, et al. (2012) Structured visual tracking with dynamic graph. Asian Conference on Computer Vision. Springer Berlin Heidelberg, p.86–97
Choi J, Chang HJ, Fischer T, Yun S, Lee K, Jeong J, Demiris Y, Choi JY (2018) Context-Aware Deep Feature Compression for High-Speed Visual Tracking. IEEE conference on computer vision and pattern recognition, p.479–488
Dinh TB, Vo N, and Medioni G (2011) Context Tracker: Exploring supporters and distracters in unconstrained environments. IEEE conference on computer vision and pattern recognition, p.1177–1184
Godec M, Roth PM, Bischof H (2013) Hough-based tracking of non-rigid objects. Comput Vis Image Underst 117(10):1245–1256
Grabner M, Grabner H, Bischof H (2007) Learning features for tracking. IEEE conference on computer vision and pattern recognition, p.1–8
Guo Y, Chen Y, Tang F, Li A, Luo W, Liu M (2014) Object tracking using Learned feature manifolds. Comput Vis Image Underst 118:128–139
Hare S, Saffari A, and Torr PHS (2011) Struck: structured output tracking with kernels. International conference on computer vision, p.2096–2109
Henriques JF, Caseiro R, Martins P, et al. (2012) Exploiting the circulant structure of tracking-by-detection with kernels. European conference on computer vision. Springer Berlin Heidelberg, p. 702–715
Hinterstoisser S, Lepetit V, Ilic S, et al. (2010) Dominant orientation templates for real-time detection of texture-less objects. IEEE conference on computer vision and pattern recognition 10, p.2257–2264
Hong Z, Chen Z, Wang C, Mei X, Prokhorov D, and Tao D. (2015) Multi-store tracker (MUSTer): a cognitive psychology inspired approach to object tracking. IEEE conference on computer vision and pattern recognition, 2015, p. 749–758
Jia X, Lu H, Yang MH (2012) Visual tracking via adaptive structural local sparse appearance model. IEEE conference on computer vision and pattern recognition, p.1822–1829
Kwon J, Lee KM (2010) Visual tracking decomposition. IEEE conference on computer vision and pattern recognition, p1269–1276
Kwon J, Lee KM (2011) Tracking by sampling trackers. International conference on computer vision, p.1195–1202
Kwon J, Lee KM (2013) Highly nonrigid object tracking via patch-based dynamic appearance modeling. IEEE Trans Pattern Anal Mach Intell 35(10):2427–2441
Li F, Tian C, Zou W, Zhang L, Yang M-H (2018) Learning spatial-temporal regularized correlation filters for visual tracking. IEEE conference on computer vision and pattern recognition
Liu B, Huang J, Yang L, and Kulikowsk C (2011) Robust tracking using local sparse appearance model and K-selection. IEEE conference on computer vision and pattern recognition, p.1313–1320
Liu AA, Su YT, Nie WZ, Kankanhalli M (2017) Hierarchical clustering multi-task learning for joint human action grouping and recognition. IEEE Trans Pattern Anal Mach Intell 39(1):102–114
Liu AA, Xu N, Zhang H et al. (2018) Multi-level policy and reward reinforcement learning for image captioning, twenty-seventh international joint conference on artificial intelligence (IJCAI), p.821-827
Liu L, Yan X, Shen A (2019) Adaptive multi-feature fusion for correlation filter tracking. Commun Signal Process Syst 1:1057–1066
Ma C, Liu C, Peng F, Liu J (2016) Multi-feature hashing tracking. Pattern Lett 69:62–71
Maresca ME, Petrosino A, Matrioska (2013) A multi-level approach to fast tracking by learning. International Conference on Image Analysis and Processing. Springer Berlin Heidelberg, p.419–428
Ning J, Yang J, Jiang S, Zhang L and Yang MH (2016) Object tracking via dual linear structured SVM and explicit feature map. IEEE conference on computer vision and pattern recognition, p.4266-4274
Oron S, Bar-Hillel A, Levi D, Avidan S (2012) Locally orderless tracking. IEEE conference on computer vision and pattern recognition, p.1940–1947
Pearson K, Galton F (2014) Pearson product-moment correlation coefficient. Covariance
Ross DA, Lim J, Lin RS, Yang MH (2008) Incremental learning for robust visual tracking. Int J Comput Vis 77(1–3):125–141
Sevilla-Lara L and Learned-Miller E (2012) Distribution fields for tracking. IEEE conference on computer vision and pattern recognition, p.1910–1917
Song Y, Ma C, Gong L, Zhang J, Lau RWH, Yang M-H (2017) CREST: convolutional residual learning for visual tracking. International conference on computer vision, p.2555–2563
Stadler S, Grabner H, Van Gool L (2012) Dynamic objectness for adaptive tracking. Conference Asian Conference on Computer Vision, p43–56
Tu WC, He S, Yang Q, Chien SY (2016) Real-time salient object detection with a minimum spanning tree. IEEE conference on computer vision and pattern recognition, p.2334–2342
Valmadre J, Bertinetto L, Henriques J, Vedaldi A, Torr PHS (2017) End-to-end representation learning for correlation filter based tracking. IEEE conference on computer vision and pattern recognition, p.5000-5008
Vojíř T, Matas J (2014) The enhanced flock of trackers. Registration and Recognition in Images and Videos. Springer Berlin Heidelberg, p.113–136
Wang D, Lu H, Yang MH (2013) Least soft-threshold squares tracking. IEEE conference on computer vision and pattern recognition. P.2371–2378
Wu Y, Lim J, Yang M H (2013) Online object tracking: a benchmark. IEEE conference on computer vision and pattern recognition, p.2411–2418
Xu N, Liu AA, Wong Y, Zhang Y, Nie WZ, Su YT, Kankanhalli M (2019) Dual-stream recurrent neural network for video captioning. IEEE Trans Circuits Syst Video Technology 29(8), p.2482–2493
Zhang L, Van Der Maaten L (2014) Preserving structure in model-free tracking. IEEE Trans Pattern Anal Mach Intell 36(4):756–769
Zhang K, Zhang L, and Yang MH (2012) Real-time compressive tracking. European Conference on Computer Vision, p864–877
Zhang J, Yu J, Tao D (2018) Local deep-feature alignment for unsupervised dimension reduction. IEEE transactions on image processing, p. 1-1
Zhu G, Wang J, Wu Y, et al. (2016) MC-HOG correlation tracking with saliency proposal. Proceedings of the thirtieth AAAI conference on artificial intelligence. AAAI press, p. 3690-3696.
Zhu G, Porikli F, Li H (2016) Beyond local search: tracking objects everywhere with instance -specific proposals. IEEE conference on computer vision and pattern recognition, p.943–951
Acknowledgments
This paper is funded by some projects of the authors. Yun Liang is the Natural Science Foundation of China (No. 61772209), and the Science and Technology Planning Project of Guangdong Province (No. 2019A050510034, 2019B020219001). Jian Zhang is supported by the Natural Science Foundation of China (No. 61972361). Chen Lin is supported by the Natural Science Foundation of China (No.61472335, 61,972,328). Jun Xiao is supported by the National Natural Science Foundation of China (No.61572431), and the Zhejiang Natural Science Foundation (No.LY17F020009, LR19F020002, LZ17F020001).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
CLC number: TP 37
Appendix 1: All The Precision Plots and Success Plots
Appendix 1: All The Precision Plots and Success Plots
1.1 Appendix 2 The comparisons of total 25 videos
Rights and permissions
About this article
Cite this article
Liang, Y., Zhang, J., Wang, Mh. et al. Multi-features guided robust visual tracking. Multimed Tools Appl 80, 16367–16395 (2021). https://doi.org/10.1007/s11042-020-08791-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-08791-z