Siamese Centerness Prediction Network for Real-Time Visual Object Tracking

470 Accesses
5 Citations
1 Altmetric
Explore all metrics

Abstract

Siamese network has been proven to achieve excellent results for visual object tracking where the SiamFC(Fully-Convolutional)is among the most well-known seminar work. Recently, with the successful application of the Region Proposal Network (RPN) in object detection, siamese networks combined with RPN have achieved good performance in visual tracking tasks. However, RPN requires the selection of the number, aspect ratio and size of the anchor boxes and these anchor-related parameters more often than not, need manual intervention and tuning. In this work, we first add a channel-aware module in the siamese network to obtain the more discriminative features. Thereafter, we propose an anchor-free strategy to replace the RPN module. The proposed framework consists of two networks, namely, the Siamese network and the Centerness Prediction network (CPN). We call the proposed method SiamCPN. In the Siamese network, Resnet50 is used as the backbone. SiamCPN is simple and relatively efficient due to the fact that it avoids the need for complicated hyper-parameters of the anchor boxes. Extensive experimental results on four visual tracking benchmark datasets, OTB100, VOT2016, UAV123 and LaSOT show that the proposed framework has achieved highly competitive and better performance compared with the state-of-the-art trackers. SiamCPN can run at 60 frames per second (FPS) on an AMD processor with 2 RTX3090.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SiamCPN: Visual tracking with the Siamese center-prediction network

Article Open access 05 April 2021

AF2S: An Anchor-Free Two-Stage Tracker Based on a Strong SiamFC Baseline

Joint Classification and Regression for Visual Tracking with Fully Convolutional Siamese Networks

Article Open access 06 January 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Yang C, Duraiswami R, Davis L (2005) Efficient mean-shift tracking via a new similarity measure. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, pp 176–183 IEEE
Zhang S, Yao H, Sun X, Lu X (2013) Sparse coding based visual tracking: Review and experimental comparison. Pattern Recognition 46(7):1772–1788
Article Google Scholar
Yu J, Rui Y, Tao D (2014) Click prediction for web image reranking using multimodal sparse coding. IEEE Transactions on Image Processing 23(5):2019–2032
Article MathSciNet MATH Google Scholar
Henriques JF, Caseiro R, Martins P, Batista J (2014) High-speed tracking with kernelized correlation filters. IEEE transactions on pattern analysis and machine intelligence 37(3):583–596
Article Google Scholar
Danelljan M, Hager G, Khan FS, Felsberg M (2016) Discriminative scale space tracking. IEEE transactions on pattern analysis and machine intelligence 39(8):1561–1575
Article Google Scholar
Danelljan M, Hager G, Shahbaz Khan F, Felsberg M (2015) Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp 4310–4318
Bertinetto L, Valmadre J, Golodetz S, Miksik O, Torr PH (2016) Staple: Complementary learners for real-time tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp 1401–1409
Danelljan M, Robinson A, Shahbaz Khan F, Felsberg M (2016) Beyond correlation filters: Learning continuous convolution operators for visual tracking. In: European Conference on Computer Vision, Springer, pp 472–488
Danelljan M, Bhat G, Shahbaz Khan F, Felsberg M (2017) Eco: Efficient convolution operators for tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6638–6646
Bhat G, Johnander J, Danelljan M, Khan FS, Felsberg M (2018) Unveiling the power of deep tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 483–498
Zhang J, Yang J, Yu J, Fan J (2022) Semisupervised image classification by mutual learning of multiple self-supervised models. International Journal of Intelligent Systems 37(5):3117–3141
Article Google Scholar
Yu J, Tan M, Zhang H, Tao D, Rui Y (2019) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE transactions on pattern analysis and machine intelligence
Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PH (2016) Fully-convolutional siamese networks for object tracking. In: European Conference on Computer Vision, Springer, pp 850–865
Wang Q, Teng Z, Xing J, Gao J, Hu W, Maybank S (2018) Learning attentions: residual attentional siamese network for high performance online visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4854–4863
Li B, Yan J, Wu W, Zhu Z, Hu X (2018) High performance visual tracking with siamese region proposal network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8971–8980
Zhu Z, Wang Q, Li B, Wu W, Yan J, Hu W (2018) Distractor-aware siamese networks for visual object tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 101–117
Li B, Wu W, Wang Q, Zhang F, Xing J, Yan J (2019) Siamrpn++: Evolution of siamese visual tracking with very deep networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4282–4291
Chen Z, Zhong B, Li G, Zhang S, Ji R (2020) Siamese box adaptive network for visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 6668–6677
Xu Y, Wang Z, Li Z, Yuan Y, Yu G (2020) Siamfc++: Towards robust and accurate visual tracking with target estimation guidelines. Proceedings of the AAAI Conference on Artificial Intelligence 34:12549–12556
Article Google Scholar
Yang K, He Z, Zhou Z, Fan N (2020) Siamatt: Siamese attention network for visual tracking. Knowledge-based systems 203:106079
Article Google Scholar
Guo Q, Feng W, Zhou C, Huang R, Wan L, Wang S (2017) Learning dynamic siamese network for visual object tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1763–1771
Wu Y, Lim J, Yang M-H (2013) Online object tracking: A benchmark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2411–2418
Kristan M, Leonardis A, Matas J, Felsberg M, Pflugfelder R, Cehovin Zajc L, Vojir T, Hager G, Lukezic A, Eldesokey A, et al (2016) The Visual Object Tracking VOT2016 challenge results. Springer http://www.springer.com/gp/book/9783319488806
Mueller M, Smith N, Ghanem B (2016) A benchmark and simulator for uav tracking. In: European Conference on Computer Vision, Springer, pp 445–461
Fan H, Lin L, Yang F, Chu P, Deng G, Yu S, Bai H, Xu Y, Liao C, Ling H (2019) Lasot: A high-quality benchmark for large-scale single object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5374–5383
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 28
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 779–788
Law H, Deng J (2018) Cornernet: Detecting objects as paired keypoints. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 734–750
Tian Z, Shen C, Chen H, He T (2019) Fcos: Fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9627–9636
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp 315–323. JMLR Workshop and Conference Proceedings
Yu F, Koltun V (2015) Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122
Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2020) Distance-iou loss: Faster and better learning for bounding box regression. Proceedings of the AAAI Conference on Artificial Intelligence 34:12993–13000
Article Google Scholar
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollar P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: European Conference on Computer Vision, Springer, pp 740–755
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. International journal of computer vision 115(3):211–252
Article MathSciNet Google Scholar
Real E, Shlens J, Mazzocchi S, Pan X, Vanhoucke V (2017) Youtube-boundingboxes: A large high-precision human-annotated data set for object detection in video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5296–5305
Huang L, Zhao X, Huang K (2019) Got-10k: A large high-diversity benchmark for generic object tracking in the wild. IEEE Transactions on Pattern Analysis and Machine Intelligence 43(5):1562–1577
Article Google Scholar
Bewley A, Ge Z, Ott L, Ramos F, Upcroft B (2016) Simple online and realtime tracking. In: 2016 IEEE International Conference on Image Processing (ICIP), pp 3464–3468 IEEE
Ning G, Zhang Z, Huang C, Ren X, Wang H, Cai C, He Z (2017) Spatially supervised recurrent convolutional neural networks for visual object tracking. In: 2017 IEEE International Symposium on Circuits and Systems (ISCAS), pp 1–4. IEEE

Download references

Funding

This paper is funded by National Key R &D Program of China (No.2021YFF0603904).

Author information

Authors and Affiliations

School of Intelligent Science and Engineering, Harbin Engineering University, Harbin, 150001, China
Yue Wu & Chengtao Cai
School of Computer Science and Engineering, Nanyang Technological University, Singapore, 639798, Singapore
Chai Kiat Yeo
Key laboratory of Intelligent Technology and Application of Marine Equipment (Harbin Engineering University), Ministry of Education Harbin, Heilongjiang Province, 150001, China
Chengtao Cai

Authors

Yue Wu
View author publications
You can also search for this author in PubMed Google Scholar
Chengtao Cai
View author publications
You can also search for this author in PubMed Google Scholar
Chai Kiat Yeo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chengtao Cai.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, Y., Cai, C. & Yeo, C.K. Siamese Centerness Prediction Network for Real-Time Visual Object Tracking. Neural Process Lett 55, 1029–1044 (2023). https://doi.org/10.1007/s11063-022-10924-4

Download citation

Accepted: 07 June 2022
Published: 04 July 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s11063-022-10924-4

Siamese Centerness Prediction Network for Real-Time Visual Object Tracking

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

SiamCPN: Visual tracking with the Siamese center-prediction network

AF2S: An Anchor-Free Two-Stage Tracker Based on a Strong SiamFC Baseline

Joint Classification and Regression for Visual Tracking with Fully Convolutional Siamese Networks

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Siamese Centerness Prediction Network for Real-Time Visual Object Tracking

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

SiamCPN: Visual tracking with the Siamese center-prediction network

AF2S: An Anchor-Free Two-Stage Tracker Based on a Strong SiamFC Baseline

Joint Classification and Regression for Visual Tracking with Fully Convolutional Siamese Networks

Explore related subjects

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation