Dice Loss in Siamese Network for Visual Object Tracking

Zhao Wei^11,12,
Changhao Zhang^11,12,
Kaiming Gu^11,12 &
…
Fei Wang^11,12

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11644))

Included in the following conference series:

International Conference on Intelligent Computing

Abstract

The problem of visual object tracking has evolved over the years. Traditionally, it is solved by a model that only learns the appearance of an object online, using the video itself as the only training data. The target in a single object tracking task is a relatively small object in most cases, and the deformation is more serious, referring to the dice loss used in the semantic segmentation problem, we introduced a new objective function to optimize during training based on the Dice coefficient. In this way, we can handle the strong imbalance between foreground and background patches. To cope with the limited amount of annotations available for training, we use random nonlinear transformations and histogram matching to increase the data. We have demonstrated in our experimental evaluation that our method has achieved good performance in challenging test data, while only requiring a small amount of processing time required by other previous methods.

Z. Wei, C. Zhang and K. Gu—Contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Distractor-Aware Siamese Networks for Visual Object Tracking

Object tracking using a convolutional network and a structured output SVM

Article Open access 15 June 2017

Robust Visual Tracking by Segmentation

References

Dong, X., Shen, J.: Triplet loss in siamese network for object tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 472–488. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_28
Chapter Google Scholar
Zhu, Z., Wang, Q., Li, B., Wu, W., Yan, J., Hu, W.: Distractor-aware siamese networks for visual object tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11213, pp. 103–119. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01240-3_7
Chapter Google Scholar
Wang, Q., Teng, Z., Xing, J., Gao, J., Hu, W., Maybank, S.: Learning attentions: residual attentional siamese network for high performance online visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 4854 (2018)
Google Scholar
Wang, Q., Zhang, M., Xing, J., Gao, J., Hu, W., Maybank, S.: Do not lose the details: reinforced representation learning for high performance visual tracking. In: 27th International Joint Conference on Artificial Intelligence (2018)
Google Scholar
Li, B., Yan, J., Wu, W., Zhu, Z., Hu, X.: High performance visual tracking with siamese region proposal network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 8971 (2018)
Google Scholar
Zhang, Y., Wang, L., Qi, J., Wang, D., Feng, M., Lu, H.: Structured siamese network for real-time visual tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11213, pp. 355–370. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01240-3_22
Chapter Google Scholar
He, A., Luo, C., Tian, X., Zeng, W.: A two fold siamese network for real-time object tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Google Scholar
Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional siamese networks for object tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 850–865. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_56
Chapter Google Scholar
Held, D., Thrun, S., Savarese, S.: Learning to track at 100 FPS with deep regression networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 749–765. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_45
Chapter Google Scholar
Fan, H., Ling, H.: Siamese cascaded region proposal networks for real-time visual tracking. arXiv preprint arXiv:1812.06148 (2018)
Yun, S., Choi, J., Yoo, Y., Yun, K., Young Choi, J.: Action-decision networks for visual tracking with deep reinforcement learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 2711 (2017)
Google Scholar
Lukezic, A., Vojir, T., Cehovin Zajc, L., Matas, J., Kristan, M.: Discriminative correlation filter with channel and spatial reliability. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 6309 (2017)
Google Scholar
Real, E., Shlens, J., Mazzocchi, S., Pan, X., Vanhoucke, V.: YouTube-BoundingBoxes: a large high-precision human-annotated data set for object detection in video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 5296 (2017)
Google Scholar
Song, Y., et al.: VITAL: visual tracking via adversarial learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 8990 (2018)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, p. 91 (2015)
Google Scholar
Gordon, D., Farhadi, A., Fox, D.: Real-time recurrent regression networks for visual tracking of generic objects. IEEE Rob. Autom. Lett. 3(2), 788 (2018)
Article Google Scholar
Danelljan, M., Häger, G., Khan, F., Felsberg, M.: Accurate scale estimation for robust visual tracking. In: British Machine Vision Conference, 1 September 2014, Nottingham. BMVA Press (2014)
Google Scholar
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 1 (2015)
Google Scholar
Tao, R., Gavves, E., Smeulders, A.W.: Siamese instance search for tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 1420 (2016)
Google Scholar
Valmadre, J., Bertinetto, L., Henriques, J., Vedaldi, A., Torr, P.H.: End-to-end representation learning for correlation filter based tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 2805 (2017)
Google Scholar
Bertinetto, L., Valmadre, J., Golodetz, S., Miksik, O., Torr, P.H.: Staple: complementary learners for real-time tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 1401 (2016)
Google Scholar
Wang, Q., Zhang, L., Bertinetto, L., Hu, W., Torr, P.H.: Fast online object tracking and segmentation: a unifying approach. arXiv preprint arXiv:1812.05050 (2018)
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211 (2015)
Article MathSciNet Google Scholar
Wu, Y., Lim, J., Yang, M.-H.: Online object tracking: a benchmark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 2411 (2013)
Google Scholar
Wu, Y., Lim, J., Yang, M.-H.: Object tracking benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1834 (2015)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 770 (2016)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, p. 1097 (2012)
Google Scholar
Zhipeng, Z., Houwen, P., Qiang, W.: Deeper and wider siamese networks for real-time visual tracking. arXiv preprint arXiv:1901.01660 (2019)
Guo, Q., Feng, W., Zhou, C., Huang, R., Wan, L., Wang, S.: Learning dynamic siamese network for visual object tracking. In: Proceedings of the IEEE International Conference on Computer Vision, p. 1763 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

National Engineering Laboratory for Visual Information Processing and Application, XJTU, 99 Yanxiang Road, Xi’an, 710054, Shaanxi, China
Zhao Wei, Changhao Zhang, Kaiming Gu & Fei Wang
School of Electronics and Information Engineering, XJTU, 28 West Xianning Road, Xi’an, 710049, Shaanxi, China
Zhao Wei, Changhao Zhang, Kaiming Gu & Fei Wang

Authors

Zhao Wei
View author publications
You can also search for this author in PubMed Google Scholar
Changhao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Kaiming Gu
View author publications
You can also search for this author in PubMed Google Scholar
Fei Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fei Wang .

Editor information

Editors and Affiliations

Tongji University, Shanghai, China
De-Shuang Huang
University of Ulsan, Ulsan, Korea (Republic of)
Kang-Hyun Jo
Nanchang Institute of Technology, Nanchang, China
Zhi-Kai Huang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wei, Z., Zhang, C., Gu, K., Wang, F. (2019). Dice Loss in Siamese Network for Visual Object Tracking. In: Huang, DS., Jo, KH., Huang, ZK. (eds) Intelligent Computing Theories and Application. ICIC 2019. Lecture Notes in Computer Science(), vol 11644. Springer, Cham. https://doi.org/10.1007/978-3-030-26969-2_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-26969-2_6
Published: 24 July 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26968-5
Online ISBN: 978-3-030-26969-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics