Foreground Detection Using an Attention Module and a Video Encoding

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13233))

Included in the following conference series:

International Conference on Image Analysis and Processing

1468 Accesses
1 Citations

Abstract

Foreground detection is the task of labelling the foreground or background pixels in the video sequence and it depends on the context of the scene. For many years, methods based on background model have been the most used approaches for detecting foreground; however, their methods are sensitive to error propagation from the first background model estimations. To address this problem, we proposed a U-net based architecture with an attention module, where the encoding of the entire video sequence is used as attention context to get features related to the background model. We tested our network on sixteen scenes from the CDnet2014 dataset, with an average F-measure of 88.42. The results also show that our model outperforms traditional and neural networks methods. Thus, we demonstrated that an attention module on a U-net based architecture can deal with the foreground detection challenges.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

(MS)2EDNet: Multiscale Motion Saliency Deep Network for Moving Object Detection

Lightweight Convolutional Neural Network for Foreground Segmentation

Deep Neural Network for Foreground Object Segmentation: An Unsupervised Approach

References

Akilan, T., Wu, Q.J., Safaei, A., Huo, J., Yang, Y.: A 3D CNN-LSTM-based image-to-image foreground segmentation. IEEE Trans. Intell. Transp. Syst. 21(3), 959–971 (2019)
Article Google Scholar
Akilan, T., Wu, Q.J.: sEnDec: an improved image to image CNN for foreground localization. IEEE Trans. Intell. Transp. Syst. 21(10), 4435–4443 (2019)
Article Google Scholar
Akilan, T., Wu, Q.J., Yang, Y.: Fusion-based foreground enhancement for background subtraction using multivariate multi-model gaussian distribution. Inf. Sci. 430, 414–431 (2018)
Article Google Scholar
Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv preprint arXiv:1607.06450 (2016)
Babaee, M., Dinh, D.T., Rigoll, G.: A deep convolutional neural network for video sequence background subtraction. Pattern Recognit. 76, 635–649 (2018)
Article Google Scholar
Dosovitskiy, A., et al.: An image is worth $16 \times 16$ words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Flores-Benites, V., Mugruza-Vassallo, C.A., Mora-Colque, R.: TVAnet: a spatial and feature-based attention model for self-driving car. In: 2021 34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 263–270. IEEE (2021)
Google Scholar
Fratama, R.R., Partiningsih, N.D.A., Rachmawanto, E.H., Sari, C.A., Andono, P.N., et al.: Real-time multiple vehicle counter using background subtraction for traffic monitoring system. In: 2019 International Seminar on Application for Technology of Information and Communication (iSemantic), pp. 1–5. IEEE (2019)
Google Scholar
Gao, Y., Cai, H., Zhang, X., Lan, L., Luo, Z.: Background subtraction via 3D convolutional neural networks. In: 2018 24th International Conference on Pattern Recognition (ICPR), pp. 1271–1276. IEEE (2018)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Hema, C.: Hand gesture identification using preprocessing, background subtraction and segmentation techniques. Int. J. Appl. Eng. Res. 11(5), 3221–3228 (2016)
Google Scholar
Hofmann, M., Tiefenbacher, P., Rigoll, G.: Background segmentation with feedback: the pixel-based adaptive segmenter. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 38–43. IEEE (2012)
Google Scholar
Huynh-The, T., Banos, O., Lee, S., Kang, B.H., Kim, E.S., Le-Tien, T.: NIC: a robust background extraction algorithm for foreground detection in dynamic scenes. IEEE Trans. Circuits Syst. Video Technol. 27(7), 1478–1490 (2016)
Article Google Scholar
Kim, J.Y., Ha, J.E.: Foreground objects detection by U-Net with multiple difference images. Appl. Sci. 11(4), 1807 (2021)
Article Google Scholar
Kim, J.Y., Ha, J.E.: Spatio-temporal data augmentation for visual surveillance. arXiv preprint arXiv:2101.09895 (2021)
Lim, L.A., Keles, H.Y.: Foreground segmentation using convolutional neural networks for multiscale feature encoding. Pattern Recognit. Lett. 112, 256–262 (2018)
Article Google Scholar
Patil, P.W., Biradar, K.M., Dudhane, A., Murala, S.: An end-to-end edge aggregation network for moving object segmentation. In: proceedings of the IEEE/CVF Conference on computer Vision and Pattern Recognition, pp. 8149–8158 (2020)
Google Scholar
Patil, P.W., Dudhane, A., Murala, S.: Multi-frame recurrent adversarial network for moving object segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2302–2311 (2021)
Google Scholar
Piccardi, M.: Background subtraction techniques: a review. In: 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No. 04CH37583), vol. 4, pp. 3099–3104. IEEE (2004)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015, Part III. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y
Article MathSciNet Google Scholar
Sajid, H., Cheung, S.C.S.: Universal multimode background subtraction. IEEE Trans. Image Process. 26(7), 3249–3260 (2017)
Article MathSciNet Google Scholar
Tarafdar, A., Roy, S., Mondal, A., Sen, R., Adhikari, A.: Image segmentation using background subtraction on colored images. In: 2019 International Conference on Opto-Electronics and Applied Optics (Optronix), pp. 1–4. IEEE (2019)
Google Scholar
Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., Jégou, H.: Training data-efficient image transformers and distillation through attention. In: International Conference on Machine Learning, pp. 10347–10357. PMLR (2021)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Wang, Y., Jodoin, P.M., Porikli, F., Konrad, J., Benezeth, Y., Ishwar, P.: CDnet 2014: an expanded change detection benchmark dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 387–394 (2014)
Google Scholar
Wu, L., Huang, K., Shen, H., Gao, L.: A foreground-background parallel compression with residual encoding for surveillance video (2020)
Google Scholar
Yang, L., Li, J., Luo, Y., Zhao, Y., Cheng, H., Li, J.: Deep background modeling using fully convolutional network. IEEE Trans. Intell. Transp. Syst. 19(1), 254–262 (2017)
Article Google Scholar
Zhang, Y., et al.: VidTr: video transformer without convolutions. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 13577–13587 (2021)
Google Scholar
Zou, W., Bai, C., Kpalma, K., Ronsin, J.: Online glocal transfer for automatic figure-ground segmentation. IEEE Trans. Image Process. 23(5), 2109–2121 (2014)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Universidad Católica San Pablo, Arequipa, Peru
Anthony A. Benavides-Arce, Victor Flores-Benites & Rensso Mora-Colque

Authors

Anthony A. Benavides-Arce
View author publications
You can also search for this author in PubMed Google Scholar
Victor Flores-Benites
View author publications
You can also search for this author in PubMed Google Scholar
Rensso Mora-Colque
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anthony A. Benavides-Arce .

Editor information

Editors and Affiliations

Boston University, Boston, MA, USA
Stan Sclaroff
National Research Council, Lecce, Italy
Cosimo Distante
National Research Council, Lecce, Italy
Marco Leo
University of Catania, Catania, Italy
Giovanni M. Farinella
Technische Universität München, Garching, Germany
Federico Tombari

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Benavides-Arce, A.A., Flores-Benites, V., Mora-Colque, R. (2022). Foreground Detection Using an Attention Module and a Video Encoding. In: Sclaroff, S., Distante, C., Leo, M., Farinella, G.M., Tombari, F. (eds) Image Analysis and Processing – ICIAP 2022. ICIAP 2022. Lecture Notes in Computer Science, vol 13233. Springer, Cham. https://doi.org/10.1007/978-3-031-06433-3_17

Download citation

DOI: https://doi.org/10.1007/978-3-031-06433-3_17
Published: 15 May 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-06432-6
Online ISBN: 978-3-031-06433-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics