Abstract
To select videos to compose a change detection dataset, we can consider the videos’ difficulty level. We need to use difficulty maps, which store values representing the pixels’ difficulty level, to estimate these levels. The problem is that ground truth is needed to generate a difficulty map, and generating the ground truth requires manual attribution of labels to the pixels of the frames. Identifying the difficulty level of a video before producing its ground truth allows researchers to obtain the difficulty level, select the videos considering this information, and, subsequently, generate ground truths only for the videos with different difficulty levels. Datasets containing videos with different difficulty levels can evaluate an algorithm more adequately. In this research, we developed a method to generate difficulty maps of a video without using its ground truth. Our method uses the videos and the ground truths from the CDNet 2014 dataset to generate difficulty maps to train a pix2pix neural network. The results showed that the trained network could generate difficulty maps similar to those generated by the traditional approach.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Code Availability
The data and the code generated and/or analyzed during the current study are available https://drive.google.com/drive/folders/1Pi_6S8sdVWTV4opj057N_O5EXPVCocyD?usp=sharing.
References
Akkem Y, Biswas SK, Varanasi A (2023a) Smart farming monitoring using ML and MLOps. In: Hassanien AE, Castillo O, Anand S, et al (eds) International conference on innovative computing and communications. Springer Nature Singapore, Singapore, pp 665–675. https://doi.org/10.1007/978-981-99-3315-0_51
Akkem Y, Biswas SK, Varanasi A (2023b) Smart farming using artificial intelligence: A review. Eng Appl Artif Intell 120:1–12. https://doi.org/10.1016/j.engappai.2023.105899
de Almeida PRL, Alves JH, Parpinelli RS et al (2022) A systematic review on computer vision-based parking lot management applied on public datasets. Expert Syst Appl 198:116731. https://doi.org/10.1016/j.eswa.2022.116731
Fisher R (2023) CAVIAR Test Case Scenarios. https://groups.inf.ed.ac.uk/vision/DATASETS/CAVIAR/CAVIARDATA1/ Accessed 14 Apr 2023
Gao C, Li P, Zhang Y et al (2016) People counting based on head detection combining adaboost and cnn in crowded surveillance environment. Neurocomputing 208:108–116. https://doi.org/10.1016/j.neucom.2016.01.097
Garcia-Cobo G, SanMiguel JC (2023) Human skeletons and change detection for efficient violence detection in surveillance videos. Comput Vis Image Underst 233:1–11. https://doi.org/10.1016/j.cviu.2023.103739
Goyette N, Jodoin PM, Porikli F, et al (2012) Changedetection.net: A new change detection benchmark dataset. In: 2012 IEEE Computer society conference on computer vision and pattern recognition workshops, pp 1–8. https://doi.org/10.1109/CVPRW.2012.6238919
Grbic R, Koch B (2023) Automatic vision-based parking slot detection and occupancy classification. Expert Syst Appl 225:120147. https://doi.org/10.1016/j.eswa.2023.120147
Huerta I, Pedersoli M, González J et al (2015) Combining where and what in change detection for unsupervised foreground learning in surveillance. Pattern Recogn 48(3):709–719. https://doi.org/10.1016/j.patcog.2014.09.023
Isola P, Zhu J, Zhou T, et al (2017) Image-to-image translation with conditional adversarial networks. In: 2017 IEEE Conference on computer vision and pattern recognition (CVPR). IEEE Computer Society, Los Alamitos, CA, USA, pp 5967–5976. https://doi.org/10.1109/CVPR.2017.632
Kalsotra R, Arora S (2019) A comprehensive survey of video datasets for background subtraction. IEEE Access 7:59143–59171. https://doi.org/10.1109/ACCESS.2019.2914961
Li L, Huang W, Gu IYH et al (2004) Statistical modeling of complex backgrounds for foreground object detection. Trans Img Proc 13(11):1459–1472. https://doi.org/10.1109/TIP.2004.836169
Li MD, Chang K, Bearce B, et al (2020) Siamese neural networks for continuous disease severity evaluation and change detection in medical imaging. npj Digital Medicine 3(1):48. https://doi.org/10.1038/s41746-020-0255-1
Li X, Liu Z, Luo P, et al (2017) Not all pixels are equal: Difficulty-aware semantic segmentation via deep layer cascade. In: 2017 IEEE Conference on computer vision and pattern recognition (CVPR). IEEE Computer Society, Los Alamitos, CA, USA, pp 6459–6468. https://doi.org/10.1109/CVPR.2017.684
Microsoft Corporation (2023) Test Images for Wallflower Paper. https://www.microsoft.com/en-us/download/details.aspx?id=54651. Accessed 10 Apr 2023
Minematsu T, Shimada A, Uchiyama H, et al (2018) Reconstruction-based change detection with image completion for a free-moving camera. Sensors 18(4). https://doi.org/10.3390/s18041232
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: Navab N, Hornegger J, Wells WM, et al (eds.) Medical image computing and computer-assisted intervention – MICCAI 2015. Springer International Publishing, Cham, pp 234–241, https://doi.org/10.1007/978-3-319-24574-4_28
Sanches SRR, Oliveira C, Sementille AC et al (2019) Challenging situations for background subtraction algorithms. Appl Intell 49(5):1771–1784. https://doi.org/10.1007/s10489-018-1346-4
Sanches SRR, Corrêa CG, Brum BR et al (2023) Evaluation of change detection algorithms using difficulty maps. IEEE Lat Am Trans 21(6):700–706. https://doi.org/10.1109/TLA.2023.10172134
Shoaib M, Shah B, Hussain T et al (2023) A deep learning-assisted visual attention mechanism for anomaly detection in videos. Multimedia Tools and Applications. https://doi.org/10.1007/s11042-023-17770-z
Silva CM, Rosa KAI, Bugatti PH et al (2022) Method for selecting representative videos for change detection datasets. Multimedia Tools and Applications 81(3):3773–3791. https://doi.org/10.1007/s11042-021-11640-2
Sobral A, Vacavant A (2014) A comprehensive review of background subtraction algorithms evaluated with synthetic and real videos. Comput Vis Image Underst 122:4–21. https://doi.org/10.1016/j.cviu.2013.12.005
St-Charles P, Bilodeau G, Bergevin R (2015) A self-adjusting approach to change detection based on background word consensus. In: 2015 IEEE Winter conference on applications of computer vision, pp 990–997. https://doi.org/10.1109/WACV.2015.137
Sušac F, Matić T, Aleksi I, et al (2021) Bulletin of the Polish Academy of Sciences Technical Sciences 69(3):1–11. https://doi.org/10.24425/bpasts.2021.137121
Toyama K, Krumm J, Brumitt B, et al (1999) Wallflower: principles and practice of background maintenance. In: Proceedings of the seventh IEEE international conference on computer vision, pp 255–261 vol.1. https://doi.org/10.1109/ICCV.1999.791228
Université de Sherbrooke (2023) ChangeDetection.NET - A video database for testing change detection algorithms. http://changedetection.net/ Accessed 30 May 2023
University of Naples Parthenope (2023) SceneBackgroundModeling.net.NET – a video database for testing background estimation algorithms. http://scenebackgroundmodeling.net. Accessed 24 Feb 2023
Vacavant A, Chateau T, Wilhelm A, et al (2013) A Benchmark Dataset for Outdoor Foreground/Background Extraction, Springer Berlin Heidelberg, Berlin, Heidelberg, pp 291–300. https://doi.org/10.1007/978-3-642-37410-4_25
Wang R, Bunyak F, Seetharaman G, et al (2014) Static and moving object detection using flux tensor with split gaussian models. In: 2014 IEEE Conference on computer vision and pattern recognition workshops, pp 420–424. https://doi.org/10.1109/CVPRW.2014.68
Wang Y, Jodoin PM, Porikli F, et al (2014) CDnet 2014: An expanded change detection benchmark dataset. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 387–394. https://doi.org/10.1109/CVPRW.2014.126
Wang Y, Choi J, Zhang K, et al (2020) Video object tracking and segmentation with box annotation. Signal Processing: Image Communication 85:115858. https://doi.org/10.1016/j.image.2020.115858
Young DP, Ferryman JM (2005) Pets metrics: On-line performance evaluation service. In: 2005 IEEE International workshop on visual surveillance and performance evaluation of tracking and surveillance, pp 317–324. https://doi.org/10.1109/VSPETS.2005.1570931
Funding
The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design.
Corresponding author
Ethics declarations
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Authors Consent
The manuscript is submitted with the consent of all authors.
Conflict of Interest
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sanches, S.R.R., Custódio Junior, E., Corrêa, C.G. et al. Automatic generation of difficulty maps for datasets using neural network. Multimed Tools Appl 83, 66499–66516 (2024). https://doi.org/10.1007/s11042-024-18271-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-024-18271-3