Abstract
Given the existence of many change detection algorithms, each with its own peculiarities and strengths, we propose a combination strategy, that we termed IUTIS (In Unity There Is Strength), based on a genetic Programming framework. This combination strategy is aimed at leveraging the strengths of the algorithms and compensate for their weakness. In this paper we show our findings in applying the proposed strategy in two different scenarios. The first scenario is purely performance-based. The second scenario performance and efficiency must be balanced. Results demonstrate that starting from simple algorithms we can achieve comparable results with respect to more complex state-of-the-art change detection algorithms, while keeping the computational complexity affordable for real-time applications.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Many computer vision applications require the detection of changes within video streams, e.g. video surveillance, smart environments, video indexing and retrieval [5, 28]. For all these applications, a robust change detection algorithms with a low false alarm rate is required as a pre-processing step. Many algorithms have been proposed to solve the problem of video change detection. Most of them rely on background subtraction techniques to segment the scene into foreground and background components. The outputs of a change detection algorithm are usually binary images of the foreground areas corresponding to moving objects. These algorithms are designed to cope with the challenges that can be found in a real-world videos such as high variation in environmental conditions, illumination changes, shadows, camera movements and camera-induced distortions and so on. To this end, algorithms are becoming increasingly more complex and thus computationally expensive both in terms of time and memory space. Parallelization of background subtraction algorithms on GPU is a possible way to speed up the computation to make them usable in real-time applications (e.g. [24]).
Notwithstanding the improvements, it is still difficult to design general-purpose background subtraction algorithms. These algorithms have been demonstrated to perform well on some types of videos but there is no single algorithm that is able to tackle all the challenges in a robust and computationally efficient way. This can be clearly seen in the CDNET 2014 competition [26] (ChangeDetection.net) where change detection algorithms are evaluated on a common dataset composed of different types of videos sequences and classified according to their performance. To date, more than 35 different algorithms have been evaluated, but many more exists in the literature.
Finally the output of a background subtraction algorithm is usually refined in order to reduce noisy patterns such as isolated pixels, holes, and jagged boundaries. To improve algorithm accuracy, post-processing of the foreground component, ranging from simple noise removal to more complex object-level techniques, has been investigated. Results indicate that significant improvements in performance are possible if a specific post-processing algorithm is designed and the corresponding parameters are set appropriately [18].
Given the existence of so many change detection algorithms, each which its own peculiarities and strengths, here we are interested in finding how far can we get, in terms of performances, by leveraging existing algorithms to create new ones. Instead of designing from scratch a new algorithm, we combine existing ones with the aim to build a better change detection algorithm. We are interested in testing this idea under two scenarios: a purely performance-based scenario and a performance/efficiency balanced scenario. In [3] we have investigated the first scenario by considering the best change detection algorithms in the CDNET 2014 competition, disregarding their computational complexity, and combining then under a Genetic Programming (GP) framework [11]. The resulting algorithms significantly outperform the algorithms used in the combination and even other, more recent, approaches. In this work, we present our findings with respect to the second scenario. We apply the same general approach used in [3] but considering state-of-the-art algorithms that are computationally efficient but not top-performing. We want to investigate if also in this scenario, we are able to create an effective algorithm and what kind of performances we can achieve.
2 The Proposed Approach
In this section, we summarize our GP-based combining approach. A detailed description of the method can be found in [3]. GP is a domain-independent evolutionary method that genetically breeds a population of functions, or more generally, computer programs to solve a given problem [11]. Evolution is driven by the best fit individuals according to an objective function (i.e. fitness function) that must be maximized or minimized. The solutions can be represented as trees, lines of code, expressions in prefix or postfix notations, strings of variable length, etc. GP has been widely used for finding suitable solutions for a wide range of problems needing optimization. For example, in image processing and computer vision applications GP has been used for: image segmentation, enhancement, layouting, classification, feature extraction, and object recognition [1, 2, 4, 7, 13]. For our purposes, we feed GP the set of the binary foreground images that correspond to the outputs of the single change detection algorithms, and a set operators represented by unary, binary, and n-ary functions that are used to combine the outputs (via logical AND, logical OR, etc...) as well as to perform post-processing (via filter operators).
More formally, given a set of n change detection algorithms \(\mathcal {C}=\{C_i\}_{i=1}^n\), the solutions evolved by GP are built using the set of functionals symbols \(\mathcal {F}\) and the set of terminal symbols \(\mathcal {T}=\mathcal {C}\). We build the set of functionals symbols considering operators that work in the spatial neighborhood of the image pixel, or combine (stack) the information at the same pixel location but across different change detection algorithms. The list of functional symbols used is given below:
-
ERO (Erosion): it requires one input, works in the spatial domain and performs morphological erosion with a \(3\times 3\) square structuring element;
-
DIL (Dilation): it requires one input, works in the spatial domain and performs morphological dilation with a \(3\times 3\) square structuring element;
-
MF (Median Filter): it requires one input, works in the spatial domain an performs median filtering with a \(5 \times 5\) kernel;
-
OR (Logical OR): it requires two inputs, works in the stack domain and performs the logical OR operation;
-
AND (Logical AND): it requires two inputs, works in the stack domain and performs the logical AND operation;
-
MV (Majority Vote): it requires two or more inputs, works in the stack domain and performs the majority vote operation;
We define the fitness function used in GP by taking inspiration from the CDNET website, where change detection algorithms are evaluated using different performance measures and ranked accordingly. Given a set of video sequences \(\mathcal {V}=\{V_1,\ldots ,V_S\}\), a set of performance measures \(\mathcal {M}=\{m_1,\ldots ,m_M\}\) the fitness function of a candidate solution \(C_0\), \(f(C_0)\) is defined as the average rank across video sequences and performance measures:
where \(\mathrm{{rank}_{C_0}}(\cdot )\) computes the rank of the candidate solution \(C_0\) with respect to the set of algorithms \(\mathcal {C}\) according to the measure \(m_j\). \(P_1(C_0)\) is defined as the signed distance between the candidate solution \(C_0\) and the best algorithm in \(\mathcal {C}\) according to the measure \(m_j\):
and \(P_2(C_0)\) is a penalty term corresponding to the number of different algorithms selected for the candidate solution \(C_0\):
The role of \(P_1\) is to produce a fitness function \(f(C_0) \in \mathbb {R}\), so that in case of candidate solutions having the same average rank, the one having better performance measures is considered a fitter individual in GP. The penalty term \(P_2\) is used to force GP to select a small number of algorithms in \(\mathcal {C}\) to build the candidate solutions. The relative importance of \(P_1\) and \(P_2\) is independently regulated by the weights \(w_1\) and \(w_2\) respectively.
3 Experimental Setup
Since we wanted to test computationally efficient and simple change detection algorithms, we chose the set of change detection algorithms \(\mathcal {C}\) to be combined among those implemented in the BGSLibraryFootnote 1. BGSLibrary is a free, open source and platform independent library which provides a C++ framework to perform background subtraction using code provided by different authors. We used the 1.9.1 version of the library which implements more than 30 different algorithms. We base our choice of the algorithms on the recent review paper of the authors of BGSLibrary [20] where the computational costs as well as the performances of the different algorithms have been assessed. The rationale is to use computationally efficient algorithms having above average performances, and possibly exploiting different background subtraction strategies. Based on the results in [20], and on some preliminary tests that we have performed, we selected the following algorithms: Static Frame Difference (SFD), Adaptive-Selective Background Learning (ASB), Adaptive Median (AM) [15], Gaussian Average (GA) [27], Gaussian Mixture Model (ZMM) [29], Gaussian Mixture Model (MoG) [6], Gaussian Mixture Model (GMM) [22], Eigenbackground/SL-PCA (EIG) [17], VuMeter (VM) [9], \(\varSigma \varDelta \) Background Estimation (SD) [12], Multiple Cues (MC) [16]. All the algorithms have been tested in [20] with the exception of SigmaDelta and SJNMultiCue algorithms. These have been added in recent versions of the BGSLibrary. We decide to include them since they show interesting performances although they are slightly more computationally intensive with respect to the simpler algorithms.
The performance measures \(\mathcal {M}\) are computed using the framework of the CDNET 2014 challenge [26]. The framework implements the following seven different measures: recall, specificity, false positive ratio (FPR), false negative ratio (FNR), percentage of wrong classifications (PWC), precision, and F-measure. A ranking of the tested algorithms is also computed starting from the partial ranks on these measures. The CDNET 2014 dataset is composed of 11 video categories, with four to six videos sequences in each category. The categories exhibit different video contents and are chosen to test the background subtraction algorithms under different operating conditions. The challenge rules impose that each algorithm should use only a single set of parameters for all the video sequences. For this reason we set the parameters of the algorithms to their default values, i.e. the values in the configuration files provided in the BGSLibrary.
We set the parameters of GP as in [3]. Also, the GP solutions are generated by considering the shortest video sequence in each of the 11 CDNET 2014 categories as training set. The images in this set are less than 10% of the total images in the whole dataset; this minimizes the over-fitting effect if more images were used. We name the best solution found by GP in this way as IUTIS-2 (the term IUTIS is derived by quoting the Greek fabulists Aesop 620BC-560BC: “In Unity There Is Strength”). We also created a different algorithm, IUTIS-1, by considering a smaller training set composed of all video sequences in the “Baseline” category. As the name suggests, this category contains basic video sequences. IUTIS-1 exhibits worse performances than IUTIS-2, and since its results are not directly comparable with the reported ones, it will not be further considered in the discussion.
4 Results
The tree structure for the IUTIS-2 solution is shown in Fig. 1. In the same figure an example of the output at each node on a sample frame is also reported. From the solution tree it is possible to notice that IUTIS-2 selected and combined a subset of four simple change detection algorithms out of the 11 available: it selected GA, ZMM, MC, and ASB. Concerning the tree structure, we can notice that IUTIS-2 presents a single long branch in its right-hand side. Starting from the functionals defined in Sect. 2, GP was able to create new ones. For instance, the solution tree uses a sequence of the operator \(\text {MF}\), which can be seen as an approximation of what could be obtained using a larger kernel for the median filter. The detailed results of the IUTIS-2 algorithm, computed using the evaluation framework of the CDNET 2014 challenge on its 11 video categories, are reported in Table 1. The overall F-measure of IUTIS-2 and of all the change detection algorithms considered are reported in Table 2. From the values reported it is possible to see that our solution is better than the best algorithm fed to GP (i.e. MC), achieving a F-measure that is 7.2% higher.
For comparison, we also report here (see Fig. 2) the solution trees of IUTIS-3 and IUTIS-5, that we recall have been generated by the same method here described but in a purely performance-based scenario [3]. The set of algorithms \(\mathcal {C}\) available for GP to build IUTIS-3 were the three top performing algorithms on CDNET 2014, i.e.: Flux Tensor with Split Gaussian models (FTS) [25], Self-Balanced SENsitivity SEgmenter (SBS) [21], and Change Detection with Weightless Neural Networks (CWS) [10]. The set \(\mathcal {C}\) available for IUTIS-5 also included Change Detection based on Spectral Reflectaces (SPC) [19] and Extension of the Adapting Multi-resolution Background Extractor (AMB) [24].
From the comparison of the solution trees reported in Figs. 1 and 2, it is possible to notice that IUTIS-2 is more complex in terms of functionals used (see for example the right branch). This is due to the fact that very simple algorithms are used and more operations are needed on their output to achieve higher performance. The overall F-measure of IUTIS-3, IUTIS-5 and of all the change detection algorithms considered are reported in Table 2. From the results it is possible to see that IUTIS-3 is better than any other single algorithms, with a F-measure that is 4.13% higher than the best algorithm used by GP. In the case of IUTIS-5 this difference increases to 5.4%. It is worth noting that all our solutions are better than majority vote solutions (denoted with MV) applied to the corresponding sets \(\mathcal {C}\). In [3] we also experimented with larger cardinalities of \(\mathcal {C}\), i.e. \(\#\mathcal {C}=7\) and 9, but in both cases the corresponding solutions found by GP, i.e. IUTIS-7 and IUTIS-9, obtained identical performance with respect to IUTIS-5 and thus we only report them in Table 2.
Outputs of some of the tested algorithms on sample frames in the CDNET 2014 dataset, together with input images and ground truth masks, are shown in Fig. 3.
5 Computational Time
IUTIS-2 algorithm has been implemented in C++ and use the OpenCV library for image processing. Table 3 reports the computational time of the proposed algorithm in frames per seconds. For evaluation purpose, we have implemented two versions of IUTIS-2: a “Sequential” one, and a “Parallel” one. The first version refers to the implementation of the algorithms without any particular optimization. While the second one refers to an optimized implementation of the algorithms obtained by exploiting parallelism on a multicore CPU. We used the OpenMP directives (parallel for and sections) to parallelize both the computation of the masks, and the execution branches of the solution tree. The timing measurements are carried out on a 3.3 GHz Intel Core-i5 (quadcore) with 16 GB RAM and Windows 7 Professional operating system. As it can be seen, the IUTIS-2 algorithm can be efficiently parallelized. Specifically, the frame rates of the parallel version is, on average, about 2 times faster than that the sequential version.
In Table 3 we also report the computational time of IUTIS-3, and IUTIS-5 used in [3]. For these algorithms, the computational time is an estimated of an hypothetical parallel implementation and corresponds to the slowest algorithm used in the solution tree. For completeness we compare the computational time of the different IUTIS algorithms with the top five algorithms in Table 2 (right). The slowest algorithm, with 10 frame-per-seconds, is FTS (that is also used in IUTIS-3 and IUTIS-5). This algorithm is implemented in MATLAB while the other algorithms are all implemented in C++. The AMB algorithm is the most efficient one with an impressive 843 frame per second. This result is achieved thanks to the parallel implementation on GPU using the CUDA architecture.
6 Conclusion
In this paper we have presented an evolutionary approach, based on Genetic Programming, to combine simple change detection algorithms to create a more robust algorithm. The solutions provided by Genetic Programming allow us to select a subset of the simple algorithms. Moreover, we are able to automatically combine them in different ways, and perform post-processing on their outputs using suitable operators to produce the best results. Our combination strategy, is able to produce algorithms that are more effective in solving the change detection problem in different scenario. If we are interested in obtaining the maximum performance disregarding the computational complexity of the algorithms themselves, we can combining few top-performing algorithms and achieve the best overall performances (i.e. IUTIS-3 and IUTIS-5). On the contrary, if we want to improve the performances of existing algorithms while maintaining a limited computational complexity, we can effectively combine several simple algorithms and achieve comparable results of more complex state-of-the-art change detection algorithms (i.e. IUTIS-2). In particular, the parallelized version of IUTIS-2 exhibits remarkable performance while being computationally affordable for real-time applications.
References
Al-Sahaf, H., Al-Sahaf, A., Xue, B., Johnston, M., Zhang, M.: Automatically evolving rotation-invariant texture image descriptors by genetic programming. IEEE Trans. Evol. Comput. 21(1), 83–101 (2017)
Amelio, A., Pizzuti, C.: An evolutionary approach for image segmentation. Evol. Comput. 22(4), 525–557 (2014)
Bianco, S., Ciocca, G., Schettini, R.: Combination of video change detection algorithms by genetic programming. IEEE Trans. Evol. Comput. (2017). http://dx.doi.org/10.1109/TEVC.2017.2694160
Bianco, S., Ciocca, G.: User preferences modeling and learning for pleasing photo collage generation. Trans. Multimedia Comput. Commun. Appl. 12(1), 1–23 (2015)
Bianco, S., Ciocca, G., Napoletano, P., Schettini, R., Margherita, R., Marini, G., Pantaleo, G.: Cooking action recognition with iVAT: an interactive video annotation tool. In: Petrosino, A. (ed.) ICIAP 2013. LNCS, vol. 8157, pp. 631–641. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41184-7_64
Bouwmans, T., Baf, F.E., Vachon, B.: Background modeling using mixture of Gaussians for foreground detection a survey. Recent Pat. Comput. Sci. 1, 219–237 (2008)
Corchs, S., Ciocca, G., Francesca, G.: A genetic programming approach to evaluate complexity of texture images. J. Electron. Imaging 25(6), 061408 (2016)
Elgammal, A., Harwood, D., Davis, L.: Non-parametric model for background subtraction. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 751–767. Springer, Heidelberg (2000). doi:10.1007/3-540-45053-X_48
Goyat, Y., Chateau, T., Malaterre, L., Trassoudaine, L.: Vehicle trajectories evaluation by static video sensors. In: 2006 Intelligent Transportation Systems Conference, ITSC 2006, pp. 864–869. IEEE (2006)
Gregorio, M.D., Giordano, M.: Change detection with weightless neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 409–413. IEEE (2014)
Koza, J.R.: Genetic Programming: On the Programming Of Computers by Means of Natural Selection, vol. 1. MIT Press, Cambridge (1992)
Lacassagne, L., Manzanera, A., Dupret, A.: Motion detection: fast and robust algorithms for embedded systems. In: 2009 16th IEEE International Conference on Image Processing (ICIP), pp. 3265–3268 (2009)
Liu, L., Shao, L., Li, X., Lu, K.: Learning spatio-temporal representations for action recognition: a genetic programming approach. IEEE Trans. Cybern. 46(1), 158–170 (2016)
Maddalena, L., Petrosino, A.: The SOBS algorithm: what are the limits? In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 21–26. IEEE (2012)
McFarlane, N., Schofield, C.: Segmentation and tracking of piglets in images. Mach. Vis. Appl. 8(3), 187–193 (1995)
Noh, S.J., Jeon, M.: A new framework for background subtraction using multiple cues. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012. LNCS, vol. 7726, pp. 493–506. Springer, Heidelberg (2013). doi:10.1007/978-3-642-37431-9_38
Oliver, N., Rosario, B., Pentland, A.: A Bayesian computer vision system for modeling human interactions. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 831–843 (2000)
Parks, D.H., Fels, S.S.: Evaluation of background subtraction algorithms with post-processing. In: IEEE 5th International Conference on Advanced Video and Signal Based Surveillance, AVSS 2008, pp. 192–199. IEEE (2008)
Sedky, M., Moniri, M., Chibelushi, C.C.: Spectral-360: a physics-based technique for change detection. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 405–408. IEEE (2014)
Sobral, A., Vacavant, A.: A comprehensive review of background subtraction algorithms evaluated with synthetic and real videos. Comput. Vis. Image Underst. 122, 4–21 (2014)
St-Charles, P.L., Bilodeau, G.A., Bergevin, R.: Subsense: a universal change detection method with local adaptive sensitivity. IEEE Trans. Image Process. 24(1), 359–373 (2015)
Stauffer, C., Grimson, W.: Adaptive background mixture models for real-time tracking. In: 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 246–252 (1999)
Varadarajan, S., Miller, P., Zhou, H.: Spatial mixture of Gaussians for dynamic background modelling. In: 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 63–68. IEEE (2013)
Wang, B., Dudek, P.: A fast self-tuning background subtraction algorithm. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 401–404. IEEE (2014)
Wang, R., Bunyak, F., Seetharaman, G., Palaniappan, K.: Static and moving object detection using flux tensor with split Gaussian models. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 420–424. IEEE (2014)
Wang, Y., Jodoin, P.M., Porikli, F., Konrad, J., Benezeth, Y., Ishwar, P.: CDnet 2014: an expanded change detection benchmark dataset. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 393–400. IEEE (2014)
Wren, C., Azarbayejani, A., Darrell, T., Pentland, A.: Pfinder: real-time tracking of the human body. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 780–785 (1997)
Yu, M., Rhuma, A., Naqvi, S.M., Wang, L., Chambers, J.: A posture recognition-based fall detection system for monitoring an elderly person in a smart home environment. IEEE Trans. Inf Technol. Biomed. 16(6), 1274–1286 (2012)
Zivkovic, Z., van der Heijden, F.: Efficient adaptive density estimation per image pixel for the task of background subtraction. Pattern Recogn. Lett. 27(7), 773–780 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Bianco, S., Ciocca, G., Schettini, R. (2017). How Far Can You Get by Combining Change Detection Algorithms?. In: Battiato, S., Gallo, G., Schettini, R., Stanco, F. (eds) Image Analysis and Processing - ICIAP 2017 . ICIAP 2017. Lecture Notes in Computer Science(), vol 10484. Springer, Cham. https://doi.org/10.1007/978-3-319-68560-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-68560-1_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68559-5
Online ISBN: 978-3-319-68560-1
eBook Packages: Computer ScienceComputer Science (R0)