1. Introduction
As the most commonly used construction material, concrete has been widely used in civil infrastructure, such as buildings, tunnels, dams, bridges, and wharfs. However, most of these structures are susceptible to damage due to the environmental factors including wind, seawater, fog, ice, etc. [
1,
2,
3,
4]. Among various concrete damage types, crack is a typical concrete damage which can remarkably influence the stress distribution of structural components and undermine the structural integrity. For the reinforced concrete structure, the crack can also result in the corrosion of steel reinforcement and cause concrete cancer, which further accelerates the crack development and growth. Accordingly, timely and accurate identification of significant cracks on the structure surface is of great importance for protecting civil infrastructure and avoiding unnecessary economic loss. The traditional crack inspection methods are not only labour intensive and time consuming, but also lack the assurance of accuracy and being real-time [
5,
6]. With rapid development of computer vision technology, using a mobile camera system or remotely piloted aircraft (RPA) to monitor civil infrastructure, combined with advanced image processing to extract useful geometrical features of crack from captured videos or images, is a more reliable and robust solution compared to other contact-based approaches such as ultrasonic [
7,
8,
9]. Although the ultrasonic-based techniques may effectively detect the internal defect or crack of the structure, the installation of such a device on the structure in use is not as convenient as that of imaging methods based on controllable mobile camera system.
To effectively identify the concrete crack, a large number of studies have been conducted in terms of crack edge detection, crack segmentation and crack-sensitive feature extraction. Kim et al. put forward a crack diagnostic method based on the fuzzy set theory for reinforced concrete structures [
10]. In the proposed method, the crack symptom and concrete condition were considered as the input variables and the built-in fuzzy rules were employed to evaluate the crack cause in the structure. Nnolim proposed an automated crack identification approach based on using partial differential equation [
11]. This approach included both edge protection smoothing and edge enhancement process to pre-process the crack image. Additionally, the local global maximal gradient matching algorithm was used to deal with the crack image for capturing the crack characteristics. Abdel-Qader et al. compared four image processing approaches: Canny filter, Sobel filter, fast Fourier transform (FFT) and Haar transform (HT) in relation to crack detection [
12]. The comparison results demonstrated that the HT method had the best performance with detection accuracy of 86% among the four methods. Fujita and Hamamoto developed a robust concrete crack identification algorithm, in which the median filter was used to eliminate the slight change due to background from the raw image and probability relaxation and adaptive thresholding methods were subsequently conducted to detect the crack with high performance including sensitivity of 0.801, specificity of 0.992 and accuracy of 0.606 [
13]. A crack detection and classification method based on Beamlet transform was introduced by Ying and Salari, in which the problem of uneven background brightness was fixed by an enhanced algorithm via evaluating the multiplicative factor [
14]. Tsai et al. demonstrated the application of geodesic minimal path algorithm in the generation of the crack pattern, which can be used for the process of route planning [
15]. Valença et al. introduced a novel approach called
Image Processing of Cracking in Concrete Surfaces (MCRACK), which is based on combined global-local method, to process the digital images for automatic characterisation of concrete cracks [
16]. Kim et al. presented a novel concrete crack assessment approach based on the integration of RAP technique and hybrid image processing, in which the binarization method was used to evaluate the crack width [
17]. The Gabor filter was utilised by Medina et al. to design a diagnostic system for identifying the crack in concrete tunnel surface [
18]. In this study, the parameters of Gabor filter were optimized by an enhanced genetic algorithm to obtain the best identification performance.
Even though the aforementioned methods have been effective in concrete crack detection, the identification accuracy is still affected by the non-uniform background brightness and noise contamination. Existing methods are unable to guarantee accurate detection of crack edges while eliminating the noise. To resolve this issue, machine learning (ML) techniques have been introduced for image processing, and several studies have been reported in application of ML in crack classification and segmentation accordingly. Lee et al. presented an approach based on back-propagation (BP) neural networks to identify and analyse the crack on the concrete surface [
19]. The proposed method can effectively quantify the crack geometries including length, width and orientation and extract the crack patterns including vertical, horizontal and random cracks, and the classification accuracy is capable to reach as high as 100% for different patterns of crack including horizontal, vertical diagonal (−45° and +45°) and random. Chun et al. employed gradient boost decision tree to design the crack detection algorithm, the inputs of which are colour, gradient and texture characteristics of crack [
20]. The random forest learning algorithm was also applied in the design of crack diagnostic system, where the local crack patch can be predicted based on channel and pairwise different features [
21,
22]. In above studies [
21,
22], two magnitude channels, three colour channels and eight orientation channels were employed to make up a total of thirteen channels for feature extraction. Liang et al. classified the crack images of concrete using a support vector machine (SVM) classifier, in which the mean square deviation and peak ratios of grey histogram and distribution of projective integral are utilized as the inputs and crack type is the output of the classifier [
23,
24]. Similar crack detection algorithms based on SVM were also reported in [
25,
26]. In [
27], Mokhtari et al. conducted a comparative study on performance evaluation of different ML algorithms, including ANN, DT,
k-NN and ANFIS, in terms of computer vision-based crack detection. The results showed that ANFIS and ANN have superior characteristics with regard to calculation time, prediction accuracy and result stability and interpretability. Recently, deep learning (DL) techniques have been quickly developed and widely utilized in processing remote sensing data via dimensionality reduction and feature learning, especially in the application of concrete crack detection. Compared to the traditional ML methods with shallow configuration, the DL approaches are capable to produce the predictions with higher accuracy due to the deeper architecture [
28]. Li et al. employed fully convolutional neural networks (CNN) with Bayes fusion algorithm for automated crack identification of concrete bridge [
29]. Jo and Jadidi adopted deep belief networks to process both infrared and RGB images for crack classification [
30]. Zhang et al. combined long short-term memory and 1-D CNN to analyse the image in frequency domain for detecting the crack on the bridge deck [
31]. Furthermore, self-generative adversarial networks [
32], semi-supervised deep cross-modal network [
33], deep encoder–decoder networks [
34], and graph convolutional neural networks [
35] were also developed for dealing with remote sensing images for same task of interest. A comprehensive literature review on application of ML and DL in feature extraction and pattern recognition of imagery data was presented in detail by Rasti et al. [
36]. In spite of the progressive advances of ML or DL algorithms, further investigation needs to be carried out in this area. The main reason is that the performance of learning model is overly dependent on the complexity of model configuration, setting of model meta-parameters and quality of data set for training.
To address the challenges in existing concrete crack identification methods, a hierarchical framework is proposed in this study via combining various image processing, ML and information fusion techniques. To start with, the raw images are denoised using non-local means method. Then, the processed images are sent to different filters for crack edge detection, and the crack sensitive features are extracted by the integral projection. Principal component analysis on the extracted features is also undertaken to select optimal components. The SVM multi-classifiers, with benefits of high-dimensional and nonlinear pattern recognition with small size of samples, are subsequently built up to achieve the initial identification of crack pattern. To improve the generalization capacities of the classifiers, the enhanced salp swarm algorithm is employed to optimize the meta-parameters of SVM. Eventually, the Dempster–Shafter fusion algorithm is introduced to fuse the initial results of different classifiers corresponding to different image filtering methods to improve the identification accuracy. Finally, the concrete images taken by a RPA are used to verify the effectiveness of the proposed framework with satisfactory results.
5. Feature Level-Based Crack Diagnosis Using Enhanced Salp Swarm Algorithm-Optimized SVM Classifiers
In this section, corresponding to each filtering method, the classifier based on SVM will be developed to automate the concrete crack identification. To improve the generalization capacity of SVM, an enhanced salp swarm algorithm is proposed to optimize the meta-parameters during the model training.
5.1. SVM Sub-Classifiers for Identifying Concrete Crack
The typical application of SVM is for solving binary classification problem, i.e., judging whether the test sample belongs to positive or negative class [
46]. However, this study aims to identify different crack types of concrete, which is a multi-objective classification problem in nature. One of the most direct ways is to construct multiple hyperplanes, which can be used to divide the entire sample space into multiple regions. Each region corresponds to one class. Although this method is able to fundamentally solve this problem, its application prospect is not encouraging due to a large amount of calculation. In the practical application, there are two common strategies to solve such a multi-classification problem: one against rest (OAR) and one against one (OAO). The fundamental of these two strategies is to transform a multi-classification problem into multiple binary-classification problems. The corresponding classifier is also called “sub-classifier”. For an
n-class classification problem, the OAR strategy only requires to establish
sub-classifiers. In
i-th sub-classifier, the samples with
i-th class are regarded as the positive class samples, while the rest of the samples are regarded as the negative class samples. The final classification result of OAR strategy is the output category of positive class. The main benefit of OAR strategy is that the number of sub-classifiers needed to be established is relatively small, but there exists the potentials of “classification overlap” and “unclassifiable”. The OAO strategy, however, establishes the sub-classifiers for any arbitrary two classes of samples in the
n-class classification problem, and a total of
sub-classifiers are required. The final classification result of OAO strategy is decided by the voting of all the sub-classifiers. The main feature of OAO strategy is that the number of sub-classifiers is rapidly increased with the adding class number and the training efficiency is lower than that of OAR strategy.
In this work, both OAR and OAO strategies are investigated to develop SVM sub-classifiers for concrete surface crack identification. Hence, four () sub-classifiers should be developed for the OAR strategy, i.e., without crack–rest (MWAR), longitudinal crack–rest (MLAR), transverse crack–rest (MTAR) and oblique crack–rest (MOAR) sub-classifiers, while six () sub-classifiers need to be trained for the OAO strategy, i.e., without crack–longitudinal crack (MWAL), without crack–transverse crack (MWAT), without crack–oblique crack (MWAO), longitudinal crack–transverse crack (MLAT), longitudinal crack–oblique crack (MLAO) and transverse crack–oblique crack (MTAO) sub-classifiers. For the training and validation of the classifiers, 80% of images of each concrete condition scenario (960) are randomly selected as training data to develop the sub-classifiers, while the rest of the images (240) are employed as validation samples to evaluate the performance of trained sub-classifiers.
5.2. Optimizing Meta-Parameters of SVM Using ESSA
The fundamental of SVM is to find an optimal classification line which can not only separate the data samples correctly but also maximize the margin. For the data with nonlinear separability, the samples in the input space can be mapped to the high-dimensional feature space through the nonlinear transformation, which transforms the nonlinear classification into linear transformation and forms a nonlinear SVM. However, SVM is not capable of directly solving the dot product of the feature space, so the kernel function of original space is employed instead. There are a number of functions that can be used as the kernel of SVM, including polynomial function, radial basis function (RBF), sigmoid function, etc. In this study, the RBF is selected due to wider domain of convergence. The mathematical expression of RBF is:
where
σ is a free parameter to indicate the variance of kernel. The optimal classification function can be written as:
where
are optimal Lagrange multipliers in the range of (0,
C) and
denotes the bias.
When the SVM-based sub-classifiers are established for concrete crack identification, the meta-parameters of SVM should be appropriately selected. Here, the SVM meta-parameters include balance coefficient
C and kernel parameter
σ. The influences of both parameters on the SVM performance are totally different. For parameter
C, if it is assigned with a low value, the classification function will be flat; if it is assigned with a high value, more samples will be employed as the support vectors to accurately predict all the data. Different settings of parameter combination may result in distinctly different model performances [
47]. Accordingly, how to select the optimal values of meta-parameters is of great importance to the development of SVM for the best generalization ability. In this study, an enhanced salp swarm algorithm (ESSA) is put forward to optimize the
C and
σ during the training of SVM sub-classifiers. The fitness function of meta-parameter optimization is defined as the mean prediction accuracy of training samples using five-fold cross validation (CV), and the optimization problem can be expressed by:
The procedure of using ESSA to optimize the meta-parameters of SVM sub-classifiers can be summarized as follows, which is also shown in
Figure 9.
Step 1. Confirm the optimization target and the parameters to be optimized, and set the algorithm parameters of ESSA, including swarm size of salp, maximum iteration number and parameters affecting S-curve-based decreasing weight.
Step 2. Initialize the locations of the salps in the vector of parameters to be optimized. Here, the parameters are C and σ2, and the upper and lower boundaries of parameters are 0.001 and 100, respectively.
Step 3. Calculate the fitness value of each salp in the swarm, and record the individual optimum and global optimum of the swarm.
Step 4. Set current iteration number CIN as 1.
Step 5. For each salp, if it is the leader (first salp,
m = 1), use Equation (32) to update its location; otherwise, use Equation (33) to update the location:
where
denotes the location of the leader;
denotes the food location;
w denotes the weighting factor;
and
are upper and lower search boundaries;
and
are two random numbers between 0 and 1;
denotes convergence factor, which is used to balance the exploitation and exploration abilities of algorithm;
l denotes the current iteration number;
lmax denotes the maximum iteration number;
denotes the location of
m-th follower.
Step 6. Evaluate the fitness value of each salp, and compare the current individual and global optimum with previous ones. If the current results are better, replace the previous record with current results; or else, keep the record unchanged.
Step 7. Judge whether the current iteration number reaches its maximum value. If so, terminate the optimization. Otherwise, CIN = CIN + 1 and go back to Step 5.
5.3. SVM Training Results
In this study, 80% of images of each class are randomly selected as the training samples to develop the SVM sub-classifier while the rest of images are used as the validation samples to evaluate the effectiveness of the proposed model. The setting of basic parameters of ESSA is given as follows: swarm size is 50 and the maximum iteration number is 200. Additionally, how to define the decreasing weight w is also important, since it directly affects the accuracy and convergence of algorithm [
48]. Generally, in the initial stage of algorithm evolution, we need a large weighting to enhance the global search ability of ESSA, while in the later stage, a small weighting is required to improve the local search ability. In [
49], a linearly decreasing weighting factor was proposed to update the location of leader salp. However, when it is far from the food source, the SSA with linear decreasing weight may fall into the local optimum. Accordingly, this study corrects this problem and proposes an
S-curve-based decreasing weighting factor, with the expression in Equation (34):
where
and
are two parameters to tune the shape of the
S-curve. A comparison between
S-curve-based decreasing weight with different combinations of
and
and linear decreasing weight is shown in
Figure 10. It is clearly seen that compared to the linear one, the nonlinear weighting based on
S-curve keeps the larger value for a longer time in the early stage, which can make the leader salp quickly find the rough location of food, and then rapidly declines to the minimum value in the later stage, which is beneficial to the fine tuning of food location. Among three parameter combinations, the combination (
= 10,
= 4) shows the symmetric property in the range of [0, 1], which is adopted in this study.
Then, the training samples are sent to the SVM for obtaining optimal model parameters based on ESSA.
Figure 11 demonstrates an example of SVM meta-parameter optimization of without crack-rest sub-classifier based on the images processed by the Sobel filter, in which
Figure 11a depicts optimal and mean fitness with the iteration and
Figure 11b shows the optimization process of parameters
C and
σ2. It can be observed that the maximum identification accuracy is kept at a relatively stable value of about 93.8%, while the mean identification accuracy fluctuates between 92% and 94%. For two SVM parameters, their values have obvious variations at around 155th iteration, and then arrive at the optimal values of 25.3911 and 79.8106, respectively.
Table 1 summarizes the details of all trained SVM sub-classifiers including optimal meta-parameters and number of support vectors. Based on the best meta-parameters, the trained sub-classifiers can obtain the optimal generalization capacity for concrete crack identification.
7. Conclusions
This research develops an intelligent framework for crack diagnosis and classification, which is a combination of signal processing, machine learning, and data fusion techniques. Non-local mean and various filters are employed for noise negation and crack-sensitive pattern recognition, which contribute to a marked diagram of concrete crack. Integral projection, together with PCA, is utilized to diagnose different types of condition surface condition, including without crack, transverse crack, longitudinal crack and oblique crack. The analysis result reveals that the first 15 PCs possess more than 95% feature of all the IPs calculated from the results of the Sobel filter. The reduction of number of features to be learned can enhance the performance of machine learning model. Then, the SVM classifiers with soft outputs under both OAR and OAO strategies are established to conduct initial diagnosis of concrete surface condition. To enhance the generalization ability of the trained classifiers, the ESSA is selected to optimize the meta-parameters of SVM. The optimization results show that the classification accuracy of trained model that can arrive at is as high as 93.8% for the training samples. To fix the problem of wrong or conflicting diagnosis due to different filters, the D–S fusion algorithm is adopted to combine the initial diagnostic results of different sub-classifiers as well as different filters, which are regarded as independent evidences. The fusion results show that the confidence probability of correct proposition can reach 0.99 while the uncertainty of the prediction is reduced to below 0.001 after two-level fusion. In addition, a comparative study demonstrates that the proposed framework outperforms the independent SVM models with single-type of features in terms of the concrete crack diagnosis. Consequently, on the basis of promising results in this research, the proposed framework can be considered as a potential tool for the automatic and real-time structural inspection by the structural engineers and infrastructure agencies.
In this research, the main target is to develop the diagnostic model based on machine learning and data fusion for classifying different patterns of concrete crack. However, in the real situation, the concrete crack pattern may be more complex than three patterns in this study. Accordingly, crack segmentation is necessary for extracting important features of complex cracks including crack width, length and orientation. In the future work, more concrete images with various complex patterns of crack, including V-shape and cross-shape crack, will be collected in the field, and deep learning methods will be employed as the potential tools to fix this problem via building the diagnostic model based on the pre-processed images and corresponding ground truths. In addition, the normalisation operation will be conducted on the raw images to evaluate its effectiveness on the improvement of diagnosis accuracy.