CN113326790A

CN113326790A - Capsule robot drain pipe disease detection method based on abnormal detection thinking

Info

Publication number: CN113326790A
Application number: CN202110647069.3A
Authority: CN
Inventors: 李清泉; 臧翀; 王全; 朱家松; 刘志; 方旭; 朱松; 王维康
Original assignee: Shenzhen Zhiyuan Space Innovation Technology Co ltd; Shenzhen Huanshui Pipe Network Technology Service Co ltd
Current assignee: Shenzhen Zhiyuan Space Innovation Technology Co ltd; Shenzhen Huanshui Pipe Network Technology Service Co ltd
Priority date: 2021-06-10
Filing date: 2021-06-10
Publication date: 2021-08-31

Abstract

The invention discloses a method for detecting diseases of a drain pipe of a capsule robot based on abnormal detection thinking; the method comprises the following steps: s10, acquiring a shot video file; s20, inputting an image, and dividing the image into 4-by-4 image blocks; s30, respectively calculating an LBP characteristic value, a GLMC characteristic value and a HOG characteristic value of the image block, designing a Gabor filter, and calculating a Gabor characteristic value; s40, combining the characteristic values obtained in the step S30 to obtain a characteristic combination data set; s50, calculating a probability formula of each characteristic value; s60, selecting a cross validation data set, sequentially carrying out the steps S30-S50 to obtain the total probability value of each image, and then determining a threshold value epsilon through the cross validation set; s70, inputting a new sample image, calculating the total probability value p of the sample image, considering that the sample image is abnormal when p < epsilon, and judging that the sample image is not abnormal when p > epsilon; the invention has the beneficial effects that: and carrying out abnormal clustering detection on the extracted characteristic data set to realize pipeline disease detection and outputting a disease type label.

Description

Capsule robot drain pipe disease detection method based on abnormal detection thinking

Technical Field

The invention relates to the technical field of drainage pipeline detection, in particular to a capsule robot drainage pipeline disease detection method based on an abnormal detection thought.

Background

The underground pipe network is an important infrastructure of a city and is a life line for maintaining safe operation of the city. However, in the process of construction and use, more and more pipe network problems are continuously exposed due to factors such as rapid city development, standard exceeding of load flow, old facilities and insufficient maintenance. In recent years, urban disaster accidents such as urban waterlogging, environmental pollution and even surface collapse caused by underground pipe network diseases frequently occur, and great casualties and economic losses are caused to the masses. According to Shenzhen city ground subsidence statistics, nearly 90% of urban ground subsidence is caused by various underground pipe network diseases, large-range normalized exploration work is carried out on the underground pipe network diseases, and the Shenzhen city ground subsidence method is important work for effectively preventing various underground pipe network accidents.

The detection method of pipeline detection instruments (such as pipeline closed-circuit television detection system CCTV, pipeline periscope QV) has become one of the main detection means of urban drainage pipe networks. During CCTV or QV detection operation, a video analyst looks at pipeline images recorded by the CCTV or QV to find out diseases in the pipeline and classifies and marks the diseases. However, it is far from sufficient to analyze the disease in the video by a manual method only. The CCTV or QV pipeline image data volume is extremely large, and the manual screening method is time-consuming and labor-consuming. In recent years, with the development of image recognition and artificial intelligence technologies, researches at home and abroad propose a technology for realizing automatic pipeline disease recognition by using a deep learning technology. According to the method, a large number of drainage pipeline disease images which are marked manually are input into a deep learning model as sample data for training, then the detected and collected pipeline images are input into the trained deep learning model for recognition, and disease type labels are output. The method reduces the workload of manual screening and improves the disease identification efficiency. The method belongs to the process of supervised learning, the disease identification effect depends on input sample image data, and the good disease identification effect can be obtained only by providing enough sample data aiming at different pipes, different pipeline environments and different disease types and training the model. However, drainage pipelines constructed in different cities and different ages in China have obvious difference, complex pipeline environments and various disease types, and brand new disease types often appear. At the moment, the disease identification based on supervised learning is difficult to realize the full coverage of the disease sample data, which can cause the omission and the false detection of the pipeline diseases and influence the identification effect of the disease pipeline diseases.

Therefore, the drainage pipeline disease detection technology based on deep learning still needs to be improved and developed.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention provides a method for detecting the diseases of the drainage pipe of the capsule robot based on abnormal detection thinking.

The technical scheme adopted by the invention for solving the technical problems is as follows: the improvement of a method for detecting diseases of a drain pipe of a capsule robot based on abnormal detection thinking is characterized by comprising the following steps:

s10, acquiring a shot video file, and framing the video file into a sequence image of the inner wall of the drainage pipe;

s20, inputting an image, and dividing the image into 4-by-4 image blocks;

s30, respectively calculating an LBP characteristic value, a GLMC characteristic value and a HOG characteristic value of the image block, designing a Gabor filter, and calculating a Gabor characteristic value;

s40, combining the characteristic values obtained in the step S30 to obtain a characteristic combination data set;

s50, calculating a probability formula of each characteristic value;

s60, selecting a cross validation data set, sequentially carrying out the steps S30-S50 to obtain the total probability value of each image, and then determining a threshold value epsilon through the cross validation set;

and S70, inputting a new sample image, calculating the total probability value p of the sample image, considering that the sample image is abnormal when p < epsilon, and judging that the sample image is not abnormal when p > epsilon.

Further, step S40 specifically includes the following steps:

combining the feature values obtained in step S30 to obtain a feature combination data set (x)₁,x₂,x₃…x_n) (ii) a Giving an mxn-dimensional training set, converting the training set into n-dimensional Gaussian distribution, analyzing the distribution of m training samples to obtain the probability density function of the training set, and obtaining the mathematical expectation mu and the square difference sigma of the training set in each dimension²(ii) a Mathematical expectation μ in the jth dimension_jSum variance

The calculation formula is as follows:

wherein,

representing the jth dimension characteristic data.

Further, in step S50, when a new point is given, the probability p of the new point on the gaussian distribution is determined, and the calculation formula of the probability p is as follows:

further, in step S10, the inside of the drain pipe is photographed by the fish-eye lens of the capsule robot, and a video file is obtained.

Further, in step S30, the LBP feature value is calculated as follows:

wherein, p represents the p-th pixel point except the central pixel point in the 3 x 3 window; i (c) represents the gray value of the central pixel point, and I (p) represents the gray value of the p-th pixel point in the field; s (x) formula is as follows:

further, in step S30, the image feature of the GLMC feature value is expressed as:

wherein P is_i,jRepresenting the number or frequency of occurrences of two pixels with gray levels i and j respectively,

further, in step S30, Gabor filters are designed, and 24 Gabor filters are formed by selecting 4 sizes and 6 directions.

Further, convolving the Gabor filter with the image to obtain a Gabor feature, where the two-dimensional Gabor function is expressed as follows:

x′＝xcosθ+ysinθ；

y′＝-xsinθ+ycosθ；

wherein X and Y respectively represent pixel coordinate positions, and λ represents the wavelength of filtering, and the pixel is taken as a unit; theta represents the inclination angle of the Gabor kernel function image and specifies the direction of parallel stripes of the Gabor function; psi represents the amount of phase offset, ranging from-180 degrees to 180 degrees, where 0 and 180 degrees correspond to the centrosymmetric center-on function and center-off function, respectively, and-90 degrees and 90 degrees correspond to the anti-symmetric function; σ represents a standard difference of the gaussian function; γ represents an aspect ratio, and when γ is 1, the shape is circular, and when γ is <1, the shape is elongated with the parallel stripe direction.

Further, λ is 3, σ is 0.56 λ, γ is 0.5, θ is 60, and ψ is 90.

Further, in step S30, the HOG feature is a feature descriptor used for object detection in computer vision and image processing, the HOG feature is configured by calculating and counting a gradient direction histogram of a local region of the image, and a gradient magnitude and direction calculation formula of a pixel point of the image is as follows:

G_x(x,y)＝H(x+1,y)-H(x-1,y)；

G_y(x,y)＝H(x,y+1)-H(x,y-1)；

wherein G is_x(x,y)、G_y(x, y), α (x, y), and H (x, y) respectively represent a horizontal direction gradient, a vertical direction gradient, a gradient direction, and a pixel value at the pixel point (x, y) in the input image.

The invention has the beneficial effects that: on one hand, the LBP characteristic, the GLCM characteristic, the Gabor characteristic and the HOG characteristic of the image are comprehensively utilized, and the disease identification accuracy is improved through the combination of the characteristics of local texture change, image brightness change, edge information, object information and the like of the image. The disease sample base does not need to be established in advance, the missing detection rate of pipeline diseases caused by incomplete disease sample data can be reduced, and meanwhile, the workload of carrying out a large amount of manual labeling on the diseases in the early stage by adopting a supervised learning method can be greatly reduced.

Drawings

Fig. 1 is a schematic flow chart of a method for detecting diseases of a drain pipe of a capsule robot based on abnormal detection thinking.

Fig. 2 is a diagram of an embodiment of a capsule robot drain disease detection method based on abnormal detection thinking according to the invention.

FIG. 3 is a partial sample diagram of experimental data of a method for detecting diseases in a drain pipe of a capsule robot based on abnormal detection thinking.

Fig. 4 to 9 are comparative diagrams of methods and feature combinations.

FIG. 10 is a graph showing the statistical results of various methods.

Detailed Description

The invention is further illustrated with reference to the following figures and examples.

The conception, the specific structure, and the technical effects produced by the present invention will be clearly and completely described below in conjunction with the embodiments and the accompanying drawings to fully understand the objects, the features, and the effects of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments, and those skilled in the art can obtain other embodiments without inventive effort based on the embodiments of the present invention, and all embodiments are within the protection scope of the present invention. In addition, all the connection/connection relations referred to in the patent do not mean that the components are directly connected, but mean that a better connection structure can be formed by adding or reducing connection auxiliary components according to specific implementation conditions. All technical characteristics in the invention can be interactively combined on the premise of not conflicting with each other.

The invention aims to provide a pipe network disease video detection method based on an unsupervised anomaly detection idea. The human visual recognition mechanism consists in: when watching a section of video data, people can distinguish the pipeline diseases because the picture has abnormal characteristics, namely, the picture has a significant difference with the previous and next pictures. The invention is based on a human visual identification mechanism, takes disease image data as an abnormal signal, utilizes a section of continuous sequence pipeline video image to extract the significant abnormal characteristics of the front and back images, and carries out unsupervised abnormal cluster detection to realize pipeline disease detection.

The invention adopts an unsupervised learning image identification technology, does not need to establish a disease sample base in advance, treats disease image data as an abnormal signal, utilizes a section of continuous sequence pipeline video image, extracts the significant abnormal characteristics existing in the images before and after the extraction, performs abnormal clustering detection on the extracted characteristic data set, realizes pipeline disease detection, and outputs a disease type label.

According to the method, a video file shot by a capsule robot on a drainage pipeline is framed into sequence images, a continuous segment of the sequence images is sequentially selected, LBP, GLCM, Gabor and HOG characteristics of the images are extracted, local texture change, image brightness change, edge information and object information of the images are obtained, and the accuracy of extraction of the salient abnormal characteristics of the front and rear images is improved through combination of various different image characteristics. The method adopts an anomaly detection algorithm based on Gaussian probability estimation, and is characterized in that when the characteristic values are mutually independent, the total probability is equal to the product of the probabilities of the characteristic values, and when the characteristic values are not mutually independent, a better effect can be obtained.

Referring to fig. 1, the present invention provides a method for detecting diseases in drain pipes of a capsule robot based on abnormal detection thinking, which comprises the following steps:

in the embodiment, shooting is carried out inside the drainage pipe through the fish-eye lens of the capsule robot, so as to obtain a video file;

s20, inputting an image, dividing the image into 4 × 4 image blocks, and totaling 16 image blocks to form a sequence image;

the invention relates to a capsule robot drain pipe disease detection method based on abnormal detection thinking, which comprises the steps of converting video data into sequence images, extracting the characteristics of each image, and mainly extracting LBP (local binary pattern) characteristics, GLMC (global warming potential) characteristics, Gabor characteristics and HOG (histogram of oriented gradient) characteristics aiming at the characteristics of drain pipe diseases;

the LBP (Local Binary Pattern) is used for extracting texture features, and has significant advantages of rotation invariance, gray scale invariance and the like. The extracted features are local texture features of the image, an original LBP operator is defined in a 3 x 3 window, the central pixel of the window is used as a threshold value, the threshold value is compared with the gray values of 8 adjacent pixels, and if the surrounding pixel values are larger than the central pixel value, the position is marked as 1; otherwise, it is marked as 0. Thus, an 8-bit binary number can be obtained, which is usually converted into a 10-bit binary number, i.e. 256 kinds of LBP codes, and this value is used as the LBP value of the pixel point in the center of the window, so as to reflect the texture information of the 3 × 3 region.

The calculation formula of the LBP characteristic value is as follows:

GLCM characteristics: (Gray-level Co-occurrence Matrix, GLCM), i.e. a Gray level Co-occurrence Matrix, GLCM is an L × L square Matrix, L is the Gray level of the source image, and describes the joint distribution of two pixels with a certain spatial position relationship, which can be seen as a joint histogram of two pixel Gray level pairs, which not only reflects the distribution characteristics of the brightness, but also reflects the position distribution characteristics between pixels with the same brightness or close to the brightness, which is a second-order statistical characteristic related to the brightness variation of the image. The gray level co-occurrence matrix of an image can reflect the comprehensive information of the gray level of the image about the direction, the adjacent interval and the variation amplitude. The image features of the GLMC feature values are expressed as:

gabor characteristics: designing Gabor filters, and selecting 4 sizes and 6 directions to form 24 Gabor filters; convolving the Gabor filter with the image to obtain Gabor characteristics, wherein the two-dimensional Gabor function is expressed as follows:

x′＝xcosθ+ysinθ；

y′＝-xsinθ+ycosθ；

wherein X and Y respectively represent pixel coordinate positions, and λ represents the wavelength of filtering, and the pixel is taken as a unit; theta represents the inclination angle of the Gabor kernel function image and specifies the direction of parallel stripes of the Gabor function; psi represents the amount of phase offset, ranging from-180 degrees to 180 degrees, where 0 and 180 degrees correspond to the centrosymmetric center-on function and center-off function, respectively, and-90 degrees and 90 degrees correspond to the anti-symmetric function; σ represents a standard difference of the gaussian function; γ represents an aspect ratio, and when γ is 1, the shape is circular, and when γ is <1, the shape is elongated with the parallel stripe direction. In the above embodiment, λ is 3, σ is 0.56 λ, γ is 0.5, θ is 60, and ψ is 90.

HOG characteristics: the HOG feature is a feature descriptor used for object detection in computer vision and image processing, the HOG feature forms a feature by calculating and counting a gradient direction histogram of a local area of an image, and a gradient size and direction calculation formula of an image pixel point is as follows:

G_x(x,y)＝H(x+1,y)-H(x-1,y)；

G_y(x,y)＝H(x,y+1)-H(x,y-1)；

1) Graying (treating the image as a three-dimensional image in x, y, z (grayscale));

2) standardizing (normalizing) the color space of the input image by using a Gamma correction method; the method aims to adjust the contrast of an image, reduce the influence caused by local shadow and illumination change of the image and inhibit the interference of noise;

3) calculating the gradient (including magnitude and direction) of each pixel of the image; mainly for capturing contour information while further attenuating the interference of illumination.

4) Dividing the image into small cells (e.g., 6 x 6 pixels/cell);

5) counting the gradient histogram (the number of different gradients) of each cell to form a descriptor of each cell;

6) and (3) forming each cell into a block (for example, 3 × 3 cells/block), and connecting the feature descriptors of all the cells in the block in series to obtain the HOG feature descriptor of the block.

7) The HOG feature descriptors of all blocks in the image are connected in series to obtain the HOG feature descriptors of the image. This is the final feature vector available for classification.

S40, combining the characteristic values obtained in the step S30 to obtain a characteristic combination data set (x)₁,x₂,x₃…x_n) (ii) a Giving an mxn-dimensional training set, converting the training set into n-dimensional Gaussian distribution, and analyzing the distribution of m training samples to obtain the probability density function of the training set, i.e. the mathematical expectation mu and variance sigma of the training set in each dimension²(ii) a Mathematical expectation μ in the jth dimension_jSum variance

The calculation formula is as follows:

wherein,

representing the jth dimension characteristic data.

S50, when a new point is given, determining the probability p of the new point on the Gaussian distribution, wherein the calculation formula of the probability p is as follows:

s60, selecting a cross validation data set, sequentially carrying out the steps S30-S50 to obtain a total probability value p of each image, and then determining a threshold value epsilon through the cross validation set;

In the video data acquired by drainage pipeline detection, the disease condition only occupies a small part of images and can be regarded as an abnormal signal, and the task of abnormal detection is to find objects different from most other objects. The invention adopts the anomaly detection algorithm based on Gaussian probability density estimation to carry out anomaly detection, and the method has the advantage that a better detection result can be obtained no matter whether the selected feature combinations are mutually independent or not.

On one hand, the LBP characteristic, the GLCM characteristic, the Gabor characteristic and the HOG characteristic of the image are comprehensively utilized, and the disease identification accuracy is improved through the combination of the characteristics of local texture change, image brightness change, edge information, object information and the like of the image. On the other hand, the method belongs to an unsupervised learning mode, prior knowledge is not needed, a disease sample database is not needed to be established in advance, the missing rate of the pipeline diseases caused by incomplete disease sample data can be reduced, and meanwhile, the workload of manually marking the diseases in the early stage by adopting a supervised learning method can be greatly reduced.

Referring to fig. 2, the present invention adopts and compares four typical anomaly detection algorithms and an anomaly detection method based on a clustering algorithm, and obtains a final evaluation result.

(1) And iFroest: the isolated forest is a rapid anomaly detection method based on Ensemble, has linear time complexity and high precision, belongs to a Non-parametric and unsupervised method, and does not need to define a mathematical model and label training. The IF uses a binary tree to segment the data, and the depth of the data point in the binary tree reflects the degree of "thinning out" of the data. The iForest is composed of t iTrees (isolation Trees) isolated trees, each iTree is a binary tree structure, and the algorithm steps are as follows:

the first step is as follows: and (5) a process of constructing a decision tree.

The second step is that: the average height h (x) of the sample points to be detected per tree is calculated. First, each iTree needs to be traversed to obtain the number ht (x) of layers of the detected data point x which finally falls on any tth iTree. The ht (x) represents the depth of the tree, that is, the closer to the root node, the smaller ht (x) is, and the closer to the bottom layer, the larger ht (x) is, and the height of the root node is 0;

the third step: according to h (x), judging whether x is an abnormal point. We generally calculate the anomaly probability score for x using the following equation:

the value range of s (x, m) is [0,1], and the closer the value is to 1, the higher the probability of being an outlier is. Wherein m is the number of samples, and the expression is as follows:

as can be seen from the expression s (x, m), if the height h (x) → 0, the probability of s (x, m) → 1, i.e., the outlier, is 100%, and if the height h (x) → m-1, the probability of s (x, m) → 0, i.e., the outlier, is impossible. If height h (x) → c (m), then s (x, m) → 0.5, i.e. the probability of being an outlier is 50%, typically we can set a threshold of $ s (x, m) and then de-tune, so that values greater than the threshold are considered outliers.

(2) One Class SVM: the One Class SVM also belongs to a large family of support vector machines, but is different from the traditional classification regression support vector machine based on supervised learning, and is an unsupervised learning method without needing to mark output labels of a training set. Then there is no class label, the hyperplane divided by using SVDD method and finding support vector, for SVDD we expect all samples not abnormal to be positive class, at the same time it uses a hypersphere instead of a hyperplane to make division, the algorithm obtains the spherical boundary around the data in the feature space, it expects to minimize the volume of this hypersphere, thus minimizing the influence of abnormal point data.

Assuming the generated hypersphere parameters are center o and the corresponding hypersphere radius r >0, the hypersphere volume v (r) is minimized, center o is the linear combination of support vectors; similar to the conventional SVM method, it is required that the distances from all training data points xi to the center are strictly less than r, but a relaxation variable ξ i with a penalty coefficient of C is constructed at the same time, and the optimization problem is as follows:

||x_i-o||₂≤r+ξ_i,i＝1,2,…m；

ξ_i≥0,i＝1,2,…m；

after the lagrange dual solution is adopted, whether a new data point z is in the class or not can be judged, if the distance from the z to the center is smaller than or equal to the radius r, the new data point z is not an abnormal point, and if the new data point z is outside the hyper-sphere, the new data point z is an abnormal point.

(3) Local Outlier algorithm-Local Outlier Factor (LOF): the method mainly judges whether each point p is an abnormal point by comparing the density of the point p with the density of the adjacent points, and if the density of the point p is lower, the point p is more likely to be considered as an abnormal point. If the ratio is closer to 1, the neighborhood point density of p is almost the same, and p may belong to the same cluster as the neighborhood; if the ratio is less than 1, the density of p is higher than that of the neighborhood points, and p is a dense point; if this ratio is greater than 1, it indicates that the density of p is less than its neighborhood point density, and p is more likely to be an outlier.

The outlier factor for point P is represented as:

(4) an anomaly detection algorithm based on Gaussian probability density estimation: an Anomaly Detection Algorithm (Anomaly Detection Algorithm) based on gaussian distribution is widely used in many scenarios. The core idea of the algorithm is as follows: giving an m x n dimensional training set, converting the training set into Gaussian distribution with n, obtaining a probability density function of the training set by analyzing the distribution of m training samples, namely obtaining a mathematical expected mu and a variance sigma 2 of the training set on each dimension, determining a threshold epsilon by using a small amount of Cross Validation sets, judging that p < epsilon is abnormal when a new point is given according to the probability calculated on the Gaussian distribution and the threshold epsilon, and judging that p > epsilon is not abnormal when p < epsilon.

(5) K-Means + +: the K-Means + + algorithm is the optimization of the method for initializing the centroid randomly by the K-Means, and comprises the following steps:

1) randomly selecting a point from the input data point set as a first clustering center mu 1;

2) for each point xi in the data set, its distance from the closest one of the selected cluster centers is calculated:

3) selecting a new data point as a new cluster center according to the following selection principles: d (x) the larger point, the probability of being selected as the clustering center is larger;

4) repeating 2 and 3 until k clustered centroids are selected;

5) the K centroids are used as initialization centroids to run the standard K-Means algorithm.

(6) DBSCAN: the DBSCAN is a density clustering method, a core object without a category is selected as a seed, then a sample set which can reach the density of all the core objects is found, and the sample set which is connected with the maximum density and is derived from the density reachable relation is a category of the final clustering, or a cluster is a clustering cluster. And then continuously selecting another core object without categories to search a sample set with reachable density, thereby obtaining another cluster. Run until all core objects have a category.

(7) Hierarchical clustering algorithm: the hierarchy method (hierarchal algorithms) first calculates the distance between samples. Each time merging the closest points to the same class. Then, the distance between the classes is calculated, and the classes with the closest distance are combined into a large class. And continuously merging until a class is synthesized.

Hierarchical Clustering (Hierarchical Clustering) is one of the Clustering algorithms that creates a Hierarchical nested cluster tree by calculating the similarity between data points of different classes. In the clustering tree, the original data points of different classes are the lowest layer of the tree, the top layer of the tree is a root node of a cluster, and the merging algorithm of hierarchical clustering combines the two most similar data points in all the data points by calculating the similarity between the two data points, and repeats the iteration process. In brief, the merging algorithm of hierarchical clustering determines the similarity between data points of each category by calculating the distance between them, and the smaller the distance, the higher the similarity. And combining two data points or categories closest to each other to generate a cluster tree.

Referring to fig. 3, for a part of experimental data samples, the first two rows are data in the pipeline, and the last row is a picture taken when the capsule robot just enters the pipeline and is recovered from the pipeline. And converting the video between the two well lids into a sequence image, and then converting the sequence image into a sequence signal characteristic through characteristic extraction. And then, carrying out disease detection by adopting an anomaly detection algorithm. And the different features were combined, 899 images of data were converted into images, and features and combinations were performed, 40 seconds for video between the two well heads.

And 5 parameter indexes are adopted to evaluate a disease detection algorithm. Let P positive samples (non-diseased samples) and N negative samples (diseased samples). The detection results of the algorithm are TP positive examples determined as positive examples, FN positive examples determined as negative examples (false negative examples) P ═ TP + FN, TN negative examples determined as negative examples, N ═ TN + FP (false positive examples) determined as negative examples, TP/(TP + FP) positive sample Accuracy, TN/(TN + FN) negative sample Accuracy 1 ═ TN/(TN + FN) negative sample Accuracy, TN + N ratio (TP + TN + acc) Accuracy (Accuracy) acc ═ TN + N ratio determined as correct examples, TP/P + N non-disease sample Recall (Recall) Recall ═ TP/P, and TN/N disease sample Recall (Recall) all _ b 1.

The disease sample recall rate, the disease sample detection precision and the comprehensive detection precision are very important parameters, and the algorithm with the largest mean value and the smallest variance is considered to have the best detection effect. Various pairs of methods and feature combinations are shown in fig. 4-9.

First, from the viewpoint of feature combination, the feature combination including the GLCM features has the best detection effect, and it is described that the GLCM can better extract the texture features of the image. Secondly, the classification effect of the original image features and the Gabor features is good, and the original image features and the Gabor features can be well extracted. Meanwhile, the image data can better reflect texture information, and the iForest anomaly detection algorithm is most stable in the aspect of the anomaly detection algorithm. In sum, even under the condition of different feature combinations, the correct classification result of about 60 percent can be achieved, and the correct classification of the GLCM feature and the original data feature of about 70 percent is achieved. The algorithm is relatively robust. Without a priori knowledge of the data, the algorithm may be prioritized for use. From the anomaly detection algorithm based on the clustering method, the basic parameters regarding the clustering method processing follow the following principles: wherein the initial KMean + + clustering number is 10 classes, the final clustering number is 5 classes, and the hierarchical clustering is 10 classes. DBSCAN does not need to specify the number of clusters. After the final result is obtained, the number of each category is arranged from small to large, and the former J-Int (0.6-K) is taken as an abnormal category according to the number K of cluster clusters in the category. For the problem of the method, disease detection scenes are provided, and two parameters of disease detection precision _ b1 and disease recall rate are very important. Essentially, the two evaluation indexes also determine other evaluation indexes such as false alarm rate, missed detection rate and overall accuracy. In machine learning, in order to evaluate the performance of a classification algorithm, the average and variance of the accuracy and recall of negative samples are used in the project to measure the performance of the classification algorithm, the algorithm with the smaller variance is considered to be the best when the average is larger, and the average is considered to be the best. Meanwhile, the two values also accord with the thought of drawing an ROC curve and the visual feeling of people, the GLCM characteristic with the best characteristic combination effect is selected in the project, and different algorithm results are compared, as shown in the table:

TABLE 3.3-1 anomaly detection results for different algorithms

Fig. 10 is a schematic diagram of statistical results of different methods.

As shown in the figure, iForest, Gaussion-D, KMeans + +, DBSCAN and GLCM all have good effects. KMeans + + showed the best results. The clustering-based method requires some prior knowledge, two important parameters are set in advance, and the proportion of the number of clustering clusters to the abnormal clusters is determined. If the prior knowledge is lacked, the iForest and Gaussion-D anomaly detection algorithms are recommended to be used and are more in line with the actual case.

In conclusion, the combination of the gray level co-occurrence matrix characteristics based on the Gaussian probability density anomaly point detection algorithm is superior. Therefore, in a real pipeline scene, the disease detection based on the pipe network video can be realized by combining an anomaly detection algorithm with texture feature extraction.

While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A capsule robot drain pipe disease detection method based on abnormal detection thinking is characterized by comprising the following steps:

s20, inputting an image, and dividing the image into 4-by-4 image blocks;

s50, calculating a probability formula of each characteristic value;

2. The method for detecting drain diseases of capsule robots based on abnormal thinking detection as claimed in claim 1, wherein the step S40 comprises the following steps:

combining the feature values obtained in step S30 to obtain a feature combination data set (x)₁,x₂,x₃…x_n) (ii) a Giving an mxn-dimensional training set, converting the training set into n-dimensional Gaussian distribution, and analyzing the distribution of m training samples to obtain the probability density function of the training set, i.e. the mathematical expectation mu and variance sigma of the training set in each dimension²(ii) a Mathematical expectation μ in the jth dimension_jSum variance

The calculation formula is as follows:

wherein,

representing the jth dimension characteristic data.

3. The method for detecting drain disease of capsule robot based on abnormal thinking detection as claimed in claim 2, wherein in step S50, when a new point is given, the probability p of the new point on the gaussian distribution is determined, and the calculation formula of the probability p is as follows:

4. the method for detecting drain pipe disease of capsule robot based on abnormal detection thinking as claimed in claim 1, wherein in step S10, the inside of the drain pipe is photographed by a fish-eye lens of the capsule robot to obtain a video file.

5. The method for detecting drain disease of capsule robot based on abnormal thinking detection as claimed in claim 1, wherein in step S30, the LBP characteristic value is calculated as follows:

6. the method for detecting drain disease of capsule robot based on abnormal detection thinking as claimed in claim 1, wherein in step S30, the image characteristics of GLMC characteristic value are expressed as:

7. the method for detecting drain diseases in capsule robots based on abnormal thinking detection as claimed in claim 1, wherein in step S30, Gabor filters are designed, and 24 Gabor filters are formed by selecting 4 sizes and 6 directions.

8. The method for detecting drain disease of capsule robot based on abnormal thinking detection as claimed in claim 7, wherein the Gabor filter is convolved with the image to obtain Gabor characteristics, and the two-dimensional Gabor function is expressed as follows:

x^*＝x cosθ+y sinθ；

y′＝-x sinθ+y cosθ；

wherein X and Y respectively represent pixel coordinate positions, and λ represents the wavelength of filtering, and the pixel is taken as a unit; theta represents the inclination angle of the Gabor kernel function image and specifies the direction of parallel stripes of the Gabor function; psi represents the phase offset in the range-180 degrees, where 0 and 180 degrees correspond to the centrosymmetric center-on function and center-off function, respectively, and-90 degrees and 90 degrees correspond to the anti-symmetric functions; σ represents the standard deviation of the Gaussian function; γ represents an aspect ratio, and when γ is 1, the shape is circular, and when γ is <1, the shape is elongated with the parallel stripe direction.

9. The method for detecting drain disease of capsule robot based on abnormal thinking detection as claimed in claim 8, wherein λ is 3, σ -0.56 λ, γ -0.5, θ -60, ψ -90.

10. The method for detecting drain disease of capsule robot based on abnormal thinking detection as claimed in claim 1, wherein in step S30, the HOG feature is a feature descriptor used for object detection in computer vision and image processing, the HOG feature is a feature by calculating and counting the histogram of gradient direction of local area of image, the calculation formula of gradient magnitude and direction of image pixel is as follows:

G_x(x,y)＝H(x+1,y)-H(x-1,y)；

G_y(x,y)＝H(x,y+1)-H(x,y-1)；