CN113326790A - Capsule robot drain pipe disease detection method based on abnormal detection thinking - Google Patents
Capsule robot drain pipe disease detection method based on abnormal detection thinking Download PDFInfo
- Publication number
- CN113326790A CN113326790A CN202110647069.3A CN202110647069A CN113326790A CN 113326790 A CN113326790 A CN 113326790A CN 202110647069 A CN202110647069 A CN 202110647069A CN 113326790 A CN113326790 A CN 113326790A
- Authority
- CN
- China
- Prior art keywords
- image
- abnormal
- detection
- disease
- gabor
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 78
- 201000010099 disease Diseases 0.000 title claims abstract description 74
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 title claims abstract description 74
- 230000002159 abnormal effect Effects 0.000 title claims abstract description 44
- 239000002775 capsule Substances 0.000 title claims abstract description 26
- 238000000034 method Methods 0.000 claims abstract description 50
- 238000002790 cross-validation Methods 0.000 claims abstract description 9
- 230000006870 function Effects 0.000 claims description 25
- 238000012549 training Methods 0.000 claims description 25
- 238000004364 calculation method Methods 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 4
- 238000001914 filtration Methods 0.000 claims description 3
- 238000009432 framing Methods 0.000 claims description 3
- 206010043431 Thinking abnormal Diseases 0.000 claims 7
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 238000004422 calculation algorithm Methods 0.000 description 32
- 230000000694 effects Effects 0.000 description 11
- 230000008859 change Effects 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 5
- 239000011159 matrix material Substances 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000000605 extraction Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 3
- 239000013598 vector Substances 0.000 description 3
- 230000000007 visual effect Effects 0.000 description 3
- 235000012571 Ficus glomerata Nutrition 0.000 description 2
- 244000153665 Ficus glomerata Species 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000007635 classification algorithm Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000013136 deep learning model Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000005286 illumination Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 102100037651 AP-2 complex subunit sigma Human genes 0.000 description 1
- 101000806914 Homo sapiens AP-2 complex subunit sigma Proteins 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 238000003912 environmental pollution Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000010792 warming Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a method for detecting diseases of a drain pipe of a capsule robot based on abnormal detection thinking; the method comprises the following steps: s10, acquiring a shot video file; s20, inputting an image, and dividing the image into 4-by-4 image blocks; s30, respectively calculating an LBP characteristic value, a GLMC characteristic value and a HOG characteristic value of the image block, designing a Gabor filter, and calculating a Gabor characteristic value; s40, combining the characteristic values obtained in the step S30 to obtain a characteristic combination data set; s50, calculating a probability formula of each characteristic value; s60, selecting a cross validation data set, sequentially carrying out the steps S30-S50 to obtain the total probability value of each image, and then determining a threshold value epsilon through the cross validation set; s70, inputting a new sample image, calculating the total probability value p of the sample image, considering that the sample image is abnormal when p < epsilon, and judging that the sample image is not abnormal when p > epsilon; the invention has the beneficial effects that: and carrying out abnormal clustering detection on the extracted characteristic data set to realize pipeline disease detection and outputting a disease type label.
Description
Technical Field
The invention relates to the technical field of drainage pipeline detection, in particular to a capsule robot drainage pipeline disease detection method based on an abnormal detection thought.
Background
The underground pipe network is an important infrastructure of a city and is a life line for maintaining safe operation of the city. However, in the process of construction and use, more and more pipe network problems are continuously exposed due to factors such as rapid city development, standard exceeding of load flow, old facilities and insufficient maintenance. In recent years, urban disaster accidents such as urban waterlogging, environmental pollution and even surface collapse caused by underground pipe network diseases frequently occur, and great casualties and economic losses are caused to the masses. According to Shenzhen city ground subsidence statistics, nearly 90% of urban ground subsidence is caused by various underground pipe network diseases, large-range normalized exploration work is carried out on the underground pipe network diseases, and the Shenzhen city ground subsidence method is important work for effectively preventing various underground pipe network accidents.
The detection method of pipeline detection instruments (such as pipeline closed-circuit television detection system CCTV, pipeline periscope QV) has become one of the main detection means of urban drainage pipe networks. During CCTV or QV detection operation, a video analyst looks at pipeline images recorded by the CCTV or QV to find out diseases in the pipeline and classifies and marks the diseases. However, it is far from sufficient to analyze the disease in the video by a manual method only. The CCTV or QV pipeline image data volume is extremely large, and the manual screening method is time-consuming and labor-consuming. In recent years, with the development of image recognition and artificial intelligence technologies, researches at home and abroad propose a technology for realizing automatic pipeline disease recognition by using a deep learning technology. According to the method, a large number of drainage pipeline disease images which are marked manually are input into a deep learning model as sample data for training, then the detected and collected pipeline images are input into the trained deep learning model for recognition, and disease type labels are output. The method reduces the workload of manual screening and improves the disease identification efficiency. The method belongs to the process of supervised learning, the disease identification effect depends on input sample image data, and the good disease identification effect can be obtained only by providing enough sample data aiming at different pipes, different pipeline environments and different disease types and training the model. However, drainage pipelines constructed in different cities and different ages in China have obvious difference, complex pipeline environments and various disease types, and brand new disease types often appear. At the moment, the disease identification based on supervised learning is difficult to realize the full coverage of the disease sample data, which can cause the omission and the false detection of the pipeline diseases and influence the identification effect of the disease pipeline diseases.
Therefore, the drainage pipeline disease detection technology based on deep learning still needs to be improved and developed.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a method for detecting the diseases of the drainage pipe of the capsule robot based on abnormal detection thinking.
The technical scheme adopted by the invention for solving the technical problems is as follows: the improvement of a method for detecting diseases of a drain pipe of a capsule robot based on abnormal detection thinking is characterized by comprising the following steps:
s10, acquiring a shot video file, and framing the video file into a sequence image of the inner wall of the drainage pipe;
s20, inputting an image, and dividing the image into 4-by-4 image blocks;
s30, respectively calculating an LBP characteristic value, a GLMC characteristic value and a HOG characteristic value of the image block, designing a Gabor filter, and calculating a Gabor characteristic value;
s40, combining the characteristic values obtained in the step S30 to obtain a characteristic combination data set;
s50, calculating a probability formula of each characteristic value;
s60, selecting a cross validation data set, sequentially carrying out the steps S30-S50 to obtain the total probability value of each image, and then determining a threshold value epsilon through the cross validation set;
and S70, inputting a new sample image, calculating the total probability value p of the sample image, considering that the sample image is abnormal when p < epsilon, and judging that the sample image is not abnormal when p > epsilon.
Further, step S40 specifically includes the following steps:
combining the feature values obtained in step S30 to obtain a feature combination data set (x)1,x2,x3…xn) (ii) a Giving an mxn-dimensional training set, converting the training set into n-dimensional Gaussian distribution, analyzing the distribution of m training samples to obtain the probability density function of the training set, and obtaining the mathematical expectation mu and the square difference sigma of the training set in each dimension2(ii) a Mathematical expectation μ in the jth dimensionjSum varianceThe calculation formula is as follows:
Further, in step S50, when a new point is given, the probability p of the new point on the gaussian distribution is determined, and the calculation formula of the probability p is as follows:
further, in step S10, the inside of the drain pipe is photographed by the fish-eye lens of the capsule robot, and a video file is obtained.
Further, in step S30, the LBP feature value is calculated as follows:
wherein, p represents the p-th pixel point except the central pixel point in the 3 x 3 window; i (c) represents the gray value of the central pixel point, and I (p) represents the gray value of the p-th pixel point in the field; s (x) formula is as follows:
further, in step S30, the image feature of the GLMC feature value is expressed as:
wherein P isi,jRepresenting the number or frequency of occurrences of two pixels with gray levels i and j respectively,
further, in step S30, Gabor filters are designed, and 24 Gabor filters are formed by selecting 4 sizes and 6 directions.
Further, convolving the Gabor filter with the image to obtain a Gabor feature, where the two-dimensional Gabor function is expressed as follows:
x′=xcosθ+ysinθ;
y′=-xsinθ+ycosθ;
wherein X and Y respectively represent pixel coordinate positions, and λ represents the wavelength of filtering, and the pixel is taken as a unit; theta represents the inclination angle of the Gabor kernel function image and specifies the direction of parallel stripes of the Gabor function; psi represents the amount of phase offset, ranging from-180 degrees to 180 degrees, where 0 and 180 degrees correspond to the centrosymmetric center-on function and center-off function, respectively, and-90 degrees and 90 degrees correspond to the anti-symmetric function; σ represents a standard difference of the gaussian function; γ represents an aspect ratio, and when γ is 1, the shape is circular, and when γ is <1, the shape is elongated with the parallel stripe direction.
Further, λ is 3, σ is 0.56 λ, γ is 0.5, θ is 60, and ψ is 90.
Further, in step S30, the HOG feature is a feature descriptor used for object detection in computer vision and image processing, the HOG feature is configured by calculating and counting a gradient direction histogram of a local region of the image, and a gradient magnitude and direction calculation formula of a pixel point of the image is as follows:
Gx(x,y)=H(x+1,y)-H(x-1,y);
Gy(x,y)=H(x,y+1)-H(x,y-1);
wherein G isx(x,y)、Gy(x, y), α (x, y), and H (x, y) respectively represent a horizontal direction gradient, a vertical direction gradient, a gradient direction, and a pixel value at the pixel point (x, y) in the input image.
The invention has the beneficial effects that: on one hand, the LBP characteristic, the GLCM characteristic, the Gabor characteristic and the HOG characteristic of the image are comprehensively utilized, and the disease identification accuracy is improved through the combination of the characteristics of local texture change, image brightness change, edge information, object information and the like of the image. The disease sample base does not need to be established in advance, the missing detection rate of pipeline diseases caused by incomplete disease sample data can be reduced, and meanwhile, the workload of carrying out a large amount of manual labeling on the diseases in the early stage by adopting a supervised learning method can be greatly reduced.
Drawings
Fig. 1 is a schematic flow chart of a method for detecting diseases of a drain pipe of a capsule robot based on abnormal detection thinking.
Fig. 2 is a diagram of an embodiment of a capsule robot drain disease detection method based on abnormal detection thinking according to the invention.
FIG. 3 is a partial sample diagram of experimental data of a method for detecting diseases in a drain pipe of a capsule robot based on abnormal detection thinking.
Fig. 4 to 9 are comparative diagrams of methods and feature combinations.
FIG. 10 is a graph showing the statistical results of various methods.
Detailed Description
The invention is further illustrated with reference to the following figures and examples.
The conception, the specific structure, and the technical effects produced by the present invention will be clearly and completely described below in conjunction with the embodiments and the accompanying drawings to fully understand the objects, the features, and the effects of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments, and those skilled in the art can obtain other embodiments without inventive effort based on the embodiments of the present invention, and all embodiments are within the protection scope of the present invention. In addition, all the connection/connection relations referred to in the patent do not mean that the components are directly connected, but mean that a better connection structure can be formed by adding or reducing connection auxiliary components according to specific implementation conditions. All technical characteristics in the invention can be interactively combined on the premise of not conflicting with each other.
The invention aims to provide a pipe network disease video detection method based on an unsupervised anomaly detection idea. The human visual recognition mechanism consists in: when watching a section of video data, people can distinguish the pipeline diseases because the picture has abnormal characteristics, namely, the picture has a significant difference with the previous and next pictures. The invention is based on a human visual identification mechanism, takes disease image data as an abnormal signal, utilizes a section of continuous sequence pipeline video image to extract the significant abnormal characteristics of the front and back images, and carries out unsupervised abnormal cluster detection to realize pipeline disease detection.
The invention adopts an unsupervised learning image identification technology, does not need to establish a disease sample base in advance, treats disease image data as an abnormal signal, utilizes a section of continuous sequence pipeline video image, extracts the significant abnormal characteristics existing in the images before and after the extraction, performs abnormal clustering detection on the extracted characteristic data set, realizes pipeline disease detection, and outputs a disease type label.
According to the method, a video file shot by a capsule robot on a drainage pipeline is framed into sequence images, a continuous segment of the sequence images is sequentially selected, LBP, GLCM, Gabor and HOG characteristics of the images are extracted, local texture change, image brightness change, edge information and object information of the images are obtained, and the accuracy of extraction of the salient abnormal characteristics of the front and rear images is improved through combination of various different image characteristics. The method adopts an anomaly detection algorithm based on Gaussian probability estimation, and is characterized in that when the characteristic values are mutually independent, the total probability is equal to the product of the probabilities of the characteristic values, and when the characteristic values are not mutually independent, a better effect can be obtained.
Referring to fig. 1, the present invention provides a method for detecting diseases in drain pipes of a capsule robot based on abnormal detection thinking, which comprises the following steps:
s10, acquiring a shot video file, and framing the video file into a sequence image of the inner wall of the drainage pipe;
in the embodiment, shooting is carried out inside the drainage pipe through the fish-eye lens of the capsule robot, so as to obtain a video file;
s20, inputting an image, dividing the image into 4 × 4 image blocks, and totaling 16 image blocks to form a sequence image;
s30, respectively calculating an LBP characteristic value, a GLMC characteristic value and a HOG characteristic value of the image block, designing a Gabor filter, and calculating a Gabor characteristic value;
the invention relates to a capsule robot drain pipe disease detection method based on abnormal detection thinking, which comprises the steps of converting video data into sequence images, extracting the characteristics of each image, and mainly extracting LBP (local binary pattern) characteristics, GLMC (global warming potential) characteristics, Gabor characteristics and HOG (histogram of oriented gradient) characteristics aiming at the characteristics of drain pipe diseases;
the LBP (Local Binary Pattern) is used for extracting texture features, and has significant advantages of rotation invariance, gray scale invariance and the like. The extracted features are local texture features of the image, an original LBP operator is defined in a 3 x 3 window, the central pixel of the window is used as a threshold value, the threshold value is compared with the gray values of 8 adjacent pixels, and if the surrounding pixel values are larger than the central pixel value, the position is marked as 1; otherwise, it is marked as 0. Thus, an 8-bit binary number can be obtained, which is usually converted into a 10-bit binary number, i.e. 256 kinds of LBP codes, and this value is used as the LBP value of the pixel point in the center of the window, so as to reflect the texture information of the 3 × 3 region.
The calculation formula of the LBP characteristic value is as follows:
wherein, p represents the p-th pixel point except the central pixel point in the 3 x 3 window; i (c) represents the gray value of the central pixel point, and I (p) represents the gray value of the p-th pixel point in the field; s (x) formula is as follows:
GLCM characteristics: (Gray-level Co-occurrence Matrix, GLCM), i.e. a Gray level Co-occurrence Matrix, GLCM is an L × L square Matrix, L is the Gray level of the source image, and describes the joint distribution of two pixels with a certain spatial position relationship, which can be seen as a joint histogram of two pixel Gray level pairs, which not only reflects the distribution characteristics of the brightness, but also reflects the position distribution characteristics between pixels with the same brightness or close to the brightness, which is a second-order statistical characteristic related to the brightness variation of the image. The gray level co-occurrence matrix of an image can reflect the comprehensive information of the gray level of the image about the direction, the adjacent interval and the variation amplitude. The image features of the GLMC feature values are expressed as:
wherein P isi,jRepresenting the number or frequency of occurrences of two pixels with gray levels i and j respectively,
gabor characteristics: designing Gabor filters, and selecting 4 sizes and 6 directions to form 24 Gabor filters; convolving the Gabor filter with the image to obtain Gabor characteristics, wherein the two-dimensional Gabor function is expressed as follows:
x′=xcosθ+ysinθ;
y′=-xsinθ+ycosθ;
wherein X and Y respectively represent pixel coordinate positions, and λ represents the wavelength of filtering, and the pixel is taken as a unit; theta represents the inclination angle of the Gabor kernel function image and specifies the direction of parallel stripes of the Gabor function; psi represents the amount of phase offset, ranging from-180 degrees to 180 degrees, where 0 and 180 degrees correspond to the centrosymmetric center-on function and center-off function, respectively, and-90 degrees and 90 degrees correspond to the anti-symmetric function; σ represents a standard difference of the gaussian function; γ represents an aspect ratio, and when γ is 1, the shape is circular, and when γ is <1, the shape is elongated with the parallel stripe direction. In the above embodiment, λ is 3, σ is 0.56 λ, γ is 0.5, θ is 60, and ψ is 90.
HOG characteristics: the HOG feature is a feature descriptor used for object detection in computer vision and image processing, the HOG feature forms a feature by calculating and counting a gradient direction histogram of a local area of an image, and a gradient size and direction calculation formula of an image pixel point is as follows:
Gx(x,y)=H(x+1,y)-H(x-1,y);
Gy(x,y)=H(x,y+1)-H(x,y-1);
wherein G isx(x,y)、Gy(x, y), α (x, y), and H (x, y) respectively represent a horizontal direction gradient, a vertical direction gradient, a gradient direction, and a pixel value at the pixel point (x, y) in the input image.
1) Graying (treating the image as a three-dimensional image in x, y, z (grayscale));
2) standardizing (normalizing) the color space of the input image by using a Gamma correction method; the method aims to adjust the contrast of an image, reduce the influence caused by local shadow and illumination change of the image and inhibit the interference of noise;
3) calculating the gradient (including magnitude and direction) of each pixel of the image; mainly for capturing contour information while further attenuating the interference of illumination.
4) Dividing the image into small cells (e.g., 6 x 6 pixels/cell);
5) counting the gradient histogram (the number of different gradients) of each cell to form a descriptor of each cell;
6) and (3) forming each cell into a block (for example, 3 × 3 cells/block), and connecting the feature descriptors of all the cells in the block in series to obtain the HOG feature descriptor of the block.
7) The HOG feature descriptors of all blocks in the image are connected in series to obtain the HOG feature descriptors of the image. This is the final feature vector available for classification.
S40, combining the characteristic values obtained in the step S30 to obtain a characteristic combination data set (x)1,x2,x3…xn) (ii) a Giving an mxn-dimensional training set, converting the training set into n-dimensional Gaussian distribution, and analyzing the distribution of m training samples to obtain the probability density function of the training set, i.e. the mathematical expectation mu and variance sigma of the training set in each dimension2(ii) a Mathematical expectation μ in the jth dimensionjSum varianceThe calculation formula is as follows:
S50, when a new point is given, determining the probability p of the new point on the Gaussian distribution, wherein the calculation formula of the probability p is as follows:
s60, selecting a cross validation data set, sequentially carrying out the steps S30-S50 to obtain a total probability value p of each image, and then determining a threshold value epsilon through the cross validation set;
and S70, inputting a new sample image, calculating the total probability value p of the sample image, considering that the sample image is abnormal when p < epsilon, and judging that the sample image is not abnormal when p > epsilon.
In the video data acquired by drainage pipeline detection, the disease condition only occupies a small part of images and can be regarded as an abnormal signal, and the task of abnormal detection is to find objects different from most other objects. The invention adopts the anomaly detection algorithm based on Gaussian probability density estimation to carry out anomaly detection, and the method has the advantage that a better detection result can be obtained no matter whether the selected feature combinations are mutually independent or not.
On one hand, the LBP characteristic, the GLCM characteristic, the Gabor characteristic and the HOG characteristic of the image are comprehensively utilized, and the disease identification accuracy is improved through the combination of the characteristics of local texture change, image brightness change, edge information, object information and the like of the image. On the other hand, the method belongs to an unsupervised learning mode, prior knowledge is not needed, a disease sample database is not needed to be established in advance, the missing rate of the pipeline diseases caused by incomplete disease sample data can be reduced, and meanwhile, the workload of manually marking the diseases in the early stage by adopting a supervised learning method can be greatly reduced.
Referring to fig. 2, the present invention adopts and compares four typical anomaly detection algorithms and an anomaly detection method based on a clustering algorithm, and obtains a final evaluation result.
(1) And iFroest: the isolated forest is a rapid anomaly detection method based on Ensemble, has linear time complexity and high precision, belongs to a Non-parametric and unsupervised method, and does not need to define a mathematical model and label training. The IF uses a binary tree to segment the data, and the depth of the data point in the binary tree reflects the degree of "thinning out" of the data. The iForest is composed of t iTrees (isolation Trees) isolated trees, each iTree is a binary tree structure, and the algorithm steps are as follows:
the first step is as follows: and (5) a process of constructing a decision tree.
The second step is that: the average height h (x) of the sample points to be detected per tree is calculated. First, each iTree needs to be traversed to obtain the number ht (x) of layers of the detected data point x which finally falls on any tth iTree. The ht (x) represents the depth of the tree, that is, the closer to the root node, the smaller ht (x) is, and the closer to the bottom layer, the larger ht (x) is, and the height of the root node is 0;
the third step: according to h (x), judging whether x is an abnormal point. We generally calculate the anomaly probability score for x using the following equation:
the value range of s (x, m) is [0,1], and the closer the value is to 1, the higher the probability of being an outlier is. Wherein m is the number of samples, and the expression is as follows:
as can be seen from the expression s (x, m), if the height h (x) → 0, the probability of s (x, m) → 1, i.e., the outlier, is 100%, and if the height h (x) → m-1, the probability of s (x, m) → 0, i.e., the outlier, is impossible. If height h (x) → c (m), then s (x, m) → 0.5, i.e. the probability of being an outlier is 50%, typically we can set a threshold of $ s (x, m) and then de-tune, so that values greater than the threshold are considered outliers.
(2) One Class SVM: the One Class SVM also belongs to a large family of support vector machines, but is different from the traditional classification regression support vector machine based on supervised learning, and is an unsupervised learning method without needing to mark output labels of a training set. Then there is no class label, the hyperplane divided by using SVDD method and finding support vector, for SVDD we expect all samples not abnormal to be positive class, at the same time it uses a hypersphere instead of a hyperplane to make division, the algorithm obtains the spherical boundary around the data in the feature space, it expects to minimize the volume of this hypersphere, thus minimizing the influence of abnormal point data.
Assuming the generated hypersphere parameters are center o and the corresponding hypersphere radius r >0, the hypersphere volume v (r) is minimized, center o is the linear combination of support vectors; similar to the conventional SVM method, it is required that the distances from all training data points xi to the center are strictly less than r, but a relaxation variable ξ i with a penalty coefficient of C is constructed at the same time, and the optimization problem is as follows:
||xi-o||2≤r+ξi,i=1,2,…m;
ξi≥0,i=1,2,…m;
after the lagrange dual solution is adopted, whether a new data point z is in the class or not can be judged, if the distance from the z to the center is smaller than or equal to the radius r, the new data point z is not an abnormal point, and if the new data point z is outside the hyper-sphere, the new data point z is an abnormal point.
(3) Local Outlier algorithm-Local Outlier Factor (LOF): the method mainly judges whether each point p is an abnormal point by comparing the density of the point p with the density of the adjacent points, and if the density of the point p is lower, the point p is more likely to be considered as an abnormal point. If the ratio is closer to 1, the neighborhood point density of p is almost the same, and p may belong to the same cluster as the neighborhood; if the ratio is less than 1, the density of p is higher than that of the neighborhood points, and p is a dense point; if this ratio is greater than 1, it indicates that the density of p is less than its neighborhood point density, and p is more likely to be an outlier.
The outlier factor for point P is represented as:
(4) an anomaly detection algorithm based on Gaussian probability density estimation: an Anomaly Detection Algorithm (Anomaly Detection Algorithm) based on gaussian distribution is widely used in many scenarios. The core idea of the algorithm is as follows: giving an m x n dimensional training set, converting the training set into Gaussian distribution with n, obtaining a probability density function of the training set by analyzing the distribution of m training samples, namely obtaining a mathematical expected mu and a variance sigma 2 of the training set on each dimension, determining a threshold epsilon by using a small amount of Cross Validation sets, judging that p < epsilon is abnormal when a new point is given according to the probability calculated on the Gaussian distribution and the threshold epsilon, and judging that p > epsilon is not abnormal when p < epsilon.
(5) K-Means + +: the K-Means + + algorithm is the optimization of the method for initializing the centroid randomly by the K-Means, and comprises the following steps:
1) randomly selecting a point from the input data point set as a first clustering center mu 1;
2) for each point xi in the data set, its distance from the closest one of the selected cluster centers is calculated:
3) selecting a new data point as a new cluster center according to the following selection principles: d (x) the larger point, the probability of being selected as the clustering center is larger;
4) repeating 2 and 3 until k clustered centroids are selected;
5) the K centroids are used as initialization centroids to run the standard K-Means algorithm.
(6) DBSCAN: the DBSCAN is a density clustering method, a core object without a category is selected as a seed, then a sample set which can reach the density of all the core objects is found, and the sample set which is connected with the maximum density and is derived from the density reachable relation is a category of the final clustering, or a cluster is a clustering cluster. And then continuously selecting another core object without categories to search a sample set with reachable density, thereby obtaining another cluster. Run until all core objects have a category.
(7) Hierarchical clustering algorithm: the hierarchy method (hierarchal algorithms) first calculates the distance between samples. Each time merging the closest points to the same class. Then, the distance between the classes is calculated, and the classes with the closest distance are combined into a large class. And continuously merging until a class is synthesized.
Hierarchical Clustering (Hierarchical Clustering) is one of the Clustering algorithms that creates a Hierarchical nested cluster tree by calculating the similarity between data points of different classes. In the clustering tree, the original data points of different classes are the lowest layer of the tree, the top layer of the tree is a root node of a cluster, and the merging algorithm of hierarchical clustering combines the two most similar data points in all the data points by calculating the similarity between the two data points, and repeats the iteration process. In brief, the merging algorithm of hierarchical clustering determines the similarity between data points of each category by calculating the distance between them, and the smaller the distance, the higher the similarity. And combining two data points or categories closest to each other to generate a cluster tree.
Referring to fig. 3, for a part of experimental data samples, the first two rows are data in the pipeline, and the last row is a picture taken when the capsule robot just enters the pipeline and is recovered from the pipeline. And converting the video between the two well lids into a sequence image, and then converting the sequence image into a sequence signal characteristic through characteristic extraction. And then, carrying out disease detection by adopting an anomaly detection algorithm. And the different features were combined, 899 images of data were converted into images, and features and combinations were performed, 40 seconds for video between the two well heads.
And 5 parameter indexes are adopted to evaluate a disease detection algorithm. Let P positive samples (non-diseased samples) and N negative samples (diseased samples). The detection results of the algorithm are TP positive examples determined as positive examples, FN positive examples determined as negative examples (false negative examples) P ═ TP + FN, TN negative examples determined as negative examples, N ═ TN + FP (false positive examples) determined as negative examples, TP/(TP + FP) positive sample Accuracy, TN/(TN + FN) negative sample Accuracy 1 ═ TN/(TN + FN) negative sample Accuracy, TN + N ratio (TP + TN + acc) Accuracy (Accuracy) acc ═ TN + N ratio determined as correct examples, TP/P + N non-disease sample Recall (Recall) Recall ═ TP/P, and TN/N disease sample Recall (Recall) all _ b 1.
The disease sample recall rate, the disease sample detection precision and the comprehensive detection precision are very important parameters, and the algorithm with the largest mean value and the smallest variance is considered to have the best detection effect. Various pairs of methods and feature combinations are shown in fig. 4-9.
First, from the viewpoint of feature combination, the feature combination including the GLCM features has the best detection effect, and it is described that the GLCM can better extract the texture features of the image. Secondly, the classification effect of the original image features and the Gabor features is good, and the original image features and the Gabor features can be well extracted. Meanwhile, the image data can better reflect texture information, and the iForest anomaly detection algorithm is most stable in the aspect of the anomaly detection algorithm. In sum, even under the condition of different feature combinations, the correct classification result of about 60 percent can be achieved, and the correct classification of the GLCM feature and the original data feature of about 70 percent is achieved. The algorithm is relatively robust. Without a priori knowledge of the data, the algorithm may be prioritized for use. From the anomaly detection algorithm based on the clustering method, the basic parameters regarding the clustering method processing follow the following principles: wherein the initial KMean + + clustering number is 10 classes, the final clustering number is 5 classes, and the hierarchical clustering is 10 classes. DBSCAN does not need to specify the number of clusters. After the final result is obtained, the number of each category is arranged from small to large, and the former J-Int (0.6-K) is taken as an abnormal category according to the number K of cluster clusters in the category. For the problem of the method, disease detection scenes are provided, and two parameters of disease detection precision _ b1 and disease recall rate are very important. Essentially, the two evaluation indexes also determine other evaluation indexes such as false alarm rate, missed detection rate and overall accuracy. In machine learning, in order to evaluate the performance of a classification algorithm, the average and variance of the accuracy and recall of negative samples are used in the project to measure the performance of the classification algorithm, the algorithm with the smaller variance is considered to be the best when the average is larger, and the average is considered to be the best. Meanwhile, the two values also accord with the thought of drawing an ROC curve and the visual feeling of people, the GLCM characteristic with the best characteristic combination effect is selected in the project, and different algorithm results are compared, as shown in the table:
TABLE 3.3-1 anomaly detection results for different algorithms
Fig. 10 is a schematic diagram of statistical results of different methods.
As shown in the figure, iForest, Gaussion-D, KMeans + +, DBSCAN and GLCM all have good effects. KMeans + + showed the best results. The clustering-based method requires some prior knowledge, two important parameters are set in advance, and the proportion of the number of clustering clusters to the abnormal clusters is determined. If the prior knowledge is lacked, the iForest and Gaussion-D anomaly detection algorithms are recommended to be used and are more in line with the actual case.
In conclusion, the combination of the gray level co-occurrence matrix characteristics based on the Gaussian probability density anomaly point detection algorithm is superior. Therefore, in a real pipeline scene, the disease detection based on the pipe network video can be realized by combining an anomaly detection algorithm with texture feature extraction.
While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (10)
1. A capsule robot drain pipe disease detection method based on abnormal detection thinking is characterized by comprising the following steps:
s10, acquiring a shot video file, and framing the video file into a sequence image of the inner wall of the drainage pipe;
s20, inputting an image, and dividing the image into 4-by-4 image blocks;
s30, respectively calculating an LBP characteristic value, a GLMC characteristic value and a HOG characteristic value of the image block, designing a Gabor filter, and calculating a Gabor characteristic value;
s40, combining the characteristic values obtained in the step S30 to obtain a characteristic combination data set;
s50, calculating a probability formula of each characteristic value;
s60, selecting a cross validation data set, sequentially carrying out the steps S30-S50 to obtain the total probability value of each image, and then determining a threshold value epsilon through the cross validation set;
and S70, inputting a new sample image, calculating the total probability value p of the sample image, considering that the sample image is abnormal when p < epsilon, and judging that the sample image is not abnormal when p > epsilon.
2. The method for detecting drain diseases of capsule robots based on abnormal thinking detection as claimed in claim 1, wherein the step S40 comprises the following steps:
combining the feature values obtained in step S30 to obtain a feature combination data set (x)1,x2,x3…xn) (ii) a Giving an mxn-dimensional training set, converting the training set into n-dimensional Gaussian distribution, and analyzing the distribution of m training samples to obtain the probability density function of the training set, i.e. the mathematical expectation mu and variance sigma of the training set in each dimension2(ii) a Mathematical expectation μ in the jth dimensionjSum varianceThe calculation formula is as follows:
3. The method for detecting drain disease of capsule robot based on abnormal thinking detection as claimed in claim 2, wherein in step S50, when a new point is given, the probability p of the new point on the gaussian distribution is determined, and the calculation formula of the probability p is as follows:
4. the method for detecting drain pipe disease of capsule robot based on abnormal detection thinking as claimed in claim 1, wherein in step S10, the inside of the drain pipe is photographed by a fish-eye lens of the capsule robot to obtain a video file.
5. The method for detecting drain disease of capsule robot based on abnormal thinking detection as claimed in claim 1, wherein in step S30, the LBP characteristic value is calculated as follows:
wherein, p represents the p-th pixel point except the central pixel point in the 3 x 3 window; i (c) represents the gray value of the central pixel point, and I (p) represents the gray value of the p-th pixel point in the field; s (x) formula is as follows:
6. the method for detecting drain disease of capsule robot based on abnormal detection thinking as claimed in claim 1, wherein in step S30, the image characteristics of GLMC characteristic value are expressed as:
wherein P isi,jRepresenting the number or frequency of occurrences of two pixels with gray levels i and j respectively,
7. the method for detecting drain diseases in capsule robots based on abnormal thinking detection as claimed in claim 1, wherein in step S30, Gabor filters are designed, and 24 Gabor filters are formed by selecting 4 sizes and 6 directions.
8. The method for detecting drain disease of capsule robot based on abnormal thinking detection as claimed in claim 7, wherein the Gabor filter is convolved with the image to obtain Gabor characteristics, and the two-dimensional Gabor function is expressed as follows:
x*=x cosθ+y sinθ;
y′=-x sinθ+y cosθ;
wherein X and Y respectively represent pixel coordinate positions, and λ represents the wavelength of filtering, and the pixel is taken as a unit; theta represents the inclination angle of the Gabor kernel function image and specifies the direction of parallel stripes of the Gabor function; psi represents the phase offset in the range-180 degrees, where 0 and 180 degrees correspond to the centrosymmetric center-on function and center-off function, respectively, and-90 degrees and 90 degrees correspond to the anti-symmetric functions; σ represents the standard deviation of the Gaussian function; γ represents an aspect ratio, and when γ is 1, the shape is circular, and when γ is <1, the shape is elongated with the parallel stripe direction.
9. The method for detecting drain disease of capsule robot based on abnormal thinking detection as claimed in claim 8, wherein λ is 3, σ -0.56 λ, γ -0.5, θ -60, ψ -90.
10. The method for detecting drain disease of capsule robot based on abnormal thinking detection as claimed in claim 1, wherein in step S30, the HOG feature is a feature descriptor used for object detection in computer vision and image processing, the HOG feature is a feature by calculating and counting the histogram of gradient direction of local area of image, the calculation formula of gradient magnitude and direction of image pixel is as follows:
Gx(x,y)=H(x+1,y)-H(x-1,y);
Gy(x,y)=H(x,y+1)-H(x,y-1);
wherein G isx(x,y)、Gy(x, y), α (x, y), and H (x, y) respectively represent a horizontal direction gradient, a vertical direction gradient, a gradient direction, and a pixel value at the pixel point (x, y) in the input image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110647069.3A CN113326790A (en) | 2021-06-10 | 2021-06-10 | Capsule robot drain pipe disease detection method based on abnormal detection thinking |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110647069.3A CN113326790A (en) | 2021-06-10 | 2021-06-10 | Capsule robot drain pipe disease detection method based on abnormal detection thinking |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113326790A true CN113326790A (en) | 2021-08-31 |
Family
ID=77420333
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110647069.3A Withdrawn CN113326790A (en) | 2021-06-10 | 2021-06-10 | Capsule robot drain pipe disease detection method based on abnormal detection thinking |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113326790A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117388637A (en) * | 2023-11-13 | 2024-01-12 | 国家电网有限公司技术学院分公司 | AI-based converter station direct current system abnormal signal identification and auxiliary decision-making method |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111695482A (en) * | 2020-06-04 | 2020-09-22 | 华油钢管有限公司 | Pipeline defect identification method |
CN111986188A (en) * | 2020-08-27 | 2020-11-24 | 深圳市智源空间创新科技有限公司 | Capsule robot drainage pipe network defect identification method based on Resnet and LSTM |
-
2021
- 2021-06-10 CN CN202110647069.3A patent/CN113326790A/en not_active Withdrawn
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111695482A (en) * | 2020-06-04 | 2020-09-22 | 华油钢管有限公司 | Pipeline defect identification method |
CN111986188A (en) * | 2020-08-27 | 2020-11-24 | 深圳市智源空间创新科技有限公司 | Capsule robot drainage pipe network defect identification method based on Resnet and LSTM |
Non-Patent Citations (5)
Title |
---|
XU FANG ET AL.: "Sewer Pipeline Fault Identification Using Anomaly Detection Algorithms on Video Sequences", 《IEEE ACCESS》 * |
于冰洁 等: "基于高斯过程模型的异常检测算法", 《计算机工程与设计》 * |
李建军: "《基于图像深度信息的人体动作识别研究》", 31 December 2018, 云南大学出版社 * |
熊欣: "《人脸识别技术与应用》", 31 August 2018, 黄河水利出版社 * |
王丽娜 等: "《信息隐藏技术与应用》", 31 May 2012, 武汉大学出版社 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117388637A (en) * | 2023-11-13 | 2024-01-12 | 国家电网有限公司技术学院分公司 | AI-based converter station direct current system abnormal signal identification and auxiliary decision-making method |
CN117388637B (en) * | 2023-11-13 | 2024-05-14 | 国家电网有限公司技术学院分公司 | AI-based converter station direct current system abnormal signal identification and auxiliary decision-making method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Fisher et al. | Dictionary of computer vision and image processing | |
US5640468A (en) | Method for identifying objects and features in an image | |
CN110097596B (en) | Object detection system based on opencv | |
US7983486B2 (en) | Method and apparatus for automatic image categorization using image texture | |
US8655070B1 (en) | Tree detection form aerial imagery | |
CN107977661B (en) | Region-of-interest detection method based on FCN and low-rank sparse decomposition | |
Siraj et al. | Digital image classification for Malaysian blooming flower | |
CN106373146B (en) | A kind of method for tracking target based on fuzzy learning | |
Zhang et al. | Road recognition from remote sensing imagery using incremental learning | |
CN110866896A (en) | Image saliency target detection method based on k-means and level set super-pixel segmentation | |
Palomo et al. | Learning topologies with the growing neural forest | |
CN108596195B (en) | Scene recognition method based on sparse coding feature extraction | |
CN109635726B (en) | Landslide identification method based on combination of symmetric deep network and multi-scale pooling | |
CN108932518A (en) | A kind of feature extraction of shoes watermark image and search method of view-based access control model bag of words | |
CN111709317A (en) | Pedestrian re-identification method based on multi-scale features under saliency model | |
Stucker et al. | Supervised outlier detection in large-scale MVS point clouds for 3D city modeling applications | |
Aman et al. | Content-based image retrieval on CT colonography using rotation and scale invariant features and bag-of-words model | |
CN113326790A (en) | Capsule robot drain pipe disease detection method based on abnormal detection thinking | |
CN112418262A (en) | Vehicle re-identification method, client and system | |
Sheta et al. | Metaheuristic search algorithms for oil spill detection using sar images | |
CN116415210A (en) | Image infringement detection method, device and storage medium | |
CN112580442B (en) | Behavior identification method based on multi-dimensional pyramid hierarchical model | |
Ahmadia et al. | The application of neural networks, image processing and cad-based environments facilities in automatic road extraction and vectorization from high resolution satellite images | |
El Idrissi et al. | A Multiple-Objects Recognition Method Based on Region Similarity Measures: Application to Roof Extraction from Orthophotoplans | |
CN111798473A (en) | Image collaborative segmentation method based on weak supervised learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20210831 |
|
WW01 | Invention patent application withdrawn after publication |