CN116912625A

CN116912625A - Data enhancement method based on priori defect characteristics and SSPCAB attention mechanism

Info

Publication number: CN116912625A
Application number: CN202310912045.5A
Authority: CN
Inventors: 杨文翰; 汤晟楠; 王正松; 杨乐
Original assignee: Northeastern University Qinhuangdao Branch
Current assignee: Northeastern University Qinhuangdao Branch
Priority date: 2023-07-25
Filing date: 2023-07-25
Publication date: 2023-10-20

Abstract

The invention provides a data enhancement method based on priori defect characteristics and SSPCAB attention mechanism, and relates to the field of industrial defect detection. Cutting an input normal sample image at a random position to obtain a rectangular area with random size, collecting the defect type and shape of an industrial product as priori knowledge, performing mask conversion on the rectangle to obtain an image patch similar to the defect shape of the corresponding product, randomly rotating the patch by a certain angle, performing random color dithering, pasting the patch at the random position of an original image to obtain a simulated defect image, inputting the simulated defect image and the normal image into a ResNet-18 neural network, integrating a self-supervision attention module (SSPCAB) into the neural network, using periodic focus loss (CFL) as a loss function of the neural network, and performing weighted addition of mean square error loss generated by the attention module as an objective function to finally obtain a defect detection model.

Description

Data enhancement method based on priori defect characteristics and SSPCAB attention mechanism

Technical Field

The invention relates to the technical field of defect detection, in particular to a data enhancement method based on priori defect characteristics and SSPCAB attention mechanism.

Background

Surface defect detection of industrial products has been a hot spot of research and is widely used in various industrial processes, such as printing industry, glass manufacturing, textile industry, etc. The purpose is to replace manual inspection, and defects similar to scratches, cracks, dust, printing ink and the like are automatically detected and positioned on the surface of a product. With the rapid development of deep learning, the performance of the method is far superior to that of the traditional algorithm in most places. However, a big feature of deep learning over conventional algorithms is that a huge amount of data is required to drive, such as the COCO dataset for object detection, the ImageNet dataset for image classification, etc. Driven by the huge data sets, the performance of the deep learning model of target detection, semantic segmentation and classification tasks can be greatly improved, so that the data acquisition is important for a deep learning algorithm.

However, in the current industry, the acquisition of data often has several problems: 1. acquisition of defective abnormal samples is difficult because in actual production, the defective rate is extremely low, and when abnormal samples are generated, a large number of repeated useless abnormal samples are often generated, sometimes even no usable abnormal samples. 2. In actual production, the positions where the anomalies appear are often random, and even if a sufficient number of data sets are acquired, the anomalies appear at all positions of the image cannot be ensured, so that the data set driven model cannot ensure that the data set driven model has stronger detection performance under various backgrounds. 3. After a large number of abnormal samples are obtained, data labeling is needed for the abnormal samples, but compared with the targets of training sets such as COCO, the defects generated in industrial production are very tiny, so that the defects are difficult to find in a complex background, and the labeling process is often finished by consuming a large amount of manpower and material resources.

Although in recent years, research on a self-supervision or unsupervised defect detection method has been continued, there are still some basic problems, and the current anomaly detection technology based on an embedding method uses the distance between embedded vectors of a sample and a normal sample as a criterion for calculating an anomaly score, and uses a training network to improve the feature extraction capability of the network to obtain an anomaly region. However, in view of the principle of anomaly detection, the extracted features of the model need to be matched, so that the calculation speed is severely limited. The method based on reconstruction detects the abnormality of the image through the error generated in the image reconstruction process and locates the abnormality. Such as self-encoders and generation of a countermeasure network are often used in this approach. The self-encoder is trained using the loss of resistance, and the anomaly score is calculated from errors generated during the image reconstruction process. In view of the strong learning ability of neural networks, even abnormal images can be reconstructed with little loss. This violates the premise of the reconstruction method that the abnormal region of the abnormal image cannot be completely reconstructed to be distinguished from the original image, thereby disabling the reconstruction method. For the anomaly detection technology based on data enhancement, the method does not need to carry out data annotation, and a feature extractor is learned from unlabeled data. Common methods are for example to randomly copy a small rectangular area from the input image and to randomly paste it into the image simulating an abnormal sample. Structural anomalies are created by pasting rectangular patches of different sizes, from the aspect of the aspect ratio and rotation angle. Most of the current data enhancement methods cannot well simulate real defects, and can only generate abnormal samples simply by constructing irregularities of images. Therefore, a data enhancement method closer to the real defect is needed to generate the simulation data, so that the neural network model with better performance is driven.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a data enhancement method based on prior defect characteristics and SSPCAB attention mechanism, which comprises the following steps:

step 1: establishing an industrial defect detection data set; the data set comprises a test set and a training set, wherein the test set comprises a certain proportion of normal samples and defect samples, and the training set only comprises a certain number of normal samples;

step 2: establishing an industrial defect product priori knowledge base; the method comprises the steps of obtaining defect shape information and type information which are frequently generated by different industrial products as priori knowledge through field investigation and analysis of an industrial field;

step 3: a data enhancement strategy with priori knowledge is adopted, a rectangular patch is obtained by cutting a normal image, a new defect patch with a defect shape and original sample characteristics is obtained by fusing the patch and the priori defect shape, and the patch is pasted back to the original image to generate a simulated abnormal image;

the data enhancement strategy with priori knowledge is divided into two working modes according to the simulated abnormal size; the first mode of operation is: in the process of cutting, determining the size of a rectangle to be cut according to a certain proportion according to the size of an original image; the second mode of operation is: giving a certain numerical range which is far smaller than the size of the image, and randomly selecting numerical values in the range as the size of a clipping rectangle so as to generate a tiny abnormal patch; enriching data types through two different anomaly simulations;

step 4: taking a normal sample and a simulated abnormal sample generated based on the normal sample as input, training an image classifier capable of detecting an abnormal image by using a simulated image and a normal image based on a training method of self-supervision learning, wherein the classifier is provided with four convolution layers, the channel sizes are 64, 128, 256 and 512, the image size is converted into 512 multiplied by 1 by adopting a Relu activation function and two pooling layers, and finally the output is converted into probability distribution by using softmax;

the classifier is divided into two working modes according to different classification modes; the first mode of operation is: the classification task of the image only comprises two categories of normal images and abnormal images, the normal images are marked as 0, and the abnormal images randomly select one of tiny anomalies and standard anomalies to simulate in the process of enhancing data each time and are marked as 1; the second mode of operation is: the classification task of the image comprises three categories of normal images, standard anomalies and tiny anomalies, wherein two anomaly images of the standard anomalies and the tiny anomalies are generated simultaneously in the data enhancement process, the normal images are marked as 0, the standard anomalies are marked as 1, and the tiny anomalies are marked as 2; training a classifier with better performance through different classification tasks;

step 5: a neural network incorporating a self-supervising attentiveness mechanism;

the self-supervising attentiveness mechanism is divided into two parts: (1) The masking convolution layer adopts a convolution kernel with the size of 3 multiplied by 3, a masking area M with the size of 1 is arranged in the middle of the convolution kernel, calculation is not involved in the process of carrying out convolution by the convolution kernel, and parameters which can be learned by the masking convolution are positioned at four corners of a receptive field, wherein k0 epsilon N ⁺ Is a super parameter defining the size of the sub-core, C is the number of input channels. d is a core K _i Distance from the mask area M, where M is denoted as ε R ₁ X 1 xC, its predictive value is K _i And (3) summing. (2) The channel attention mechanism is fused after the convolution layer, the features extracted by the convolution layer are subjected to a Squeeze compression operation, and feature mapping crossing the space dimension H×W is aggregated to generate a channel descriptor, namely H×W×C- & gt 1×1×C. The global space information is compressed into the channel descriptors so that these descriptors can be utilized by other layers of its input. The approach taken by Squeeze is global average pooling, expressed as:

where H is the length of the feature map, W is the width of the feature map, u _c After being converted intoKeeping the channel number C unchanged, compressing the features into a channel descriptor;

the gating mechanism is then parameterized by a bottleneck of two fully connected layers, i.e. W ₁ For reducing dimension, W ₂ For dimension increment, expressed as:

s＝F _ex (z,W)＝σ(g(z,W))＝σ(W ₂ δ(W ₁ z))

wherein W is ₁ ,W ₂ Is a full connection layer, W ₁ Where r is a scaling parameter, here a value of 16, the number of channels can be reduced to reduce the amount of computation. Delta is a Relu layer, the output dimension is unchanged, and W ₂ Multiplying and finally obtaining s through a sigmoid function;

after the module is inserted into the last convolutional layer of the backbone neural network, parameters are updated synchronously with training of the neural network;

step 6: repeating the steps 4 and 5 until the target function of the neural network converges to obtain the optimal classified neural network model;

the objective function is expressed as:

L _mask ＝-ξ(1+p _t ) ^γhc log(p _t )+(1-ξ)FL+α(G(X)-X) ²

wherein p is _t Is a predicted value; FL is the focus loss; gamma ray _hc Is a super parameter; ζ is a time-varying parameter that is assigned different weights during different training periods, with the goal of combining early-to-confidence predicted class concerns with mid-to-misclassified hard sample class concerns. Alpha is defined weight super parameter, which is generally taken as 0.1, G is self-supervision attention module, X is tensor of self-supervision attention module; in detail, the parameter ζ that varies with the training epoch is defined as:

wherein, tau is a super parameter and tau is more than or equal to 1, which is used for dividing the total epoch; t is the total epoch number; t is the current epoch number. For example, if τ is set to 2, the periodic shape is an inverted triangle, τ changes from 1 to 0 in the first half and from 0 in the second half;

step 7: performing model evaluation on the classification model obtained in the step 6 in a test set, and calculating the abnormal score of the image through a Gaussian density estimator to classify the image;

the Gaussian density estimator detects the image abnormality and calculates the abnormality score, and the input image is only marked with the image-level label and has no pixel-level label; in the evaluation stage, the extracted features of the last convolution layer of the model are taken as output, the extracted features are input to a Gaussian density estimator, and the anomaly scores of the features are calculated and expressed as follows:

where x is the input feature, μ is the mathematical desired value, Σ is an n-order matrix obtained during learning, and their sub-table is expressed as:

wherein x is a feature matrix, and m is the number of the feature matrices;

wherein x is a feature matrix, mu is a data expectation, and m is the number of the feature matrices;

step 8: drawing a classification result of the classification model into an ROC curve, wherein the ROC curve is used as an evaluation index of the classifier and is used as a basis for calculating an AUC;

step 9: calculating the AUC score of the classifier as the score of the evaluation model according to the ROC curve obtained in the step 8, wherein the score is expressed as follows:

wherein M is a negative sample, N is a positive sample, the product of the two is the sum of the positive and negative samples in a pairwise ordering mode, the sum of the positive sample ordering modes is calculated and added, and rank is calculated _i Is the number of samples less than the sample score. And if the score is not ideal, repeating the steps 4-9.

Step 10: if the neural network model obtained in the step 9 meets the required AUC score (set by itself according to the actual requirement), the model is used as a final detection model, namely, the neural network model of the data enhancement method using the prior defect characteristics and the SSPCAB self-supervision attention mechanism is used for completing training.

Further, the abnormal image generation strategy in the step 3 is divided into two working modes according to the simulated abnormal size; the first mode of operation is: in the process of cutting, determining the size of a rectangle to be cut according to a certain proportion according to the size of an original image; the second mode of operation is: giving a certain numerical range which is far smaller than the size of the image, and randomly selecting numerical values in the range as the size of a clipping rectangle so as to generate a tiny abnormal patch; enriching data types through two different anomaly simulations;

further, the normal sample and the simulated abnormal sample which are input by the self-supervision training method in the step 4 are divided into two working modes according to different classification modes; the first mode of operation is: the classification task of the image only comprises two categories of normal images and abnormal images, the normal images are marked as 0, and the abnormal images randomly select one of tiny anomalies and standard anomalies to simulate in the process of enhancing data each time and are marked as 1; the second mode of operation is: the classification task of the image comprises three categories of normal images, standard anomalies and tiny anomalies, wherein two anomaly images of the standard anomalies and the tiny anomalies are generated simultaneously in the data enhancement process, the normal images are marked as 0, the standard anomalies are marked as 1, and the tiny anomalies are marked as 2; training a classifier with better performance through different classification tasks;

further, in step 7, the gaussian density estimator performs anomaly detection on the image and calculates anomaly scores, and the input image is only labeled with image-level labels and no pixel-level labels; in the evaluation stage, the extracted features of the last convolution layer of the model are taken as output, the extracted features are input to a Gaussian density estimator, and the anomaly scores of the features are calculated and expressed as follows:

wherein x is a feature matrix, and m is the number of the feature matrices;

further, the objective function in step 6 is expressed as:

the beneficial effects of adopting above-mentioned technical scheme to produce lie in:

the invention provides a data enhancement method based on prior defect characteristics and SSPCAB attention mechanisms, which adopts a neural network combined with a self-supervision attention prediction mechanism, and adds industrial defect prior knowledge acquired on an industrial site in the data enhancement process, so that a generated simulated defect image is more in line with a real defect image, the finally obtained detection precision is higher, and the defect detection efficiency is effectively improved.

Drawings

FIG. 1 is a flow chart of a method for enhancing self-monitoring data in combination with an attention mechanism in an embodiment of the present invention.

FIG. 2 is a schematic diagram of a data enhancement strategy according to an embodiment of the present invention.

Fig. 3 is a block diagram of the attention mechanism in the present invention.

Detailed Description

The following describes in further detail the embodiments of the present invention with reference to the drawings and examples. The following examples are for

The invention is illustrated but not intended to limit the scope of the invention.

The traditional industrial defect detection method is greatly interfered by external factors in a complex scene, for example, when the conditions of illumination change, pollution of a background light source, vibration of production line equipment and the like exist in an industrial field, the detection is greatly influenced. Current deep learning methods can avoid these problems to some extent, however deep learning methods require a large amount of defect data to train the model, which is currently difficult for industry to provide. Based on this, as shown in fig. 1, the present invention provides a data enhancement method based on a priori defect characteristics and ssplab attention mechanism, comprising the steps of:

step 3: a data enhancement strategy with priori knowledge is adopted, a rectangular patch is obtained by cutting a normal image, a new defect patch with a defect shape and original sample characteristics is obtained by fusing the patch and the priori defect shape, and the patch is pasted back to the original image to generate a simulated abnormal image; as shown in fig. 2;

step 5: a neural network incorporating a self-supervising attentiveness mechanism; as shown in fig. 3;

where H is the length of the feature map, W is the width of the feature map, u _c For the converted feature matrix, keeping the channel number C unchanged, and compressing the features into a channel descriptor;

thereafter, the gating is parameterized by a bottleneck formed by two fully connected layersThe mechanism, i.e. W ₁ For reducing dimension, W ₂ For dimension increment, expressed as:

s＝F _ex (z,W)＝σ(g(z,W))＝σ(W ₂ δ(W ₁ z))

the objective function is expressed as:

wherein x is a feature matrix, and m is the number of the feature matrices;

wherein M is negativeThe samples, N is positive sample, the product of the two is the sum of the positive and negative samples ordered pairwise, the sum of the positive sample orders is calculated and added, and rank is calculated _i Is the number of samples less than the sample score. And if the score is not ideal, repeating the steps 4-9.

The embodiment of the self-monitoring data enhancement method combined with the attention mechanism can be applied to any device with data processing capability, and the device with data processing capability can be a device or a device such as a computer. The apparatus embodiments may be implemented by software, or may be implemented by hardware or a combination of hardware and software. Taking software implementation as an example, the device in a logic sense is formed by reading corresponding computer program instructions in a nonvolatile memory into a memory by a processor of any device with data processing capability.

The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above technical features, but encompasses other technical features formed by any combination of the above technical features or their equivalents without departing from the spirit of the invention. Such as the above-described features, are mutually substituted with (but not limited to) the features having similar functions disclosed in the embodiments of the present disclosure.

Claims

1. A method of data enhancement based on a priori defect characteristics and ssplab attention mechanisms, comprising the steps of:

step 3: the data with priori knowledge is enhanced, a rectangular patch is obtained by cutting a normal image, a new defect patch with a defect shape and original sample characteristics is obtained by fusing the patch and the priori defect shape, and the patch is pasted back to the original image to generate a simulated abnormal image;

s＝F _ex (z,W)＝σ(g(z,W))＝σ(W ₂ δ(W ₁ z))

step 7: performing model evaluation on the classification model obtained in the step 6 in a test set, and classifying by calculating abnormal scores of the images through Gaussian density estimation;

wherein M is a negative sample of the sample,n is positive sample, the product of the N and the N is the sum of the positive sample and the negative sample in a pairwise ordering mode, the sum of the positive sample ordering modes is calculated and added, and rank is calculated _i Is the number of samples less than the sample score. And if the score is not ideal, repeating the steps 4-9.

2. The method for enhancing data based on a priori defect features and sspcar attention mechanism as set forth in claim 1 wherein the anomaly image generation strategy of step 3 is divided into two modes of operation based on simulated anomaly size; the first mode of operation is: in the process of cutting, determining the size of a rectangle to be cut according to a certain proportion according to the size of an original image; the second mode of operation is: giving a certain numerical range which is far smaller than the size of the image, and randomly selecting numerical values in the range as the size of a clipping rectangle so as to generate a tiny abnormal patch; the data types are enriched by two different anomaly simulations.

3. The data enhancement method based on the prior defect feature and the sspcar attention mechanism according to claim 1, wherein the normal sample and the simulated abnormal sample which are input by the self-supervision training method in step 4 are divided into two working modes according to different classification modes; the first mode of operation is: the classification task of the image only comprises two categories of normal images and abnormal images, the normal images are marked as 0, and the abnormal images randomly select one of tiny anomalies and standard anomalies to simulate in the process of enhancing data each time and are marked as 1; the second mode of operation is: the classification task of the image comprises three categories of normal images, standard anomalies and tiny anomalies, wherein two anomaly images of the standard anomalies and the tiny anomalies are generated simultaneously in the data enhancement process, the normal images are marked as 0, the standard anomalies are marked as 1, and the tiny anomalies are marked as 2; and training the classifier with better performance through different classification tasks.

4. The method for enhancing data based on a priori defect features and sspcar attention mechanism as set forth in claim 1 wherein the gaussian density estimator performs anomaly detection and calculates anomaly scores for the image in step 7, the input image is labeled only with image level labels and no pixel level labels; in the evaluation stage, the extracted features of the last convolution layer of the model are taken as output, the extracted features are input to a Gaussian density estimator, and the anomaly scores of the features are calculated and expressed as follows:

wherein x is a feature matrix, and m is the number of the feature matrices;

5. the method of claim 1, wherein the objective function in step 6 is expressed as:

wherein, tau is a super parameter and tau is more than or equal to 1, which is used for dividing the total epoch; t is the total epoch number; t is the current epoch number. For example, if τ is set to 2, the periodic shape is an inverted triangle, τ changes from 1 to 0 in the first half and from 0 in the second half.