Nothing Special   »   [go: up one dir, main page]

CN116912625A - Data enhancement method based on priori defect characteristics and SSPCAB attention mechanism - Google Patents

Data enhancement method based on priori defect characteristics and SSPCAB attention mechanism Download PDF

Info

Publication number
CN116912625A
CN116912625A CN202310912045.5A CN202310912045A CN116912625A CN 116912625 A CN116912625 A CN 116912625A CN 202310912045 A CN202310912045 A CN 202310912045A CN 116912625 A CN116912625 A CN 116912625A
Authority
CN
China
Prior art keywords
image
defect
sample
training
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310912045.5A
Other languages
Chinese (zh)
Inventor
杨文翰
汤晟楠
王正松
杨乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University Qinhuangdao Branch
Original Assignee
Northeastern University Qinhuangdao Branch
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University Qinhuangdao Branch filed Critical Northeastern University Qinhuangdao Branch
Priority to CN202310912045.5A priority Critical patent/CN116912625A/en
Publication of CN116912625A publication Critical patent/CN116912625A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0895Weakly supervised learning, e.g. semi-supervised or self-supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/776Validation; Performance evaluation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a data enhancement method based on priori defect characteristics and SSPCAB attention mechanism, and relates to the field of industrial defect detection. Cutting an input normal sample image at a random position to obtain a rectangular area with random size, collecting the defect type and shape of an industrial product as priori knowledge, performing mask conversion on the rectangle to obtain an image patch similar to the defect shape of the corresponding product, randomly rotating the patch by a certain angle, performing random color dithering, pasting the patch at the random position of an original image to obtain a simulated defect image, inputting the simulated defect image and the normal image into a ResNet-18 neural network, integrating a self-supervision attention module (SSPCAB) into the neural network, using periodic focus loss (CFL) as a loss function of the neural network, and performing weighted addition of mean square error loss generated by the attention module as an objective function to finally obtain a defect detection model.

Description

Data enhancement method based on priori defect characteristics and SSPCAB attention mechanism
Technical Field
The invention relates to the technical field of defect detection, in particular to a data enhancement method based on priori defect characteristics and SSPCAB attention mechanism.
Background
Surface defect detection of industrial products has been a hot spot of research and is widely used in various industrial processes, such as printing industry, glass manufacturing, textile industry, etc. The purpose is to replace manual inspection, and defects similar to scratches, cracks, dust, printing ink and the like are automatically detected and positioned on the surface of a product. With the rapid development of deep learning, the performance of the method is far superior to that of the traditional algorithm in most places. However, a big feature of deep learning over conventional algorithms is that a huge amount of data is required to drive, such as the COCO dataset for object detection, the ImageNet dataset for image classification, etc. Driven by the huge data sets, the performance of the deep learning model of target detection, semantic segmentation and classification tasks can be greatly improved, so that the data acquisition is important for a deep learning algorithm.
However, in the current industry, the acquisition of data often has several problems: 1. acquisition of defective abnormal samples is difficult because in actual production, the defective rate is extremely low, and when abnormal samples are generated, a large number of repeated useless abnormal samples are often generated, sometimes even no usable abnormal samples. 2. In actual production, the positions where the anomalies appear are often random, and even if a sufficient number of data sets are acquired, the anomalies appear at all positions of the image cannot be ensured, so that the data set driven model cannot ensure that the data set driven model has stronger detection performance under various backgrounds. 3. After a large number of abnormal samples are obtained, data labeling is needed for the abnormal samples, but compared with the targets of training sets such as COCO, the defects generated in industrial production are very tiny, so that the defects are difficult to find in a complex background, and the labeling process is often finished by consuming a large amount of manpower and material resources.
Although in recent years, research on a self-supervision or unsupervised defect detection method has been continued, there are still some basic problems, and the current anomaly detection technology based on an embedding method uses the distance between embedded vectors of a sample and a normal sample as a criterion for calculating an anomaly score, and uses a training network to improve the feature extraction capability of the network to obtain an anomaly region. However, in view of the principle of anomaly detection, the extracted features of the model need to be matched, so that the calculation speed is severely limited. The method based on reconstruction detects the abnormality of the image through the error generated in the image reconstruction process and locates the abnormality. Such as self-encoders and generation of a countermeasure network are often used in this approach. The self-encoder is trained using the loss of resistance, and the anomaly score is calculated from errors generated during the image reconstruction process. In view of the strong learning ability of neural networks, even abnormal images can be reconstructed with little loss. This violates the premise of the reconstruction method that the abnormal region of the abnormal image cannot be completely reconstructed to be distinguished from the original image, thereby disabling the reconstruction method. For the anomaly detection technology based on data enhancement, the method does not need to carry out data annotation, and a feature extractor is learned from unlabeled data. Common methods are for example to randomly copy a small rectangular area from the input image and to randomly paste it into the image simulating an abnormal sample. Structural anomalies are created by pasting rectangular patches of different sizes, from the aspect of the aspect ratio and rotation angle. Most of the current data enhancement methods cannot well simulate real defects, and can only generate abnormal samples simply by constructing irregularities of images. Therefore, a data enhancement method closer to the real defect is needed to generate the simulation data, so that the neural network model with better performance is driven.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a data enhancement method based on prior defect characteristics and SSPCAB attention mechanism, which comprises the following steps:
step 1: establishing an industrial defect detection data set; the data set comprises a test set and a training set, wherein the test set comprises a certain proportion of normal samples and defect samples, and the training set only comprises a certain number of normal samples;
step 2: establishing an industrial defect product priori knowledge base; the method comprises the steps of obtaining defect shape information and type information which are frequently generated by different industrial products as priori knowledge through field investigation and analysis of an industrial field;
step 3: a data enhancement strategy with priori knowledge is adopted, a rectangular patch is obtained by cutting a normal image, a new defect patch with a defect shape and original sample characteristics is obtained by fusing the patch and the priori defect shape, and the patch is pasted back to the original image to generate a simulated abnormal image;
the data enhancement strategy with priori knowledge is divided into two working modes according to the simulated abnormal size; the first mode of operation is: in the process of cutting, determining the size of a rectangle to be cut according to a certain proportion according to the size of an original image; the second mode of operation is: giving a certain numerical range which is far smaller than the size of the image, and randomly selecting numerical values in the range as the size of a clipping rectangle so as to generate a tiny abnormal patch; enriching data types through two different anomaly simulations;
step 4: taking a normal sample and a simulated abnormal sample generated based on the normal sample as input, training an image classifier capable of detecting an abnormal image by using a simulated image and a normal image based on a training method of self-supervision learning, wherein the classifier is provided with four convolution layers, the channel sizes are 64, 128, 256 and 512, the image size is converted into 512 multiplied by 1 by adopting a Relu activation function and two pooling layers, and finally the output is converted into probability distribution by using softmax;
the classifier is divided into two working modes according to different classification modes; the first mode of operation is: the classification task of the image only comprises two categories of normal images and abnormal images, the normal images are marked as 0, and the abnormal images randomly select one of tiny anomalies and standard anomalies to simulate in the process of enhancing data each time and are marked as 1; the second mode of operation is: the classification task of the image comprises three categories of normal images, standard anomalies and tiny anomalies, wherein two anomaly images of the standard anomalies and the tiny anomalies are generated simultaneously in the data enhancement process, the normal images are marked as 0, the standard anomalies are marked as 1, and the tiny anomalies are marked as 2; training a classifier with better performance through different classification tasks;
step 5: a neural network incorporating a self-supervising attentiveness mechanism;
the self-supervising attentiveness mechanism is divided into two parts: (1) The masking convolution layer adopts a convolution kernel with the size of 3 multiplied by 3, a masking area M with the size of 1 is arranged in the middle of the convolution kernel, calculation is not involved in the process of carrying out convolution by the convolution kernel, and parameters which can be learned by the masking convolution are positioned at four corners of a receptive field, wherein k0 epsilon N + Is a super parameter defining the size of the sub-core, C is the number of input channels. d is a core K i Distance from the mask area M, where M is denoted as ε R 1 X 1 xC, its predictive value is K i And (3) summing. (2) The channel attention mechanism is fused after the convolution layer, the features extracted by the convolution layer are subjected to a Squeeze compression operation, and feature mapping crossing the space dimension H×W is aggregated to generate a channel descriptor, namely H×W×C- & gt 1×1×C. The global space information is compressed into the channel descriptors so that these descriptors can be utilized by other layers of its input. The approach taken by Squeeze is global average pooling, expressed as:
where H is the length of the feature map, W is the width of the feature map, u c After being converted intoKeeping the channel number C unchanged, compressing the features into a channel descriptor;
the gating mechanism is then parameterized by a bottleneck of two fully connected layers, i.e. W 1 For reducing dimension, W 2 For dimension increment, expressed as:
s=F ex (z,W)=σ(g(z,W))=σ(W 2 δ(W 1 z))
wherein W is 1 ,W 2 Is a full connection layer, W 1 Where r is a scaling parameter, here a value of 16, the number of channels can be reduced to reduce the amount of computation. Delta is a Relu layer, the output dimension is unchanged, and W 2 Multiplying and finally obtaining s through a sigmoid function;
after the module is inserted into the last convolutional layer of the backbone neural network, parameters are updated synchronously with training of the neural network;
step 6: repeating the steps 4 and 5 until the target function of the neural network converges to obtain the optimal classified neural network model;
the objective function is expressed as:
L mask =-ξ(1+p t ) γhc log(p t )+(1-ξ)FL+α(G(X)-X) 2
wherein p is t Is a predicted value; FL is the focus loss; gamma ray hc Is a super parameter; ζ is a time-varying parameter that is assigned different weights during different training periods, with the goal of combining early-to-confidence predicted class concerns with mid-to-misclassified hard sample class concerns. Alpha is defined weight super parameter, which is generally taken as 0.1, G is self-supervision attention module, X is tensor of self-supervision attention module; in detail, the parameter ζ that varies with the training epoch is defined as:
wherein, tau is a super parameter and tau is more than or equal to 1, which is used for dividing the total epoch; t is the total epoch number; t is the current epoch number. For example, if τ is set to 2, the periodic shape is an inverted triangle, τ changes from 1 to 0 in the first half and from 0 in the second half;
step 7: performing model evaluation on the classification model obtained in the step 6 in a test set, and calculating the abnormal score of the image through a Gaussian density estimator to classify the image;
the Gaussian density estimator detects the image abnormality and calculates the abnormality score, and the input image is only marked with the image-level label and has no pixel-level label; in the evaluation stage, the extracted features of the last convolution layer of the model are taken as output, the extracted features are input to a Gaussian density estimator, and the anomaly scores of the features are calculated and expressed as follows:
where x is the input feature, μ is the mathematical desired value, Σ is an n-order matrix obtained during learning, and their sub-table is expressed as:
wherein x is a feature matrix, and m is the number of the feature matrices;
wherein x is a feature matrix, mu is a data expectation, and m is the number of the feature matrices;
step 8: drawing a classification result of the classification model into an ROC curve, wherein the ROC curve is used as an evaluation index of the classifier and is used as a basis for calculating an AUC;
step 9: calculating the AUC score of the classifier as the score of the evaluation model according to the ROC curve obtained in the step 8, wherein the score is expressed as follows:
wherein M is a negative sample, N is a positive sample, the product of the two is the sum of the positive and negative samples in a pairwise ordering mode, the sum of the positive sample ordering modes is calculated and added, and rank is calculated i Is the number of samples less than the sample score. And if the score is not ideal, repeating the steps 4-9.
Step 10: if the neural network model obtained in the step 9 meets the required AUC score (set by itself according to the actual requirement), the model is used as a final detection model, namely, the neural network model of the data enhancement method using the prior defect characteristics and the SSPCAB self-supervision attention mechanism is used for completing training.
Further, the abnormal image generation strategy in the step 3 is divided into two working modes according to the simulated abnormal size; the first mode of operation is: in the process of cutting, determining the size of a rectangle to be cut according to a certain proportion according to the size of an original image; the second mode of operation is: giving a certain numerical range which is far smaller than the size of the image, and randomly selecting numerical values in the range as the size of a clipping rectangle so as to generate a tiny abnormal patch; enriching data types through two different anomaly simulations;
further, the normal sample and the simulated abnormal sample which are input by the self-supervision training method in the step 4 are divided into two working modes according to different classification modes; the first mode of operation is: the classification task of the image only comprises two categories of normal images and abnormal images, the normal images are marked as 0, and the abnormal images randomly select one of tiny anomalies and standard anomalies to simulate in the process of enhancing data each time and are marked as 1; the second mode of operation is: the classification task of the image comprises three categories of normal images, standard anomalies and tiny anomalies, wherein two anomaly images of the standard anomalies and the tiny anomalies are generated simultaneously in the data enhancement process, the normal images are marked as 0, the standard anomalies are marked as 1, and the tiny anomalies are marked as 2; training a classifier with better performance through different classification tasks;
further, in step 7, the gaussian density estimator performs anomaly detection on the image and calculates anomaly scores, and the input image is only labeled with image-level labels and no pixel-level labels; in the evaluation stage, the extracted features of the last convolution layer of the model are taken as output, the extracted features are input to a Gaussian density estimator, and the anomaly scores of the features are calculated and expressed as follows:
where x is the input feature, μ is the mathematical desired value, Σ is an n-order matrix obtained during learning, and their sub-table is expressed as:
wherein x is a feature matrix, and m is the number of the feature matrices;
wherein x is a feature matrix, mu is a data expectation, and m is the number of the feature matrices;
further, the objective function in step 6 is expressed as:
wherein p is t Is a predicted value; FL is the focus loss; gamma ray hc Is a super parameter; ζ is a time-varying parameter that is assigned different weights during different training periods, with the goal of combining early-to-confidence predicted class concerns with mid-to-misclassified hard sample class concerns. Alpha is defined weight super parameter, which is generally taken as 0.1, G is self-supervision attention module, X is tensor of self-supervision attention module; in detail, the parameter ζ that varies with the training epoch is defined as:
wherein, tau is a super parameter and tau is more than or equal to 1, which is used for dividing the total epoch; t is the total epoch number; t is the current epoch number. For example, if τ is set to 2, the periodic shape is an inverted triangle, τ changes from 1 to 0 in the first half and from 0 in the second half;
the beneficial effects of adopting above-mentioned technical scheme to produce lie in:
the invention provides a data enhancement method based on prior defect characteristics and SSPCAB attention mechanisms, which adopts a neural network combined with a self-supervision attention prediction mechanism, and adds industrial defect prior knowledge acquired on an industrial site in the data enhancement process, so that a generated simulated defect image is more in line with a real defect image, the finally obtained detection precision is higher, and the defect detection efficiency is effectively improved.
Drawings
FIG. 1 is a flow chart of a method for enhancing self-monitoring data in combination with an attention mechanism in an embodiment of the present invention.
FIG. 2 is a schematic diagram of a data enhancement strategy according to an embodiment of the present invention.
Fig. 3 is a block diagram of the attention mechanism in the present invention.
Detailed Description
The following describes in further detail the embodiments of the present invention with reference to the drawings and examples. The following examples are for
The invention is illustrated but not intended to limit the scope of the invention.
The traditional industrial defect detection method is greatly interfered by external factors in a complex scene, for example, when the conditions of illumination change, pollution of a background light source, vibration of production line equipment and the like exist in an industrial field, the detection is greatly influenced. Current deep learning methods can avoid these problems to some extent, however deep learning methods require a large amount of defect data to train the model, which is currently difficult for industry to provide. Based on this, as shown in fig. 1, the present invention provides a data enhancement method based on a priori defect characteristics and ssplab attention mechanism, comprising the steps of:
step 1: establishing an industrial defect detection data set; the data set comprises a test set and a training set, wherein the test set comprises a certain proportion of normal samples and defect samples, and the training set only comprises a certain number of normal samples;
step 2: establishing an industrial defect product priori knowledge base; the method comprises the steps of obtaining defect shape information and type information which are frequently generated by different industrial products as priori knowledge through field investigation and analysis of an industrial field;
step 3: a data enhancement strategy with priori knowledge is adopted, a rectangular patch is obtained by cutting a normal image, a new defect patch with a defect shape and original sample characteristics is obtained by fusing the patch and the priori defect shape, and the patch is pasted back to the original image to generate a simulated abnormal image; as shown in fig. 2;
the data enhancement strategy with priori knowledge is divided into two working modes according to the simulated abnormal size; the first mode of operation is: in the process of cutting, determining the size of a rectangle to be cut according to a certain proportion according to the size of an original image; the second mode of operation is: giving a certain numerical range which is far smaller than the size of the image, and randomly selecting numerical values in the range as the size of a clipping rectangle so as to generate a tiny abnormal patch; enriching data types through two different anomaly simulations;
step 4: taking a normal sample and a simulated abnormal sample generated based on the normal sample as input, training an image classifier capable of detecting an abnormal image by using a simulated image and a normal image based on a training method of self-supervision learning, wherein the classifier is provided with four convolution layers, the channel sizes are 64, 128, 256 and 512, the image size is converted into 512 multiplied by 1 by adopting a Relu activation function and two pooling layers, and finally the output is converted into probability distribution by using softmax;
the classifier is divided into two working modes according to different classification modes; the first mode of operation is: the classification task of the image only comprises two categories of normal images and abnormal images, the normal images are marked as 0, and the abnormal images randomly select one of tiny anomalies and standard anomalies to simulate in the process of enhancing data each time and are marked as 1; the second mode of operation is: the classification task of the image comprises three categories of normal images, standard anomalies and tiny anomalies, wherein two anomaly images of the standard anomalies and the tiny anomalies are generated simultaneously in the data enhancement process, the normal images are marked as 0, the standard anomalies are marked as 1, and the tiny anomalies are marked as 2; training a classifier with better performance through different classification tasks;
step 5: a neural network incorporating a self-supervising attentiveness mechanism; as shown in fig. 3;
the self-supervising attentiveness mechanism is divided into two parts: (1) The masking convolution layer adopts a convolution kernel with the size of 3 multiplied by 3, a masking area M with the size of 1 is arranged in the middle of the convolution kernel, calculation is not involved in the process of carrying out convolution by the convolution kernel, and parameters which can be learned by the masking convolution are positioned at four corners of a receptive field, wherein k0 epsilon N + Is a super parameter defining the size of the sub-core, C is the number of input channels. d is a core K i Distance from the mask area M, where M is denoted as ε R 1 X 1 xC, its predictive value is K i And (3) summing. (2) The channel attention mechanism is fused after the convolution layer, the features extracted by the convolution layer are subjected to a Squeeze compression operation, and feature mapping crossing the space dimension H×W is aggregated to generate a channel descriptor, namely H×W×C- & gt 1×1×C. The global space information is compressed into the channel descriptors so that these descriptors can be utilized by other layers of its input. The approach taken by Squeeze is global average pooling, expressed as:
where H is the length of the feature map, W is the width of the feature map, u c For the converted feature matrix, keeping the channel number C unchanged, and compressing the features into a channel descriptor;
thereafter, the gating is parameterized by a bottleneck formed by two fully connected layersThe mechanism, i.e. W 1 For reducing dimension, W 2 For dimension increment, expressed as:
s=F ex (z,W)=σ(g(z,W))=σ(W 2 δ(W 1 z))
wherein W is 1 ,W 2 Is a full connection layer, W 1 Where r is a scaling parameter, here a value of 16, the number of channels can be reduced to reduce the amount of computation. Delta is a Relu layer, the output dimension is unchanged, and W 2 Multiplying and finally obtaining s through a sigmoid function;
after the module is inserted into the last convolutional layer of the backbone neural network, parameters are updated synchronously with training of the neural network;
step 6: repeating the steps 4 and 5 until the target function of the neural network converges to obtain the optimal classified neural network model;
the objective function is expressed as:
wherein p is t Is a predicted value; FL is the focus loss; gamma ray hc Is a super parameter; ζ is a time-varying parameter that is assigned different weights during different training periods, with the goal of combining early-to-confidence predicted class concerns with mid-to-misclassified hard sample class concerns. Alpha is defined weight super parameter, which is generally taken as 0.1, G is self-supervision attention module, X is tensor of self-supervision attention module; in detail, the parameter ζ that varies with the training epoch is defined as:
wherein, tau is a super parameter and tau is more than or equal to 1, which is used for dividing the total epoch; t is the total epoch number; t is the current epoch number. For example, if τ is set to 2, the periodic shape is an inverted triangle, τ changes from 1 to 0 in the first half and from 0 in the second half;
step 7: performing model evaluation on the classification model obtained in the step 6 in a test set, and calculating the abnormal score of the image through a Gaussian density estimator to classify the image;
the Gaussian density estimator detects the image abnormality and calculates the abnormality score, and the input image is only marked with the image-level label and has no pixel-level label; in the evaluation stage, the extracted features of the last convolution layer of the model are taken as output, the extracted features are input to a Gaussian density estimator, and the anomaly scores of the features are calculated and expressed as follows:
where x is the input feature, μ is the mathematical desired value, Σ is an n-order matrix obtained during learning, and their sub-table is expressed as:
wherein x is a feature matrix, and m is the number of the feature matrices;
wherein x is a feature matrix, mu is a data expectation, and m is the number of the feature matrices;
step 8: drawing a classification result of the classification model into an ROC curve, wherein the ROC curve is used as an evaluation index of the classifier and is used as a basis for calculating an AUC;
step 9: calculating the AUC score of the classifier as the score of the evaluation model according to the ROC curve obtained in the step 8, wherein the score is expressed as follows:
wherein M is negativeThe samples, N is positive sample, the product of the two is the sum of the positive and negative samples ordered pairwise, the sum of the positive sample orders is calculated and added, and rank is calculated i Is the number of samples less than the sample score. And if the score is not ideal, repeating the steps 4-9.
Step 10: if the neural network model obtained in the step 9 meets the required AUC score (set by itself according to the actual requirement), the model is used as a final detection model, namely, the neural network model of the data enhancement method using the prior defect characteristics and the SSPCAB self-supervision attention mechanism is used for completing training.
The embodiment of the self-monitoring data enhancement method combined with the attention mechanism can be applied to any device with data processing capability, and the device with data processing capability can be a device or a device such as a computer. The apparatus embodiments may be implemented by software, or may be implemented by hardware or a combination of hardware and software. Taking software implementation as an example, the device in a logic sense is formed by reading corresponding computer program instructions in a nonvolatile memory into a memory by a processor of any device with data processing capability.
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above technical features, but encompasses other technical features formed by any combination of the above technical features or their equivalents without departing from the spirit of the invention. Such as the above-described features, are mutually substituted with (but not limited to) the features having similar functions disclosed in the embodiments of the present disclosure.

Claims (5)

1. A method of data enhancement based on a priori defect characteristics and ssplab attention mechanisms, comprising the steps of:
step 1: establishing an industrial defect detection data set; the data set comprises a test set and a training set, wherein the test set comprises a certain proportion of normal samples and defect samples, and the training set only comprises a certain number of normal samples;
step 2: establishing an industrial defect product priori knowledge base; the method comprises the steps of obtaining defect shape information and type information which are frequently generated by different industrial products as priori knowledge through field investigation and analysis of an industrial field;
step 3: the data with priori knowledge is enhanced, a rectangular patch is obtained by cutting a normal image, a new defect patch with a defect shape and original sample characteristics is obtained by fusing the patch and the priori defect shape, and the patch is pasted back to the original image to generate a simulated abnormal image;
step 4: taking a normal sample and a simulated abnormal sample generated based on the normal sample as input, training an image classifier capable of detecting an abnormal image by using a simulated image and a normal image based on a training method of self-supervision learning, wherein the classifier is provided with four convolution layers, the channel sizes are 64, 128, 256 and 512, the image size is converted into 512 multiplied by 1 by adopting a Relu activation function and two pooling layers, and finally the output is converted into probability distribution by using softmax;
step 5: a neural network incorporating a self-supervising attentiveness mechanism;
the self-supervising attentiveness mechanism is divided into two parts: (1) The masking convolution layer adopts a convolution kernel with the size of 3 multiplied by 3, a masking area M with the size of 1 is arranged in the middle of the convolution kernel, calculation is not involved in the process of carrying out convolution by the convolution kernel, and parameters which can be learned by the masking convolution are positioned at four corners of a receptive field, wherein k0 epsilon N + Is a super parameter defining the size of the sub-core, C is the number of input channels. d is a core K i Distance from the mask area M, where M is denoted as ε R 1 X 1 xC, its predictive value is K i And (3) summing. (2) The channel attention mechanism is fused after the convolution layer, the features extracted by the convolution layer are subjected to a Squeeze compression operation, and feature mapping crossing the space dimension H×W is aggregated to generate a channel descriptor, namely H×W×C- & gt 1×1×C. The global space information is compressed into the channel descriptors so that these descriptors can be utilized by other layers of its input. The approach taken by Squeeze is global average pooling, expressed as:
where H is the length of the feature map, W is the width of the feature map, u c For the converted feature matrix, keeping the channel number C unchanged, and compressing the features into a channel descriptor;
the gating mechanism is then parameterized by a bottleneck of two fully connected layers, i.e. W 1 For reducing dimension, W 2 For dimension increment, expressed as:
s=F ex (z,W)=σ(g(z,W))=σ(W 2 δ(W 1 z))
wherein W is 1 ,W 2 Is a full connection layer, W 1 Where r is a scaling parameter, here a value of 16, the number of channels can be reduced to reduce the amount of computation. Delta is a Relu layer, the output dimension is unchanged, and W 2 Multiplying and finally obtaining s through a sigmoid function;
after the module is inserted into the last convolutional layer of the backbone neural network, parameters are updated synchronously with training of the neural network;
step 6: repeating the steps 4 and 5 until the target function of the neural network converges to obtain the optimal classified neural network model;
step 7: performing model evaluation on the classification model obtained in the step 6 in a test set, and classifying by calculating abnormal scores of the images through Gaussian density estimation;
step 8: drawing a classification result of the classification model into an ROC curve, wherein the ROC curve is used as an evaluation index of the classifier and is used as a basis for calculating an AUC;
step 9: calculating the AUC score of the classifier as the score of the evaluation model according to the ROC curve obtained in the step 8, wherein the score is expressed as follows:
wherein M is a negative sample of the sample,n is positive sample, the product of the N and the N is the sum of the positive sample and the negative sample in a pairwise ordering mode, the sum of the positive sample ordering modes is calculated and added, and rank is calculated i Is the number of samples less than the sample score. And if the score is not ideal, repeating the steps 4-9.
Step 10: if the neural network model obtained in the step 9 meets the required AUC score (set by itself according to the actual requirement), the model is used as a final detection model, namely, the neural network model of the data enhancement method using the prior defect characteristics and the SSPCAB self-supervision attention mechanism is used for completing training.
2. The method for enhancing data based on a priori defect features and sspcar attention mechanism as set forth in claim 1 wherein the anomaly image generation strategy of step 3 is divided into two modes of operation based on simulated anomaly size; the first mode of operation is: in the process of cutting, determining the size of a rectangle to be cut according to a certain proportion according to the size of an original image; the second mode of operation is: giving a certain numerical range which is far smaller than the size of the image, and randomly selecting numerical values in the range as the size of a clipping rectangle so as to generate a tiny abnormal patch; the data types are enriched by two different anomaly simulations.
3. The data enhancement method based on the prior defect feature and the sspcar attention mechanism according to claim 1, wherein the normal sample and the simulated abnormal sample which are input by the self-supervision training method in step 4 are divided into two working modes according to different classification modes; the first mode of operation is: the classification task of the image only comprises two categories of normal images and abnormal images, the normal images are marked as 0, and the abnormal images randomly select one of tiny anomalies and standard anomalies to simulate in the process of enhancing data each time and are marked as 1; the second mode of operation is: the classification task of the image comprises three categories of normal images, standard anomalies and tiny anomalies, wherein two anomaly images of the standard anomalies and the tiny anomalies are generated simultaneously in the data enhancement process, the normal images are marked as 0, the standard anomalies are marked as 1, and the tiny anomalies are marked as 2; and training the classifier with better performance through different classification tasks.
4. The method for enhancing data based on a priori defect features and sspcar attention mechanism as set forth in claim 1 wherein the gaussian density estimator performs anomaly detection and calculates anomaly scores for the image in step 7, the input image is labeled only with image level labels and no pixel level labels; in the evaluation stage, the extracted features of the last convolution layer of the model are taken as output, the extracted features are input to a Gaussian density estimator, and the anomaly scores of the features are calculated and expressed as follows:
where x is the input feature, μ is the mathematical desired value, Σ is an n-order matrix obtained during learning, and their sub-table is expressed as:
wherein x is a feature matrix, and m is the number of the feature matrices;
wherein x is a feature matrix, mu is a data expectation, and m is the number of the feature matrices;
5. the method of claim 1, wherein the objective function in step 6 is expressed as:
wherein p is t Is a predicted value; FL is the focus loss; gamma ray hc Is a super parameter; ζ is a time-varying parameter that is assigned different weights during different training periods, with the goal of combining early-to-confidence predicted class concerns with mid-to-misclassified hard sample class concerns. Alpha is defined weight super parameter, which is generally taken as 0.1, G is self-supervision attention module, X is tensor of self-supervision attention module; in detail, the parameter ζ that varies with the training epoch is defined as:
wherein, tau is a super parameter and tau is more than or equal to 1, which is used for dividing the total epoch; t is the total epoch number; t is the current epoch number. For example, if τ is set to 2, the periodic shape is an inverted triangle, τ changes from 1 to 0 in the first half and from 0 in the second half.
CN202310912045.5A 2023-07-25 2023-07-25 Data enhancement method based on priori defect characteristics and SSPCAB attention mechanism Pending CN116912625A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310912045.5A CN116912625A (en) 2023-07-25 2023-07-25 Data enhancement method based on priori defect characteristics and SSPCAB attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310912045.5A CN116912625A (en) 2023-07-25 2023-07-25 Data enhancement method based on priori defect characteristics and SSPCAB attention mechanism

Publications (1)

Publication Number Publication Date
CN116912625A true CN116912625A (en) 2023-10-20

Family

ID=88352806

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310912045.5A Pending CN116912625A (en) 2023-07-25 2023-07-25 Data enhancement method based on priori defect characteristics and SSPCAB attention mechanism

Country Status (1)

Country Link
CN (1) CN116912625A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117557872A (en) * 2024-01-12 2024-02-13 苏州大学 Unsupervised anomaly detection method and device for optimizing storage mode
CN118485592A (en) * 2024-07-05 2024-08-13 华侨大学 Low-illumination phase contrast cell microscopic image enhancement method based on multi-scale transducer

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117557872A (en) * 2024-01-12 2024-02-13 苏州大学 Unsupervised anomaly detection method and device for optimizing storage mode
CN117557872B (en) * 2024-01-12 2024-03-22 苏州大学 Unsupervised anomaly detection method and device for optimizing storage mode
CN118485592A (en) * 2024-07-05 2024-08-13 华侨大学 Low-illumination phase contrast cell microscopic image enhancement method based on multi-scale transducer

Similar Documents

Publication Publication Date Title
CN114627383B (en) Small sample defect detection method based on metric learning
CN111444939B (en) Small-scale equipment component detection method based on weak supervision cooperative learning in open scene of power field
CN116912625A (en) Data enhancement method based on priori defect characteristics and SSPCAB attention mechanism
CN109740676B (en) Object detection and migration method based on similar targets
CN116310785B (en) Unmanned aerial vehicle image pavement disease detection method based on YOLO v4
CN112434586B (en) Multi-complex scene target detection method based on domain self-adaptive learning
CN109284779A (en) Object detection method based on deep full convolution network
CN111199543A (en) Refrigerator-freezer surface defect detects based on convolutional neural network
CN115136209A (en) Defect detection system
CN115880298A (en) Glass surface defect detection method and system based on unsupervised pre-training
CN115526847A (en) Mainboard surface defect detection method based on semi-supervised learning
CN113591948A (en) Defect pattern recognition method and device, electronic equipment and storage medium
Cui et al. Real-time detection of wood defects based on SPP-improved YOLO algorithm
CN117036243A (en) Method, device, equipment and storage medium for detecting surface defects of shaving board
CN116563250A (en) Recovery type self-supervision defect detection method, device and storage medium
CN118212196B (en) Industrial defect detection method based on image restoration
Ruediger-Flore et al. CAD-based data augmentation and transfer learning empowers part classification in manufacturing
CN116912144A (en) Data enhancement method based on discipline algorithm and channel attention mechanism
CN117911399A (en) Light-weight multi-scale aluminum profile surface defect detection method
CN115294392B (en) Visible light remote sensing image cloud removal method and system based on network model generation
KR102494829B1 (en) Structure damage evaluation method for using the convolutional neural network, and computing apparatus for performing the method
CN113808079B (en) Industrial product surface defect self-adaptive detection method based on deep learning model AGLNet
CN115601610A (en) Fabric flaw detection method based on improved EfficientDet model
US20230084761A1 (en) Automated identification of training data candidates for perception systems
Chen et al. Noise-assisted data enhancement promoting image classification of municipal solid waste

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination