CN118864454B - An unsupervised anomaly detection method and system based on memory expert guidance - Google Patents
An unsupervised anomaly detection method and system based on memory expert guidance Download PDFInfo
- Publication number
- CN118864454B CN118864454B CN202411336461.6A CN202411336461A CN118864454B CN 118864454 B CN118864454 B CN 118864454B CN 202411336461 A CN202411336461 A CN 202411336461A CN 118864454 B CN118864454 B CN 118864454B
- Authority
- CN
- China
- Prior art keywords
- network
- memory
- feature
- normal
- denoising
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 91
- 230000007547 defect Effects 0.000 claims abstract description 81
- 239000013598 vector Substances 0.000 claims abstract description 43
- 238000000034 method Methods 0.000 claims abstract description 37
- 238000004821 distillation Methods 0.000 claims abstract description 34
- 239000011521 glass Substances 0.000 claims abstract description 17
- 230000006870 function Effects 0.000 claims description 40
- 238000012549 training Methods 0.000 claims description 36
- 238000003860 storage Methods 0.000 claims description 26
- 230000008569 process Effects 0.000 claims description 20
- 230000003044 adaptive effect Effects 0.000 claims description 17
- 230000004927 fusion Effects 0.000 claims description 15
- 230000008447 perception Effects 0.000 claims description 13
- 230000004913 activation Effects 0.000 claims description 12
- 238000000605 extraction Methods 0.000 claims description 10
- 230000000007 visual effect Effects 0.000 claims description 8
- 239000011159 matrix material Substances 0.000 claims description 7
- 238000010606 normalization Methods 0.000 claims description 7
- 208000000044 Amnesia Diseases 0.000 claims description 5
- 208000026139 Memory disease Diseases 0.000 claims description 5
- 230000006984 memory degeneration Effects 0.000 claims description 5
- 208000023060 memory loss Diseases 0.000 claims description 5
- 230000004931 aggregating effect Effects 0.000 claims 1
- 230000011218 segmentation Effects 0.000 claims 1
- 230000002159 abnormal effect Effects 0.000 abstract description 28
- 230000005856 abnormality Effects 0.000 description 15
- 238000011176 pooling Methods 0.000 description 10
- 230000007246 mechanism Effects 0.000 description 9
- 238000012545 processing Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 6
- 238000013140 knowledge distillation Methods 0.000 description 6
- 238000004519 manufacturing process Methods 0.000 description 6
- 238000012360 testing method Methods 0.000 description 6
- 239000000284 extract Substances 0.000 description 5
- 241000282326 Felis catus Species 0.000 description 4
- 239000002131 composite material Substances 0.000 description 4
- 238000005457 optimization Methods 0.000 description 4
- 230000009466 transformation Effects 0.000 description 4
- 230000001965 increasing effect Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 101150117004 atg18 gene Proteins 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 239000012141 concentrate Substances 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000002950 deficient Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000000379 polymerizing effect Effects 0.000 description 2
- 238000011084 recovery Methods 0.000 description 2
- 241000270311 Crocodylus niloticus Species 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000002547 anomalous effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000011248 coating agent Substances 0.000 description 1
- 238000000576 coating method Methods 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000006993 memory improvement Effects 0.000 description 1
- 108010089741 opacity factor Proteins 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0004—Industrial image inspection
- G06T7/0008—Industrial image inspection checking presence/absence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/096—Transfer learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Quality & Reliability (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention provides an unsupervised anomaly detection method and system based on memory expert guidance, and relates to the technical field of anomaly detection, wherein the method comprises the steps of obtaining a defect image to be detected; the method comprises the steps of inputting a defect image to be detected into a trained defect detection model to be detected, obtaining a glass container surface defect detection result, wherein the defect detection model comprises a characteristic distillation network for extracting a difference significant characteristic image and an abnormal refinement network for generating a defect detection result according to the difference significant characteristic image, which are sequentially connected, the characteristic distillation network is used for helping a denoising student network to learn a normal sample based on a normal memory expert, a memory vector obtained by a teacher network according to the normal sample characteristic is stored in the normal memory expert, and the denoising student network updates a query characteristic generated according to the defect image to be detected according to the memory vector. The invention can improve the accuracy of glass container surface defect detection.
Description
Technical Field
The invention relates to the technical field of anomaly detection, in particular to an unsupervised anomaly detection method and system based on memory expert guidance.
Background
In intelligent manufacturing, anomaly detection is one of key technologies, and can help enterprises monitor production processes in real time and discover and treat potential problems in time, so that product quality and production safety are guaranteed. Anomaly detection is the process of identifying abnormal or rare objects, events, and patterns in an image or video. An anomaly may be any condition that does not conform to an expected pattern or deviates significantly from most data. In industrial defect detection, an unsupervised anomaly detection method is widely focused due to high data acquisition and labeling cost and unpredictability of anomaly types of industrial products.
In the glass container production process, anomaly detection is particularly important. During the production of glass containers, minor defects such as bubbles, cracks or uneven coating may affect the quality of the product and even lead to rejection of the product. Because of the variety of types of defects and the difficulty in labeling them completely, unsupervised anomaly detection techniques are critical for glass container defect detection.
Knowledge distillation methods have become an important technique in unsupervised anomaly detection tasks. The knowledge distillation-based anomaly detection framework can effectively utilize the advantages of the teacher network, guide the student network to learn the representation of the normal mode, and detect anomalies by identifying differences between the teacher network and the student network. However, in this process, the following problems often occur:
(1) Insufficient knowledge extraction, namely incomplete knowledge extraction possibly caused by insufficient model capacity or training when knowledge is transferred from a teacher network to a student network, affects the learning of the student network on the defect characteristics.
(2) Insufficient feature extraction-the difference in feature extraction capability between the teacher network and the student network may cause the student network to fail to learn the abnormal features sufficiently, especially when dealing with subtle defects.
(3) The teacher constraint is not strong, and the teacher network may not sufficiently restrict the guidance of the student network, so that the student network does not perform well when detecting the defects of the glass container.
Disclosure of Invention
In order to solve at least one of the defects in the prior art, the invention aims to provide an unsupervised anomaly detection method and system based on memory expert guidance so as to improve the accuracy of glass container surface defect detection.
To achieve the above object, according to some embodiments, a first aspect of the present invention provides an unsupervised anomaly detection method based on memory expert guidance, including:
Obtaining a defect image to be detected;
Inputting the defect image to be detected into a trained defect detection model for detection to obtain a glass container surface defect detection result;
The defect detection model comprises a feature distillation network for extracting a difference significant feature map and an abnormal refinement network for generating a defect detection result according to the difference significant feature map, wherein the difference significant feature map is obtained according to a denoising student network and a teacher network of the feature distillation network, the feature distillation network is used for helping the denoising student network to learn a normal sample based on a normal memory expert, a memory vector obtained by the teacher network according to the normal sample feature is stored in the normal memory expert, and the denoising student network updates a query feature generated according to a defect image to be detected according to the memory vector.
In a second aspect of the present invention, there is provided an unsupervised anomaly detection system based on memory expert guidance, comprising:
The image acquisition module is configured to acquire a defect image to be detected;
The defect detection module is configured to input a defect image to be detected into a trained defect detection model for detection, so as to obtain a glass container surface defect detection result;
The defect detection model comprises a feature distillation network for extracting a difference significant feature map and an abnormal refinement network for generating a defect detection result according to the difference significant feature map, wherein the difference significant feature map is obtained according to a denoising student network and a teacher network of the feature distillation network, the feature distillation network is used for helping the denoising student network to learn a normal sample based on a normal memory expert, a memory vector obtained by the teacher network according to the normal sample feature is stored in the normal memory expert, and the denoising student network updates a query feature generated according to a defect image to be detected according to the memory vector.
Compared with the prior art, the invention has the beneficial effects that:
The invention provides an unsupervised anomaly detection method and system based on memory expert guidance, which introduces a memory expert mechanism in the knowledge distillation process, extracts high-level normal features in normal samples (namely normal samples) through a teacher network, stores the high-level normal features in normal memory specialization in the form of memory vectors and transmits the memory vectors to a denoising student network, and the denoising student network updates query features according to the memory vectors stored in the normal memory specialization, so that the image compression process is accelerated, the knowledge distillation degree is enhanced, and the accuracy of the denoising student network on tiny defect identification is improved. The self-adaptive perception state enhancement module is introduced into the denoising student network to fully extract global information and local information in the reconstruction process, so that the denoising student network can pay attention to the accurate recovery of the surface details and the structure of the glass container better during denoising, and the reconstruction quality of a normal sample is improved. By adopting the double-domain comparison method, the frequency domain information is introduced on the basis of the spatial domain, so that the constraint of the teacher network on the denoising student network is enhanced, the teacher network can guide the denoising student network to reconstruct a normal image with higher quality, and the accuracy of detecting the surface defects of the glass container is further improved.
Additional aspects of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.
FIG. 1 is a flow chart of a method according to a first embodiment of the invention;
FIG. 2 is a schematic diagram of the overall architecture of a defect detection model;
FIG. 3 is a schematic diagram of a normal information storage and recall network;
FIG. 4 is a schematic diagram of an adaptive perception state enhancement module;
FIG. 5 is a schematic diagram of a lossless frequency-domain feature encoder;
FIG. 6 is a schematic diagram of an anomaly refinement network.
Detailed Description
The invention will be further described with reference to the drawings and examples.
Example 1
An embodiment of the present invention provides an unsupervised anomaly detection method based on memory expert guidance, as shown in fig. 1 to 6, including:
Obtaining a defect image to be detected;
Inputting the defect image to be detected into a trained defect detection model for detection to obtain a glass container surface defect detection result;
The defect detection model comprises a feature distillation network for extracting a difference significant feature map and an abnormal refinement network for generating a defect detection result according to the difference significant feature map, wherein the difference significant feature map is obtained according to a denoising student network and a teacher network of the feature distillation network, the feature distillation network is used for helping the denoising student network to learn a normal sample based on a normal memory expert, a memory vector obtained by the teacher network according to the normal sample feature is stored in the normal memory expert, and the denoising student network updates a query feature generated according to a defect image to be detected according to the memory vector.
Aiming at the problems of the existing unsupervised anomaly detection based on knowledge distillation, the embodiment constructs a defect detection model, and a memory expert mechanism is introduced into the defect detection model to help a student network to more accurately identify the micro defects in the glass container. Firstly, a defect detection model formed by a characteristic distillation network and an abnormality refinement network is constructed, then, the model is trained and tested by adopting collected defective and flawless industrial product surface defect images, and finally, the trained model is deployed in an industrial application scene to detect the abnormality of a glass container in the production process.
The specific implementation process comprises the following steps:
s1, acquiring an industrial product surface defect image for training and verifying a model, and preprocessing the acquired image to obtain a training set and a verification set;
S2, constructing a characteristic distillation network;
s3, constructing an abnormal refinement network;
Step S4, connecting a characteristic distillation network and an abnormality refinement network to obtain a defect detection model, training by adopting the training set obtained in the step S1, and testing by adopting the testing set to obtain a trained defect detection model;
and S5, deploying the trained defect detection model.
In step S1, a systematic method is adopted to screen and divide the surface defect images of the industrial products. And selecting the defect-free image as a training set to culture the accurate identification capability of the abnormal detection model on the normal surface characteristics. Meanwhile, the defective image is used as a test set to evaluate the performance and accuracy of the model in the actual detection task. In order to further improve generalization capability and sensitivity to anomalies of the model, pseudo-anomaly construction is performed on normal samples in a training set, pseudo-anomaly images corresponding to the normal images are generated, mask patterns are created for the pseudo-anomaly images, the mask patterns not only define anomaly areas, but also provide additional training information for the model, and therefore a pseudo-anomaly training set is formed.
The generation of pseudo-anomalous images aims at generating anomalous images in a more complex and realistic way. Specifically, to increase the diversity of the generated abnormal images, a multi-stage nonlinear processing and noise transformation are introduced. The process of generating the abnormal image P a is formulated as follows:
;
Wherein, Representing element-by-element multiplication, T is an anomaly-free training set image, and retains the information of an original scene; The method is characterized in that the method comprises the steps of selecting opacity factors randomly in a range (0, 1) as a data enhancement means, wherein M is an abnormal mask generated through multi-level nonlinear transformation and activation function processing and represents an abnormal region; And Is an introduced extra noise item which respectively represents local and global complex noise, thereby increasing the diversity of abnormal images, and the characteristics of the noise are determined by parametersAndAnd (5) adjusting.
In the generation of the abnormal image P a, the non-abnormal region is represented by (1-M), so that the distinction between the abnormal region and the non-abnormal region can be more direct and obvious, especially at the boundary of the abnormal region, the introduced additional noise items comprise local and global complex noise, the characteristics of the noise can be more intuitively controlled, and the opacity factor is introducedThe method can flexibly adjust the visibility of the abnormality when generating the abnormal image, increases the flexibility of generating the image, and is beneficial to training the model to identify the abnormality with different degrees.
The generation process of the anomaly mask M is as follows:
;
Wherein, W i is convolution kernel, X i is input image; A nonlinear activation function RELU; Is a Sigmoid activation function used to generate a mask, N is the number of convolution layers, determines the complexity of generating an anomaly mask, parameters W i and The convolution kernel and activation function of each stage are represented separately for adjusting the transform strength.
In step S2, the feature distillation network includes a denoising student network and a teacher network, in which a normal information storage and recall network and an adaptive perception state enhancement module are designed, and parameters of the feature distillation network are updated by a two-domain comparison method.
Specifically, as shown in fig. 2, the teacher network is composed of three pre-training Resnet blocks of corresponding dimensions for extracting features of the normal sample. The denoising student network is an architecture of an encoder-decoder, the encoder comprises four Resnet blocks and three normal information storage and recall networks, one normal information storage and recall network is correspondingly connected after the first three Resnet blocks, and the decoder comprises two Resnet blocks and two self-adaptive perception state enhancement modules connected after the Resnet blocks.
Based on the inspiration of the human anomaly recognition process, the embodiment of the invention introduces a memory expert mechanism in the knowledge distillation process, thereby assisting the denoising learning of the denoising student network, and the core of the memory expert mechanism is a normal information storage and recall network. As shown in fig. 3, a normal memory expert is established for storing high-level normal features extracted from normal samples, which are extracted by a pre-trained ResNet encoder (i.e., teacher network) to form a set of memory information representing normal states. The teacher network extracts advanced normal features from the normal sample, which can accurately reflect the high-level information of the normal sample, and is stored in the normal memory specialist. In the training process of the student network, the normal memory expert helps the student to conduct denoising learning. The student network not only needs to learn how to identify the normal sample, but also improves the robustness of the student network to noise and abnormality through the advanced features provided by the memory expert.
A normal learning strategy is adopted to help the memory expert grasp prior knowledge in normal data in the training process. First, a randomly initialized normal memory expert is established to store a certain number of memory vectors m i. In the training phase, normal samples in the training set are sent into the teacher network, and k training samples are encoded into normal sample characteristics through an encoder of the teacher networkWherein,Representing a d-dimensional real vector space defined by d real coordinates, these normal sample features are saved in a normal memory specialist to recall learned normal knowledge from query features.
In order to further improve the learning effect, an adaptive weighting update mechanism is introduced, and weights are adjusted based on the dynamic relation between the memory vector and the normal sample characteristics of the current input:
;
Wherein, Is the value of the memory vector in the t-th iteration; is the value of the memory vector in the t+1st iteration; the dynamic learning rate is adaptively adjusted according to the change condition of the loss function in each iteration step; Is a suitably selected nonlinear function, here a Sigmoid function is used for adjusting the update amplitude; is a normal sample feature of the current input, represents the element product. The learning rate is adaptively adjusted according to the loss condition of each iteration, so that the method can converge more rapidly or avoid over-fitting; Nonlinear adjustment is introduced, and the nonlinear adjustment can be based on the memory vector m i and normal sample characteristics The similarity of the model is dynamically adjusted to update the amplitude, so that the flexibility and the expressive power of the model are enhanced, and the element product is obtainedPreserving the relative orientation information between the input normal sample features and the memory vector helps to more accurately capture the relationship between the features. Subsequently, each normal sample featureFlattened and computed cosine similarity with memory vector m i in the memory specialist, similarity weight is obtained through softmax activation function to form a first similarity score matrix:
;
Where K is the total number of vectors for the normal memory expert. Then press m i Polymerizing to obtain the first normalized feature:
;
To ensure that normal information is remembered from normal sample features, a normal memory loss function L mem is employed to minimizeAnd (3) withThe difference between:
;
Wherein, The sparsity-loss function is represented as,Representing a cross-sample consistency loss function,、The weight coefficients of the sparsity loss function and the cross-sample consistency loss function, respectively.
In order to promote the sparsity of the memory vector, avoid overfitting, introduce a sparsity loss function, ensure that the memory vector m i has sparsity:
。
in order to ensure the feature consistency among different samples, cross-sample consistency loss is introduced, and the structural relationship among samples is ensured to be consistent after reconstruction:
;
Wherein, Representing different normal samplesAndThe difference between the two is that,Representing different normal samplesAndIs the first normalized feature of (1)AndDifferences between them.
After training of the teacher network on the normal memory expert is completed, the final goal of the normal information storage and recall network is to adaptively adjust the normality of the student network generation characteristics. And (3) extracting the memory priori knowledge for modeling by using the query characteristics, and inputting the pseudo-abnormal image corresponding to the training set into a denoising student network. Obtaining query features through a noisy student networkAnd invokes normal information stored in the memory specialist according to the query characteristics.
Specifically, the pseudo-abnormal image corresponding to the training set is sent to a denoising student network, and k training samples are encoded into query features through the denoising student networkWherein. The query features recall the learned normal knowledge by the normal memory expert. Each query featureFlattened and computed cosine similarity with memory vector m i in normal memory specialist, similarity weight is obtained through softmax activation function, and a second similarity score matrix is formed:
;
Where K is the total number of vectors of the memory specialist.
Second similarity score matrixHow much of the relevant normalization needs to be invoked for integration at normalization is controlled. Then press m i Polymerizing to obtain the second normalized feature:
;
Finally, willConversion intoAnd with the original query featuresAnd further splicing to form the input of the student network of the next stage.
The encoder of the denoising student network comprises four Resnet18 blocks and three normal information storage and recall networks, wherein the first Resnet blocks are connected with the first normal information storage and recall network NMR1, the first normal information storage and recall network NMR1 is connected with the second Resnet blocks, the second Resnet blocks are connected with the second normal information storage and recall network NMR2, the second normal information storage and recall network NMR2 is connected with the third Resnet blocks, the third Resnet blocks are connected with the third normal information storage and recall network NMR3, and the third normal information storage and recall network NMR3 is connected with the fourth Resnet blocks. The teacher network comprises three Resnet blocks which correspond to the first three Resnet blocks of the denoising student network encoder, and the features extracted by each Resnet block are input into the normal information storage and recall network after the corresponding Resnet blocks in the denoising student network, so as to train the normal memory expert in the corresponding normal information storage and recall network.
The self-adaptive perception state enhancement module is based on Mamba state space models, considers the requirements of the models in the embodiment, improves the post-processing process after the visual state space blocks, extracts global and local information, and enables the denoising student network to pay more attention to accurate recovery of details and structures during denoising.
Specifically, the self-adaptive perception state enhancement module uses the visual state space block to extract the space remote dependency relationship, so that the extraction of local information is increased on the basis, and after the extraction of the space features is completed by using the visual state space block, the capturing capability of the local region features is improved by adopting a strategy combining average pooling and maximum pooling, so that the model is helped to more accurately locate and aim at the interested object.
The structure of the adaptive sensing state enhancement module EP-VSS is shown in FIG. 4, and for the input feature X input to the adaptive sensing state enhancement module, X is first sent to the visual state space block VSS for processing to obtain the output result of the visual state space block. Subsequently, toAnd respectively carrying out average pooling and maximum pooling treatment, respectively capturing global average characteristics and global maximum characteristics, and then enhancing global understanding of the model on input data through the two characteristics of splicing (cat) to obtain spliced characteristics y. The calculation formula of the stitching feature y can be expressed as:
;
Wherein, For the output result of the visual state space block, H, w represent the height and width of the feature map, respectively, (H, 1) and (1, w) represent the window size of the pooling operation, and pooling is performed in the height and width directions, respectively.
Then extracting features through a convolution layer, and obtaining enhanced features through batch normalization of nonlinear expression capacity of BN and Sigmoid activation function enhancement modelsThe calculation formula is as follows:
;
where conv (y) denotes applying a convolution operation to y, Representing element-wise multiplication.
Thereafter, willAnd dividing the image into two parts along a second dimension (channel dimension), wherein the size of each part is H and W respectively, and obtaining a characteristic image x w and x h after dividing. The formula is as follows:
。
thereafter, the attention weights a w、ah in the width and height directions are calculated by the convolution layer and Sigmoid activation function.
;
;
Wherein, The function is activated for Sigmoid,To permute x w to accommodate convolution operations. Finally, multiplying the attention weights in the width and height directions to obtain a final attention map, and multiplying the final attention map with the input feature map to enhance important features:
;
The identity represents an identity mapping to the input, namely the input characteristic X of the input is directly used, and out is a characteristic diagram output by the adaptive perception state enhancement module.
As shown in fig. 2, in order to improve the performance of the feature distillation network and optimize the network structure of the denoising student network, the embodiment of the invention adopts a dual-domain comparison method between the denoising student network and the teacher network for extracting normal features. According to the method, a more efficient frequency domain information analysis strategy is introduced on the basis of traditional spatial domain analysis, and the output of three scales of a teacher network is compared with the output of the last Resnet blocks and two self-adaptive perception state enhancement modules of a denoising student network in a double-domain manner, so that the denoising student network is promoted to reconstruct a higher-quality normal image under the guiding action of the teacher network.
Specifically, the image obtained by the teacher network and the corresponding image processed by the denoised student network are first passed through a lossless frequency domain feature encoder LFDFE. The key of the step is that the feature conversion from the spatial domain to the frequency domain is realized without losing any image information. The redundant information is then removed by a filter. On the basis, a double-domain composite loss function is further constructed, and the function comprehensively considers the image restoration performance of the feature space and the image restoration efficiency of the wavelet frequency domain space.
The lossless frequency-domain feature encoder first performs a Haar wavelet transform on the input feature map I, which decomposes the feature map into four key components, an approximate (low frequency) component a, and detail (high frequency) components in the horizontal C, vertical V, and diagonal D directions. And splicing the four components, and then carrying out further refinement treatment through a filter. The filter consists of a standard 1 x 1 convolution layer, a batch normalization layer, and RELU activation functions, with the objective of filtering redundant information and providing more efficient representative features. The process of Haar wavelet transformation can be expressed by the following formula:
;
Where I Haar denotes a feature obtained after the Haar wavelet transform, and H () denotes the Haar wavelet transform.
After four key components obtained by wavelet transformation are spliced, further processing is carried out through a filter:
;
splicing four key components through cat, performing further processing through conv1×1 convolution, and obtaining the characteristics after filter processing through batch normalization BN and an activation function RELU 。
The wavelet transform can provide a multi-scale and multi-directional representation of the image, helping to more accurately identify and remove noise while preserving important structural information of the image. By minimizing the difference between the wavelet transform of normal image features and its denoised feature wavelet transform by the frequency domain loss function, it is possible to learn how to efficiently recover the details of the image in the wavelet domain. Frequency domain loss functionCan be expressed as:
;
wherein W represents lossless frequency domain feature encoding, Is a normal image feature output by the teacher's network,The method is a denoising image feature output by a denoising student network, R represents different scales, and r=3 represents the scale number of three different scales. The difference in frequency domain between the first Resnet block of the teacher network and the last adaptive perceptual status enhancement module output feature of the denoised student network decoder, the second Resnet block of the teacher network and the penultimate adaptive perceptual status enhancement module output feature of the denoised student network decoder, the third Resnet block of the teacher network and the last Resnet block output feature of the denoised student network decoder, respectively, is compared by minimizing the frequency domain loss functionThe network can optimize its parameters in the wavelet domain to achieve better denoising performance.
Combining the loss of the characteristic domain and the loss of the frequency domain to obtain a double-domain composite loss functionThe network can be optimized in two spaces:
;
;
Wherein, Representing a characteristic domain loss function, R representing different scales, r=3,、Is a weight parameter for balancing two loss terms, h and w represent the height and width of the feature map respectively,AndRepresenting the raw and denoised eigenvalues at positions (i, j), respectively.
Three different scales are to be usedAndFusing to uniform scale, and splicing to obtain a difference significant feature mapThe formula is as follows:
;
Where r=1, 2,3 denotes three different scales, U () denotes an up-sampling operation, and cat () denotes a splicing operation.
In step S3, an abnormality refinement network of an abnormality detection model is constructed, and the difference significant feature map obtained by the feature distillation network is further refined so as to improve the accuracy of abnormality detection. As shown in fig. 6, the anomaly refinement network includes a multi-scale feature extraction module, a self-attention fusion module, a cascade convolution module, a global context sensing module, an attention-guided feature fusion module, and a refinement output module, which are sequentially connected, and finally generates an anomaly detection result.
The multi-scale feature extraction module extracts information of an input feature map from different scales, and the capturing capability of abnormal features under the multi-scale is enhanced. The feature extraction at each scale uses a combination of convolution and pooling layers, the formula is as follows:
;
Where Conv represents convolution, pooling represents pooling operations, Representing the extracted multi-scale feature of the r-th scale.
Then fusing the features with different scales through a self-attention fusion module, and highlighting the key areas through a self-attention mechanism, wherein the formula is as follows:
;
Wherein, Representing computing self-attention weights and applying these weights to multi-scale features,Representing the output characteristics of the self-attention fusion module.
And then, each convolution layer further refines the characteristics on the basis of the previous convolution layer through a cascade convolution module, and the formula is as follows:
;
wherein BN represents normalization, RELU represents activation function, Representing the output characteristics of the cascaded convolution module.
Then, the global context sensing module enhances capturing global information through a global context sensing mechanism, and the formula is as follows:
;
wherein GAP represents global average pooling, FC represents fully connected layers, Representing the output characteristics of the global context awareness module.
Then, the attention-guided feature fusion module fuses the multi-scale features through a multi-head attention mechanism to enhance the feature expression capacity, and the formula is as follows:
;
;
Wherein, Represents the attention mechanism of the 1,2, & n heads, cat represents the splice,Representing the resulting multi-scale features of the fusion,Representing the output features of the attention directed feature fusion module. And finally, processing the final characteristics by the refinement output module to generate an abnormality detection result.
;
Wherein Upsample denotes an upsampling operation, for increasing the resolution of the feature map,Representing the output characteristics of the refinement output module.
Fixed feature distillation network. And respectively sending the pseudo-abnormal pictures into a teacher network and a denoising student network, and training an abnormal refinement network by using the characteristics extracted by the teacher network and the denoising student network. An abnormal refinement loss function combining multi-scale weighted loss and adaptive fusion loss is designed to better capture abnormal characteristics and balance the influence of different categories at the same time, so that the performance of an abnormal refinement network is improved.
Multi-scale weighted loss functionThe method aims at enhancing the attention to the abnormal region through feature extraction and weighting processing under different scales, and the formula is as follows:
;
Where S is the number of scales, N s is the number of samples at the S-th scale, Is the weight of the s-th scale,Is a weighted parameter of the sample at the s-th scale, used to balance the loss at different scales,Is the true label of sample c at the s-th scale,Is the predicted value of sample c at the s-th scale.
Adaptive fusion loss functionThe abnormality detection performance is optimized by adaptively fusing the outputs of the plurality of feature layers, the formula of which is as follows:
;
Where N is the number of samples, Is the adaptive weight of sample d, used to balance the loss of different samples, L is the number of feature layers,Is the fusion weight of the first feature layer,Is the true label of the sample d,Is the predicted value of sample d under the first feature layer.
The abnormal refinement loss function combines the multi-scale weighted loss and the adaptive fusion loss to optimize the performance of the abnormal refinement networkThe formula of (2) is as follows:
;
Wherein, AndThe weight parameters of the multi-scale weighted loss and the adaptive fusion loss are used for balancing the contribution of the two losses.
In step S4, the construction of the defect detection model is completed, and the defect detection model is trained by adopting a training set, wherein the defect detection model is based on a normal memory loss functionTraining a normal information storage and recall network based on a double-domain composite loss functionTraining of characteristic distillation network is completed, and loss function is refined based on abnormalityAnd (5) training the anomaly refinement network. The test set is then used to verify the model effect.
The training process is divided into two phases. In the first stage, the normal image and the pseudo-abnormal image are respectively used as inputs of a teacher network and a denoising student network. After the teacher network finishes training the normal information storage and recall network, the intermediate features output by the denoising student network encoder are input into the normal information storage and recall network for recombination, and then are sent to the denoising student network decoder part. The goal of the training is to make the denoised student network generated feature representations as similar as possible to the teacher network. Finally, the feature distillation network generates a difference saliency map. In the second stage, the student network is fixed while using the pseudo-outlier image as input to the teacher network and the student network. The anomaly refinement network receives the salient feature map from the feature distillation network and generates a final anomaly detection result through multi-level refinement processing. The loss function of the model consists of three parts, namely a normal memory loss function used for training a normal information storage and recall networkDual domain composite loss function for training denoised student networkIncluding frequency domain loss functionsAnd a feature field loss functionAnd an anomaly refinement loss function for training an anomaly refinement network。
In the training of the defect detection model, the first 1000 rounds concentrate on memory enhancement and feature optimization, train normal information storage and recall networks and feature distillation networks, cultivate the abnormality removal capacity of student networks, and the subsequent 4000 rounds concentrate on feature refinement and performance optimization, train and performance optimization of abnormality refinement networks, and further improve the accuracy and generalization capacity of an abnormality detection system. After every 1000 rounds, the models were evaluated using a separate test set and the best performing model weights were retained. The strategy ensures that the model with the optimal performance can be selected for optimization at each stage in the training process, and finally the model with the optimal performance on the test set is selected for deployment, so that the defect detection model is ensured to achieve the optimal performance in practical application.
Example two
The embodiment provides an unsupervised anomaly detection system based on memory expert guidance, which comprises:
The image acquisition module is configured to acquire a defect image to be detected;
The defect detection module is configured to input a defect image to be detected into a trained defect detection model for detection, so as to obtain a glass container surface defect detection result;
The defect detection model comprises a feature distillation network for extracting a difference significant feature map and an abnormal refinement network for generating a defect detection result according to the difference significant feature map, wherein the difference significant feature map is obtained according to a denoising student network and a teacher network of the feature distillation network, the feature distillation network is used for helping the denoising student network to learn a normal sample based on a normal memory expert, a memory vector obtained by the teacher network according to the normal sample feature is stored in the normal memory expert, and the denoising student network updates a query feature generated according to a defect image to be detected according to the memory vector.
It should be noted that, in this embodiment, each module corresponds to a step of the method in the first embodiment, and the implementation process is the same, which is not described here.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202411336461.6A CN118864454B (en) | 2024-09-25 | 2024-09-25 | An unsupervised anomaly detection method and system based on memory expert guidance |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202411336461.6A CN118864454B (en) | 2024-09-25 | 2024-09-25 | An unsupervised anomaly detection method and system based on memory expert guidance |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118864454A CN118864454A (en) | 2024-10-29 |
CN118864454B true CN118864454B (en) | 2025-01-21 |
Family
ID=93170376
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202411336461.6A Active CN118864454B (en) | 2024-09-25 | 2024-09-25 | An unsupervised anomaly detection method and system based on memory expert guidance |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118864454B (en) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118196051A (en) * | 2024-03-27 | 2024-06-14 | 中国矿业大学 | Self-supervised industrial defect detection method guided by prototype memory |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112036513B (en) * | 2020-11-04 | 2021-03-09 | 成都考拉悠然科技有限公司 | Image anomaly detection method based on memory-enhanced potential spatial autoregression |
CN112991330B (en) * | 2021-04-19 | 2021-08-13 | 征图新视(江苏)科技股份有限公司 | Positive sample industrial defect detection method based on knowledge distillation |
EP4224379A4 (en) * | 2021-12-03 | 2024-02-14 | Contemporary Amperex Technology Co., Limited | METHOD AND SYSTEM FOR RAPID ANOMALY DETECTION BASED ON CONTRASTING REPRESENTATION DISTILLATION |
CN115187525A (en) * | 2022-06-23 | 2022-10-14 | 四川启睿克科技有限公司 | Unsupervised image defect detection method, unsupervised image defect detection device and unsupervised image defect detection medium based on knowledge distillation |
US20240144635A1 (en) * | 2022-10-26 | 2024-05-02 | Mindtrace.Ai Usa, Inc. | Techniques For Unsupervised Anomaly Classification Using An Artificial Intelligence Model |
CN116416558A (en) * | 2023-04-08 | 2023-07-11 | 苏州海裕鸿智能科技有限公司 | Non-supervision abnormal event detection algorithm based on memory guiding attention |
CN116912173A (en) * | 2023-06-14 | 2023-10-20 | 华北电力大学 | Product apparent defect detection method based on unsupervised anomaly detection |
CN117173131A (en) * | 2023-09-05 | 2023-12-05 | 天津大学 | Anomaly detection method based on distillation and memory-guided reconstruction |
CN117635585A (en) * | 2023-12-06 | 2024-03-01 | 华中科技大学 | Texture surface defect detection method based on teacher-student network |
-
2024
- 2024-09-25 CN CN202411336461.6A patent/CN118864454B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118196051A (en) * | 2024-03-27 | 2024-06-14 | 中国矿业大学 | Self-supervised industrial defect detection method guided by prototype memory |
Also Published As
Publication number | Publication date |
---|---|
CN118864454A (en) | 2024-10-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109615582B (en) | A Face Image Super-resolution Reconstruction Method Based on Attribute Description Generative Adversarial Network | |
JP7490141B2 (en) | IMAGE DETECTION METHOD, MODEL TRAINING METHOD, IMAGE DETECTION APPARATUS, TRAINING APPARATUS, DEVICE, AND PROGRAM | |
Kim et al. | Fully deep blind image quality predictor | |
CN108537742B (en) | A Panchromatic Sharpening Method for Remote Sensing Images Based on Generative Adversarial Networks | |
CN111046821B (en) | Video behavior recognition method and system and electronic equipment | |
CN108090447A (en) | Hyperspectral image classification method and device under double-branch deep structure | |
CN116757986A (en) | Infrared and visible light image fusion method and device | |
CN114463759A (en) | Lightweight character detection method and device based on anchor-frame-free algorithm | |
Avola et al. | Real-time deep learning method for automated detection and localization of structural defects in manufactured products | |
CN112465700B (en) | A device and method for image stitching and positioning based on depth clustering | |
Xu et al. | Tackling small data challenges in visual fire detection: A deep convolutional generative adversarial network approach | |
CN116596851A (en) | Industrial flaw detection method based on knowledge distillation and anomaly simulation | |
CN107729885A (en) | A kind of face Enhancement Method based on the study of multiple residual error | |
CN106203321A (en) | A kind of gait recognition method and system | |
CN112734638B (en) | Remote sensing image super-resolution reconstruction method and device and storage medium | |
CN118864454B (en) | An unsupervised anomaly detection method and system based on memory expert guidance | |
CN118864453A (en) | Steel surface defect detection method and system based on local-global context perception | |
Shakeel et al. | Multi-scale attention guided network for end-to-end face alignment and recognition | |
Hussein | Robust iris recognition framework using computer vision algorithms | |
CN117872127A (en) | Motor fault diagnosis method and equipment | |
CN114886438B (en) | Epileptic detection method based on EEG single sample deep learning | |
CN116311504A (en) | Small sample behavior recognition method, system and equipment | |
CN114897901B (en) | Battery quality detection method and device based on sample expansion and electronic equipment | |
CN118052811B (en) | A method for aircraft skin defect detection based on NAM-DSSD model | |
Lv et al. | Underwater Image Enhancement Based on Shallow Underwater Neural Network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |