CN115984662A - Multi-mode data pre-training and recognition method, device, equipment and medium - Google Patents
Multi-mode data pre-training and recognition method, device, equipment and medium Download PDFInfo
- Publication number
- CN115984662A CN115984662A CN202310272537.2A CN202310272537A CN115984662A CN 115984662 A CN115984662 A CN 115984662A CN 202310272537 A CN202310272537 A CN 202310272537A CN 115984662 A CN115984662 A CN 115984662A
- Authority
- CN
- China
- Prior art keywords
- defect
- scene
- data
- information
- self
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 72
- 238000012549 training Methods 0.000 title claims abstract description 52
- 230000007547 defect Effects 0.000 claims abstract description 406
- 238000001514 detection method Methods 0.000 claims abstract description 63
- 239000013598 vector Substances 0.000 claims abstract description 25
- 230000004927 fusion Effects 0.000 claims abstract description 16
- 230000006870 function Effects 0.000 claims description 40
- 239000011159 matrix material Substances 0.000 claims description 39
- 238000001228 spectrum Methods 0.000 claims description 35
- 230000008569 process Effects 0.000 claims description 32
- 238000004590 computer program Methods 0.000 claims description 22
- 238000004519 manufacturing process Methods 0.000 claims description 21
- 238000000605 extraction Methods 0.000 claims description 16
- 238000004364 calculation method Methods 0.000 claims description 13
- 238000009826 distribution Methods 0.000 claims description 13
- 238000013507 mapping Methods 0.000 claims description 13
- 238000003860 storage Methods 0.000 claims description 10
- 238000013461 design Methods 0.000 claims description 7
- 210000000988 bone and bone Anatomy 0.000 claims description 6
- 238000005516 engineering process Methods 0.000 claims description 6
- 239000004744 fabric Substances 0.000 claims description 6
- 238000007689 inspection Methods 0.000 claims description 6
- 230000007246 mechanism Effects 0.000 claims description 6
- 239000002184 metal Substances 0.000 claims description 6
- 239000004065 semiconductor Substances 0.000 claims description 6
- 235000012431 wafers Nutrition 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims description 5
- 238000000227 grinding Methods 0.000 claims description 5
- 238000005096 rolling process Methods 0.000 claims description 5
- 238000010008 shearing Methods 0.000 claims description 4
- 239000000470 constituent Substances 0.000 claims description 3
- 230000003595 spectral effect Effects 0.000 claims description 3
- 230000001502 supplementing effect Effects 0.000 claims description 2
- 239000000047 product Substances 0.000 description 13
- 238000010586 diagram Methods 0.000 description 6
- 238000013519 translation Methods 0.000 description 5
- 238000013135 deep learning Methods 0.000 description 4
- 230000003647 oxidation Effects 0.000 description 3
- 238000007254 oxidation reaction Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 239000002023 wood Substances 0.000 description 2
- 206010063385 Intellectualisation Diseases 0.000 description 1
- 229910000831 Steel Inorganic materials 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000009411 base construction Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 238000009776 industrial production Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000007670 refining Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 239000010959 steel Substances 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Image Analysis (AREA)
- Investigating Materials By The Use Of Optical Means Adapted For Particular Applications (AREA)
Abstract
The invention discloses a method, a device, equipment and a medium for multi-mode data pre-training and recognition, wherein a defect scene rule database is constructed by performing multi-source heterogeneous data fusion on acquired defect basic data; extracting defect type information, characteristic information and scene information from the defect scene rule database, performing data association, and extracting scene factors of the defect scene rule database; constructing a self-coding network structure model carrying defect scene information, integrating the scene factors into the self-coding network structure model, inputting a characteristic vector obtained by coding sample data of various defects, performing matching training of data and rules, and generating a modal identification model; and identifying the defects of the sample to be detected according to the modal identification model. The product defect detection accuracy and the model robustness can be improved.
Description
Technical Field
The invention relates to the field of image recognition, in particular to a method, a device, equipment and a medium for multi-mode data pre-training and recognition.
Background
With the rapid development of precision manufacturing industry, the loss caused by surface defects of high-precision instruments reaches the billion yuan level every year, and the requirement of high-precision defect detection of industrial products is increasingly strong. Especially, the industrial production environment has highly complex conditions such as noise, shielding, vibration, dim light and the like, so that the defect detection has to meet the requirements of intellectualization, high precision, long time and high efficiency.
Although the defect accuracy rate is improved to a certain extent by applying the deep learning algorithm at the present stage, the defect samples are small and unbalanced in the existing high-precision defect detection process, and are easily influenced by environments such as shielding, oxidation and vibration, so that the problems of low product defect detection accuracy rate and weak model robustness exist.
Disclosure of Invention
In order to solve the technical problems, the invention provides a method, a device, equipment and a medium for multi-mode data pre-training and recognition, which improve the product defect detection accuracy and the robustness of a model.
The embodiment of the invention provides a multi-mode data pre-training and recognition method, which comprises the following steps:
performing multi-source heterogeneous data fusion on acquired defect basic data, and constructing a defect scene rule database;
extracting defect type information, characteristic information and scene information from the defect scene rule database, performing data association, and extracting scene factors of the defect scene rule database;
constructing a self-coding network structure model carrying defect scene information, integrating the scene factors into the self-coding network structure model, inputting feature vectors obtained by coding sample data of various defects, performing matching training of data and rules, and generating a modal identification model;
and identifying the defects of the sample to be detected according to the modal identification model.
Further, the multi-source heterogeneous data fusion is performed on the acquired defect basic data, and a defect scene rule database is constructed, specifically including:
performing multi-source heterogeneous data fusion on defect basic data consisting of historical experience data, common rule data and defect standard data to form a defect scene rule database which is associated with defect types, positions and scales;
the defect scene rules database includes: a surface defect data set, a defect rule data set, a detection system data set, and a process scene data set.
As an improvement of the above solution, the surface defect data set D1= [ surface defect ID, defect geometry, spatial distribution data, defect statistics, defect spectrum data ];
the defect rule data set D2= [ defect rule ID, detection object type, defect classification statistical data, damage mechanism data, defect cause rule, defect grade ];
the detection system data set D3= [ detection system ID, device type, production line design data, technology type ];
the process scene data set D4= [ process scene data ID, detection object type, scene factor, production process ];
the defect geometry includes: point-line-surface defects, boundaries, bones, shapes, positions, sizes, stretches, and translates;
the spatial distribution data includes: entropy, contrast, consistency and correlation;
the defect statistical data comprise gray level co-occurrence matrixes, autocorrelation coefficients, mathematical morphology, histogram statistical characteristics, fractal values and defect frequency spectrum subsets;
the histogram statistics include range, mean, geometric mean, harmonic mean, standard deviation, variance, and median.
The fractal values comprise stretching and translation fractal dimension and porosity;
the defect frequency spectrum subset comprises a texture frequency spectrum, a taint frequency spectrum and a sawtooth frequency spectrum;
the defect classification statistical data is specifically a fault mode of automatic defect division;
the defect level comprises the detection object type;
the detection object types comprise semiconductors, circuit boards, wafers, fabrics, metal surfaces and woods;
the scene factors comprise operation scale and equipment type selection;
the production process comprises the steps of blank making, grinding, rolling, shearing, bundling and finished product forming.
Preferably, the extracting the defect type information, the feature information, and the scene information from the defect scene rule database, performing data association, and extracting the scene factor of the defect scene rule database specifically includes:
extracting defect type information from the surface defect dataset, extracting feature information from the surface defect dataset and the defect rule dataset, and extracting scene information from the inspection system dataset and the process scene dataset;
for the defect Z, constructing a layered matrix Z multiplied by T multiplied by R according to the extracted defect type information, the extracted characteristic information and the extracted scene information;
for defect-feature associated information, a first extraction factor a is adopted ij Mapping and extracting from the matrix Z multiplied by T to obtain a front defect scene factorBased on all preceding extracted defect scene factors &>Preceding scene factor->;
For the characteristic-scene associated information, a second extraction factor b is adopted ij Mapping from a matrix T RShooting and extracting to obtain the background defect scene factorBased on all the post-defect scene factors extracted ≥>Forming a post-term scene factor->;
Determining the scene factor according to the extracted antecedent scene factor and consequent scene factor;
wherein,,T,n is the number of defect classes, j is the eigenvector dimension, Z i j Being the value of an element in the defect matrix, T i j Is the value of an element in the feature information matrix, R i j I =1,2, ... (n) is the value of an element in the scene information matrix;,,Then>=0,Then is greater or less>;;,,Then is greater or less>=0,Then it is;。/>
Preferably, the constructing a self-coding network structure model carrying defect scene information, merging the scene factors into the self-coding network structure model, inputting feature vectors obtained by coding sample data of various defects, performing matching training of data and rules, and generating a modal identification model specifically includes:
applying the former scene factor in the scene factors to an encoder of the self-encoding network structure model to extract effective characteristics;
applying the latter scene factor in the scene factors to a decoder of the self-coding network structure model to generate rules;
inputting a characteristic vector W coded by sample data of various defects, introducing a scene factor in the structure of a basic operation block during superposition by using the thought of a residual error network for reference, so that the scene factor is hidden in a hierarchical structure in the stack of the self-coding network structure model, and decoding and outputting to obtain a scene rule output [ type, characteristic and scene ];
outputting the scene rules through a semi-supervised stacking self-encoder, adding a classifier in a decoding stage to realize a classification function, optimizing the self-encoding network structure model classifier through matching training of data and rules, and generating the modal identification model.
As a preferred scheme, the objective function of the self-coding network structure model is specifically:
the loss function of the self-coding network structure model is specifically as follows:
wherein V (G, D) is the whole defined objective function, N is the number of the original labels,the probability P which represents that the defect sample x is the original label in the output data (x) after passing through the self-coding network is judged as being greater than or equal to>Representing the probability P of the original label in the output data z (x) after the sample x carrying the defect knowledge passes through a self-coding network; d (X) is a conditional probability calculation function, G (z) is the probability of comparing the output information y in the applied classification category data under the condition of the category model G (z);Represents whether or not there is a->Class defects; a. b, w, h and c are composition variables of each grid during defect detection, a and b are points at the lower left corner of the grid, w and h are width and height of the grid, c is grid confidence coefficient, and>representing the coordinate loss of the defect bounding box by calculating the mean square error from the position informationLosing;Representing the size loss of the defect bounding box by the calculated absolute mean square error of the size information;Indicates whether the user belongs to by judgingThe defect type calculates the confidence loss.
Preferably, the scene rule output is further trained by a hidden layer of a stacked self-encoder, and the defect scene rule is continuously generated and updated and supplemented into the defect scene rule database.
The embodiment of the invention also provides a device for pre-training and recognizing the multi-modal data, which comprises:
the database construction module is used for carrying out multi-source heterogeneous data fusion on the acquired defect basic data and constructing a defect scene rule database;
the scene factor extraction module is used for extracting defect type information, characteristic information and scene information from the defect scene rule database, performing data association and extracting scene factors of the defect scene rule database;
the model generation module is used for constructing a self-coding network structure model carrying defect scene information, integrating the scene factors into the self-coding network structure model, inputting a characteristic vector obtained by coding sample data of various defects, performing matching training of data and rules and generating a modal identification model;
and the defect identification module is used for identifying the defects of the sample to be detected according to the modal identification model.
Preferably, the database construction module is specifically configured to:
performing multi-source heterogeneous data fusion on defect basic data consisting of historical experience data, common rule data and defect standard data to form a defect scene rule database which is associated with defect types, positions and scales;
the defect scene rules database includes: a surface defect data set, a defect rule data set, a detection system data set, and a process scene data set.
Further, the surface defect data set D1= [ surface defect ID, defect geometry, spatial distribution data, defect statistics, defect spectrum data ];
the defect rule data set D2= [ defect rule ID, detection object type, defect classification statistical data, damage mechanism data, defect cause rule, defect grade ];
the detection system data set D3= [ detection system ID, device type, production line design data, technology type ];
the process scene data set D4= [ process scene data ID, detection object type, scene factor, production process ];
the defect geometry includes: point-line-surface defects, boundaries, bones, shapes, positions, sizes, stretches, and translates;
the spatial distribution data includes: entropy, contrast, consistency and correlation;
the defect statistical data comprise gray level co-occurrence matrixes, autocorrelation coefficients, mathematical morphology, histogram statistical characteristics, fractal values and defect frequency spectrum subsets;
the histogram statistical features include range, mean, geometric mean, harmonic mean, standard deviation; median value
The fractal values include stretching, translating fractal dimension and porosity;
the defect frequency spectrum subset comprises a texture frequency spectrum, a taint frequency spectrum and a sawtooth frequency spectrum;
the defect classification statistical data are fault modes of automatic defect division;
the defect grade comprises the detection object type;
the detection object types comprise semiconductors, circuit boards, wafers, fabrics, metal surfaces and woods;
the scene factors comprise operation scale and equipment type selection;
the production process comprises blank making, grinding, rolling, shearing, bundling and finished product forming.
Preferably, the scene factor extraction module is specifically configured to:
extracting defect type information from the surface defect dataset, extracting feature information from the surface defect dataset and the defect rule dataset, and extracting scene information from the inspection system dataset and the process scene dataset;
for the defect Z, constructing a layered matrix Z multiplied by T multiplied by R according to the extracted defect type information, the extracted characteristic information and the extracted scene information;
for defect-feature associated information, a first extraction factor a is adopted ij Mapping and extracting the matrix Z multiplied by T to obtain the scene factor of the previous defectBased on all previous defect scene factors extracted>Forming a preceding scene factor->;
For the characteristic-scene associated information, a second extraction factor b is adopted ij Mapping and extracting from the matrix T multiplied by R to obtain the background factor of the defectBased on all post defect scene factors extracted &>Forming a post-term scene factor->;
Determining the scene factor according to the extracted antecedent scene factor and the extracted consequent scene factor;
wherein,,T,n is the number of defect classes, j is the eigenvector dimension, Z i j Being the value of an element in the defect matrix, T i j Is the value of an element in the feature information matrix, R i j I =1,2, \8230nfor element values in the scene information matrix;,,Then is greater or less>=0,Then is greater or less>;;,,Then>=0,Then it is;。
Preferably, the model generation module is specifically configured to:
applying the antecedent scene factors in the scene factors to an encoder of the self-encoding network structure model to extract effective features;
applying the latter scene factor in the scene factors to a decoder of the self-coding network structure model to generate rules;
inputting a characteristic vector W coded by sample data of various defects, introducing a scene factor in the structure of a basic operation block during superposition by using the thought of a residual error network for reference, so that the scene factor is hidden in a hierarchical structure in the stack of the self-coding network structure model, and decoding and outputting to obtain a scene rule output [ type, characteristic and scene ];
outputting the scene rules through a semi-supervised stacking self-encoder, adding a classifier in a decoding stage to realize a classification function, optimizing the self-encoding network structure model classifier through matching training of data and rules, and generating the modal identification model.
Preferably, the objective function of the self-coding network structure model is specifically:
the loss function of the self-coding network structure model is specifically as follows:
wherein V (G, D) is the whole defined objective function, N is the number of the original labels,the defect sample x is represented as the original in the output data (x) after passing through the self-coding networkProbability P,. Of the associated tag>Representing the probability P of the original label in the output data z (x) after the sample x carrying the defect knowledge passes through a self-coding network; d (X) is a conditional probability calculation function, G (z) is the probability of comparing the output information y in the applied classification category data under the condition of the category model G (z);Represents whether or not there is a->Class defects; a. b, w, h and c are constituent variables of each grid during defect detection, a and b are points at the lower left corner of the grid, w and h are the width and height of the grid, c is the confidence of the grid, and/or the value of the grid>Representing that the mean square error is calculated through the position information to represent the coordinate loss of the defect bounding box;Representing the size loss of the defect bounding box by the calculated absolute mean square error of the size information;Indicates whether the user belongs to the system by judgingThe defect type calculates the confidence loss.
Further, the scene rule output is continuously generated and updated by hidden layer training of the stacked self-encoder, and is supplemented to the defect scene rule database.
The invention also provides a computer-readable storage medium, which includes a stored computer program, wherein when the computer program runs, the device where the computer-readable storage medium is located is controlled to execute the multimodal data pre-training and recognition method as described in any one of the above embodiments.
The invention further provides a terminal device, which comprises a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, wherein the processor executes the computer program to implement the multimodal data pre-training and recognition method as described in any one of the above embodiments.
The invention provides a multi-mode data pre-training and recognition method, a device, equipment and a medium, wherein a defect scene rule database is constructed by performing multi-source heterogeneous data fusion on acquired defect basic data; extracting defect type information, characteristic information and scene information from the defect scene rule database, performing data association, and extracting scene factors of the defect scene rule database; constructing a self-coding network structure model carrying defect scene information, integrating the scene factors into the self-coding network structure model, inputting feature vectors obtained by coding sample data of various defects, performing matching training of data and rules, and generating a modal identification model; and identifying the defects of the sample to be detected according to the modal identification model. The product defect detection accuracy and the model robustness can be improved.
Drawings
FIG. 1 is a schematic flow chart of a multi-modal data pre-training and recognition method according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method for pre-training and recognizing multi-modal data according to another embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a multi-modal data pre-training and recognition apparatus according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a terminal device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a multi-modal data pre-training and recognition method, and relates to fig. 1, which is a flow diagram of the multi-modal data pre-training and recognition method provided by the embodiment of the invention, wherein the method comprises the following steps of S1-S4:
s1, performing multi-source heterogeneous data fusion on acquired defect basic data to construct a defect scene rule database;
s2, extracting defect type information, characteristic information and scene information from the defect scene rule database, performing data association, and extracting scene factors of the defect scene rule database;
s3, constructing a self-coding network structure model carrying defect scene information, integrating the scene factors into the self-coding network structure model, inputting feature vectors obtained by coding sample data of various defects, performing matching training of data and rules, and generating a modal identification model;
and S4, identifying the defects of the sample to be detected according to the modal identification model.
When the method is implemented specifically, defect basic data is collected, the defect basic data is specifically historical defect data of a sample to be detected, multi-source heterogeneous data for defect detection is fused, and a basic defect scene rule database containing information such as static defect representation, dynamic defect evolution, defect classification and defect-scene rules is constructed through the fusion of the multi-source heterogeneous data;
refining scene factors according to a defect scene rule database, wherein the scene factors jointly construct a three-dimensional vector matrix containing defect type information, characteristic information and scene information, and the matrix constraint is applied to force a self-encoder to consider which parts of input data need to be optimized and copied and which parts need to be discarded, so that the self-encoder can learn the effective characteristics of the data and discard irrelevant characteristics, thereby generating more defect scene rules, performing data association and extracting the scene factors of the defect scene rule database;
the method comprises the steps of researching the construction of a scene rule knowledge base based on a semi-supervised self-coding network, designing a stacking self-coding network structure carrying defect scene information, introducing scene factors, enabling the scene factors to be hidden in a hierarchical structure in the stacking of the self-coding network, inputting a characteristic vector obtained by encoding sample data of various defects, performing matching training of data and rules, and generating a modal identification model;
and identifying the defects of the sample according to the generated modal identification model.
According to the method, under the conditions of low defect sampling rate and unbalanced samples, a production process scene is combined, material characteristics, manufacturing process data and high-resolution defect image sub-pixel characteristics are fused, a scene rule knowledge base is constructed through sample generation based on material process data, high-resolution defect image sub-pixel characteristic coding and deep learning classification methods, a self-coding network can well process various mapping relations in small sample defect data, and feature coding and knowledge modeling are performed, so that the core problems of low calculation efficiency, difficult defect origin tracing and the like caused by the fact that defect identification and classification are difficult and the robustness is weak, the capacity of an image to be detected is large and a deep learning method is used under the complex backgrounds of shielding, oxidation, vibration and the like in the defect detection process can be solved.
In another embodiment provided by the present invention, the step S1 specifically includes:
performing multi-source heterogeneous data fusion on defect basic data consisting of historical experience data, common rule data and defect standard data to form a defect scene rule database which is associated with defect types, positions and scales;
the defect scene rules database includes: a surface defect data set, a defect rule data set, a detection system data set, and a process scene data set.
When the embodiment is implemented specifically, the sources of the defect basic data include historical experience data, common rule data and defect standard data, and the historical experience data is specifically historical data of judgment of an expert on the defect;
the defects of common industrial products are mainly as follows: the method is characterized in that the method comprises the following steps of detecting defects such as lines, scratches, oil stains, points, shadows, textures, sawteeth and the like, representing the defects in an image in another form during defect detection, combining the common data representation condition of the defect image, and combining the characteristics of business activities with scene analysis, wherein the links of detected industrial products in business belong to, and the defects have important influence on scene judgment formed by defect detection. Finally, forming a defect scene rule database which is associated with the defect scene, the defect type, the defect position and the defect scale through association of each data set;
the defect scene rule database comprises a surface defect data set, a defect rule data set, a detection system data set and a process scene data set.
Accurate defect identification is realized by classifying and associating complex backgrounds such as shielding, oxidation, vibration and the like in the defect detection process of the micron-sized visual image.
In yet another embodiment provided by the present invention, the surface defect data set D1= [ surface defect ID, defect geometry, spatial distribution data, defect statistics, defect spectral data ];
the defect rule data set D2= [ defect rule ID, detection object type, defect classification statistical data, damage mechanism data, defect cause rule, defect grade ];
the detection system data set D3= [ detection system ID, device type, production line design data, technology type ];
the process scene data set D4= [ process scene data ID, detection object type, scene factor, production process ];
the defect geometry comprises: point-line-surface defects, boundaries, bones, shapes, positions, sizes, stretches, and translates;
the spatial distribution data includes: entropy, contrast, consistency and correlation;
the defect statistical data comprise gray level co-occurrence matrixes, autocorrelation coefficients, mathematical morphology, histogram statistical characteristics, fractal values and defect frequency spectrum subsets;
the histogram statistical features include range, mean, geometric mean, harmonic mean, standard deviation, variance, and median
The fractal values comprise stretching and translation fractal dimension and porosity;
the defect frequency spectrum subset comprises a texture frequency spectrum, a taint frequency spectrum and a sawtooth frequency spectrum;
the defect classification statistical data is specifically a fault mode of automatic defect division;
the defect grade comprises the detection object type;
the detection object types comprise semiconductors, circuit boards, wafers, fabrics, metal surfaces and woods;
the scene factors comprise operation scale and equipment type selection;
the production process comprises the steps of blank making, grinding, rolling, shearing, bundling and finished product forming.
In the embodiment, the surface defect data set specifically includes defect geometric features (point-line-surface defect, boundary, bone, shape, location, size, stretch, translation), spatial distribution data (entropy, contrast, consistency, and correlation), defect statistics (gray level co-occurrence matrix, autocorrelation coefficient, mathematical morphology, histogram statistics (range, mean, geometric mean, harmonic mean, standard deviation, variance, and median), and fractal values (stretch, translated fractal dimension, and porosity)), defect spectrum data (texture spectrum, blur spectrum, and sawtooth spectrum).
The entropy is used for reflecting the randomness of the image reflection pixels, and the larger the entropy is, the coarser the entropy is; contrast refers to the average difference of brightness and darkness of the defect scene image; consistency refers to the degree of consistency of the quantitative angle in the batch of images; correlation refers to the degree of correlation of the acquired image with the detected scene. In general, these specific data sets, which are actually the detection data sets of the image data, are classified from different angles to form different subsets, so as to facilitate the processing and recognition of the image.
The defect rule data set includes defect classification statistics (defects are automatically classified into corresponding failure modes), damage mechanism data, defect cause rules, and defect classes (inspection object types (semiconductor, circuit board, wafer, fabric, metal surface, wood, etc.)). The detection system data set comprises equipment types, production line design data and technology model selection;
the process scene data includes inspection object type (semiconductor, circuit board, wafer, fabric, metal surface, wood, etc.), scene factor (operation scale, equipment selection), production process (blank making, grinding, rolling, cutting, bundling, finished product, etc.).
Respectively expressing a surface defect data set, a defect rule data set, a detection system data set and a process scene data set in a data set form as follows:
a surface defect data set D1= [ surface defect ID, defect geometry, spatial distribution data, defect statistics, defect spectral data ];
defect geometry subset = [ surface defect ID, defect geometry ID, point, line, plane defect, boundary, bone, shape, position, size, stretch, translation ];
spatial distribution subset = [ surface defect ID, spatial distribution ID, entropy, contrast, consistency, correlation ];
defect statistics subset = [ surface defect ID, defect statistics ID, gray level co-occurrence matrix, autocorrelation coefficient, mathematical morphology, histogram statistical characteristics, fractal value ];
the defect statistical subset refers to data values obtained by statistically calculating defect data. Although the defect characteristics are not directly described, the statistical data of the distribution of the characteristics are mastered, and the method is favorable for analyzing the relationship between the defect types and the common characteristics. This is intersected in the D2 data set, i.e. these statistics will eventually be associated with defect rules, which make it easier to form defect scene rules.
Histogram statistical feature subset = [ surface defect ID, defect statistics ID, histogram statistics ID, range, mean, geometric mean, harmonic mean, standard deviation, variance, and median ];
fractal value subset = [ surface defect ID, defect statistics ID, fractal value ID, fractal dimension for stretching, translation, and porosity characteristics ];
the fractal value can reflect the stretching and deformation degree of defects, and the integral stretching of accessories is often caused by improper application of process level in the manufacturing process of products, so that industrial gap defects and the like are caused.
Defect spectrum subset = [ surface defect ID, defect spectrum ID, texture spectrum, smear spectrum, sawtooth spectrum ];
the defect frequency spectrum does refer to the frequency spectrum characteristics exhibited by the defect image, but the frequency spectrum characteristics formed by texture, stain and sawtooth are different, and the data set is the frequency spectrum characteristics of the defect image such as good texture, stain and sawtooth collected in the image defect process.
The defect rule data set D2= [ defect rule ID, detection object type, defect classification statistical data, damage mechanism data, defect cause rule, defect grade ];
the equipment type refers to missing equipment, and the detection object type refers to a detected object, such as PCB detection, steel detection, chip detection, mobile phone accessory detection and the like. Different detection objects have different detection scenes.
The detection system data set D3= [ detection system ID, device type, production line design data, technology model selection ];
the process scenario data set D4= [ process scenario data ID, detection object type, scenario factor, production procedure ].
In another embodiment provided by the present invention, the step S2 specifically includes:
extracting defect type information from the surface defect dataset, extracting feature information from the surface defect dataset and the defect rule dataset, and extracting scene information from the inspection system dataset and the process scene dataset;
for the defect Z, constructing a layered matrix Z multiplied by T multiplied by R according to the extracted defect type information, the extracted characteristic information and the extracted scene information;
for the defect-feature associated information, a first extraction factor a is adopted ij Mapping from a matrix of Z TExtracting to obtain the scene factor of the previous defectBased on all previous defect scene factors extracted>Forming a preceding scene factor->;
For the characteristic-scene associated information, a second extraction factor b is adopted ij Mapping and extracting from the matrix T multiplied by R to obtain the background factor of the defectBased on all the post-defect scene factors extracted ≥>Forming a post-term scene factor->;
Determining the scene factor according to the extracted antecedent scene factor and the extracted consequent scene factor;
wherein,,T,n is the number of defect classes, j is the eigenvector dimension, Z i j Being the value of an element in the defect matrix, T i j For the value of an element in the characteristic information matrix, R i j I =1,2, \8230nfor element values in the scene information matrix;,,Then is greater or less>=0,Then>;;,,Then>=0,Then it is;。
When the method is implemented specifically, scene factors are extracted according to a basic knowledge base, the scene factors are constructed into a three-dimensional vector matrix containing types, characteristics and scenes together, and the matrix constraint is applied to force the self-encoder to consider which parts of input data need to be optimized and copied and which parts need to be discarded, so that the self-encoder can learn the effective characteristics of the data and discard irrelevant characteristics, and further more defect scene rules are generated.
And finally forming a three-dimensional vector matrix containing type information, characteristic information and scene information after carrying out data cleaning, data association and conversion on the defect scene rule database.
Extracting defect type information from the surface defect data set D1; extracting feature information from a surface defect data set D1 and the defect rule data set D2; extracting scene information from the detection system dataset D3 and the process scene dataset D4;
for defect Z, can be expressed asFor the characteristic information, it can be expressed as T->For scene information, may be expressed as ≧>Finally, a Z × T × R hierarchical matrix is formed.
Wherein n is the number of defect categories, j is the dimension of a feature vector, and j is the dimension of a vector, a sample or a feature vector; for example, for the defect Z, the surface defect data set D1 and the defect rule data set D2 represent feature information, and if the sum of the fields of the surface defect data set D1 and the defect rule data set D2 is 11, j represents 1 to 11;
Z i j being the value of an element in the defect matrix, T i j For the value of an element in the characteristic information matrix, R i j I =1,2, ... (n) is the value of an element in the scene information matrix;
for the defect-feature associated information, extracting mapping information from Z multiplied by T, and adopting a first extraction factor from defects to featuresExtracting the previous defect scene factor->;/>
Wherein,is a staged representation of the symbol, which is used in the calculation process, is based on>Then is greater or less>=0,When it is, then;
For the characteristic-scene associated information, extracting mapping information from T multiplied by R and adopting a second extraction factor from defects to characteristicsAnd the obtained next defect scene factor is extracted>;
Wherein,is a staged representation of the symbol, which is used in the calculation process, is based on>Then is greater or less>=0,Then it is;
Scene factor = [ antecedent scene factor, postcedent scene factor ].
Antecedent scene factor representation: the information of the defect feature correlation is used for guiding effective feature extraction before the encoder, so that the noise of the sample is reduced;
the background scene factor represents: the information when the features are associated with the scene can guide the rule generation and filter invalid rules after being used for a decoder and before the rule generation.
In another embodiment provided by the present invention, the step S3 specifically includes applying a previous scene factor in the scene factors to an encoder of the self-coding network structure model to perform effective feature extraction;
applying the latter scene factor in the scene factors to a decoder of the self-coding network structure model to generate rules;
inputting a characteristic vector W coded by sample data of various defects, introducing a scene factor in the structure of a basic operation block during superposition by using the thought of a residual error network for reference, so that the scene factor is hidden in a hierarchical structure in the stack of the self-coding network structure model, and decoding and outputting to obtain a scene rule output [ type, characteristic and scene ];
outputting the scene rules through a semi-supervised stacking self-encoder, adding a classifier in a decoding stage to realize a classification function, optimizing the self-encoding network structure model classifier through matching training of data and rules, and generating the modal identification model.
In the specific implementation of the present embodiment, referring to fig. 2, a schematic flow chart of a multi-modal data pre-training and recognition method according to another embodiment of the present invention is shown;
in fig. 2, a scene rule knowledge base construction based on a semi-supervised self-coding network is studied, and a stacked self-coding network structure carrying defect scene information is designed;
applying the antecedent scene factors containing defects and characteristics in the scene factors to an encoder of the self-encoding network structure model to extract effective characteristics; applying the background scene factors including the features and the scenes in the scene factors to a decoder of the self-coding network structure model, and generating rules to make the scene factors hidden in a hierarchical structure in the stacking of the self-coding network, and adding coding structures and various classification feature information after stacking the self-coding network so that the constructed model has the functions of modal identification and scene prejudgment;
firstly, in a stacked self-coding network, an encoder and a decoder are in a symmetrical structural model, and the basic operation block structure of the network is designed in the coding network. By taking the thought of a residual error network as a reference, in the structure of a basic operation block, a scene factor is introduced during superposition, so that the scene factor is hidden in a hierarchical structure in the stacking of a self-coding network;
inputting a characteristic vector W consisting of sample data W1-Wi after data preprocessing is carried out on input sample data X1-Xi into a self-coding network structure model, introducing a scene factor in the structure of a basic operation block during superposition by using the thought of a residual error network, so that the scene factor is hidden in a hierarchical structure in the stack of the self-coding network structure model, and decoding and outputting to obtain a scene rule output [ type, characteristic and scene ];
outputting the scene rules through a semi-supervised stacking self-encoder, adding a classifier in a decoding stage to realize a classification function, optimizing the self-encoding network structure model classifier through matching training of data and rules, and generating the modal identification model.
A mode identification and scene prejudgment method based on a semi-supervised self-coding network is used for constructing a basic defect scene knowledge base containing information such as static defect representation, dynamic defect evolution, defect classification and defect-scene rules through multi-source heterogeneous data fusion. Then, based on a self-coding network, introducing scene factors to be fused into a stacked self-coding network, coding a certain type of data sample to obtain a characteristic vector through learning of the data sample, learning mapping from a certain type of image space to a potential space, generating characteristic models of various types, positions and degrees, and performing matching training of data and rules; by constructing and applying the defect scene knowledge base, the defect detection model has the function of scene prejudgment, can promote the cause generated by the defect detection model according to the defect information, and is helpful for the production line design and process optimization of industrial defect products.
In another embodiment provided by the present invention, the objective function of the self-coding network structure model is specifically:
the loss function of the self-coding network structure model is specifically as follows:
wherein V (G, D) is the whole defined objective function, N is the number of the original labels,the probability P of the original label in the output data (x) of the defect sample x after passing through the self-coding network is represented, and the value is combined with the value of the original label in the data (x)>Representing the probability P of the original label in the output data z (x) after the sample x carrying the defect knowledge passes through a self-coding network; d (X) is a conditional probability calculation function, G (z) is the probability of comparing the output information y in the applied classification category data under the condition of the category model G (z);Representing the presence or absence of ^ in an image>Class defects; a. b, w, h and c are constituent variables of each grid during defect detection, a and b are points at the lower left corner of the grid, w and h are the width and height of the grid, c is the confidence of the grid, and/or the value of the grid>Representing that the mean square error is calculated through the position information to represent the coordinate loss of the defect boundary box;Representing the size loss of the defect bounding box by the calculated absolute mean square error of the size information;Indicates whether the user belongs to the system by judgingThe defect type calculates the confidence loss.
In the specific implementation of this embodiment, the objective function designed when the self-coding network structure model with defect scene information designed in this patent is applied to classification and identification is:
where V (G, D) is the defined overall objective function, which is calculated at the angle of maximum contribution, and is a conditional probability calculation function that yields the improvement D (X) to the countermeasure network equation, which is divided into three parts, the first part: reflecting the objective function calculation of the encoding stage, wherein the calculation of the stage and the whole function calculation are pursued to be as large as possible so as to obtain the most representative characteristic information; the second part is a decoding stage, the output calculation value of the stage is required to be as small as possible, but the whole equation calculation is as large as possible, so that the decoding difference is small; when the third part is target classification identification, G (z) is the probability of comparing the output information y in the applied classification class data under the condition of the class model G (z), which can represent the accuracy of classification;the probability P of the original tag in the output data (x) of the defect sample x after passing through the self-coding network is represented,representing the probability P of the original label in the output data z (x) after the sample x carrying the defect knowledge passes through a self-coding network;Estimating the central point;
the loss function is:
the loss function of the self-coding network structure model is specifically as follows:
wherein a, b, w, h and c are the composition variables of each grid during defect detection, N is the number of the original labels, a and b are the points at the lower left corner of the grid, w and h are the width and height of the grid, c is the confidence coefficient of the grid,representing that the mean square error is calculated through the position information to represent the coordinate loss of the defect boundary box;Representing the size loss of the defect bounding box by the calculated absolute mean square error of the size information;Indicates whether the judgment is passed or not>The defect type calculates the confidence loss.
In another embodiment of the present invention, the scene rule output is further processed by implicit layer training of a stacked self-encoder, continuously generating and updating a defect scene rule, and supplementing the defect scene rule into the defect scene rule database.
In the specific implementation of the embodiment, the output result after the decoder can realize the classification function by adding a classifier in the decoding stage through a semi-supervised stacked self-encoder, and can continuously generate and update the defect-scene rule knowledge through the hidden layer training of the stacked self-encoder, and supplement the defect-scene rule knowledge to the defect-scene rule database. And further perfecting the knowledge base of the defect and scene mapping rules.
In the specific implementation of this embodiment, referring to fig. 2, a scene rule knowledge base is supplemented according to a rule generated by the post-output of a decoder, that is, a scene factor is extracted through a last-time consequent factor [ Yi-1], a scene hierarchical matrix is updated for a self-coding network structure model, and the scene hierarchical matrix is also supplemented into an input feature vector according to a vector matrix [ Yi ] of the extracted scene factor;
the stacking is performed in the form of a scene factor structure. In the stacking substructure, the antecedent scene factors are merged into a first layer of training, and the consequent scene factors are merged into a second layer of training; the usage is the same, one is threshold value use, and the other is weight amplification; threshold value use means that an activation function is influenced, on the basis of original full connection, through matrix entry verification of antecedent/consequent scene factors, defect features with an excessively small threshold value can be directly discarded, so that excessive feature/scene information is prevented, and finally overfitting can be prevented in application; on the other hand, effective features are further amplified, so that the phenomenon of gradient disappearance easily generated by deep learning can be prevented, and the loss of the effective features is prevented. Through the two aspects, the rules formed by the training of the stacked self-coding network are more suitable for defect scenes.
In another embodiment provided by the present invention, referring to fig. 3, it is a schematic structural diagram of a multi-modal data pre-training and recognition apparatus provided by the embodiment of the present invention, the apparatus includes:
the database construction module is used for carrying out multi-source heterogeneous data fusion on the acquired defect basic data and constructing a defect scene rule database;
the scene factor extraction module is used for extracting defect type information, characteristic information and scene information from the defect scene rule database, performing data association and extracting scene factors of the defect scene rule database;
the model generation module is used for constructing a self-coding network structure model carrying defect scene information, integrating the scene factors into the self-coding network structure model, inputting a characteristic vector obtained by coding sample data of various defects, performing matching training of data and rules and generating a modal identification model;
and the defect identification module is used for identifying the defects of the sample to be detected according to the modal identification model.
It should be noted that the multi-modal data pre-training and recognition apparatus provided in the embodiment of the present invention can perform the multi-modal data pre-training and recognition method described in any embodiment of the above embodiments, and specific functions of the multi-modal data pre-training and recognition apparatus are not described in detail herein.
Fig. 4 is a schematic structural diagram of a terminal device according to an embodiment of the present invention. The terminal device of this embodiment includes: a processor, a memory, and a computer program, such as a multimodal data pre-training and recognition program, stored in the memory and executable on the processor. When the processor executes the computer program, the steps in each of the above embodiments of the method for pre-training and recognizing multimodal data, such as steps S1 to S5 shown in fig. 1, are implemented. Alternatively, the processor implements the functions of the modules in the above device embodiments when executing the computer program.
Illustratively, the computer program may be partitioned into one or more modules/units that are stored in the memory and executed by the processor to implement the invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used for describing the execution process of the computer program in the terminal device. For example, the computer program may be divided into modules, and the specific functions of the modules are not described again.
The terminal device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The terminal device may include, but is not limited to, a processor, a memory. It will be appreciated by those skilled in the art that the schematic diagram is merely an example of a terminal device and does not constitute a limitation of a terminal device, and may include more or less components than those shown, or combine certain components, or different components, for example, the terminal device may also include input output devices, network access devices, buses, etc.
The Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like, which is the control center of the terminal device and connects the various parts of the whole terminal device using various interfaces and lines.
The memory may be used for storing the computer programs and/or modules, and the processor may implement various functions of the terminal device by executing or executing the computer programs and/or modules stored in the memory and calling data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
Wherein, the terminal device integrated module/unit can be stored in a computer readable storage medium if it is implemented in the form of software functional unit and sold or used as an independent product. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in code form, in object code form, in an executable file or in some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, read-Only Memory (ROM), random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like.
It should be noted that the above-described device embodiments are merely illustrative, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiment of the apparatus provided by the present invention, the connection relationship between the modules indicates that there is a communication connection between them, and may be specifically implemented as one or more communication buses or signal lines. One of ordinary skill in the art can understand and implement it without inventive effort.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.
Claims (10)
1. A multi-modal data pre-training and recognition method, the method comprising:
performing multi-source heterogeneous data fusion on acquired defect basic data, and constructing a defect scene rule database;
extracting defect type information, characteristic information and scene information from the defect scene rule database, performing data association, and extracting scene factors of the defect scene rule database;
constructing a self-coding network structure model carrying defect scene information, integrating the scene factors into the self-coding network structure model, inputting feature vectors obtained by coding sample data of various defects, performing matching training of data and rules, and generating a modal identification model;
and identifying the defects of the sample to be detected according to the modal identification model.
2. The multi-modal data pre-training and recognition method of claim 1, wherein the multi-source heterogeneous data fusion is performed on the acquired defect base data to construct a defect scene rule database, specifically comprising:
performing multi-source heterogeneous data fusion on defect basic data consisting of historical experience data, common rule data and defect standard data to form a defect scene rule database which is associated with defect types, positions and scales;
the defect scene rules database includes: a surface defect data set, a defect rule data set, a detection system data set, and a process scene data set.
3. The method of claim 2, wherein the surface defect dataset D1= [ surface defect ID, defect geometry, spatial distribution data, defect statistics, defect spectral data ];
the defect rule data set D2= [ defect rule ID, detection object type, defect classification statistical data, damage mechanism data, defect cause rule, defect grade ];
the detection system data set D3= [ detection system ID, device type, production line design data, technology model selection ];
the process scene data set D4= [ process scene data ID, detection object type, scene factor, production process ];
the defect geometry includes: point-line-surface defects, boundaries, bones, shapes, positions, sizes, stretches, and translates;
the spatial distribution data includes: entropy, contrast, consistency and correlation;
the defect statistical data comprise gray level co-occurrence matrixes, autocorrelation coefficients, mathematical morphology, histogram statistical characteristics, fractal values and defect frequency spectrum subsets;
the histogram statistical features include range, mean, geometric mean, harmonic mean, standard deviation, variance, and median
The fractal values include stretching, translating fractal dimension and porosity;
the defect spectrum subset comprises a texture spectrum, a taint spectrum, and a sawtooth spectrum;
the defect classification statistical data is specifically a fault mode of automatic defect division;
the defect level comprises the detection object type;
the detection object types comprise semiconductors, circuit boards, wafers, fabrics, metal surfaces and woods;
the scene factors comprise operation scale and equipment type selection;
the production process comprises the steps of blank making, grinding, rolling, shearing, bundling and finished product forming.
4. The multi-modal data pre-training and recognition method of claim 2, wherein the extracting defect type information, feature information, and scene information from the defect scene rules database, performing data association, and extracting scene factors of the defect scene rules database, specifically comprises:
extracting defect type information from the surface defect dataset, extracting feature information from the surface defect dataset and the defect rule dataset, and extracting scene information from the inspection system dataset and the process scene dataset;
for the defect Z, constructing a layered matrix Z multiplied by T multiplied by R according to the extracted defect type information, the characteristic information and the scene information;
for the defect-feature associated information, a first extraction factor a is adopted ij Mapping and extracting from the matrix Z multiplied by T to obtain a front defect scene factorBased on all previous defect scene factors extracted>Forming a preceding scene factor->;
For the characteristic-scene correlation information, a second extraction factor b is adopted ij Mapping and extracting from the matrix T multiplied by R to obtain the background factor of the defectBased on all the post-defect scene factors extracted ≥>Forming a post-term scene factor->;
Determining the scene factor according to the extracted antecedent scene factor and consequent scene factor;
wherein,,T,n is the number of defect classes, j is the eigenvector dimension, Z i j Is the value of an element in the defect matrix, T i j For the value of an element in the characteristic information matrix, R i j I =1,2, ... (n) is the value of an element in the scene information matrix;,,Then is greater or less>=0,Then is greater or less>;;,,Then is greater or less>=0,When it is, then;。
5. The method according to claim 1, wherein the constructing a self-coding network structure model carrying scene information of the defect, fusing the scene factor into the self-coding network structure model, inputting a feature vector obtained by coding sample data of various defects, performing matching training of data and rules, and generating a modal recognition model specifically comprises:
applying the antecedent scene factors in the scene factors to an encoder of the self-encoding network structure model to extract effective features;
applying the latter scene factor in the scene factors to a decoder of the self-coding network structure model to generate rules;
inputting a characteristic vector W coded by sample data of various defects, introducing a scene factor in the structure of a basic operation block during superposition by using the thought of a residual error network for reference, so that the scene factor is hidden in a hierarchical structure in the stack of the self-coding network structure model, and decoding and outputting to obtain a scene rule output [ type, characteristic and scene ];
and outputting the scene rules through a semi-supervised stacking self-encoder, adding a classifier in a decoding stage to realize a classification function, and optimizing the self-encoding network structure model classifier through matching training of data and rules to generate the modal recognition model.
6. The method for pre-training and recognizing multimodal data as claimed in claim 1, wherein the objective function of the self-coding network structure model is specifically:
the loss function of the self-coding network structure model is specifically as follows:
wherein V (G, D) is the whole defined objective function, N is the number of the original labels,the probability P which represents that the defect sample x is the original label in the output data (x) after passing through the self-coding network is judged as being greater than or equal to>Representing the probability P of the original label in the output data z (x) after the sample x carrying the defect knowledge passes through a self-coding network; d (X) is a conditional probability calculation function, G (z) is the probability of comparing the output information y in the applied classification category data under the condition of the category model G (z);Represents whether or not there is a->Class defects; a. b, w, h and c are constituent variables of each grid during defect detection, a and b are points at the lower left corner of the grid, w and h are the width and height of the grid, c is the confidence of the grid, and/or the value of the grid>Representing that the mean square error is calculated through the position information to represent the coordinate loss of the defect boundary box;Representing passage sizeThe calculated absolute mean square error of the information represents the size loss of the defect bounding box;Indicates whether the judgment is passed or not>The defect type calculates the confidence loss.
7. The method of claim 5, wherein the scene rule output is further trained through hidden layers of a stacked self-encoder, continuously generating and updating defect scene rules, and supplementing the defect scene rules into the defect scene rules database.
8. A multi-modal data pre-training and recognition device, the device comprising:
the database construction module is used for carrying out multi-source heterogeneous data fusion on the acquired defect basic data and constructing a defect scene rule database;
the scene factor extraction module is used for extracting defect type information, characteristic information and scene information from the defect scene rule database, performing data association and extracting scene factors of the defect scene rule database;
the model generation module is used for constructing a self-coding network structure model carrying defect scene information, integrating the scene factors into the self-coding network structure model, inputting a characteristic vector obtained by coding sample data of various defects, performing matching training of data and rules and generating a modal identification model;
and the defect identification module is used for identifying the defects of the sample to be detected according to the modal identification model.
9. A computer-readable storage medium, comprising a stored computer program, wherein the computer program, when executed, controls a device on which the computer-readable storage medium is located to perform the multimodal data pre-training and recognition method as claimed in any one of claims 1 to 7.
10. A terminal device comprising a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the multimodal data pre-training and recognition method as claimed in any one of claims 1 to 7 when executing the computer program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310272537.2A CN115984662B (en) | 2023-03-21 | 2023-03-21 | Multi-mode data pre-training and identifying method, device, equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310272537.2A CN115984662B (en) | 2023-03-21 | 2023-03-21 | Multi-mode data pre-training and identifying method, device, equipment and medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115984662A true CN115984662A (en) | 2023-04-18 |
CN115984662B CN115984662B (en) | 2023-08-04 |
Family
ID=85958593
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310272537.2A Active CN115984662B (en) | 2023-03-21 | 2023-03-21 | Multi-mode data pre-training and identifying method, device, equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115984662B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117036141A (en) * | 2023-10-08 | 2023-11-10 | 交通运输部公路科学研究所 | Data processing method and data interaction system for highway full life cycle |
CN117376632A (en) * | 2023-12-06 | 2024-01-09 | 中国信息通信研究院 | Data recovery method and system based on intelligent depth synthesis |
CN118505704A (en) * | 2024-07-18 | 2024-08-16 | 成都数之联科技股份有限公司 | General model modeling detection method for detecting defects of panel production line |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109919934A (en) * | 2019-03-11 | 2019-06-21 | 重庆邮电大学 | A kind of liquid crystal display panel defect inspection method based on the study of multi-source domain depth migration |
CN112164067A (en) * | 2020-10-12 | 2021-01-01 | 西南科技大学 | Medical image segmentation method and device based on multi-mode subspace clustering |
US20210183052A1 (en) * | 2018-12-28 | 2021-06-17 | Omron Corporation | Defect inspecting device, defect inspecting method, and storage medium |
CN113066070A (en) * | 2021-03-31 | 2021-07-02 | 广东电网有限责任公司 | Multi-source data fusion interaction method in three-dimensional scene |
CN113436184A (en) * | 2021-07-15 | 2021-09-24 | 南瑞集团有限公司 | Power equipment image defect judging method and system based on improved twin network |
US20220383479A1 (en) * | 2021-05-20 | 2022-12-01 | Hon Hai Precision Industry Co., Ltd. | Method for detecting defects in images, computer device, and storage medium |
US20220405909A1 (en) * | 2020-12-03 | 2022-12-22 | Boe Technology Group Co., Ltd. | Computer-implemented method for defect analysis, apparatus for defect analysis, computer-program product, and intelligent defect analysis system |
-
2023
- 2023-03-21 CN CN202310272537.2A patent/CN115984662B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210183052A1 (en) * | 2018-12-28 | 2021-06-17 | Omron Corporation | Defect inspecting device, defect inspecting method, and storage medium |
CN109919934A (en) * | 2019-03-11 | 2019-06-21 | 重庆邮电大学 | A kind of liquid crystal display panel defect inspection method based on the study of multi-source domain depth migration |
CN112164067A (en) * | 2020-10-12 | 2021-01-01 | 西南科技大学 | Medical image segmentation method and device based on multi-mode subspace clustering |
US20220405909A1 (en) * | 2020-12-03 | 2022-12-22 | Boe Technology Group Co., Ltd. | Computer-implemented method for defect analysis, apparatus for defect analysis, computer-program product, and intelligent defect analysis system |
CN113066070A (en) * | 2021-03-31 | 2021-07-02 | 广东电网有限责任公司 | Multi-source data fusion interaction method in three-dimensional scene |
US20220383479A1 (en) * | 2021-05-20 | 2022-12-01 | Hon Hai Precision Industry Co., Ltd. | Method for detecting defects in images, computer device, and storage medium |
CN113436184A (en) * | 2021-07-15 | 2021-09-24 | 南瑞集团有限公司 | Power equipment image defect judging method and system based on improved twin network |
Non-Patent Citations (1)
Title |
---|
张旭中 等: "基于深度强化学习的木材缺陷图像识别及分割模型研究", 电子测量技术, no. 17, pages 86 - 92 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117036141A (en) * | 2023-10-08 | 2023-11-10 | 交通运输部公路科学研究所 | Data processing method and data interaction system for highway full life cycle |
CN117036141B (en) * | 2023-10-08 | 2023-12-08 | 交通运输部公路科学研究所 | Data processing method and data interaction system for highway full life cycle |
CN117376632A (en) * | 2023-12-06 | 2024-01-09 | 中国信息通信研究院 | Data recovery method and system based on intelligent depth synthesis |
CN117376632B (en) * | 2023-12-06 | 2024-02-06 | 中国信息通信研究院 | Data recovery method and system based on intelligent depth synthesis |
CN118505704A (en) * | 2024-07-18 | 2024-08-16 | 成都数之联科技股份有限公司 | General model modeling detection method for detecting defects of panel production line |
CN118505704B (en) * | 2024-07-18 | 2024-10-22 | 成都数之联科技股份有限公司 | General model modeling detection method for detecting defects of panel production line |
Also Published As
Publication number | Publication date |
---|---|
CN115984662B (en) | 2023-08-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111833306B (en) | Defect detection method and model training method for defect detection | |
CN110738207B (en) | Character detection method for fusing character area edge information in character image | |
CN111462120B (en) | Defect detection method, device, medium and equipment based on semantic segmentation model | |
CN115984662A (en) | Multi-mode data pre-training and recognition method, device, equipment and medium | |
CN114155244B (en) | Defect detection method, device, equipment and storage medium | |
CN111079638A (en) | Target detection model training method, device and medium based on convolutional neural network | |
CN112989995B (en) | Text detection method and device and electronic equipment | |
CN114092474B (en) | Method and system for detecting processing defects of complex texture background of mobile phone shell | |
CN111325237B (en) | Image recognition method based on attention interaction mechanism | |
CN114255223B (en) | Deep learning-based double-stage bathroom ceramic surface defect detection method and equipment | |
CN110570442A (en) | Contour detection method under complex background, terminal device and storage medium | |
CN114445410A (en) | Circuit board detection method based on image recognition, computer and readable storage medium | |
CN114581646A (en) | Text recognition method and device, electronic equipment and storage medium | |
CN116205881A (en) | Digital jet printing image defect detection method based on lightweight semantic segmentation | |
CN113673528B (en) | Text processing method, text processing device, electronic equipment and readable storage medium | |
CN111144425A (en) | Method and device for detecting screen shot picture, electronic equipment and storage medium | |
CN117523087B (en) | Three-dimensional model optimization method based on content recognition | |
CN108428234B (en) | Interactive segmentation performance optimization method based on image segmentation result evaluation | |
CN116109627B (en) | Defect detection method, device and medium based on migration learning and small sample learning | |
CN113378837A (en) | License plate shielding identification method and device, electronic equipment and storage medium | |
CN112966730A (en) | Vehicle damage identification method, device, equipment and storage medium | |
Xu et al. | Tolerance Information Extraction for Mechanical Engineering Drawings–A Digital Image Processing and Deep Learning-based Model | |
CN115345895B (en) | Image segmentation method and device for visual detection, computer equipment and medium | |
CN114511862B (en) | Form identification method and device and electronic equipment | |
CN113505784A (en) | Automatic nail annotation analysis method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB03 | Change of inventor or designer information |
Inventor after: Luo Liang Inventor after: Lin Zhu Inventor after: Li Haiwei Inventor after: Ma Zhiping Inventor after: Feng Diehua Inventor before: Luo Liang Inventor before: Lin Zhu Inventor before: Li Haiwei Inventor before: Ma Zhiping Inventor before: Feng Zhihua |
|
CB03 | Change of inventor or designer information | ||
GR01 | Patent grant | ||
GR01 | Patent grant |