Nothing Special   »   [go: up one dir, main page]

CN115984662A - Multi-mode data pre-training and recognition method, device, equipment and medium - Google Patents

Multi-mode data pre-training and recognition method, device, equipment and medium Download PDF

Info

Publication number
CN115984662A
CN115984662A CN202310272537.2A CN202310272537A CN115984662A CN 115984662 A CN115984662 A CN 115984662A CN 202310272537 A CN202310272537 A CN 202310272537A CN 115984662 A CN115984662 A CN 115984662A
Authority
CN
China
Prior art keywords
defect
scene
data
information
self
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310272537.2A
Other languages
Chinese (zh)
Other versions
CN115984662B (en
Inventor
罗亮
林珠
李海威
马志平
冯秩华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Science & Technology Infrastructure Center
Original Assignee
Guangdong Science & Technology Infrastructure Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Science & Technology Infrastructure Center filed Critical Guangdong Science & Technology Infrastructure Center
Priority to CN202310272537.2A priority Critical patent/CN115984662B/en
Publication of CN115984662A publication Critical patent/CN115984662A/en
Application granted granted Critical
Publication of CN115984662B publication Critical patent/CN115984662B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Image Analysis (AREA)
  • Investigating Materials By The Use Of Optical Means Adapted For Particular Applications (AREA)

Abstract

The invention discloses a method, a device, equipment and a medium for multi-mode data pre-training and recognition, wherein a defect scene rule database is constructed by performing multi-source heterogeneous data fusion on acquired defect basic data; extracting defect type information, characteristic information and scene information from the defect scene rule database, performing data association, and extracting scene factors of the defect scene rule database; constructing a self-coding network structure model carrying defect scene information, integrating the scene factors into the self-coding network structure model, inputting a characteristic vector obtained by coding sample data of various defects, performing matching training of data and rules, and generating a modal identification model; and identifying the defects of the sample to be detected according to the modal identification model. The product defect detection accuracy and the model robustness can be improved.

Description

Multi-mode data pre-training and recognition method, device, equipment and medium
Technical Field
The invention relates to the field of image recognition, in particular to a method, a device, equipment and a medium for multi-mode data pre-training and recognition.
Background
With the rapid development of precision manufacturing industry, the loss caused by surface defects of high-precision instruments reaches the billion yuan level every year, and the requirement of high-precision defect detection of industrial products is increasingly strong. Especially, the industrial production environment has highly complex conditions such as noise, shielding, vibration, dim light and the like, so that the defect detection has to meet the requirements of intellectualization, high precision, long time and high efficiency.
Although the defect accuracy rate is improved to a certain extent by applying the deep learning algorithm at the present stage, the defect samples are small and unbalanced in the existing high-precision defect detection process, and are easily influenced by environments such as shielding, oxidation and vibration, so that the problems of low product defect detection accuracy rate and weak model robustness exist.
Disclosure of Invention
In order to solve the technical problems, the invention provides a method, a device, equipment and a medium for multi-mode data pre-training and recognition, which improve the product defect detection accuracy and the robustness of a model.
The embodiment of the invention provides a multi-mode data pre-training and recognition method, which comprises the following steps:
performing multi-source heterogeneous data fusion on acquired defect basic data, and constructing a defect scene rule database;
extracting defect type information, characteristic information and scene information from the defect scene rule database, performing data association, and extracting scene factors of the defect scene rule database;
constructing a self-coding network structure model carrying defect scene information, integrating the scene factors into the self-coding network structure model, inputting feature vectors obtained by coding sample data of various defects, performing matching training of data and rules, and generating a modal identification model;
and identifying the defects of the sample to be detected according to the modal identification model.
Further, the multi-source heterogeneous data fusion is performed on the acquired defect basic data, and a defect scene rule database is constructed, specifically including:
performing multi-source heterogeneous data fusion on defect basic data consisting of historical experience data, common rule data and defect standard data to form a defect scene rule database which is associated with defect types, positions and scales;
the defect scene rules database includes: a surface defect data set, a defect rule data set, a detection system data set, and a process scene data set.
As an improvement of the above solution, the surface defect data set D1= [ surface defect ID, defect geometry, spatial distribution data, defect statistics, defect spectrum data ];
the defect rule data set D2= [ defect rule ID, detection object type, defect classification statistical data, damage mechanism data, defect cause rule, defect grade ];
the detection system data set D3= [ detection system ID, device type, production line design data, technology type ];
the process scene data set D4= [ process scene data ID, detection object type, scene factor, production process ];
the defect geometry includes: point-line-surface defects, boundaries, bones, shapes, positions, sizes, stretches, and translates;
the spatial distribution data includes: entropy, contrast, consistency and correlation;
the defect statistical data comprise gray level co-occurrence matrixes, autocorrelation coefficients, mathematical morphology, histogram statistical characteristics, fractal values and defect frequency spectrum subsets;
the histogram statistics include range, mean, geometric mean, harmonic mean, standard deviation, variance, and median.
The fractal values comprise stretching and translation fractal dimension and porosity;
the defect frequency spectrum subset comprises a texture frequency spectrum, a taint frequency spectrum and a sawtooth frequency spectrum;
the defect classification statistical data is specifically a fault mode of automatic defect division;
the defect level comprises the detection object type;
the detection object types comprise semiconductors, circuit boards, wafers, fabrics, metal surfaces and woods;
the scene factors comprise operation scale and equipment type selection;
the production process comprises the steps of blank making, grinding, rolling, shearing, bundling and finished product forming.
Preferably, the extracting the defect type information, the feature information, and the scene information from the defect scene rule database, performing data association, and extracting the scene factor of the defect scene rule database specifically includes:
extracting defect type information from the surface defect dataset, extracting feature information from the surface defect dataset and the defect rule dataset, and extracting scene information from the inspection system dataset and the process scene dataset;
for the defect Z, constructing a layered matrix Z multiplied by T multiplied by R according to the extracted defect type information, the extracted characteristic information and the extracted scene information;
for defect-feature associated information, a first extraction factor a is adopted ij Mapping and extracting from the matrix Z multiplied by T to obtain a front defect scene factor
Figure SMS_1
Based on all preceding extracted defect scene factors &>
Figure SMS_2
Preceding scene factor->
Figure SMS_3
For the characteristic-scene associated information, a second extraction factor b is adopted ij Mapping from a matrix T RShooting and extracting to obtain the background defect scene factor
Figure SMS_4
Based on all the post-defect scene factors extracted ≥>
Figure SMS_5
Forming a post-term scene factor->
Figure SMS_6
Determining the scene factor according to the extracted antecedent scene factor and consequent scene factor;
wherein,
Figure SMS_16
,T
Figure SMS_9
Figure SMS_12
n is the number of defect classes, j is the eigenvector dimension, Z i j Being the value of an element in the defect matrix, T i j Is the value of an element in the feature information matrix, R i j I =1,2, ... (n) is the value of an element in the scene information matrix;
Figure SMS_10
Figure SMS_11
Figure SMS_15
Then>
Figure SMS_19
=0,
Figure SMS_17
Then is greater or less>
Figure SMS_21
Figure SMS_8
Figure SMS_13
Figure SMS_18
Figure SMS_22
Then is greater or less>
Figure SMS_20
=0,
Figure SMS_23
Then it is
Figure SMS_7
Figure SMS_14
。/>
Preferably, the constructing a self-coding network structure model carrying defect scene information, merging the scene factors into the self-coding network structure model, inputting feature vectors obtained by coding sample data of various defects, performing matching training of data and rules, and generating a modal identification model specifically includes:
applying the former scene factor in the scene factors to an encoder of the self-encoding network structure model to extract effective characteristics;
applying the latter scene factor in the scene factors to a decoder of the self-coding network structure model to generate rules;
inputting a characteristic vector W coded by sample data of various defects, introducing a scene factor in the structure of a basic operation block during superposition by using the thought of a residual error network for reference, so that the scene factor is hidden in a hierarchical structure in the stack of the self-coding network structure model, and decoding and outputting to obtain a scene rule output [ type, characteristic and scene ];
outputting the scene rules through a semi-supervised stacking self-encoder, adding a classifier in a decoding stage to realize a classification function, optimizing the self-encoding network structure model classifier through matching training of data and rules, and generating the modal identification model.
As a preferred scheme, the objective function of the self-coding network structure model is specifically:
Figure SMS_24
the loss function of the self-coding network structure model is specifically as follows:
Figure SMS_25
wherein V (G, D) is the whole defined objective function, N is the number of the original labels,
Figure SMS_27
the probability P which represents that the defect sample x is the original label in the output data (x) after passing through the self-coding network is judged as being greater than or equal to>
Figure SMS_29
Representing the probability P of the original label in the output data z (x) after the sample x carrying the defect knowledge passes through a self-coding network; d (X) is a conditional probability calculation function, G (z) is the probability of comparing the output information y in the applied classification category data under the condition of the category model G (z);
Figure SMS_31
Represents whether or not there is a->
Figure SMS_26
Class defects; a. b, w, h and c are composition variables of each grid during defect detection, a and b are points at the lower left corner of the grid, w and h are width and height of the grid, c is grid confidence coefficient, and>
Figure SMS_30
representing the coordinate loss of the defect bounding box by calculating the mean square error from the position informationLosing;
Figure SMS_32
Representing the size loss of the defect bounding box by the calculated absolute mean square error of the size information;
Figure SMS_33
Indicates whether the user belongs to by judging
Figure SMS_28
The defect type calculates the confidence loss.
Preferably, the scene rule output is further trained by a hidden layer of a stacked self-encoder, and the defect scene rule is continuously generated and updated and supplemented into the defect scene rule database.
The embodiment of the invention also provides a device for pre-training and recognizing the multi-modal data, which comprises:
the database construction module is used for carrying out multi-source heterogeneous data fusion on the acquired defect basic data and constructing a defect scene rule database;
the scene factor extraction module is used for extracting defect type information, characteristic information and scene information from the defect scene rule database, performing data association and extracting scene factors of the defect scene rule database;
the model generation module is used for constructing a self-coding network structure model carrying defect scene information, integrating the scene factors into the self-coding network structure model, inputting a characteristic vector obtained by coding sample data of various defects, performing matching training of data and rules and generating a modal identification model;
and the defect identification module is used for identifying the defects of the sample to be detected according to the modal identification model.
Preferably, the database construction module is specifically configured to:
performing multi-source heterogeneous data fusion on defect basic data consisting of historical experience data, common rule data and defect standard data to form a defect scene rule database which is associated with defect types, positions and scales;
the defect scene rules database includes: a surface defect data set, a defect rule data set, a detection system data set, and a process scene data set.
Further, the surface defect data set D1= [ surface defect ID, defect geometry, spatial distribution data, defect statistics, defect spectrum data ];
the defect rule data set D2= [ defect rule ID, detection object type, defect classification statistical data, damage mechanism data, defect cause rule, defect grade ];
the detection system data set D3= [ detection system ID, device type, production line design data, technology type ];
the process scene data set D4= [ process scene data ID, detection object type, scene factor, production process ];
the defect geometry includes: point-line-surface defects, boundaries, bones, shapes, positions, sizes, stretches, and translates;
the spatial distribution data includes: entropy, contrast, consistency and correlation;
the defect statistical data comprise gray level co-occurrence matrixes, autocorrelation coefficients, mathematical morphology, histogram statistical characteristics, fractal values and defect frequency spectrum subsets;
the histogram statistical features include range, mean, geometric mean, harmonic mean, standard deviation; median value
The fractal values include stretching, translating fractal dimension and porosity;
the defect frequency spectrum subset comprises a texture frequency spectrum, a taint frequency spectrum and a sawtooth frequency spectrum;
the defect classification statistical data are fault modes of automatic defect division;
the defect grade comprises the detection object type;
the detection object types comprise semiconductors, circuit boards, wafers, fabrics, metal surfaces and woods;
the scene factors comprise operation scale and equipment type selection;
the production process comprises blank making, grinding, rolling, shearing, bundling and finished product forming.
Preferably, the scene factor extraction module is specifically configured to:
extracting defect type information from the surface defect dataset, extracting feature information from the surface defect dataset and the defect rule dataset, and extracting scene information from the inspection system dataset and the process scene dataset;
for the defect Z, constructing a layered matrix Z multiplied by T multiplied by R according to the extracted defect type information, the extracted characteristic information and the extracted scene information;
for defect-feature associated information, a first extraction factor a is adopted ij Mapping and extracting the matrix Z multiplied by T to obtain the scene factor of the previous defect
Figure SMS_34
Based on all previous defect scene factors extracted>
Figure SMS_35
Forming a preceding scene factor->
Figure SMS_36
For the characteristic-scene associated information, a second extraction factor b is adopted ij Mapping and extracting from the matrix T multiplied by R to obtain the background factor of the defect
Figure SMS_37
Based on all post defect scene factors extracted &>
Figure SMS_38
Forming a post-term scene factor->
Figure SMS_39
Determining the scene factor according to the extracted antecedent scene factor and the extracted consequent scene factor;
wherein,
Figure SMS_50
,T
Figure SMS_40
Figure SMS_46
n is the number of defect classes, j is the eigenvector dimension, Z i j Being the value of an element in the defect matrix, T i j Is the value of an element in the feature information matrix, R i j I =1,2, \8230nfor element values in the scene information matrix;
Figure SMS_51
Figure SMS_55
Figure SMS_54
Then is greater or less>
Figure SMS_56
=0,
Figure SMS_49
Then is greater or less>
Figure SMS_53
Figure SMS_43
Figure SMS_45
Figure SMS_42
Figure SMS_47
Then>
Figure SMS_48
=0,
Figure SMS_52
Then it is
Figure SMS_41
Figure SMS_44
Preferably, the model generation module is specifically configured to:
applying the antecedent scene factors in the scene factors to an encoder of the self-encoding network structure model to extract effective features;
applying the latter scene factor in the scene factors to a decoder of the self-coding network structure model to generate rules;
inputting a characteristic vector W coded by sample data of various defects, introducing a scene factor in the structure of a basic operation block during superposition by using the thought of a residual error network for reference, so that the scene factor is hidden in a hierarchical structure in the stack of the self-coding network structure model, and decoding and outputting to obtain a scene rule output [ type, characteristic and scene ];
outputting the scene rules through a semi-supervised stacking self-encoder, adding a classifier in a decoding stage to realize a classification function, optimizing the self-encoding network structure model classifier through matching training of data and rules, and generating the modal identification model.
Preferably, the objective function of the self-coding network structure model is specifically:
Figure SMS_57
the loss function of the self-coding network structure model is specifically as follows:
Figure SMS_58
wherein V (G, D) is the whole defined objective function, N is the number of the original labels,
Figure SMS_60
the defect sample x is represented as the original in the output data (x) after passing through the self-coding networkProbability P,. Of the associated tag>
Figure SMS_62
Representing the probability P of the original label in the output data z (x) after the sample x carrying the defect knowledge passes through a self-coding network; d (X) is a conditional probability calculation function, G (z) is the probability of comparing the output information y in the applied classification category data under the condition of the category model G (z);
Figure SMS_64
Represents whether or not there is a->
Figure SMS_61
Class defects; a. b, w, h and c are constituent variables of each grid during defect detection, a and b are points at the lower left corner of the grid, w and h are the width and height of the grid, c is the confidence of the grid, and/or the value of the grid>
Figure SMS_63
Representing that the mean square error is calculated through the position information to represent the coordinate loss of the defect bounding box;
Figure SMS_65
Representing the size loss of the defect bounding box by the calculated absolute mean square error of the size information;
Figure SMS_66
Indicates whether the user belongs to the system by judging
Figure SMS_59
The defect type calculates the confidence loss.
Further, the scene rule output is continuously generated and updated by hidden layer training of the stacked self-encoder, and is supplemented to the defect scene rule database.
The invention also provides a computer-readable storage medium, which includes a stored computer program, wherein when the computer program runs, the device where the computer-readable storage medium is located is controlled to execute the multimodal data pre-training and recognition method as described in any one of the above embodiments.
The invention further provides a terminal device, which comprises a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, wherein the processor executes the computer program to implement the multimodal data pre-training and recognition method as described in any one of the above embodiments.
The invention provides a multi-mode data pre-training and recognition method, a device, equipment and a medium, wherein a defect scene rule database is constructed by performing multi-source heterogeneous data fusion on acquired defect basic data; extracting defect type information, characteristic information and scene information from the defect scene rule database, performing data association, and extracting scene factors of the defect scene rule database; constructing a self-coding network structure model carrying defect scene information, integrating the scene factors into the self-coding network structure model, inputting feature vectors obtained by coding sample data of various defects, performing matching training of data and rules, and generating a modal identification model; and identifying the defects of the sample to be detected according to the modal identification model. The product defect detection accuracy and the model robustness can be improved.
Drawings
FIG. 1 is a schematic flow chart of a multi-modal data pre-training and recognition method according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method for pre-training and recognizing multi-modal data according to another embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a multi-modal data pre-training and recognition apparatus according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a terminal device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a multi-modal data pre-training and recognition method, and relates to fig. 1, which is a flow diagram of the multi-modal data pre-training and recognition method provided by the embodiment of the invention, wherein the method comprises the following steps of S1-S4:
s1, performing multi-source heterogeneous data fusion on acquired defect basic data to construct a defect scene rule database;
s2, extracting defect type information, characteristic information and scene information from the defect scene rule database, performing data association, and extracting scene factors of the defect scene rule database;
s3, constructing a self-coding network structure model carrying defect scene information, integrating the scene factors into the self-coding network structure model, inputting feature vectors obtained by coding sample data of various defects, performing matching training of data and rules, and generating a modal identification model;
and S4, identifying the defects of the sample to be detected according to the modal identification model.
When the method is implemented specifically, defect basic data is collected, the defect basic data is specifically historical defect data of a sample to be detected, multi-source heterogeneous data for defect detection is fused, and a basic defect scene rule database containing information such as static defect representation, dynamic defect evolution, defect classification and defect-scene rules is constructed through the fusion of the multi-source heterogeneous data;
refining scene factors according to a defect scene rule database, wherein the scene factors jointly construct a three-dimensional vector matrix containing defect type information, characteristic information and scene information, and the matrix constraint is applied to force a self-encoder to consider which parts of input data need to be optimized and copied and which parts need to be discarded, so that the self-encoder can learn the effective characteristics of the data and discard irrelevant characteristics, thereby generating more defect scene rules, performing data association and extracting the scene factors of the defect scene rule database;
the method comprises the steps of researching the construction of a scene rule knowledge base based on a semi-supervised self-coding network, designing a stacking self-coding network structure carrying defect scene information, introducing scene factors, enabling the scene factors to be hidden in a hierarchical structure in the stacking of the self-coding network, inputting a characteristic vector obtained by encoding sample data of various defects, performing matching training of data and rules, and generating a modal identification model;
and identifying the defects of the sample according to the generated modal identification model.
According to the method, under the conditions of low defect sampling rate and unbalanced samples, a production process scene is combined, material characteristics, manufacturing process data and high-resolution defect image sub-pixel characteristics are fused, a scene rule knowledge base is constructed through sample generation based on material process data, high-resolution defect image sub-pixel characteristic coding and deep learning classification methods, a self-coding network can well process various mapping relations in small sample defect data, and feature coding and knowledge modeling are performed, so that the core problems of low calculation efficiency, difficult defect origin tracing and the like caused by the fact that defect identification and classification are difficult and the robustness is weak, the capacity of an image to be detected is large and a deep learning method is used under the complex backgrounds of shielding, oxidation, vibration and the like in the defect detection process can be solved.
In another embodiment provided by the present invention, the step S1 specifically includes:
performing multi-source heterogeneous data fusion on defect basic data consisting of historical experience data, common rule data and defect standard data to form a defect scene rule database which is associated with defect types, positions and scales;
the defect scene rules database includes: a surface defect data set, a defect rule data set, a detection system data set, and a process scene data set.
When the embodiment is implemented specifically, the sources of the defect basic data include historical experience data, common rule data and defect standard data, and the historical experience data is specifically historical data of judgment of an expert on the defect;
the defects of common industrial products are mainly as follows: the method is characterized in that the method comprises the following steps of detecting defects such as lines, scratches, oil stains, points, shadows, textures, sawteeth and the like, representing the defects in an image in another form during defect detection, combining the common data representation condition of the defect image, and combining the characteristics of business activities with scene analysis, wherein the links of detected industrial products in business belong to, and the defects have important influence on scene judgment formed by defect detection. Finally, forming a defect scene rule database which is associated with the defect scene, the defect type, the defect position and the defect scale through association of each data set;
the defect scene rule database comprises a surface defect data set, a defect rule data set, a detection system data set and a process scene data set.
Accurate defect identification is realized by classifying and associating complex backgrounds such as shielding, oxidation, vibration and the like in the defect detection process of the micron-sized visual image.
In yet another embodiment provided by the present invention, the surface defect data set D1= [ surface defect ID, defect geometry, spatial distribution data, defect statistics, defect spectral data ];
the defect rule data set D2= [ defect rule ID, detection object type, defect classification statistical data, damage mechanism data, defect cause rule, defect grade ];
the detection system data set D3= [ detection system ID, device type, production line design data, technology type ];
the process scene data set D4= [ process scene data ID, detection object type, scene factor, production process ];
the defect geometry comprises: point-line-surface defects, boundaries, bones, shapes, positions, sizes, stretches, and translates;
the spatial distribution data includes: entropy, contrast, consistency and correlation;
the defect statistical data comprise gray level co-occurrence matrixes, autocorrelation coefficients, mathematical morphology, histogram statistical characteristics, fractal values and defect frequency spectrum subsets;
the histogram statistical features include range, mean, geometric mean, harmonic mean, standard deviation, variance, and median
The fractal values comprise stretching and translation fractal dimension and porosity;
the defect frequency spectrum subset comprises a texture frequency spectrum, a taint frequency spectrum and a sawtooth frequency spectrum;
the defect classification statistical data is specifically a fault mode of automatic defect division;
the defect grade comprises the detection object type;
the detection object types comprise semiconductors, circuit boards, wafers, fabrics, metal surfaces and woods;
the scene factors comprise operation scale and equipment type selection;
the production process comprises the steps of blank making, grinding, rolling, shearing, bundling and finished product forming.
In the embodiment, the surface defect data set specifically includes defect geometric features (point-line-surface defect, boundary, bone, shape, location, size, stretch, translation), spatial distribution data (entropy, contrast, consistency, and correlation), defect statistics (gray level co-occurrence matrix, autocorrelation coefficient, mathematical morphology, histogram statistics (range, mean, geometric mean, harmonic mean, standard deviation, variance, and median), and fractal values (stretch, translated fractal dimension, and porosity)), defect spectrum data (texture spectrum, blur spectrum, and sawtooth spectrum).
The entropy is used for reflecting the randomness of the image reflection pixels, and the larger the entropy is, the coarser the entropy is; contrast refers to the average difference of brightness and darkness of the defect scene image; consistency refers to the degree of consistency of the quantitative angle in the batch of images; correlation refers to the degree of correlation of the acquired image with the detected scene. In general, these specific data sets, which are actually the detection data sets of the image data, are classified from different angles to form different subsets, so as to facilitate the processing and recognition of the image.
The defect rule data set includes defect classification statistics (defects are automatically classified into corresponding failure modes), damage mechanism data, defect cause rules, and defect classes (inspection object types (semiconductor, circuit board, wafer, fabric, metal surface, wood, etc.)). The detection system data set comprises equipment types, production line design data and technology model selection;
the process scene data includes inspection object type (semiconductor, circuit board, wafer, fabric, metal surface, wood, etc.), scene factor (operation scale, equipment selection), production process (blank making, grinding, rolling, cutting, bundling, finished product, etc.).
Respectively expressing a surface defect data set, a defect rule data set, a detection system data set and a process scene data set in a data set form as follows:
a surface defect data set D1= [ surface defect ID, defect geometry, spatial distribution data, defect statistics, defect spectral data ];
defect geometry subset = [ surface defect ID, defect geometry ID, point, line, plane defect, boundary, bone, shape, position, size, stretch, translation ];
spatial distribution subset = [ surface defect ID, spatial distribution ID, entropy, contrast, consistency, correlation ];
defect statistics subset = [ surface defect ID, defect statistics ID, gray level co-occurrence matrix, autocorrelation coefficient, mathematical morphology, histogram statistical characteristics, fractal value ];
the defect statistical subset refers to data values obtained by statistically calculating defect data. Although the defect characteristics are not directly described, the statistical data of the distribution of the characteristics are mastered, and the method is favorable for analyzing the relationship between the defect types and the common characteristics. This is intersected in the D2 data set, i.e. these statistics will eventually be associated with defect rules, which make it easier to form defect scene rules.
Histogram statistical feature subset = [ surface defect ID, defect statistics ID, histogram statistics ID, range, mean, geometric mean, harmonic mean, standard deviation, variance, and median ];
fractal value subset = [ surface defect ID, defect statistics ID, fractal value ID, fractal dimension for stretching, translation, and porosity characteristics ];
the fractal value can reflect the stretching and deformation degree of defects, and the integral stretching of accessories is often caused by improper application of process level in the manufacturing process of products, so that industrial gap defects and the like are caused.
Defect spectrum subset = [ surface defect ID, defect spectrum ID, texture spectrum, smear spectrum, sawtooth spectrum ];
the defect frequency spectrum does refer to the frequency spectrum characteristics exhibited by the defect image, but the frequency spectrum characteristics formed by texture, stain and sawtooth are different, and the data set is the frequency spectrum characteristics of the defect image such as good texture, stain and sawtooth collected in the image defect process.
The defect rule data set D2= [ defect rule ID, detection object type, defect classification statistical data, damage mechanism data, defect cause rule, defect grade ];
the equipment type refers to missing equipment, and the detection object type refers to a detected object, such as PCB detection, steel detection, chip detection, mobile phone accessory detection and the like. Different detection objects have different detection scenes.
The detection system data set D3= [ detection system ID, device type, production line design data, technology model selection ];
the process scenario data set D4= [ process scenario data ID, detection object type, scenario factor, production procedure ].
In another embodiment provided by the present invention, the step S2 specifically includes:
extracting defect type information from the surface defect dataset, extracting feature information from the surface defect dataset and the defect rule dataset, and extracting scene information from the inspection system dataset and the process scene dataset;
for the defect Z, constructing a layered matrix Z multiplied by T multiplied by R according to the extracted defect type information, the extracted characteristic information and the extracted scene information;
for the defect-feature associated information, a first extraction factor a is adopted ij Mapping from a matrix of Z TExtracting to obtain the scene factor of the previous defect
Figure SMS_67
Based on all previous defect scene factors extracted>
Figure SMS_68
Forming a preceding scene factor->
Figure SMS_69
For the characteristic-scene associated information, a second extraction factor b is adopted ij Mapping and extracting from the matrix T multiplied by R to obtain the background factor of the defect
Figure SMS_70
Based on all the post-defect scene factors extracted ≥>
Figure SMS_71
Forming a post-term scene factor->
Figure SMS_72
Determining the scene factor according to the extracted antecedent scene factor and the extracted consequent scene factor;
wherein,
Figure SMS_83
,T
Figure SMS_74
Figure SMS_79
n is the number of defect classes, j is the eigenvector dimension, Z i j Being the value of an element in the defect matrix, T i j For the value of an element in the characteristic information matrix, R i j I =1,2, \8230nfor element values in the scene information matrix;
Figure SMS_75
Figure SMS_77
Figure SMS_81
Then is greater or less>
Figure SMS_85
=0,
Figure SMS_84
Then>
Figure SMS_88
Figure SMS_73
Figure SMS_80
Figure SMS_82
Figure SMS_86
Then>
Figure SMS_87
=0,
Figure SMS_89
Then it is
Figure SMS_76
Figure SMS_78
When the method is implemented specifically, scene factors are extracted according to a basic knowledge base, the scene factors are constructed into a three-dimensional vector matrix containing types, characteristics and scenes together, and the matrix constraint is applied to force the self-encoder to consider which parts of input data need to be optimized and copied and which parts need to be discarded, so that the self-encoder can learn the effective characteristics of the data and discard irrelevant characteristics, and further more defect scene rules are generated.
And finally forming a three-dimensional vector matrix containing type information, characteristic information and scene information after carrying out data cleaning, data association and conversion on the defect scene rule database.
Extracting defect type information from the surface defect data set D1; extracting feature information from a surface defect data set D1 and the defect rule data set D2; extracting scene information from the detection system dataset D3 and the process scene dataset D4;
for defect Z, can be expressed as
Figure SMS_90
For the characteristic information, it can be expressed as T->
Figure SMS_91
For scene information, may be expressed as ≧>
Figure SMS_92
Finally, a Z × T × R hierarchical matrix is formed.
Wherein n is the number of defect categories, j is the dimension of a feature vector, and j is the dimension of a vector, a sample or a feature vector; for example, for the defect Z, the surface defect data set D1 and the defect rule data set D2 represent feature information, and if the sum of the fields of the surface defect data set D1 and the defect rule data set D2 is 11, j represents 1 to 11;
Z i j being the value of an element in the defect matrix, T i j For the value of an element in the characteristic information matrix, R i j I =1,2, ... (n) is the value of an element in the scene information matrix;
for the defect-feature associated information, extracting mapping information from Z multiplied by T, and adopting a first extraction factor from defects to features
Figure SMS_93
Extracting the previous defect scene factor->
Figure SMS_94
;/>
Wherein,
Figure SMS_95
is a staged representation of the symbol, which is used in the calculation process, is based on>
Figure SMS_96
Then is greater or less>
Figure SMS_97
=0,
Figure SMS_98
When it is, then
Figure SMS_99
According to the extracted antecedent defect scene factor
Figure SMS_100
The formed antecedent scene factor->
Figure SMS_101
For the characteristic-scene associated information, extracting mapping information from T multiplied by R and adopting a second extraction factor from defects to characteristics
Figure SMS_102
And the obtained next defect scene factor is extracted>
Figure SMS_103
Wherein,
Figure SMS_104
is a staged representation of the symbol, which is used in the calculation process, is based on>
Figure SMS_105
Then is greater or less>
Figure SMS_106
=0,
Figure SMS_107
Then it is
Figure SMS_108
According to the extracted antecedent defect scene factor
Figure SMS_109
The formed antecedent scene factor->
Figure SMS_110
Scene factor = [ antecedent scene factor, postcedent scene factor ].
Antecedent scene factor representation: the information of the defect feature correlation is used for guiding effective feature extraction before the encoder, so that the noise of the sample is reduced;
the background scene factor represents: the information when the features are associated with the scene can guide the rule generation and filter invalid rules after being used for a decoder and before the rule generation.
In another embodiment provided by the present invention, the step S3 specifically includes applying a previous scene factor in the scene factors to an encoder of the self-coding network structure model to perform effective feature extraction;
applying the latter scene factor in the scene factors to a decoder of the self-coding network structure model to generate rules;
inputting a characteristic vector W coded by sample data of various defects, introducing a scene factor in the structure of a basic operation block during superposition by using the thought of a residual error network for reference, so that the scene factor is hidden in a hierarchical structure in the stack of the self-coding network structure model, and decoding and outputting to obtain a scene rule output [ type, characteristic and scene ];
outputting the scene rules through a semi-supervised stacking self-encoder, adding a classifier in a decoding stage to realize a classification function, optimizing the self-encoding network structure model classifier through matching training of data and rules, and generating the modal identification model.
In the specific implementation of the present embodiment, referring to fig. 2, a schematic flow chart of a multi-modal data pre-training and recognition method according to another embodiment of the present invention is shown;
in fig. 2, a scene rule knowledge base construction based on a semi-supervised self-coding network is studied, and a stacked self-coding network structure carrying defect scene information is designed;
applying the antecedent scene factors containing defects and characteristics in the scene factors to an encoder of the self-encoding network structure model to extract effective characteristics; applying the background scene factors including the features and the scenes in the scene factors to a decoder of the self-coding network structure model, and generating rules to make the scene factors hidden in a hierarchical structure in the stacking of the self-coding network, and adding coding structures and various classification feature information after stacking the self-coding network so that the constructed model has the functions of modal identification and scene prejudgment;
firstly, in a stacked self-coding network, an encoder and a decoder are in a symmetrical structural model, and the basic operation block structure of the network is designed in the coding network. By taking the thought of a residual error network as a reference, in the structure of a basic operation block, a scene factor is introduced during superposition, so that the scene factor is hidden in a hierarchical structure in the stacking of a self-coding network;
inputting a characteristic vector W consisting of sample data W1-Wi after data preprocessing is carried out on input sample data X1-Xi into a self-coding network structure model, introducing a scene factor in the structure of a basic operation block during superposition by using the thought of a residual error network, so that the scene factor is hidden in a hierarchical structure in the stack of the self-coding network structure model, and decoding and outputting to obtain a scene rule output [ type, characteristic and scene ];
outputting the scene rules through a semi-supervised stacking self-encoder, adding a classifier in a decoding stage to realize a classification function, optimizing the self-encoding network structure model classifier through matching training of data and rules, and generating the modal identification model.
A mode identification and scene prejudgment method based on a semi-supervised self-coding network is used for constructing a basic defect scene knowledge base containing information such as static defect representation, dynamic defect evolution, defect classification and defect-scene rules through multi-source heterogeneous data fusion. Then, based on a self-coding network, introducing scene factors to be fused into a stacked self-coding network, coding a certain type of data sample to obtain a characteristic vector through learning of the data sample, learning mapping from a certain type of image space to a potential space, generating characteristic models of various types, positions and degrees, and performing matching training of data and rules; by constructing and applying the defect scene knowledge base, the defect detection model has the function of scene prejudgment, can promote the cause generated by the defect detection model according to the defect information, and is helpful for the production line design and process optimization of industrial defect products.
In another embodiment provided by the present invention, the objective function of the self-coding network structure model is specifically:
Figure SMS_111
the loss function of the self-coding network structure model is specifically as follows:
Figure SMS_112
wherein V (G, D) is the whole defined objective function, N is the number of the original labels,
Figure SMS_113
the probability P of the original label in the output data (x) of the defect sample x after passing through the self-coding network is represented, and the value is combined with the value of the original label in the data (x)>
Figure SMS_117
Representing the probability P of the original label in the output data z (x) after the sample x carrying the defect knowledge passes through a self-coding network; d (X) is a conditional probability calculation function, G (z) is the probability of comparing the output information y in the applied classification category data under the condition of the category model G (z);
Figure SMS_119
Representing the presence or absence of ^ in an image>
Figure SMS_115
Class defects; a. b, w, h and c are constituent variables of each grid during defect detection, a and b are points at the lower left corner of the grid, w and h are the width and height of the grid, c is the confidence of the grid, and/or the value of the grid>
Figure SMS_116
Representing that the mean square error is calculated through the position information to represent the coordinate loss of the defect boundary box;
Figure SMS_118
Representing the size loss of the defect bounding box by the calculated absolute mean square error of the size information;
Figure SMS_120
Indicates whether the user belongs to the system by judging
Figure SMS_114
The defect type calculates the confidence loss.
In the specific implementation of this embodiment, the objective function designed when the self-coding network structure model with defect scene information designed in this patent is applied to classification and identification is:
Figure SMS_121
where V (G, D) is the defined overall objective function, which is calculated at the angle of maximum contribution, and is a conditional probability calculation function that yields the improvement D (X) to the countermeasure network equation, which is divided into three parts, the first part: reflecting the objective function calculation of the encoding stage, wherein the calculation of the stage and the whole function calculation are pursued to be as large as possible so as to obtain the most representative characteristic information; the second part is a decoding stage, the output calculation value of the stage is required to be as small as possible, but the whole equation calculation is as large as possible, so that the decoding difference is small; when the third part is target classification identification, G (z) is the probability of comparing the output information y in the applied classification class data under the condition of the class model G (z), which can represent the accuracy of classification;
Figure SMS_122
the probability P of the original tag in the output data (x) of the defect sample x after passing through the self-coding network is represented,
Figure SMS_123
representing the probability P of the original label in the output data z (x) after the sample x carrying the defect knowledge passes through a self-coding network;
Figure SMS_124
Estimating the central point;
the loss function is:
the loss function of the self-coding network structure model is specifically as follows:
Figure SMS_125
wherein a, b, w, h and c are the composition variables of each grid during defect detection, N is the number of the original labels, a and b are the points at the lower left corner of the grid, w and h are the width and height of the grid, c is the confidence coefficient of the grid,
Figure SMS_126
representing that the mean square error is calculated through the position information to represent the coordinate loss of the defect boundary box;
Figure SMS_127
Representing the size loss of the defect bounding box by the calculated absolute mean square error of the size information;
Figure SMS_128
Indicates whether the judgment is passed or not>
Figure SMS_129
The defect type calculates the confidence loss.
In another embodiment of the present invention, the scene rule output is further processed by implicit layer training of a stacked self-encoder, continuously generating and updating a defect scene rule, and supplementing the defect scene rule into the defect scene rule database.
In the specific implementation of the embodiment, the output result after the decoder can realize the classification function by adding a classifier in the decoding stage through a semi-supervised stacked self-encoder, and can continuously generate and update the defect-scene rule knowledge through the hidden layer training of the stacked self-encoder, and supplement the defect-scene rule knowledge to the defect-scene rule database. And further perfecting the knowledge base of the defect and scene mapping rules.
In the specific implementation of this embodiment, referring to fig. 2, a scene rule knowledge base is supplemented according to a rule generated by the post-output of a decoder, that is, a scene factor is extracted through a last-time consequent factor [ Yi-1], a scene hierarchical matrix is updated for a self-coding network structure model, and the scene hierarchical matrix is also supplemented into an input feature vector according to a vector matrix [ Yi ] of the extracted scene factor;
the stacking is performed in the form of a scene factor structure. In the stacking substructure, the antecedent scene factors are merged into a first layer of training, and the consequent scene factors are merged into a second layer of training; the usage is the same, one is threshold value use, and the other is weight amplification; threshold value use means that an activation function is influenced, on the basis of original full connection, through matrix entry verification of antecedent/consequent scene factors, defect features with an excessively small threshold value can be directly discarded, so that excessive feature/scene information is prevented, and finally overfitting can be prevented in application; on the other hand, effective features are further amplified, so that the phenomenon of gradient disappearance easily generated by deep learning can be prevented, and the loss of the effective features is prevented. Through the two aspects, the rules formed by the training of the stacked self-coding network are more suitable for defect scenes.
In another embodiment provided by the present invention, referring to fig. 3, it is a schematic structural diagram of a multi-modal data pre-training and recognition apparatus provided by the embodiment of the present invention, the apparatus includes:
the database construction module is used for carrying out multi-source heterogeneous data fusion on the acquired defect basic data and constructing a defect scene rule database;
the scene factor extraction module is used for extracting defect type information, characteristic information and scene information from the defect scene rule database, performing data association and extracting scene factors of the defect scene rule database;
the model generation module is used for constructing a self-coding network structure model carrying defect scene information, integrating the scene factors into the self-coding network structure model, inputting a characteristic vector obtained by coding sample data of various defects, performing matching training of data and rules and generating a modal identification model;
and the defect identification module is used for identifying the defects of the sample to be detected according to the modal identification model.
It should be noted that the multi-modal data pre-training and recognition apparatus provided in the embodiment of the present invention can perform the multi-modal data pre-training and recognition method described in any embodiment of the above embodiments, and specific functions of the multi-modal data pre-training and recognition apparatus are not described in detail herein.
Fig. 4 is a schematic structural diagram of a terminal device according to an embodiment of the present invention. The terminal device of this embodiment includes: a processor, a memory, and a computer program, such as a multimodal data pre-training and recognition program, stored in the memory and executable on the processor. When the processor executes the computer program, the steps in each of the above embodiments of the method for pre-training and recognizing multimodal data, such as steps S1 to S5 shown in fig. 1, are implemented. Alternatively, the processor implements the functions of the modules in the above device embodiments when executing the computer program.
Illustratively, the computer program may be partitioned into one or more modules/units that are stored in the memory and executed by the processor to implement the invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used for describing the execution process of the computer program in the terminal device. For example, the computer program may be divided into modules, and the specific functions of the modules are not described again.
The terminal device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The terminal device may include, but is not limited to, a processor, a memory. It will be appreciated by those skilled in the art that the schematic diagram is merely an example of a terminal device and does not constitute a limitation of a terminal device, and may include more or less components than those shown, or combine certain components, or different components, for example, the terminal device may also include input output devices, network access devices, buses, etc.
The Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like, which is the control center of the terminal device and connects the various parts of the whole terminal device using various interfaces and lines.
The memory may be used for storing the computer programs and/or modules, and the processor may implement various functions of the terminal device by executing or executing the computer programs and/or modules stored in the memory and calling data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
Wherein, the terminal device integrated module/unit can be stored in a computer readable storage medium if it is implemented in the form of software functional unit and sold or used as an independent product. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in code form, in object code form, in an executable file or in some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, read-Only Memory (ROM), random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like.
It should be noted that the above-described device embodiments are merely illustrative, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiment of the apparatus provided by the present invention, the connection relationship between the modules indicates that there is a communication connection between them, and may be specifically implemented as one or more communication buses or signal lines. One of ordinary skill in the art can understand and implement it without inventive effort.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

Claims (10)

1. A multi-modal data pre-training and recognition method, the method comprising:
performing multi-source heterogeneous data fusion on acquired defect basic data, and constructing a defect scene rule database;
extracting defect type information, characteristic information and scene information from the defect scene rule database, performing data association, and extracting scene factors of the defect scene rule database;
constructing a self-coding network structure model carrying defect scene information, integrating the scene factors into the self-coding network structure model, inputting feature vectors obtained by coding sample data of various defects, performing matching training of data and rules, and generating a modal identification model;
and identifying the defects of the sample to be detected according to the modal identification model.
2. The multi-modal data pre-training and recognition method of claim 1, wherein the multi-source heterogeneous data fusion is performed on the acquired defect base data to construct a defect scene rule database, specifically comprising:
performing multi-source heterogeneous data fusion on defect basic data consisting of historical experience data, common rule data and defect standard data to form a defect scene rule database which is associated with defect types, positions and scales;
the defect scene rules database includes: a surface defect data set, a defect rule data set, a detection system data set, and a process scene data set.
3. The method of claim 2, wherein the surface defect dataset D1= [ surface defect ID, defect geometry, spatial distribution data, defect statistics, defect spectral data ];
the defect rule data set D2= [ defect rule ID, detection object type, defect classification statistical data, damage mechanism data, defect cause rule, defect grade ];
the detection system data set D3= [ detection system ID, device type, production line design data, technology model selection ];
the process scene data set D4= [ process scene data ID, detection object type, scene factor, production process ];
the defect geometry includes: point-line-surface defects, boundaries, bones, shapes, positions, sizes, stretches, and translates;
the spatial distribution data includes: entropy, contrast, consistency and correlation;
the defect statistical data comprise gray level co-occurrence matrixes, autocorrelation coefficients, mathematical morphology, histogram statistical characteristics, fractal values and defect frequency spectrum subsets;
the histogram statistical features include range, mean, geometric mean, harmonic mean, standard deviation, variance, and median
The fractal values include stretching, translating fractal dimension and porosity;
the defect spectrum subset comprises a texture spectrum, a taint spectrum, and a sawtooth spectrum;
the defect classification statistical data is specifically a fault mode of automatic defect division;
the defect level comprises the detection object type;
the detection object types comprise semiconductors, circuit boards, wafers, fabrics, metal surfaces and woods;
the scene factors comprise operation scale and equipment type selection;
the production process comprises the steps of blank making, grinding, rolling, shearing, bundling and finished product forming.
4. The multi-modal data pre-training and recognition method of claim 2, wherein the extracting defect type information, feature information, and scene information from the defect scene rules database, performing data association, and extracting scene factors of the defect scene rules database, specifically comprises:
extracting defect type information from the surface defect dataset, extracting feature information from the surface defect dataset and the defect rule dataset, and extracting scene information from the inspection system dataset and the process scene dataset;
for the defect Z, constructing a layered matrix Z multiplied by T multiplied by R according to the extracted defect type information, the characteristic information and the scene information;
for the defect-feature associated information, a first extraction factor a is adopted ij Mapping and extracting from the matrix Z multiplied by T to obtain a front defect scene factor
Figure QLYQS_1
Based on all previous defect scene factors extracted>
Figure QLYQS_2
Forming a preceding scene factor->
Figure QLYQS_3
For the characteristic-scene correlation information, a second extraction factor b is adopted ij Mapping and extracting from the matrix T multiplied by R to obtain the background factor of the defect
Figure QLYQS_4
Based on all the post-defect scene factors extracted ≥>
Figure QLYQS_5
Forming a post-term scene factor->
Figure QLYQS_6
Determining the scene factor according to the extracted antecedent scene factor and consequent scene factor;
wherein,
Figure QLYQS_17
,T
Figure QLYQS_7
Figure QLYQS_14
n is the number of defect classes, j is the eigenvector dimension, Z i j Is the value of an element in the defect matrix, T i j For the value of an element in the characteristic information matrix, R i j I =1,2, ... (n) is the value of an element in the scene information matrix;
Figure QLYQS_15
Figure QLYQS_19
Figure QLYQS_21
Then is greater or less>
Figure QLYQS_23
=0,
Figure QLYQS_18
Then is greater or less>
Figure QLYQS_22
Figure QLYQS_9
Figure QLYQS_12
Figure QLYQS_8
Figure QLYQS_13
Then is greater or less>
Figure QLYQS_16
=0,
Figure QLYQS_20
When it is, then
Figure QLYQS_10
Figure QLYQS_11
5. The method according to claim 1, wherein the constructing a self-coding network structure model carrying scene information of the defect, fusing the scene factor into the self-coding network structure model, inputting a feature vector obtained by coding sample data of various defects, performing matching training of data and rules, and generating a modal recognition model specifically comprises:
applying the antecedent scene factors in the scene factors to an encoder of the self-encoding network structure model to extract effective features;
applying the latter scene factor in the scene factors to a decoder of the self-coding network structure model to generate rules;
inputting a characteristic vector W coded by sample data of various defects, introducing a scene factor in the structure of a basic operation block during superposition by using the thought of a residual error network for reference, so that the scene factor is hidden in a hierarchical structure in the stack of the self-coding network structure model, and decoding and outputting to obtain a scene rule output [ type, characteristic and scene ];
and outputting the scene rules through a semi-supervised stacking self-encoder, adding a classifier in a decoding stage to realize a classification function, and optimizing the self-encoding network structure model classifier through matching training of data and rules to generate the modal recognition model.
6. The method for pre-training and recognizing multimodal data as claimed in claim 1, wherein the objective function of the self-coding network structure model is specifically:
Figure QLYQS_24
the loss function of the self-coding network structure model is specifically as follows:
Figure QLYQS_25
wherein V (G, D) is the whole defined objective function, N is the number of the original labels,
Figure QLYQS_28
the probability P which represents that the defect sample x is the original label in the output data (x) after passing through the self-coding network is judged as being greater than or equal to>
Figure QLYQS_30
Representing the probability P of the original label in the output data z (x) after the sample x carrying the defect knowledge passes through a self-coding network; d (X) is a conditional probability calculation function, G (z) is the probability of comparing the output information y in the applied classification category data under the condition of the category model G (z);
Figure QLYQS_32
Represents whether or not there is a->
Figure QLYQS_26
Class defects; a. b, w, h and c are constituent variables of each grid during defect detection, a and b are points at the lower left corner of the grid, w and h are the width and height of the grid, c is the confidence of the grid, and/or the value of the grid>
Figure QLYQS_29
Representing that the mean square error is calculated through the position information to represent the coordinate loss of the defect boundary box;
Figure QLYQS_31
Representing passage sizeThe calculated absolute mean square error of the information represents the size loss of the defect bounding box;
Figure QLYQS_33
Indicates whether the judgment is passed or not>
Figure QLYQS_27
The defect type calculates the confidence loss.
7. The method of claim 5, wherein the scene rule output is further trained through hidden layers of a stacked self-encoder, continuously generating and updating defect scene rules, and supplementing the defect scene rules into the defect scene rules database.
8. A multi-modal data pre-training and recognition device, the device comprising:
the database construction module is used for carrying out multi-source heterogeneous data fusion on the acquired defect basic data and constructing a defect scene rule database;
the scene factor extraction module is used for extracting defect type information, characteristic information and scene information from the defect scene rule database, performing data association and extracting scene factors of the defect scene rule database;
the model generation module is used for constructing a self-coding network structure model carrying defect scene information, integrating the scene factors into the self-coding network structure model, inputting a characteristic vector obtained by coding sample data of various defects, performing matching training of data and rules and generating a modal identification model;
and the defect identification module is used for identifying the defects of the sample to be detected according to the modal identification model.
9. A computer-readable storage medium, comprising a stored computer program, wherein the computer program, when executed, controls a device on which the computer-readable storage medium is located to perform the multimodal data pre-training and recognition method as claimed in any one of claims 1 to 7.
10. A terminal device comprising a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the multimodal data pre-training and recognition method as claimed in any one of claims 1 to 7 when executing the computer program.
CN202310272537.2A 2023-03-21 2023-03-21 Multi-mode data pre-training and identifying method, device, equipment and medium Active CN115984662B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310272537.2A CN115984662B (en) 2023-03-21 2023-03-21 Multi-mode data pre-training and identifying method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310272537.2A CN115984662B (en) 2023-03-21 2023-03-21 Multi-mode data pre-training and identifying method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN115984662A true CN115984662A (en) 2023-04-18
CN115984662B CN115984662B (en) 2023-08-04

Family

ID=85958593

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310272537.2A Active CN115984662B (en) 2023-03-21 2023-03-21 Multi-mode data pre-training and identifying method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN115984662B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117036141A (en) * 2023-10-08 2023-11-10 交通运输部公路科学研究所 Data processing method and data interaction system for highway full life cycle
CN117376632A (en) * 2023-12-06 2024-01-09 中国信息通信研究院 Data recovery method and system based on intelligent depth synthesis
CN118505704A (en) * 2024-07-18 2024-08-16 成都数之联科技股份有限公司 General model modeling detection method for detecting defects of panel production line

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109919934A (en) * 2019-03-11 2019-06-21 重庆邮电大学 A kind of liquid crystal display panel defect inspection method based on the study of multi-source domain depth migration
CN112164067A (en) * 2020-10-12 2021-01-01 西南科技大学 Medical image segmentation method and device based on multi-mode subspace clustering
US20210183052A1 (en) * 2018-12-28 2021-06-17 Omron Corporation Defect inspecting device, defect inspecting method, and storage medium
CN113066070A (en) * 2021-03-31 2021-07-02 广东电网有限责任公司 Multi-source data fusion interaction method in three-dimensional scene
CN113436184A (en) * 2021-07-15 2021-09-24 南瑞集团有限公司 Power equipment image defect judging method and system based on improved twin network
US20220383479A1 (en) * 2021-05-20 2022-12-01 Hon Hai Precision Industry Co., Ltd. Method for detecting defects in images, computer device, and storage medium
US20220405909A1 (en) * 2020-12-03 2022-12-22 Boe Technology Group Co., Ltd. Computer-implemented method for defect analysis, apparatus for defect analysis, computer-program product, and intelligent defect analysis system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210183052A1 (en) * 2018-12-28 2021-06-17 Omron Corporation Defect inspecting device, defect inspecting method, and storage medium
CN109919934A (en) * 2019-03-11 2019-06-21 重庆邮电大学 A kind of liquid crystal display panel defect inspection method based on the study of multi-source domain depth migration
CN112164067A (en) * 2020-10-12 2021-01-01 西南科技大学 Medical image segmentation method and device based on multi-mode subspace clustering
US20220405909A1 (en) * 2020-12-03 2022-12-22 Boe Technology Group Co., Ltd. Computer-implemented method for defect analysis, apparatus for defect analysis, computer-program product, and intelligent defect analysis system
CN113066070A (en) * 2021-03-31 2021-07-02 广东电网有限责任公司 Multi-source data fusion interaction method in three-dimensional scene
US20220383479A1 (en) * 2021-05-20 2022-12-01 Hon Hai Precision Industry Co., Ltd. Method for detecting defects in images, computer device, and storage medium
CN113436184A (en) * 2021-07-15 2021-09-24 南瑞集团有限公司 Power equipment image defect judging method and system based on improved twin network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张旭中 等: "基于深度强化学习的木材缺陷图像识别及分割模型研究", 电子测量技术, no. 17, pages 86 - 92 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117036141A (en) * 2023-10-08 2023-11-10 交通运输部公路科学研究所 Data processing method and data interaction system for highway full life cycle
CN117036141B (en) * 2023-10-08 2023-12-08 交通运输部公路科学研究所 Data processing method and data interaction system for highway full life cycle
CN117376632A (en) * 2023-12-06 2024-01-09 中国信息通信研究院 Data recovery method and system based on intelligent depth synthesis
CN117376632B (en) * 2023-12-06 2024-02-06 中国信息通信研究院 Data recovery method and system based on intelligent depth synthesis
CN118505704A (en) * 2024-07-18 2024-08-16 成都数之联科技股份有限公司 General model modeling detection method for detecting defects of panel production line
CN118505704B (en) * 2024-07-18 2024-10-22 成都数之联科技股份有限公司 General model modeling detection method for detecting defects of panel production line

Also Published As

Publication number Publication date
CN115984662B (en) 2023-08-04

Similar Documents

Publication Publication Date Title
CN111833306B (en) Defect detection method and model training method for defect detection
CN110738207B (en) Character detection method for fusing character area edge information in character image
CN111462120B (en) Defect detection method, device, medium and equipment based on semantic segmentation model
CN115984662A (en) Multi-mode data pre-training and recognition method, device, equipment and medium
CN114155244B (en) Defect detection method, device, equipment and storage medium
CN111079638A (en) Target detection model training method, device and medium based on convolutional neural network
CN112989995B (en) Text detection method and device and electronic equipment
CN114092474B (en) Method and system for detecting processing defects of complex texture background of mobile phone shell
CN111325237B (en) Image recognition method based on attention interaction mechanism
CN114255223B (en) Deep learning-based double-stage bathroom ceramic surface defect detection method and equipment
CN110570442A (en) Contour detection method under complex background, terminal device and storage medium
CN114445410A (en) Circuit board detection method based on image recognition, computer and readable storage medium
CN114581646A (en) Text recognition method and device, electronic equipment and storage medium
CN116205881A (en) Digital jet printing image defect detection method based on lightweight semantic segmentation
CN113673528B (en) Text processing method, text processing device, electronic equipment and readable storage medium
CN111144425A (en) Method and device for detecting screen shot picture, electronic equipment and storage medium
CN117523087B (en) Three-dimensional model optimization method based on content recognition
CN108428234B (en) Interactive segmentation performance optimization method based on image segmentation result evaluation
CN116109627B (en) Defect detection method, device and medium based on migration learning and small sample learning
CN113378837A (en) License plate shielding identification method and device, electronic equipment and storage medium
CN112966730A (en) Vehicle damage identification method, device, equipment and storage medium
Xu et al. Tolerance Information Extraction for Mechanical Engineering Drawings–A Digital Image Processing and Deep Learning-based Model
CN115345895B (en) Image segmentation method and device for visual detection, computer equipment and medium
CN114511862B (en) Form identification method and device and electronic equipment
CN113505784A (en) Automatic nail annotation analysis method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Luo Liang

Inventor after: Lin Zhu

Inventor after: Li Haiwei

Inventor after: Ma Zhiping

Inventor after: Feng Diehua

Inventor before: Luo Liang

Inventor before: Lin Zhu

Inventor before: Li Haiwei

Inventor before: Ma Zhiping

Inventor before: Feng Zhihua

CB03 Change of inventor or designer information
GR01 Patent grant
GR01 Patent grant