CN111340785A

CN111340785A - Model training method, product surface defect detection method and storage medium

Info

Publication number: CN111340785A
Application number: CN202010122929.7A
Authority: CN
Inventors: 蔡长青
Original assignee: Guangzhou University
Current assignee: Guangzhou University
Priority date: 2020-02-27
Filing date: 2020-02-27
Publication date: 2020-06-26
Anticipated expiration: 2040-02-27
Also published as: CN111340785B

Abstract

The invention discloses a model training method, a product surface defect detection method and a storage medium. The training method comprises the steps of obtaining an automatic coding network, training the automatic coding network, and obtaining and generating a confrontation network; the generation countermeasure network comprises a generator network and a discriminator network, the discriminator network is constructed by an encoder network in the automatic encoding network, and the generation countermeasure network is trained. The convolutional automatic coding network-generation countermeasure network combined network obtained by training has the advantages of unsupervised training of the convolutional automatic coding network and semi-supervised training of the generation countermeasure network, has better generalization capability, can adapt to the conditions of few training samples, complex image appearance, large group difference and the like faced by the occasions of detecting the surface defects of products, particularly the surface defects of steel products, and obtains better recognition effect. The invention is widely applied to the technical field of image detection.

Description

Model training method, product surface defect detection method and storage medium

技术领域technical field

本发明涉及图像检测技术领域，尤其是一种模型训练方法、产品表面缺陷检测方法和存储介质。The invention relates to the technical field of image detection, in particular to a model training method, a product surface defect detection method and a storage medium.

背景技术Background technique

在钢铁产品等产品的生产和维护过程中，对其表面缺陷进行检测分析是一种高效的缺陷检测方法。但是，由于缺陷的罕见发生和外观变化，识别表面缺陷一直是一项艰巨的任务。In the production and maintenance process of steel products and other products, the detection and analysis of surface defects is an efficient defect detection method. However, identifying surface defects has always been a difficult task due to the rare occurrence of defects and changes in appearance.

近年来，深度学习方法在图像分类中表现出出色的性能，尤其是在有足够的训练样本的情况下。因此，陆续出现了一些识别和检测产品表面缺陷的现有技术，包括极限学习机(ELM)和支持向量机(SVM)等，再此基础上引入遗传算法(GA)、RNAMlet特征校正器、尺度不变特征变换(SIFT)和剪切波变换等以进行改善，其总的原理是对产品表面进行拍摄，将拍摄得到的图像输入到经过训练的深度学习模型中，获取深度学习模型所输出的识别结果，判断产品表面是否存在缺陷或者存在何种类型的缺陷。因此，这种方法的有效性取决于深度学习模型的性能。In recent years, deep learning methods have shown excellent performance in image classification, especially when there are enough training samples. Therefore, some existing technologies for identifying and detecting product surface defects have emerged one after another, including extreme learning machine (ELM) and support vector machine (SVM), etc. On this basis, genetic algorithm (GA), RNAMlet feature corrector, scale Invariant feature transformation (SIFT) and shear wave transformation, etc. are used for improvement. The general principle is to shoot the surface of the product, input the captured image into the trained deep learning model, and obtain the output of the deep learning model. Identify the results to determine whether there are defects on the surface of the product or what type of defects exist. Therefore, the effectiveness of this approach depends on the performance of the deep learning model.

但是，产品表面缺陷具有复杂的外观，这使得拍摄所得的图片也具有很高的复杂性。图1所示是钢铁产品的表面缺陷，其中a-d部分是接缝，e-h部分是氧化皮，由图1可以看出，即使是同种表面缺陷，其外观也存在很大的差异，即产品表面缺陷图像的特点是具有很大的组内差异，以及图像背景非常复杂，这使得现有技术的泛化能力不佳，在应用现有技术时只能针对每种外观来分别构建和训练模型，极大地提高了使用成本，降低了使用效率。产品表面缺陷图像的另一特点是难以获得足够数量的样本图像去对模型进行训练，并且产品表面缺陷图像的上下文与多数预训练模型有很大的不同，这使得迁移学习等针对小数据集场景进行改善的技术难以应用在产品表面缺陷检测领域。However, product surface defects have a complex appearance, which makes the resulting pictures also highly complex. Figure 1 shows the surface defects of iron and steel products, in which parts a-d are seams, and parts e-h are oxide scales. It can be seen from Figure 1 that even the same surface defects have great differences in appearance, that is, the surface of the product is very different. Defective images are characterized by large intra-group differences, and the image background is very complex, which makes the generalization ability of the existing technology poor. When applying the existing technology, the model can only be constructed and trained separately for each appearance. Greatly increases the cost of use and reduces the efficiency of use. Another feature of product surface defect images is that it is difficult to obtain a sufficient number of sample images to train the model, and the context of product surface defect images is very different from that of most pre-trained models, which makes transfer learning suitable for small dataset scenarios. The improved technology is difficult to apply in the field of product surface defect detection.

发明内容SUMMARY OF THE INVENTION

针对上述至少一个技术问题，本发明的目的在于提供一种模型训练方法、产品表面缺陷检测方法和存储介质。In view of at least one of the above technical problems, the purpose of the present invention is to provide a model training method, a product surface defect detection method and a storage medium.

一方面，本发明实施例包括一种产品表面缺陷检测模型的训练方法，包括以下步骤：On the one hand, an embodiment of the present invention includes a training method for a product surface defect detection model, comprising the following steps:

获取自动编码网络；所述自动编码网络包括编码器网络和解码器网络；Obtain an auto-encoding network; the auto-encoding network includes an encoder network and a decoder network;

获取样本图像；至少部分所述样本图像中包括产品表面缺陷；obtaining a sample image; at least a portion of the sample image includes product surface defects;

使用所述样本图像对所述自动编码网络进行训练；training the auto-encoding network using the sample images;

获取生成对抗网络；所述生成对抗网络包括生成器网络和鉴别器网络，所述鉴别器网络由所述自动编码网络中的编码器网络构建得到；Obtain a generative adversarial network; the generative adversarial network includes a generator network and a discriminator network, and the discriminator network is constructed by the encoder network in the auto-encoding network;

获取真实图像及其标记；所述真实图像中包括产品表面缺陷，所述标记用于表示相应真实图像中所包括的产品表面缺陷的种类；Obtain a real image and its mark; the real image includes product surface defects, and the mark is used to indicate the type of product surface defect included in the corresponding real image;

将所述真实图像或由所述生成器网络生成的伪图像输入到所述鉴别器网络，获取所述鉴别器网络的输出结果，调整所述鉴别器网络和/或生成器网络的参数，直至所述鉴别器网络的损失函数和/或生成器网络的损失函数达到目标值；Input the real image or the fake image generated by the generator network into the discriminator network, obtain the output result of the discriminator network, adjust the parameters of the discriminator network and/or the generator network, until The loss function of the discriminator network and/or the loss function of the generator network reaches the target value;

经过训练的所述生成对抗网络用作产品表面缺陷检测模型。The trained generative adversarial network is used as a product surface defect detection model.

进一步地，所述调整所述鉴别器网络和/或生成器网络的参数这一步骤，具体包括：Further, the step of adjusting the parameters of the discriminator network and/or the generator network specifically includes:

由所述解码器网络重构样本图像；reconstructing sample images by the decoder network;

获取所述解码器网络在重构过程中的误差；Obtain the error of the decoder network in the reconstruction process;

根据所述误差调整所述编码器网络。The encoder network is adjusted according to the error.

进一步地，所述鉴别器网络的损失函数为：Further, the loss function of the discriminator network is:

进一步地，所述鉴别器网络由所述编码器网络和多分类网络连接得到。Further, the discriminator network is obtained by connecting the encoder network and the multi-classification network.

进一步地，所述多分类网络为softmax网络。Further, the multi-classification network is a softmax network.

进一步地，所述自动编码网络为卷积自动编码网络；所述编码器网络包括多个卷积层和一个passthrough层；所述passthrough层用于对较浅的卷积层输出的特征图进行尺寸缩小，并与较深的卷积层输出的特征图连接起来。Further, the automatic coding network is a convolutional automatic coding network; the encoder network includes a plurality of convolutional layers and a passthrough layer; the passthrough layer is used to size the feature map output by the shallower convolutional layer. downscale and concatenate with feature maps output by deeper convolutional layers.

进一步地，所述较深的卷积层为所述编码器网络中最后一个卷积层。Further, the deeper convolutional layer is the last convolutional layer in the encoder network.

进一步地，所述编码器网络中的至少一个所述卷积层进行最大池化操作。Further, at least one of the convolutional layers in the encoder network performs a max-pooling operation.

另一方面，本发明实施例还包括一种产品表面缺陷检测方法，包括以下步骤：On the other hand, an embodiment of the present invention also includes a method for detecting surface defects of a product, comprising the following steps:

获取待检测图像；所述待检测图像包括产品表面；Obtain an image to be detected; the image to be detected includes the product surface;

将所述待检测图像输入到产品表面缺陷检测模型；所述产品表面缺陷检测模型经过实施例所述的训练方法训练；Input the image to be detected into the product surface defect detection model; the product surface defect detection model is trained by the training method described in the embodiment;

获取所述产品表面缺陷检测模型的输出结果，根据所述输出结果确定产品表面缺陷类型。The output result of the product surface defect detection model is acquired, and the product surface defect type is determined according to the output result.

另一方面，本发明实施例还包括一种存储介质，其中存储有处理器可执行的指令，所述处理器可执行的指令在由处理器执行时用于执行实施例所述的物体形状测量方法。On the other hand, an embodiment of the present invention further includes a storage medium, in which processor-executable instructions are stored, and when executed by the processor, the processor-executable instructions are used to perform the object shape measurement described in the embodiments method.

本发明的有益效果是：通过实施例中的模型训练方法训练所得的生成对抗网络，实际上是一个卷积自动编码网络-生成对抗网络联合网络，它兼具了卷积自动编码网络的无监督训练优点，以及生成对抗网络的半监督训练优点，具有较好的泛化能力，能够适应产品表面缺陷尤其是钢铁产品表面缺陷检测场合中所面对的训练样本少、图像外观复杂、组内差异大等情况，取得较好的识别效果。The beneficial effects of the present invention are: the generative adversarial network trained by the model training method in the embodiment is actually a convolutional auto-encoding network-generative adversarial network joint network, which has both the unsupervised convolutional auto-encoding network. The advantages of training, as well as the advantages of semi-supervised training of generative adversarial networks, have good generalization ability, and can adapt to the surface defects of products, especially the surface defects of steel products. There are few training samples, complex image appearance, and intra-group differences. In large and other situations, a better recognition effect is achieved.

附图说明Description of drawings

图1为钢铁产品的表面缺陷的图像；Figure 1 is an image of surface defects of a steel product;

图2为实施例中模型训练方法的原理示意图；Fig. 2 is the principle schematic diagram of the model training method in the embodiment;

图3为实施例中所构建的卷积自动编码网络的原理示意图；Fig. 3 is the principle schematic diagram of the convolutional auto-encoding network constructed in the embodiment;

图4为实施例中所构建的卷积自动编码网络的结构示意图；4 is a schematic structural diagram of a convolutional auto-encoding network constructed in an embodiment;

图5为实施例中解码器网络中执行的上采样的原理示意图；5 is a schematic diagram of the principle of upsampling performed in a decoder network in an embodiment;

图6为实施例中生成对抗网络的原理示意图。FIG. 6 is a schematic diagram of the principle of a generative adversarial network in an embodiment.

具体实施方式Detailed ways

实施例1Example 1

本实施例中提出的一种训练方法，包括以下步骤：A training method proposed in this embodiment includes the following steps:

P1.获取卷积自动编码网络；所述卷积自动编码网络包括编码器网络和解码器网络；P1. Obtain a convolutional auto-encoding network; the convolutional auto-encoding network includes an encoder network and a decoder network;

P2.获取样本图像；至少部分所述样本图像中包括产品表面缺陷；P2. Obtain a sample image; at least part of the sample image includes product surface defects;

P3.使用所述样本图像对所述卷积自动编码网络进行训练；P3. Use the sample images to train the convolutional auto-encoding network;

P4.获取生成对抗网络；所述生成对抗网络包括生成器网络和鉴别器网络，所述鉴别器网络由所述卷积自动编码网络中的编码器网络构建得到；P4. Obtain a generative adversarial network; the generative adversarial network includes a generator network and a discriminator network, and the discriminator network is constructed by the encoder network in the convolutional auto-encoding network;

P5.获取真实图像及其标记；所述真实图像中包括产品表面缺陷，所述标记用于表示相应真实图像中所包括的产品表面缺陷的种类；P5. Obtain a real image and its mark; the real image includes product surface defects, and the mark is used to indicate the type of product surface defect included in the corresponding real image;

P6.将所述真实图像或由所述生成器网络生成的伪图像输入到所述鉴别器网络，获取所述鉴别器网络的输出结果，调整所述鉴别器网络和/或生成器网络的参数，直至所述鉴别器网络的损失函数达到目标值；所述损失函数由所述鉴别器网络的输出结果与所述真实图像的标记或所述伪图像的标记确定；P6. Input the real image or the fake image generated by the generator network to the discriminator network, obtain the output result of the discriminator network, and adjust the parameters of the discriminator network and/or generator network , until the loss function of the discriminator network reaches the target value; the loss function is determined by the output result of the discriminator network and the label of the real image or the label of the fake image;

在不影响逻辑性的基础上，步骤P1-P6的顺序可以进行调整，这不影响本实施例训练方法的实现。On the basis of not affecting the logic, the sequence of steps P1-P6 can be adjusted, which does not affect the implementation of the training method in this embodiment.

步骤P1-P6的目标是构建一个生成对抗网络，对这个生成对抗网络进行训练，经过训练之后的生成对抗网络是我们想要得到的产品表面缺陷检测模型。The goal of steps P1-P6 is to build a generative adversarial network and train the generative adversarial network. The trained generative adversarial network is the product surface defect detection model we want to get.

步骤P1-P6的原理如图2所示。优选地，本实施例中使用的自动编码网络为卷积自动编码网络(CAE)，与标准的自动编码网络相比，卷积自动编码网络更适用于高维的图像数据。而且，在卷积自动编码网络中，权重和偏差在输入中的所有位置之间共享，这意味着卷积自动编码网络可以保留空间局部性。因此，解码器网络的重建是基于编码器网络输出的基本图像块的线性组合。The principle of steps P1-P6 is shown in FIG. 2 . Preferably, the auto-encoding network used in this embodiment is a convolutional auto-encoding network (CAE). Compared with the standard auto-encoding network, the convolutional auto-encoding network is more suitable for high-dimensional image data. Also, in convolutional autoencoder networks, weights and biases are shared across all positions in the input, which means that convolutional autoencoders can preserve spatial locality. Therefore, the reconstruction of the decoder network is based on a linear combination of the basic image patches output by the encoder network.

通过执行步骤P1-P4，先构建卷积自动编码网络，并使用样本图像对卷积自动编码网络进行训练，对于经过训练的卷积自动编码网络，将其编码器网络部分直接或者稍加改造之后，用作生成对抗网络中的鉴别器网络，与构建的生成器网络一起构成生成对抗网络。By executing steps P1-P4, the convolutional auto-encoding network is first constructed, and the convolutional auto-encoding network is trained using sample images. For the trained convolutional auto-encoding network, the encoder network part is directly or slightly modified. , used as the discriminator network in the generative adversarial network, together with the constructed generator network, constitutes the generative adversarial network.

步骤P1-P3中，所构建的卷积自动编码网络的原理如图3所示，其中输入数据x表示m维向量，其中x∈Rm。输出数据特征表示n维向量特征，其中n∈Rn，并且m>n。卷积自动编码网络的原理是：首先在编码器网络中将输入数据转换为通常较低维度的空间，然后扩展为在解码器网络中再现初始数据。In steps P1-P3, the principle of the constructed convolutional auto-encoding network is shown in Figure 3, where the input data x represents an m-dimensional vector, where x ∈ Rm. The output data features represent n-dimensional vector features, where n∈Rn and m>n. The principle of convolutional auto-encoding networks is to first transform the input data into a usually lower dimensional space in the encoder network, and then expand to reproduce the original data in the decoder network.

卷积自动编码网络运行时包括三个主要步骤：The convolutional autoencoder network runtime consists of three main steps:

步骤1：通过下式将输入的x转换为编码器网络的代码：Step 1: Convert the input x to the code of the encoder network by:

f＝sigmoid(W₁ ^Tx+b₁)；f=sigmoid(W ₁ ^T x+b ₁ );

其中x是输入向量；W₁ ^T是输入层和隐藏层之间的权重矩阵；b₁是偏差向量；f是编码器网络的输出值。where x is the input vector; W ₁ ^T is the weight matrix between the input and hidden layers; b ₁ is the bias vector; and f is the output value of the encoder network.

步骤2：根据编码器网络的输出，解码器网络通过下式重建输入值：Step 2: Based on the output of the encoder network, the decoder network reconstructs the input value by:

其中f是编码器网络的输出；

是隐藏层和输出层之间的权重矩阵；b₂偏差矢量。

是解码器网络的输出值。where f is the output of the encoder network;

is the weight matrix between the hidden and output layers _; b is the bias vector.

is the output value of the decoder network.

对卷积自动编码网络的训练是无监督的，也就是无需对步骤P2获取到的样本图像进行标记，只需要一部分样本图像中包括产品表面缺陷，另一部分样本图像中不包括产品表面缺陷，对卷积自动编码网络训练的目标是最小化原始图像和解码器网络输出的误差，也就是下式：The training of the convolutional auto-encoding network is unsupervised, that is, there is no need to mark the sample images obtained in step P2, only a part of the sample images need to include product surface defects, and the other part of the sample images do not include product surface defects. The goal of convolutional autoencoder network training is to minimize the error between the original image and the output of the decoder network, which is:

其中En代表编码器网络；De代表解码器网络；x是输入向量；

是解码器网络的输出，

通常，还可以在

后加惩罚项α以强制卷积自动编码网络对输入图像的稀疏表示进行编码。本实施例中，惩罚项α满足：where En represents the encoder network; De represents the decoder network; x is the input vector;

is the output of the decoder network,

Usually, you can also

A penalty term α is added to force the convolutional auto-encoding network to encode a sparse representation of the input image. In this embodiment, the penalty term α satisfies:

是编码器网络的平均输出；ρ是实验中设定的目标值0.05；KL是Kullback-Leibler散度，它有助于编码器网络输出更多的稀疏值。

is the average output of the encoder network; ρ is the target value 0.05 set in the experiment; KL is the Kullback-Leibler divergence, which helps the encoder network to output more sparse values.

所使用的卷积自动编码网络更为细致的结构如图4所示，其中编码器网络的具体参数如表1所示，解码器网络的具体参数如表2所示。The more detailed structure of the convolutional auto-encoding network used is shown in Figure 4, where the specific parameters of the encoder network are shown in Table 1, and the specific parameters of the decoder network are shown in Table 2.

表1Table 1

表2Table 2

编码器网络包含多个卷积层，在每个卷积层的最后还包括一个最大池化层，以对该卷积层所输出的数据进行最大池化操作。The encoder network includes multiple convolutional layers, and at the end of each convolutional layer, it also includes a max-pooling layer to perform a max-pooling operation on the data output by the convolutional layer.

本实施例中，编码器网络以大小为224×224的图像作为输入，并由八个卷积层和四个最大池化层进行处理，以构造一组区分特征图。编码器网络中的一个卷积层可以描述为：In this example, the encoder network takes an image of size 224×224 as input and is processed by eight convolutional layers and four max-pooling layers to construct a set of discriminative feature maps. A convolutional layer in an encoder network can be described as:

y＝f(∑f_c*k_c+b_c)；y=f(∑f _c *k _c +b _c );

其中y表示卷积层的输出特征图；f_c和k_c分别是第c个特征图和第c个卷积核。b_c是第c个偏差；*表示卷积运算。对卷积自动编码网络的训练就是不断学习k_c和b_c。where _y represents the output feature map of the convolutional layer; fc and _kc are the cth feature map and the cth convolution kernel, respectively. b _c is the c-th deviation; * denotes a convolution operation. The training of the convolutional autoencoder network is to continuously learn k _c and b _c .

本实施例中，在编码器网络的每个卷积层都使用了整流线性单元(ReLU)，可以提高卷积自动编码网络的非线性表达能力。一个整流线性单元表示为：In this embodiment, a rectified linear unit (ReLU) is used in each convolutional layer of the encoder network, which can improve the nonlinear expression ability of the convolutional auto-encoding network. A rectified linear unit is represented as:

y＝max(x,0)；y=max(x,0);

其中x是卷积层的输出；最大池化层可以减少特征图的维数，一个最大池化层可以表示为：where x is the output of the convolutional layer; the max pooling layer can reduce the dimension of the feature map, and a max pooling layer can be expressed as:

其中，x(i,j)表示第(i,j)个合并区域；R是所有不同的合并区域；y(i,j)是第(i,j)个合并区域的输出；最大合并会增加卷积自动编码网络的平移不变性，并使其编码更稀疏的特征。where x(i,j) represents the (i,j)th merged region; R is all the different merged regions; y(i,j) is the output of the (i,j)th merged region; the maximum merge increases Translation-invariant convolutional auto-encoding networks and making them encode sparser features.

由于向卷积自动编码网络输入的图像的大小为224×224，并且编码器网络中有四个最大池化层，因此编码器网络的输出特征图的大小为14×14，然后将其代入解码器网络以重建原始图像图片，作为卷积自动编码网络的最终输出。Since the size of the input image to the convolutional auto-encoding network is 224×224, and there are four max-pooling layers in the encoder network, the output feature map of the encoder network is of size 14×14, which is then substituted into the decoding The network is used to reconstruct the original image picture as the final output of the convolutional auto-encoding network.

本实施例中，在编码器网络中设置一个passthrough层。如图4所示，passthrough层的输入端连接在编码器网络中的一个较浅的卷积层的输出端，这里的深浅是卷积层在编码器网络中的位置的描述，较深的卷积层位于较浅的卷积层之后。passthrough层获取这个较浅的卷积层所输出的特征图，将这一特征图中相邻的特征堆叠到不同的特征图通道中，使得这个特征图被转换为多个尺寸更小的特征图，例如，passthrough层可以将较浅卷积层输出的大小为4×4的特征图转换为大小为4个大小均为2×2的不同的特征图。In this embodiment, a passthrough layer is set in the encoder network. As shown in Figure 4, the input of the passthrough layer is connected to the output of a shallower convolutional layer in the encoder network, where the depth is a description of the position of the convolutional layer in the encoder network, and the deeper volume The convolutional layers follow the shallower convolutional layers. The passthrough layer obtains the feature map output by this shallower convolutional layer, and stacks adjacent features in this feature map into different feature map channels, so that this feature map is converted into multiple feature maps of smaller size For example, a passthrough layer can convert a feature map of size 4×4 output by a shallower convolutional layer into 4 different feature maps of size 2×2 each.

passthrough层的输出端连接在编码器网络中的一个较深的卷积层，在完成上述处理后，passthrough层在深度通道中将浅卷积层输出的这些特征图与较深卷积层输出的特征图连接起来，以形成编码器网络的最终输出。The output of the passthrough layer is connected to a deeper convolutional layer in the encoder network. After the above processing is completed, the passthrough layer combines these feature maps output by the shallow convolutional layer with those outputted by the deeper convolutional layer in the depth channel. The feature maps are concatenated to form the final output of the encoder network.

本实施例中，passthrough层所连接的较浅的卷积层，是编码器网络中第五个卷积层，这一卷积层输出128个大小均为28×28的不同的特征图，passthrough层将这128个特征图转换为512个大小均为14×14的特征图。passthrough层所连接的较浅的卷积层是编码器网络中最后一个卷积层，它本身可以输出128个大小均为14×14的特征图，passthrough层将转换得到的特征图与最后一个卷积层输出的特征图连接起来，得到640个大小均为14×14的特征图，也就是编码器网络的最终输出为14×14×640。In this embodiment, the shallower convolutional layer connected to the passthrough layer is the fifth convolutional layer in the encoder network. This convolutional layer outputs 128 different feature maps with a size of 28×28. Passthrough The layer converts these 128 feature maps into 512 feature maps each of size 14×14. The shallower convolutional layer connected by the passthrough layer is the last convolutional layer in the encoder network, which itself can output 128 feature maps with a size of 14×14. The passthrough layer will convert the obtained feature map with the last volume. The feature maps output by the multi-layer layer are connected to obtain 640 feature maps with a size of 14 × 14, that is, the final output of the encoder network is 14 × 14 × 640.

passthrough层可以保留较浅卷积层中的详细特征，也就是帮助编码器网络更好地提取输入数据中的细粒度特征，使得卷积自动编码网络最终输出结果中仍包含大量的细节特征，避免因卷积层数量过多导致细节丢失，提高编码器网络的分辨率。The passthrough layer can retain the detailed features in the shallower convolutional layers, that is, it helps the encoder network to better extract the fine-grained features in the input data, so that the final output of the convolutional auto-encoding network still contains a large number of detailed features, avoiding Increase the resolution of the encoder network due to loss of details due to too many convolutional layers.

解码器网络以编码器网络的输出作为输入，并且与九个卷积层和上采样层进行卷积。解码器网络中执行的上采样的原理如图5所示，通过复制对特征图进行上采样，其过程和效果与最大池化相反，可以扩展特征图的尺寸。The decoder network takes the output of the encoder network as input and is convolved with nine convolutional and upsampling layers. The principle of upsampling performed in the decoder network is shown in Figure 5. Upsampling the feature map by duplication, the process and effect of which are opposite to max pooling, can expand the size of the feature map.

在构建并训练好卷积自动编码网络之后，可以将卷积自动编码网络中的编码器网络直接用作生成对抗网络中的鉴别器网络。After building and training the convolutional autoencoder network, the encoder network in the convolutional autoencoder network can be directly used as the discriminator network in the generative adversarial network.

构建一个生成器网络，本实施例中，生成器网络的结构如表3所示，它由9个卷积层和4个上采样层组成，输入大小为14×14，从0到1的均匀分布采样。上采样层的作用与卷积自动编码网络中解码器网络的作用相同，也就是扩展特征图的大小。在经过这些上采样和卷积之后，生成器网络所接收的随机噪声输入被映射到一个伪图像中。Build a generator network. In this example, the structure of the generator network is shown in Table 3. It consists of 9 convolutional layers and 4 upsampling layers. The input size is 14×14, and the uniform value from 0 to 1 Distribution sampling. The role of the upsampling layer is the same as the role of the decoder network in the convolutional auto-encoding network, which is to expand the size of the feature map. After these upsampling and convolutions, the random noise input received by the generator network is mapped into a fake image.

表3table 3

所述伪图像是相对真实图像而言的，即不属于真实图像的图像，将伪图像和真实图像输入到鉴别器网络中，由鉴别器网络输出对伪图像和真实图像的判断结果。参照图6，在将卷积自动编码网络中的编码器网络直接用作鉴别器网络的情况下，鉴别器网络的判断结果是二元的，即输入到鉴别器网络中的图像是对产品表面缺陷真实拍摄所得的真实图像，还是由随机噪声生成的伪图像。The fake image is relative to the real image, that is, the image that does not belong to the real image, the fake image and the real image are input into the discriminator network, and the discriminator network outputs the judgment result of the fake image and the real image. Referring to Fig. 6, in the case where the encoder network in the convolutional auto-encoding network is directly used as the discriminator network, the judgment result of the discriminator network is binary, that is, the image input into the discriminator network is the product surface. The real image obtained by the real defect is still a fake image generated by random noise.

根据鉴别器的判断结果，确定鉴别器网络的损失函数的具体取值；鉴别器网络的损失函数可以定义为：According to the judgment result of the discriminator, the specific value of the loss function of the discriminator network is determined; the loss function of the discriminator network can be defined as:

其中D代表鉴别器网络，G代表将随机噪声z作为输入的生成器网络，x代表数据集的真实图像，

表示训练目标是使损失函数最大化。where D represents the discriminator network, G represents the generator network that takes random noise z as input, x represents the real images of the dataset,

Indicates that the training objective is to maximize the loss function.

生成器网络的训练目标是生成具有最大可能值D(x)的图像，以测试鉴别器网络。生成器的损失函数可以定义为：The training goal of the generator network is to generate images with the largest possible value D(x) to test the discriminator network. The loss function of the generator can be defined as:

在损失函数未达到目标值的情况下，沿着梯度下降的方向对生成器网络和鉴别器网络的参数进行调整，进行下一轮处理。相邻的两轮处理可以分别对生成器网络和鉴别器网络的参数进行调整，以达到轮流训练的结果，当鉴别器网络的损失函数取得目标值时，可以结束对鉴别器网络的训练，即不再改变鉴别器网络的参数；当生成器网络的损失函数取得目标值时，可以结束对生成器网络的训练，即不再改变生成器网络的参数。当鉴别器网络和生成器网络的训练均完成时，对整个生成对抗网络的训练完成。When the loss function does not reach the target value, the parameters of the generator network and the discriminator network are adjusted in the direction of gradient descent for the next round of processing. The two adjacent rounds of processing can adjust the parameters of the generator network and the discriminator network respectively to achieve the result of training in turn. When the loss function of the discriminator network achieves the target value, the training of the discriminator network can be ended, that is, The parameters of the discriminator network are no longer changed; when the loss function of the generator network achieves the target value, the training of the generator network can be ended, that is, the parameters of the generator network will not be changed. When the training of both the discriminator network and the generator network is completed, the training of the entire generative adversarial network is completed.

本实施例中，生成对抗网络中的鉴别器网络，是在卷积自动编码网络的编码器网络的基础上，连接一个多分类网络得到的，而这个多分类网络可以选择为softmax网络。多分类网络可以输出多个分类结果，也就是除了可以识别所接收的图像是属于真实图像或者伪图像之外，还可以识别出所接收的图像中含有何种产品表面缺陷，因此，经过这样改进的鉴别器网络，具有更好的泛化能力。此时，相应地所使用的训练资料中，对真实图像的标记除了表明其属于真实图像外，还表明其包含的产品表面缺陷的种类，这样可以将鉴别器网络训练为具有识别产品表面缺陷种类的能力。In this embodiment, the discriminator network in the generative adversarial network is obtained by connecting a multi-classification network on the basis of the encoder network of the convolutional auto-encoding network, and the multi-classification network can be selected as a softmax network. The multi-classification network can output multiple classification results, that is, in addition to identifying whether the received image is a real image or a fake image, it can also identify what kind of product surface defects the received image contains. Discriminator network with better generalization ability. At this time, in the corresponding training data used, the marking of the real image not only indicates that it belongs to the real image, but also indicates the type of product surface defect contained in it, so that the discriminator network can be trained to have the ability to identify the type of product surface defect. Ability.

在执行步骤P4以将编码器网络用作鉴别器网络时，可以完全截断解码器网络与编码器网络之间的连接。本实施例中，还可以在对生成对抗网络的训练过程中，保留解码器网络与编码器网络之间的连接。此时，步骤P4中的调整所述鉴别器网络和/或生成器网络的参数这一步骤，由以下步骤组成：When performing step P4 to use the encoder network as the discriminator network, the connection between the decoder network and the encoder network can be completely truncated. In this embodiment, the connection between the decoder network and the encoder network may also be reserved during the training of the generative adversarial network. At this time, the step of adjusting the parameters of the discriminator network and/or the generator network in step P4 consists of the following steps:

P401.由所述解码器网络重构样本图像；P401. Reconstruct a sample image by the decoder network;

P402.获取所述解码器网络在重构过程中的误差；P402. Obtain the error of the decoder network in the reconstruction process;

P403.根据所述误差调整所述编码器网络。P403. Adjust the encoder network according to the error.

鉴别器网络的损失函数相应变为：The loss function of the discriminator network accordingly becomes:

其中De代表解码器网络；En代表编码器网络，它也用作鉴别器网络的卷积层；α用于降低图像重建的损失权重，以确保鉴别器网络快速收敛。where De represents the decoder network; En represents the encoder network, which is also used as the convolutional layer of the discriminator network; α is used to reduce the loss weight of image reconstruction to ensure fast convergence of the discriminator network.

通过保留解码器网络与编码器网络之间的连接，解码器网络仍从实际生产线重构样本图像，并将误差传播回编码器网络，这样，在对生成对抗网络进行训练时，鉴别器网络仍然可以从解码器网络中学习到样本图像的特征，提高鉴别器网络的识别准确率。By preserving the connection between the decoder network and the encoder network, the decoder network still reconstructs sample images from the actual production line and propagates the error back to the encoder network, so that while training the generative adversarial network, the discriminator network still The features of the sample images can be learned from the decoder network to improve the recognition accuracy of the discriminator network.

综上，步骤P1-P6实际上包括两阶段，第一阶段是步骤P1-P3所述的构建和训练卷积自动编码网络的过程，第二阶段是步骤P4-P6所述的构建和训练生成对抗网络的过程。在实际执行时，可以使用Keras或Tensorflow来构建卷积自动编码网络和生成对抗网络，第一阶段和第二阶段的学习率分别设置为0.001和0.0001，动量均设置为0.9，权重衰减均设置为0.0005，可以达到较优的识别率。In summary, steps P1-P6 actually include two stages. The first stage is the process of building and training the convolutional auto-encoding network described in steps P1-P3, and the second stage is the construction and training generation described in steps P4-P6. The process of adversarial networks. In actual execution, Keras or Tensorflow can be used to build convolutional auto-encoding networks and generative adversarial networks. The learning rates of the first and second stages are set to 0.001 and 0.0001, respectively, the momentum is set to 0.9, and the weight decay is set to 0.0005, a better recognition rate can be achieved.

执行步骤P1-P6构建和训练所得的生成对抗网络，实际上是一个卷积自动编码网络-生成对抗网络联合网络，它兼具了卷积自动编码网络的无监督训练优点，以及生成对抗网络的半监督训练优点，具有较好的泛化能力，能够适应产品表面缺陷尤其是钢铁产品表面缺陷检测场合中所面对的训练样本少、图像外观复杂、组内差异大等情况，取得较好的识别效果。The generative adversarial network constructed and trained by performing steps P1-P6 is actually a convolutional auto-encoding network-generative adversarial network joint network, which combines the advantages of unsupervised training of convolutional auto-encoding networks and the advantages of generative adversarial networks. The advantages of semi-supervised training are that it has good generalization ability, and can adapt to the situation of product surface defects, especially steel product surface defect detection, where there are few training samples, complex image appearance, and large differences within groups, etc., and achieve better results. Identify the effect.

实施例2Example 2

在执行实施例1所述训练方法后，所得到的生成对抗网络用作产品表面缺陷检测模型，将其用于执行以下产品表面缺陷检测方法：After performing the training method described in Embodiment 1, the resulting generative adversarial network is used as a product surface defect detection model, which is used to implement the following product surface defect detection methods:

S1.获取待检测图像；所述待检测图像包括产品表面，通常可以通过拍摄或扫描等方法得到待检测图像；S1. Obtain an image to be detected; the image to be detected includes the surface of the product, and the image to be detected can usually be obtained by shooting or scanning;

S2.将所述待检测图像输入到产品表面缺陷检测模型；S2. Input the image to be detected into the product surface defect detection model;

S3.获取所述产品表面缺陷检测模型的输出结果，根据所述输出结果确定产品表面缺陷类型。S3. Obtain the output result of the product surface defect detection model, and determine the product surface defect type according to the output result.

使用实施例1中获得的产品表面缺陷检测模型来执行产品表面缺陷检测方法，可以获得与实施例1所述的相同的技术效果，也就是能够适应产品表面缺陷尤其是钢铁产品表面缺陷检测场合中所面对的图像外观复杂、组内差异大等情况，取得较高的识别准确率。Using the product surface defect detection model obtained in Example 1 to perform the product surface defect detection method, the same technical effect as described in Example 1 can be obtained, that is, it can be adapted to product surface defects, especially in steel product surface defect detection occasions Faced with the complex appearance of the image, large differences within the group, etc., a high recognition accuracy is achieved.

实施例3Example 3

将实施例1所记载的训练方法和实施例2所记载的检测方法编写成相应的计算机代码并写入至存储介质中，当存储介质被连接到控制器时，其中的计算机程序代码可被读取出来并执行，从而自动执行步骤P1-P6或S1-S3，实现与实施例1或实施例2中所述的相同的技术效果。The training method described in Embodiment 1 and the detection method described in Embodiment 2 are written into corresponding computer codes and written into the storage medium. When the storage medium is connected to the controller, the computer program codes therein can be read. Take it out and execute it, so that steps P1-P6 or S1-S3 are automatically executed to achieve the same technical effect as described in Embodiment 1 or Embodiment 2.

需要说明的是，如无特殊说明，当某一特征被称为“固定”、“连接”在另一个特征，它可以直接固定、连接在另一个特征上，也可以间接地固定、连接在另一个特征上。此外，本公开中所使用的上、下、左、右等描述仅仅是相对于附图中本公开各组成部分的相互位置关系来说的。在本公开中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式，除非上下文清楚地表示其他含义。此外，除非另有定义，本实施例所使用的所有的技术和科学术语与本技术领域的技术人员通常理解的含义相同。本实施例说明书中所使用的术语只是为了描述具体的实施例，而不是为了限制本发明。本实施例所使用的术语“和/或”包括一个或多个相关的所列项目的任意的组合。It should be noted that, unless otherwise specified, when a feature is called "fixed" or "connected" to another feature, it can be directly fixed or connected to another feature, or it can be indirectly fixed or connected to another feature. on a feature. In addition, descriptions such as upper, lower, left, right, etc. used in the present disclosure are only relative to the mutual positional relationship of each component of the present disclosure in the accompanying drawings. As used in this disclosure, the singular forms "a," "the," and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. Also, unless otherwise defined, all technical and scientific terms used in this embodiment have the same meaning as commonly understood by those skilled in the art. The terms used in the description of the embodiments are only used to describe specific embodiments, rather than to limit the present invention. As used in this example, the term "and/or" includes any combination of one or more of the associated listed items.

应当理解，尽管在本公开可能采用术语第一、第二、第三等来描述各种元件，但这些元件不应限于这些术语。这些术语仅用来将同一类型的元件彼此区分开。例如，在不脱离本公开范围的情况下，第一元件也可以被称为第二元件，类似地，第二元件也可以被称为第一元件。本实施例所提供的任何以及所有实例或示例性语言(“例如”、“如”等)的使用仅意图更好地说明本发明的实施例，并且除非另外要求，否则不会对本发明的范围施加限制。It will be understood that, although the terms first, second, third, etc. may be used in this disclosure to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish elements of the same type from one another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of the present disclosure. The use of any and all examples or exemplary language ("for example," "such as," etc.) provided in this embodiment is intended only to better illustrate embodiments of the invention and does not detract from the scope of the invention unless otherwise requested impose restrictions.

应当认识到，本发明的实施例可以由计算机硬件、硬件和软件的组合、或者通过存储在非暂时性计算机可读存储器中的计算机指令来实现或实施。所述方法可以使用标准编程技术-包括配置有计算机程序的非暂时性计算机可读存储介质在计算机程序中实现，其中如此配置的存储介质使得计算机以特定和预定义的方式操作——根据在具体实施例中描述的方法和附图。每个程序可以以高级过程或面向目标终端的编程语言来实现以与计算机系统通信。然而，若需要，该程序可以以汇编或机器语言实现。在任何情况下，该语言可以是编译或解释的语言。此外，为此目的该程序能够在编程的专用集成电路上运行。It should be appreciated that embodiments of the present invention may be implemented or implemented by computer hardware, a combination of hardware and software, or by computer instructions stored in non-transitory computer readable memory. The methods can be implemented in a computer program using standard programming techniques - including a non-transitory computer-readable storage medium configured with a computer program, wherein the storage medium so configured causes the computer to operate in a specific and predefined manner - according to the specific Methods and figures described in the Examples. Each program may be implemented in a high-level process or target terminal-oriented programming language to communicate with a computer system. However, if desired, the program can be implemented in assembly or machine language. In any case, the language can be a compiled or interpreted language. Furthermore, the program can be run on a programmed application specific integrated circuit for this purpose.

此外，可按任何合适的顺序来执行本实施例描述的过程的操作，除非本实施例另外指示或以其他方式明显地与上下文矛盾。本实施例描述的过程(或变型和/或其组合)可在配置有可执行指令的一个或多个计算机系统的控制下执行，并且可作为共同地在一个或多个处理器上执行的代码(例如，可执行指令、一个或多个计算机程序或一个或多个应用)、由硬件或其组合来实现。所述计算机程序包括可由一个或多个处理器执行的多个指令。Furthermore, the operations of the processes described in this embodiment may be performed in any suitable order unless otherwise indicated by this embodiment or otherwise clearly contradicted by context. The processes described in this embodiment (or variations and/or combinations thereof) may be performed under the control of one or more computer systems configured with executable instructions, and may be executed as code collectively executing on one or more processors (eg, executable instructions, one or more computer programs, or one or more applications), implemented in hardware, or a combination thereof. The computer program includes a plurality of instructions executable by one or more processors.

进一步，所述方法可以在可操作地连接至合适的任何类型的计算平台中实现，包括但不限于个人电脑、迷你计算机、主框架、工作站、网络或分布式计算环境、单独的或集成的计算机平台、或者与带电粒子工具或其它成像装置通信等等。本发明的各方面可以以存储在非暂时性存储介质或设备上的机器可读代码来实现，无论是可移动的还是集成至计算平台，如硬盘、光学读取和/或写入存储介质、RAM、ROM等，使得其可由可编程计算机读取，当存储介质或设备由计算机读取时可用于配置和操作计算机以执行在此所描述的过程。此外，机器可读代码，或其部分可以通过有线或无线网络传输。当此类媒体包括结合微处理器或其他数据处理器实现上文所述步骤的指令或程序时，本实施例所述的发明包括这些和其他不同类型的非暂时性计算机可读存储介质。当根据本发明所述的方法和技术编程时，本发明还包括计算机本身。Further, the methods may be implemented in any type of computing platform operably connected to a suitable, including but not limited to personal computer, minicomputer, mainframe, workstation, network or distributed computing environment, stand-alone or integrated computer platform, or communicate with charged particle tools or other imaging devices, etc. Aspects of the invention may be implemented in machine-readable code stored on a non-transitory storage medium or device, whether removable or integrated into a computing platform, such as a hard disk, an optically read and/or written storage medium, RAM, ROM, etc., such that it can be read by a programmable computer, when a storage medium or device is read by a computer, it can be used to configure and operate the computer to perform the processes described herein. Furthermore, the machine-readable code, or portions thereof, may be transmitted over wired or wireless networks. The invention described in this embodiment includes these and other various types of non-transitory computer-readable storage media when such media includes instructions or programs that implement the steps described above in conjunction with a microprocessor or other data processor. The invention also includes the computer itself when programmed according to the methods and techniques described herein.

计算机程序能够应用于输入数据以执行本实施例所述的功能，从而转换输入数据以生成存储至非易失性存储器的输出数据。输出信息还可以应用于一个或多个输出设备如显示器。在本发明优选的实施例中，转换的数据表示物理和有形的目标终端，包括显示器上产生的物理和有形目标终端的特定视觉描绘。A computer program can be applied to input data to perform the functions described in this embodiment to transform the input data to generate output data for storage to non-volatile memory. The output information can also be applied to one or more output devices such as a display. In a preferred embodiment of the present invention, the transformed data represents the physical and tangible target terminals, including specific visual depictions of the physical and tangible target terminals produced on the display.

以上所述，只是本发明的较佳实施例而已，本发明并不局限于上述实施方式，只要其以相同的手段达到本发明的技术效果，凡在本发明的精神和原则之内，所做的任何修改、等同替换、改进等，均应包含在本发明保护的范围之内。在本发明的保护范围内其技术方案和/或实施方式可以有各种不同的修改和变化。The above are only preferred embodiments of the present invention, and the present invention is not limited to the above-mentioned embodiments, as long as it achieves the technical effect of the present invention by the same means, all within the spirit and principle of the present invention, do Any modification, equivalent replacement, improvement, etc., should be included within the protection scope of the present invention. Various modifications and changes can be made to its technical solutions and/or implementations within the protection scope of the present invention.

Claims

1. a training method of a product surface defect detection model, is characterized in that, comprises the following steps:

Obtain an auto-encoding network; the auto-encoding network includes an encoder network and a decoder network;

obtaining a sample image; at least a portion of the sample image includes product surface defects;

training the auto-encoding network using the sample images;

Obtain a generative adversarial network; the generative adversarial network includes a generator network and a discriminator network, and the discriminator network is constructed by the encoder network in the auto-encoding network;

Obtain a real image and its mark; the real image includes product surface defects, and the mark is used to indicate the type of product surface defect included in the corresponding real image;

Input the real image or the fake image generated by the generator network into the discriminator network, obtain the output result of the discriminator network, adjust the parameters of the discriminator network and/or the generator network, until The loss function of the discriminator network and/or the loss function of the generator network reaches the target value;

The trained generative adversarial network is used as a product surface defect detection model.

2. The training method according to claim 1, wherein the step of adjusting the parameters of the discriminator network and/or the generator network specifically comprises:

reconstructing sample images by the decoder network;

Obtain the error of the decoder network in the reconstruction process;

The encoder network is adjusted according to the error.

3. training method according to claim 2, is characterized in that, the loss function of described discriminator network is:

4 . The training method according to claim 1 , wherein the discriminator network is obtained by connecting the encoder network and a multi-classification network. 5 .

5. The training method according to claim 4, wherein the multi-classification network is a softmax network.

6. The training method according to claim 4, wherein the automatic coding network is a convolutional automatic coding network; the encoder network comprises a plurality of convolutional layers and a passthrough layer; the passthrough layer is used for The feature maps output by the shallower convolutional layers are downsized and concatenated with the feature maps output by the deeper convolutional layers.

7. The training method according to claim 6, wherein the deeper convolutional layer is the last convolutional layer in the encoder network.

8. The training method according to claim 4, wherein at least one of the convolutional layers in the encoder network performs a max-pooling operation.

9. A product surface defect detection method, is characterized in that, comprises the following steps:

Obtain an image to be detected; the image to be detected includes the product surface;

Input the image to be detected into the product surface defect detection model; the product surface defect detection model is trained by the training method described in any one of claims 1-8;

The output result of the product surface defect detection model is acquired, and the product surface defect type is determined according to the output result.

10. A storage medium storing processor-executable instructions, wherein, when executed by the processor, the processor-executable instructions are used to execute the method according to any one of claims 1-9. method.