CN114067168B

CN114067168B - Cloth defect image generation system and method based on improved variational autoencoder network

Info

Publication number: CN114067168B
Application number: CN202111198444.7A
Authority: CN
Inventors: 宋亚林; 乐飞; 袁明阳; 张潮; 庞子龙; 何欣; 于俊洋; 甘志华
Original assignee: Henan University
Current assignee: Henan University
Priority date: 2021-10-14
Filing date: 2021-10-14
Publication date: 2024-11-12
Anticipated expiration: 2041-10-14
Also published as: CN114067168A

Abstract

The present invention provides a cloth defect image generation system and method based on an improved variational autoencoder network. The system includes a variational autoencoder network and a discriminant network, wherein the variational autoencoder network is divided into an encoder network and a decoder network; the encoder network is used to encode the real target image into a normal distribution q(z|x) of a latent space variable X; the decoder network is used to sample the latent space variable ^~ X from the normal distribution q(z|x) to generate a new target image; the discriminator network is used to measure the similarity between the generated target image and the real target image, calculate the adversarial loss, pass the adversarial loss into the encoding network and the decoding network, and replace the pixel-based reconstruction metric in the variational autoencoder network with the feature metric represented in the discriminator network. The present invention improves the variational autoencoder network by generating a discriminator network in an adversarial network to improve the quality of the generated image.

Description

Cloth defect image generation system and method based on improved variation self-encoder network

Technical Field

The invention relates to the technical field of image generation, in particular to a cloth defect image generation system and method based on an improved variation self-encoder network.

Background

Currently, the clothing industry in China has huge market, and whether cloth is defective or not is an important standard for checking the quality of cloth in production. When we want to solve problems such as cloth defect detection, defect classification, etc. using deep learning techniques, we find that there is insufficient three types of cloth defect image data with holes, stains, and looseness to support model training.

Many image generation methods have been proposed that use techniques to generate a robust against the network GAN (e.g., goodflow ij, pouget-Abadie J, mirza M, etc. GENERATIVE ADVERSARIAL Networks J arXiv:1406.2661 cs, stat, 2014) and a variable from the encoder VAE (e.g., Kingma D P,Welling M.Auto-Encoding Variational Bayes[J].arXiv:1312.6114[cs,stat],2014.),pixelCNN、pixelRNN. In which the GAN has two different Networks rather than a single network compared to conventional models and the training approach is to combat the training approach to finally obtain a stable model that can generate a sufficiently realistic image, whereas the GAN network has the problems of unstable training, vanishing gradients, and mode collapse, the VAE adds constraints to the encoder on a standard self-encoding (AE) basis, forcing it to generate a latent variable that follows a unit gaussian distribution, samples from the unit gaussian distribution, and then passes it to the decoder to generate new images, but the image effect produced by the VAE tends to be slightly blurred.

Disclosure of Invention

Aiming at the problem that there is not enough cloth defect image data with holes, stains and looseness to support the training of a deep learning model, the invention provides a cloth defect image generation system and method based on an improved variation self-encoder network, the variational self-encoder network is improved by generating a discriminator network in the countermeasure network, learning characteristic representation in the GAN discriminator is used as the basis of a VAE reconstruction target, and pixel similarity measurement of the VAE is replaced by characteristic measurement learned by the discriminator, so that the quality of a generated image is improved.

In one aspect, the present invention provides a cloth defect image generating system based on an improved variation self-encoder network, comprising: a variation self-encoder network and a discrimination network, the variation self-encoder network being divided into an encoder network and a decoder network;

an encoder network for encoding the real target image into a normal distribution q (z|x) of latent space variables X;

A decoder network for sampling the latent space variable X from the normal distribution q (z|x) to generate a new target image;

And the discriminator network is used for measuring the similarity between the generated target image and the real target image, calculating the countermeasures, transmitting the countermeasures into the coding network and the decoding network, and simultaneously replacing the reconstruction measurement based on the pixels in the variation self-coder network with the characteristic measurement expressed in the discriminator network.

Further, the encoder network comprises a resnet network; the resnet network comprises a Conv2d layer, a maximum pooling layer, 8 residual blocks and an average pooling layer which are sequentially connected from shallow to deep; the 8 residual blocks are sequentially connected in series;

The Conv2d layer consists of a convolution layer with the convolution kernel size of 7 multiplied by 7 and the step length of 2, a normalization layer and an activation function ReLU; the convolution kernel size of the maximum pooling layer is 3×3 and the step size is 2; the convolution kernel size of the average pooling layer is 1 multiplied by 1; the 1 st, 2 nd residual blocks each contain a convolution with a 2-layer convolution kernel size of 3 x 3 and an output channel of 64; the 3 rd and 4 th residual blocks each contain a convolution with a 2-layer convolution kernel size of 3 x 3 and an output channel of 128; the 5 th and 6 th residual blocks each contain a convolution with a 2-layer convolution kernel size of 3 x 3 and an output channel of 256; the 7 th and 8 th residual blocks each contain convolutions with a 2-layer convolution kernel size of 3 x 3 and an output channel of 512.

Further, the decoder network comprises 6 layers, the first layer comprising a deconvolution layer having a convolution kernel size of 4×4 and an activation function ReLU; the second layer to the fifth layer comprise a deconvolution layer with a convolution kernel size of 4×4, a normalization layer and an activation function ReLU; the sixth layer comprises a deconvolution layer with a convolution kernel size of 4×4 and an activation function Tanh; the number of output channels of the first layer to the sixth layer is 512, 384, 192, 96, 64,3 in order.

Further, the arbiter network comprises 6 layers, the first layer comprising a convolution layer having a convolution kernel size of 4 x 4 and a step size of 2, and an activation function LeakyReLU; the second layer to the fifth layer are composed of a convolution layer with the convolution kernel size of 4 multiplied by 4 and the step length of 2, a normalization layer and an activation function ReLU; the sixth layer consists of a convolution layer with a convolution kernel size of 4 x 4 and a step size of 1 and an activation function Sigmoid.

In another aspect, the present invention provides a cloth defect image generating method based on an improved variation self-encoder network, comprising:

Step 1: constructing the cloth defect image generation system based on the improved variation self-encoder network;

Step 2: screening cloth defect image data with holes, stains or loose warp defects as a training set, and scaling and cropping each image to 256×256 pixels;

step 3: defining a loss function of the cloth defect image generating system;

step 4: initializing a cloth defect image generation system;

Step 5: training the cloth defect image generation system by adopting a training set by using an Adam algorithm;

Step 6: and generating a cloth defect image with holes, stains or loose warp defects by using a trained cloth defect image generation system.

Further, the loss function of the cloth defect image generating system comprises regularized prior loss L _prior of the variable self-encoder network, counterloss L _GAN of the variable self-encoder network and the discriminant network, and characteristic loss of the first layer of the discriminant network replacing pixel loss

Wherein the regularized prior loss L _prior of the variation from the encoder network is defined as the KL divergence loss between the normal distribution q (z|x) of the transformation of the target image into the latent space variable X and a given normal distribution p (z) constraint:

L_prior＝D_KL(q(z|x)||p(z))

countering loss L _GAN and characteristic loss of first layer of discriminator network Respectively defined as:

L_GAN＝log(Dis(x))+log(1-Dis(Gen(z)))

Wherein D _KL represents a KL divergence loss function; dis represents a discriminator function, which is used for judging whether the picture input to the discriminator is true or false, if true, outputting 1, otherwise outputting 0; gen stands for variation self-encoder, the encoder in the variation self-encoder encodes the inputted real picture into a latent space variable, the decoder in the variation self-encoder decodes the latent space variable into a new picture as input of the discriminator; e represents the expected value of the distribution function.

Further, step 5 specifically includes: in the training process, the system parameters are updated through iteration until the system converges; in each iteration process, the process of updating the system parameters specifically comprises the following steps:

calculating losses of the encoder network and the decoder network through a loss function;

Calculating increment DeltaW ⁽ⁱ⁾ of convolution layer weight in the convolution layer according to the obtained loss by utilizing the principle of a back propagation algorithm and a gradient descent algorithm, executing Updating the convolution kernel parameters;

Wherein W ⁽ⁱ⁾ represents the convolution kernel parameters of the convolution layer after the ith iteration, Is the update amount of the parameter calculated according to the back propagation algorithm and the gradient descent algorithm in the ith iteration,In steps.

The invention has the beneficial effects that:

The invention uses a discrimination network in a generation countermeasure network to improve a variation self-encoder network, and designs a cloth defect image generation system based on the improved variation self-encoder network and related parameter configuration thereof aiming at the problem that insufficient three cloth defect image data with holes, stains and looseness are used for supporting deep learning model training, and specifically comprises the following steps: improving the network structure of the VAE to form a new VAE-GAN by using the GAN; an improved variational self-encoder network capable of generating higher quality images with three cloth defects of hole, stain and loose warp is designed; the problem that the image data of three cloth defects with holes, stains and loose warps is insufficient to support training of the deep learning model is effectively solved.

Drawings

FIG. 1 is an overall block diagram of a cloth defect image generating system based on an improved variation self-encoder network provided by an embodiment of the present invention;

FIG. 2 is a general block diagram of a variable self-encoder network provided by an embodiment of the present invention;

FIG. 3 is a diagram showing a structure corresponding to a coding network of a variable self-encoder network according to an embodiment of the present invention

FIG. 4 is a block diagram corresponding to a decoding network of a variable self-encoder network according to an embodiment of the present invention;

fig. 5 is a diagram of a structure corresponding to a discrimination network according to an embodiment of the present invention;

Fig. 6 is a flowchart of a cloth defect image generating method based on an improved variation self-encoder network according to an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions in the embodiments of the present invention will be clearly described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

As shown in fig. 1, an embodiment of the present invention provides a cloth defect image generating system based on an improved variation self-encoder network, which includes a variation self-encoder network Generator and a discrimination network Discriminator, wherein the variation self-encoder network is divided into an encoder network Encoder and a Decoder network Decoder;

As an embodiment, as shown in fig. 2, the encoder network comprises a resnet network; the resnet network comprises a Conv2d layer, a maximum pooling layer, 8 residual blocks and an average pooling layer which are sequentially connected from shallow to deep; the 8 residual blocks are sequentially connected in series;

As an implementation, as shown in fig. 3, the decoder network includes 6 layers, and the first layer includes a deconvolution layer with a convolution kernel size of 4×4 and an activation function ReLU, and the structure is represented as UpConv = [ ConvTranspose _4×4 -ReLU ]; the second layer to the fifth layer comprise a deconvolution layer with a convolution kernel size of 4×4, a normalization layer and an activation function ReLU, and the structure is expressed as UpConv = [ ConvTranspose _4×4 -BN-ReLU ]; the sixth layer comprises a deconvolution layer with a convolution kernel size of 4×4 and an activation function Tanh, up conv= [ ConvTranspose _4×4 -Tanh ]; the number of output channels of the first layer to the sixth layer is 512, 384, 192, 96, 64,3 in order.

The variation self-encoder network in the embodiment of the invention extracts the characteristics of the image by utilizing resnet network, solves the problem of gradient disappearance caused by over-deep network by utilizing the thought of residual block, and generates a new target image by sampling-X from normal component q (z|x) through a decoding network.

As an embodiment, as shown in fig. 4, the arbiter network includes 6 layers, where the first layer includes a convolution layer with a convolution kernel size of 4×4 and a step size of 2, and an activation function LeakyReLU; the second layer to the fifth layer are composed of a convolution layer with the convolution kernel size of 4 multiplied by 4 and the step length of 2, a normalization layer and an activation function ReLU; the sixth layer consists of a convolution layer with a convolution kernel size of 4 x 4 and a step size of 1 and an activation function Sigmoid.

In the embodiment of the invention, the identifier network extracts the characteristics of the image by utilizing the convolution layer, the true and false of the input image is identified through the Sigmoid function, the first layer characteristic loss of the identification network and the identification result are fed back to the variation self-encoder network, and the variation self-encoder network retrains the generated image according to the feedback result until the model is stable.

The invention also provides a cloth defect image generation method based on the improved variation self-encoder network, which adopts the cloth defect image generation system in the embodiment, and specifically comprises the following steps:

S101: constructing a cloth defect image generating system based on an improved variation self-encoder network, such as the cloth defect image generating system in the embodiment;

S102: screening cloth defect image data with holes, stains or loose warp defects as a training set, and scaling and cropping each image to 256×256 pixels;

S103: defining a loss function of the cloth defect image generating system;

Specifically, in the embodiment of the present invention, the loss function of the cloth defect image generating system includes regularized prior loss L _prior of the variable self-encoder network, counterloss L _GAN of the variable self-encoder network and the discriminant network, and characteristic loss of the first layer of the discriminant network replacing pixel loss This is because the feature loss learned by the first layer of the arbiter is extracted as the loss of the generated image, based on that pixel loss is less suitable for the image data, and therefore a higher level and sufficiently invariant representation of the image is used to measure the similarity of the image.

L_prior＝D_KL(q(z|x)||p(z))

L_GAN＝log(Dis(x))+log(1-Dis(Gen(z)))

Wherein D _KL represents a KL divergence loss function; dis represents a discriminator function, which is used for judging whether the picture input to the discriminator by the function is true or false, if true, outputting 1, otherwise outputting 0; gen stands for variation self-encoder, the encoder in the variation self-encoder encodes the inputted real picture into a latent space variable, the decoder in the variation self-encoder decodes the latent space variable into a new picture as input of the discriminator; e represents the expected value of the distribution function, the greater the expected value, the smaller the loss of function.

S104: initializing a cloth defect image generating system, comprising: for a convolution layer in a convolution network, a convolution kernel parameter is initialized by using an Xavier mode.

S105: training the cloth defect image generation system by adopting a training set by using an Adam algorithm;

specifically, an image generated by the decoding of the latent space variable X and a real image generated by the decoding of the latent space variable X are sent to a discriminator network to calculate countermeasures, the countermeasures the quality of the image generated by the decoding of the latent space variable X, and the result is fed back to a variational self-encoder network;

In the training process, the system parameters are updated through iteration until the system converges; in each iteration process, the process of updating the system parameters specifically comprises the following steps:

Wherein W ⁽ⁱ⁾ represents the convolution kernel parameters of the convolution layer after the ith iteration, Is the update amount of the parameter calculated according to the back propagation algorithm and the gradient descent algorithm in the ith iteration,In steps.Also referred to as the learning rate,The size of (2) determines how fast and how far the network is converging during training,When the value is relatively large, the network can quickly converge during training, but the global optimal value can not be reached,When smaller, the network training converges more slowly, but often can reach a global optimum. In the preferred embodiment, adam is used in the iterative update parameter algorithm, and the learning rate is initially set to 0.001.

S106: and generating a cloth defect image with holes, stains or loose warp defects by using a trained cloth defect image generation system.

Specifically, the cloth defect image with holes, stains or loose defects generated by using the trained cloth defect image generating system can be compared with the images generated by the VAE and the GAN, and the quality of the image generated by using the method can be evaluated by using Cosine similarity and Mean squared error values of the generated images, so that the performance of the cloth defect system provided by the invention can be measured.

In order to verify the effectiveness of the cloth defect image generation system and method provided by the invention, the invention also provides the following experiment.

In the experiment, pytorch-1.6 is used for realizing a system network structure, and an LFW testing set is adopted for testing the system network performance; the network is trained by using an Adam algorithm, wherein the super-parameter learning rate of the Adam algorithm is initially set to 0.0002, and the learning rate is automatically adjusted by using a dynamic learning mode. All the convolution layer convolution kernels are initialized by adopting the Xavier initialization method in Pytorch, and the offset item is set to be 0.0. In each iteration, any 4 images in the training set form a batch, the images are input into the network for training, and the training is carried out for a plurality of times until the network converges. After training, the images are expressed by evaluation indexes Cosine similarity and Mean squared error of the similarity of the image attributes, and the images are used for comparing networks such as VAE, GAN and the like, adopting the same data set division and testing the performance according to the original configuration.

As shown in table 1, the Cosine similarity and Mean squared error values of the resulting cloth defect images were tested under VAE, GAN and the system and method of the present invention, respectively, and as can be seen from table 1, the system and method of the present invention provided cloth defects that perform better on the dataset than the VAE and GAN.

TABLE 1

Model	Cosine similarity	Mean squared error
			test_set	0.9193	14.1987
VAE	0.9030	27.59±1.42
			GAN	0.8892	27.89±3.0
VAEGAN	0.9114	22.39±1.16

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A cloth defect image generation system based on an improved variational autoencoder network, characterized in that it includes a variational autoencoder network and a discriminant network, wherein the variational autoencoder network is divided into an encoder network and a decoder network;

The encoder network is used to encode the real target image into a normal distribution q(z|x) of the latent space variable X;

The decoder network is used to sample the latent space variable ^~ X from the normal distribution q(z|x) to generate a new target image;

The discriminator network is used to measure the similarity between the generated target image and the real target image, calculate the adversarial loss, pass the adversarial loss to the encoding network and the decoding network, and replace the pixel-based reconstruction metric in the variational autoencoder network with the feature metric represented in the discriminator network;

The encoder network includes a resnet18 network; the resnet18 network includes a Conv2d layer, a maximum pooling layer, 8 residual blocks and an average pooling layer connected in sequence from shallow to deep; the 8 residual blocks are connected in series in sequence;

The Conv2d layer consists of a convolution layer with a convolution kernel size of 7×7 and a stride of 2, a normalization layer, and an activation function ReLU; the convolution kernel size of the maximum pooling layer is 3×3 and the stride is 2; the convolution kernel size of the average pooling layer is 1×1; the first and second residual blocks each contain two layers of convolution with a convolution kernel size of 3×3 and an output channel of 64; the third and fourth residual blocks each contain two layers of convolution with a convolution kernel size of 3×3 and an output channel of 128; the fifth and sixth residual blocks each contain two layers of convolution with a convolution kernel size of 3×3 and an output channel of 256; the seventh and eighth residual blocks each contain two layers of convolution with a convolution kernel size of 3×3 and an output channel of 512;

The decoder network comprises 6 layers, the first layer comprises a deconvolution layer with a convolution kernel size of 4×4 and an activation function ReLU; the second to fifth layers each comprise a deconvolution layer with a convolution kernel size of 4×4, a normalization layer and an activation function ReLU; the sixth layer comprises a deconvolution layer with a convolution kernel size of 4×4 and an activation function Tanh; the number of output channels of the first to sixth layers are 512, 384, 192, 96, 64, and 3 respectively;

The discriminator network comprises 6 layers, the first layer includes a convolution layer with a convolution kernel size of 4×4 and a step size of 2 and an activation function LeakyReLU; the second to fifth layers are each composed of a convolution layer with a convolution kernel size of 4×4 and a step size of 2, a normalization layer and an activation function ReLU; the sixth layer is composed of a convolution layer with a convolution kernel size of 4×4 and a step size of 1 and an activation function Sigmoid.

2. A cloth defect image generation method based on an improved variational autoencoder network, characterized by comprising:

Step 1: constructing a cloth defect image generation system based on an improved variational autoencoder network as described in claim 1;

Step 2: Filter the cloth defect image data with holes, stains or loose warp defects as the training set, and scale and crop each image to 256×256 pixels;

Step 3: defining a loss function of the cloth defect image generation system;

Step 4: Initialize the cloth defect image generation system;

Step 5: Using the Adam algorithm to train the cloth defect image generation system using the training set;

Step 6: Use the trained cloth defect image generation system to generate cloth defect images with holes, stains or loose warp defects.

3. The cloth defect image generation method based on the improved variational autoencoder network according to claim 2 is characterized in that the loss function of the cloth defect image generation system includes the regularization prior loss L _prior of the variational autoencoder network, the adversarial loss L _GAN between the variational autoencoder network and the discriminator network, and the feature loss of the lth layer of the discriminator network with the replacement pixel loss.

Among them, the regularized prior loss L _prior of the variational autoencoder network is defined as the KL divergence loss between the normal distribution q(z|x) of the target image converted into the latent space variable X and the given normal distribution p(z) constraint:

L _prior =D _KL (q(z|x)||p(z))

Adversarial loss L _GAN and feature loss of the lth layer of the discriminator network They are defined as:

L _GAN =log(Dis(x))+log(1-Dis(Gen(z)))

Among them, D _KL represents the KL divergence loss function; Dis represents the discriminator function, which is used to judge the authenticity of the picture input to the discriminator. If it is true, the output is 1, otherwise it outputs 0; Gen represents the variational autoencoder. The encoder in the variational autoencoder encodes the input real picture into a latent space variable, and the decoder in the variational autoencoder decodes the latent space variable into a new picture as the input of the discriminator; E represents the expected value of the distribution function.

4. According to the method for generating cloth defect images based on an improved variational autoencoder network according to claim 2, it is characterized in that step 5 specifically comprises: in the training process, updating the system parameters by iteration until the system converges; wherein, in each iteration process, the process of updating the system parameters is specifically:

Calculate the loss of the encoder network and decoder network through the loss function;

According to the loss obtained, the increment of the convolutional layer weight in the convolutional layer ΔW ⁽ⁱ⁾ is calculated using the principle of the back propagation algorithm and the gradient descent algorithm, and then executed Update the convolution kernel parameters;

Where W ⁽ⁱ⁾ represents the convolution kernel parameters of the convolution layer after the i-th iteration, is the parameter update amount calculated by the back propagation algorithm and the gradient descent algorithm at the i-th iteration, is the step length.