WO2024160178A1 - 图像翻译模型的训练、图像翻译方法、设备及存储介质 - Google Patents
图像翻译模型的训练、图像翻译方法、设备及存储介质 Download PDFInfo
- Publication number
- WO2024160178A1 WO2024160178A1 PCT/CN2024/074513 CN2024074513W WO2024160178A1 WO 2024160178 A1 WO2024160178 A1 WO 2024160178A1 CN 2024074513 W CN2024074513 W CN 2024074513W WO 2024160178 A1 WO2024160178 A1 WO 2024160178A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- model
- training
- discrimination
- pair
- Prior art date
Links
- 238000013519 translation Methods 0.000 title claims abstract description 171
- 238000012549 training Methods 0.000 title claims abstract description 135
- 238000000034 method Methods 0.000 title claims abstract description 77
- 238000012545 processing Methods 0.000 claims abstract description 9
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims description 37
- 239000013598 vector Substances 0.000 claims description 34
- 230000008569 process Effects 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 4
- 125000004122 cyclic group Chemical group 0.000 claims 1
- 230000006870 function Effects 0.000 description 35
- 238000010586 diagram Methods 0.000 description 6
- 238000005070 sampling Methods 0.000 description 4
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 3
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 3
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 3
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 3
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 3
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000012850 discrimination method Methods 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 1
- 241000282414 Homo sapiens Species 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Definitions
- the present application relates to the field of image processing technology, and in particular to the training of image translation models, image translation methods, devices and storage media.
- images contain rich semantic information.
- images are the most widely spread and frequently transmitted form of information besides natural language.
- Human beings' ability to understand the unique semantic information of images is called vision, and the simplest one is the classification of images.
- Classification generally divides the field to which each sample belongs based on the similarities and differences of certain features.
- these classification fields include low-level semantics such as resolution and color, as well as high-level semantics such as content, style, and object relationships.
- image-to-image translation or image translation for short.
- image translation models need to generate and transform the input image to a certain extent so that it has new features while retaining other irrelevant features.
- the first type of solution uses offline real data sets collected in reality to train the model
- the second type of solution uses pre-trained image generators to reversely reconstruct latent variables and edit latent variables.
- the second type of solution is more commonly used, but its training speed is slow and it is difficult to meet the requirements of real-world scenarios for reconstruction quality and running speed.
- an embodiment of the present application provides a method for training an image translation model, comprising:
- the sample image pair includes: the source image and a target image, wherein the target image is a reference standard for translating the source image;
- the training image pair includes: the source image and the training image;
- the parameters of the initial image translation model and the initial image discrimination model are adjusted to obtain a target image translation model.
- the method before adopting the initial image translation model and inputting the source image in the sample image pair to obtain the training image, the method further includes:
- a plurality of groups of noise are input into a preset image generator to obtain a plurality of groups of the source images and the target images, forming a plurality of groups of the sample image pairs.
- the initial image translation model includes: an encoder and a re-parameter generator, the re-parameter generator includes: a plurality of modulated convolutional layers and a mapper, and the initial image translation model is used to input a source image in a sample image pair to obtain a training image, including:
- the mapper is used to convert the preset noise into a noise offset vector in a feature space
- the re-parameter generator is used to generate the training image according to the source image vector, the noise offset vector and the constant feature map.
- the method before using the re-parameter generator to generate the training image according to the source image vector, the noise offset vector and the constant feature map, the method further includes:
- the pre-trained generator is re-parameterized according to the first affine parameter and the second affine parameter to obtain the re-parameterized generator.
- the initial image discrimination model is adopted, sample image pairs and training image pairs are input, and image pair discrimination results are obtained, including:
- the heavy parameter image discrimination model is adopted, the sample image pair and the training image pair are input, and an image pair discrimination result is obtained.
- adjusting parameters of the initial image translation model and the initial image discrimination model according to the image pair discrimination result to obtain a target image translation model includes:
- the parameters of the initial image translation model and the initial image discrimination model are adjusted cyclically until the deviation of the image discrimination loss function is less than a preset threshold, thereby obtaining a target image translation model.
- the loop adjusts the parameters of the initial image translation model and the initial image discrimination model until the image Determine that the deviation of the loss function is less than a preset threshold, and obtain the target image translation model, including:
- the parameters of the adjusted image discrimination model are kept unchanged, and the parameters of the initial image translation model are adjusted until the deviation of the image determination loss function is less than the preset threshold, thereby obtaining the target image translation model.
- an embodiment of the present application provides an image translation method, the method comprising:
- the target image translation model described in any one of the first aspects is used to process the image to be translated to obtain a target image.
- an embodiment of the present application provides a training device for an image translation model.
- the training device comprises:
- a translation module used to adopt an initial image translation model, input a source image in a sample image pair, and obtain a training image;
- the sample image pair includes: the source image and a target image, wherein the target image is a reference standard for translating the source image;
- a discrimination module used to adopt an initial image discrimination model, input the sample image pair and the training image pair, and obtain an image pair discrimination result; wherein the training image pair includes: the source image and the training image;
- the adjustment module is used to adjust the parameters of the initial image translation model and the initial image discrimination model according to the image pair discrimination result to obtain the target image translation model.
- an embodiment of the present application provides an image translation device.
- the device comprises:
- the acquisition module is used to acquire the image to be translated.
- the processing module is used to process the image to be translated using the target image translation model in the first aspect to obtain a target image.
- an embodiment of the present application provides an electronic device, comprising: a processor and a storage medium, wherein the processor and the storage medium are communicatively connected via a bus, the storage medium stores program instructions executable by the processor, and the processor calls the program stored in the storage medium to execute the training method of the image translation model as described in any one of the first aspect or the steps of the image translation method as described in the second aspect.
- an embodiment of the present application provides a storage medium having a computer program stored thereon, and when the computer program is executed by a processor, the training method of the image translation model as described in any one of the first aspect or the steps of the image translation method as described in the second aspect are executed.
- FIG1 is a flow chart of a method for training an image translation model provided in some embodiments of the present application.
- FIG2 is a flow chart of a method for translating a training image provided by some embodiments of the present application.
- FIG3 is a schematic flow chart of an image pair identification method provided in some embodiments of the present application.
- FIG4 is a schematic flow chart of a model determination method provided in some embodiments of the present application.
- FIG5 is a flow chart of a model adjustment method provided in some embodiments of the present application.
- FIG6 is a flow chart of an image translation method provided in some embodiments of the present application.
- FIG7 is a schematic diagram of a training device for an image translation model provided in some embodiments of the present application.
- FIG8 is a schematic diagram of an image translation device provided by some embodiments of the present application.
- FIG9 is a schematic diagram of an electronic device provided in some embodiments of the present application.
- Icons 701 - translation module, 702 - identification module, 703 - adjustment module, 704 - generation module, 801 - acquisition module, 802 - processing module, 901 - processor, 902 - storage medium.
- this application provides a training method for the image translation model. Training, image translation method, equipment and storage medium.
- FIG1 is a flow chart of a training method for an image translation model provided by the present application, and the execution subject of the method may be an electronic device, and the electronic device may be a device with a computing function, such as a desktop computer, a laptop computer, etc. As shown in FIG1 , the method includes steps S101-S103.
- the sample image pair includes: a source image and a target image, wherein the target image is a reference standard for translating the source image, and the source image is an image to be translated, that is, the target image is a standard image obtained by translating the source image.
- the initial image translation model is used to input the source image, translate the source image, and obtain the training image.
- the training image is the translation result of the source image in this round of image translation.
- the user can pre-process the source image according to the image translation requirements (for example, generate bangs for portraits, perform animation processing on portraits) to obtain the target image.
- the training image obtained by the image translation model will use the target image as a reference standard. In other words, the more similar the training image is to the target image, the better the effect of the trained image translation model.
- the training image pair includes: a source image and a training image.
- the source image and the training image can be combined into a training image pair, and then combined with the sample image pair to form two pairs of image pairs.
- the sample image pair represents the difference between the source image and the target image.
- the training image pair represents the difference between the source image and the training image.
- an image translation model and an image discrimination model are used.
- the purpose of the image translation model is to translate a training image that is closer to the target image so as to pass the discrimination of the image discrimination model (successfully deceive the image discrimination model); while the purpose of the image discrimination model is strict discrimination (so that it is not deceived by the image translation model).
- the two supervise each other and conduct adversarial training, forming an unsupervised model adversarial training network.
- the parameters of the initial image translation model and the initial image discrimination model are adjusted.
- the purpose of the adjustment is to make the image translation model translate images more accurately (more able to successfully deceive the image discrimination model) and the image discrimination model perform discrimination more strictly (more able to prevent itself from being deceived by the image translation model).
- Input multiple groups of sample image pairs, and repeat the above steps S101 and S102. Each time, adjust the parameters of the initial image translation model and the initial image discrimination model according to the image pair discrimination results until the trained model meets the user's needs and the model training is completed.
- the trained image translation model is used as the target image translation model.
- the source image in the sample image pair is input to obtain the training image;
- the sample image pair includes: the source image and the target image, wherein the target image is the reference standard for the source image translation;
- the initial image discrimination model is adopted, the sample image pair and the training image pair are input to obtain the image pair discrimination result;
- the training image pair includes: the source image and the training image; according to the image pair discrimination result, the parameters of the initial image translation model and the initial image discrimination model are adjusted to obtain the target image translation model.
- the embodiment of the present application adopts an initial image translation model in S101, inputs a source image in a sample image pair, and before obtaining a training image, the method further includes: inputting multiple groups of noise into a preset image generator to obtain multiple groups of source images and target images to form multiple groups of sample image pairs.
- the preset image generator is a pair of pre-trained generators, including: a source image generator and a target image generator.
- the input of the preset image generator is noise and the output is an image.
- Multiple groups of noise can be randomly sampled from a standard Gaussian distribution.
- a group of noise is input into the source image generator (including the initial source image information) and the target image generator (including the initial target image information) to obtain a group of source images and a group of target images.
- a group of source images and a group of target images constitute a group of sample image pairs.
- z is noise
- Gx is the expression of the target image generator
- Gy is the expression of the source image generator
- x is the target image
- y is the source image.
- multiple sets of source images and target images can be generated with only one set of real initial source images and initial target images. There is no need to obtain multiple sets of source images from real scenes and then process the multiple sets of source images in real scenes to obtain target images. This greatly expands the number of training samples, reduces the cost of obtaining samples, and improves the efficiency of obtaining samples.
- multiple groups of noise are input into a preset image generator to obtain multiple groups of source images and target images, which form multiple groups of sample image pairs.
- the number of training samples is greatly expanded, the cost of obtaining samples is reduced, and the efficiency of obtaining samples is improved.
- the present application further provides a method for translating a training image.
- FIG2 is a flow chart of a method for translating a training image provided by the present application.
- the initial image translation model includes: an encoder and a re-parameter generator, and the re-parameter generator includes: a plurality of modulated convolutional layers and a mapper.
- the reparameter generator refers to the reparameterized generator.
- Reparameterization means sampling from a distribution with parameters. If sampling is performed directly (the sampling action is discrete and discrete data is not differentiable), there is no gradient information, and the parameter gradient will not be updated during the back propagation of the neural network. Reparameterization can ensure sampling from the distribution while retaining the gradient information.
- the initial image translation model is adopted, the source image in the sample image pair is input, and the training image is obtained, including steps S201-S204.
- the preset noise and constant feature map are also input at the same time.
- the preset noise and constant feature map are the noise and constant feature map corresponding to the re-parameter generator.
- the constant feature map is the constant input in the StyleGAN generator.
- the constant feature map is a preset fixed constant vector, which serves as the input of the StyleGAN generator.
- the preset noise can generate vectors in the feature space, which are input into the StyleGAN generator as the style information of the image.
- S202 Use a mapper to convert the preset noise into a noise offset vector in the feature space.
- the feature space may be W space
- M is the expression of the mapper
- z is the preset noise
- w is the noise offset vector
- S203 Use an encoder to convert the source image into a source image vector in a feature space.
- the output of the encoder consists of two parts: the output of each layer of the encoder and the noise offset vector.
- the image translation model uses a re-parameter generator, which turns the image translation model into a re-parameter network structure.
- the re-parameter generator is used to generate training images based on the source image vector, noise offset vector, and constant feature map. It can smoothly migrate between tasks at the lowest cost to ensure the properties of the re-parameter network and improve the training speed.
- the initial image translation model includes: an encoder and a re-parameter generator
- the re-parameter generator includes: multiple modulated convolutional layers and mappers, through the input source image, preset noise and constant feature map; the mapper is used to convert the preset noise into a noise offset vector in the feature space; the encoder is used to convert the source image into a source image vector in the feature space; the re-parameter generator is used to generate a training image according to the source image vector, the noise offset vector and the constant feature map.
- the embodiment of the present application adopts a re-parameter generator in S204, and before generating a training image according to the source image vector, the noise offset vector and the constant feature map, the method also includes: re-parameterizing the pre-trained generator according to the first affine parameter and the second affine parameter to obtain the re-parameter generator.
- ci is the output feature of the i-th layer of the pre-trained generator
- w is the noise offset vector
- the pre-trained generator is re-parameterized, the structure of the pre-trained generator is changed, and a re-parameterized generator is obtained.
- ⁇ is the first affine parameter and ⁇ is the second affine parameter, e i is the output of the i-th layer of the encoder, and wi is W + spatial offset.
- the first affine parameter and the second affine parameter are introduced, and the output e i of each layer in the encoder and the final output W + spatial offset are used to re-parameterize the pre-trained generator.
- the first affine parameter and the second affine parameter are affine parameters that can be learned and optimized. They are initialized to 0, so that the prior of the re-parameterized generator network can be utilized, and the smooth migration between tasks can be achieved at the minimum cost to ensure the properties of the re-parameterized network, and a fast convergence speed is obtained, which improves the training speed.
- the pre-trained generator is re-parameterized according to the first affine parameter and the second affine parameter to obtain a re-parameterized generator, thereby ensuring the properties of the re-parameterized network and improving the training speed.
- FIG3 is a flow chart of an image pair discrimination method provided by the present application. As shown in FIG3, in S102, an initial image discrimination model is used, sample image pairs and training image pairs are input, and image pair discrimination results are obtained, including steps S301-S302.
- the image discrimination model is used to discriminate images.
- the pre-trained image discrimination model i.e., the original image discrimination model
- its input is a single image
- the above-mentioned embodiment of the present application generates paired data (image pairs), so the input of the image discrimination model is also paired images (image pairs), so it is also necessary to re-parameterize the pre-trained image discrimination model.
- the image discrimination model can be regarded as a stack of n downsampled residual blocks d, taking the feature map encoded by the first layer dn as a whole, as shown in the following formula (6):
- the parameterized image discrimination model is obtained in the form of the following formula (7):
- ⁇ is the third affine parameter
- the third affine parameter is an affine parameter that is initialized to 0 and can be optimized for learning
- d′ n is a newly introduced preset convolution block
- the preset convolution block can be a discriminator convolution block, and is randomly initialized.
- the sample image pair includes: a source image and a target image; the training image pair includes: a source image and a training image.
- the sample image pair is input into the heavy parameter image discrimination model to obtain the discrimination result of the sample image pair.
- the training image pair is input into the heavy parameter image discrimination model to obtain the discrimination result of the training image pair.
- the pre-trained image discrimination model is re-parameterized to obtain a re-parameterized image discrimination model; the re-parameterized image discrimination model is used to input the sample image pair and the training image pair to obtain the image pair discrimination result.
- the image pair can be discriminated, thereby improving the discrimination efficiency.
- the present application further provides a model determination method.
- FIG4 is a flow chart of a model determination method provided by the present application. As shown in FIG4, in S103, according to the image pair discrimination result, the parameters of the initial image translation model and the initial image discrimination model are adjusted to obtain the target image translation model, including steps S401-S402.
- the image pair discrimination results include: the discrimination results of the sample image pair and the discrimination results of the training image pair.
- the image discrimination loss function determined according to the discrimination results of the sample image pair and the discrimination results of the training image pair is in the form of the following formula (8):
- D(x,y) is the discrimination result of the sample image pair
- D(T(y),y) is the discrimination result of the training image pair
- T(y) is the training image.
- the loss function represents the image discrimination model needs to accurately determine which image pair is the sample image pair and which image pair is the training image pair (expressed as ), the image translation model needs to make the training image pairs easier to pass the judgment of the image discrimination model, so that the image discrimination model determines that the training image pairs are sample image pairs (the smaller the gap between the training image pairs and the sample image pairs, the smaller the image translation model successfully deceives the image discrimination model, which is expressed in the loss function as ).
- the deviation of the corresponding image judgment loss function is obtained. If the deviation of the image judgment loss function is greater than or equal to the preset threshold, it means that the image judgment loss function is still not stable enough and the function curve is not smooth enough, and it is necessary to continue to adjust the parameters of the initial image translation model and the initial image discrimination model (for example, the first affine parameter, the second affine parameter, and the third affine parameter in the above embodiment).
- the deviation of the image judgment loss function is less than the preset threshold, it means that the image judgment loss function is stable enough and the function curve is smooth enough, then there is no need to continue to adjust the parameters of the initial image translation model and the initial image discrimination model, and the adjusted image translation model is used as the target image translation model.
- the preset threshold can be set according to the actual needs of model training, and the deviation of the image determination loss function can be the mean square error of the image determination loss function.
- the image determination loss function is determined according to the image pair discrimination result; the parameters of the initial image translation model and the initial image discrimination model are adjusted cyclically until the deviation of the image determination loss function is less than a preset threshold, and the target image translation model is obtained.
- the loss function makes the model training more accurate.
- the present application further provides a model adjustment method.
- FIG5 is a flow chart of a model adjustment method provided by the present application. As shown in FIG5, the parameters of the initial image translation model and the initial image discrimination model are adjusted cyclically in S402 until the deviation of the image discrimination loss function is less than a preset threshold value, and the target image translation model is obtained, including steps S501-S502.
- S501 Keep the parameters of the initial image translation model unchanged, adjust the parameters of the initial image discrimination model, and obtain the adjusted image discrimination model.
- the parameters of the initial image translation model can be kept unchanged (e.g., the first affine parameter and the second affine parameter are kept unchanged), and the parameters of the initial image discrimination model can be adjusted (e.g., the third affine parameter is adjusted).
- the loss function this is reflected in keeping the translation structure of T(y) unchanged and changing the discrimination structure of D(x,y) and D(T(y),y). And obtain the adjusted image discrimination model.
- the corresponding loss function can be expressed as follows:
- the model loss is calculated according to formula (9), and then the parameters of the initial image discrimination model are optimized and adjusted through the gradient descent back propagation algorithm to update the parameters of the initial image discrimination model.
- S502 Keep the parameters of the adjusted image discrimination model unchanged, adjust the parameters of the initial image translation model until the deviation of the image determination loss function is less than a preset threshold, and obtain the target image translation model.
- the parameters of the initial image discriminant model are kept unchanged (e.g., the third affine parameter is kept unchanged), and the parameters of the initial image translation model are adjusted (e.g., the first affine parameter and the second affine parameter are adjusted).
- the loss function this is reflected in keeping the discriminant structure of D(x,y) and D(T(y),y) unchanged and changing the translation structure of T(y).
- the adjusted image discriminant model is then obtained.
- the corresponding loss function can be expressed as follows:
- the model loss is calculated according to formula (10), and the parameters of the initial image translation model are optimized and adjusted through the gradient descent back propagation algorithm to update the parameters of the initial image translation model.
- the training images generated by the image translation model will become closer and closer to the target images.
- the parameters of the initial image translation model are kept unchanged, the parameters of the initial image discrimination model are adjusted, and the adjusted image discrimination model is obtained; the parameters of the adjusted image discrimination model are kept unchanged, and the parameters of the initial image translation model are adjusted until the deviation of the image judgment loss function is less than a preset threshold, and the target image translation model is obtained.
- the model training accuracy is improved through adversarial training.
- FIG6 is a flow chart of an image translation method provided by the present application.
- the execution subject of the method may be an electronic device, and the electronic device may be a device with a computing function, such as a desktop computer, a laptop computer, etc.
- the method includes:
- S602 Use the target image translation model of any one of the above embodiments to process the image to be translated to obtain a target image.
- an image to be translated is obtained, and the target image translation model of any one of the above embodiments is used to process the image to be translated to obtain a target image, thereby improving the accuracy of image translation.
- This application is developed on the Ubuntu platform, and the deep learning framework developed is based on PyTorch.
- the main language used in this application is Python.
- DDP Distributed Data Parallel
- the sample image generator and target image generator are based on StyleCLIP and StyleGAN-nada.
- StyleCLIP this application uses the text pair “Face” and “Face with fringe hair” to achieve local tampering in the form of adding fringe generation.
- StyleGAN-nada this application uses the text pair "Photo” and “Pixar” to achieve the global tampering goal of face animation.
- CLIP model in this paper uses Vision Transformer as the visual encoding model. After generating the image pair, the two images are synchronously enhanced, including horizontal flipping, color change, etc.
- the generator of the image translation model uses the StyleGAN2 structure. Both the generator and the image discriminator model are initialized to the generator and discriminator in the default face generator StyleGAN2 model.
- the encoder part of the image translation model uses a similar structure to the discriminator and is randomly initialized.
- steps in the flowcharts involved in the above-mentioned embodiments can include multiple steps or multiple stages, and these steps or stages are not necessarily executed at the same time, but can be executed at different times, and the execution order of these steps or stages is not necessarily carried out in sequence, but can be executed in turn or alternately with other steps or at least a part of the steps or stages in other steps.
- FIG. 7 is a schematic diagram of a training device for an image translation model provided in an embodiment of the present application. As shown in FIG. 7 , the device includes:
- the translation module 701 is used to adopt the initial image translation model, input the source image in the sample image pair, and obtain the training image;
- the sample image pair includes: a source image and a target image, wherein the target image is a reference standard for translating the source image.
- the discrimination module 702 is used to adopt an initial image discrimination model, input a sample image pair and a training image pair, and obtain an image pair discrimination result; wherein the training image pair includes: a source image and a training image.
- the adjustment module 703 is used to adjust the parameters of the initial image translation model and the initial image discrimination model according to the image pair discrimination result to obtain the target image translation model.
- the generating module 704 is used to input multiple sets of noise into a preset image generator to obtain multiple sets of source images and target images. Image, forming multiple groups of sample image pairs.
- the translation module 701 which is specifically used for the initial image translation model, includes: an encoder and a re-parameter generator, the re-parameter generator includes: multiple modulated convolutional layers and a mapper, inputting a source image, preset noise and a constant feature map; using the mapper to convert the preset noise into a noise offset vector in the feature space; using the encoder to convert the source image into a source image vector in the feature space; using the re-parameter generator to generate a training image according to the source image vector, the noise offset vector and the constant feature map.
- the re-parameter generator includes: multiple modulated convolutional layers and a mapper, inputting a source image, preset noise and a constant feature map; using the mapper to convert the preset noise into a noise offset vector in the feature space; using the encoder to convert the source image into a source image vector in the feature space; using the re-parameter generator to generate a training image according to the source image vector, the noise offset vector and the constant feature map
- the translation module 701 is specifically used to re-parameterize the pre-trained generator according to the first affine parameter and the second affine parameter to obtain a re-parameterized generator.
- the discrimination module 702 is specifically used to re-parameterize the pre-trained image discrimination model according to the third affine parameter and the preset convolution block to obtain the re-parameterized image discrimination model; using the re-parameterized image discrimination model, inputting the sample image pair and the training image pair to obtain the image pair discrimination result.
- the adjustment module 703 is specifically used to determine the image determination loss function according to the image pair discrimination result; cyclically adjust the parameters of the initial image translation model and the initial image discrimination model until the deviation of the image determination loss function is less than a preset threshold, and obtain the target image translation model.
- the adjustment module 703 is specifically used to keep the parameters of the initial image translation model unchanged, adjust the parameters of the initial image discrimination model, and obtain the adjusted image discrimination model; keep the parameters of the adjusted image discrimination model unchanged, adjust the parameters of the initial image translation model until the deviation of the image judgment loss function is less than a preset threshold, and obtain the target image translation model.
- FIG8 is a schematic diagram of an image translation device provided in an embodiment of the present application. As shown in FIG8 , the device includes:
- the acquisition module 801 is used to acquire the image to be translated.
- the processing module 802 is used to process the image to be translated using any target image translation model in the above embodiments to obtain a target image.
- the training device of the above-mentioned image translation model and each module in the image translation device can be implemented in whole or in part by software, hardware and a combination thereof.
- the above-mentioned modules can be embedded in or independent of the processor in the computer device in the form of hardware, or can be stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
- Figure 9 is a schematic diagram of an electronic device provided in an embodiment of the present application, which can be a device with computing and processing functions.
- the electronic device includes: a processor 901 and a storage medium 902.
- the processor 901 and the storage medium 902 are connected via a bus.
- the storage medium 902 is used to store programs, and the processor 901 calls the programs stored in the storage medium 902 to execute the above method embodiment.
- the specific implementation method and technical effect are similar and will not be repeated here.
- the present application also provides a storage medium, including a program, which is used to execute the above-mentioned method embodiment when executed by a processor.
- a storage medium including a program, which is used to execute the above-mentioned method embodiment when executed by a processor.
- the disclosed devices and methods can be implemented in other ways.
- the device embodiments described above are merely schematic.
- the division of the units is only a logical function division. There may be other division methods in actual implementation, such as multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed.
- Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be through some interfaces, indirect coupling or communication connection of devices or units, which can be electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
- each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
- the above-mentioned integrated unit may be implemented in the form of hardware or in the form of hardware plus software functional units.
- the above-mentioned integrated unit implemented in the form of a software functional unit can be stored in a storage medium.
- the above-mentioned software functional unit is stored in a storage medium, including a number of instructions for a computer device (which can be a personal computer, a server, or a network device, etc.) or a processor to execute some steps of the method described in each embodiment of the present application.
- the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, referred to as: ROM), random access memory (Random Access Memory, referred to as: RAM), disk or optical disk and other media that can store program codes.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
本申请提供一种图像翻译模型的训练、图像翻译方法、设备及存储介质,涉及图像处理技术领域。该训练方法通过采用初始图像翻译模型,输入样本图像对中的源图像,得到训练图像(S101);样本图像对包括:源图像、目标图像,其中,目标图像为源图像翻译的参考标准;采用初始图像判别模型,输入样本图像对和训练图像对,得到图像对判别结果(S102);其中,训练图像对包括:源图像、训练图像;根据图像对判别结果,调节初始图像翻译模型、初始图像判别模型的参数,获取目标图像翻译模型(S103)。
Description
相关申请
本申请要求2023年01月30日提交的,申请号为2023100827723,名称为“图像翻译模型的训练、图像翻译方法、设备及存储介质”的中国专利申请的优先权,在此将其全文引入作为参考。
本申请涉及图像处理技术领域,具体而言,涉及图像翻译模型的训练、图像翻译方法、设备及存储介质。
图像作为自然界的一种非常常见的信息媒介,蕴含着丰富的语义信息。特别在互联网时代,图片是除自然语言外传播最广最频繁的信息形式。人类对于图像的独特的语义信息的理解能力被称为视觉,其中最简单的一种便是对图像的分类。分类一般是根据某些特征的共同点和差异性,划分每个样本所属的领域。对于图像来说,这些分类领域包括分辨率、色彩等低级语义,也包括内容、风格、物体关系等高级语义。如同自然语言中各个语种之间的翻译一样,图像之间也存在着翻译任务,这被称为图像到图像翻译,简称图像翻译。一般来说,图像翻译模型需要对输入图像进行一定程度的生成和变换,使其拥有新的特征的同时保留其他无关特征。
目前,对图像的分类一般有两类:第一类方案利用现实收集的离线真实数据集训练模型,第二类方案利用预训练图像生成器进行潜变量逆向重建并进行潜变量编辑。
一般地,第二类方案使用较多,但其训练速度较慢,难以满足现实场景对重构质量和运行速度的要求。
发明内容
第一方面,本申请实施例提供一种图像翻译模型的训练方法,包括:
采用初始图像翻译模型,输入样本图像对中的源图像,得到训练图像;所述样本图像对包括:所述源图像、目标图像,其中,所述目标图像为所述源图像翻译的参考标准;
采用初始图像判别模型,输入所述样本图像对和训练图像对,得到图像对判别结果;
其中,所述训练图像对包括:所述源图像、所述训练图像;
根据所述图像对判别结果,调节所述初始图像翻译模型、所述初始图像判别模型的参数,获取目标图像翻译模型。
可选地,在所述采用初始图像翻译模型,输入样本图像对中的源图像,得到训练图像之前,所述方法还包括:
将多组噪声输入至预设图像生成器,得到多组所述源图像以及所述目标图像,组成多组所述样本图像对。
可选地,所述初始图像翻译模型包括:编码器和重参数生成器,所述重参数生成器包括:多个调制卷积层和映射器,所述采用初始图像翻译模型,输入样本图像对中的源图像,得到训练图像,包括:
输入所述源图像、预设噪声以及常量特征图;
采用所述映射器将所述预设噪声转换为特征空间的噪声偏移向量;
采用所述编码器将所述源图像转化为所述特征空间的源图像向量;
采用所述重参数生成器,根据所述源图像向量、所述噪声偏移向量以及常量特征图,生成所述训练图像。
可选地,在所述采用重参数生成器,根据所述源图像向量、所述噪声偏移向量以及常量特征图,生成所述训练图像之前,所述方法还包括:
根据第一仿射参数和第二仿射参数,对预训练生成器进行重参数,获取所述重参数生成器。
可选地,所述采用初始图像判别模型,输入样本图像对和训练图像对,得到图像对判别结果,包括:
根据第三仿射参数以及预设卷积块,对预训练图像判别模型进行重参数,获取重参数图像判别模型;
采用所述重参数图像判别模型,输入所述样本图像对和所述训练图像对,得到图像对判别结果。
可选地,所述根据所述图像对判别结果,调节所述初始图像翻译模型、初始图像判别模型的参数,获取目标图像翻译模型,包括:
根据所述图像对判别结果确定图像判定损失函数;
循环调节所述初始图像翻译模型、初始图像判别模型的参数,直至图像判定损失函数的偏差小于预设阈值,获取目标图像翻译模型。
可选地,所述循环调节所述初始图像翻译模型、初始图像判别模型的参数,直至图像
判定损失函数的偏差小于预设阈值,获取目标图像翻译模型,包括:
保持初始图像翻译模型的参数不变,调节初始图像判别模型的参数,获取调节后的图像判别模型;
保持所述调节后的图像判别模型的参数不变,调节初始图像翻译模型的参数,直至图像判定损失函数的偏差小于所述预设阈值,获取目标图像翻译模型。
第二方面,本申请实施例提供一种图像翻译方法,所述方法包括:
获取待翻译的图像;
采用第一方面任一项所述的目标图像翻译模型,对所述待翻译的图像进行处理,得到目标图像。
第三方面,本申请实施例提供一种图像翻译模型的训练装置。所述训练装置包括:
翻译模块,用于采用初始图像翻译模型,输入样本图像对中的源图像,得到训练图像;所述样本图像对包括:所述源图像、目标图像,其中,所述目标图像为所述源图像翻译的参考标准;
判别模块,用于采用初始图像判别模型,输入所述样本图像对和训练图像对,得到图像对判别结果;其中,所述训练图像对包括:所述源图像、所述训练图像;
调节模块,用于根据所述图像对判别结果,调节所述初始图像翻译模型、所述初始图像判别模型的参数,获取目标图像翻译模型。
第四方面,本申请实施例提供一种图像翻译装置。所述装置包括:
获取模块,用于获取待翻译的图像。
处理模块,用于采用上述第一方面中的目标图像翻译模型,对所述待翻译的图像进行处理,得到目标图像。
第五方面,本申请实施例提供一种电子设备,包括:处理器、存储介质,所述处理器与所述存储介质之间通过总线通信连接,所述存储介质存储有所述处理器可执行的程序指令,所述处理器调用存储介质中存储的程序,以执行如第一方面任一所述的图像翻译模型的训练方法或如第二方面所述的图像翻译方法的步骤。
第六方面,本申请实施例提供一种存储介质,所述存储介质上存储有计算机程序,所述计算机程序被处理器运行时执行如第一方面任一所述的图像翻译模型的训练方法或如第二方面所述的图像翻译方法的步骤。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征、目的和优点将从说明书、附图以及权利要求书变得明显。
为了更清楚地说明本申请实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,应当理解,以下附图仅示出了本申请的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。
图1为本申请一些实施例提供的一种图像翻译模型的训练方法的流程示意图;
图2为本申请一些实施例提供的一种翻译得到训练图像的方法的流程示意图;
图3为本申请一些实施例提供的一种图像对判别方法的流程示意图;
图4为本申请一些实施例提供的一种模型确定方法的流程示意图;
图5为本申请一些实施例提供的一种模型调节方法的流程示意图;
图6为本申请一些实施例提供的一种图像翻译方法的流程示意图;
图7为本申请一些实施例提供的一种图像翻译模型的训练装置的示意图;
图8为本申请一些实施例提供的一种图像翻译装置的示意图;
图9为本申请一些实施例提供的一种电子设备的示意图;
图标:701-翻译模块、702-判别模块、703-调节模块、704-生成模块、801-获取模块、802-处理模块、901-处理器、902-存储介质。
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本申请实施例的组件可以以各种不同的配置来布置和设计。
因此,以下对在附图中提供的本申请的实施例的详细描述并非旨在限制要求保护的本申请的范围,而是仅仅表示本申请的选定实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
应注意到:类似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。
此外,若出现术语“第一”、“第二”等仅用于区分描述,而不能理解为指示或暗示相对重要性。
需要说明的是,在不冲突的情况下,本申请的实施例中的特征可以相互结合。
为提高图像翻译模型的训练速度以及训练精度,本申请提供了一种图像翻译模型的训
练、图像翻译方法、设备及存储介质。
如下通过具体示例对本申请提供的一种图像翻译模型的训练方法进行解释说明。图1为本申请提供的一种图像翻译模型的训练方法的流程示意图,该方法的执行主体可以是电子设备,该电子设备可以为具有计算处理功能的设备,如台式电脑、笔记本电脑等。如图1所示,该方法包括步骤S101-S103。
S101、采用初始图像翻译模型,输入样本图像对中的源图像,得到训练图像。
其中,样本图像对包括:源图像、目标图像,其中,目标图像为源图像翻译的参考标准,源图像为待翻译的图像,也即目标图像是根据源图像翻译得到的标准图像。
采用初始图像翻译模型,输入源图像,对源图像进行翻译,得到训练图像。训练图像就是本轮图像翻译的源图像翻译结果。
在训练图像翻译模型的过程中,为使得模型训练达到预期效果,用户可以预先根据图像翻译需求对源图像进行处理(例如,为人像生成留海,对人像进行动漫化处理),得到目标图像。图像翻译模型得到的训练图像将以目标图像为参考标准,也就是说,训练图像与目标图像越相似,说明训练的图像翻译模型效果越好。
S102、采用初始图像判别模型,输入样本图像对和训练图像对,得到图像对判别结果。
其中,训练图像对包括:源图像、训练图像。
得到图像翻译模型根据源图像翻译的训练图像之后,可以将源图像、训练图像组成训练图像对,再结合样本图像对,形成两对图像对。可以理解的,样本图像对表征了源图像与目标图像之间的差距。训练图像对表征了源图像与训练图像之间的差距。相较于采用目标图像与训练图像进行判别,采用样本图像对和训练图像对进行判别,相当于在判别的过程中还引入源图像,使得判别结果会更加精准,也能提高判别速度。
S103、根据图像对判别结果,调节初始图像翻译模型、初始图像判别模型的参数,获取目标图像翻译模型。
在训练模型的过程中,采用了图像翻译模型和图像判别模型,图像翻译模型的目的是翻译出更接近于目标图像的训练图像,以通过图像判别模型的判别(成功欺骗图像判别模型);而图像判别模型的目的是严格判别(使自身不被图像翻译模型欺骗)。通过图像翻译模型和图像判别模型的结合,二者相互博弈监督,进行对抗训练,形成了无监督的模型对抗训练网络。
根据图像对判别结果,调节初始图像翻译模型、初始图像判别模型的参数。调节的目的为:使得图像翻译模型更精准地翻译图像(更能成功欺骗图像判别模型)、图像判别模型更严格地进行判别(更能使自身不被图像翻译模型欺骗)。
输入多组样本图像对,循环上述步骤S101和S102,每一次都根据图像对判别结果,调节初始图像翻译模型、初始图像判别模型的参数,直至训练得到的模型满足用户需求,完成模型训练。将训练完成的图像翻译模型作为目标图像翻译模型。通过采用图像对进行模型训练,并采用图像翻译模型和图像判别模型的结合,形成无监督的模型训练网络,提高了模型训练精度与训练速度。
综上,在本实施例中,通过采用初始图像翻译模型,输入样本图像对中的源图像,得到训练图像;样本图像对包括:源图像、目标图像,其中,目标图像为源图像翻译的参考标准;采用初始图像判别模型,输入样本图像对和训练图像对,得到图像对判别结果;其中,训练图像对包括:源图像、训练图像;根据图像对判别结果,调节初始图像翻译模型、初始图像判别模型的参数,获取目标图像翻译模型。从而,通过采用图像对进行模型训练,并采用图像翻译模型和图像判别模型的结合,形成无监督的模型训练网络,提高了模型训练精度与训练速度。
在上述图1对应的实施例的基础上,本申请实施例在S101中的采用初始图像翻译模型,输入样本图像对中的源图像,得到训练图像之前,该方法还包括:将多组噪声输入至预设图像生成器,得到多组源图像以及目标图像,组成多组样本图像对。
示例地,预设图像生成器为一对预训练好的生成器,包括:源图像生成器、目标图像生成器,预设图像生成器的输入为噪声、输出为图像。可以从标准高斯分布中随机采样多组噪声。每一轮将一组噪声分别输入至源图像生成器(包含初始源图像信息)和目标图像生成器(包含初始目标图像信息)中,得到一组源图像、一组目标图像,一组源图像与目标图像组成一组样本图像对。具体的图像生成方式可以如下公式(1)所示:
(x,y)=(Gx(z),Gy(z)) (1)
(x,y)=(Gx(z),Gy(z)) (1)
其中,z为噪声,Gx为目标图像生成器的表达,Gy为源图像生成器的表达,x为目标图像,y为源图像。
通过向预设图像生成器中输入多组噪声,只需要一组真实的初始源图像和初始目标图像,就可以生成多组源图像和目标图像。不需要从真实场景中获取多组源图像,再对真实场景中的多组源图像进行处理得到目标图像。极大地扩充了训练样本数量,并降低了获取样本的成本,提升了获取样本的效率。
综上,在本实施例中,将多组噪声输入至预设图像生成器,得到多组源图像以及目标图像,组成多组样本图像对。从而,极大地扩充了训练样本数量,并降低了获取样本的成本,提升了获取样本的效率。
在另一实施例中,本申请还提供了一种翻译得到训练图像的方法。图2为本申请提供的一种翻译得到训练图像的方法的流程示意图。如图2所示,初始图像翻译模型包括:编码器和重参数生成器,重参数生成器包括:多个调制卷积层和映射器。
其中,重参数生成器是指重参数化的生成器,重参数化是指从一个分布中进行采样,而该分布是带有参数的,如果直接进行采样(采样动作是离散的,离散数据不可微),是没有梯度信息的,在神经网络反向传播的时候就不会对参数梯度进行更新。重参数化可以保证从该分布中进行采样,同时又能保留梯度信息。
在S101中的采用初始图像翻译模型,输入样本图像对中的源图像,得到训练图像,包括步骤S201-S204。
S201、输入源图像、预设噪声以及常量特征图。
示例地,在将源图像输入至图像翻译模型时,还需同时输入预设噪声以及常量特征图。其中,预设噪声以及常量特征图为重参数生成器对应的噪声和常量特征图。以重参数生成器为预训练的StyleGAN生成器为例,常量特征图就是StyleGAN生成器中的常量输入,常量特征图是一个预设好的固定常数向量,作为StyleGAN生成器的输入。预设噪声可以生成特征空间的向量,这些向量作为图像的风格信息输入到StyleGAN生成器中。
S202、采用映射器将预设噪声转换为特征空间的噪声偏移向量。
示例地,特征空间可以为W空间,具体的转换方式如下公式(2)所示:
w=M(z) (2)
w=M(z) (2)
其中,M为映射器的表达,z为预设噪声,w为噪声偏移向量。
S203、采用编码器将源图像转化为特征空间的源图像向量。
其中,编码器可以表示为其中,fi表示编码器的第i层。进一步地,编码器第i层的输出可以如下公式(3)所示:
ei=fi(ei+1)=Convi(ei+1) (3)
ei=fi(ei+1)=Convi(ei+1) (3)
其中,en+1=y为源图像,i=(0,1,…,n),ei表示编码器第i层的输出。
当i=0时,编码器最终的输出就为e0={w1,…,wn},它会作为编码输出的W+空间偏移量,也就是步骤S202中的噪声偏移向量。
因此,编码器的输出包括两部分,分别为:编码器每一层的输出以及噪声偏移向量。
S204、采用重参数生成器,根据源图像向量、噪声偏移向量以及常量特征图,生成训练图像。
重复进行步骤S203和步骤S204,遍历编码器和重参数生成器的所有层输出,生成完
整的训练图像。
图像翻译模型采用重参数生成器,使得图像翻译模型变为重参数网络结构。采用重参数生成器,根据源图像向量、噪声偏移向量以及常量特征图,生成训练图像。可以以最小代价在任务之间平滑迁移,以保证重参数网络的性质,提升了训练速度。
综上,在本实施例中,初始图像翻译模型包括:编码器和重参数生成器,重参数生成器包括:多个调制卷积层和映射器,通过输入源图像、预设噪声以及常量特征图;采用映射器将预设噪声转换为特征空间的噪声偏移向量;采用编码器将源图像转化为特征空间的源图像向量;采用重参数生成器,根据源图像向量、噪声偏移向量以及常量特征图,生成训练图像。从而,保证了重参数网络的性质,提升了训练速度。
在上述图2对应的实施例的基础上,本申请实施例在S204中的采用重参数生成器,根据源图像向量、噪声偏移向量以及常量特征图,生成训练图像之前,该方法还包括:根据第一仿射参数和第二仿射参数,对预训练生成器进行重参数,获取重参数生成器。
预训练生成器的第i层的输出特征可以如下公式(4)所示:
ci=ModConvi(ci-1,w) (4)
ci=ModConvi(ci-1,w) (4)
其中,ci为预训练生成器的第i层的输出特征,w为噪声偏移向量。
根据第一仿射参数和第二仿射参数,对预训练生成器进行重参数,改变预训练生成器的结构,获取重参数生成器,重参数生成器的第i层的输出特征可以如下公式(5)所示:
ci=ModConvi(ci-1,w+αwi)+βei (5)
ci=ModConvi(ci-1,w+αwi)+βei (5)
其中,α为第一仿射参数和β为第二仿射参数,ei编码器第i层的输出,wi为W+空间偏移量。
引入第一仿射参数和第二仿射参数,并采用编码器中每一层的输出ei和最后的输出W+空间偏移量对预训练生成器进行重参数。第一仿射参数和第二仿射参数是可以学习优化的仿射参数,它们被初始化为0,这样既可以利用重参数生成器网络的先验,又可以以最小代价在任务之间平滑迁移,以保证重参数网络的性质,并获得了快速的收敛速度,提升了训练速度。
综上,在本实施例中,根据第一仿射参数和第二仿射参数,对预训练生成器进行重参数,获取重参数生成器。从而,保证重参数网络的性质,提升了训练速度。
在另一实施例中,本申请还提供了一种图像对判别方法。图3为本申请提供的一种图像对判别方法的流程示意图。如图3所示,在S102中的采用初始图像判别模型,输入样本图像对和训练图像对,得到图像对判别结果,包括步骤S301-S302。
S301、根据第三仿射参数以及预设卷积块,对预训练图像判别模型进行重参数,获取
重参数图像判别模型。
图像判别模型用于对图像进行判别。对于预训练图像判别模型(即,原始的图像判别模型)来说,其输入为单张图像,而本申请上述实施例中生成的是成对的数据(图像对),因此图像判别模型的输入也是成对的图像(图像对),所以也需要对预训练图像判别模型(进行重参数。
图像判别模型可以看作是n个下采样残差块d的堆叠,把第一层dn编码出的特征图作为一个整体,如下公式(6)所示:
根据第三仿射参数以及预设卷积块进行重参数,获取的重参数图像判别模型的形式,如下公式(7)所示:
其中,γ为第三仿射参数,第三仿射参数为初始化为0且可以优化学习的仿射参数,d′n为新引入的预设卷积块,预设卷积块可以为鉴别器卷积块,并进行随机初始化。
S302、采用重参数图像判别模型,输入样本图像对和训练图像对,得到图像对判别结果。
样本图像对包括:源图像和目标图像;训练图像对包括:源图像和训练图像。将样本图像对输入至重参数图像判别模型中,得到样本图像对的判别结果。将训练图像对输入至重参数图像判别模型中,得到训练图像对的判别结果。
综上,在本实施例中,根据第三仿射参数以及预设卷积块,对预训练图像判别模型进行重参数,获取重参数图像判别模型;采用重参数图像判别模型,输入样本图像对和训练图像对,得到图像对判别结果。从而,通过对图像判别模型进行重参数,实现对图像对进行判别,提高了判别效率。
在另一实施例中,本申请还提供了一种模型确定方法。图4为本申请提供的一种模型确定方法的流程示意图。如图4所示,在S103中的根据图像对判别结果,调节初始图像翻译模型、初始图像判别模型的参数,获取目标图像翻译模型,包括步骤S401-S402。
S401、根据图像对判别结果确定图像判定损失函数。
图像对判别结果包括:样本图像对的判别结果、训练图像对的判别结果,根据样本图像对的判别结果、训练图像对的判别结果确定的图像判定损失函数的形式如下公式(8)所示:
其中,D(x,y)为样本图像对的判别结果,D(T(y),y)为训练图像对的判别结果,T(y)为训练图像。
损失函数表征了图像判别模型需要精准地判别哪个图像对为样本图像对、哪个图像对为训练图像对(在损失函数中表示为),图像翻译模型需要使得训练图像对更易通过图像判别模型的判定,使得图像判别模型判定训练图像对为样本图像对(使得训练图像对与样本图像对之间的差距越小,图像翻译模型成功欺骗图像判别模型,在损失函数中表示为)。
S402、循环调节初始图像翻译模型、初始图像判别模型的参数,直至图像判定损失函数的偏差小于预设阈值,获取目标图像翻译模型。
每输入一组图像对判别结果,得到对应的图像判定损失函数的偏差。若图像判定损失函数的偏差大于或等于预设阈值,则说明图像判定损失函数仍不够稳定、函数曲线不够平滑,需要继续调节初始图像翻译模型、初始图像判别模型的参数(例如,上述实施例中的第一仿射参数、第二仿射参数、第三仿射参数)。若图像判定损失函数的偏差小于预设阈值,则说明图像判定损失函数已经足够稳定、函数曲线足够平滑,则不需要继续调节初始图像翻译模型、初始图像判别模型的参数,将调节完成的图像翻译模型作为目标图像翻译模型。
示例地,预设阈值可以根据模型训练的实际需求进行设置,图像判定损失函数的偏差可以为图像判定损失函数的均方差。
综上,在本实施例中,根据图像对判别结果确定图像判定损失函数;循环调节初始图像翻译模型、初始图像判别模型的参数,直至图像判定损失函数的偏差小于预设阈值,获取目标图像翻译模型。从而,通过损失函数,使得模型训练更加精准。
在另一实施例中,本申请还提供了一种模型调节方法。图5为本申请提供的一种模型调节方法的流程示意图。如图5所示,在S402中的循环调节初始图像翻译模型、初始图像判别模型的参数,直至图像判定损失函数的偏差小于预设阈值,获取目标图像翻译模型,包括步骤S501-S502。
S501、保持初始图像翻译模型的参数不变,调节初始图像判别模型的参数,获取调节后的图像判别模型。
在每一轮参数调节过程中,可以先保持初始图像翻译模型的参数不变(如,保持第一仿射参数、第二仿射参数不变),调节初始图像判别模型的参数(如,调节第三仿射参数)。在损失函数中,体现为保持T(y)的翻译结构不变,改变D(x,y)、D(T(y),y)的判别结构。
并获取调节后的图像判别模型。
对应的损失函数的表达可以如下公式(9)所示:
根据公式(9)计算模型损失,然后通过梯度下降反向传播算法对初始图像判别模型的参数进行优化调节,更新初始图像判别模型的参数。
S502、保持调节后的图像判别模型的参数不变,调节初始图像翻译模型的参数,直至图像判定损失函数的偏差小于预设阈值,获取目标图像翻译模型。
进一步地,保持初始图像判别模型的参数不变(如,保持第三仿射参数变),调节初始图像翻译模型的参数(如,调节第一仿射参数、第二仿射参数)。在损失函数中,体现为保持D(x,y)、D(T(y),y)的判别结构不变,改变T(y)的翻译结构。并获取调节后的图像判别模型。
对应的损失函数的表达可以如下公式(10)所示:
根据公式(10)计算模型损失,通过梯度下降反向传播算法对初始图像翻译模型的参数进行优化调节,更新初始图像翻译模型的参数。
通过这样对抗训练的方式,图像翻译模型所生成的训练图像会越来越接近目标图像。
综上,在本实施例中,保持初始图像翻译模型的参数不变,调节初始图像判别模型的参数,获取调节后的图像判别模型;保持调节后的图像判别模型的参数不变,调节初始图像翻译模型的参数,直至图像判定损失函数的偏差小于预设阈值,获取目标图像翻译模型。从而,通过对抗训练的方式,提升了模型训练精度。
如下通过具体示例对本申请提供的一种图像翻译方法进行解释说明。图6为本申请提供的一种图像翻译方法的流程示意图,该方法的执行主体可以是电子设备,该电子设备可以为具有计算处理功能的设备,如台式电脑、笔记本电脑等。如图6所示,该方法包括:
S601、获取待翻译的图像。
S602、采用上述实施例中任一项的目标图像翻译模型,对待翻译的图像进行处理,得到目标图像。
通过上述训练方式,已经得到了翻译精准的目标图像翻译模型,采用目标图像翻译模型,对待翻译的图像进行处理,得到的目标图像会更加精准。
综上,在本实施例中,获取待翻译的图像;采用上述实施例中任一项的目标图像翻译模型,对待翻译的图像进行处理,得到目标图像。从而,提升了图像翻译精度。
进一步地,对本申请的仿真实验环境作进一步的说明。
本申请在Ubuntu平台上进行开发,开发的深度学习框架基于PyTorch。本申请中主要用的语言为Python。并使用DDP(Distributed Data Parallel,分布式数据并行)作为多卡训练的基础并行架构。
样本图像生成器和目标图像生成器采用基于StyleCLIP和StyleGAN-nada两种方式,对于StyleCLIP,本申请采用了文本对“Face”和“Face with fringe hair”以实现添加刘海生成形式的局部篡改。而对于StyleGAN-nada,本申请采用了文本对“Photo”和“Pixar”以实现人脸动漫化的全局篡改目标。其中的CLIP模型本文采用Vision Transformer作为视觉编码模型。在生成图像对后,对两张图像进行了同步的数据增强,包括水平翻转,颜色变化等。
图像翻译模型的生成器采用StyleGAN2结构。生成器和图像判别模型均初始化为默认的人脸生成器StyleGAN2模型中的生成器和鉴别器。图像翻译模型中的编码器部分采用和鉴别器类似的结构,并进行随机初始化。
应该理解的是,虽然如上所述的各实施例所涉及的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,如上所述的各实施例所涉及的流程图中的至少一部分步骤可以包括多个步骤或者多个阶段,这些步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤中的步骤或者阶段的至少一部分轮流或者交替地执行。
下述对用以执行的本申请所提供的一种图像翻译模型的训练、图像翻译装置、设备及存储介质等进行说明,其具体的实现过程以及技术效果参见上述,下述不再赘述。
图7为本申请实施例提供的一种图像翻译模型的训练装置的示意图,如图7所示,该装置包括:
翻译模块701,用于采用初始图像翻译模型,输入样本图像对中的源图像,得到训练图像;样本图像对包括:源图像、目标图像,其中,目标图像为源图像翻译的参考标准。
判别模块702,用于采用初始图像判别模型,输入样本图像对和训练图像对,得到图像对判别结果;其中,训练图像对包括:源图像、训练图像。
调节模块703,用于根据图像对判别结果,调节初始图像翻译模型、初始图像判别模型的参数,获取目标图像翻译模型。
生成模块704,用于将多组噪声输入至预设图像生成器,得到多组源图像以及目标图
像,组成多组样本图像对。
进一步地,翻译模块701,具体用于初始图像翻译模型包括:编码器和重参数生成器,重参数生成器包括:多个调制卷积层和映射器,输入源图像、预设噪声以及常量特征图;采用映射器将预设噪声转换为特征空间的噪声偏移向量;采用编码器将源图像转化为特征空间的源图像向量;采用重参数生成器,根据源图像向量、噪声偏移向量以及常量特征图,生成训练图像。
进一步地,翻译模块701,具体还用于根据第一仿射参数和第二仿射参数,对预训练生成器进行重参数,获取重参数生成器。
进一步地,判别模块702,具体用于根据第三仿射参数以及预设卷积块,对预训练图像判别模型进行重参数,获取重参数图像判别模型;采用重参数图像判别模型,输入样本图像对和训练图像对,得到图像对判别结果。
进一步地,调节模块703,具体用于根据图像对判别结果确定图像判定损失函数;循环调节初始图像翻译模型、初始图像判别模型的参数,直至图像判定损失函数的偏差小于预设阈值,获取目标图像翻译模型。
进一步地,调节模块703,具体还用于保持初始图像翻译模型的参数不变,调节初始图像判别模型的参数,获取调节后的图像判别模型;保持调节后的图像判别模型的参数不变,调节初始图像翻译模型的参数,直至图像判定损失函数的偏差小于预设阈值,获取目标图像翻译模型。
图8为本申请实施例提供的一种图像翻译装置的示意图,如图8所示,该装置包括:
获取模块801,用于获取待翻译的图像。
处理模块802,用于采用上述实施例中任一项的目标图像翻译模型,对待翻译的图像进行处理,得到目标图像。
上述图像翻译模型的训练装置以及图像翻译装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。图9为本申请实施例提供的一种电子设备的示意图,该电子设备可以是具备计算处理功能的设备。
该电子设备包括:处理器901、存储介质902。处理器901和存储介质902通过总线连接。
存储介质902用于存储程序,处理器901调用存储介质902存储的程序,以执行上述方法实施例。具体实现方式和技术效果类似,这里不再赘述。
可选地,本申请还提供一种存储介质,包括程序,该程序在被处理器执行时用于执行上述方法实施例。在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其他的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其他的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
上述以软件功能单元的形式实现的集成的单元,可以存储在一个存储介质中。上述软件功能单元存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本申请各个实施例所述方法的部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,简称:ROM)、随机存取存储器(Random Access Memory,简称:RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对本申请专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请的保护范围应以所附权利要求为准。
Claims (12)
- 一种图像翻译模型的训练方法,包括:采用初始图像翻译模型,输入样本图像对中的源图像,得到训练图像;所述样本图像对包括:所述源图像、目标图像,其中,所述目标图像为所述源图像翻译的参考标准;采用初始图像判别模型,输入所述样本图像对和训练图像对,得到图像对判别结果;其中,所述训练图像对包括:所述源图像、所述训练图像;以及根据所述图像对判别结果,调节所述初始图像翻译模型、所述初始图像判别模型的参数,获取目标图像翻译模型。
- 根据权利要求1所述的方法,其中,在所述采用初始图像翻译模型,输入样本图像对中的源图像,得到训练图像之前,所述方法还包括:将多组噪声输入至预设图像生成器,得到多组所述源图像以及所述目标图像,组成多组所述样本图像对。
- 根据权利要求1所述的方法,其中,所述初始图像翻译模型包括:编码器和重参数生成器,所述重参数生成器包括:多个调制卷积层和映射器,所述采用初始图像翻译模型,输入样本图像对中的源图像,得到训练图像,包括:输入所述源图像、预设噪声以及常量特征图;采用所述映射器将所述预设噪声转换为特征空间的噪声偏移向量;采用所述编码器将所述源图像转化为所述特征空间的源图像向量;以及采用所述重参数生成器,根据所述源图像向量、所述噪声偏移向量以及常量特征图,生成所述训练图像。
- 根据权利要求3所述的方法,其中,在所述采用所述重参数生成器,根据所述源图像向量、所述噪声偏移向量以及常量特征图,生成所述训练图像之前,所述方法还包括:根据第一仿射参数和第二仿射参数,对预训练生成器进行重参数,获取所述重参数生成器。
- 根据权利要求1所述的方法,其中,所述采用初始图像判别模型,输入样本图像对和训练图像对,得到图像对判别结果,包括:根据第三仿射参数以及预设卷积块,对预训练图像判别模型进行重参数,获取重参数图像判别模型;以及采用所述重参数图像判别模型,输入所述样本图像对和所述训练图像对,得到图像对判别结果。
- 根据权利要求1所述的方法,其中,所述根据所述图像对判别结果,调节所述初始图像翻译模型、初始图像判别模型的参数,获取目标图像翻译模型,包括:根据所述图像对判别结果确定图像判定损失函数;以及循环调节所述初始图像翻译模型、初始图像判别模型的参数,直至图像判定损失函数的偏差小于预设阈值,获取目标图像翻译模型。
- 根据权利要求6所述的方法,其中,所述循环调节所述初始图像翻译模型、初始图像判别模型的参数,直至图像判定损失函数的偏差小于预设阈值,获取目标图像翻译模型,包括:保持初始图像翻译模型的参数不变,调节初始图像判别模型的参数,获取调节后的图像判别模型;以及保持所述调节后的图像判别模型的参数不变,调节初始图像翻译模型的参数,直至图像判定损失函数的偏差小于所述预设阈值,获取目标图像翻译模型。
- 一种图像翻译方法,包括:获取待翻译的图像;以及采用权利要求1-6任一项所述的目标图像翻译模型,对所述待翻译的图像进行处理,得到目标图像。
- 一种图像翻译模型的训练装置,包括:翻译模块,用于采用初始图像翻译模型,输入样本图像对中的源图像,得到训练图像;所述样本图像对包括:所述源图像、目标图像,其中,所述目标图像为所述源图像翻译的参考标准;判别模块,用于采用初始图像判别模型,输入所述样本图像对和训练图像对,得到图像对判别结果;其中,所述训练图像对包括:所述源图像、所述训练图像;以及调节模块,用于根据所述图像对判别结果,调节所述初始图像翻译模型、所述初始图像判别模型的参数,获取目标图像翻译模型。
- 一种图像翻译装置,包括:获取模块,用于获取待翻译的图像;以及处理模块,用于采用权利要求1至7中任一所述的目标图像翻译模型,对所述待翻译的图像进行处理,得到目标图像。
- 一种电子设备,包括:处理器、存储介质,所述处理器与所述存储介质之间通过总线通信连接,所述存储介质存储有所述处理器可执行的程序指令,所述处理器调用存储介质中存储的程序,以执行如权利要求1至8中任一项所述方法的步骤。
- 一种存储介质,所述存储介质上存储有计算机程序,所述计算机程序被处理器运行时执行如权利要求1至8中任一项所述方法的步骤。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310082772.3 | 2023-01-30 | ||
CN202310082772.3A CN116229206A (zh) | 2023-01-30 | 2023-01-30 | 图像翻译模型的训练、图像翻译方法、设备及存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024160178A1 true WO2024160178A1 (zh) | 2024-08-08 |
Family
ID=86578019
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2024/074513 WO2024160178A1 (zh) | 2023-01-30 | 2024-01-29 | 图像翻译模型的训练、图像翻译方法、设备及存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN116229206A (zh) |
WO (1) | WO2024160178A1 (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116229206A (zh) * | 2023-01-30 | 2023-06-06 | 厦门美图之家科技有限公司 | 图像翻译模型的训练、图像翻译方法、设备及存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110866909A (zh) * | 2019-11-13 | 2020-03-06 | 上海联影智能医疗科技有限公司 | 图像生成网络的训练方法、图像预测方法和计算机设备 |
CN113139893A (zh) * | 2020-01-20 | 2021-07-20 | 北京达佳互联信息技术有限公司 | 图像翻译模型的构建方法和装置、图像翻译方法和装置 |
WO2022057837A1 (zh) * | 2020-09-16 | 2022-03-24 | 广州虎牙科技有限公司 | 图像处理和人像超分辨率重建及模型训练方法、装置、电子设备及存储介质 |
CN115424109A (zh) * | 2022-08-17 | 2022-12-02 | 之江实验室 | 一种可形变实例级图像翻译方法 |
CN116229206A (zh) * | 2023-01-30 | 2023-06-06 | 厦门美图之家科技有限公司 | 图像翻译模型的训练、图像翻译方法、设备及存储介质 |
-
2023
- 2023-01-30 CN CN202310082772.3A patent/CN116229206A/zh active Pending
-
2024
- 2024-01-29 WO PCT/CN2024/074513 patent/WO2024160178A1/zh unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110866909A (zh) * | 2019-11-13 | 2020-03-06 | 上海联影智能医疗科技有限公司 | 图像生成网络的训练方法、图像预测方法和计算机设备 |
CN113139893A (zh) * | 2020-01-20 | 2021-07-20 | 北京达佳互联信息技术有限公司 | 图像翻译模型的构建方法和装置、图像翻译方法和装置 |
WO2022057837A1 (zh) * | 2020-09-16 | 2022-03-24 | 广州虎牙科技有限公司 | 图像处理和人像超分辨率重建及模型训练方法、装置、电子设备及存储介质 |
CN115424109A (zh) * | 2022-08-17 | 2022-12-02 | 之江实验室 | 一种可形变实例级图像翻译方法 |
CN116229206A (zh) * | 2023-01-30 | 2023-06-06 | 厦门美图之家科技有限公司 | 图像翻译模型的训练、图像翻译方法、设备及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN116229206A (zh) | 2023-06-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Masood et al. | Deepfakes generation and detection: State-of-the-art, open challenges, countermeasures, and way forward | |
Lu et al. | Image generation from sketch constraint using contextual gan | |
US11727717B2 (en) | Data-driven, photorealistic social face-trait encoding, prediction, and manipulation using deep neural networks | |
KR102663519B1 (ko) | 교차 도메인 이미지 변환 기법 | |
CN111553267B (zh) | 图像处理方法、图像处理模型训练方法及设备 | |
CN112734634A (zh) | 换脸方法、装置、电子设备和存储介质 | |
KR102287407B1 (ko) | 이미지 생성을 위한 학습 장치 및 방법과 이미지 생성 장치 및 방법 | |
US11893717B2 (en) | Initializing a learned latent vector for neural-network projections of diverse images | |
CN111275784A (zh) | 生成图像的方法和装置 | |
KR20220011100A (ko) | 얼굴 이미지 검색을 통한 가상 인물 생성 시스템 및 방법 | |
KR20210034462A (ko) | 픽셀 별 주석을 생성하는 생성적 적대 신경망(gan)을 학습시키는 방법 | |
WO2024160178A1 (zh) | 图像翻译模型的训练、图像翻译方法、设备及存储介质 | |
CN110728319A (zh) | 一种图像生成方法、装置以及计算机存储介质 | |
CN110874575A (zh) | 一种脸部图像处理方法及相关设备 | |
WO2025031069A1 (zh) | 活体检测模型的训练方法、装置、介质及电子设备、产品 | |
CN114913303A (zh) | 虚拟形象生成方法及相关装置、电子设备、存储介质 | |
CN114783017A (zh) | 基于逆映射的生成对抗网络优化方法及装置 | |
KR102504722B1 (ko) | 감정 표현 영상 생성을 위한 학습 장치 및 방법과 감정 표현 영상 생성 장치 및 방법 | |
US20220101122A1 (en) | Energy-based variational autoencoders | |
US20220101145A1 (en) | Training energy-based variational autoencoders | |
CN114330514A (zh) | 一种基于深度特征与梯度信息的数据重建方法及系统 | |
CN109754416B (zh) | 图像处理装置和方法 | |
WO2024050107A1 (en) | Three-dimensional diffusion models | |
Ye et al. | MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes | |
Fan et al. | Facial expression animation through action units transfer in latent space |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24749657 Country of ref document: EP Kind code of ref document: A1 |