CN116630147B - Face image editing method based on reinforcement learning - Google Patents
Face image editing method based on reinforcement learning Download PDFInfo
- Publication number
- CN116630147B CN116630147B CN202310908009.1A CN202310908009A CN116630147B CN 116630147 B CN116630147 B CN 116630147B CN 202310908009 A CN202310908009 A CN 202310908009A CN 116630147 B CN116630147 B CN 116630147B
- Authority
- CN
- China
- Prior art keywords
- image
- face image
- reinforcement learning
- attribute
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000002787 reinforcement Effects 0.000 title claims abstract description 52
- 238000000034 method Methods 0.000 title claims abstract description 33
- 230000001815 facial effect Effects 0.000 claims abstract description 40
- 238000011156 evaluation Methods 0.000 claims abstract description 33
- 238000013210 evaluation model Methods 0.000 claims abstract description 13
- 238000010606 normalization Methods 0.000 claims abstract description 5
- 238000013507 mapping Methods 0.000 claims abstract description 4
- 238000012549 training Methods 0.000 claims description 31
- 239000013598 vector Substances 0.000 claims description 28
- 238000000605 extraction Methods 0.000 claims description 11
- 230000008569 process Effects 0.000 claims description 7
- 230000009471 action Effects 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 4
- 230000001419 dependent effect Effects 0.000 claims description 4
- 230000008014 freezing Effects 0.000 claims description 4
- 238000007710 freezing Methods 0.000 claims description 4
- 238000003062 neural network model Methods 0.000 claims description 4
- 238000001514 detection method Methods 0.000 claims description 3
- 239000000284 extract Substances 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 2
- 239000003795 chemical substances by application Substances 0.000 description 5
- 241000282414 Homo sapiens Species 0.000 description 3
- 230000003796 beauty Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 2
- 230000008451 emotion Effects 0.000 description 2
- 238000012854 evaluation process Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000000342 Monte Carlo simulation Methods 0.000 description 1
- 206010048669 Terminal state Diseases 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 230000019771 cognition Effects 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000003741 hair volume Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 230000002028 premature Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000001303 quality assessment method Methods 0.000 description 1
- 238000013441 quality evaluation Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 238000011282 treatment Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/04—Context-preserving transformations, e.g. by using an importance map
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/092—Reinforcement learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4046—Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/766—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using regression, e.g. by projecting features on hyperplanes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/98—Detection or correction of errors, e.g. by rescanning the pattern or by human intervention; Evaluation of the quality of the acquired patterns
- G06V10/993—Evaluation of the quality of the acquired pattern
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Computing Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Quality & Reliability (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
A face image editing method based on reinforcement learning includes the steps: acquiring a face image to be edited and extracting a first face attribute; mapping the face image to be edited through an encoding module to obtain an image hidden variable; acquiring a pre-trained generator; inputting the image hidden variable into the generator to generate a first face image; inputting the first face image into a trained image evaluation model to obtain an evaluation result; inputting the first facial attribute and the evaluation result to a trained reinforcement learning module to generate a second facial attribute; inputting the image hidden variable and the second facial attribute into a continuous normalization stream module to generate a hidden variable of a target face image; inputting hidden variables of the target face image into the image generation module to generate a second face image; the invention realizes the automatic adjustment of the facial attribute of the face through reinforcement learning and improves the aesthetic quality of the face image.
Description
Technical Field
The invention relates to the technical field of image editing, in particular to a face image editing method based on reinforcement learning.
Background
The pursuit of beauty is the nature and objective requirement of human beings, and can meet the emotion requirement of the human beings and make people feel pleasant. Images are important carriers for conveying information and expressing emotion, and the aesthetic attractiveness between different images is quite different, and the quality of the images influences the feeling of audiences. Artificial intelligence has been developed rapidly in cognition and beauty evaluation, but still has a great progress in creating beauty.
With the popularization of social applications, people hope to upload more beautiful personal images on software, thereby increasing the charm of making friends of themselves, and more of the graphic applications are put into practical production and research. From the original image sent to the current more people to select to decorate and beautify the image and upload, the aesthetic requirements of people can be seen to be continuously improved. The existing image beautifying software intelligently edits the real face image according to the templatization standard, provides personalized beautifying guidance, has wide application requirements in daily life, has huge development potential in the professional fields of medical cosmetology, planar advertisement design, image post-processing and the like, and has bright prospect.
Therefore, how to select facial attributes and edit facial images with high aesthetic quality according to different aesthetic requirements is a problem that needs to be solved by those skilled in the art.
Disclosure of Invention
In view of this, the invention adjusts the facial semantic attribute of the bottom layer of the StyleGAN generator by reinforcement learning method, and then edits and obtains the beautiful face image more conforming to human aesthetic.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
the face image editing method based on reinforcement learning is characterized by comprising the following steps:
acquiring a face image to be edited and extracting a first face attribute;
mapping the face image to be edited through an encoding module to obtain an image hidden variable;
acquiring a pre-trained generator;
inputting the image hidden variable into the generator to generate a first face image;
selecting an attribute to be edited from the first facial attribute, and inputting a first face image into a trained image evaluation model to obtain an evaluation result;
inputting the attribute to be edited and the evaluation result to a trained reinforcement learning module to generate a second facial attribute;
inputting the image hidden variable and the second facial attribute into a continuous normalization stream module to generate a hidden variable of a target face image;
and inputting the hidden variable of the target face image into the image generation module to generate a second face image.
Further, the first face image is input to a trained image evaluation model to obtain an evaluation result, and the steps include:
preprocessing the first face image, inputting the preprocessed first face image into a backbone network for feature extraction to obtain a feature vector;
inputting the feature vector to a channel attention module to obtain a three-dimensional vector, and expanding the three-dimensional vector into a one-dimensional vector after activation and self-adaptive average pooling;
and inputting the one-dimensional vector into a regression network, and outputting an evaluation result.
Further, the training step of the image evaluation model includes:
training a classification network, inputting training data into a backbone network to extract characteristics, and classifying through the classification network; when the classification network is trained, parameter feedback is carried out on the loss value through a cross entropy function, and the parameter of the regression network is kept from being returned;
regression training is performed on the data based on the classification network to extract more aesthetic features, at which time only the regression network is released, freezing the parameters of the backbone network and the classification network.
Further, the training step of the reinforcement learning module specifically includes:
initializing the facial attribute, and generating a plurality of groups of corresponding training images according to the selected attribute;
respectively calculating each group of training images according to a preset reinforcement learning strategy, generating new face attributes, and generating new face images corresponding to the new face attributes;
and evaluating the new face image through the image evaluation model, and updating the gradient by adopting a soft gradient strategy according to the evaluation result to iterate until convergence.
Further, the reinforcement learning strategyThe method comprises the following steps:
wherein the method comprises the steps ofFor the state vector of the t-th iteration, +.>R is a fitting factor satisfying r is more than or equal to 0 and less than or equal to 1 for the action vector of the t-th iteration; />Entropy (entropy)>Is super-parameter, control->The relative importance in the target; />Is the probability of a policy.
Further, the soft ladder policy is calculated as:
wherein,is a temperature super parameter for controlling the detection range, +.>Is policy +>Value of->Is a state-dependent baseline; strategy->Is->Is a very small parameter.
Further, the reinforcement learning module comprises a feature extraction unit and a gating circulation unit;
the characteristic extraction unit extracts image characteristics in the initial image and inputs the image characteristics into the gating circulation unit;
and the tail end of the hidden layer in the gate control circulation unit is connected with a full connection layer, and the value of the selected attribute is output through the full connection layer.
Further, the feature extraction unit is a ResNet18 network.
A neural network model, comprising: an encoder, a generator, an image evaluation network, and a reinforcement learning network;
the encoder converts the image to be edited into a hidden space vector and then carries out image reconstruction through a generator; during reconstruction, the generator generates facial attributes according to the hidden space vector and obtains a reconstructed image;
the image evaluation network evaluates the reconstructed image;
the reinforcement learning network selects according to the generated facial attribute in the reconstruction process, optimizes according to the evaluation structure, obtains the optimized facial attribute, and inputs the optimized facial attribute to the generator again to generate a final image.
A face image editing system based on reinforcement learning comprises an image acquisition module, an image editing module and an image generation module;
the image acquisition module is used for acquiring a face image to be edited;
the image editing module is used for selecting editing attributes according to the face image to be edited;
the image generation module is used for carrying out preliminary evaluation according to the face image to be edited, optimizing the selected editing attribute according to the evaluation result and obtaining the optimized face attribute; and the method is used for generating an editing result according to the optimized facial attribute.
The invention has the beneficial effects that:
compared with the prior art, the invention discloses a face image editing method based on reinforcement learning, which realizes automatic adjustment of the image face attribute through reinforcement learning and generates an aesthetic high-quality face image conforming to image evaluation; the invention enhances the face image through reinforcement learning, and provides a novel model optimization method of the soft gradient strategy with a self-criticizing training mode in reinforcement learning.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a face image editing method based on reinforcement learning according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an image evaluation process according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a reinforcement learning module according to an embodiment of the invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, an embodiment of the present invention provides a face image editing method based on reinforcement learning, including the steps of:
s1: acquiring a face image to be edited and extracting a first face attribute; mapping the face image to be edited by an encoding module to obtain an image hidden variable; the encoding module adopts a pixel2style2pixel framework, which is hereinafter referred to as pSp. pSp framework is based on a new type of encoder network that generates a series of style vectors. The first facial attribute includes posture, hair volume, beard, age, expression, and the like.
S2: acquiring a pre-trained generator; inputting the image hidden variable into a generator to generate a first face image; specifically, a StyleGAN network model structure is used as a generator to generate an image with a resolution of 1024×1024. After the pSp encoder converts the face image to be edited into the style vector in the hidden space, the generator is used for reconstruction.
S3: and selecting the attribute to be edited from the first facial attribute, and inputting the first face image into the trained image evaluation model to obtain an evaluation result.
In one embodiment, the image evaluation model is composed of a backbone network and a regression network, the backbone network is used for extracting image features, the regression network is used for carrying out regression calculation according to the image features, and a final value is output as an evaluation result; specifically, a channel attention module is further arranged between the backbone network and the regression network, and the evaluation process is as follows: preprocessing a face image, rotationally cutting and scaling the face image to 800 multiplied by 800 resolution with the face as the center, extracting image features by adopting a pre-built neural network, and convolving the extracted image features; normalizing the convolved result and activating the convolved result through an activation function; inputting the activated features to an ECA attention module to obtain a three-dimensional vector, and expanding the three-dimensional vector into a one-dimensional vector after activation and self-adaptive average pooling; the one-dimensional vector is input into the regression network and the image quality parameter, i.e. the aesthetic score, is output, ranging in number from 0 to 1. The Efficient Net-B4 can be used as a pre-training model to extract the characteristics of the image to be evaluated, the image characteristics are extracted, and the convolution operation with the kernel size of 3 is carried out after the characteristics are extracted.
In this embodiment, the training steps of the image evaluation model are:
training a classification network, inputting training data into a backbone network to extract characteristics, and classifying through the classification network; when the classification network is trained, parameter feedback is carried out on the loss value through a cross entropy function, and the parameter of the regression network is kept from being returned; regression training is performed on the data based on the classification network to extract more aesthetic features, at which time only the regression network is released, freezing the parameters of the backbone network and the classification network.
Specifically, a quite class network is trained by taking 0.1 as a step length, the loss value is returned through a cross entropy function during classified network training, and the return of the parameters of a regression network is kept. Regression training is performed on the data based on the classification network to extract more aesthetic features, at which time only the regression network is released, freezing the parameters of the backbone network and the classification network.
In the classification training, the number of single treatments was set to 32, the initial learning rate was set to 0.001, the learning rate was automatically reduced to half when the accuracy was not increased for a plurality of consecutive rounds, adam optimization algorithm was selected, the coefficient of running average for calculating the gradient and the square of the gradient was set to (0.98, 0.999), and the weight attenuation coefficient was set to 0.0001. In the regression training, the batch size is set to 64, and if the mean square error is not reduced after multiple training rounds, the learning rate is automatically halved as well.
S4: and inputting the attribute to be edited and the evaluation result to a trained reinforcement learning module to generate a second facial attribute. As shown in fig. 2, the reinforcement learning module includes a feature extraction unit and a gating circulation unit; the feature extraction unit extracts image features in the initial image and inputs the image features into the gating circulation unit; the tail end of the hidden layer in the gate control circulation unit is connected with a full connection layer, and the value of the selected attribute is output through the full connection layer; the feature extraction unit is a ResNet18 network.
The training step of the reinforcement learning module comprises the following steps:
s41: initializing the facial attributes and generating a plurality of groups of training images according to the selected attributes.
S42: each group of training images calculates the value of the corresponding attribute, namely the new facial attribute, through the reinforcement learning module. Respectively calculating each group of training images by adopting a preset reinforcement learning strategy, generating new face attributes, and generating new face images corresponding to the new face attributes; wherein, in reinforcement learning, agent continuously interacts with environment, defined asWherein->And->Is a state and action space, +.>Is the state transition probability>Is rewarded with->Is the initial state s 0 The gamma prime factor defines how much the smart agent will be affected by the far-end state. The goal is to learn a random strategy +.>Thus when action is taken->When the desired reward for the trajectory is maximized.
The method comprises the following specific steps:
firstly, setting an attribute value corresponding to a selected feature dimension to be 0; then respectively exploring the selected feature dimensions, wherein the exploring process needs to establish a plurality of exploring tracks according to preset values; therefore, 5 initial images are obtained by cloning initial attributes, wherein each image corresponds to the input of one exploration track, each exploration track explores a certain attribute, new attribute values are calculated through exploration, and images are generated; finally, image quality evaluation is carried out to obtain 5 scores, and the baseline is dependent on the stateThe average value of the five scores is set, so that the reinforcement learning module is updated to improve the probability of obtaining a track with a higher aesthetic score, and the base line can effectively reduce the variance in the learning process, thereby improving the stability of the training process. In this step, the aliasing factor γ is set to 1. In addition, the Agent rewards only the terminal state, and does not rewards the middle state of the track. The prize value refers to the score of the image after a series of adjustments. The descent gradient was optimized by Adam optimizer, using a learning rate of 1e-5, without L2 regularization of the parameters. The ResNet18 module is a pre-trained version on ImageNet. For a gated loop cell, the size of the hidden state is 512, which is the same as the output of the ResNet18 module.
S43: and evaluating the new face image through the image evaluation model, and updating the gradient by adopting a soft gradient strategy according to the evaluation result for iteration until convergence.
In this embodiment, when reinforcement learning is performed, the maximum entropy reinforcement learning framework is used to ensure randomness of exploration, and prevent premature convergence to a suboptimal strategy. Strong chemistryThe learned target is set as a strategy:
Wherein S and A are respectively all possible state space sets encountered by the Agent and all possible action space sets generated, and r is a shading factor meeting 0.ltoreq.r.ltoreq.1.Entropy (entropy)>Is a super parameter, set to 0.01, control +.>Relative importance in the target.
The soft strategy gradient formula is as follows:
wherein,is a temperature super parameter for controlling the detection range, +.>Is policy +>Value of->Is a state-dependent baseline and can be any function as long as it does not change with motion.
In one embodiment, the self-adjudicating training pattern is incorporated into the soft policy gradient update. The Monte Carlo method is used to calculate the Q value of reinforcement learning, wherein the Q value is the aesthetic score given by the image quality assessment model. In this embodiment, the reinforcement learning module uses a batch size of 16 and each gradient update step has a batch size of 80.
S5: inputting the image hidden variables and new facial attributes into a continuous normalization flow module (Continuous Normalizing Flow, CNF) to generate hidden variables of a target face image;
s6: inputting the hidden variable of the target face image into the image generation module to generate a final face image.
Example 2
Based on the same inventive concept, an embodiment of the present invention discloses a neural network model for implementing the image editing method in embodiment 1, including: an encoder, a generator, an image evaluation network, and a reinforcement learning network; the encoder converts the image to be edited into hidden space vectors and then carries out image reconstruction through a generator; during reconstruction, the generator generates facial attributes according to the hidden space vector and obtains a reconstructed image; the image evaluation network evaluates the reconstructed image; the reinforcement learning network selects according to the generated facial attribute in the reconstruction process, optimizes according to the evaluation structure, obtains the optimized facial attribute, and inputs the optimized facial attribute to the generator again to generate a final image.
Example 3
Based on the same inventive concept, the embodiment of the invention discloses a face image editing system based on reinforcement learning, which is characterized by comprising an image acquisition module, an image editing module and an image generation module; the image acquisition module is used for acquiring a face image to be edited; the image editing module is used for selecting editing attributes according to the face image to be edited; the image generation module is used for carrying out preliminary evaluation according to the face image to be edited, optimizing the selected editing attribute according to the evaluation result and obtaining the optimized face attribute; and the method is used for generating an editing result according to the optimized facial attribute.
According to the invention, the control of the facial attribute of the face image is realized through reinforcement learning, and the facial attribute can be selected and edited independently according to different aesthetic requirements to obtain a high-quality face image; in the reasoning process, the actions with the highest probability are selected instead of sampling according to the softmax layer of the reinforcement learning module, and the intelligent agent automatically sets values for the selected feature dimensions in sequence to obtain new facial attributes. And inputting the hidden variables of the image and the new facial attributes into a continuous normalization stream module to generate hidden variables of the target face image, and then inputting the hidden variables into an image generation module to obtain the edited face image.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (7)
1. The face image editing method based on reinforcement learning is characterized by comprising the following steps:
acquiring a face image to be edited and extracting a first face attribute;
mapping the face image to be edited through an encoding module to obtain an image hidden variable;
acquiring a pre-trained generator;
inputting the image hidden variable into the generator to generate a first face image;
selecting an attribute to be edited from the first facial attribute, and inputting a first face image into a trained image evaluation model to obtain an evaluation result;
inputting the attribute to be edited and the evaluation result to a trained reinforcement learning module to generate a second facial attribute;
inputting the image hidden variable and the second facial attribute into a continuous normalization stream module to generate a hidden variable of a target face image;
inputting the hidden variable of the target face image into the generator to generate a second face image;
the training step of the reinforcement learning module comprises the following specific steps:
initializing the facial attribute, and generating a plurality of groups of corresponding training images according to the selected attribute;
respectively calculating each group of training images according to a preset reinforcement learning strategy, generating new face attributes, and generating new face images corresponding to the new face attributes;
evaluating the new face image through the image evaluation model, and updating the gradient by adopting a soft gradient strategy according to the evaluation result for iteration until convergence;
the reinforcement learning strategy pi * The method comprises the following steps:
wherein s is t A is the state vector of the t-th iteration t R is a fitting factor satisfying r is more than or equal to 0 and less than or equal to 1 for the action vector of the t-th iteration;entropy, alpha is super parameter, control +.>The relative importance in the target; ρ π Probability of being a policy;
the rope ladder strategy calculation formula is as follows:
wherein alpha is a temperature super-parameter controlling the detection range,is the Q value of the strategy, b (s t ) Is a state-dependent baseline; policy pi θ θ in (2) is a minuscule parameter.
2. The method for editing a face image based on reinforcement learning according to claim 1, wherein the step of inputting the first face image into a trained image evaluation model to obtain an evaluation result comprises:
preprocessing the first face image, inputting the preprocessed first face image into a backbone network for feature extraction to obtain a feature vector;
inputting the feature vector to a channel attention module to obtain a three-dimensional vector, and expanding the three-dimensional vector into a one-dimensional vector after activation and self-adaptive average pooling;
and inputting the one-dimensional vector into a regression network, and outputting an evaluation result.
3. The method for editing a face image based on reinforcement learning according to claim 1, wherein the training step of the image evaluation model comprises:
training a classification network, inputting training data into a backbone network to extract characteristics, and classifying through the classification network; when the classification network is trained, parameter feedback is carried out on the loss value through a cross entropy function, and the parameter of the regression network is kept from being returned;
regression training is performed on the data based on the classification network to extract more aesthetic features, at which time only the regression network is released, freezing the parameters of the backbone network and the classification network.
4. The face image editing method based on reinforcement learning according to claim 1, wherein the reinforcement learning module comprises a feature extraction unit and a gating circulation unit;
the characteristic extraction unit extracts image characteristics in the initial image and inputs the image characteristics into the gating circulation unit;
and the tail end of the hidden layer in the gate control circulation unit is connected with a full connection layer, and the value of the selected attribute is output through the full connection layer.
5. The face image editing method based on reinforcement learning according to claim 4, wherein the feature extraction unit is a res net18 network.
6. A neural network model for implementing the image editing method of any of claims 1-5, comprising: an encoder, a generator, an image evaluation network, and a reinforcement learning network;
the encoder converts the image to be edited into a hidden space vector and then carries out image reconstruction through a generator; during reconstruction, the generator generates facial attributes according to the hidden space vector and obtains a reconstructed image;
the image evaluation network evaluates the reconstructed image;
the reinforcement learning network selects according to the generated facial attribute in the reconstruction process, optimizes according to the evaluation structure, obtains the optimized facial attribute, and inputs the optimized facial attribute to the generator again to generate a final image.
7. A face image editing system based on reinforcement learning, which is characterized by adopting the neural network model in claim 6 and comprising an image acquisition module, an image editing module and an image generation module;
the image acquisition module is used for acquiring a face image to be edited;
the image editing module is used for selecting editing attributes according to the face image to be edited;
the image generation module is used for carrying out preliminary evaluation according to the face image to be edited, optimizing the selected editing attribute according to the evaluation result and obtaining the optimized face attribute; and the method is used for generating an editing result according to the optimized facial attribute.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310908009.1A CN116630147B (en) | 2023-07-24 | 2023-07-24 | Face image editing method based on reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310908009.1A CN116630147B (en) | 2023-07-24 | 2023-07-24 | Face image editing method based on reinforcement learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116630147A CN116630147A (en) | 2023-08-22 |
CN116630147B true CN116630147B (en) | 2024-02-06 |
Family
ID=87617445
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310908009.1A Active CN116630147B (en) | 2023-07-24 | 2023-07-24 | Face image editing method based on reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116630147B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014001610A1 (en) * | 2012-06-25 | 2014-01-03 | Nokia Corporation | Method, apparatus and computer program product for human-face features extraction |
CN111260754A (en) * | 2020-04-27 | 2020-06-09 | 腾讯科技(深圳)有限公司 | Face image editing method and device and storage medium |
CN112800893A (en) * | 2021-01-18 | 2021-05-14 | 南京航空航天大学 | Human face attribute editing method based on reinforcement learning |
CN112907725A (en) * | 2021-01-22 | 2021-06-04 | 北京达佳互联信息技术有限公司 | Image generation method, image processing model training method, image processing device, and image processing program |
CN113221794A (en) * | 2021-05-24 | 2021-08-06 | 厦门美图之家科技有限公司 | Training data set generation method, device, equipment and storage medium |
CN113255551A (en) * | 2021-06-04 | 2021-08-13 | 广州虎牙科技有限公司 | Training, face editing and live broadcasting method of face editor and related device |
WO2022135013A1 (en) * | 2020-12-24 | 2022-06-30 | 百果园技术(新加坡)有限公司 | Facial attribute editing method and system, and electronic device and storage medium |
-
2023
- 2023-07-24 CN CN202310908009.1A patent/CN116630147B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014001610A1 (en) * | 2012-06-25 | 2014-01-03 | Nokia Corporation | Method, apparatus and computer program product for human-face features extraction |
CN111260754A (en) * | 2020-04-27 | 2020-06-09 | 腾讯科技(深圳)有限公司 | Face image editing method and device and storage medium |
WO2022135013A1 (en) * | 2020-12-24 | 2022-06-30 | 百果园技术(新加坡)有限公司 | Facial attribute editing method and system, and electronic device and storage medium |
CN112800893A (en) * | 2021-01-18 | 2021-05-14 | 南京航空航天大学 | Human face attribute editing method based on reinforcement learning |
CN112907725A (en) * | 2021-01-22 | 2021-06-04 | 北京达佳互联信息技术有限公司 | Image generation method, image processing model training method, image processing device, and image processing program |
CN113221794A (en) * | 2021-05-24 | 2021-08-06 | 厦门美图之家科技有限公司 | Training data set generation method, device, equipment and storage medium |
CN113255551A (en) * | 2021-06-04 | 2021-08-13 | 广州虎牙科技有限公司 | Training, face editing and live broadcasting method of face editor and related device |
Also Published As
Publication number | Publication date |
---|---|
CN116630147A (en) | 2023-08-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liu et al. | More control for free! image synthesis with semantic diffusion guidance | |
CN109036465B (en) | Speech emotion recognition method | |
US11354792B2 (en) | System and methods for modeling creation workflows | |
CN106503654A (en) | A kind of face emotion identification method based on the sparse autoencoder network of depth | |
CN111553467B (en) | Method for realizing general artificial intelligence | |
CN113435211B (en) | Text implicit emotion analysis method combined with external knowledge | |
CN108197533A (en) | A kind of man-machine interaction method based on user's expression, electronic equipment and storage medium | |
US11823490B2 (en) | Non-linear latent to latent model for multi-attribute face editing | |
CN108985464A (en) | The continuous feature generation method of face for generating confrontation network is maximized based on information | |
US20220101144A1 (en) | Training a latent-variable generative model with a noise contrastive prior | |
Dogan et al. | Semi-supervised image attribute editing using generative adversarial networks | |
Zhai et al. | Asian female facial beauty prediction using deep neural networks via transfer learning and multi-channel feature fusion | |
CN108846073A (en) | A kind of man-machine emotion conversational system of personalization | |
WO2021226731A1 (en) | Method for imitating human memory to realize universal machine intelligence | |
CN117409109A (en) | Image generation method and data processing method for image generation | |
Chen et al. | CNN-based broad learning with efficient incremental reconstruction model for facial emotion recognition | |
CN117576257A (en) | Method, terminal and storage medium for editing face image through text | |
Ye et al. | Multi-style transfer and fusion of image’s regions based on attention mechanism and instance segmentation | |
Xia et al. | Semantic translation of face image with limited pixels for simulated prosthetic vision | |
Feng et al. | Improved visual story generation with adaptive context modeling | |
WO2021223042A1 (en) | Method for implementing machine intelligence similar to human intelligence | |
CN116630147B (en) | Face image editing method based on reinforcement learning | |
Tsapatsoulis et al. | A fuzzy system for emotion classification based on the MPEG-4 facial definition parameter set | |
KR20190129698A (en) | Electronic apparatus for compressing recurrent neural network and method thereof | |
Liu et al. | Multimodal face aging framework via learning disentangled representation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |