Nothing Special   »   [go: up one dir, main page]

CN116630147B - Face image editing method based on reinforcement learning - Google Patents

Face image editing method based on reinforcement learning Download PDF

Info

Publication number
CN116630147B
CN116630147B CN202310908009.1A CN202310908009A CN116630147B CN 116630147 B CN116630147 B CN 116630147B CN 202310908009 A CN202310908009 A CN 202310908009A CN 116630147 B CN116630147 B CN 116630147B
Authority
CN
China
Prior art keywords
image
face image
reinforcement learning
attribute
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310908009.1A
Other languages
Chinese (zh)
Other versions
CN116630147A (en
Inventor
金鑫
赵姝
章乐
赵鑫
邓强
肖超恩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Hidden Computing Technology Co ltd
Original Assignee
Beijing Hidden Computing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Hidden Computing Technology Co ltd filed Critical Beijing Hidden Computing Technology Co ltd
Priority to CN202310908009.1A priority Critical patent/CN116630147B/en
Publication of CN116630147A publication Critical patent/CN116630147A/en
Application granted granted Critical
Publication of CN116630147B publication Critical patent/CN116630147B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/092Reinforcement learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/766Arrangements for image or video recognition or understanding using pattern recognition or machine learning using regression, e.g. by projecting features on hyperplanes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/98Detection or correction of errors, e.g. by rescanning the pattern or by human intervention; Evaluation of the quality of the acquired patterns
    • G06V10/993Evaluation of the quality of the acquired pattern
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Quality & Reliability (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

A face image editing method based on reinforcement learning includes the steps: acquiring a face image to be edited and extracting a first face attribute; mapping the face image to be edited through an encoding module to obtain an image hidden variable; acquiring a pre-trained generator; inputting the image hidden variable into the generator to generate a first face image; inputting the first face image into a trained image evaluation model to obtain an evaluation result; inputting the first facial attribute and the evaluation result to a trained reinforcement learning module to generate a second facial attribute; inputting the image hidden variable and the second facial attribute into a continuous normalization stream module to generate a hidden variable of a target face image; inputting hidden variables of the target face image into the image generation module to generate a second face image; the invention realizes the automatic adjustment of the facial attribute of the face through reinforcement learning and improves the aesthetic quality of the face image.

Description

Face image editing method based on reinforcement learning
Technical Field
The invention relates to the technical field of image editing, in particular to a face image editing method based on reinforcement learning.
Background
The pursuit of beauty is the nature and objective requirement of human beings, and can meet the emotion requirement of the human beings and make people feel pleasant. Images are important carriers for conveying information and expressing emotion, and the aesthetic attractiveness between different images is quite different, and the quality of the images influences the feeling of audiences. Artificial intelligence has been developed rapidly in cognition and beauty evaluation, but still has a great progress in creating beauty.
With the popularization of social applications, people hope to upload more beautiful personal images on software, thereby increasing the charm of making friends of themselves, and more of the graphic applications are put into practical production and research. From the original image sent to the current more people to select to decorate and beautify the image and upload, the aesthetic requirements of people can be seen to be continuously improved. The existing image beautifying software intelligently edits the real face image according to the templatization standard, provides personalized beautifying guidance, has wide application requirements in daily life, has huge development potential in the professional fields of medical cosmetology, planar advertisement design, image post-processing and the like, and has bright prospect.
Therefore, how to select facial attributes and edit facial images with high aesthetic quality according to different aesthetic requirements is a problem that needs to be solved by those skilled in the art.
Disclosure of Invention
In view of this, the invention adjusts the facial semantic attribute of the bottom layer of the StyleGAN generator by reinforcement learning method, and then edits and obtains the beautiful face image more conforming to human aesthetic.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
the face image editing method based on reinforcement learning is characterized by comprising the following steps:
acquiring a face image to be edited and extracting a first face attribute;
mapping the face image to be edited through an encoding module to obtain an image hidden variable;
acquiring a pre-trained generator;
inputting the image hidden variable into the generator to generate a first face image;
selecting an attribute to be edited from the first facial attribute, and inputting a first face image into a trained image evaluation model to obtain an evaluation result;
inputting the attribute to be edited and the evaluation result to a trained reinforcement learning module to generate a second facial attribute;
inputting the image hidden variable and the second facial attribute into a continuous normalization stream module to generate a hidden variable of a target face image;
and inputting the hidden variable of the target face image into the image generation module to generate a second face image.
Further, the first face image is input to a trained image evaluation model to obtain an evaluation result, and the steps include:
preprocessing the first face image, inputting the preprocessed first face image into a backbone network for feature extraction to obtain a feature vector;
inputting the feature vector to a channel attention module to obtain a three-dimensional vector, and expanding the three-dimensional vector into a one-dimensional vector after activation and self-adaptive average pooling;
and inputting the one-dimensional vector into a regression network, and outputting an evaluation result.
Further, the training step of the image evaluation model includes:
training a classification network, inputting training data into a backbone network to extract characteristics, and classifying through the classification network; when the classification network is trained, parameter feedback is carried out on the loss value through a cross entropy function, and the parameter of the regression network is kept from being returned;
regression training is performed on the data based on the classification network to extract more aesthetic features, at which time only the regression network is released, freezing the parameters of the backbone network and the classification network.
Further, the training step of the reinforcement learning module specifically includes:
initializing the facial attribute, and generating a plurality of groups of corresponding training images according to the selected attribute;
respectively calculating each group of training images according to a preset reinforcement learning strategy, generating new face attributes, and generating new face images corresponding to the new face attributes;
and evaluating the new face image through the image evaluation model, and updating the gradient by adopting a soft gradient strategy according to the evaluation result to iterate until convergence.
Further, the reinforcement learning strategyThe method comprises the following steps:
wherein the method comprises the steps ofFor the state vector of the t-th iteration, +.>R is a fitting factor satisfying r is more than or equal to 0 and less than or equal to 1 for the action vector of the t-th iteration; />Entropy (entropy)>Is super-parameter, control->The relative importance in the target; />Is the probability of a policy.
Further, the soft ladder policy is calculated as:
wherein,is a temperature super parameter for controlling the detection range, +.>Is policy +>Value of->Is a state-dependent baseline; strategy->Is->Is a very small parameter.
Further, the reinforcement learning module comprises a feature extraction unit and a gating circulation unit;
the characteristic extraction unit extracts image characteristics in the initial image and inputs the image characteristics into the gating circulation unit;
and the tail end of the hidden layer in the gate control circulation unit is connected with a full connection layer, and the value of the selected attribute is output through the full connection layer.
Further, the feature extraction unit is a ResNet18 network.
A neural network model, comprising: an encoder, a generator, an image evaluation network, and a reinforcement learning network;
the encoder converts the image to be edited into a hidden space vector and then carries out image reconstruction through a generator; during reconstruction, the generator generates facial attributes according to the hidden space vector and obtains a reconstructed image;
the image evaluation network evaluates the reconstructed image;
the reinforcement learning network selects according to the generated facial attribute in the reconstruction process, optimizes according to the evaluation structure, obtains the optimized facial attribute, and inputs the optimized facial attribute to the generator again to generate a final image.
A face image editing system based on reinforcement learning comprises an image acquisition module, an image editing module and an image generation module;
the image acquisition module is used for acquiring a face image to be edited;
the image editing module is used for selecting editing attributes according to the face image to be edited;
the image generation module is used for carrying out preliminary evaluation according to the face image to be edited, optimizing the selected editing attribute according to the evaluation result and obtaining the optimized face attribute; and the method is used for generating an editing result according to the optimized facial attribute.
The invention has the beneficial effects that:
compared with the prior art, the invention discloses a face image editing method based on reinforcement learning, which realizes automatic adjustment of the image face attribute through reinforcement learning and generates an aesthetic high-quality face image conforming to image evaluation; the invention enhances the face image through reinforcement learning, and provides a novel model optimization method of the soft gradient strategy with a self-criticizing training mode in reinforcement learning.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a face image editing method based on reinforcement learning according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an image evaluation process according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a reinforcement learning module according to an embodiment of the invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, an embodiment of the present invention provides a face image editing method based on reinforcement learning, including the steps of:
s1: acquiring a face image to be edited and extracting a first face attribute; mapping the face image to be edited by an encoding module to obtain an image hidden variable; the encoding module adopts a pixel2style2pixel framework, which is hereinafter referred to as pSp. pSp framework is based on a new type of encoder network that generates a series of style vectors. The first facial attribute includes posture, hair volume, beard, age, expression, and the like.
S2: acquiring a pre-trained generator; inputting the image hidden variable into a generator to generate a first face image; specifically, a StyleGAN network model structure is used as a generator to generate an image with a resolution of 1024×1024. After the pSp encoder converts the face image to be edited into the style vector in the hidden space, the generator is used for reconstruction.
S3: and selecting the attribute to be edited from the first facial attribute, and inputting the first face image into the trained image evaluation model to obtain an evaluation result.
In one embodiment, the image evaluation model is composed of a backbone network and a regression network, the backbone network is used for extracting image features, the regression network is used for carrying out regression calculation according to the image features, and a final value is output as an evaluation result; specifically, a channel attention module is further arranged between the backbone network and the regression network, and the evaluation process is as follows: preprocessing a face image, rotationally cutting and scaling the face image to 800 multiplied by 800 resolution with the face as the center, extracting image features by adopting a pre-built neural network, and convolving the extracted image features; normalizing the convolved result and activating the convolved result through an activation function; inputting the activated features to an ECA attention module to obtain a three-dimensional vector, and expanding the three-dimensional vector into a one-dimensional vector after activation and self-adaptive average pooling; the one-dimensional vector is input into the regression network and the image quality parameter, i.e. the aesthetic score, is output, ranging in number from 0 to 1. The Efficient Net-B4 can be used as a pre-training model to extract the characteristics of the image to be evaluated, the image characteristics are extracted, and the convolution operation with the kernel size of 3 is carried out after the characteristics are extracted.
In this embodiment, the training steps of the image evaluation model are:
training a classification network, inputting training data into a backbone network to extract characteristics, and classifying through the classification network; when the classification network is trained, parameter feedback is carried out on the loss value through a cross entropy function, and the parameter of the regression network is kept from being returned; regression training is performed on the data based on the classification network to extract more aesthetic features, at which time only the regression network is released, freezing the parameters of the backbone network and the classification network.
Specifically, a quite class network is trained by taking 0.1 as a step length, the loss value is returned through a cross entropy function during classified network training, and the return of the parameters of a regression network is kept. Regression training is performed on the data based on the classification network to extract more aesthetic features, at which time only the regression network is released, freezing the parameters of the backbone network and the classification network.
In the classification training, the number of single treatments was set to 32, the initial learning rate was set to 0.001, the learning rate was automatically reduced to half when the accuracy was not increased for a plurality of consecutive rounds, adam optimization algorithm was selected, the coefficient of running average for calculating the gradient and the square of the gradient was set to (0.98, 0.999), and the weight attenuation coefficient was set to 0.0001. In the regression training, the batch size is set to 64, and if the mean square error is not reduced after multiple training rounds, the learning rate is automatically halved as well.
S4: and inputting the attribute to be edited and the evaluation result to a trained reinforcement learning module to generate a second facial attribute. As shown in fig. 2, the reinforcement learning module includes a feature extraction unit and a gating circulation unit; the feature extraction unit extracts image features in the initial image and inputs the image features into the gating circulation unit; the tail end of the hidden layer in the gate control circulation unit is connected with a full connection layer, and the value of the selected attribute is output through the full connection layer; the feature extraction unit is a ResNet18 network.
The training step of the reinforcement learning module comprises the following steps:
s41: initializing the facial attributes and generating a plurality of groups of training images according to the selected attributes.
S42: each group of training images calculates the value of the corresponding attribute, namely the new facial attribute, through the reinforcement learning module. Respectively calculating each group of training images by adopting a preset reinforcement learning strategy, generating new face attributes, and generating new face images corresponding to the new face attributes; wherein, in reinforcement learning, agent continuously interacts with environment, defined asWherein->And->Is a state and action space, +.>Is the state transition probability>Is rewarded with->Is the initial state s 0 The gamma prime factor defines how much the smart agent will be affected by the far-end state. The goal is to learn a random strategy +.>Thus when action is taken->When the desired reward for the trajectory is maximized.
The method comprises the following specific steps:
firstly, setting an attribute value corresponding to a selected feature dimension to be 0; then respectively exploring the selected feature dimensions, wherein the exploring process needs to establish a plurality of exploring tracks according to preset values; therefore, 5 initial images are obtained by cloning initial attributes, wherein each image corresponds to the input of one exploration track, each exploration track explores a certain attribute, new attribute values are calculated through exploration, and images are generated; finally, image quality evaluation is carried out to obtain 5 scores, and the baseline is dependent on the stateThe average value of the five scores is set, so that the reinforcement learning module is updated to improve the probability of obtaining a track with a higher aesthetic score, and the base line can effectively reduce the variance in the learning process, thereby improving the stability of the training process. In this step, the aliasing factor γ is set to 1. In addition, the Agent rewards only the terminal state, and does not rewards the middle state of the track. The prize value refers to the score of the image after a series of adjustments. The descent gradient was optimized by Adam optimizer, using a learning rate of 1e-5, without L2 regularization of the parameters. The ResNet18 module is a pre-trained version on ImageNet. For a gated loop cell, the size of the hidden state is 512, which is the same as the output of the ResNet18 module.
S43: and evaluating the new face image through the image evaluation model, and updating the gradient by adopting a soft gradient strategy according to the evaluation result for iteration until convergence.
In this embodiment, when reinforcement learning is performed, the maximum entropy reinforcement learning framework is used to ensure randomness of exploration, and prevent premature convergence to a suboptimal strategy. Strong chemistryThe learned target is set as a strategy
Wherein S and A are respectively all possible state space sets encountered by the Agent and all possible action space sets generated, and r is a shading factor meeting 0.ltoreq.r.ltoreq.1.Entropy (entropy)>Is a super parameter, set to 0.01, control +.>Relative importance in the target.
The soft strategy gradient formula is as follows:
wherein,is a temperature super parameter for controlling the detection range, +.>Is policy +>Value of->Is a state-dependent baseline and can be any function as long as it does not change with motion.
In one embodiment, the self-adjudicating training pattern is incorporated into the soft policy gradient update. The Monte Carlo method is used to calculate the Q value of reinforcement learning, wherein the Q value is the aesthetic score given by the image quality assessment model. In this embodiment, the reinforcement learning module uses a batch size of 16 and each gradient update step has a batch size of 80.
S5: inputting the image hidden variables and new facial attributes into a continuous normalization flow module (Continuous Normalizing Flow, CNF) to generate hidden variables of a target face image;
s6: inputting the hidden variable of the target face image into the image generation module to generate a final face image.
Example 2
Based on the same inventive concept, an embodiment of the present invention discloses a neural network model for implementing the image editing method in embodiment 1, including: an encoder, a generator, an image evaluation network, and a reinforcement learning network; the encoder converts the image to be edited into hidden space vectors and then carries out image reconstruction through a generator; during reconstruction, the generator generates facial attributes according to the hidden space vector and obtains a reconstructed image; the image evaluation network evaluates the reconstructed image; the reinforcement learning network selects according to the generated facial attribute in the reconstruction process, optimizes according to the evaluation structure, obtains the optimized facial attribute, and inputs the optimized facial attribute to the generator again to generate a final image.
Example 3
Based on the same inventive concept, the embodiment of the invention discloses a face image editing system based on reinforcement learning, which is characterized by comprising an image acquisition module, an image editing module and an image generation module; the image acquisition module is used for acquiring a face image to be edited; the image editing module is used for selecting editing attributes according to the face image to be edited; the image generation module is used for carrying out preliminary evaluation according to the face image to be edited, optimizing the selected editing attribute according to the evaluation result and obtaining the optimized face attribute; and the method is used for generating an editing result according to the optimized facial attribute.
According to the invention, the control of the facial attribute of the face image is realized through reinforcement learning, and the facial attribute can be selected and edited independently according to different aesthetic requirements to obtain a high-quality face image; in the reasoning process, the actions with the highest probability are selected instead of sampling according to the softmax layer of the reinforcement learning module, and the intelligent agent automatically sets values for the selected feature dimensions in sequence to obtain new facial attributes. And inputting the hidden variables of the image and the new facial attributes into a continuous normalization stream module to generate hidden variables of the target face image, and then inputting the hidden variables into an image generation module to obtain the edited face image.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (7)

1. The face image editing method based on reinforcement learning is characterized by comprising the following steps:
acquiring a face image to be edited and extracting a first face attribute;
mapping the face image to be edited through an encoding module to obtain an image hidden variable;
acquiring a pre-trained generator;
inputting the image hidden variable into the generator to generate a first face image;
selecting an attribute to be edited from the first facial attribute, and inputting a first face image into a trained image evaluation model to obtain an evaluation result;
inputting the attribute to be edited and the evaluation result to a trained reinforcement learning module to generate a second facial attribute;
inputting the image hidden variable and the second facial attribute into a continuous normalization stream module to generate a hidden variable of a target face image;
inputting the hidden variable of the target face image into the generator to generate a second face image;
the training step of the reinforcement learning module comprises the following specific steps:
initializing the facial attribute, and generating a plurality of groups of corresponding training images according to the selected attribute;
respectively calculating each group of training images according to a preset reinforcement learning strategy, generating new face attributes, and generating new face images corresponding to the new face attributes;
evaluating the new face image through the image evaluation model, and updating the gradient by adopting a soft gradient strategy according to the evaluation result for iteration until convergence;
the reinforcement learning strategy pi * The method comprises the following steps:
wherein s is t A is the state vector of the t-th iteration t R is a fitting factor satisfying r is more than or equal to 0 and less than or equal to 1 for the action vector of the t-th iteration;entropy, alpha is super parameter, control +.>The relative importance in the target; ρ π Probability of being a policy;
the rope ladder strategy calculation formula is as follows:
wherein alpha is a temperature super-parameter controlling the detection range,is the Q value of the strategy, b (s t ) Is a state-dependent baseline; policy pi θ θ in (2) is a minuscule parameter.
2. The method for editing a face image based on reinforcement learning according to claim 1, wherein the step of inputting the first face image into a trained image evaluation model to obtain an evaluation result comprises:
preprocessing the first face image, inputting the preprocessed first face image into a backbone network for feature extraction to obtain a feature vector;
inputting the feature vector to a channel attention module to obtain a three-dimensional vector, and expanding the three-dimensional vector into a one-dimensional vector after activation and self-adaptive average pooling;
and inputting the one-dimensional vector into a regression network, and outputting an evaluation result.
3. The method for editing a face image based on reinforcement learning according to claim 1, wherein the training step of the image evaluation model comprises:
training a classification network, inputting training data into a backbone network to extract characteristics, and classifying through the classification network; when the classification network is trained, parameter feedback is carried out on the loss value through a cross entropy function, and the parameter of the regression network is kept from being returned;
regression training is performed on the data based on the classification network to extract more aesthetic features, at which time only the regression network is released, freezing the parameters of the backbone network and the classification network.
4. The face image editing method based on reinforcement learning according to claim 1, wherein the reinforcement learning module comprises a feature extraction unit and a gating circulation unit;
the characteristic extraction unit extracts image characteristics in the initial image and inputs the image characteristics into the gating circulation unit;
and the tail end of the hidden layer in the gate control circulation unit is connected with a full connection layer, and the value of the selected attribute is output through the full connection layer.
5. The face image editing method based on reinforcement learning according to claim 4, wherein the feature extraction unit is a res net18 network.
6. A neural network model for implementing the image editing method of any of claims 1-5, comprising: an encoder, a generator, an image evaluation network, and a reinforcement learning network;
the encoder converts the image to be edited into a hidden space vector and then carries out image reconstruction through a generator; during reconstruction, the generator generates facial attributes according to the hidden space vector and obtains a reconstructed image;
the image evaluation network evaluates the reconstructed image;
the reinforcement learning network selects according to the generated facial attribute in the reconstruction process, optimizes according to the evaluation structure, obtains the optimized facial attribute, and inputs the optimized facial attribute to the generator again to generate a final image.
7. A face image editing system based on reinforcement learning, which is characterized by adopting the neural network model in claim 6 and comprising an image acquisition module, an image editing module and an image generation module;
the image acquisition module is used for acquiring a face image to be edited;
the image editing module is used for selecting editing attributes according to the face image to be edited;
the image generation module is used for carrying out preliminary evaluation according to the face image to be edited, optimizing the selected editing attribute according to the evaluation result and obtaining the optimized face attribute; and the method is used for generating an editing result according to the optimized facial attribute.
CN202310908009.1A 2023-07-24 2023-07-24 Face image editing method based on reinforcement learning Active CN116630147B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310908009.1A CN116630147B (en) 2023-07-24 2023-07-24 Face image editing method based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310908009.1A CN116630147B (en) 2023-07-24 2023-07-24 Face image editing method based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN116630147A CN116630147A (en) 2023-08-22
CN116630147B true CN116630147B (en) 2024-02-06

Family

ID=87617445

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310908009.1A Active CN116630147B (en) 2023-07-24 2023-07-24 Face image editing method based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN116630147B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014001610A1 (en) * 2012-06-25 2014-01-03 Nokia Corporation Method, apparatus and computer program product for human-face features extraction
CN111260754A (en) * 2020-04-27 2020-06-09 腾讯科技(深圳)有限公司 Face image editing method and device and storage medium
CN112800893A (en) * 2021-01-18 2021-05-14 南京航空航天大学 Human face attribute editing method based on reinforcement learning
CN112907725A (en) * 2021-01-22 2021-06-04 北京达佳互联信息技术有限公司 Image generation method, image processing model training method, image processing device, and image processing program
CN113221794A (en) * 2021-05-24 2021-08-06 厦门美图之家科技有限公司 Training data set generation method, device, equipment and storage medium
CN113255551A (en) * 2021-06-04 2021-08-13 广州虎牙科技有限公司 Training, face editing and live broadcasting method of face editor and related device
WO2022135013A1 (en) * 2020-12-24 2022-06-30 百果园技术(新加坡)有限公司 Facial attribute editing method and system, and electronic device and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014001610A1 (en) * 2012-06-25 2014-01-03 Nokia Corporation Method, apparatus and computer program product for human-face features extraction
CN111260754A (en) * 2020-04-27 2020-06-09 腾讯科技(深圳)有限公司 Face image editing method and device and storage medium
WO2022135013A1 (en) * 2020-12-24 2022-06-30 百果园技术(新加坡)有限公司 Facial attribute editing method and system, and electronic device and storage medium
CN112800893A (en) * 2021-01-18 2021-05-14 南京航空航天大学 Human face attribute editing method based on reinforcement learning
CN112907725A (en) * 2021-01-22 2021-06-04 北京达佳互联信息技术有限公司 Image generation method, image processing model training method, image processing device, and image processing program
CN113221794A (en) * 2021-05-24 2021-08-06 厦门美图之家科技有限公司 Training data set generation method, device, equipment and storage medium
CN113255551A (en) * 2021-06-04 2021-08-13 广州虎牙科技有限公司 Training, face editing and live broadcasting method of face editor and related device

Also Published As

Publication number Publication date
CN116630147A (en) 2023-08-22

Similar Documents

Publication Publication Date Title
Liu et al. More control for free! image synthesis with semantic diffusion guidance
CN109036465B (en) Speech emotion recognition method
US11354792B2 (en) System and methods for modeling creation workflows
CN106503654A (en) A kind of face emotion identification method based on the sparse autoencoder network of depth
CN111553467B (en) Method for realizing general artificial intelligence
CN113435211B (en) Text implicit emotion analysis method combined with external knowledge
CN108197533A (en) A kind of man-machine interaction method based on user's expression, electronic equipment and storage medium
US11823490B2 (en) Non-linear latent to latent model for multi-attribute face editing
CN108985464A (en) The continuous feature generation method of face for generating confrontation network is maximized based on information
US20220101144A1 (en) Training a latent-variable generative model with a noise contrastive prior
Dogan et al. Semi-supervised image attribute editing using generative adversarial networks
Zhai et al. Asian female facial beauty prediction using deep neural networks via transfer learning and multi-channel feature fusion
CN108846073A (en) A kind of man-machine emotion conversational system of personalization
WO2021226731A1 (en) Method for imitating human memory to realize universal machine intelligence
CN117409109A (en) Image generation method and data processing method for image generation
Chen et al. CNN-based broad learning with efficient incremental reconstruction model for facial emotion recognition
CN117576257A (en) Method, terminal and storage medium for editing face image through text
Ye et al. Multi-style transfer and fusion of image’s regions based on attention mechanism and instance segmentation
Xia et al. Semantic translation of face image with limited pixels for simulated prosthetic vision
Feng et al. Improved visual story generation with adaptive context modeling
WO2021223042A1 (en) Method for implementing machine intelligence similar to human intelligence
CN116630147B (en) Face image editing method based on reinforcement learning
Tsapatsoulis et al. A fuzzy system for emotion classification based on the MPEG-4 facial definition parameter set
KR20190129698A (en) Electronic apparatus for compressing recurrent neural network and method thereof
Liu et al. Multimodal face aging framework via learning disentangled representation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant