CN116843785A

CN116843785A - Artificial intelligence-based painting image generation method, display terminal and storage medium

Info

Publication number: CN116843785A
Application number: CN202310866634.4A
Authority: CN
Inventors: 林洽锐
Original assignee: Zhichuangjucheng Shenzhen Network Technology Co ltd
Current assignee: Zhichuangjucheng Shenzhen Network Technology Co ltd
Priority date: 2023-07-13
Filing date: 2023-07-13
Publication date: 2023-10-03

Abstract

The invention discloses an artificial intelligence painting image generation method, a display terminal and a storage medium, wherein the artificial intelligence painting image generation method comprises the following steps: s10: collecting painting data and extracting painting elements; s20: selecting a proper model and training the model; s30: performing parameter setting, and optimally adjusting the parameter setting; s40: and inputting the trained model to generate a new image. The invention provides an image generation method, a display terminal and a computer readable storage medium based on artificial intelligence drawing, aiming at improving the accuracy of a display device for acquiring drawing instructions which a user wants to express in the current environment, so that an artificial intelligence model can generate a drawing image which is more in line with the mind of the user for display by the display device.

Description

Artificial intelligence-based painting image generation method, display terminal and storage medium

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a painting image generation method based on artificial intelligence, a display terminal and a computer readable storage medium.

Background

Artificial intelligence (Artificial Intelligence, AI) is a comprehensive technology of computer science, and by researching the design principles and implementation methods of various intelligent machines, the machines have the functions of sensing, reasoning and decision. Artificial intelligence technology is a comprehensive subject, and relates to a wide range of fields, such as natural language processing technology, machine learning/deep learning and other directions, and with the development of technology, the artificial intelligence technology will be applied in more fields and has an increasingly important value.

With the rapid development of AI (Artificial Intelligence ) drawing technology nowadays, it is also possible to send drawing instructions to AI by a user, and to generate images by AI drawing and display the images on a digital photo frame.

The conventional AI painting generally uses the intelligent device of the computer to give corresponding painting instructions to the AI, but the display device of the general digital photo frame is not more complete than the computer (i.e. the display device of the digital photo frame has relatively single function and is mainly used for displaying images), and the conventional AI painting instructions are generally text input, so that the user is inconvenient to directly input text instructions to the display device of the digital photo frame to give corresponding painting instructions to the AI, and for some users with weaker text literacy, the user is generally difficult to intuitively and accurately convert what he thinks of and what he feels into corresponding text expressions to input corresponding painting instructions to the display device of the digital photo frame.

Disclosure of Invention

The invention mainly aims to provide an image generation method, a display terminal and a computer readable storage medium based on artificial intelligence drawing, aiming at improving the accuracy of a display device for acquiring drawing instructions which a user wants to express in the current environment, so that an artificial intelligence model can generate a drawing image which is more in line with the mind of the user for display by the display device.

In order to achieve the above object, the present invention provides an artificial intelligence based painting image generation method, comprising the steps of:

s10: collecting painting data and extracting painting elements;

s20: selecting a proper model and training the model;

s30: performing parameter setting, and optimally adjusting the parameter setting;

s40: and inputting the trained model to generate a new image.

Preferably, the step S20 specifically includes the following steps:

s21: determining an appropriate training data set, the data set comprising actual image samples associated with the target task;

s22: selecting a proper generation model according to specific image generation tasks and requirements, wherein the generation model comprises generation of an countermeasure network;

s23: based on the selected model, building a corresponding neural network architecture including a generator network and an identifier network; the generator network includes a generator, and the discriminator network includes a discriminator;

s24: initializing model parameters;

s25: defining an appropriate loss function according to the selected model and task type;

s26: according to the complexity of the model and the scale of the data set, the learning rate and the super-parameters of the optimizer are adjusted;

s27: the model is trained by inputting real image samples into the model and generating noise or conditional inputs.

Preferably, S30 specifically includes the following steps:

s31: selecting proper initial parameter values, including generating relevant super parameters, determining the super parameters to be adjusted by using a pre-training module, and selecting proper learning rate, batch size, iteration times, regularization strength and other data model structures and specific tasks for the super parameters;

s32: searching the hyper-parameter space using a grid search or a random search method;

s33: dividing the data set into a training set and a verification set for evaluating the performances of different super-parameter configurations;

s34: adjusting the learning rate, including optimizing a training process of generating the model by a learning rate decay strategy or dynamically adjusting the learning rate;

s35: overfitting is prevented by regularizing the complexity of the control model.

Preferably, S40 includes: after training is completed, generating a new painting image through a generator by using a generator network and an input noise vector; a diversified image is generated by sampling the random noise vector.

Preferably, the noise vector comprises randomly sampling a noise vector from a noise vector space as input.

Preferably, S40 further comprises: inputting the noise vector into a generator model, the generator attempting to generate a realistic image by back-propagation to confuse the discriminator; the discriminator classifies and judges authenticity according to the generated image and the real image; the gaming process of the generator and the discriminator is gradually optimized, and the result that the generator can generate the image is finally obtained.

Preferably, S40 further comprises the following steps:

s50: and evaluating and adjusting the generated image, and adjusting the model or parameters according to the requirement to obtain a generation result which is more in line with expectations.

Preferably, S50 includes: quantifying the quality of the generated image using the evaluation index; the evaluation index comprises a structural similarity index or a peak signal to noise ratio or a perceived mean square error.

In order to achieve the above object, the present invention also provides a display terminal including a memory, a processor, and an artificial intelligence painting-based image generation program stored on the memory and executable on the processor, the artificial intelligence painting-based image generation program implementing the steps of the artificial intelligence painting-based image generation method as described above when executed by the processor.

In order to achieve the above object, the present invention also proposes a computer-readable storage medium having stored thereon an image generation program based on an artificial intelligence drawing, which when executed by a processor, implements the steps of the image generation method based on an artificial intelligence drawing as described above.

The invention has the beneficial effects that:

according to the image generation method based on the artificial intelligence painting, the display terminal and the computer readable storage medium, a user can acquire painting data from the display device, extract painting elements, select a model, train the model, input the trained model, and generate a new image; the imaging style and the imaging conditions are fused to obtain an object image which has the imaging style and accords with the imaging conditions, so that the object image required by a user can be generated, and the pertinence and the practicability of the content of the generated object image are improved; noise is blended into the regularized feature map, so that the change of the object image is promoted, and the diversity of the generated object image is improved on the premise of ensuring the imaging style and the imaging condition; the regularization processing is carried out on the feature images containing noise and the style vectors containing imaging conditions, so that the generation quality of the object image is improved, and a painting image which is more in line with the mind of a user is obtained for display of display equipment.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to the structures shown in these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of an embodiment of a method for generating an artificial intelligence based pictorial image in accordance with the present invention;

fig. 2 is a diagram of a display terminal system structure based on an artificial intelligence drawing image generating method of the present invention.

The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be noted that all directional indicators (such as up, down, left, right, front, and rear … …) in the embodiments of the present invention are merely used to explain the relative positional relationship, movement, etc. between the components in a particular posture (as shown in the drawings), and if the particular posture is changed, the directional indicator is changed accordingly.

In the present invention, unless specifically stated and limited otherwise, the terms "connected," "affixed," and the like are to be construed broadly, and for example, "affixed" may be a fixed connection, a removable connection, or an integral body; can be mechanically or electrically connected; either directly or indirectly, through intermediaries, or both, may be in communication with each other or in interaction with each other, unless expressly defined otherwise. The specific meaning of the above terms in the present invention can be understood by those of ordinary skill in the art according to the specific circumstances.

Furthermore, descriptions such as those referred to as "first," "second," and the like, are provided for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implying an order of magnitude of the indicated technical features in the present disclosure. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present invention.

The invention provides a display terminal, which comprises a memory, a processor and an image generating program based on artificial intelligence painting, wherein the image generating program based on artificial intelligence painting is stored in the memory and can run on the processor, the image generating program based on artificial intelligence painting realizes the steps of the image generating method based on artificial intelligence painting when being executed by the processor,

the display terminal comprises a memory, a processor and an image generation program which is stored on the memory and can run on the processor and is based on artificial intelligence drawing.

The present invention also proposes a computer-readable storage medium having stored thereon an image generation program based on an artificial intelligence drawing, which when executed by a processor, implements the steps of the image generation method based on an artificial intelligence drawing as described above.

The processor may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor is typically configured to control operation of the artificial intelligence drawing based image generation program.

Referring to fig. 1, the invention provides an artificial intelligence based painting image generation method, which comprises the following steps:

s10: collecting painting data and extracting painting elements;

s20: selecting a proper model and training the model;

s40: and inputting the trained model to generate a new image.

In this embodiment, in step S10, collecting drawing data includes collecting and sorting a data set for image generation, including a real image, a line draft, a drawing sample, and the like; and extracting drawing elements from the drawing content information and inputting the drawing elements to the display terminal. The display terminal may be used to acquire the imaging style and imaging conditions of the object to be imaged, for example, when the user inputs the imaging style and imaging conditions at an input interface of the terminal and clicks an image generation button, the terminal automatically acquires the imaging style and imaging conditions of the object to be imaged.

The imaging style is a representative visual artistic expression form of the image on the whole, such as Chinese painting, watercolor, sketch, oil painting, woodcut, cartoon and the like, which belong to the imaging style. The image can embody rich and special new visual effects through the imaging style. The imaging style of the image can be applied to the fields of animation, games and the like in a strong artistic form, and also appears in engineering and industrial design drawings. The imaging system has the advantages of wide application field, rich and various artistic expression forms of the imaging style, and capability of assisting in completing original creation work with large workload and high difficulty through the imaging style by the computer.

The imaging conditions: the imaging conditions may be hair color, hair length, etc. as required for object image generation, for example when cartoon faces are to be generated. According to the embodiment of the invention, the object image with the imaging condition can be generated according to the imaging condition, for example, the imaging condition is short hair, blue eye and long ban, and the generated cartoon face has the characteristics of short hair, blue eye and long ban.

Step S20 includes selecting an appropriate generation model, such as a countermeasure network (GANs), or other model, according to the particular needs. The selected model is trained, and iterative training is performed using the data set to learn patterns and features of image generation.

Step S30 comprises optimizing super parameters of the generated model to obtain better generation effect, and adjusting and optimizing parameters such as training times, learning rate, network structure and the like.

Step S40 includes generating a new image using the trained model and the input noise or other conditions; including random sampling, interpolation, etc., to explore the potential space of the model and generate diverse images.

In this embodiment, preferably, the step S20 specifically includes the following steps:

s24: initializing model parameters;

The step S22 may generate a countermeasure network (GANs) model, which builds a corresponding neural network architecture, including a generator network and an identifier network; the generator network includes a generator, and the discriminator network includes a discriminator; the generator is responsible for converting the random noise vector into a realistic pictorial image, while the discriminator evaluates the authenticity of a given image.

Performing the countermeasure training as described in step S20 includes training by alternating the optimization generator and the discriminator. At each:

a. training a generator: the generator receives a fake pictorial image. The image is passed to a discriminator;

b. training a discriminator: the discriminator receives the real pictorial images and the false images generated by the generator and attempts to distinguish them. The weights and bias of the discriminator are updated by calculating the loss function of the discriminator.

c. Performing countermeasure training, including minimizing the countermeasure loss between the generator and the discriminator, enables the generator to generate more realistic pictorial images while making it difficult for the discriminator to distinguish between real and false images.

Further comprising repeating the training iterations, repeating the plurality of training iterations as needed until the generated image reaches a desired quality level.

The calculation formula for generating the sample based on the generation of the countermeasure network is as follows:

1. calculation by a generator: the generator receives as input a random noise vector, generally denoted z. It passes this noise vector through a series of conversion and transformation operations, such as the full join layer, convolution layer, and batch normalization layer, to generate a falsified sample. The computation of the generator can be expressed as:

g (z) =x_fake; wherein (z) represents a fake sample generated by the generator, and x_false is a fake sample output by the generator;

2. calculation of the discriminator: the discriminator receives as input a dataset consisting of the real samples and the samples generated by the generator and uses it for training and discriminating between two types of samples: a real sample and a sample generated by the generator. The discriminator may employ a convolutional neural network or a fully-connected neural network to extract features and classify. The computation of the discriminator can be expressed as:

(x) = _real D (G (z))=p_cake; where (x) represents the discrimination result of the discriminator for the true sample, and p_real is the probability that the discriminator decides the true sample; d (G (z)) represents the discrimination result of the discriminator for the generator-generated sample, and p_fake is the probability that the discriminator decides to falsify the sample;

3. loss function of generator and discriminator: during the training process, the goal of the generator and discriminator is to compete and optimize each other. The generator attempts to minimize the accuracy of the discriminator with respect to the generated sample, while the discriminator attempts to maximize the accuracy of the discrimination of the true and generated samples.

In this embodiment, the cross entropy loss is used as the loss function calculation:

Cross-Entropy Loss (Cross-Entropy Loss): the loss function of the generator can be expressed as:

Generator Loss(_G)＝-log(D(G(z)))；

the loss function of the discriminator can be expressed as:

Discriminator Loss(_D)＝-[log(D(x))+log(1-D(G(z)))]；

the learning rate schedule in step S26 includes training processes that may attempt to optimize the generated model using a learning rate decay strategy (e.g., reducing the learning rate on a schedule) or dynamically adjusting the learning rate (e.g., based on guidance of the validation set performance); the super-parameters refer to parameters which need to be set manually in a machine learning and deep learning model, and are not parameters obtained through autonomous learning in a training process.

The super parameters include: learning Rate (Learning Rate): controlling the step size of the model when the parameters are updated each time;

number of iterations/training rounds (Number of Iterations/Epochs): designating the total number of rounds or the iteration number of training the model;

regularization parameters (Regularization Parameter): for controlling the complexity of the model, preventing overfitting;

weight initialization method (Weight Initialization Method): policies are defined that initialize network weights, such as random initialization, xavier initialization, etc.

In this embodiment, further, preferably, S30 specifically includes the following steps:

In this embodiment, preferably, S40 includes: after training is completed, generating a new painting image through a generator by using a generator network and an input noise vector; a diversified image is generated by sampling the random noise vector.

Generating a new pictorial image using the generator network and the input noise vector, comprising the steps of:

step a), definition generator network: a generator network is created that can accept as input a noise vector and output the composite image. The generator is typically composed of a series of convolution layers, deconvolution layers, batch normalization layers, etc. for gradually converting noise vectors into realistic images;

step B), generating a noise vector: a noise vector is randomly generated, typically following a uniform or gaussian distribution. This noise vector will be supplied as input to the generator network;

step C), forward propagation: the input to the generator network is set to a noise vector and inferred through the network by forward propagation. The generator converts the noise vector into a composite image;

step D), post-treatment: performing appropriate post-processing operations on the generated image, such as adjusting brightness, contrast, size, etc., according to the specific application and needs;

preferably, step E) is also included, and the generated image may optionally be further optimized or enhanced, such as applying filtering, color enhancement or adding detail on the image, etc.

By repeating the above steps, a plurality of drawing images of different styles, styles and contents can be generated.

In this embodiment, further, preferably, the step S40 further includes the following steps:

Still further, preferably, step S50 includes: quantifying the quality of the generated image using the evaluation index; the evaluation index comprises a structural similarity index or a peak signal to noise ratio or a perceived mean square error.

As described in step S40, after the display terminal obtains the drawing image generated by the artificial intelligence model based on each drawing element, the generated drawing image can be displayed on the display screen of the display terminal for people to watch.

The display terminal comprises a processor 10, an operating system 21, a computer program 22, a database 23, a memory 30, a communication interface 40, an input device 50; wherein the operating system 21, the computer program 22, and the database 23 constitute a storage medium; the input device 50 inputs drawing information to the processor 10; the processor 10 controls the storage medium and memory 30 and the communication interface 40, respectively; the communication interface 40 may be externally connected to a device to output imaging information.

The user may input the imaging style and imaging condition of the object to be imaged on the input device 50 of the terminal, and when the input is completed, the terminal may forward the imaging style and imaging condition of the object to be imaged to the server through the communication interface 40, and the server may acquire the imaging style and imaging condition of the object to be imaged. The imaging style can be the expression forms of Chinese painting, watercolor, sketch, oil painting, woodcut, cartoon and the like, the imaging condition can be the attribute information of the object such as hair length, presence or absence of bang, height, fat and thin and the like, and the object can be a person or an animal and the like.

The present invention also proposes a computer-readable storage medium comprising an artificial intelligence painting-based image generation program which, when executed by a processor, implements the steps of the artificial intelligence painting-based image generation method as described in the above embodiments. It is understood that the computer readable storage medium in this embodiment may be a volatile readable storage medium or a nonvolatile readable storage medium.

The invention provides an image generation method based on artificial intelligence painting, a display terminal and a computer readable storage medium, aiming at improving the accuracy of a display device for acquiring painting instructions which a user wants to express in the current environment, so that an artificial intelligence model can generate painting images which are more in line with the mind of the user for display by the display device.

The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the invention, and all equivalent structural changes made by the description of the present invention and the accompanying drawings or direct/indirect application in other related technical fields are included in the scope of the invention.

Claims

1. The artificial intelligence painting image generating method is characterized by comprising the following steps of:

s10: collecting painting data and extracting painting elements;

s20: selecting a proper model and training the model;

s40: and inputting the trained model to generate a new image.

2. The artificial intelligence drawing image generating method according to claim 1, wherein the step S20 specifically comprises the steps of:

s24: initializing model parameters;

3. The artificial intelligence drawing image generating method according to claim 1, wherein S30 specifically comprises the steps of:

4. The artificial intelligence drawing image based generation method according to claim 2, wherein S40 comprises: after training is completed, generating a new painting image through a generator by using a generator network and an input noise vector; a diversified image is generated by sampling the random noise vector.

5. The artificial intelligence based pictorial image generation method of claim 4 wherein the noise vector includes randomly sampling a noise vector from a noise vector space as input.

6. The artificial intelligence drawing image based generation method of claim 5, wherein S40 further comprises: inputting the noise vector into a generator model, the generator attempting to generate a realistic image by back-propagation to confuse the discriminator; the discriminator classifies and judges authenticity according to the generated image and the real image; the gaming process of the generator and the discriminator is gradually optimized, and the result that the generator can generate the image is finally obtained.

7. The artificial intelligence drawing image based generation method according to claim 1, further comprising the following steps after S40:

8. The artificial intelligence drawing image generating method according to claim 7, wherein S50 comprises: quantifying the quality of the generated image using the evaluation index; the evaluation index comprises a structural similarity index or a peak signal to noise ratio or a perceived mean square error.

9. A display terminal comprising a memory, a processor, and an artificial intelligence painting-based image generation program stored on the memory and executable on the processor, the artificial intelligence painting-based image generation program when executed by the processor implementing the steps of the artificial intelligence painting-based image generation method according to any one of claims 1 to 8.

10. A computer-readable storage medium, on which an artificial intelligence drawing-based image generation program is stored, which when executed by a processor, implements the steps of the artificial intelligence drawing-based image generation method according to any one of claims 1 to 8.