CN111553335A

CN111553335A - Image generation method and apparatus, and storage medium

Info

Publication number: CN111553335A
Application number: CN202010329281.0A
Authority: CN
Inventors: 黄楷; 梁新敏; 陈羲
Original assignee: Shanghai Fengzhi Technology Co ltd
Current assignee: Shanghai Second Picket Network Technology Co ltd
Priority date: 2020-04-23
Filing date: 2020-04-23
Publication date: 2020-08-18

Abstract

The invention discloses an image generation method and device and a storage medium. The method comprises the following steps: acquiring a plurality of candidate images associated with a target image to be generated, wherein the plurality of candidate images are images which are collected from a data sharing platform and carry text information for popularizing a target theme; sequentially extracting features of each candidate image of a plurality of candidate images to obtain a plurality of groups of candidate feature pairs, wherein each group of candidate feature pairs comprises character features which are extracted from one candidate image and are associated with a target subject and image features extracted from one candidate image; sequentially inputting a plurality of groups of candidate feature pairs into an image generation model; and acquiring a target image with a target theme according to an output result of the image generation model. The invention solves the technical problem of low image generation efficiency caused by manually designing the image.

Description

Image generation method and apparatus, and storage medium

Technical Field

The present invention relates to the field of computers, and in particular, to an image generation method and apparatus, and a storage medium.

Background

In designing an image such as a poster, it is common to manually design the image by a professional. For example, the operator issues the demand, and asks the relevant art designer to design the poster or the poster. However, such a method of manually designing an image for promotion, such as a poster, is often time-consuming and limited in the number of designed images, resulting in a problem of low image generation efficiency.

In view of the above problems, no effective solution has been proposed.

Disclosure of Invention

The embodiment of the invention provides an image generation method and device and a storage medium, which at least solve the technical problem of low image generation efficiency caused by manual image design.

According to an aspect of an embodiment of the present invention, there is provided an image generation method including: acquiring a plurality of candidate images associated with a target image to be generated, wherein the plurality of candidate images are images which are collected from a data sharing platform and carry text information for popularizing a target theme; sequentially extracting features of each candidate image of the candidate images to obtain a plurality of groups of candidate feature pairs, wherein each group of candidate feature pairs comprises character features which are extracted from one candidate image and are associated with the target subject and image features extracted from the candidate image; sequentially inputting the multiple groups of candidate feature pairs into an image generation model, wherein the image generation model is a neural network model which is obtained by training a plurality of sample data and is used for generating an image for popularizing a specified theme, and the image generation model comprises a generation sub-network model for generating the image and an identification sub-network model for identifying whether the generated image is the specified theme or not; and acquiring the target image with the target theme according to the output result of the image generation model.

As an optional implementation manner, after the sequentially inputting the plurality of sets of candidate feature pairs into the image generation model, the method further includes: under the condition that a candidate feature pair of the ith candidate image is currently input, generating an object image matched with the target subject through the generation sub-network model by using the character features extracted from the ith candidate image, wherein i is an integer which is more than 1 and less than or equal to N, and N is the number of the candidate images; when the identification sub-network model acquires the object image, identifying whether the object image and the currently input ith candidate image are the same subject; and determining the target image as the target image when the ith candidate image input by the target image has the same subject.

As an optional implementation, sequentially performing feature extraction on each candidate image of the plurality of candidate images to obtain a plurality of sets of candidate feature pairs includes: repeatedly executing the following steps until the plurality of candidate images are traversed: acquiring a current candidate image; identifying the text information associated with the target subject in the current candidate image; performing word segmentation and mapping processing on the text information to obtain a plurality of current word vectors corresponding to the current candidate image, and performing aggregation processing on the plurality of current word vectors to obtain a group of current sentence vectors; determining the current sentence vector as the character feature of the current candidate image; identifying image information in the current candidate image, wherein the image information comprises position information and color information of each pixel point; the image feature is determined based on the image information.

As an optional implementation manner, before the acquiring a plurality of candidate images associated with the target image to be generated, the method further includes: acquiring a plurality of sample data, wherein the plurality of sample data comprise first type sample data and second type sample data, the first type sample data are image data with the same theme, and the second type sample data are image data with different themes; and training the initialized image generation model by using the plurality of sample data to obtain the image generation model.

As an optional implementation, the acquiring multiple candidate images associated with the target image to be generated includes: acquiring a search request, wherein the search request carries the keyword of the target theme; and responding to the search request, and searching the plurality of candidate images with the target theme from the data sharing platform.

As an alternative embodiment, the image generation model is constructed based on a stackGAN model.

According to another aspect of the embodiments of the present invention, there is also provided an image generating apparatus including: the system comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring a plurality of candidate images related to a target image to be generated, and the candidate images are images which are collected from a data sharing platform and carry text information for popularizing a target theme; an extracting unit, configured to perform feature extraction on each candidate image of the multiple candidate images in sequence to obtain multiple sets of candidate feature pairs, where each set of candidate feature pair includes a text feature extracted from one candidate image and associated with the target subject, and an image feature extracted from the one candidate image; an input unit, configured to sequentially input the plurality of sets of candidate feature pairs into an image generation model, where the image generation model is a neural network model obtained by training using a plurality of sample data and used for generating an image for promoting a specified subject, and the image generation model includes a generation sub-network model used for generating an image and an identification sub-network model used for identifying whether the generated image is the specified subject; a second obtaining unit, configured to obtain the target image with the target theme according to an output result of the image generation model.

As an optional implementation, the method further includes: a generation unit configured to generate, by using the text feature extracted from an ith candidate image, an object image matching the target topic by using the generation sub-network model when the candidate feature pair is currently input as an ith candidate image after the image generation models are sequentially input as the plurality of sets of candidate feature pairs, where i is an integer greater than 1 and smaller than or equal to N, and N is the number of the plurality of candidate images; an identification unit configured to identify whether the object image and the currently input i-th candidate image are of the same subject when the identification sub-network model acquires the object image; a determining unit, configured to determine the target image as the target image if the ith candidate image input by the target image is of the same subject.

As an optional implementation, the input unit includes: the processing module is used for repeatedly executing the following steps until the plurality of candidate images are traversed: acquiring a current candidate image; identifying the text information associated with the target subject in the current candidate image; performing word segmentation and mapping processing on the text information to obtain a plurality of current word vectors corresponding to the current candidate image, and performing aggregation processing on the plurality of current word vectors to obtain a group of current sentence vectors; determining the current sentence vector as the character feature of the current candidate image; identifying image information in the current candidate image, wherein the image information comprises position information and color information of each pixel point; the image feature is determined based on the image information.

According to a further aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium having a computer program stored therein, wherein the computer program is configured to execute the above-mentioned image generation method when running.

In the embodiment of the present invention, after a plurality of candidate images with the same target subject as a target image to be generated are acquired, feature extraction is performed on the plurality of candidate images respectively to obtain a plurality of sets of candidate feature pairs, and the plurality of sets of candidate feature pairs are sequentially input to an image generation model to output the target image with the target subject. Therefore, the promotion and publicity images can be automatically generated, a large number of promotion and publicity images do not need to be manually designed by professionals, the aims of saving the image generation time and improving the image generation efficiency are achieved, and the technical problem of the problem of low image generation efficiency caused by manually designing the images is solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

FIG. 1 is a flow diagram of an alternative image generation method according to an embodiment of the invention;

FIG. 2 is a schematic diagram of an alternative image generation method according to an embodiment of the invention;

FIG. 3 is a schematic diagram of an alternative image generation method according to an embodiment of the invention;

FIG. 4 is a flow diagram of another alternative image generation method according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of an alternative image generating apparatus according to an embodiment of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

According to an aspect of the embodiments of the present invention, there is provided an image generating method, optionally, as an optional implementation manner, as shown in fig. 1, the image generating method includes:

s102, acquiring a plurality of candidate images associated with a target image to be generated, wherein the plurality of candidate images are images which are collected from a data sharing platform and carry text information for popularizing a target theme;

s104, sequentially carrying out feature extraction on each candidate image of the plurality of candidate images to obtain a plurality of groups of candidate feature pairs, wherein each group of candidate feature pairs comprises character features which are extracted from one candidate image and are associated with a target subject and image features extracted from one candidate image;

s106, sequentially inputting a plurality of groups of candidate feature pairs into an image generation model, wherein the image generation model is a neural network model which is obtained by training a plurality of sample data and is used for generating an image for popularizing a specified subject, and the image generation model comprises a generation sub-network model for generating the image and an identification sub-network model for identifying whether the generated image is the specified subject;

and S108, acquiring a target image with a target theme according to the output result of the image generation model.

Alternatively, in the present embodiment, the above-mentioned image generation method may be applied, but not limited, to a generation process of generating an image for promotion, such as an advertisement image, a poster image, a promotion image, and the like. That is to say, in the present embodiment, after a plurality of candidate images that are the same target subject as the target image to be generated are acquired, feature extraction is performed from the plurality of candidate images respectively to obtain a plurality of sets of candidate feature pairs, and the plurality of sets of candidate feature pairs are sequentially input to the image generation model to output the target image that is the target subject. Therefore, the promotion and publicity images can be automatically generated, a large number of promotion and publicity images do not need to be manually designed by professionals, the purposes of saving the image generation time and improving the image generation efficiency are achieved, and the problem of low image generation efficiency in the related technology is solved. The above is only an example, and this is not limited in this embodiment.

Optionally, in this embodiment, the image generation model may be, but is not limited to, constructed based on a stackGAN model. Wherein the stackGAN model can be, but is not limited to, a tree structure of multiple producers and discriminators that produce multi-scale images corresponding to the same scene from different branches of the tree.

Optionally, in this embodiment, after the image generation model identifies text information associated with the target topic in the candidate image, the image generation model performs word segmentation and mapping on the text information to obtain a word vector corresponding to the candidate image, and performs aggregation on the word vector to obtain a sentence vector; thereby determining the sentence vector as the character feature of the candidate image. The image generation model may determine image features from image information in the candidate image after identifying the image information.

For example, the image generation model described above may be as shown in fig. 2. After acquiring a plurality of candidate images, extracting text information in the candidate images, as shown by the bold solid line box in fig. 2. Then, the words are subjected to word segmentation, mapping and aggregation to obtain sentence vectors as character features, as shown by the bold dashed line boxes in fig. 2. Then the character features are input into an image generation model. Then, a candidate target image generated by prediction is obtained, as shown by a bold dashed circle in fig. 2. Inputting the image characteristics determined according to the image information of the candidate image and the candidate target image into the model again, wherein the image characteristics determined according to the image information of the candidate image are shown as a bold solid line circle in fig. 2. And further analyzing to obtain a target image of the target theme.

Optionally, in this embodiment, acquiring multiple candidate images associated with the target image to be generated includes: acquiring a search request, wherein the search request carries a keyword of a target theme; and responding to the search request, and finding a plurality of candidate images with the target theme from the data sharing platform.

It should be noted that the data sharing platform may be, but is not limited to, an application platform such as a shared space application, an instant messaging application, and the like, for example, a wechat public number. The plurality of candidate images may be, but not limited to, images provided in the WeChat public. For example, as shown in fig. 3, a certain WeChat public number includes a poster, and text information published in 2019, 12 months, and 24 days is displayed: [ redemption for points ] a red color that must never be missed. The red can highlight eyes and can also be low in temperature and soft, and particularly in Christmas seasons, people are not red or poor.

According to the embodiment provided by the application, after a plurality of candidate images which are the same as the target image to be generated are obtained, feature extraction is respectively carried out on the candidate images to obtain a plurality of groups of candidate feature pairs, and the candidate feature pairs are sequentially input into the image generation model to output the target image with the target theme. Therefore, the promotion and publicity images can be automatically generated, a large number of promotion and publicity images do not need to be manually designed by professionals, the purposes of saving the image generation time and improving the image generation efficiency are achieved, and the problem of low image generation efficiency in the related technology is solved.

As an optional scheme, after sequentially inputting the plurality of sets of candidate feature pairs into the image generation model, the method further includes:

s1, under the condition that the current input is a candidate feature pair of the ith candidate image, generating an object image matched with the target subject by using the character features extracted from the ith candidate image and generating a sub-network model, wherein i is an integer which is larger than 1 and smaller than the same, and N is the number of a plurality of candidate images;

s2, under the condition that the identification sub-network model acquires the object image, identifying whether the object image and the current input ith candidate image are the same subject;

s3, in the case where the ith candidate image input by the object image is of the same subject, the object image is determined as the target image.

Optionally, in this embodiment, before acquiring a plurality of candidate images associated with a target image to be generated, the method further includes: the method comprises the steps of obtaining a plurality of sample data, wherein the plurality of sample data comprise first type sample data and second type sample data, the first type sample data are image data with the same theme, and the second type sample data are image data with different themes; and training the initialized image generation model by using a plurality of sample data to obtain the image generation model.

Optionally, in this embodiment, sequentially performing feature extraction on each candidate image of the multiple candidate images to obtain multiple sets of candidate feature pairs includes:

repeatedly executing the following steps until a plurality of candidate images are traversed:

acquiring a current candidate image;

identifying text information associated with the target subject in the current candidate image; performing word segmentation and mapping processing on the character information to obtain a plurality of current word vectors corresponding to the current candidate image, and performing aggregation processing on the plurality of current word vectors to obtain a group of current sentence vectors; determining the current sentence vector as the character feature of the current candidate image;

identifying image information in the current candidate image, wherein the image information comprises position information and color information of each pixel point; image features are determined from the image information.

The following description is specifically made in conjunction with S402 to S410 shown in fig. 4:

and acquiring a search request, wherein the search request carries a keyword of a target subject (assumed to be the keyword of the target product). In response to the search request, a plurality of candidate images with the target subject are found from a data sharing platform (such as WeChat public number). Here, the plurality of candidate images may be history posters related to the target product, or posters competitive with the target product, as in step S402.

Using an image Recognition technique (such as Optical Character Recognition, OCR for short), the text information associated with the target theme of the target product in each candidate image is recognized (step S404). For example, extracting a title (the title "exchange for points, red which cannot be missed at all" as shown in fig. 3) or extracting text in an article (the text "red can be chosen to be bright and can also be low and gentle, especially essential in christmas season, you are not or are not a little bit red equipped with woolen") as shown in fig. 3.

Then, the extracted text information is subjected to word segmentation (e.g., using jieba word segmentation, SnowNLP, foonltk, ansj, etc.) in step S406.

Further, using a wordebeading mapping algorithm, each word is mapped into a dense dimension (e.g., 200 dimensions), resulting in a plurality of word vectors. Wherein, the wordebeading mapping algorithm may be, but is not limited to, mapping words in a text space to another numerical vector space. After all the word vectors are obtained, the word vectors are aggregated to obtain sentence vectors which serve as character features of the candidate images. Here, the sentence vectors can be, but are not limited to, viewed as the result of the averaging of the corresponding dimensions of each word vector, as in step S408.

And then inputting the character features extracted from the candidate images and the extracted image features into a trained image generation model constructed based on stackGAN to obtain corresponding target images, as in step S410, thereby assisting designers to quickly generate posters for promotion and promotion.

According to the embodiment provided by the application, the candidate images are identified through the trained image generation model constructed based on the stackGAN, so that the target images are automatically generated based on the candidate images without manual design of professionals, the image generation efficiency is improved, and the efficiency of popularization based on the target images is improved.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.

According to another aspect of the embodiment of the present invention, there is also provided an image generation apparatus for implementing the above-described image generation method. As shown in fig. 5, the apparatus includes:

1) a first obtaining unit 502, configured to obtain multiple candidate images associated with a target image to be generated, where the multiple candidate images are images collected from a data sharing platform and carrying text information for promoting a target theme;

2) an extracting unit 504, configured to perform feature extraction on each candidate image of the multiple candidate images in sequence to obtain multiple sets of candidate feature pairs, where each set of candidate feature pair includes a text feature extracted from one candidate image and associated with a target subject, and an image feature extracted from one candidate image;

3) an input unit 506, configured to sequentially input multiple sets of candidate feature pairs into an image generation model, where the image generation model is a neural network model obtained by training using multiple sample data and used to generate an image for promoting a specified subject, and the image generation model includes a generation sub-network model used to generate an image and an identification sub-network model used to identify whether the generated image is a specified subject;

4) a second obtaining unit 508, configured to obtain a target image with a target topic according to an output result of the image generation model.

The embodiments in this embodiment may refer to the above method embodiments, but are not limited thereto.

As an optional scheme, the method further comprises the following steps:

1) the generation unit is used for generating an object image matched with a target theme by using character features extracted from the ith candidate image and generating a sub-network model by using the character features extracted from the ith candidate image under the condition that the candidate feature pair of the ith candidate image is currently input after a plurality of groups of candidate feature pairs are sequentially input into an image generation model, wherein i is an integer which is more than 1 and less than or equal to N, and N is the number of a plurality of candidate images;

2) the identification unit is used for identifying whether the object image and the currently input ith candidate image are the same subject or not under the condition that the identification sub-network model acquires the object image;

3) a determination unit configured to determine the object image as the target image in a case where an ith candidate image input by the object image is the same subject.

As an alternative, the input unit includes:

1) a processing module for repeatedly executing the following steps until a plurality of candidate images are traversed: acquiring a current candidate image; identifying text information associated with the target subject in the current candidate image; performing word segmentation and mapping processing on the character information to obtain a plurality of current word vectors corresponding to the current candidate image, and performing aggregation processing on the plurality of current word vectors to obtain a group of current sentence vectors; determining the current sentence vector as the character feature of the current candidate image; identifying image information in the current candidate image, wherein the image information comprises position information and color information of each pixel point; image features are determined from the image information.

According to a further aspect of an embodiment of the present invention, there is also provided a computer-readable storage medium having a computer program stored thereon, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.

Alternatively, in the present embodiment, the above-mentioned computer-readable storage medium may be configured to store a computer program for executing the steps of:

alternatively, in this embodiment, a person skilled in the art may understand that all or part of the steps in the methods of the foregoing embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

The integrated unit in the above embodiments, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in the above computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing one or more computer devices (which may be personal computers, servers, network devices, etc.) to execute all or part of the steps of the method according to the embodiments of the present invention.

In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. An image generation method, comprising:

acquiring a plurality of candidate images associated with a target image to be generated, wherein the plurality of candidate images are images which are collected from a data sharing platform and carry text information for popularizing a target theme;

sequentially extracting features of each candidate image of the candidate images to obtain a plurality of groups of candidate feature pairs, wherein each group of candidate feature pairs comprises character features which are extracted from one candidate image and are associated with the target subject and image features extracted from the candidate image;

sequentially inputting the multiple groups of candidate feature pairs into an image generation model, wherein the image generation model is a neural network model which is obtained by training by using multiple sample data and is used for generating an image for popularizing a specified theme, and the image generation model comprises a generation sub-network model for generating the image and an identification sub-network model for identifying whether the generated image is the specified theme or not;

and acquiring the target image with the target theme according to the output result of the image generation model.

2. The method of claim 1, further comprising, after said sequentially inputting the plurality of sets of candidate feature pairs into an image generation model:

under the condition that a currently input candidate feature pair is the ith candidate image, generating an object image matched with the target subject through the generation sub-network model by using the character features extracted from the ith candidate image, wherein i is an integer which is larger than 1 and smaller than the same, and N is the number of the candidate images;

under the condition that the identification sub-network model acquires the object image, identifying whether the object image and the currently input ith candidate image are the same subject or not;

determining the object image as the target image in the case that the ith candidate image input by the object image is the same subject.

3. The method of claim 1, wherein the sequentially performing feature extraction on each candidate image of the plurality of candidate images to obtain a plurality of sets of candidate feature pairs comprises:

repeatedly executing the following steps until the plurality of candidate images are traversed:

acquiring a current candidate image;

identifying text information associated with the target subject in the current candidate image; performing word segmentation and mapping processing on the text information to obtain a plurality of current word vectors corresponding to the current candidate image, and performing aggregation processing on the plurality of current word vectors to obtain a group of current sentence vectors; determining the current sentence vector as the character feature of the current candidate image;

identifying image information in the current candidate image, wherein the image information comprises position information and color information of each pixel point; determining the image feature from the image information.

4. The method according to claim 1, wherein before the obtaining of the plurality of candidate images associated with the target image to be generated, further comprising:

acquiring a plurality of sample data, wherein the plurality of sample data comprise first type sample data and second type sample data, the first type sample data are image data with the same theme, and the second type sample data are image data with different themes;

and training the initialized image generation model by using the plurality of sample data to obtain the image generation model.

5. The method according to claim 1, wherein the obtaining a plurality of candidate images associated with a target image to be generated comprises:

acquiring a search request, wherein the search request carries a keyword of the target theme;

and responding to the search request, and finding the candidate images with the target theme from the data sharing platform.

6. The method of any of claims 1 to 5, wherein the image generation model is constructed based on a stackGAN model.

7. An image generation apparatus, comprising:

the system comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring a plurality of candidate images related to a target image to be generated, and the candidate images are images which are collected from a data sharing platform and carry text information for popularizing a target theme;

the extraction unit is used for sequentially carrying out feature extraction on each candidate image of the candidate images to obtain a plurality of groups of candidate feature pairs, wherein each group of candidate feature pairs comprises a character feature which is extracted from one candidate image and is associated with the target subject and an image feature which is extracted from the candidate image;

the input unit is used for sequentially inputting the multiple groups of candidate feature pairs into an image generation model, wherein the image generation model is a neural network model which is obtained by training a plurality of sample data and is used for generating an image for popularizing a specified subject, and the image generation model comprises a generation sub-network model for generating the image and an identification sub-network model for identifying whether the generated image is the specified subject;

and the second acquisition unit is used for acquiring the target image with the target theme according to the output result of the image generation model.

8. The apparatus of claim 7, further comprising:

a generating unit, configured to, after the image generation models are sequentially input to the multiple sets of candidate feature pairs, generate, by using the text features extracted from the ith candidate image, an object image that matches the target topic through the generation sub-network model in a case where a currently input candidate feature pair is an ith candidate image, where i is an integer greater than 1 and smaller than or equal to N, and N is the number of the multiple candidate images;

the identification unit is used for identifying whether the object image and the currently input ith candidate image are the same subject or not under the condition that the identification sub-network model acquires the object image;

a determination unit configured to determine the object image as the target image in a case where the ith candidate image input by the object image is the same subject.

9. The apparatus of claim 7, wherein the input unit comprises:

a processing module, configured to repeatedly perform the following steps until the plurality of candidate images are traversed: acquiring a current candidate image; identifying text information associated with the target subject in the current candidate image; performing word segmentation and mapping processing on the text information to obtain a plurality of current word vectors corresponding to the current candidate image, and performing aggregation processing on the plurality of current word vectors to obtain a group of current sentence vectors; determining the current sentence vector as the character feature of the current candidate image; identifying image information in the current candidate image, wherein the image information comprises position information and color information of each pixel point; determining the image feature from the image information.

10. A computer-readable storage medium comprising a stored program, wherein the program when executed performs the method of any of claims 1 to 6.