CN111553335A - Image generation method and apparatus, and storage medium - Google Patents
Image generation method and apparatus, and storage medium Download PDFInfo
- Publication number
- CN111553335A CN111553335A CN202010329281.0A CN202010329281A CN111553335A CN 111553335 A CN111553335 A CN 111553335A CN 202010329281 A CN202010329281 A CN 202010329281A CN 111553335 A CN111553335 A CN 111553335A
- Authority
- CN
- China
- Prior art keywords
- image
- candidate
- target
- images
- current
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 44
- 239000013598 vector Substances 0.000 claims description 36
- 238000012545 processing Methods 0.000 claims description 16
- 238000000605 extraction Methods 0.000 claims description 11
- 238000013507 mapping Methods 0.000 claims description 11
- 230000011218 segmentation Effects 0.000 claims description 10
- 238000012549 training Methods 0.000 claims description 9
- 230000002776 aggregation Effects 0.000 claims description 8
- 238000004220 aggregation Methods 0.000 claims description 8
- 238000003062 neural network model Methods 0.000 claims description 6
- 238000004590 computer program Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 238000013461 design Methods 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000001737 promoting effect Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000005034 decoration Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000015654 memory Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012015 optical character recognition Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 230000002860 competitive effect Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5846—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using extracted text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0241—Advertisements
- G06Q30/0276—Advertisement creation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/60—Editing figures and text; Combining figures or text
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Library & Information Science (AREA)
- Data Mining & Analysis (AREA)
- Finance (AREA)
- General Engineering & Computer Science (AREA)
- Development Economics (AREA)
- Accounting & Taxation (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Entrepreneurship & Innovation (AREA)
- Databases & Information Systems (AREA)
- Game Theory and Decision Science (AREA)
- Multimedia (AREA)
- Economics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses an image generation method and device and a storage medium. The method comprises the following steps: acquiring a plurality of candidate images associated with a target image to be generated, wherein the plurality of candidate images are images which are collected from a data sharing platform and carry text information for popularizing a target theme; sequentially extracting features of each candidate image of a plurality of candidate images to obtain a plurality of groups of candidate feature pairs, wherein each group of candidate feature pairs comprises character features which are extracted from one candidate image and are associated with a target subject and image features extracted from one candidate image; sequentially inputting a plurality of groups of candidate feature pairs into an image generation model; and acquiring a target image with a target theme according to an output result of the image generation model. The invention solves the technical problem of low image generation efficiency caused by manually designing the image.
Description
Technical Field
The present invention relates to the field of computers, and in particular, to an image generation method and apparatus, and a storage medium.
Background
In designing an image such as a poster, it is common to manually design the image by a professional. For example, the operator issues the demand, and asks the relevant art designer to design the poster or the poster. However, such a method of manually designing an image for promotion, such as a poster, is often time-consuming and limited in the number of designed images, resulting in a problem of low image generation efficiency.
In view of the above problems, no effective solution has been proposed.
Disclosure of Invention
The embodiment of the invention provides an image generation method and device and a storage medium, which at least solve the technical problem of low image generation efficiency caused by manual image design.
According to an aspect of an embodiment of the present invention, there is provided an image generation method including: acquiring a plurality of candidate images associated with a target image to be generated, wherein the plurality of candidate images are images which are collected from a data sharing platform and carry text information for popularizing a target theme; sequentially extracting features of each candidate image of the candidate images to obtain a plurality of groups of candidate feature pairs, wherein each group of candidate feature pairs comprises character features which are extracted from one candidate image and are associated with the target subject and image features extracted from the candidate image; sequentially inputting the multiple groups of candidate feature pairs into an image generation model, wherein the image generation model is a neural network model which is obtained by training a plurality of sample data and is used for generating an image for popularizing a specified theme, and the image generation model comprises a generation sub-network model for generating the image and an identification sub-network model for identifying whether the generated image is the specified theme or not; and acquiring the target image with the target theme according to the output result of the image generation model.
As an optional implementation manner, after the sequentially inputting the plurality of sets of candidate feature pairs into the image generation model, the method further includes: under the condition that a candidate feature pair of the ith candidate image is currently input, generating an object image matched with the target subject through the generation sub-network model by using the character features extracted from the ith candidate image, wherein i is an integer which is more than 1 and less than or equal to N, and N is the number of the candidate images; when the identification sub-network model acquires the object image, identifying whether the object image and the currently input ith candidate image are the same subject; and determining the target image as the target image when the ith candidate image input by the target image has the same subject.
As an optional implementation, sequentially performing feature extraction on each candidate image of the plurality of candidate images to obtain a plurality of sets of candidate feature pairs includes: repeatedly executing the following steps until the plurality of candidate images are traversed: acquiring a current candidate image; identifying the text information associated with the target subject in the current candidate image; performing word segmentation and mapping processing on the text information to obtain a plurality of current word vectors corresponding to the current candidate image, and performing aggregation processing on the plurality of current word vectors to obtain a group of current sentence vectors; determining the current sentence vector as the character feature of the current candidate image; identifying image information in the current candidate image, wherein the image information comprises position information and color information of each pixel point; the image feature is determined based on the image information.
As an optional implementation manner, before the acquiring a plurality of candidate images associated with the target image to be generated, the method further includes: acquiring a plurality of sample data, wherein the plurality of sample data comprise first type sample data and second type sample data, the first type sample data are image data with the same theme, and the second type sample data are image data with different themes; and training the initialized image generation model by using the plurality of sample data to obtain the image generation model.
As an optional implementation, the acquiring multiple candidate images associated with the target image to be generated includes: acquiring a search request, wherein the search request carries the keyword of the target theme; and responding to the search request, and searching the plurality of candidate images with the target theme from the data sharing platform.
As an alternative embodiment, the image generation model is constructed based on a stackGAN model.
According to another aspect of the embodiments of the present invention, there is also provided an image generating apparatus including: the system comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring a plurality of candidate images related to a target image to be generated, and the candidate images are images which are collected from a data sharing platform and carry text information for popularizing a target theme; an extracting unit, configured to perform feature extraction on each candidate image of the multiple candidate images in sequence to obtain multiple sets of candidate feature pairs, where each set of candidate feature pair includes a text feature extracted from one candidate image and associated with the target subject, and an image feature extracted from the one candidate image; an input unit, configured to sequentially input the plurality of sets of candidate feature pairs into an image generation model, where the image generation model is a neural network model obtained by training using a plurality of sample data and used for generating an image for promoting a specified subject, and the image generation model includes a generation sub-network model used for generating an image and an identification sub-network model used for identifying whether the generated image is the specified subject; a second obtaining unit, configured to obtain the target image with the target theme according to an output result of the image generation model.
As an optional implementation, the method further includes: a generation unit configured to generate, by using the text feature extracted from an ith candidate image, an object image matching the target topic by using the generation sub-network model when the candidate feature pair is currently input as an ith candidate image after the image generation models are sequentially input as the plurality of sets of candidate feature pairs, where i is an integer greater than 1 and smaller than or equal to N, and N is the number of the plurality of candidate images; an identification unit configured to identify whether the object image and the currently input i-th candidate image are of the same subject when the identification sub-network model acquires the object image; a determining unit, configured to determine the target image as the target image if the ith candidate image input by the target image is of the same subject.
As an optional implementation, the input unit includes: the processing module is used for repeatedly executing the following steps until the plurality of candidate images are traversed: acquiring a current candidate image; identifying the text information associated with the target subject in the current candidate image; performing word segmentation and mapping processing on the text information to obtain a plurality of current word vectors corresponding to the current candidate image, and performing aggregation processing on the plurality of current word vectors to obtain a group of current sentence vectors; determining the current sentence vector as the character feature of the current candidate image; identifying image information in the current candidate image, wherein the image information comprises position information and color information of each pixel point; the image feature is determined based on the image information.
According to a further aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium having a computer program stored therein, wherein the computer program is configured to execute the above-mentioned image generation method when running.
In the embodiment of the present invention, after a plurality of candidate images with the same target subject as a target image to be generated are acquired, feature extraction is performed on the plurality of candidate images respectively to obtain a plurality of sets of candidate feature pairs, and the plurality of sets of candidate feature pairs are sequentially input to an image generation model to output the target image with the target subject. Therefore, the promotion and publicity images can be automatically generated, a large number of promotion and publicity images do not need to be manually designed by professionals, the aims of saving the image generation time and improving the image generation efficiency are achieved, and the technical problem of the problem of low image generation efficiency caused by manually designing the images is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a flow diagram of an alternative image generation method according to an embodiment of the invention;
FIG. 2 is a schematic diagram of an alternative image generation method according to an embodiment of the invention;
FIG. 3 is a schematic diagram of an alternative image generation method according to an embodiment of the invention;
FIG. 4 is a flow diagram of another alternative image generation method according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an alternative image generating apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
According to an aspect of the embodiments of the present invention, there is provided an image generating method, optionally, as an optional implementation manner, as shown in fig. 1, the image generating method includes:
s102, acquiring a plurality of candidate images associated with a target image to be generated, wherein the plurality of candidate images are images which are collected from a data sharing platform and carry text information for popularizing a target theme;
s104, sequentially carrying out feature extraction on each candidate image of the plurality of candidate images to obtain a plurality of groups of candidate feature pairs, wherein each group of candidate feature pairs comprises character features which are extracted from one candidate image and are associated with a target subject and image features extracted from one candidate image;
s106, sequentially inputting a plurality of groups of candidate feature pairs into an image generation model, wherein the image generation model is a neural network model which is obtained by training a plurality of sample data and is used for generating an image for popularizing a specified subject, and the image generation model comprises a generation sub-network model for generating the image and an identification sub-network model for identifying whether the generated image is the specified subject;
and S108, acquiring a target image with a target theme according to the output result of the image generation model.
Alternatively, in the present embodiment, the above-mentioned image generation method may be applied, but not limited, to a generation process of generating an image for promotion, such as an advertisement image, a poster image, a promotion image, and the like. That is to say, in the present embodiment, after a plurality of candidate images that are the same target subject as the target image to be generated are acquired, feature extraction is performed from the plurality of candidate images respectively to obtain a plurality of sets of candidate feature pairs, and the plurality of sets of candidate feature pairs are sequentially input to the image generation model to output the target image that is the target subject. Therefore, the promotion and publicity images can be automatically generated, a large number of promotion and publicity images do not need to be manually designed by professionals, the purposes of saving the image generation time and improving the image generation efficiency are achieved, and the problem of low image generation efficiency in the related technology is solved. The above is only an example, and this is not limited in this embodiment.
Optionally, in this embodiment, the image generation model may be, but is not limited to, constructed based on a stackGAN model. Wherein the stackGAN model can be, but is not limited to, a tree structure of multiple producers and discriminators that produce multi-scale images corresponding to the same scene from different branches of the tree.
Optionally, in this embodiment, after the image generation model identifies text information associated with the target topic in the candidate image, the image generation model performs word segmentation and mapping on the text information to obtain a word vector corresponding to the candidate image, and performs aggregation on the word vector to obtain a sentence vector; thereby determining the sentence vector as the character feature of the candidate image. The image generation model may determine image features from image information in the candidate image after identifying the image information.
For example, the image generation model described above may be as shown in fig. 2. After acquiring a plurality of candidate images, extracting text information in the candidate images, as shown by the bold solid line box in fig. 2. Then, the words are subjected to word segmentation, mapping and aggregation to obtain sentence vectors as character features, as shown by the bold dashed line boxes in fig. 2. Then the character features are input into an image generation model. Then, a candidate target image generated by prediction is obtained, as shown by a bold dashed circle in fig. 2. Inputting the image characteristics determined according to the image information of the candidate image and the candidate target image into the model again, wherein the image characteristics determined according to the image information of the candidate image are shown as a bold solid line circle in fig. 2. And further analyzing to obtain a target image of the target theme.
Optionally, in this embodiment, acquiring multiple candidate images associated with the target image to be generated includes: acquiring a search request, wherein the search request carries a keyword of a target theme; and responding to the search request, and finding a plurality of candidate images with the target theme from the data sharing platform.
It should be noted that the data sharing platform may be, but is not limited to, an application platform such as a shared space application, an instant messaging application, and the like, for example, a wechat public number. The plurality of candidate images may be, but not limited to, images provided in the WeChat public. For example, as shown in fig. 3, a certain WeChat public number includes a poster, and text information published in 2019, 12 months, and 24 days is displayed: [ redemption for points ] a red color that must never be missed. The red can highlight eyes and can also be low in temperature and soft, and particularly in Christmas seasons, people are not red or poor.
According to the embodiment provided by the application, after a plurality of candidate images which are the same as the target image to be generated are obtained, feature extraction is respectively carried out on the candidate images to obtain a plurality of groups of candidate feature pairs, and the candidate feature pairs are sequentially input into the image generation model to output the target image with the target theme. Therefore, the promotion and publicity images can be automatically generated, a large number of promotion and publicity images do not need to be manually designed by professionals, the purposes of saving the image generation time and improving the image generation efficiency are achieved, and the problem of low image generation efficiency in the related technology is solved.
As an optional scheme, after sequentially inputting the plurality of sets of candidate feature pairs into the image generation model, the method further includes:
s1, under the condition that the current input is a candidate feature pair of the ith candidate image, generating an object image matched with the target subject by using the character features extracted from the ith candidate image and generating a sub-network model, wherein i is an integer which is larger than 1 and smaller than the same, and N is the number of a plurality of candidate images;
s2, under the condition that the identification sub-network model acquires the object image, identifying whether the object image and the current input ith candidate image are the same subject;
s3, in the case where the ith candidate image input by the object image is of the same subject, the object image is determined as the target image.
Optionally, in this embodiment, before acquiring a plurality of candidate images associated with a target image to be generated, the method further includes: the method comprises the steps of obtaining a plurality of sample data, wherein the plurality of sample data comprise first type sample data and second type sample data, the first type sample data are image data with the same theme, and the second type sample data are image data with different themes; and training the initialized image generation model by using a plurality of sample data to obtain the image generation model.
Optionally, in this embodiment, sequentially performing feature extraction on each candidate image of the multiple candidate images to obtain multiple sets of candidate feature pairs includes:
repeatedly executing the following steps until a plurality of candidate images are traversed:
acquiring a current candidate image;
identifying text information associated with the target subject in the current candidate image; performing word segmentation and mapping processing on the character information to obtain a plurality of current word vectors corresponding to the current candidate image, and performing aggregation processing on the plurality of current word vectors to obtain a group of current sentence vectors; determining the current sentence vector as the character feature of the current candidate image;
identifying image information in the current candidate image, wherein the image information comprises position information and color information of each pixel point; image features are determined from the image information.
The following description is specifically made in conjunction with S402 to S410 shown in fig. 4:
and acquiring a search request, wherein the search request carries a keyword of a target subject (assumed to be the keyword of the target product). In response to the search request, a plurality of candidate images with the target subject are found from a data sharing platform (such as WeChat public number). Here, the plurality of candidate images may be history posters related to the target product, or posters competitive with the target product, as in step S402.
Using an image Recognition technique (such as Optical Character Recognition, OCR for short), the text information associated with the target theme of the target product in each candidate image is recognized (step S404). For example, extracting a title (the title "exchange for points, red which cannot be missed at all" as shown in fig. 3) or extracting text in an article (the text "red can be chosen to be bright and can also be low and gentle, especially essential in christmas season, you are not or are not a little bit red equipped with woolen") as shown in fig. 3.
Then, the extracted text information is subjected to word segmentation (e.g., using jieba word segmentation, SnowNLP, foonltk, ansj, etc.) in step S406.
Further, using a wordebeading mapping algorithm, each word is mapped into a dense dimension (e.g., 200 dimensions), resulting in a plurality of word vectors. Wherein, the wordebeading mapping algorithm may be, but is not limited to, mapping words in a text space to another numerical vector space. After all the word vectors are obtained, the word vectors are aggregated to obtain sentence vectors which serve as character features of the candidate images. Here, the sentence vectors can be, but are not limited to, viewed as the result of the averaging of the corresponding dimensions of each word vector, as in step S408.
And then inputting the character features extracted from the candidate images and the extracted image features into a trained image generation model constructed based on stackGAN to obtain corresponding target images, as in step S410, thereby assisting designers to quickly generate posters for promotion and promotion.
According to the embodiment provided by the application, the candidate images are identified through the trained image generation model constructed based on the stackGAN, so that the target images are automatically generated based on the candidate images without manual design of professionals, the image generation efficiency is improved, and the efficiency of popularization based on the target images is improved.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.
According to another aspect of the embodiment of the present invention, there is also provided an image generation apparatus for implementing the above-described image generation method. As shown in fig. 5, the apparatus includes:
1) a first obtaining unit 502, configured to obtain multiple candidate images associated with a target image to be generated, where the multiple candidate images are images collected from a data sharing platform and carrying text information for promoting a target theme;
2) an extracting unit 504, configured to perform feature extraction on each candidate image of the multiple candidate images in sequence to obtain multiple sets of candidate feature pairs, where each set of candidate feature pair includes a text feature extracted from one candidate image and associated with a target subject, and an image feature extracted from one candidate image;
3) an input unit 506, configured to sequentially input multiple sets of candidate feature pairs into an image generation model, where the image generation model is a neural network model obtained by training using multiple sample data and used to generate an image for promoting a specified subject, and the image generation model includes a generation sub-network model used to generate an image and an identification sub-network model used to identify whether the generated image is a specified subject;
4) a second obtaining unit 508, configured to obtain a target image with a target topic according to an output result of the image generation model.
The embodiments in this embodiment may refer to the above method embodiments, but are not limited thereto.
As an optional scheme, the method further comprises the following steps:
1) the generation unit is used for generating an object image matched with a target theme by using character features extracted from the ith candidate image and generating a sub-network model by using the character features extracted from the ith candidate image under the condition that the candidate feature pair of the ith candidate image is currently input after a plurality of groups of candidate feature pairs are sequentially input into an image generation model, wherein i is an integer which is more than 1 and less than or equal to N, and N is the number of a plurality of candidate images;
2) the identification unit is used for identifying whether the object image and the currently input ith candidate image are the same subject or not under the condition that the identification sub-network model acquires the object image;
3) a determination unit configured to determine the object image as the target image in a case where an ith candidate image input by the object image is the same subject.
The embodiments in this embodiment may refer to the above method embodiments, but are not limited thereto.
As an alternative, the input unit includes:
1) a processing module for repeatedly executing the following steps until a plurality of candidate images are traversed: acquiring a current candidate image; identifying text information associated with the target subject in the current candidate image; performing word segmentation and mapping processing on the character information to obtain a plurality of current word vectors corresponding to the current candidate image, and performing aggregation processing on the plurality of current word vectors to obtain a group of current sentence vectors; determining the current sentence vector as the character feature of the current candidate image; identifying image information in the current candidate image, wherein the image information comprises position information and color information of each pixel point; image features are determined from the image information.
The embodiments in this embodiment may refer to the above method embodiments, but are not limited thereto.
According to a further aspect of an embodiment of the present invention, there is also provided a computer-readable storage medium having a computer program stored thereon, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.
Alternatively, in the present embodiment, the above-mentioned computer-readable storage medium may be configured to store a computer program for executing the steps of:
alternatively, in this embodiment, a person skilled in the art may understand that all or part of the steps in the methods of the foregoing embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
The integrated unit in the above embodiments, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in the above computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing one or more computer devices (which may be personal computers, servers, network devices, etc.) to execute all or part of the steps of the method according to the embodiments of the present invention.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.
Claims (10)
1. An image generation method, comprising:
acquiring a plurality of candidate images associated with a target image to be generated, wherein the plurality of candidate images are images which are collected from a data sharing platform and carry text information for popularizing a target theme;
sequentially extracting features of each candidate image of the candidate images to obtain a plurality of groups of candidate feature pairs, wherein each group of candidate feature pairs comprises character features which are extracted from one candidate image and are associated with the target subject and image features extracted from the candidate image;
sequentially inputting the multiple groups of candidate feature pairs into an image generation model, wherein the image generation model is a neural network model which is obtained by training by using multiple sample data and is used for generating an image for popularizing a specified theme, and the image generation model comprises a generation sub-network model for generating the image and an identification sub-network model for identifying whether the generated image is the specified theme or not;
and acquiring the target image with the target theme according to the output result of the image generation model.
2. The method of claim 1, further comprising, after said sequentially inputting the plurality of sets of candidate feature pairs into an image generation model:
under the condition that a currently input candidate feature pair is the ith candidate image, generating an object image matched with the target subject through the generation sub-network model by using the character features extracted from the ith candidate image, wherein i is an integer which is larger than 1 and smaller than the same, and N is the number of the candidate images;
under the condition that the identification sub-network model acquires the object image, identifying whether the object image and the currently input ith candidate image are the same subject or not;
determining the object image as the target image in the case that the ith candidate image input by the object image is the same subject.
3. The method of claim 1, wherein the sequentially performing feature extraction on each candidate image of the plurality of candidate images to obtain a plurality of sets of candidate feature pairs comprises:
repeatedly executing the following steps until the plurality of candidate images are traversed:
acquiring a current candidate image;
identifying text information associated with the target subject in the current candidate image; performing word segmentation and mapping processing on the text information to obtain a plurality of current word vectors corresponding to the current candidate image, and performing aggregation processing on the plurality of current word vectors to obtain a group of current sentence vectors; determining the current sentence vector as the character feature of the current candidate image;
identifying image information in the current candidate image, wherein the image information comprises position information and color information of each pixel point; determining the image feature from the image information.
4. The method according to claim 1, wherein before the obtaining of the plurality of candidate images associated with the target image to be generated, further comprising:
acquiring a plurality of sample data, wherein the plurality of sample data comprise first type sample data and second type sample data, the first type sample data are image data with the same theme, and the second type sample data are image data with different themes;
and training the initialized image generation model by using the plurality of sample data to obtain the image generation model.
5. The method according to claim 1, wherein the obtaining a plurality of candidate images associated with a target image to be generated comprises:
acquiring a search request, wherein the search request carries a keyword of the target theme;
and responding to the search request, and finding the candidate images with the target theme from the data sharing platform.
6. The method of any of claims 1 to 5, wherein the image generation model is constructed based on a stackGAN model.
7. An image generation apparatus, comprising:
the system comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring a plurality of candidate images related to a target image to be generated, and the candidate images are images which are collected from a data sharing platform and carry text information for popularizing a target theme;
the extraction unit is used for sequentially carrying out feature extraction on each candidate image of the candidate images to obtain a plurality of groups of candidate feature pairs, wherein each group of candidate feature pairs comprises a character feature which is extracted from one candidate image and is associated with the target subject and an image feature which is extracted from the candidate image;
the input unit is used for sequentially inputting the multiple groups of candidate feature pairs into an image generation model, wherein the image generation model is a neural network model which is obtained by training a plurality of sample data and is used for generating an image for popularizing a specified subject, and the image generation model comprises a generation sub-network model for generating the image and an identification sub-network model for identifying whether the generated image is the specified subject;
and the second acquisition unit is used for acquiring the target image with the target theme according to the output result of the image generation model.
8. The apparatus of claim 7, further comprising:
a generating unit, configured to, after the image generation models are sequentially input to the multiple sets of candidate feature pairs, generate, by using the text features extracted from the ith candidate image, an object image that matches the target topic through the generation sub-network model in a case where a currently input candidate feature pair is an ith candidate image, where i is an integer greater than 1 and smaller than or equal to N, and N is the number of the multiple candidate images;
the identification unit is used for identifying whether the object image and the currently input ith candidate image are the same subject or not under the condition that the identification sub-network model acquires the object image;
a determination unit configured to determine the object image as the target image in a case where the ith candidate image input by the object image is the same subject.
9. The apparatus of claim 7, wherein the input unit comprises:
a processing module, configured to repeatedly perform the following steps until the plurality of candidate images are traversed: acquiring a current candidate image; identifying text information associated with the target subject in the current candidate image; performing word segmentation and mapping processing on the text information to obtain a plurality of current word vectors corresponding to the current candidate image, and performing aggregation processing on the plurality of current word vectors to obtain a group of current sentence vectors; determining the current sentence vector as the character feature of the current candidate image; identifying image information in the current candidate image, wherein the image information comprises position information and color information of each pixel point; determining the image feature from the image information.
10. A computer-readable storage medium comprising a stored program, wherein the program when executed performs the method of any of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010329281.0A CN111553335A (en) | 2020-04-23 | 2020-04-23 | Image generation method and apparatus, and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010329281.0A CN111553335A (en) | 2020-04-23 | 2020-04-23 | Image generation method and apparatus, and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111553335A true CN111553335A (en) | 2020-08-18 |
Family
ID=72005759
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010329281.0A Pending CN111553335A (en) | 2020-04-23 | 2020-04-23 | Image generation method and apparatus, and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111553335A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111985565A (en) * | 2020-08-20 | 2020-11-24 | 上海风秩科技有限公司 | Picture analysis method and device, storage medium and electronic equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107247755A (en) * | 2017-05-27 | 2017-10-13 | 深圳市唯特视科技有限公司 | A kind of personalized image method for generating captions based on context serial memorization network |
CN109271537A (en) * | 2018-08-10 | 2019-01-25 | 北京大学 | A kind of text based on distillation study is to image generating method and system |
CN110163267A (en) * | 2019-05-09 | 2019-08-23 | 厦门美图之家科技有限公司 | A kind of method that image generates the training method of model and generates image |
CN110287484A (en) * | 2019-06-11 | 2019-09-27 | 华东师范大学 | A kind of Chinese language text based on face characteristic describes Face image synthesis method |
-
2020
- 2020-04-23 CN CN202010329281.0A patent/CN111553335A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107247755A (en) * | 2017-05-27 | 2017-10-13 | 深圳市唯特视科技有限公司 | A kind of personalized image method for generating captions based on context serial memorization network |
CN109271537A (en) * | 2018-08-10 | 2019-01-25 | 北京大学 | A kind of text based on distillation study is to image generating method and system |
CN110163267A (en) * | 2019-05-09 | 2019-08-23 | 厦门美图之家科技有限公司 | A kind of method that image generates the training method of model and generates image |
CN110287484A (en) * | 2019-06-11 | 2019-09-27 | 华东师范大学 | A kind of Chinese language text based on face characteristic describes Face image synthesis method |
Non-Patent Citations (1)
Title |
---|
HAN ZHANG ET AL: "StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks" * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111985565A (en) * | 2020-08-20 | 2020-11-24 | 上海风秩科技有限公司 | Picture analysis method and device, storage medium and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112199375B (en) | Cross-modal data processing method and device, storage medium and electronic device | |
CN110472090B (en) | Image retrieval method based on semantic tags, related device and storage medium | |
JP5282658B2 (en) | Image learning, automatic annotation, search method and apparatus | |
CN111522989B (en) | Method, computing device, and computer storage medium for image retrieval | |
CN109543690A (en) | Method and apparatus for extracting information | |
CN113159095A (en) | Model training method, image retrieval method and device | |
CN108319723A (en) | A kind of picture sharing method and device, terminal, storage medium | |
CN113298197B (en) | Data clustering method, device, equipment and readable storage medium | |
CN110598019B (en) | Repeated image identification method and device | |
CN112258254B (en) | Internet advertisement risk monitoring method and system based on big data architecture | |
CN114419363A (en) | Target classification model training method and device based on label-free sample data | |
CN112017162B (en) | Pathological image processing method, pathological image processing device, storage medium and processor | |
CN113704623A (en) | Data recommendation method, device, equipment and storage medium | |
Belhi et al. | Deep learning and cultural heritage: the CEPROQHA project case study | |
CN108319985A (en) | The method and apparatus of linguistic indexing of pictures | |
CN113204643B (en) | Entity alignment method, device, equipment and medium | |
CN111553335A (en) | Image generation method and apparatus, and storage medium | |
CN110209895B (en) | Vector retrieval method, device and equipment | |
CN118093930A (en) | Audio resource retrieval method, device, equipment and storage medium | |
CN114860667B (en) | File classification method, device, electronic equipment and computer readable storage medium | |
US10803115B2 (en) | Image-based domain name system | |
CN113762038A (en) | Video text recognition method and device, storage medium and electronic equipment | |
CN112612965A (en) | Material recommendation method and system based on map label noise reduction | |
CN113779248A (en) | Data classification model training method, data processing method and storage medium | |
Liao et al. | Image-matching based identification of store signage using web-crawled information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20231120 Address after: Unit 5B03, 5th Floor, Building 2, No. 277 Longlan Road, Xuhui District, Shanghai, 200000 Applicant after: SHANGHAI SECOND PICKET NETWORK TECHNOLOGY CO.,LTD. Address before: Floors 4, 5 and 6, No. 3, Lane 1473, Zhenguang Road, Putuo District, Shanghai, 200333 Applicant before: Shanghai Fengzhi Technology Co.,Ltd. |
|
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200818 |