WO2021139242A1

WO2021139242A1 - Presentation file generation method, apparatus, and device and storage medium

Info

Publication number: WO2021139242A1
Application number: PCT/CN2020/118349
Authority: WO
Inventors: 谢静文; 阮晓雯; 徐亮
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-07-16
Filing date: 2020-09-28
Publication date: 2021-07-15
Also published as: CN111930976A; CN111930976B

Abstract

A presentation file generation method and apparatus, a device, and a storage medium. The method comprises: obtaining at least two keywords in a file to be processed, as well as feature attribute information of the file (S101); dividing the file according to the at least two keywords, so as to obtain at least two text fragments (S102); identifying in a presentation file template library a target presentation file template that matches the feature attribute information of said file; (S103); and importing the at least two text fragments into the target presentation file template so as to obtain a target presentation file (S104). In the method, a target presentation file may be generated according to the text information inputted by the user, increasing presentation file generation efficiency. The present method relates to image recognition technology in artificial intelligence. The method is suitable for fields such as smart government services and smart education, and is thus conducive to promoting the construction of smart cities.

Description

Method, device, equipment and storage medium for generating presentation

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office, the application number is 202010686330.6, and the application name is "a drug discovery method, equipment, server and readable storage medium" on July 16, 2020, and its entire contents Incorporated in this application by reference.

Technical field

This application relates to the field of computer technology, and in particular to a method, device, device, and storage medium for generating a presentation.

Background technique

With the general promotion of office software, presentations are widely used in all aspects of social life. For example, presentations are used in work reports, corporate publicity, product promotion, wedding celebrations, project bidding, management consulting, education and training. At present, the inventor found that the production of presentations is mainly by manually filling pictures, texts and other elements into preset templates. However, this method requires high labor costs, and in some cases, the templates and content cannot be very good. Good integration requires users to repeatedly adjust the template to achieve the desired effect, resulting in low efficiency of presentation generation.

technical problem

Existing presentation generation methods require high labor costs, and in some cases the template and content cannot be well integrated, requiring users to repeatedly adjust the template to achieve the desired effect, resulting in low efficiency of presentation generation , The embodiments of the present application provide a presentation method, device, equipment, and storage medium, which can improve the efficiency of generating presentations.

Technical solutions

In the first aspect, an embodiment of the present application provides a presentation method. The method includes: obtaining at least two keywords in a file to be processed and characteristic attribute information of the file to be processed, where the characteristic attribute information includes all At least one of the field to which the file to be processed belongs, the number of keywords in the file to be processed, and the subject of the file to be processed; the file to be processed is divided according to the at least two keywords, Obtain at least two text fragments, one text fragment corresponding to at least one keyword; from the presentation file template library, identify the target presentation file template matching the characteristic attribute information of the file to be processed; The fragments are imported into the target presentation file template to obtain the target presentation file.

In a second aspect, an embodiment of the present application provides a presentation generation device, the device includes: an acquisition module for acquiring at least two keywords in a file to be processed and characteristic attribute information of the file to be processed, so The characteristic attribute information includes at least one of the field to which the file to be processed belongs, the number of keywords in the file to be processed, and the subject of the file to be processed; The keywords divide the file to be processed to obtain at least two text fragments, one text fragment corresponds to at least one keyword; the recognition module is used to identify the characteristics of the file to be processed from the presentation file template library A target presentation file template with matching attribute information; an import module for importing the at least two text fragments into the target presentation file template to obtain a target presentation file.

In a third aspect, an embodiment of the present application provides an electronic device, which includes: a processor, adapted to implement one or more instructions; and, a computer storage medium that stores one or more instructions, The one or more instructions are suitable for being loaded by the processor and executing the following steps: acquiring at least two keywords in the file to be processed and characteristic attribute information of the file to be processed, the characteristic attribute information including all At least one of the field to which the file to be processed belongs, the number of keywords in the file to be processed, and the subject of the file to be processed; the file to be processed is divided according to the at least two keywords, Obtain at least two text fragments, one text fragment corresponding to at least one keyword; from the presentation file template library, identify the target presentation file template matching the characteristic attribute information of the file to be processed; The fragments are imported into the target presentation file template to obtain the target presentation file.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, including: the computer storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by a processor and executing the following steps : Obtain at least two keywords in the file to be processed, and characteristic attribute information of the file to be processed, the characteristic attribute information including the field to which the file to be processed belongs and the number of keywords in the file to be processed And at least one of the topics of the document to be processed; dividing the document to be processed according to the at least two keywords to obtain at least two text fragments, one text fragment corresponding to at least one keyword; In the presentation file template library, a target presentation file template matching the characteristic attribute information of the file to be processed is identified; the at least two text fragments are imported into the target presentation file template to obtain the target presentation file.

Beneficial effect

In the embodiment of the present application, by acquiring at least two keywords in the file to be processed and the characteristic attribute information of the file to be processed; dividing the file to be processed according to the at least two keywords to obtain at least two text fragments; from the presentation file In the template library, identify the target presentation file template matching the characteristic attribute information of the file to be processed; import at least two text fragments into the target presentation file template to obtain the target presentation file. In this solution, the file to be processed is divided according to the above at least two keywords to obtain at least two text fragments, which is beneficial to generating a presentation corresponding to each file fragment. And identify the target presentation file template matching the characteristic attribute information of the file to be processed, and import at least two text fragments into the target presentation file template to obtain the target presentation file, that is, the target presentation file includes the presentation corresponding to each text fragment Manuscript. The whole process of generating the target presentation file does not require human involvement, which can improve the efficiency and flexibility of presentation generation; and ensure the accuracy and relevance of the presentation.

Description of the drawings

FIG. 1 is a schematic flowchart of a method for generating a presentation provided by an embodiment of the present application.

FIG. 2 is a schematic diagram of a method for importing at least two text fragments into a target presentation file template according to an embodiment of the present application.

FIG. 3 is a schematic flowchart of another method for generating a presentation provided by an embodiment of the present application.

Fig. 4 is a schematic structural diagram of a presentation generating device provided by an embodiment of the present application.

Fig. 5 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.

Embodiments of the present invention

Artificial intelligence technology is a comprehensive discipline, covering a wide range of fields, including both hardware-level technology and software-level technology. Basic artificial intelligence technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, and mechatronics. Artificial intelligence software technology mainly includes computer vision technology, speech processing technology, natural language processing technology, and machine learning/deep learning.

Among them, computer vision technology (Computer Vision, CV) is a science that studies how to make machines "see". Furthermore, it refers to the use of cameras and computers instead of human eyes to identify, track, and measure targets. And further graphics processing, so that computer processing becomes more suitable for human eyes to observe or send to the instrument to detect the image. As a scientific discipline, computer vision studies related theories and technologies, trying to establish an artificial intelligence system that can obtain information from images or multi-dimensional data. Computer vision technology usually includes image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technology, virtual reality, augmented reality, synchronous positioning and mapping Construction and other technologies also include common face recognition, fingerprint recognition and other biometric recognition technologies.

This application relates to image recognition technology in artificial intelligence. The image recognition technology is used to automatically convert images into presentations without manual participation, which can improve the efficiency and accuracy of generating presentations; this application can be applied to smart government affairs, smart education, etc. This field is conducive to promoting the construction of smart cities.

Please refer to FIG. 1, which is a schematic flowchart of a method for generating a presentation provided by an embodiment of the present application. The embodiment of the present application may be executed by an electronic device. The method for generating a presentation includes the following steps S101 to S104.

S101: Acquire at least two keywords in a file to be processed and characteristic attribute information of the file to be processed.

In the embodiment of the present application, the file to be processed is a text file provided by the user for making the target presentation file, and the file to be processed contains content information corresponding to each page of the presentation in the target presentation file. The keywords in the file to be processed can be extracted through the LDA model to obtain at least two keywords in the file to be processed; and the title and content of the file to be processed can be analyzed to obtain characteristic attribute information of the file to be processed. The aforementioned LDA model refers to a document topic generation model, which is used to infer the topic distribution of a document, and text classification can be performed according to the topic distribution. The characteristic attribute information of the file to be processed includes at least one of the field to which the file to be processed belongs, the number of keywords in the file to be processed, and the subject of the file to be processed; wherein the field to which the file to be processed belongs includes technology, education, Political parties, finance, tourism, etc. The topics of the documents to be processed include work summary, marriage and love, graduation defense, induction training, and so on.

S102: Divide the file to be processed according to the above at least two keywords to obtain at least two text fragments.

The paragraph of the keyword in the file to be processed can be used as the text segment corresponding to the keyword; or, the file to be processed can be divided according to the position of the keyword in the file to be processed to obtain at least two text segments. The length of the text fragments can be the same or different; a text fragment corresponds to at least one keyword, that is, the keyword corresponding to the text fragment can refer to the topic of the text fragment, and the keywords corresponding to different text fragments can be different .

Further, step S102 includes: obtaining the similarity between every two adjacent keywords in the at least two keywords, and dividing the corresponding paragraphs in the file to be processed with the keywords whose similarity is greater than the similarity threshold into the same text segment In, at least two text fragments are obtained, and two adjacent keywords can be located in adjacent paragraphs, or located in the same paragraph. By dividing the files to be processed according to the similarity of keywords, the accuracy of dividing the files to be processed can be improved.

The foregoing obtaining the similarity between every two adjacent keywords in at least two keywords includes: using a distance algorithm to obtain the distance between every two adjacent keywords in the at least two keywords, and determining to obtain the at least The similarity between every two adjacent keywords in two keywords. Among them, the greater the distance between two adjacent keywords, the smaller the similarity between two adjacent keywords; the smaller the distance between two adjacent keywords, the similarity between two adjacent keywords The greater the degree. The distance algorithm may include at least one of Minkowski distance, Manhattan distance, Manhattan distance, and Chebyshev distance.

Optionally, the file to be processed may be divided according to the above at least two keywords to obtain at least two candidate text fragments; the above at least two text fragments are generated based on the at least two candidate file fragments. Wherein, after at least two candidate text fragments are obtained, it can be judged whether the at least two candidate text fragments are new topic sentences. If the candidate text fragment is a new topic sentence, the candidate text fragment is determined as a text fragment; if the candidate text fragment is not a new topic sentence, the keywords of the candidate text fragment are re-acquired, and then the candidate text after re-obtaining the keywords is determined Whether the fragment is a new topic sentence. Among them, it is possible to judge whether each candidate text fragment is a new topic sentence according to the text meaning of each candidate text fragment and the word meaning of the keyword corresponding to each candidate text fragment. For example, it is possible to judge whether each candidate text segment is a new topic sentence according to the BERT model. The BERT model refers to a method of pre-training language representation. A general "language understanding" model is trained on a large amount of text corpus (Wikipedia), and then based on this model, candidate text fragments can be classified, and whether the candidate text is Judge the new topic sentence. The text vector of each candidate text segment can be obtained, and each keyword and the text vector of the candidate text segment corresponding to each keyword can be input into the BERT model to obtain a 1/0 indication result. The text vector of each candidate text segment can be obtained during the training process of the BERT model, and the text vector of each candidate file segment is used to describe the global semantic information of the text segment. The BERT model will determine whether each candidate text segment is a new topic sentence based on the meaning of each keyword and the text vector of the candidate text segment corresponding to each keyword. The BERT model will output a 1/0 indication result to determine Whether each candidate text segment is a new topic sentence. If the BERT model outputs a 1 indicating result, it means that the candidate text segment is a new topic sentence, and then the candidate text fragment is determined as a text fragment; if the BERT model outputs a 0 indicating result, it means that the candidate text fragment is not a new topic sentence. Then, the keyword of the candidate text segment is reacquired, and then it is judged whether the candidate text segment after reacquiring the keywords is a new topic sentence.

S103: Identify a target presentation file template matching the characteristic attribute information of the file to be processed from the presentation file template library.

The presentation file template library includes a variety of presentation file templates, each presentation file template includes multiple presentations, and the number of presentations included in different presentation file templates is inconsistent; and/or, the color information and layout of different presentation file templates The information can be different. Therefore, different presentation file templates are suitable for different fields, or different presentation file templates are suitable for different themes; or, different presentation file templates are suitable for generating text files of different lengths corresponding to the presentation files.

After obtaining the characteristic attribute information of the file to be processed, in the presentation file template library, identify the target presentation file template matching the characteristic attribute information of the file to be processed, and generate the target presentation file corresponding to the file to be processed according to the target presentation file template .

Optionally, the aforementioned characteristic attribute information includes the number of keywords in the file to be processed, and the number of text fragments in at least two text fragments can be determined according to the number of keywords in the file to be processed; The number of presentations included in the presentation file template; the number of presentation texts included in the presentation file template library, which is the same as the number of text fragments, is determined as the target presentation file template.

The target presentation file template can be determined in the presentation file template library according to the number of keywords in the file to be processed. First, the number of text fragments of at least two text fragments can be determined according to the number of keywords of the file to be processed. At least one keyword corresponds to one text fragment, that is, the sentence corresponding to the at least one keyword is divided into one text fragment. Then obtain the number of presentations included in each presentation file template in the target presentation file template library, and determine the number of presentation files included in the presentation file template library as the target presentation file template with the same number of text fragments as the presentation file template. . Among them, if the number of presentation files included in the presentation file template library is more than one presentation file template with the same number of text fragments, one of them can be determined as the target presentation based on the text content of the file to be processed, user designation, or random selection. Document template.

Optionally, the above-mentioned characteristic attribute information includes the theme of the file to be processed, the theme of each presentation file template in the presentation file template library can be obtained; the theme of each presentation file template in the presentation file template library is determined separately, and The matching degree between the themes of the files to be processed; the presentation file template with the largest matching degree is selected from the presentation file template library as the target presentation file template.

After obtaining the document to be processed, the subject of the document to be processed can be obtained according to the text content of the document to be processed, such as work summary, marriage and love, graduation defense, induction training, etc., and the title and content of the document to be processed are analyzed. Get the subject of the file to be processed. Then obtain the theme of each presentation file template in the presentation file template library, and determine the matching degree between the theme corresponding to each presentation file template in the presentation file template library and the theme of the file to be processed. The theme of the presentation file template in the presentation file template library and the presentation file template with the greatest matching degree between the theme of the file to be processed are used as the target presentation file template. You can store multiple theme presentation file templates in the presentation file template library in advance, and one presentation file template corresponds to one theme.

Optionally, the above-mentioned characteristic attribute information includes the field to which the file to be processed belongs, and the attribute information of the presentation file corresponding to the file to be processed can be predicted according to the field to which the file to be processed belongs. The attribute information of the presentation file corresponding to the file to be processed includes The typesetting information and color information of the presentation file corresponding to the file to be processed; obtaining the attribute information of each presentation file template in the presentation file template library, and the attribute information of each presentation file template includes the typesetting information and color information of each presentation file template; The attribute information of the presentation file template in the presentation file template library and the presentation file template with the greatest degree of matching with the attribute information of the presentation file corresponding to the file to be processed are determined as the target presentation file template.

The target presentation file template corresponding to the file to be processed can be determined in the presentation file template library according to the field to which the file to be processed belongs. The attribute information of the presentation file corresponding to the file to be processed can be predicted according to the field to which the file to be processed belongs. The attribute information of the presentation file corresponding to the file to be processed includes the typesetting information and color information of the presentation file corresponding to the file to be processed. The field of the document to be processed can be science and technology, education, political parties, finance, tourism, etc. If the field of the document to be processed is tourism, it is predicted that the typesetting information of the presentation file corresponding to the document to be processed is folio typesetting , That is, half of the page is used to display the scenery, half of the page is used to introduce the scenery, or multi-picture layout, etc. And the color information of the presentation file corresponding to the to-be-processed file of the tourism category should be relatively fresh, such as sky blue, green, and so on. Then obtain the attribute information of each presentation file template in the presentation file template library, and the attribute information of each presentation file template includes the typesetting information and color information of each presentation file template. Match the attribute information of each presentation file template in the presentation file template library with the attribute information of the presentation file corresponding to the file to be processed to obtain the presentation file template with the greatest matching degree, and use the presentation file template with the largest matching degree as the target presentation Document template.

S104: Import the above-mentioned at least two text fragments into the target presentation file template to obtain the target presentation file.

The file to be processed is divided according to at least two keywords of the file to be processed to obtain at least two text fragments, and after the target presentation file template is obtained, the at least two text fragments are imported into the target presentation file template to obtain the target presentation file. The target presentation file template includes multiple presentation templates. At least two text fragments can be imported into the target presentation file template in a preset order to obtain the target presentation file, that is, according to the order of the text fragments, each of the at least two text fragments The two text fragments are imported into the presentation template corresponding to the target presentation file template, the presentation corresponding to each file fragment is obtained, and the target presentation file is generated according to each presentation. The preset order may be obtained according to the position information of the keyword of each text fragment in the at least two text fragments in the file to be processed, and the at least two text fragments may be imported into the target presentation file template according to the preset order. If there are three text fragments, namely, text fragment 1, text fragment 2, and text fragment 3, the three text fragments are sorted as text fragment 1 before text fragment 2, and text fragment 2 before text fragment 3. The three text fragments can be imported into the target presentation file template in turn according to the order between the three text fragments, that is, text fragment 1 is imported into the first presentation template in the target presentation file template, and text fragment 2 is imported into the target presentation file. In the second presentation template in the template, text fragment 3 is imported into the third presentation template in the target presentation file template. Of course, you can also select the presentation template in the target presentation file template according to the typesetting information and text information of each text fragment. For example, if there are four subtitles in the text fragment 4, you can select four in the target presentation file template. Presentation template in headline style.

Wherein, as shown in FIG. 2, it is a schematic diagram of a method for importing at least two text fragments into a target presentation file template provided by an embodiment of the present application. As shown in FIG. 2, a method provided by an embodiment of the present application combines at least two The method for importing text fragments into a target presentation file template includes steps S21-S23.

S21: Acquire location information in the file to be processed of the keyword of each of the at least two text fragments.

If the text content in the file to be processed is collected by the user according to the display order of the content that needs to be displayed, the location information in the file to be processed where the keyword corresponding to each text segment of at least two text fragments is located can be obtained, and each file is recorded. The location information in the file to be processed where the keyword corresponding to each text fragment is located.

S22: Sort at least two text fragments according to the position information of the keywords in the processed file.

S23: Import at least two sorted text fragments into the target presentation file template in sequence to obtain the target presentation file.

Obtain the sequence of at least two keywords according to the location information in the file to be processed where the keywords are located, and determine the sequence of at least two text fragments corresponding to the at least two keywords according to the sequence of the at least two keywords To sort at least two text fragments.

Each presentation in the target presentation file template has a fixed sequence, and at least two sorted text fragments are sequentially imported into the target presentation file template to obtain the target presentation file.

Optionally, obtain the text feature information corresponding to each text segment in the at least two text segments; determine the preprocessing mode of each text segment according to the text feature information corresponding to each text segment, and the preprocessing mode includes simplified processing and disassembly. At least one of sub-processing, correction processing, and typesetting processing; each text segment is processed according to the preprocessing method of each text segment to obtain at least two processed text segments; the processed at least two text segments Import into the target presentation file template to get the target presentation file.

The text feature information corresponding to each of the at least two text segments can be acquired, and the preprocessing mode of each text segment can be determined according to the text feature information corresponding to each text segment. The text feature information of the text fragment includes the text length information of the text fragment, the hierarchical title information of the text fragment, the information whether the text of the text fragment is wrong, etc. The preprocessing method includes simplified processing, split processing, correction processing, and typesetting processing. At least one of. Each text segment is processed according to the preprocessing method of each text segment to obtain at least two processed text segments. For example, according to the text length information of each text fragment, for the text fragment with too long text length, you can simplify the processing according to the method of TextRank (extract keywords) text summary, and delete some of the redundant text to ensure that each text The content of the fragments is not too cumbersome, and the readability of each text fragment is improved. Splitting some sentences that are too long in the text fragment makes each sentence read more smoothly. At the same time, based on the hierarchical title information of the text fragments, the overall content of each text fragment, and the overall content under each hierarchical heading, use seq2seq and Pointer-Generator Network to generate a title for each content in each text fragment. Seq2seq is an algorithm cluster for natural language processing in machine learning. It is mainly used for language translation, image subtitles, conversation models and text summary extraction. Pointer-Generator Network (pointer generation network) is also used for text summary extraction. In addition, the text information in each text segment is also corrected, and the wrong text and wrong punctuation in each text segment are corrected. Import the processed at least two text fragments into the target presentation file template to obtain the target presentation file.

Among them, when importing at least two text fragments into the target presentation file template, a suitable presentation can be selected in the target presentation file template according to the typesetting information and text content of each text fragment. If a text fragment contains four subtitles, select a presentation containing four subtitles in the target presentation file template. In addition to the presentation page corresponding to each text fragment in the file to be processed, the opening page is set according to the theme of the file to be processed, the directory page is set according to the content of each text fragment in the file to be processed, and the end page is set. Complete the complete target presentation file. In addition, after the target presentation file is completed, the target presentation file is presented to the user, and the user's instructions to adjust the color, font, and content of each presentation in the target presentation file can be accepted. Each presentation in the target presentation file is You can accept personalized adjustments to the shape of the icon, the style of the wireframe, etc., and finally get the final version of the target presentation file.

Optionally, the electronic device in this application can refer to any node device in the blockchain. The so-called blockchain is a computer technology such as distributed data storage, peer-to-peer transmission (P2P transmission), consensus mechanism, encryption algorithm, etc. The new type of application model is essentially a decentralized database; a block chain can be composed of multiple serial transaction records (also called blocks) that are connected and protected by cryptography. The connected distributed ledger allows multiple parties to effectively record the transaction, and the transaction can be permanently checked (not tampered with). Among them, the consensus mechanism refers to the mathematical algorithm that realizes the establishment of trust between different nodes and the acquisition of rights and interests in the blockchain network; that is to say, the consensus mechanism is a mathematical algorithm recognized by all network nodes of the blockchain. This application can use the consensus mechanism of the blockchain to generate the target demonstration file based on the files to be processed, which can improve the accuracy of restoring the target demonstration file.

For example, each node device in the blockchain performs consensus verification on the execution results of the above steps S101~S104, and the execution result of each step is passed by the consensus verification, it can be determined that the accuracy of generating the target presentation file is relatively high; if there are steps If the execution result of is not passed by the consensus verification, it can be determined that the accuracy of generating the target demonstration file is relatively low, and the node device may perform the above steps S101 to S104 again to obtain the target demonstration file again. Alternatively, each node device in the blockchain can perform consensus verification on the target presentation file (that is, only the execution result of step S104). If the consensus verification is passed, it is determined that the accuracy of the target presentation file is relatively high; if the consensus verification fails , It is determined that the accuracy of the target presentation file is relatively low, and the node device may perform the above steps S101 to S104 again to obtain the target presentation file again.

In the embodiment of the present application, by acquiring at least two keywords in the file to be processed and the characteristic attribute information of the file to be processed; dividing the file to be processed according to the at least two keywords to obtain at least two text fragments; from the presentation file In the template library, identify the target presentation file template matching the characteristic attribute information of the file to be processed; import at least two text fragments into the target presentation file template to obtain the target presentation file. In this solution, the file to be processed is divided according to the above at least two keywords to obtain at least two text fragments, which is beneficial to generating a presentation corresponding to each file fragment. Among them, the content in each text segment will be simplified, split, corrected, or typeset to simplify and correct the content of each text segment, and improve the accuracy of generating the target presentation file. And identify the target presentation file template that matches the characteristic attribute information of the file to be processed, and import at least two text fragments into the target presentation file template to obtain the target presentation file. The target presentation file includes the presentation corresponding to each text fragment . The entire process of generating the target presentation file does not require human involvement, and the generated results are directly output, which can improve the efficiency and flexibility of presentation generation, and ensure the accuracy and relevance of the presentation.

Please refer to FIG. 3, which is a schematic flowchart of another method for generating a presentation provided by an embodiment of the present application, which is executed by the electronic device in the embodiment of the present application. The another method for generating a presentation includes the following steps S201 to S206.

S201: Acquire at least two keywords in a file to be processed and characteristic attribute information of the file to be processed.

S202: Divide the file to be processed according to at least two keywords to obtain at least two text fragments.

S203: Identify a target presentation file template matching the characteristic attribute information of the file to be processed from the presentation file template library.

In the embodiment of the present application, the content in steps S201 to S203 of the another method for generating a presentation can refer to the content shown in FIG. 1, which will not be repeated in the embodiment of the present application.

S204: Obtain the affiliation relationship between keywords of every two text fragments in the at least two text fragments.

S205: Sort at least two text fragments according to the affiliation between the keywords of every two text fragments.

S206: Import at least two sorted text fragments into the target presentation file template in sequence to obtain the target presentation file.

The affiliation relationship between keywords corresponding to each of the at least two text fragments can be obtained according to the BERT model, and the affiliation relationship may refer to the containment relationship and the sequence relationship between the keywords. The sequence between the at least two text segments can be determined according to the subordination relationship between the keywords of every two text segments, and the at least two text segments can be sorted according to the sequence. For example, the document to be processed is about tourism promotion of a certain place. If the key word corresponding to text fragment 1 is the historical culture of a certain place, the key word corresponding to text fragment 2 is the story of a certain place during the Republic of China. A certain period of culture, therefore, the order of the keywords corresponding to the text fragment 1 is before the order of the keywords corresponding to the text fragment 2. According to the subordination relationship between the keywords of each two text fragments, at least two text fragments are sorted, and the sorted at least two text fragments are sequentially imported into the target presentation file template to obtain the target presentation file. The accuracy of the presentation.

In the embodiment of the present application, by acquiring at least two keywords in the file to be processed and the characteristic attribute information of the file to be processed; dividing the file to be processed according to the at least two keywords to obtain at least two text fragments; from the presentation file In the template library, identify the target presentation file template matching the characteristic attribute information of the file to be processed; import at least two text fragments into the target presentation file template to obtain the target presentation file. In this solution, the file to be processed is divided according to the above at least two keywords to obtain at least two text fragments, which is beneficial to generating a presentation corresponding to each file fragment. Among them, the content in each text segment will be simplified, split, corrected, or typeset to simplify and correct the content of each text segment, and improve the accuracy of generating the target presentation file. And identify the target presentation file template that matches the characteristic attribute information of the file to be processed, and import at least two text fragments into the target presentation file template to obtain the target presentation file. The target presentation file includes the presentation corresponding to each text fragment . When importing at least two text fragments into the target presentation file template, at least two text fragments will be sorted according to the affiliation between the keywords of each two text fragments in the at least two text fragments, and then the sorted Import at least two text fragments into the target presentation file template, so that any two presentations in the target presentation file have a reasonable sequence, and improve the accuracy of the target presentation file generation. And in the entire process of generating the target presentation file, there is no need for human involvement, and the generated result is directly output, which can improve the efficiency and flexibility of presentation generation, and ensure the accuracy and relevance of the presentation.

Refer to FIG. 4, which is a schematic structural diagram of a presentation generating apparatus provided by an embodiment of the present application. The presentation generating apparatus of the embodiment of the present application may be in the above-mentioned electronic device. In this embodiment, the data processing device includes the following: an obtaining module 11, configured to obtain at least two keywords in the file to be processed, and characteristic attribute information of the file to be processed, the characteristic attribute information including the to-be-processed file At least one of the field to which the file to be processed belongs, the number of keywords in the file to be processed, and the subject of the file to be processed; The file is divided to obtain at least two text fragments, one text fragment corresponds to at least one keyword; the recognition module 13 is used to identify the target presentation that matches the characteristic attribute information of the file to be processed from the presentation file template library File template; wherein the identification module 13 includes: a first determining unit, a first acquiring unit, and a second determining unit. The first determining unit is configured to determine the number of text fragments in the at least two text fragments according to the number of keywords in the file to be processed; the first acquiring unit is configured to acquire each of the presentation file template libraries The number of presentation documents included in the presentation file template; the second determining unit is used to determine the number of presentation texts included in the presentation file template library as the presentation file template with the same number of text fragments as the Target presentation file template.

Wherein, the identification module 13 further includes: a second acquisition unit, a third determination unit, and a selection unit.

The second obtaining unit is used to obtain the theme of each presentation file template in the presentation file template library; the third determining unit is used to separately determine the theme of each presentation file template in the presentation file template library, and The degree of matching between the topics of the files to be processed; a selection unit for selecting a presentation file template with the largest matching degree from the presentation file template library as the target presentation file template.

Wherein, the identification module 13 further includes: a prediction unit, a third acquisition unit, and a fourth determination unit.

The prediction unit is configured to predict the attribute information of the presentation file corresponding to the file to be processed according to the field to which the file to be processed belongs, and the attribute information of the presentation file corresponding to the file to be processed includes the presentation corresponding to the file to be processed Typesetting information and color information of the file; the third obtaining unit is used to obtain the attribute information of each presentation file template in the presentation file template library, and the attribute information of each presentation file template includes each presentation file template. The typesetting information and color information of the file; the fourth determining unit is used to combine the attribute information of the presentation file template in the presentation file template library with the presentation file template with the greatest degree of match between the attribute information of the presentation file corresponding to the file to be processed, Determined as the target presentation file template.

The import module 14 is configured to import the at least two text fragments into the target presentation file template to obtain the target presentation file.

Wherein, the import module 14 includes: a fourth acquisition unit, a first sorting unit, and a first import unit.

The fourth obtaining unit is configured to obtain the position information of the keyword of each text fragment in the at least two text fragments in the to-be-processed file; the first sorting unit is configured to compare the position information to the At least two text fragments are sorted; the first import unit is configured to import the at least two text fragments sorted into the target presentation file template in order to obtain the target presentation file.

Wherein, the import module 14 further includes: a fifth acquisition unit, a second sorting unit, and a second import unit.

The fifth acquiring unit is configured to acquire the affiliation relationship between keywords of each two text fragments in the at least two text fragments; the second sorting unit is configured to compare the at least two texts according to the affiliation relationship. The fragments are sorted; the second import unit is used to import at least two text fragments sorted into the target presentation file template in sequence to obtain the target presentation file.

Wherein, the import module 14 further includes: a sixth acquisition unit, a fifth determination unit, a processing unit, and a third import unit.

The sixth acquiring unit is configured to acquire the text feature information corresponding to each of the at least two text fragments; the fifth determining unit is configured to determine the text feature information corresponding to each of the text fragments. A preprocessing method for text fragments, the preprocessing method includes at least one of simplification processing, split processing, correction processing, and typesetting processing; a processing unit configured to perform processing on the preprocessing method of each text fragment Each text fragment is processed to obtain at least two processed text fragments; the third import unit is configured to import the processed at least two text fragments into the target presentation file template to obtain the target presentation file.

Please refer to FIG. 5, which is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in FIG. 5, the electronic device in this embodiment may include: one or more processors 21; and one or more input devices 22. One or more output devices 23 and storage 24. The aforementioned processor 21, input device 22, output device 23, and memory 24 are connected by a bus 25.

The processor 21 may be a central processing unit (Central Processing Unit, CPU), the processor can also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), ready-made programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.

The input device 22 may include a touch panel, a fingerprint sensor (used to collect user fingerprint information and fingerprint orientation information), a microphone, etc., the output device 23 may include a display (LCD, etc.), a speaker, etc., and the output device 23 may output calibration The processed data sheet.

The memory 24 may include a read-only memory and a random access memory, and provides instructions and data to the processor 21. A part of the memory 24 may also include a non-volatile random access memory. The memory 24 is used to store a computer program. The computer program includes program instructions. The processor 21 is used to execute the program instructions stored in the memory 24 to execute a program. A method for generating a presentation is used to perform the following operations: acquiring at least two keywords in a file to be processed and characteristic attribute information of the file to be processed, where the characteristic attribute information includes the field to which the file to be processed belongs At least one of the number of keywords in the file to be processed and the subject of the file to be processed; dividing the file to be processed according to the at least two keywords to obtain at least two text fragments, One text fragment corresponds to at least one keyword; from the presentation file template library, identify the target presentation file template matching the characteristic attribute information of the file to be processed; import the at least two text fragments into the target presentation In the file template, the target presentation file is obtained.

Optionally, the processor 21 is configured to execute program instructions stored in the memory 24 to perform the following operations: determine the number of text fragments in the at least two text fragments according to the number of keywords in the file to be processed; The number of presentations included in each presentation file template in the presentation file template library; the number of presentation texts included in the presentation file template library is determined to be the presentation file template with the same number of text fragments as The target presentation file template.

Optionally, the processor 21 is configured to execute program instructions stored in the memory 24 to perform the following operations: obtain the theme of each presentation file template in the presentation file template library; respectively determine the presentation file template in the presentation file template library The matching degree between the theme of each presentation file template and the theme of the file to be processed; the presentation file template with the largest matching degree is selected from the presentation file template library as the target presentation file template.

Optionally, the processor 21 is configured to execute program instructions stored in the memory 24 to perform the following operations: predict the attribute information of the presentation file corresponding to the to-be-processed file according to the field to which the to-be-processed file belongs, and the to-be-processed file The attribute information of the presentation file corresponding to the processed file includes the typesetting information and color information of the presentation file corresponding to the file to be processed; the attribute information of each presentation file template in the presentation file template library is acquired, and each presentation file template is The attribute information includes the typesetting information and color information of each presentation file template; the attribute information of the presentation file template in the presentation file template library is matched with the attribute information of the presentation file corresponding to the file to be processed. The presentation file template is determined as the target presentation file template.

Optionally, the processor 21 is configured to execute program instructions stored in the memory 24, and is configured to perform the following operations: obtain the position information of the keyword of each text fragment in the at least two text fragments in the file to be processed; The at least two text fragments are sorted according to the location information; the at least two text fragments after sorting are sequentially imported into the target presentation file template to obtain the target presentation file.

Optionally, the processor 21 is configured to execute program instructions stored in the memory 24, and is configured to perform the following operations: obtain the affiliation relationship between keywords of every two text fragments in the at least two text fragments; Relationship, sort the at least two text fragments; import the at least two text fragments sorted into the target presentation file template in sequence to obtain the target presentation file.

Optionally, the processor 21 is configured to execute program instructions stored in the memory 24 to perform the following operations: obtain the text feature information corresponding to each of the at least two text fragments; Determine the preprocessing mode of each text segment in the text feature information, the preprocessing mode includes at least one of simplification processing, split processing, correction processing, and typesetting processing; according to the preprocessing of each text segment Each of the text fragments is processed in a manner to obtain at least two text fragments after processing; the at least two text fragments after the processing are imported into the target presentation file template to obtain the target presentation file.

The processor 21, the input device 22, and the output device 23 described in the embodiments of this application can execute the implementations described in the first embodiment and the second embodiment of the presentation generation method provided in the embodiments of this application. The implementation of the electronic device described in the embodiments of the present application is implemented, which will not be repeated here.

In the embodiment of this application, by acquiring at least two keywords in the file to be processed and the characteristic attribute information of the file to be processed; dividing the file to be processed according to the at least two keywords to obtain at least two text fragments; from the presentation file In the template library, identify the target presentation file template matching the characteristic attribute information of the file to be processed; import at least two text fragments into the target presentation file template to obtain the target presentation file. In this solution, the file to be processed is divided according to the above at least two keywords to obtain at least two text fragments, which is beneficial to generating a presentation corresponding to each file fragment. Among them, the content in each text segment will be simplified, split, corrected, or typeset to simplify and correct the content of each text segment, and improve the accuracy of generating the target presentation file. And identify the target presentation file template that matches the characteristic attribute information of the file to be processed, and import at least two text fragments into the target presentation file template to obtain the target presentation file. The target presentation file includes the presentation corresponding to each text fragment . When importing at least two text fragments into the target presentation file template, the at least two text fragments will be sorted according to the affiliation between the keywords of each two text fragments in the at least two text fragments, and then the sorted Import at least two text fragments into the target presentation file template, so that any two presentations in the target presentation file have a reasonable sequence, and improve the accuracy of the target presentation file generation. And in the whole process of generating the target presentation file, there is no need for manual participation, and the generated result is directly output, which can improve the efficiency and flexibility of presentation generation, and ensure the accuracy and relevance of the presentation.

An embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and the computer program includes program instructions. When the program instructions are executed by a processor, the implementation is shown in FIG. 1 and FIG. 3 The presentation generation method shown in the embodiment. The computer-readable storage medium may be non-volatile or volatile.

The computer-readable storage medium may be an internal storage unit of the electronic device described in any of the foregoing embodiments, such as a hard disk or a memory of a control device. The computer-readable storage medium may also be an external storage device of the control device, such as a plug-in hard disk equipped on the control device, a smart memory card (Smart Media Card, SMC), or a secure digital (Secure Digital, SD) card, flash card (Flash Card), etc. Further, the computer-readable storage medium may also include both an internal storage unit of the control device and an external storage device. The computer-readable storage medium is used to store the computer program and other programs and data required by the control device. The computer-readable storage medium can also be used to temporarily store data that has been output or will be output.

As an example, the above-mentioned computer-readable storage medium may be deployed on one computer device for execution, or deployed on multiple computer devices located in one location, or on multiple computer devices that are distributed in multiple locations and interconnected by a communication network. Executed on a computer device, multiple computer devices distributed in multiple locations and interconnected by a communication network can form a blockchain network.

The above are only specific implementations of this application, but the protection scope of this application is not limited to this. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in this application. Should be covered within the scope of protection of this application. Therefore, the protection scope of this application should be subject to the protection scope of the claims.

Claims

A method for generating presentations, including:

Obtain at least two keywords in the file to be processed, and characteristic attribute information of the file to be processed, where the characteristic attribute information includes the field to which the file to be processed belongs, the number of keywords in the file to be processed, and At least one of the themes of the file to be processed;

Dividing the file to be processed according to the at least two keywords to obtain at least two text fragments, one text fragment corresponding to at least one keyword;

From the presentation file template library, identify the target presentation file template that matches the characteristic attribute information of the file to be processed;

Import the at least two text fragments into the target presentation file template to obtain the target presentation file.
The method according to claim 1, wherein the characteristic attribute information includes the number of keywords in the file to be processed, and the identification from the presentation file template library matches the characteristic attribute information of the file to be processed The target presentation file template includes:

Determining the number of text fragments in the at least two text fragments according to the number of keywords in the file to be processed;

Acquiring the number of presentations included in each presentation file template in the presentation file template library;

The number of presentation texts included in the presentation file template library and the presentation file template with the same number of text fragments are determined as the target presentation file template.
The method according to claim 1, wherein the characteristic attribute information includes the subject of the file to be processed, and the target presentation file that matches the characteristic attribute information of the file to be processed is identified from the presentation file template library Templates, including:

Acquiring the theme of each presentation file template in the presentation file template library;

Respectively determine the matching degree between the theme of each presentation file template in the presentation file template library and the theme of the file to be processed;

Select the presentation file template with the greatest matching degree from the presentation file template library as the target presentation file template.
The method according to claim 1, wherein the characteristic attribute information includes the field to which the file to be processed belongs, and the presentation file template library is used to identify a target presentation that matches the characteristic attribute information of the file to be processed Document templates, including:

According to the field to which the file to be processed belongs, the attribute information of the presentation file corresponding to the file to be processed is predicted, and the attribute information of the presentation file corresponding to the file to be processed includes the typesetting information of the presentation file corresponding to the file to be processed and Color information

Acquiring attribute information of each presentation file template in the presentation file template library, where the attribute information of each presentation file template includes typesetting information and color information of each presentation file template;

The attribute information of the presentation file template in the presentation file template library and the presentation file template with the greatest degree of matching with the attribute information of the presentation file corresponding to the file to be processed are determined as the target presentation file template.
The method according to claim 1, wherein said importing said at least two text fragments into said target presentation file template to obtain a target presentation file comprises:

Acquiring the position information of the keyword of each of the at least two text fragments in the file to be processed;

Sort the at least two text fragments according to the location information;

At least two sorted text fragments are sequentially imported into the target presentation file template to obtain the target presentation file.
The method according to claim 1, wherein said importing said at least two text fragments into said target presentation file template to obtain a target presentation file comprises:

Acquiring the affiliation relationship between keywords of every two text fragments in the at least two text fragments;

Sort the at least two text fragments according to the affiliation;

At least two sorted text fragments are sequentially imported into the target presentation file template to obtain the target presentation file.
The method according to claim 1, wherein said importing said at least two text fragments into said target presentation file template to obtain a target presentation file comprises:

Acquiring text feature information corresponding to each of the at least two text fragments;

Determining a preprocessing manner for each text fragment according to the text feature information corresponding to each text fragment, where the preprocessing manner includes at least one of simplification processing, split processing, correction processing, and typesetting processing;

Processing each of the text fragments according to the preprocessing manner of each of the text fragments to obtain at least two processed text fragments;

Import the processed at least two text fragments into the target presentation file template to obtain the target presentation file.
The method according to claim 1, wherein the dividing the file to be processed according to the at least two keywords to obtain at least two text fragments comprises:

Acquiring the similarity between every two adjacent keywords in the at least two keywords;

The corresponding paragraphs in the file to be processed are divided into the same text segment to obtain the at least two text segments, wherein the two adjacent keywords can be located in the same text segment. Describe adjacent paragraphs or the same paragraph in the document to be processed.
The method according to claim 8, wherein said obtaining the similarity between every two adjacent keywords in the at least two keywords comprises:

The distance between every two adjacent keywords in the at least two keywords is calculated, and the similarity between every two adjacent keywords in the at least two keywords is determined according to the distance.
A presentation generating device, which includes:

The obtaining module is used to obtain at least two keywords in the file to be processed and the characteristic attribute information of the file to be processed. The characteristic attribute information includes the field to which the file to be processed belongs and the information in the file to be processed. At least one of the number of keywords and the subject of the document to be processed;

A dividing module, configured to divide the file to be processed according to the at least two keywords to obtain at least two text fragments, one text fragment corresponding to the at least one keyword;

The recognition module is used to identify the target presentation file template matching the characteristic attribute information of the file to be processed from the presentation file template library;

The import module is used to import the at least two text fragments into the target presentation file template to obtain the target presentation file.
An electronic device, including:

Processor, suitable for implementing one or more instructions; and,

A computer-readable storage medium storing one or more instructions, and the one or more instructions are suitable for being loaded by the processor and executing the following steps:

Obtain at least two keywords in the file to be processed, and characteristic attribute information of the file to be processed, where the characteristic attribute information includes the field to which the file to be processed belongs, the number of keywords in the file to be processed, and At least one of the themes of the file to be processed;

Dividing the file to be processed according to the at least two keywords to obtain at least two text fragments, one text fragment corresponding to at least one keyword;

From the presentation file template library, identify the target presentation file template that matches the characteristic attribute information of the file to be processed;

Import the at least two text fragments into the target presentation file template to obtain the target presentation file.
11. The electronic device according to claim 11, wherein the characteristic attribute information includes the number of keywords in the file to be processed, and the processor is configured to:

Determining the number of text fragments in the at least two text fragments according to the number of keywords in the file to be processed;

Acquiring the number of presentations included in each presentation file template in the presentation file template library;

The number of presentation texts included in the presentation file template library and the presentation file template with the same number of text fragments are determined as the target presentation file template.
11. The electronic device according to claim 11, wherein the characteristic attribute information includes the subject of the file to be processed, and the processor is configured to:

Acquiring the theme of each presentation file template in the presentation file template library;

Respectively determine the matching degree between the theme of each presentation file template in the presentation file template library and the theme of the file to be processed;

Select the presentation file template with the greatest matching degree from the presentation file template library as the target presentation file template.
11. The electronic device according to claim 11, wherein the characteristic attribute information includes the field to which the file to be processed belongs, and the processor is configured to:

According to the field to which the file to be processed belongs, the attribute information of the presentation file corresponding to the file to be processed is predicted, and the attribute information of the presentation file corresponding to the file to be processed includes the typesetting information of the presentation file corresponding to the file to be processed and Color information

Acquiring attribute information of each presentation file template in the presentation file template library, where the attribute information of each presentation file template includes typesetting information and color information of each presentation file template;

The attribute information of the presentation file template in the presentation file template library and the presentation file template with the greatest degree of matching with the attribute information of the presentation file corresponding to the file to be processed are determined as the target presentation file template.
The electronic device according to claim 11, wherein the processor is configured to:

Acquiring the position information of the keyword of each of the at least two text fragments in the file to be processed;

Sort the at least two text fragments according to the location information;

At least two sorted text fragments are sequentially imported into the target presentation file template to obtain the target presentation file.
The electronic device according to claim 11, wherein the processor is configured to:

Acquiring the affiliation relationship between keywords of every two text fragments in the at least two text fragments;

Sort the at least two text fragments according to the affiliation;

At least two sorted text fragments are sequentially imported into the target presentation file template to obtain the target presentation file.
The electronic device according to claim 11, wherein the processor is configured to:

Acquiring text feature information corresponding to each of the at least two text fragments;

Determining a preprocessing manner for each text fragment according to the text feature information corresponding to each text fragment, where the preprocessing manner includes at least one of simplification processing, split processing, correction processing, and typesetting processing;

Processing each of the text fragments according to the preprocessing manner of each of the text fragments to obtain at least two processed text fragments;

Import the processed at least two text fragments into the target presentation file template to obtain the target presentation file.
The electronic device according to claim 11, wherein the processor is configured to:

Acquiring the similarity between every two adjacent keywords in the at least two keywords;

The corresponding paragraphs in the file to be processed are divided into the same text segment to obtain the at least two text segments, wherein the two adjacent keywords can be located in the same text segment. Describe adjacent paragraphs or the same paragraph in the document to be processed.
The electronic device according to claim 18, wherein the processor is configured to:

The distance between every two adjacent keywords in the at least two keywords is calculated, and the similarity between every two adjacent keywords in the at least two keywords is determined according to the distance.
A computer-readable storage medium, wherein the computer-readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by a processor and executing the following steps:

Obtain at least two keywords in the file to be processed, and characteristic attribute information of the file to be processed, where the characteristic attribute information includes the field to which the file to be processed belongs, the number of keywords in the file to be processed, and At least one of the themes of the file to be processed;

Dividing the file to be processed according to the at least two keywords to obtain at least two text fragments, one text fragment corresponding to at least one keyword;

From the presentation file template library, identify the target presentation file template that matches the characteristic attribute information of the file to be processed;

Import the at least two text fragments into the target presentation file template to obtain the target presentation file.