Nothing Special   »   [go: up one dir, main page]

CN113901778B - Query function generation method, system and device - Google Patents

Query function generation method, system and device Download PDF

Info

Publication number
CN113901778B
CN113901778B CN202111277389.0A CN202111277389A CN113901778B CN 113901778 B CN113901778 B CN 113901778B CN 202111277389 A CN202111277389 A CN 202111277389A CN 113901778 B CN113901778 B CN 113901778B
Authority
CN
China
Prior art keywords
query
target
subdivision
gist
historical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111277389.0A
Other languages
Chinese (zh)
Other versions
CN113901778A (en
Inventor
毛瑞彬
朱菁
潘斌强
杨雯雯
刘金香
孙德旺
武李爱
张俊
杨建明
张大千
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHENZHEN SECURITIES INFORMATION CO Ltd
Original Assignee
SHENZHEN SECURITIES INFORMATION CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHENZHEN SECURITIES INFORMATION CO Ltd filed Critical SHENZHEN SECURITIES INFORMATION CO Ltd
Priority to CN202111277389.0A priority Critical patent/CN113901778B/en
Publication of CN113901778A publication Critical patent/CN113901778A/en
Application granted granted Critical
Publication of CN113901778B publication Critical patent/CN113901778B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a query function generation method, a query function generation system and a query function generation device, which are used for generating a query function aiming at abnormal examination points in a strand book of an issuer, assisting in manual writing and saving manpower and time. The method of the embodiment of the application comprises the following steps: obtaining a target generation model, wherein the target generation model is obtained by performing machine learning training on an initial generation model by a historical query function and a historical stranding book, a rule dictionary for generating the target query function is stored in the target generation model, the historical query function has a corresponding relation with the historical stranding book, the historical query function is generated according to the historical stranding book, and the rule dictionary is obtained by performing machine learning on the historical query function and the historical stranding book and combining with manual experience; acquiring a text vector of a target stranding book; and inputting the text vector of the target bid to the target generation model, and outputting a target query function corresponding to the target bid by the target generation model according to the rule dictionary.

Description

Query function generation method, system and device
Technical Field
The embodiment of the application relates to the field of algorithm models, in particular to a query function generation method, a query function generation system and a query function generation device.
Background
The registration system is a securities issuing mode taking information disclosure as a core, aiming at the problems of authenticity, integrity and accuracy in public issuing, parallel purchasing recombination and reissuing information disclosure, an exchange initiates an inquiry function to an issuer, the issuer and issuing intermediaries such as a sponsor, an accounting firm, a law firm and the like answer, and the authenticity, integrity and accuracy of information disclosure are supplemented from multiple inquiries, so that an investor can fully know the investment value and risk of the issuer, reduce the risk of capital market and improve the capability of the capital market for entity economic service.
The method comprises the steps that an inquiry function is sent to an issuer to be an important mode for enabling registered securities to issue auditing rights, the inquiry function inquires abnormal auditing points in a stock book of the issuer, and the issuer replies to the inquiry function, so that the information disclosure of the issuer is ensured to be sufficient, and the awareness rights of investors are ensured. But writing an inquiry letter requires an auditor to read the application material, review the relevant information, determine whether the information disclosed in the equity book is authentic, complete and accurate, and write a report, thereby giving an inquiry to the issuer, sponsor institution, accounting firm, or law firm.
But in the actual working process, the query function is written by pure manual, so that the workload is high, and the labor is consumed.
Disclosure of Invention
The embodiment of the application provides a query function generation method, a query function generation system and a query function generation device, which are used for generating a query function aiming at abnormal examination points in a strand book of an issuer, assisting in manual writing and saving labor and time cost.
The query function generating method provided by the embodiment of the application comprises the following steps:
Obtaining a target generation model, wherein the target generation model is obtained by performing machine learning training on an initial generation model by a historical query function and a historical stranding book, a rule dictionary for generating the target query function is stored in the target generation model, the historical query function has a corresponding relation with the historical stranding book, the historical query function is generated according to the historical stranding book, and the rule dictionary is obtained by performing machine learning on the historical query function and the historical stranding book and combining with manual experience;
Acquiring a text vector of a target stranding book;
And inputting the text vector of the target bid to the target generation model, and outputting a target query function corresponding to the target bid by the target generation model according to the rule dictionary.
Optionally, before the obtaining the target generation model, the method further includes:
acquiring the historical query function and the historical stranding book, and acquiring text vectors of the historical query function and the historical stranding book by using a text vector algorithm;
And taking the text vectors of the historical query function and the historical stranding book as training samples to input the initial prediction model, performing machine learning training on the initial generation model by using the training samples to obtain the target generation model, wherein the rule dictionary is stored in the target generation model.
Optionally, the rule dictionary is obtained by machine learning a history inquiry function and a history stranding book and combining manual experience, and comprises the following steps:
Decomposing the historical query function into a query background paragraph and a query question paragraph by using a classification algorithm, and identifying subdivision problems contained in the query question paragraph, wherein one audit gist corresponds to at least one query background paragraph and at least one query question paragraph, and the query background paragraph and the query question paragraph corresponding to the same audit gist have an association relation;
determining auditing points corresponding to the subdivision problems, wherein one auditing point corresponds to at least one subdivision problem, and forming a first catalog representing the correspondence between the auditing points and the subdivision problems;
Acquiring keywords contained in the subdivision questions, and taking the keywords as query directions of the subdivision questions;
clustering the subdivision questions corresponding to each audit gist according to query directions to obtain a second catalog representing the corresponding relation among the audit gist, the query directions and the subdivision questions, wherein one audit gist corresponds to at least one query direction and one query direction corresponds to at least one subdivision question;
Inputting the subdivision questions into a neural network algorithm model to generate a question template, and obtaining a third catalog representing the corresponding relation among the auditing gist, the inquiring directions and the question template, wherein one auditing gist corresponds to at least one inquiring direction, and one inquiring direction corresponds to at least one question template;
determining a triggering condition of the auditing gist according to the subdivision problem to obtain a fourth catalog representing the corresponding relation among the auditing gist, the inquiring direction, the problem template and the triggering condition, wherein the triggering condition is used for judging whether the auditing gist is to be inquired or not;
machine learning the query background paragraph and the historical stranding book to obtain a summary rewrite rule for generating the query background paragraph;
The rule dictionary includes the fourth directory and the digest-rewrite rules.
Optionally, the obtaining the keywords contained in the subdivision problem, and taking the keywords as the query direction of the subdivision problem includes:
Word segmentation matching based on word strength is carried out on the subdivision problem to obtain a preprocessed word segmentation set, and a text vector algorithm and a text classification algorithm are used for processing the preprocessed word segmentation set to obtain a keyword seed word stock;
expanding the keyword seed word stock by using a near-meaning word dictionary based on manual experience to obtain a keyword word stock;
Or the keyword seed word stock is expanded by calculating the similarity of words through a text vector algorithm, so as to obtain the keyword word stock;
And obtaining keywords contained in the subdivision questions according to the keyword word stock, and taking the keywords as query directions of the subdivision questions.
Optionally, the inputting the subdivision problem into the neural network algorithm model to generate a problem template includes:
a text vector algorithm is used for obtaining text vectors of the subdivision problems, the text vectors of the subdivision problems are used as training samples to be input into an initial text recognition model for machine learning training, a target text recognition model is obtained, and the recognition method of entity texts, attribute texts and attribute value texts in the subdivision problems is stored in the target text recognition model;
And replacing the entity text, the attribute text and the attribute value text in the subdivision problem with placeholders to obtain the problem template.
Optionally, the determining the triggering condition of the audit gist according to the subdivision problem includes:
The triggering conditions of the auditing gist are divided into two types of rule modes and content modes, wherein the triggering conditions of the rule modes comprise:
Presetting an attribute value threshold according to the attribute value of the subdivision problem, and determining to trigger the auditing gist if the attribute value in the historical equity book does not meet the attribute value threshold, wherein the attribute value in the subdivision problem in the historical query function has a corresponding relation with the attribute value of the auditing gist corresponding to the historical equity book;
The triggering conditions of the content mode include:
performing machine learning training by taking the historical query function and the historical stranding book as training samples to obtain a classification model for judging whether abnormal auditing key points exist;
and when the target stranding book inputs the two classification models to obtain a conclusion that abnormal auditing gist exists, determining to trigger the auditing gist.
Optionally, the machine learning the query background paragraph and the historical equity book to obtain the summary rewrite rule for generating the query background paragraph includes:
Taking the inquiry background paragraph and the history stranding book as training samples to carry out machine learning training to obtain the corresponding relation between the inquiry background paragraph and the content in the history stranding book;
And determining a summary rewrite rule for generating the query background paragraph according to the corresponding relation, wherein the summary rewrite rule is used for performing summary rewrite on the text paragraph with the corresponding relation with the query background paragraph in the history poster book to generate the query background paragraph.
The query function generating system provided by the embodiment of the application comprises the following components:
The acquisition unit is used for acquiring a target generation model, the target generation model is obtained by performing machine learning training on an initial generation model by a historical query function and a historical stranding book, a rule dictionary for generating the target query function is stored in the target generation model, the historical query function has a corresponding relation with the historical stranding book, the historical query function is generated according to the historical stranding book, and the rule dictionary is obtained by performing machine learning on the historical query function and the historical stranding book and combining with manual experience;
the acquisition unit is also used for acquiring the text vector of the target stranding book;
And the input unit is used for inputting the text vector of the target poster book into the target generation model, and the target generation model outputs a target query function corresponding to the target poster book according to the rule dictionary.
Optionally, the obtaining unit is further configured to obtain, before the obtaining the target generation model, the historical query function and the historical bid, and obtain text vectors of the historical query function and the historical bid by using a text vector algorithm;
And taking the text vectors of the historical query function and the historical stranding book as training samples to input the initial prediction model, performing machine learning training on the initial generation model by using the training samples to obtain the target generation model, wherein the rule dictionary is stored in the target generation model.
Optionally, the obtaining unit is specifically configured to decompose the historical query function into a query background paragraph and a query question paragraph by using a classification algorithm, and identify a subdivision problem included in the query question paragraph, where an audit gist corresponds to at least one query background paragraph and at least one query question paragraph, and the query background paragraph and the query question paragraph corresponding to the same audit gist have an association relationship;
Confirming auditing points corresponding to the subdivision problems, wherein one auditing point corresponds to at least one subdivision problem, and forming a first catalog representing the correspondence between the auditing points and the subdivision problems;
Acquiring keywords contained in the subdivision questions, and taking the keywords as query directions of the subdivision questions;
clustering the subdivision questions corresponding to each audit gist according to query directions to obtain a second catalog representing the corresponding relation among the audit gist, the query directions and the subdivision questions, wherein one audit gist corresponds to at least one query direction and one query direction corresponds to at least one subdivision question;
Inputting the subdivision questions into a neural network algorithm model to generate a question template, and obtaining a third catalog representing the corresponding relation among the auditing gist, the inquiring directions and the question template, wherein one auditing gist corresponds to at least one inquiring direction, and one inquiring direction corresponds to at least one question template;
determining a triggering condition of the auditing gist according to the subdivision problem to obtain a fourth catalog representing the corresponding relation among the auditing gist, the inquiring direction, the problem template and the triggering condition, wherein the triggering condition is used for judging whether the auditing gist is to be inquired or not;
machine learning the query background paragraph and the historical stranding book to obtain a summary rewrite rule for generating the query background paragraph;
The rule dictionary includes the fourth directory and the digest-rewrite rules.
Optionally, the obtaining unit is specifically further configured to perform word segmentation matching based on word strength on the subdivision problem to obtain a preprocessed word segmentation set, and process the preprocessed word segmentation set by using a text vector algorithm and a text classification algorithm to obtain a keyword seed word stock;
expanding the keyword seed word stock by using a near-meaning word dictionary based on manual experience to obtain a keyword word stock;
Or the keyword seed word stock is expanded by calculating the similarity of words through a text vector algorithm, so as to obtain the keyword word stock;
And obtaining keywords contained in the subdivision questions according to the keyword word stock, and taking the keywords as query directions of the subdivision questions.
Optionally, the obtaining unit is specifically further configured to obtain a text vector of the subdivision problem by using a text vector algorithm, input the text vector of the subdivision problem as a training sample into an initial text recognition model to perform machine learning training, and obtain a target text recognition model, where the target text recognition model stores a recognition method of an entity text, an attribute text and an attribute value text in the subdivision problem;
And replacing the entity text, the attribute text and the attribute value text in the subdivision problem with placeholders to obtain the problem template.
Optionally, the acquiring unit is specifically further configured to determine a triggering condition of the audit gist according to the subdivision problem, where the triggering condition of the audit gist is classified into a rule mode and a content mode, and the triggering condition of the rule mode includes:
Presetting an attribute value threshold according to the attribute value of the subdivision problem, and determining to trigger the auditing gist if the attribute value in the historical equity book does not meet the attribute value threshold, wherein the attribute value in the subdivision problem in the historical query function has a corresponding relation with the attribute value of the auditing gist corresponding to the historical equity book;
The triggering conditions of the content mode include:
performing machine learning training by taking the historical query function and the historical stranding book as training samples to obtain a classification model for judging whether abnormal auditing key points exist;
and when the target stranding book inputs the two classification models to obtain a conclusion that abnormal auditing gist exists, determining to trigger the auditing gist.
Optionally, the obtaining unit is specifically further configured to perform machine learning training by using the query background paragraph and the historical stranding book as training samples, so as to obtain a corresponding relationship between the query background paragraph and content in the historical stranding book;
And determining a summary rewrite rule for generating the query background paragraph according to the corresponding relation, wherein the summary rewrite rule is used for performing summary rewrite on the text paragraph with the corresponding relation with the query background paragraph in the history poster book to generate the query background paragraph.
The embodiment of the application also provides a query function generating device, which comprises:
a central processing unit, a memory and an input/output interface;
The memory is a short-term memory or a persistent memory;
the central processor is configured to communicate with the memory and execute instruction operations in the memory to perform the aforementioned query function generation method.
The embodiment of the application also provides a computer readable storage medium, which comprises instructions, wherein the instructions, when running on a computer, cause the computer to execute the query function generation method.
From the above technical solutions, the embodiment of the present application has the following advantages:
the target generation model for generating the target query function is obtained by machine learning the historical query function and the historical stranding book, and the rule dictionary for generating the target query function is stored in the target generation model, so that after the target stranding book is input into the target generation model, the target generation model can output the target query function, and the labor and time cost required by pure manual writing are reduced.
Drawings
FIG. 1 is a schematic diagram of a method for generating an inquiry function according to an embodiment of the present application;
FIG. 2 is a schematic diagram of another embodiment of a query function generating method according to an embodiment of the present application;
FIG. 3 is a schematic diagram of an exemplary query function generation system according to an embodiment of the present application;
fig. 4 is a schematic diagram of an apparatus for generating an inquiry function according to an embodiment of the present application.
Detailed Description
The embodiment of the application provides a query function generation method, a query function generation system and a query function generation device, which are used for generating a query function aiming at abnormal examination points in a strand book of an issuer, assisting in manual writing and saving manpower and time.
The registration system is a securities issuing mode taking information disclosure as a core, aiming at the problems of authenticity, integrity and accuracy in public issuing, parallel purchasing recombination and reissuing information disclosure, an exchange initiates an inquiry function to an issuer, the issuer and issuing intermediaries such as a sponsor, an accounting firm, a law firm and the like answer, and the authenticity, integrity and accuracy of information disclosure are supplemented from multiple inquiries, so that an investor can fully know the investment value and risk of the issuer, reduce the risk of capital market and improve the capability of the capital market for entity economic service.
The method comprises the steps that an inquiry function is sent to an issuer to be an important mode for enabling registered securities to issue auditing rights, the inquiry function inquires abnormal auditing points in a stock book of the issuer, and the issuer replies to the inquiry function, so that the information disclosure of the issuer is ensured to be sufficient, and the awareness rights of investors are ensured. But writing an inquiry letter requires an auditor to read the application material, review the relevant information, determine whether the information disclosed in the equity book is authentic, complete and accurate, and write a report, thereby giving an inquiry to the issuer, sponsor institution, accounting firm, or law firm.
But in the actual working process, the query function is written by pure manual, so that the workload is high, and the labor is consumed. Therefore, a method for automatically generating an inquiry function by using a computer is needed to assist in manually writing the inquiry function, and the technical effect can be achieved by the embodiment of the application. In the embodiment of the application, for abnormal auditing gist appearing in information disclosure files such as a publisher's advertisement, a proper query direction and subdivision problem are recommended, a question template is filled with an entity, an attribute and an attribute value to generate a question paragraph, and a query background paragraph is generated, so that a complete query function is finally generated, and the generated query function can be sent to the publisher only by manually moisturizing the generated query function or even without manual moisturizing.
It should be noted that, in the embodiment of the present application, the query function is used to refer to the text data that queries the public text data of the issuer, including but not limited to the form of the query function, and the endorsement is used to refer to the text data related to the issuance of securities, including but not limited to the form of the endorsement, which is used in the writing for the convenience of language expression and the clarity of technical solution, so that such naming mode is adopted.
Referring to fig. 1, an implementation manner of a query function generating method according to an embodiment of the present application includes steps 101 to 103.
101. And obtaining a target generation model.
The target generation model is pre-established and is used for generating a target query function, the target query function and the target stranding book have a corresponding relation, and the target query function is used for initiating query to an abnormal examination point existing in the target stranding book, so that the target generation model is specifically obtained by performing machine learning training on an initial generation model through a historical query function and the historical stranding book, a rule dictionary used for generating the target query function is stored in the target generation model, and the rule dictionary is obtained through performing machine learning on the historical query function and the historical stranding book and combining manual experience.
The rule dictionary includes a rule for generating a query paragraph of the target query function and a rule for generating a query background paragraph of the target query function, and in the stage of building the target generation model, the historical query function is first decomposed into the query background paragraph and the query paragraph by using a classification algorithm according to text features in the historical query function, wherein the text features can be represented by text vectors, i.e. text vectors corresponding to the semantics of the historical query function are converted by using a text vector algorithm, such as embedding algorithm, and the historical query function is decomposed into the query background paragraph and the query paragraph according to the text vectors, and can also be decomposed into the query background paragraph and the query paragraph by adopting other text semantic analysis modes capable of achieving the same or similar effects, and the method is not limited in this particular.
After obtaining the query paragraphs, identifying the subdivision questions contained in the query paragraphs, and similarly, a text vector algorithm, such as embedding algorithm, may be adopted to further decompose the query paragraphs into subdivision questions, or further decompose the query paragraphs into subdivision questions by adopting other text semantic analysis modes capable of achieving the same or similar effects, where an audit point corresponds to at least one query background paragraph and at least one query question paragraph, and the query background paragraphs and the query question paragraphs corresponding to the same audit point have an association relationship.
The method comprises the steps of determining an audit point corresponding to a subdivision problem, wherein the audit point is corresponding to the subdivision problem in a question section, and a first catalog representing the correspondence between the audit point and the subdivision problem is formed.
Keywords contained in the subdivision questions are obtained by using a word segmentation algorithm, and the keywords are used as query directions of the subdivision questions, wherein the word segmentation algorithm can be a BPE algorithm or other algorithms capable of achieving the same or similar effects, and the word segmentation algorithm is not limited in the specification. Specifically, firstly adopting a word segmentation algorithm to reject non-key phrases in the subdivision problem to obtain a preprocessed word segmentation set, calculating tf-idf values of text vectors of words in the preprocessed word segmentation set, and judging words belonging to keywords according to grammar components of the words in sentences to obtain a keyword seed word stock. The process of obtaining the keyword seed word stock can be achieved by constructing a keyword classification model, for example, text vectors, tf-idf values of the words in the pretreatment word set and grammar components of the words in sentences are used as inputs of the keyword classification model, whether the words in the pretreatment word set are keywords or not is outputted, wherein the keyword classification model can be achieved by selecting RNN, CNN or Transformer as an encoder and superposing a sigmoid classifier. Because different historic query functions and historic stranding books are written by different people, different term styles may exist, so that the coverage range of keywords in the keyword seed word stock is smaller, and the keyword seed word stock needs to be expanded to form the keyword word stock. The keyword seed word stock is obtained by expanding a near-meaning word dictionary based on manual experience, or the keyword seed word stock is expanded by calculating the similarity of words by using a text vector algorithm.
And obtaining keywords contained in the subdivision questions according to the keyword word stock, and taking the keywords as query directions of the subdivision questions. And clustering the subdivision questions corresponding to each auditing gist according to the query directions to obtain a second catalogue representing the corresponding relation among the auditing gist, the query directions and the subdivision questions, wherein one auditing gist corresponds to at least one query direction and one query direction corresponds to at least one subdivision question.
And inputting the subdivided questions into a neural network algorithm model to generate a question template, and obtaining a third catalog which represents the corresponding relation among the auditing gist, the inquiring directions and the question template, wherein one auditing gist corresponds to at least one inquiring direction, and one inquiring direction corresponds to at least one question template.
Specifically, a text vector algorithm is used for obtaining text vectors of subdivision problems, the text vectors of subdivision problems are used as training samples to be input into an initial text recognition model for machine learning training, a target text recognition model is obtained, the recognition method of entity texts, attribute texts and attribute value texts in subdivision problems is stored in the target text recognition model, and the entity texts, the attribute texts and the attribute value texts in subdivision problems are replaced by placeholders, so that a problem template is obtained.
For example, the entities in the subdivision problem mainly include date, name of business, name of person, amount and quantity, and the attribute value is a numerical value of year, a numerical value of amount or a numerical value of proportion for representing attributes of the entities, and the attribute is used for representing attributes of the entities. Since the entities are closely related to the content in the historical equity, the entities, attributes and attribute values in the subdivision questions are generated from the equity, so a question template can be generated first, and the query function for the equity can be generated by simply filling the question template with the entities, attributes and attribute values. For example, the subdivision problem in the historical query function is "2018, 2019 and 2020", and the company overseas sales income accounts for 43.49%, 44.71% and 23.57% of the main business income, respectively. Please explain the reason for the decrease in the revenue of the main service ", the problem templates after entity, attribute and attribute values are removed and placeholder is filled are" [ year ] year, [ POOSITMBI ], [ POOSITMBI ] and [ POOSITMBI ] respectively. Please explain the reasons for the decrease in revenue of the main service.
And then, determining the triggering condition of the auditing gist according to the subdivision problem, and obtaining a fourth catalogue representing the corresponding relation among the auditing gist, the inquiring direction, the problem template and the triggering condition, wherein the triggering condition is used for judging whether the auditing gist is to be inquired or not.
Specifically, the triggering conditions of the audit gist are divided into two types, namely a rule mode and a content mode, and the triggering conditions of the rule mode comprise:
And presetting an attribute value threshold according to the attribute value of the subdivision problem, and determining to trigger an auditing gist if the attribute value in the historical equity book does not meet the attribute value threshold, wherein the attribute value threshold can be obtained through a formula, a regular expression and four arithmetic, and the attribute value in the subdivision problem in the historical query function has a corresponding relation with the attribute value of the auditing gist corresponding to the historical equity book.
The triggering conditions of the content mode include:
And performing machine learning training by taking the historical query function and the historical stranding book as training samples to obtain a two-class model for judging whether abnormal auditing points exist, and determining to trigger the auditing points when the target stranding book inputs the two-class model to obtain a conclusion that the abnormal auditing points exist.
Machine learning is performed on the query background paragraph and the historical equity book to obtain summary rewrite rules for generating the query background paragraph.
Specifically, when the content in the stranding book meets the triggering condition of the auditing gist, an inquiry background paragraph needs to be generated in the inquiry function, so that the inquiry background paragraph and the historical stranding book are used as training samples for machine learning training, and the corresponding relation between the inquiry background paragraph and the content in the historical stranding book is obtained;
and determining a summary rewrite rule for generating the query background paragraph according to the corresponding relation, wherein the summary rewrite rule is used for performing summary rewrite on the text paragraph with the corresponding relation with the query background paragraph in the history poster book to generate the query background paragraph.
For example, when the audit gist is triggered, firstly, the paragraphs triggering the audit gist in the pooling book are identified, the query direction corresponding to the audit gist is used as the keywords of the query background paragraphs, sentences containing the keywords are preferentially extracted from the pooling book, the referents in the sentences are identified, and the corresponding sentences are replaced by the antecedents, so that the sentences are more smooth, and the query background paragraphs are generated.
And determining a rule dictionary to comprise a fourth catalog and abstract rewrite rules, and generating a target query function corresponding to the target bid according to the rule dictionary.
102. A text vector of the target poster book is acquired.
Since the input of the target generation model is a text vector, the text vector of the target stranding book needs to be acquired by using a text algorithm, so that the target generation model generates a target query function corresponding to the target stranding book according to the text vector of the target stranding book.
103. And inputting the text vector of the target poster book into a target generation model, and outputting a target query function corresponding to the target poster book according to the rule dictionary by the target generation model.
After the text vector of the target stranding book is input into the target generation model, the rule dictionary is matched with the text vector of the target stranding book, when the triggering condition of the auditing main points is met, the rule dictionary is used for carrying out similarity matching and keyword matching based on the text vector on the paragraphs in the target stranding book triggering the auditing main points, if the paragraphs meet a certain problem template, the problem template is filled with entity, attribute and attribute values to generate subdivision problems, and at least one subdivision problem forms a question and query question paragraph of the target query function. Generating a query background paragraph according to the abstract rewrite rule, specifically, taking the query direction corresponding to the audit gist as the key word of the query background paragraph, preferentially picking sentences containing the key word from the stock book, identifying the referents in the sentences, replacing the corresponding language with the antecedent language, and leading the sentences to be more smooth, thereby generating the query background paragraph. The auditing points, the inquiry question paragraphs and the inquiry background paragraphs have corresponding relations, and one target inquiry function comprises at least one group of auditing points, inquiry question paragraphs and inquiry background paragraphs.
In the embodiment of the application, the target generation model for generating the target query function is obtained by machine learning the historical query function and the historical stranding book, and the rule dictionary for generating the target query function is stored in the target generation model, so that after the target stranding book is input into the target generation model, the target generation model can output the target query function, and the labor and time cost required by pure manual writing are reduced.
Referring to fig. 2, another implementation of a query function generating method according to an embodiment of the present application includes steps 201 to 204.
201. Text vectors of the historical query function and the historical bid are obtained.
Because the method provided by the embodiment of the application is to establish a target generation model by using a machine learning method, the history inquiry function and the history stranding book are used as training samples to carry out machine learning training on the initial generation model, and the input of the initial generation model is the text vectors of the history inquiry function and the history stranding book, so that the text vectors of the history inquiry function and the history stranding book are required to be acquired first. Specifically, the historical query function text is converted into a text vector corresponding to the historical query function semantics and the historical endorsement text is converted into a text vector corresponding to the historical endorsement semantics by a text vector algorithm, such as embedding algorithm.
202. And taking the text vectors of the historical query function and the historical stranding book as training samples to input an initial prediction model, performing machine learning training on the initial generation model by using the training samples to obtain a target generation model, and storing a rule dictionary in the target generation model.
Specifically, the rule dictionary includes a rule for generating a query paragraph of the target query function and a rule for generating a query background paragraph of the target query function, and in the stage of building the target generation model, the history query function is first decomposed into the query background paragraph and the query paragraph by using a classification algorithm according to text features in the history query function, where the text features may be represented by text vectors, i.e. text vectors, such as embedding algorithm, are used to transform the text of the history query function into text vectors corresponding to the semantics of the history query function, and the history query function is decomposed into the query background paragraph and the query paragraph according to the text vectors, or other text semantic analysis manners capable of achieving the same or similar effects may be used to decompose the history query function into the query background paragraph and the query paragraph, which is not limited herein.
After obtaining the query paragraphs, identifying the subdivision questions contained in the query paragraphs, and similarly, a text vector algorithm, such as embedding algorithm, may be adopted to further decompose the query paragraphs into subdivision questions, or further decompose the query paragraphs into subdivision questions by adopting other text semantic analysis modes capable of achieving the same or similar effects, where an audit point corresponds to at least one query background paragraph and at least one query question paragraph, and the query background paragraphs and the query question paragraphs corresponding to the same audit point have an association relationship.
The method comprises the steps of determining an audit point corresponding to a subdivision problem, wherein the audit point is corresponding to the subdivision problem in a question section, and a first catalog representing the correspondence between the audit point and the subdivision problem is formed.
Keywords contained in the subdivision questions are obtained by using a word segmentation algorithm, and the keywords are used as query directions of the subdivision questions, wherein the word segmentation algorithm can be a BPE algorithm or other algorithms capable of achieving the same or similar effects, and the word segmentation algorithm is not limited in the specification. Specifically, firstly adopting a word segmentation algorithm to reject non-key phrases in the subdivision problem to obtain a preprocessed word segmentation set, calculating tf-idf values of text vectors of words in the preprocessed word segmentation set, and judging words belonging to keywords according to grammar components of the words in sentences to obtain a keyword seed word stock. The process of obtaining the keyword seed word stock can be achieved by constructing a keyword classification model, for example, text vectors, tf-idf values of the words in the pretreatment word set and grammar components of the words in sentences are used as inputs of the keyword classification model, whether the words in the pretreatment word set are keywords or not is outputted, wherein the keyword classification model can be achieved by selecting RNN, CNN or Transformer as an encoder and superposing a sigmoid classifier. Because different historic query functions and historic stranding books are written by different people, different term styles may exist, so that the coverage range of keywords in the keyword seed word stock is smaller, and the keyword seed word stock needs to be expanded to form the keyword word stock. The keyword seed word stock is obtained by expanding a near-meaning word dictionary based on manual experience, or the keyword seed word stock is expanded by calculating the similarity of words by using a text vector algorithm.
And obtaining keywords contained in the subdivision questions according to the keyword word stock, and taking the keywords as query directions of the subdivision questions. And clustering the subdivision questions corresponding to each auditing gist according to the query directions to obtain a second catalogue representing the corresponding relation among the auditing gist, the query directions and the subdivision questions, wherein one auditing gist corresponds to at least one query direction and one query direction corresponds to at least one subdivision question.
And inputting the subdivided questions into a neural network algorithm model to generate a question template, and obtaining a third catalog which represents the corresponding relation among the auditing gist, the inquiring directions and the question template, wherein one auditing gist corresponds to at least one inquiring direction, and one inquiring direction corresponds to at least one question template.
Specifically, a text vector algorithm is used for obtaining text vectors of subdivision problems, the text vectors of subdivision problems are used as training samples to be input into an initial text recognition model for machine learning training, a target text recognition model is obtained, the recognition method of entity texts, attribute texts and attribute value texts in subdivision problems is stored in the target text recognition model, and the entity texts, the attribute texts and the attribute value texts in subdivision problems are replaced by placeholders, so that a problem template is obtained.
For example, the entities in the subdivision problem mainly include date, name of business, name of person, amount and quantity, and the attribute value is a numerical value of year, a numerical value of amount or a numerical value of proportion for representing attributes of the entities, and the attribute is used for representing attributes of the entities. Since the entities are closely related to the content in the historical equity, the entities, attributes and attribute values in the subdivision questions are generated from the equity, so a question template can be generated first, and the query function for the equity can be generated by simply filling the question template with the entities, attributes and attribute values. For example, the subdivision problem in the historical query function is "2018, 2019 and 2020", and the company overseas sales income accounts for 43.49%, 44.71% and 23.57% of the main business income, respectively. Please explain the reason for the decrease in the revenue of the main service ", the problem templates after entity, attribute and attribute values are removed and placeholder is filled are" [ year ] year, [ POOSITMBI ], [ POOSITMBI ] and [ POOSITMBI ] respectively. Please explain the reasons for the decrease in revenue of the main service.
And then, determining the triggering condition of the auditing gist according to the subdivision problem, and obtaining a fourth catalogue representing the corresponding relation among the auditing gist, the inquiring direction, the problem template and the triggering condition, wherein the triggering condition is used for judging whether the auditing gist is to be inquired or not.
Specifically, the triggering conditions of the audit gist are divided into two types, namely a rule mode and a content mode, and the triggering conditions of the rule mode comprise:
And presetting an attribute value threshold according to the attribute value of the subdivision problem, and determining to trigger an auditing gist if the attribute value in the historical equity book does not meet the attribute value threshold, wherein the attribute value threshold can be obtained through a formula, a regular expression and four arithmetic, and the attribute value in the subdivision problem in the historical query function has a corresponding relation with the attribute value of the auditing gist corresponding to the historical equity book.
The triggering conditions of the content mode include:
And performing machine learning training by taking the historical query function and the historical stranding book as training samples to obtain a two-class model for judging whether abnormal auditing points exist, and determining to trigger the auditing points when the target stranding book inputs the two-class model to obtain a conclusion that the abnormal auditing points exist.
Machine learning is performed on the query background paragraph and the historical equity book to obtain summary rewrite rules for generating the query background paragraph.
Specifically, when the content in the stranding book meets the triggering condition of the auditing gist, an inquiry background paragraph needs to be generated in the inquiry function, so that the inquiry background paragraph and the historical stranding book are used as training samples for machine learning training, and the corresponding relation between the inquiry background paragraph and the content in the historical stranding book is obtained;
and determining a summary rewrite rule for generating the query background paragraph according to the corresponding relation, wherein the summary rewrite rule is used for performing summary rewrite on the text paragraph with the corresponding relation with the query background paragraph in the history poster book to generate the query background paragraph.
For example, when the audit gist is triggered, firstly, the paragraphs triggering the audit gist in the pooling book are identified, the query direction corresponding to the audit gist is used as the keywords of the query background paragraphs, sentences containing the keywords are preferentially extracted from the pooling book, the referents in the sentences are identified, and the corresponding sentences are replaced by the antecedents, so that the sentences are more smooth, and the query background paragraphs are generated.
And determining a rule dictionary to comprise a fourth catalog and abstract rewrite rules, and generating a target query function corresponding to the target bid according to the rule dictionary.
203. A text vector of the target poster book is acquired.
Since the input of the target generation model is a text vector, the text vector of the target stranding book needs to be acquired by using a text algorithm, so that the target generation model generates a target query function corresponding to the target stranding book according to the text vector of the target stranding book.
204. And inputting the text vector of the target poster book into a target generation model, and outputting a target query function corresponding to the target poster book according to the rule dictionary by the target generation model.
After the text vector of the target stranding book is input into the target generation model, the rule dictionary is matched with the text vector of the target stranding book, when the triggering condition of the auditing main points is met, the rule dictionary is used for carrying out similarity matching and keyword matching based on the text vector on the paragraphs in the target stranding book triggering the auditing main points, if the paragraphs meet a certain problem template, the problem template is filled with entity, attribute and attribute values to generate subdivision problems, and at least one subdivision problem forms a question and query question paragraph of the target query function. Generating a query background paragraph according to the abstract rewrite rule, specifically, taking the query direction corresponding to the audit gist as the key word of the query background paragraph, preferentially picking sentences containing the key word from the stock book, identifying the referents in the sentences, replacing the corresponding language with the antecedent language, and leading the sentences to be more smooth, thereby generating the query background paragraph. The auditing points, the inquiry question paragraphs and the inquiry background paragraphs have corresponding relations, and one target inquiry function comprises at least one group of auditing points, inquiry question paragraphs and inquiry background paragraphs.
In the embodiment of the application, the target generation model for generating the target query function is obtained by machine learning the historical query function and the historical stranding book, and the rule dictionary for generating the target query function is stored in the target generation model, so that after the target stranding book is input into the target generation model, the target generation model can output the target query function, and the labor and time cost required by pure manual writing are reduced.
Referring to fig. 3, an implementation manner of an inquiry function generating system provided by an embodiment of the present application includes:
An obtaining unit 301, configured to obtain a target generation model, where the target generation model is obtained by performing machine learning training on an initial generation model by using a historical query function and a historical stranding book, and a rule dictionary for generating the target query function is stored in the target generation model, the historical query function has a corresponding relationship with the historical stranding book, and the historical query function is generated according to the historical stranding book, and the rule dictionary is obtained by performing machine learning on the historical query function and the historical stranding book and combining with manual experience;
an obtaining unit 301, configured to obtain a text vector of the target stranding book;
and an input unit 302, configured to input a text vector of the target endorsement into a target generation model, where the target generation model outputs a target query function corresponding to the target endorsement according to the rule dictionary.
The functions and processes executed by each unit in the query function generating system in this embodiment are similar to those executed by the query function generating system in fig. 1 to 2, and are not repeated here.
Fig. 4 is a schematic structural diagram of an apparatus for generating an inquiry function according to the present application, where the apparatus 400 may include one or more central processing units (CPUs, centralprocessingunits) 401 and a memory 405, and the memory 405 stores one or more application programs or data.
Wherein the memory 405 may be volatile storage or persistent storage. The program stored in the memory 405 may include one or more modules, each of which may include a series of instruction operations in the query function generating device. Still further, the central processor 401 may be arranged to communicate with the memory 405, and execute a series of instruction operations in the memory 405 on the query function generation device 400.
The query function generation device 400 may also include one or more power supplies 402, one or more wired or wireless network interfaces 403, one or more input/output interfaces 404, and/or one or more operating systems, such as Windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, etc.
The cpu 401 may perform the operations performed by the query function generating system in the embodiments shown in fig. 1 to 2, and detailed descriptions thereof are omitted herein.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
In the several embodiments provided in the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random-access memory (RAM, random access memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Claims (9)

1. A query function generation method, comprising:
Decomposing a historical query function into a query background paragraph and a query question paragraph by using a classification algorithm, and identifying subdivision problems contained in the query question paragraphs, wherein one audit gist corresponds to at least one query background paragraph and at least one query question paragraph, and the query background paragraph and the query question paragraph corresponding to the same audit gist have an association relation;
determining auditing points corresponding to the subdivision problems, wherein one auditing point corresponds to at least one subdivision problem, and forming a first catalog representing the correspondence between the auditing points and the subdivision problems;
Acquiring keywords contained in the subdivision questions, and taking the keywords as query directions of the subdivision questions;
clustering the subdivision questions corresponding to each audit gist according to query directions to obtain a second catalog representing the corresponding relation among the audit gist, the query directions and the subdivision questions, wherein one audit gist corresponds to at least one query direction and one query direction corresponds to at least one subdivision question;
Inputting the subdivision questions into a neural network algorithm model to generate a question template, and obtaining a third catalog representing the corresponding relation among the auditing gist, the inquiring directions and the question template, wherein one auditing gist corresponds to at least one inquiring direction, and one inquiring direction corresponds to at least one question template;
determining a triggering condition of the auditing gist according to the subdivision problem to obtain a fourth catalog representing the corresponding relation among the auditing gist, the inquiring direction, the problem template and the triggering condition, wherein the triggering condition is used for judging whether the auditing gist is to be inquired or not;
machine learning is carried out on the inquiry background paragraph and the historical stranding book to obtain a summary rewrite rule for generating the inquiry background paragraph;
The rule dictionary comprises the fourth catalog and the abstract rewrite rules;
Obtaining a target generation model, wherein the target generation model is obtained by performing machine learning training on an initial generation model by the historical query function and a historical stranding book, the target generation model stores the rule dictionary for generating the target query function, the historical query function has a corresponding relation with the historical stranding book, and the historical query function is generated according to the historical stranding book;
Acquiring a text vector of a target stranding book;
Inputting the text vector of the target poster into the target generation model, and outputting a target query function corresponding to the target poster by the target generation model according to the rule dictionary;
The inputting the text vector of the target endorsement into the target generation model comprises: after the text vector of the target stranding book is input into the target generation model, matching the rule dictionary with the text vector of the target stranding book, when the triggering condition of any one of the auditing key points is met, matching the paragraphs in the target stranding book triggering the auditing key points based on the similarity of the text vector and the keyword, and if the paragraphs are matched with any one of the problem templates, filling the entity, the attribute and the attribute value in the problem template so as to obtain subdivision questions of question paragraphs forming a target query function, and determining the question paragraphs of the target query function based on the subdivision questions of the question paragraphs forming the target query function;
When the triggering condition of any auditing gist is met, acquiring an inquiring direction corresponding to the auditing gist, picking a statement containing the inquiring direction corresponding to the auditing gist from the target stock book, and replacing the inquiring direction corresponding to the auditing gist in the statement with a corresponding antecedent to acquire an inquiring background paragraph of the target inquiring letter;
And forming a target query function based on any one of the auditing gist, the query question paragraph of the target query function, the target query function of the query background paragraph of the target query function and the corresponding relation among the three.
2. The query function generation method according to claim 1, wherein before the acquisition of the target generation model, the method further comprises:
acquiring the historical query function and the historical stranding book, and acquiring text vectors of the historical query function and the historical stranding book by using a text vector algorithm;
and inputting the text vectors of the historical query function and the historical stranding book as training samples into the initial generation model, performing machine learning training on the initial generation model by using the training samples to obtain the target generation model, wherein the rule dictionary is stored in the target generation model.
3. The query function generation method according to claim 1, wherein the obtaining the keywords included in the segment questions, and taking the keywords as the query directions of the segment questions, comprises:
word segmentation matching based on word granularity is carried out on the subdivision problem to obtain a preprocessed word segmentation set, and a text vector algorithm and a text classification algorithm are used for processing the preprocessed word segmentation set to obtain a keyword seed word stock;
expanding the keyword seed word stock by using a near-meaning word dictionary based on manual experience to obtain a keyword word stock;
Or the keyword seed word stock is expanded by calculating the similarity of words through a text vector algorithm, so as to obtain the keyword word stock;
And obtaining keywords contained in the subdivision questions according to the keyword word stock, and taking the keywords as query directions of the subdivision questions.
4. The query function generation method of claim 1, wherein said inputting the subdivision problem into a neural network algorithm model to generate a problem template comprises:
a text vector algorithm is used for obtaining text vectors of the subdivision problems, the text vectors of the subdivision problems are used as training samples to be input into an initial text recognition model for machine learning training, a target text recognition model is obtained, and the recognition method of entity texts, attribute texts and attribute value texts in the subdivision problems is stored in the target text recognition model;
And replacing the entity text, the attribute text and the attribute value text in the subdivision problem with placeholders to obtain the problem template.
5. The query function generation method as claimed in claim 4, wherein said determining a trigger condition of said audit gist from said subdivision problem comprises:
The triggering conditions of the auditing gist are divided into two types of rule modes and content modes, wherein the triggering conditions of the rule modes comprise:
Presetting an attribute value threshold according to the attribute value of the subdivision problem, and determining to trigger the auditing gist if the attribute value in the historical equity book does not meet the attribute value threshold, wherein the attribute value in the subdivision problem in the historical query function has a corresponding relation with the attribute value of the auditing gist corresponding to the historical equity book;
The triggering conditions of the content mode include:
performing machine learning training by taking the historical query function and the historical stranding book as training samples to obtain a classification model for judging whether abnormal auditing key points exist;
and when the target stranding book inputs the two classification models to obtain a conclusion that abnormal auditing gist exists, determining to trigger the auditing gist.
6. The query function generation method as claimed in claim 1, wherein said machine learning said query background paragraph and said history stranding book to obtain a digest rewrite rule for generating said query background paragraph comprises:
Taking the inquiry background paragraph and the history stranding book as training samples to carry out machine learning training to obtain the corresponding relation between the inquiry background paragraph and the content in the history stranding book;
And determining a summary rewrite rule for generating the query background paragraph according to the corresponding relation, wherein the summary rewrite rule is used for performing summary rewrite on the text paragraph with the corresponding relation with the query background paragraph in the history poster book to generate the query background paragraph.
7. An inquiry function generating system, comprising:
The acquisition unit is used for acquiring a target generation model, wherein the target generation model is obtained by performing machine learning training on an initial generation model by a historical query function and a historical stranding book, a rule dictionary for generating the target query function is stored in the target generation model, the historical query function has a corresponding relation with the historical stranding book, and the historical query function is generated according to the historical stranding book;
the acquisition unit is also used for acquiring the text vector of the target stranding book;
The input unit is used for inputting the text vector of the target poster into the target generation model, and the target generation model outputs a target query function corresponding to the target poster according to the rule dictionary;
The obtaining unit is specifically configured to decompose the historical query function into a query background paragraph and a query question paragraph by using a classification algorithm, and identify a subdivision problem included in the query question paragraph, where an audit gist corresponds to at least one query background paragraph and at least one query question paragraph, and the query background paragraph and the query question paragraph corresponding to the same audit gist have an association relationship;
Confirming auditing points corresponding to the subdivision problems, wherein one auditing point corresponds to at least one subdivision problem, and forming a first catalog representing the correspondence between the auditing points and the subdivision problems;
Acquiring keywords contained in the subdivision questions, and taking the keywords as query directions of the subdivision questions;
clustering the subdivision questions corresponding to each audit gist according to query directions to obtain a second catalog representing the corresponding relation among the audit gist, the query directions and the subdivision questions, wherein one audit gist corresponds to at least one query direction and one query direction corresponds to at least one subdivision question;
Inputting the subdivision questions into a neural network algorithm model to generate a question template, and obtaining a third catalog representing the corresponding relation among the auditing gist, the inquiring directions and the question template, wherein one auditing gist corresponds to at least one inquiring direction, and one inquiring direction corresponds to at least one question template;
determining a triggering condition of the auditing gist according to the subdivision problem to obtain a fourth catalog representing the corresponding relation among the auditing gist, the inquiring direction, the problem template and the triggering condition, wherein the triggering condition is used for judging whether the auditing gist is to be inquired or not;
machine learning the query background paragraph and the historical stranding book to obtain a summary rewrite rule for generating the query background paragraph;
The rule dictionary includes the fourth catalog and the digest rewrite rules;
The input unit is specifically configured to input a text vector of the target bid to the target generation model, match the rule dictionary with the text vector of the target bid, perform similarity matching and keyword matching based on the text vector on a paragraph in the target bid triggering the audit gist when a triggering condition of any audit gist is satisfied, and fill an entity, an attribute and an attribute value in the question template if the paragraph is matched with any question template, thereby obtaining a subdivision problem of a question paragraph constituting a target query function, and determine the question paragraph of the target query function based on the subdivision problem of the question paragraph constituting the target query function;
When the triggering condition of any auditing gist is met, acquiring an inquiring direction corresponding to the auditing gist, picking a statement containing the inquiring direction corresponding to the auditing gist from the target stock book, and replacing the inquiring direction corresponding to the auditing gist in the statement with a corresponding antecedent to acquire an inquiring background paragraph of the target inquiring letter;
And forming a target query function based on any one of the auditing gist, the query question paragraph of the target query function, the target query function of the query background paragraph of the target query function and the corresponding relation among the three.
8. An inquiry function generating apparatus, comprising:
a central processing unit, a memory and an input/output interface;
The memory is a short-term memory or a persistent memory;
The central processor is configured to communicate with the memory and to execute instruction operations in the memory to perform the method of any of claims 1 to 6.
9. A computer readable storage medium comprising instructions which, when run on a computer, cause the computer to perform the method of any one of claims 1 to 6.
CN202111277389.0A 2021-10-29 2021-10-29 Query function generation method, system and device Active CN113901778B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111277389.0A CN113901778B (en) 2021-10-29 2021-10-29 Query function generation method, system and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111277389.0A CN113901778B (en) 2021-10-29 2021-10-29 Query function generation method, system and device

Publications (2)

Publication Number Publication Date
CN113901778A CN113901778A (en) 2022-01-07
CN113901778B true CN113901778B (en) 2024-11-05

Family

ID=79027698

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111277389.0A Active CN113901778B (en) 2021-10-29 2021-10-29 Query function generation method, system and device

Country Status (1)

Country Link
CN (1) CN113901778B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109543007A (en) * 2018-10-16 2019-03-29 深圳壹账通智能科技有限公司 Put question to data creation method, device, computer equipment and storage medium
CN112417155A (en) * 2020-11-27 2021-02-26 浙江大学 Court trial query generation method, device and medium based on pointer-generation Seq2Seq model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108446286B (en) * 2017-02-16 2023-04-25 阿里巴巴集团控股有限公司 Method, device and server for generating natural language question answers

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109543007A (en) * 2018-10-16 2019-03-29 深圳壹账通智能科技有限公司 Put question to data creation method, device, computer equipment and storage medium
CN112417155A (en) * 2020-11-27 2021-02-26 浙江大学 Court trial query generation method, device and medium based on pointer-generation Seq2Seq model

Also Published As

Publication number Publication date
CN113901778A (en) 2022-01-07

Similar Documents

Publication Publication Date Title
CN109190110B (en) Named entity recognition model training method and system and electronic equipment
CN107491531B (en) Chinese network comment sensibility classification method based on integrated study frame
CN109685056B (en) Method and device for acquiring document information
CN112182246B (en) Method, system, medium, and application for creating an enterprise representation through big data analysis
US11163956B1 (en) System and method for recognizing domain specific named entities using domain specific word embeddings
CN109508373B (en) Method and device for calculating enterprise public opinion index and computer readable storage medium
CN108170715B (en) Text structuralization processing method
CN110134959B (en) Named entity recognition model training method and equipment, and information extraction method and equipment
US11966698B2 (en) System and method for automatically tagging customer messages using artificial intelligence models
CN113987112B (en) Table information extraction method and device, storage medium and electronic equipment
CN109558541A (en) A kind of method, apparatus and computer storage medium of information processing
Dwivedi et al. Sentiment analytics for crypto pre and post covid: topic modeling
Lu et al. Credit rating change modeling using news and financial ratios
CN112990973A (en) Online shop portrait construction method and system
CN114037545A (en) Client recommendation method, device, equipment and storage medium
CN113919437A (en) Method, device, equipment and storage medium for generating client portrait
Haryono et al. Aspect-based sentiment analysis of financial headlines and microblogs using semantic similarity and bidirectional long short-term memory
Gupta et al. A two-staged NLP-based framework for assessing the sentiments on Indian supreme court judgments
CN113901778B (en) Query function generation method, system and device
CN112989053A (en) Periodical recommendation method and device
Doughman et al. Time-aware word embeddings for three Lebanese news archives
CN110717029A (en) Information processing method and system
Hott et al. Evaluating contextualized embeddings for topic modeling in public bidding domain
CN116402056A (en) Document information processing method and device and electronic equipment
CN111798214B (en) System and method for generating job skill label

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant