CN114757189A - Event extraction method and device, intelligent terminal and storage medium - Google Patents
Event extraction method and device, intelligent terminal and storage medium Download PDFInfo
- Publication number
- CN114757189A CN114757189A CN202210661693.3A CN202210661693A CN114757189A CN 114757189 A CN114757189 A CN 114757189A CN 202210661693 A CN202210661693 A CN 202210661693A CN 114757189 A CN114757189 A CN 114757189A
- Authority
- CN
- China
- Prior art keywords
- event
- vector
- event type
- argument
- extracted
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 117
- 239000013598 vector Substances 0.000 claims abstract description 322
- 239000013604 expression vector Substances 0.000 claims abstract description 72
- 239000011159 matrix material Substances 0.000 claims description 88
- 238000012545 processing Methods 0.000 claims description 14
- 238000000034 method Methods 0.000 abstract description 30
- 230000009286 beneficial effect Effects 0.000 abstract description 4
- 238000012549 training Methods 0.000 description 30
- 230000008569 process Effects 0.000 description 17
- 238000010586 diagram Methods 0.000 description 10
- 238000004364 calculation method Methods 0.000 description 8
- 230000006870 function Effects 0.000 description 6
- 238000004590 computer program Methods 0.000 description 5
- 230000004044 response Effects 0.000 description 5
- 230000007246 mechanism Effects 0.000 description 4
- 102100025750 Sphingosine 1-phosphate receptor 1 Human genes 0.000 description 3
- 101710155454 Sphingosine 1-phosphate receptor 1 Proteins 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000002349 favourable effect Effects 0.000 description 3
- 238000002372 labelling Methods 0.000 description 3
- 238000003058 natural language processing Methods 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 230000017105 transposition Effects 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 102100025749 Sphingosine 1-phosphate receptor 2 Human genes 0.000 description 1
- 101710155462 Sphingosine 1-phosphate receptor 2 Proteins 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/126—Character encoding
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses an event extraction method, an event extraction device, an intelligent terminal and a storage medium, wherein the method comprises the following steps: acquiring sentences to be extracted, and performing word coding and position coding on each word to obtain corresponding word embedded vectors and position embedded vectors; adding the word embedding vector and the position embedding vector to obtain a first input vector, inputting the first input vector into an encoder, and outputting a contextualized expression vector through the encoder; inputting the contextualized expression vector into a multi-label event type classifier to determine an event type embedding vector corresponding to a statement to be extracted, and acquiring a corresponding event type comprehensive vector; adding the contextualized expression vector and the event type comprehensive vector to obtain a second input vector, and inputting the second input vector into an event argument classifier to obtain an event argument corresponding to the statement to be extracted; and constructing argument combinations according to the event arguments, classifying the argument combinations by events and determining the target event types of the argument combinations. The invention is beneficial to improving the efficiency of event extraction.
Description
Technical Field
The invention relates to the technical field of natural language processing, in particular to an event extraction method, an event extraction device, an intelligent terminal and a storage medium.
Background
With the development of science and technology, natural language processing technology is widely applied, and the application of event extraction is more and more extensive. An event refers to an occurrence, typically sentence-level, of one or more actions, and of one or more roles engaged, occurring within a particular time segment and territory. The event can be structured through event extraction, and the structured target is to determine the event type to which the event belongs and extract the event participants.
In the prior art, event extraction depends on preset trigger words. The problem in the prior art is that attention to event type information of a sentence is lacked, and a set trigger word does not necessarily completely correspond to the sentence, which is not beneficial to improving the accuracy of event extraction.
Thus, there is a need for improvement and development of the prior art.
Disclosure of Invention
The invention mainly aims to provide an event extraction method, an event extraction device, an intelligent terminal and a storage medium, and aims to solve the problems that in the prior art, the event extraction process depends on preset trigger words, attention to event type information of sentences is lacked, the set trigger words do not necessarily completely correspond to the sentences, and the accuracy of event extraction is not favorably improved.
In order to achieve the above object, a first aspect of the present invention provides an event extraction method, where the event extraction method includes:
obtaining sentences to be extracted, and carrying out word coding and position coding on each word in the sentences to be extracted to obtain word embedded vectors and position embedded vectors corresponding to the sentences to be extracted;
adding the word embedding vector and the position embedding vector to obtain a first input vector, inputting the first input vector into a pre-trained encoder, and outputting a contextualized expression vector of the sentence to be extracted through the encoder;
inputting the contextualized expression vector into a pre-trained multi-label event type classifier, determining an event type embedding vector corresponding to the statement to be extracted through the multi-label event type classifier, and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedding vector;
adding the contextualized expression vector and the event type comprehensive vector to obtain a second input vector, inputting the second input vector into a pre-trained event argument classifier, and obtaining event arguments corresponding to the sentence to be extracted through the event argument classifier;
And constructing argument combinations according to the event arguments, performing event classification on each argument combination, and determining a target event type corresponding to each argument combination, wherein the target event type corresponding to one argument combination is any one of a non-event type or an event type corresponding to the statement to be extracted.
Optionally, the method for extracting the sentences from the database includes that the sentences to be extracted correspond to a plurality of event types, the dimension of the comprehensive vector of the event types is the same as the dimension of the contextualized expression vector, the contextualized expression vector is input to a pre-trained multi-tag event type classifier, the event type embedded vector corresponding to the sentences to be extracted is determined by the multi-tag event type classifier, and the comprehensive vector of the event types corresponding to the sentences to be extracted is obtained according to the event type embedded vector, including:
inputting the contextualized expression vector into a pre-trained multi-label event type classifier, and determining an event type embedding vector corresponding to the statement to be extracted through the multi-label event type classifier;
and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedded vector.
Optionally, the obtaining an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedded vector includes:
acquiring event probability corresponding to each event type, wherein the event probability is determined when the event type embedding vector corresponding to the statement to be extracted is determined through the multi-label event type classifier;
taking an event type with the event probability larger than a preset probability threshold value as a to-be-processed event type, and taking an event type embedded vector corresponding to the to-be-processed event type as a to-be-processed embedded vector;
and performing weighted summation on the to-be-processed embedded vectors corresponding to the to-be-processed event types to obtain the event type comprehensive vector, wherein the weighting coefficients corresponding to the to-be-processed embedded vectors are equal, or the event probability corresponding to the to-be-processed event types is used as the weighting coefficient of the to-be-processed embedded vector corresponding to the to-be-processed event types.
Optionally, the obtaining an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedded vector includes:
obtaining a weight matrix, a projection matrix and an event type embedding matrix, wherein the event type embedding matrix is obtained according to the event type embedding vector, the weight matrix is a matrix with m rows and 1 columns, the projection matrix and the event type embedding matrix are both matrices with m rows and d columns, m is the number of event types corresponding to the statements to be extracted, and d is the dimension of the event type embedding vector;
Calculating and obtaining an event type comprehensive vector corresponding to the statement to be extracted according to the weight matrix, the projection matrix, the event type embedding matrix and the contextualized expression vector;
wherein, each element in the weight matrix is an event probability corresponding to each event type, or each element in the weight matrix is 1.
Optionally, firstiThe event type integrated vector is the product of the transpose matrix of the target matrix and the event type embedding matrix, the target matrix is obtained by the Hadamard product of the target product matrix and the weighting matrix, and the target product matrix is the firstiThe product of the contextualized expression vector and the projection matrix.
Optionally, the constructing of argument combinations according to the event arguments, performing event classification on each argument combination, and determining a target event type corresponding to each argument combination includes:
acquiring word attributes corresponding to the event arguments, and combining the event arguments according to the word attributes corresponding to the event arguments to obtain a plurality of argument combinations, wherein each argument combination comprises a plurality of event arguments, and the word attributes corresponding to the event arguments in one argument combination are different;
And classifying the events of the argument combinations and determining the target event types corresponding to the argument combinations.
Optionally, the event classification for each argument combination and determining a target event type corresponding to each argument combination includes:
and inputting the argument combination into a pre-trained argument combination event type classifier, performing event classification on each event argument combination through the argument combination event type classifier, and acquiring a target event type corresponding to each argument combination.
A second aspect of the present invention provides an event extraction device, wherein the event extraction device includes:
the sentence processing module is used for acquiring sentences to be extracted, and performing word coding and position coding on each word in the sentences to be extracted to obtain word embedded vectors and position embedded vectors corresponding to the sentences to be extracted;
an embedded vector processing module, configured to add the word embedded vector and the position embedded vector to obtain a first input vector, input the first input vector to a pre-trained encoder, and output a contextualized expression vector of the to-be-extracted sentence through the encoder;
an event type determining module, configured to input the contextualized expression vector into a pre-trained multi-tag event type classifier, determine, by the multi-tag event type classifier, an event type embedding vector corresponding to the to-be-extracted statement, and obtain, according to the event type embedding vector, an event type comprehensive vector corresponding to the to-be-extracted statement;
An event argument extraction module, configured to add the contextualized expression vector and the event type comprehensive vector to obtain a second input vector, input the second input vector to a pre-trained event argument classifier, and obtain an event argument corresponding to the to-be-extracted statement through the event argument classifier;
and the event argument processing module is used for constructing argument combinations according to the event arguments, classifying the events of the argument combinations and determining the target event types corresponding to the argument combinations, wherein the target event type corresponding to one argument combination is any one of a non-event type or an event type corresponding to the statement to be extracted.
A third aspect of the present invention provides an intelligent terminal, where the intelligent terminal includes a memory, a processor, and an event extraction program stored in the memory and executable on the processor, and the event extraction program implements any one of the steps of the event extraction method when executed by the processor.
A fourth aspect of the present invention provides a computer-readable storage medium, on which an event extraction program is stored, where the event extraction program, when executed by a processor, implements any one of the steps of the event extraction method.
As can be seen from the above, in the scheme of the present invention, a sentence to be extracted is obtained, and word encoding and position encoding are performed on each word in the sentence to be extracted, so as to obtain a word embedded vector and a position embedded vector corresponding to the sentence to be extracted; adding the word embedding vector and the position embedding vector to obtain a first input vector, inputting the first input vector into a pre-trained encoder, and outputting a contextualized expression vector of the sentence to be extracted through the encoder; inputting the contextualized expression vector into a pre-trained multi-label event type classifier, determining an event type embedding vector corresponding to the statement to be extracted through the multi-label event type classifier, and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedding vector; adding the contextualized expression vector and the event type comprehensive vector to obtain a second input vector, inputting the second input vector into a pre-trained event argument classifier, and obtaining event arguments corresponding to the sentence to be extracted through the event argument classifier; and constructing argument combinations according to the event arguments, performing event classification on each argument combination, and determining a target event type corresponding to each argument combination, wherein the target event type corresponding to one argument combination is any one of a non-event type or an event type corresponding to the statement to be extracted. Compared with the event extraction scheme depending on the preset trigger word in the prior art, the method and the device for extracting the events of the sentences acquire the event types of the sentences to be extracted without setting the trigger word, fuse the event types as the indication information with the sentence information, and are favorable for improving the accuracy of the event extraction. Meanwhile, all event types are acted together to extract event arguments once, and then the obtained event arguments are arranged and combined to realize event type classification without extracting event arguments for multiple times, so that the efficiency of event extraction is improved, and the event types in the original sentence to be processed can be combined to better extract the event.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings required to be used in the embodiments or the prior art description will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without inventive labor.
Fig. 1 is a schematic flowchart of an event extraction method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a specific flow chart of event extraction according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating an event extraction according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating an event extraction according to an embodiment of the present invention;
FIG. 5 is a flowchart illustrating an event extraction according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an event extraction device according to an embodiment of the present invention;
fig. 7 is a schematic block diagram of an internal structure of an intelligent terminal according to an embodiment of the present invention.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when …" or "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted depending on the context to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings of the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways than those specifically described and will be readily apparent to those of ordinary skill in the art without departing from the spirit of the present invention, and therefore the present invention is not limited to the specific embodiments disclosed below.
With the development of science and technology, natural language processing technology is widely applied, and the application of event extraction is more and more extensive. An event refers to an occurrence, typically sentence-level, of one or more actions, and of one or more roles engaged, occurring within a particular time segment and territory. The event can be structured through event extraction, and the structured goal is to determine the event type to which the event belongs and extract the event participants, such as related entities, related time, related numerical values, and the like.
In the prior art, event extraction depends on preset trigger words. The problem in the prior art is that attention to event type information of a sentence is lacked, and a set trigger word does not necessarily completely correspond to the sentence, which is not beneficial to improving the accuracy of event extraction.
Meanwhile, due to the complexity and diversity of language and characters, a sentence may involve multiple events, and the event extraction method in the prior art lacks consideration for the situation of multiple events, easily causes the problems of incomplete event extraction and the like, and is not favorable for improving the accuracy of event extraction.
In an application scenario, the extraction of event parameters needs to be performed in multiple steps according to event types, which affects the efficiency of event extraction. In another application scenario, direct association between event type identification and event parameter extraction (i.e., event argument extraction) is lacked, and the event type information and event semantic information lack an in-depth information fusion mode during event parameter extraction, which affects the accuracy of event extraction.
In order to solve at least one of the problems, in the scheme of the invention, a sentence to be extracted is obtained, word coding and position coding are carried out on each word in the sentence to be extracted, and a word embedding vector and a position embedding vector corresponding to the sentence to be extracted are obtained; adding the word embedding vector and the position embedding vector to obtain a first input vector, inputting the first input vector into a pre-trained encoder, and outputting a contextualized expression vector of the sentence to be extracted through the encoder; inputting the contextualized expression vector into a pre-trained multi-label event type classifier, determining an event type embedding vector corresponding to the statement to be extracted through the multi-label event type classifier, and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedding vector; adding the contextualized expression vector and the event type comprehensive vector to obtain a second input vector, inputting the second input vector into a pre-trained event argument classifier, and obtaining an event argument corresponding to the statement to be extracted through the event argument classifier; and constructing argument combinations according to the event arguments, performing event classification on the argument combinations and determining target event types corresponding to the argument combinations, wherein the target event type corresponding to one argument combination is any one of a non-event or an event type corresponding to the statement to be extracted.
Compared with the event extraction scheme depending on the preset trigger word in the prior art, the method and the device for extracting the events of the sentences acquire the event types of the sentences to be extracted without setting the trigger word, fuse the event types as the indication information with the sentence information, and are favorable for improving the accuracy of the event extraction. Meanwhile, all event types act together to perform one-time event argument extraction, and then the obtained event arguments are arranged and combined to realize event type classification without performing multiple event argument extractions, so that the efficiency of event extraction is improved, and the event extraction can be better performed by combining all event types in the original sentence to be processed.
Exemplary method
As shown in fig. 1, an embodiment of the present invention provides an event extraction method, and specifically, the method includes the following steps:
step S100, obtaining sentences to be extracted, and carrying out word coding and position coding on each word in the sentences to be extracted to obtain word embedded vectors and position embedded vectors corresponding to the sentences to be extracted.
Specifically, the statement to be extracted is a statement that needs to be event extracted, and the statement to be extracted may correspond to one event type or a plurality of event types. It should be noted that one event type corresponds to one event type embedding vector.
Specifically, the sentence to be extracted includes a plurality of words, each word forms a word vector after being subjected to word encoding, the position encoding is to encode position information of the word in the sentence to be extracted, and the position vector can be formed after the position encoding. It should be noted that the position vector represents position information of each word, the position information is an appearance sequence of the word in the sentence to be extracted, and in an application scenario, the position information is an appearance sequence of the word in the sentence to be extracted according to a sequence from left to right.
After word vectors corresponding to the words are obtained, the word vectors are arranged according to the sequence of the words in the sentence to be extracted, and word embedded vectors corresponding to the sentence to be extracted are obtained; and then, arranging the position vectors according to the sequencing sequence of the words in the sentence to be extracted to obtain the position embedded vector corresponding to the sentence to be extracted. The term may be a chinese character, a word in a foreign language, or a character (e.g., arabic numeral), and is not limited in this respect.
Step S200, adding the word embedding vector and the position embedding vector to obtain a first input vector, inputting the first input vector into a pre-trained encoder, and outputting the contextualized expression vector of the sentence to be extracted through the encoder.
The word embedding vector and the position embedding vector have the same number of elements and the same vector dimension. Specifically, in the present embodiment, the word embedding vector is composed of a plurality of word vectors sorted, the position embedding vector is composed of a plurality of position vectors sorted, and the number of word vectors in the word embedding vector is the same as the number of position vectors in the position embedding vector.
In step S200, the adding of the word embedding vector and the position embedding vector means adding each word vector in the word embedding vector and a position vector corresponding to each word vector, wherein a vector dimension of each word vector is equal to a vector dimension of the corresponding position vector. Therefore, the vector dimension of the word embedding vector and the vector dimension of the position embedding vector are equal to the vector dimension corresponding to the input sentence to be extracted. In this embodiment, the word embedding vector and the position embedding vector are added in a point-by-point addition manner.
Further, in this embodiment, the vector dimensions of each word vector and each position vector are the same, so that calculation is performed, and the efficiency of event extraction is improved.
In this embodiment, the pre-trained encoder is a pre-trained Transformer language model encoder (i.e. a Transformer encoder), and is not limited in particular. The input of the encoder (i.e. the first input vector) is composed of two parts, the first part is a word vector (word embedding) of a sentence text word after passing through a word embedding layer, and the second part is a position vector (position embedding) of position information after passing through a position embedding layer. After the first input vector is encoded by the self-attention mechanism of the Transformer language model, a contextualized representation (contextualized expression) vector corresponding to the sentence to be extracted is output. The contextualized expression vector is the result of the model mapping the input data to the same dimensional space through an attention mechanism.
It should be noted that the Transformer language model is a machine learning model based on attention mechanism, and can process all words or symbols in a text in parallel, and simultaneously, combine the context with a word far away by using the attention mechanism, and by processing all words in parallel, let each word notice other words in a sentence in multiple processing steps. The input item of the Transformer encoder is a first input vector, and the output item is a contextualized expression vector of the statement to be extracted, wherein the contextualized expression vector of the statement to be extracted comprises contextualized expression vectors of all terms in the statement to be extracted.
Fig. 2 is a schematic diagram illustrating a specific flow of event extraction according to an embodiment of the present invention, and as shown in fig. 2, a statement to be extracted is "the present him to bag hd and kill", and is encoded to obtain a word embedding vector and a position embedding vector, it should be noted that [ CLS ] is]The start delimiter represents a start delimiter of a sentence, and the start delimiter can also be used as a word of the sentence to be extracted for position coding and word coding to obtain a position vector and a word vector corresponding to the start delimiter. As shown in fig. 2, in this embodiment, word encoding is performed on a sentence to be extracted to obtain a plurality of word vectors, and the word vectors are arranged in order to obtain corresponding word embedding vectors(ii) a And carrying out position coding on the sentences to be extracted to obtain position vectors, and sequencing to obtain corresponding position embedded vectors. It should be noted that, in the present embodiment, the sentence to be extracted includes 7 words and 1 start delimiter, so that the obtained word embedding vector includes 8 word vectors, and in a specific use process, the number of the word vectors in the word embedding vector is determined according to an actual requirement, which is not specifically limited herein. Adding the word embedding vector and the position embedding vector point by point to obtain a first input vector, inputting the first input vector into a Transformer encoder to obtain a corresponding contextualized expression vector Wherein, in the process,is a specific mark bit [ CLS ] in a pre-training language model (namely a Transformer language model)]May also be referred to as a flag bit [ CLS ]]The word latent vector of (2) is,a contextualized expression vector (i.e. word hidden vector) representing the 1 st word in the sentence to be extracted,and representing the contextualized expression vector of the 2 nd word in the sentence to be extracted, and so on, which are not described in detail.
Step S300, inputting the contextualized expression vector into a pre-trained multi-label event type classifier, determining an event type embedding vector corresponding to the statement to be extracted through the multi-label event type classifier, and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedding vector.
In this embodiment, the multi-label event type classifier may be a full connection layer. For the task of text classification, the contextualized expression vector of a sentence is input into a multi-label event type classifier constructed by a full connection layer to perform multi-label classification (multi-label classification), and the multi-label event type classifier can output a plurality of event types and corresponding probabilities (probabilities). It should be noted that the multi-label classification means that for each input, there are more than one label.
In this embodiment, a statement to be extracted is subjected to multi-event extraction as an example, where the statement to be extracted corresponds to multiple event types, one event type comprehensive vector corresponds to one contextualized expression vector, and the dimension of the event type comprehensive vector is the same as that of the contextualized expression vector, so that each contextualized expression vector is respectively summed with one corresponding event type comprehensive vector.
Specifically, the step S300 includes: inputting the contextualized expression vector into a pre-trained multi-label event type classifier, and determining an event type embedding vector corresponding to the statement to be extracted through the multi-label event type classifier; and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedded vector.
In an application scenario, the multi-tag event type classifier directly outputs the corresponding event type (and the event probability corresponding to each event type), and then obtains the corresponding event type embedding vector according to the event type. Specifically, the contextualized expression vector is input into a pre-trained multi-label event type classifier, and the event type corresponding to the statement to be extracted is determined through the multi-label event type classifier; acquiring event type embedded vectors corresponding to the statements to be extracted according to the event types, wherein the event types correspond to the event type embedded vectors one to one; and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedded vector.
In this embodiment, event extraction is divided into two stages, and multiple event extraction (multiple event extraction) is performed. As shown in fig. 2, the first stage in this embodiment corresponds to the left side in fig. 2, and event type classification is performed by using an encoder and a multi-tag event type classifier, and finally an event type comprehensive vector is obtained; the second stage corresponds to the right side in fig. 2, single-step event extraction is performed according to the event type comprehensive vector and the contextualized expression vector obtained in the first stage, event arguments in the statement to be extracted are obtained, and finally, the target event type corresponding to each argument combination is determined.
The event type embedded vector is a vector representation corresponding to each event type, fig. 3 is a specific flow diagram of event extraction provided by an embodiment of the present invention, and as shown in fig. 3, a multi-tag event type classifier is used to process a contextualized expression vector and obtain a vector pair to be extractedCorresponding multiple event type embedding vector, e.g. event type a (b)) Corresponding event type embedding vectorEvent type b () Corresponding event type embedding vectorEvent type c () Corresponding event type embedded vector And so on. Each event type corresponds to an event probability, that is, the probability that the statement to be extracted corresponds to the event type. For example, event probability corresponding to event type b in FIG. 30.8, event probability corresponding to event type cAnd was 0.6. It should be noted that the event type embedding vector (event type embedding) is a trainable vector and may be used as an event indication vector, or a corresponding event type integrated vector is obtained according to the event type embedding vector, and the event type integrated vector is used as an event indication vector, and is subjected to information fusion with a word vector, so as to assist argument extraction at the second stage. Therefore, the two phases are associated, and the expandability is realized, and the overall architecture of the model cannot be influenced as the event types are increased.
FIG. 4 is a schematic diagram of a specific process of event extraction according to an embodiment of the present inventionFIG. 3 and FIG. 4 show that each contextualized expression vector corresponds to the same event type composite vectorEach contextualized expression vector is integrated with the event type vectorAnd (4) carrying out addition.
In one application scenario, a probability threshold is preset. For example, it may be set to 0.5, but is not particularly limited. And for each event type, if the corresponding probability is greater than 0.5, the corresponding event type is considered to exist in the statement to be extracted. For example, in fig. 3 and 4, if the event probability of the event type b and the event type c is greater than 0.5, the event type b and the event type c are used as the event types to be processed, and the corresponding event type comprehensive vector is obtained according to the event types to be processed, so that the calculation amount is reduced, and the efficiency of event extraction is improved.
Specifically, the obtaining of the integrated vector of the event type corresponding to the statement to be extracted according to the embedded vector of the event type includes:
acquiring event probability corresponding to each event type, wherein the event probability is determined when the event type embedding vector corresponding to the statement to be extracted is determined through the multi-label event type classifier;
taking an event type with the event probability larger than a preset probability threshold value as a to-be-processed event type, and taking an event type embedded vector corresponding to the to-be-processed event type as a to-be-processed embedded vector;
and performing weighted summation on the embedding vectors to be processed corresponding to the event types to be processed to obtain the event type comprehensive vector, wherein the weighting coefficients corresponding to the embedding vectors to be processed are equal, or the event probability corresponding to the event types to be processed is taken as the weighting coefficient of the embedding vector to be processed corresponding to the event types to be processed.
In an application scenario, as shown in fig. 3, the event probability corresponding to each event type to be processed is used as a weight coefficient of the embedded vector to be processed corresponding to each event type to be processed, and an event type comprehensive vector is obtained through calculation, specifically, the corresponding event type comprehensive vector may be obtained through calculation according to the following formula (1):
Wherein,andthe embedded vectors respectively represent the event types corresponding to the event types b and c, namely the vectors to be processed,andthe event probabilities corresponding to the event types b and c are represented respectively, and in this embodiment, only two event types to be processed exist as an example for explanation, but not limited specifically.
In another application scenario, the same weighting coefficients may be set for each of the to-be-processed embedded vectors. It should be noted that the weighting factor of each of the to-be-processed embedded vectors may be preset to be a same number (for example, both are set to be 1), or may be adjusted according to the number of the to-be-processed embedded vectors (but keep equal to each other. as shown in fig. 4, the weighting factor may be set to be the reciprocal of the number of the to-be-processed embedded vectors, and the to-be-extracted statement corresponds to 2 to-be-processed event types (i.e., corresponds to 2 to-be-processed embedded vectors), the weighting factor may be set to be half, and the corresponding event type comprehensive vector is obtained by calculation according to the following formula (2):
wherein,andand the embedded vectors respectively represent the event types corresponding to the event types b and c, so that the average value of the embedded vectors to be processed is used as an event type comprehensive vector, and the accuracy and efficiency of event extraction are improved.
Fig. 5 is a schematic diagram of a specific event extraction flow provided in this embodiment, in an application scenario, as shown in fig. 5, the number of the event type integrated vectors is the same as the number of the contextualized expression vectors corresponding to the statements to be extracted, and the value of the event type integrated vector corresponding to each contextualized expression vector is determined according to the corresponding contextualized expression vector, and the values of the event type integrated vectors corresponding to different contextualized expression vectors may be different. It should be noted that, in fig. 5,represents the 1 st event type integrated vector, which is associated with the 1 st contextualized expression vectorIn response to this, the mobile terminal is allowed to,represents the firstiAn event type synthesis vector withiContextualized expression vectorAnd (7) corresponding. Therefore, all event types corresponding to the statements to be extracted are comprehensively considered, and the accuracy of event extraction is improved.
Specifically, learning to obtain a corresponding event type comprehensive vector for each character in the sentence to be extracted, and obtaining the event type comprehensive vector corresponding to the sentence to be extracted according to the event type embedded vector include: acquiring a weight matrix, a projection matrix and an event type embedding matrix, wherein the event type embedding matrix is acquired according to the event type embedding vector, the weight matrix is a matrix with m rows and 1 columns, the projection matrix and the event type embedding matrix are both matrices with m rows and d columns, m is the number of event types corresponding to the statement to be extracted, and d is the dimension of the event type embedding vector; calculating and obtaining an event type comprehensive vector corresponding to the statement to be extracted according to the weight matrix, the projection matrix, the event type embedding matrix and the contextualized expression vector; wherein, each element in the weight matrix is an event probability corresponding to each event type, or each element in the weight matrix is 1.
Specifically, the firstiThe event type integrated vector is the product of the transpose matrix of the target matrix and the event type embedding matrix, the target matrix is obtained by the Hadamard product of the target product matrix and the weighting matrix, and the target product matrix is the firstiThe product of the contextualized expression vector and the projection matrix.
As shown in FIG. 5, in one application scenario, the first calculation may be obtained according to the following equation (3)iEvent type comprehensive vector corresponding to characters:
Wherein,representing a projection matrix of shape (m, d),is a contextualized expression vector output in a Transformer language modelIn measurement ofiA vectorized expression (i.e. contextualized expression vector) corresponding to a single character,a matrix of the weights is represented by,represents a transposition of the matrix by a phase-shifting device,an embedded matrix representing the type of the event,representing the hadamard product. As shown in figure 5 of the drawings,ican be positive integer or [ CLS ]],iThe maximum value of (a) is determined according to the number of characters of the sentence to be extracted, and is not particularly limited herein. It should be noted that, for a word in a natural language, the d-dimensional vector is expressed asThe specific values of the elements are determined according to actual requirements, and are not specifically limited herein.
In particular, projection matricesSpecifically, the event type projection parameter is obtained by random initialization, wherein each row represents the projection parameter of each event type respectively; event type embedding matrixThe method comprises the steps that event types corresponding to statements to be extracted are embedded into vectors; weight matrixFormed by event probabilities corresponding to each event type, or a weight matrixEach element in (1).
In an application scenario, a sentence to be extracted corresponds to 5 event types, a vector dimension corresponding to each event type (corresponding to an event type embedded vector) is 512, and event extraction needs to be performed on a sentence with a length of 20 characters, so that an event is extractedIs a projection matrix with the shape of (5, 512),an event type embedding matrix in the shape of (5, 512) is composed of event type embedding vectors corresponding to 5 event types, and for the 1 st character, the corresponding contextualized expression vector (namely, word hidden vector) is coded by a transformIs a matrix of (512, 1) shape, whose corresponding event type is synthesized according to the above formula (3)The shape is (1, 512).
As shown in formula (3), in this embodiment, the projection matrix and the second projection matrix are first combinediContextualized expression vector: ( ) Multiplying to obtain a target product matrix, then performing Hadamard multiplication on the target product matrix and a weight matrix (which can be probability distribution corresponding to all event types) to obtain a target matrix, and taking the product of the transposition of the target matrix and the event type embedding matrix as an event type comprehensive vector corresponding to the contextualized expression vector. Thus, the weighting parameters (i.e., the elements in the weight matrix) can be used to allow the model to learn one for each word and for different contextsThe better weighting weight optimizes the embedding effect of the event type and improves the accuracy of event extraction.
It should be noted that a Hadamard product (Hadamard product) is a type of operation of a matrix, and a specific process is to multiply elements at corresponding positions in two matrices one by one. In another application scenario, each element in the weight matrix is 1, so that performing hadamard multiplication with the weight matrix is equivalent to multiplying each element in the original matrix by 1, and thus the weight matrix can be omitted. Therefore, when each element in the weight matrix is 1, the first element can be obtained by the following formula (4) calculationiEvent type comprehensive vector corresponding to each character :
In this way, the amount of computation can be reduced, thereby improving the efficiency of event extraction.
It should be noted that, in the process of calculating the event type comprehensive vector, the weighting weight of each event type embedding vector may be taken from the attention weight, so as to improve the accuracy of event extraction.
And step S400, adding the contextualized expression vector and the event type comprehensive vector to obtain a second input vector, inputting the second input vector into a pre-trained event argument classifier, and obtaining the event argument corresponding to the statement to be extracted through the event argument classifier.
In this embodiment, for multiple types of events extracted in the first stage, after all event type embedding vectors corresponding to the multiple types of events are weighted and corresponding event type integrated vectors are obtained, the event type integrated vectors and the contextualized expression vectors obtained in the first stage are added to be used as second input vectors during argument extraction, and argument extraction is performed. And combining and event pairing the extracted arguments to obtain the event arguments corresponding to each detected event type. Moreover, for the N detected event types, all event arguments can be extracted in one step without extracting each event type, so that the efficiency of event extraction is improved.
Specifically, in this embodiment, the second input vector is input into a pre-trained event argument classifier, and an event argument (i.e., an event parameter) corresponding to the statement to be extracted is obtained.
And S500, constructing argument combinations according to the event arguments, classifying the events of the argument combinations and determining the target event types corresponding to the argument combinations.
And one argument combination comprises at least one event argument, and the target event type corresponding to one argument combination is any one of a non-event type and an event type corresponding to the statement to be extracted.
In this embodiment, the step S500 specifically includes: acquiring word attributes corresponding to the event arguments, and combining the event arguments according to the word attributes corresponding to the event arguments to obtain a plurality of argument combinations, wherein each argument combination comprises a plurality of event arguments, and the word attributes corresponding to the event arguments in one argument combination are different; and classifying the events of the argument combinations and determining the target event types corresponding to the argument combinations.
Further, the event classification for each argument combination and the determination of the target event type corresponding to each argument combination include: and inputting the argument combination into a pre-trained argument combination event type classifier, performing event classification on each event argument combination through the argument combination event type classifier, and acquiring a target event type corresponding to each argument combination.
The event argument (i.e., event parameter) is a character or a character combination having a certain meaning of the event argument. One event argument corresponds to one word attribute, such as subject, predicate, object, complement, etc., and event arguments of different word attributes are combined. In the present embodiment, as shown in fig. 2 to fig. 5, the extracted subjects, predicates, and objects are taken as an example for explanation, where S1, S2, and S3 represent the extracted 3 subjects, P1, P2, and P3 represent the extracted 3 predicates, and O1, O2, and O3 represent the extracted 3 objects. The event-arguments are combined according to word attributes, for example, one argument combination includes at least a subject, a predicate, and an object. The argument combination event type classifier is a classifier trained in advance for classifying events of argument combinations. It should be noted that the multi-label event type classifier is a classifier for determining a target event type that may exist in a statement, and the function of the multi-label event type classifier is different from that of the argument combination event type classifier.
In an application scenario, one argument combination comprises a plurality of event arguments, wherein the word attributes corresponding to each event argument are different, and the number of the event arguments in one argument combination is the same as the number of the kinds of the word attributes. And after the corresponding argument combinations are obtained, event classification is carried out on each argument combination through a pre-trained argument combination event type classifier, and a target event type corresponding to each argument combination is determined. In this embodiment, the argument combination event type classifier used is an SPO event type classifier trained in advance for classifying argument combinations composed of principal and predicate guests, but is not limited specifically. Therefore, the event arguments obtained by extraction are combined and classified, and each target event type and the corresponding argument combination can be better determined.
It should be noted that there are many ways to combine and categorize event arguments, which are not specifically limited herein. In an application scenario, for an argument combination, a multi-classifier and a beam search may be used for event classification, or an argument combination (or an event argument) is predicted to be scored through a ranking algorithm, and event types are ranked according to the scores, and a corresponding target event type is determined.
Specifically, all extracted event arguments are subjected to exhaustive permutation and combination, and corresponding argument combinations are obtained.For each argument combination, a first implicit vector of each argument is usedAnd performing superposition or average, and inputting the result into a classifier to classify the argument combination. If the argument combination is not a reasonable combination, the classifier classifies it as non-event and discards it.
For example, two subjects S1 and S2, two predicates P1 and P2 are detected in a paragraph, in which case there are four possible combinations of event elements, S1P1, S1P2, S2P1 and S2P 2. Taking S1P1 as an example, the first-character vector of S1 isThe P1 first word hidden vector isThen will beAnd inputting the data into a classifier as a characteristic, if the classifier judges that the data is not an event, discarding the S1P1 combination, otherwise, keeping the data as a corresponding event type, and taking the corresponding event type as a target event type.
In one application scenario, for each argument combination, a first implicit vector of each argument is usedAnd performing superposition or averaging, performing pairwise vector distance comparison with event type embedded vectors corresponding to all event types, distributing the combination to the event type with the closest distance, and taking the event type with the closest distance as a target event type.
Therefore, in the embodiment, when the multi-event extraction is carried out, the key words required by the traditional event extraction do not need to be marked, and a plurality of event types and corresponding parameters thereof can be extracted from one sentence. Meanwhile, event type classification and event argument extraction are combined, and a trainable event type comprehensive vector is set as indicating information to assist parameter extraction at the second stage. In the second stage, the event type integration vector and the contextualized expression vector are added to realize the embedding of the event type, so that the two stages are associated, and meanwhile, the expandability is realized, and the overall architecture of the model cannot be influenced along with the increase of the event types. In the single-step extraction process of the second stage, for the detected N events, all event arguments can be extracted by only one step, so that the process is more efficient. And in the second stage, weighting weights are taken from the event probability corresponding to the event type in the first stage, and the information in the two stages is dynamically combined, so that the event extraction effect is improved.
In an application scenario, the encoder, the multi-label event type classifier, the event argument classifier and the argument combination event type classifier may be trained respectively in advance through respective training data, the training data may include input data and artificial labeling data, output data is obtained according to the input data in a training process, a loss value is calculated according to the output data and the artificial labeling data, and parameters of the encoder, the event argument classifier and the argument combination event type classifier are adjusted according to the loss value until a training condition is met (a preset iteration number is reached or the loss value is smaller than a corresponding threshold value). For example, when the argument combination event type classifier is trained, the training data comprises a plurality of argument combinations and artificially labeled real target event types corresponding to the argument combinations. And inputting the argument combinations in the training data into an argument combination event type classifier, obtaining output target event types corresponding to the argument combinations, calculating loss values according to the output target event types and the real target event types, and adjusting parameters of the argument combination event type classifier according to the loss values until preset training conditions are met.
In this embodiment, the encoder, the multi-label event type classifier, the event argument classifier, and the argument combination event type classifier belong to the same event extraction model, that is, the event extraction model includes four parts, namely, the encoder, the multi-label event type classifier, the event argument classifier, and the argument combination event type classifier, and the whole event extraction model is directly trained end to end according to training data including artificial labeling data, which is beneficial to reducing training amount and improving training accuracy.
Specifically, in this embodiment, the event extraction model is trained according to the following steps:
inputting training sentences in training data into an event extraction model, and outputting training target event types corresponding to the training sentences through the event extraction model, wherein the training data comprises a plurality of groups of training sentence information, and each group of training sentence information comprises a training sentence and a real target event type corresponding to the training sentence;
and adjusting model parameters of the event extraction model according to the training target event type and the real target event type corresponding to the training sentences, and continuing to execute the step of inputting the training sentences in the training data into the event extraction model until preset training conditions are met to obtain the trained event extraction model.
In the training process of the event extraction model, the specific data processing process of each component is similar to the specific processing process in the event extraction process, and is not described again here.
In an application scenario, the event extraction model can also calculate loss values in three parts and train the loss values, namely, corresponding multi-label event type classification loss, event argument classification loss and argument combination event type classification loss are calculated respectively for a multi-label event type classifier, an event argument classifier and an argument combination event type classifier, the three losses are summed to obtain final loss, the final loss is used for calculating gradient, and gradient propagation is carried out on the three parts to train the model. It should be noted that, at this time, the training data corresponding to the event extraction model includes multiple sets of training statement information, and each set of training statement information includes a training statement, a true event type (and a true event probability corresponding to each true event type, or a true event type embedded vector), a true event argument, and a true target event type, so as to calculate three corresponding loss values respectively.
Therefore, in the event extraction method provided by the embodiment of the invention, the event type of the statement to be extracted is acquired without setting a trigger word, and the event type is used as the indication information to be fused with the statement information, so that the accuracy of event extraction is improved. Meanwhile, all event types are acted together to extract event arguments once, and then the obtained event arguments are arranged and combined to realize event type classification without extracting event arguments for multiple times, so that the efficiency of event extraction is improved, and the event types in the original sentence to be processed can be combined to better extract the event.
Exemplary device
As shown in fig. 6, corresponding to the event extraction method, an embodiment of the present invention further provides an event extraction device, where the event extraction device includes:
the sentence processing module 610 is configured to obtain a sentence to be extracted, perform word encoding and position encoding on each word in the sentence to be extracted, and obtain a word embedding vector and a position embedding vector corresponding to the sentence to be extracted.
And an embedded vector processing module 620, configured to add the word embedded vector and the position embedded vector to obtain a first input vector, input the first input vector into a pre-trained encoder, and output a contextualized expression vector of the to-be-extracted sentence through the encoder.
An event type determining module 630, configured to input the contextualized expression vector into a pre-trained multi-tag event type classifier, determine, by the multi-tag event type classifier, an event type embedding vector corresponding to the to-be-extracted statement, and obtain, according to the event type embedding vector, an event type comprehensive vector corresponding to the to-be-extracted statement.
And an event argument extracting module 640, configured to add the contextualized expression vector and the event type comprehensive vector to obtain a second input vector, input the second input vector to a pre-trained event argument classifier, and obtain an event argument corresponding to the to-be-extracted statement through the event argument classifier.
And an event argument processing module 650, configured to construct argument combinations according to the event arguments, perform event classification on the argument combinations, and determine a target event type corresponding to each argument combination, where a target event type corresponding to one argument combination is any one of a non-event and an event type corresponding to the to-be-extracted statement.
Specifically, in this embodiment, the specific functions of the event extraction device and each module thereof may refer to the corresponding descriptions in the event extraction method, and are not described herein again.
The event extraction device is not limited to the above-described event extraction device, and the module may be divided into different modules.
Based on the above embodiment, the present invention further provides an intelligent terminal, and a schematic block diagram thereof may be as shown in fig. 7. The intelligent terminal comprises a processor, a memory, a network interface and a display screen which are connected through a system bus. Wherein, the processor of the intelligent terminal is used for providing calculation and control capability. The memory of the intelligent terminal comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and an event extraction program. The internal memory provides an environment for the operating system and the event extraction program in the nonvolatile storage medium to run. The network interface of the intelligent terminal is used for being connected and communicated with an external terminal through a network. The event extraction program realizes the steps of any one of the above event extraction methods when being executed by a processor. The display screen of the intelligent terminal can be a liquid crystal display screen or an electronic ink display screen.
It will be understood by those skilled in the art that the block diagram of fig. 7 is only a block diagram of a part of the structure related to the solution of the present invention, and does not constitute a limitation to the intelligent terminal to which the solution of the present invention is applied, and a specific intelligent terminal may include more or less components than those shown in the figure, or combine some components, or have different arrangements of components.
In one embodiment, an intelligent terminal is provided, where the intelligent terminal includes a memory, a processor, and an event extraction program stored in the memory and executable on the processor, and the event extraction program, when executed by the processor, performs the following operations:
acquiring sentences to be extracted, and performing word coding and position coding on each word in the sentences to be extracted to obtain word embedded vectors and position embedded vectors corresponding to the sentences to be extracted;
adding the word embedding vector and the position embedding vector to obtain a first input vector, inputting the first input vector into a pre-trained encoder, and outputting a contextualized expression vector of the sentence to be extracted through the encoder;
inputting the contextualized expression vector into a pre-trained multi-label event type classifier, determining an event type embedding vector corresponding to the statement to be extracted through the multi-label event type classifier, and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedding vector;
adding the contextualized expression vector and the event type comprehensive vector to obtain a second input vector, inputting the second input vector into a pre-trained event argument classifier, and obtaining an event argument corresponding to the statement to be extracted through the event argument classifier;
And constructing argument combinations according to the event arguments, performing event classification on each argument combination, and determining a target event type corresponding to each argument combination, wherein the target event type corresponding to one argument combination is any one of a non-event type or an event type corresponding to the statement to be extracted.
The embodiment of the present invention further provides a computer-readable storage medium, where an event extraction program is stored on the computer-readable storage medium, and when the event extraction program is executed by a processor, the event extraction program implements the steps of any one of the event extraction methods provided in the embodiments of the present invention.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by functions and internal logic of the process, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
It should be clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional units and modules is only used for illustration, and in practical applications, the above functions may be distributed as different functional units and modules according to needs, that is, the internal structure of the apparatus may be divided into different functional units or modules to implement all or part of the above described functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only used for distinguishing one functional unit from another, and are not used for limiting the protection scope of the present invention. For the specific working processes of the units and modules in the above-mentioned apparatus, reference may be made to the corresponding processes in the foregoing method embodiments, which are not described herein again.
In the above embodiments, the description of each embodiment has its own emphasis, and reference may be made to the related description of other embodiments for parts that are not described or recited in any embodiment.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the technical solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided in the present invention, it should be understood that the disclosed system/terminal device and method may be implemented in other ways. For example, the above-described system/terminal device embodiments are merely illustrative, and for example, the division of the above modules or units is only one logical function division, and may be implemented by another division manner in practice, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed.
The integrated modules/units described above may be stored in a computer-readable storage medium if implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow in the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments described above may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the above-described computer program code, recording medium, U disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier signal, telecommunications signal, software distribution medium, and the like. It should be noted that the contents of the computer-readable storage medium can be increased or decreased as required by the legislation and patent practice in the jurisdiction.
The above-mentioned embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art; the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein.
Claims (10)
1. An event extraction method, characterized in that the event extraction method comprises:
obtaining sentences to be extracted, and performing word coding and position coding on each word in the sentences to be extracted to obtain word embedded vectors and position embedded vectors corresponding to the sentences to be extracted;
adding the word embedding vector and the position embedding vector to obtain a first input vector, inputting the first input vector into a pre-trained encoder, and outputting a contextualized expression vector of the sentence to be extracted through the encoder;
inputting the contextualized expression vector into a pre-trained multi-label event type classifier, determining an event type embedded vector corresponding to the statement to be extracted through the multi-label event type classifier, and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedded vector;
Adding the contextualized expression vector and the event type comprehensive vector to obtain a second input vector, inputting the second input vector into a pre-trained event argument classifier, and obtaining event arguments corresponding to the statement to be extracted through the event argument classifier;
and constructing argument combinations according to the event arguments, performing event classification on the argument combinations, and determining target event types corresponding to the argument combinations, wherein the target event type corresponding to one argument combination is any one of a non-event or an event type corresponding to the statement to be extracted.
2. The event extraction method according to claim 1, wherein the to-be-extracted sentence corresponds to a plurality of event types, the dimension of the integrated vector of the event types is the same as the dimension of the contextualized expression vector, the contextualized expression vector is input to a pre-trained multi-tag event type classifier, an event type embedding vector corresponding to the to-be-extracted sentence is determined by the multi-tag event type classifier, and the integrated vector of the event types corresponding to the to-be-extracted sentence is obtained according to the event type embedding vector, including:
Inputting the contextualized expression vector into a pre-trained multi-label event type classifier, and determining an event type embedding vector corresponding to the statement to be extracted through the multi-label event type classifier;
and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedded vector.
3. The event extraction method according to claim 2, wherein the obtaining of the integrated vector of the event type corresponding to the statement to be extracted according to the embedded vector of the event type includes:
acquiring event probability corresponding to each event type, wherein the event probability is determined when the event type embedding vector corresponding to the statement to be extracted is determined through the multi-label event type classifier;
taking an event type with an event probability larger than a preset probability threshold value as a to-be-processed event type, and taking an event type embedding vector corresponding to the to-be-processed event type as a to-be-processed embedding vector;
and performing weighted summation on the embedding vectors to be processed corresponding to the event types to be processed to obtain the event type comprehensive vector, wherein the weighting coefficients corresponding to the embedding vectors to be processed are equal, or the event probability corresponding to the event types to be processed is taken as the weighting coefficient of the embedding vector to be processed corresponding to the event types to be processed.
4. The event extraction method according to claim 2, wherein the obtaining an event type integrated vector corresponding to the statement to be extracted according to the event type embedded vector comprises:
obtaining a weight matrix, a projection matrix and an event type embedded matrix, wherein the event type embedded matrix is obtained according to the event type embedded vector, the weight matrix is a matrix with m rows and 1 column, the projection matrix and the event type embedded matrix are both matrices with m rows and d column, m is the number of event types corresponding to the statements to be extracted, and d is the dimension of the event type embedded vector;
calculating and obtaining an event type comprehensive vector corresponding to the statement to be extracted according to the weight matrix, the projection matrix, the event type embedded matrix and the contextualized expression vector;
each element in the weight matrix is an event probability corresponding to each event type, or each element in the weight matrix is 1.
5. The event extraction method according to claim 4, characterized in thatiThe event type comprehensive vector is the product of a transposed matrix of a target matrix and the event type embedded matrix, the target matrix is obtained by solving the Hadamard product of the target product matrix and the weight matrix, and the target product matrix is the first iA product of the contextualized expression vector and the projection matrix.
6. The event extraction method according to any one of claims 1 to 5, wherein the constructing of argument combinations according to the event arguments, the event classification of each argument combination, and the determination of the target event type corresponding to each argument combination, comprises:
acquiring word attributes corresponding to the event arguments, and combining the event arguments according to the word attributes corresponding to the event arguments to obtain a plurality of argument combinations, wherein each argument combination comprises a plurality of event arguments, and the word attributes corresponding to the event arguments in one argument combination are different;
and performing event classification on each argument combination and determining a target event type corresponding to each argument combination.
7. The event extraction method according to any one of claims 1 to 5, wherein the performing event classification on each argument combination and determining a target event type corresponding to each argument combination comprises:
and inputting the argument combination into a pre-trained argument combination event type classifier, performing event classification on each event argument combination through the argument combination event type classifier, and determining a target event type corresponding to each argument combination.
8. An event extraction device, characterized in that the event extraction device comprises:
the sentence processing module is used for acquiring sentences to be extracted, and performing word coding and position coding on each word in the sentences to be extracted to obtain word embedded vectors and position embedded vectors corresponding to the sentences to be extracted;
the embedded vector processing module is used for adding the word embedded vector and the position embedded vector to obtain a first input vector, inputting the first input vector into a pre-trained encoder, and outputting a contextualized expression vector of the statement to be extracted through the encoder;
the event type determining module is used for inputting the contextualized expression vector into a pre-trained multi-label event type classifier, determining an event type embedded vector corresponding to the statement to be extracted through the multi-label event type classifier, and acquiring an event type comprehensive vector corresponding to the statement to be extracted according to the event type embedded vector;
an event argument extraction module, configured to add the contextualized expression vector and the event type comprehensive vector to obtain a second input vector, input the second input vector to a pre-trained event argument classifier, and obtain, by using the event argument classifier, an event argument corresponding to the statement to be extracted;
And the event argument processing module is used for constructing argument combinations according to the event arguments, performing event classification on each argument combination and determining a target event type corresponding to each argument combination, wherein the target event type corresponding to one argument combination is any one of a non-event type or an event type corresponding to the statement to be extracted.
9. An intelligent terminal, characterized in that the intelligent terminal comprises a memory, a processor and an event extraction program stored on the memory and operable on the processor, the event extraction program, when executed by the processor, implementing the steps of the event extraction method according to any one of claims 1-7.
10. A computer-readable storage medium, in which an event extraction program is stored, which when executed by a processor implements the steps of the event extraction method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210661693.3A CN114757189B (en) | 2022-06-13 | 2022-06-13 | Event extraction method and device, intelligent terminal and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210661693.3A CN114757189B (en) | 2022-06-13 | 2022-06-13 | Event extraction method and device, intelligent terminal and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114757189A true CN114757189A (en) | 2022-07-15 |
CN114757189B CN114757189B (en) | 2022-10-18 |
Family
ID=82337169
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210661693.3A Active CN114757189B (en) | 2022-06-13 | 2022-06-13 | Event extraction method and device, intelligent terminal and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114757189B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115186820A (en) * | 2022-09-07 | 2022-10-14 | 粤港澳大湾区数字经济研究院(福田) | Event coreference resolution method, device, terminal and computer readable storage medium |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107122416A (en) * | 2017-03-31 | 2017-09-01 | 北京大学 | A kind of Chinese event abstracting method |
CN110704598A (en) * | 2019-09-29 | 2020-01-17 | 北京明略软件系统有限公司 | Statement information extraction method, extraction device and readable storage medium |
CN111222305A (en) * | 2019-12-17 | 2020-06-02 | 共道网络科技有限公司 | Information structuring method and device |
CN111475617A (en) * | 2020-03-30 | 2020-07-31 | 招商局金融科技有限公司 | Event body extraction method and device and storage medium |
CN111753522A (en) * | 2020-06-29 | 2020-10-09 | 深圳壹账通智能科技有限公司 | Event extraction method, device, equipment and computer readable storage medium |
CN112163416A (en) * | 2020-10-09 | 2021-01-01 | 北京理工大学 | Event joint extraction method for merging syntactic and entity relation graph convolution network |
CN112597366A (en) * | 2020-11-25 | 2021-04-02 | 中国电子科技网络信息安全有限公司 | Encoder-Decoder-based event extraction method |
CN113312464A (en) * | 2021-05-28 | 2021-08-27 | 北京航空航天大学 | Event extraction method based on conversation state tracking technology |
CN114385793A (en) * | 2022-03-23 | 2022-04-22 | 粤港澳大湾区数字经济研究院(福田) | Event extraction method and related device |
CN114490953A (en) * | 2022-04-18 | 2022-05-13 | 北京北大软件工程股份有限公司 | Training event extraction model, event extraction method and target event extraction model |
CN114519344A (en) * | 2022-01-25 | 2022-05-20 | 浙江大学 | Discourse element sub-graph prompt generation and guide-based discourse-level multi-event extraction method |
-
2022
- 2022-06-13 CN CN202210661693.3A patent/CN114757189B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107122416A (en) * | 2017-03-31 | 2017-09-01 | 北京大学 | A kind of Chinese event abstracting method |
CN110704598A (en) * | 2019-09-29 | 2020-01-17 | 北京明略软件系统有限公司 | Statement information extraction method, extraction device and readable storage medium |
CN111222305A (en) * | 2019-12-17 | 2020-06-02 | 共道网络科技有限公司 | Information structuring method and device |
CN111475617A (en) * | 2020-03-30 | 2020-07-31 | 招商局金融科技有限公司 | Event body extraction method and device and storage medium |
CN111753522A (en) * | 2020-06-29 | 2020-10-09 | 深圳壹账通智能科技有限公司 | Event extraction method, device, equipment and computer readable storage medium |
CN112163416A (en) * | 2020-10-09 | 2021-01-01 | 北京理工大学 | Event joint extraction method for merging syntactic and entity relation graph convolution network |
CN112597366A (en) * | 2020-11-25 | 2021-04-02 | 中国电子科技网络信息安全有限公司 | Encoder-Decoder-based event extraction method |
CN113312464A (en) * | 2021-05-28 | 2021-08-27 | 北京航空航天大学 | Event extraction method based on conversation state tracking technology |
CN114519344A (en) * | 2022-01-25 | 2022-05-20 | 浙江大学 | Discourse element sub-graph prompt generation and guide-based discourse-level multi-event extraction method |
CN114385793A (en) * | 2022-03-23 | 2022-04-22 | 粤港澳大湾区数字经济研究院(福田) | Event extraction method and related device |
CN114490953A (en) * | 2022-04-18 | 2022-05-13 | 北京北大软件工程股份有限公司 | Training event extraction model, event extraction method and target event extraction model |
Non-Patent Citations (3)
Title |
---|
MENGYUAN ZHOU 等: ""Sequential Attention Module for Natural Language Processing"", 《HTTPS://ARXIV.ORG/ABS/2109.03009》 * |
SHUN ZHENG 等: ""Doc2EDAG: An End-to-End Document-level Framework for Chinese Financial Event Extraction"", 《HTTPS://ARXIV.ORG/ABS/1904.07535》 * |
柳亦婷: ""面向特定领域的事件抽取研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115186820A (en) * | 2022-09-07 | 2022-10-14 | 粤港澳大湾区数字经济研究院(福田) | Event coreference resolution method, device, terminal and computer readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN114757189B (en) | 2022-10-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110969020B (en) | CNN and attention mechanism-based Chinese named entity identification method, system and medium | |
US20220382553A1 (en) | Fine-grained image recognition method and apparatus using graph structure represented high-order relation discovery | |
CN111325243B (en) | Visual relationship detection method based on regional attention learning mechanism | |
CN113407660B (en) | Unstructured text event extraction method | |
EP4152211A1 (en) | Neural network model training method, image classification method, text translation method and apparatus, and device | |
CN112733866A (en) | Network construction method for improving text description correctness of controllable image | |
CN112784066B (en) | Knowledge graph-based information feedback method, device, terminal and storage medium | |
CN111291165A (en) | Method and device for embedding training word vector into model | |
CN113378938B (en) | Edge transform graph neural network-based small sample image classification method and system | |
CN114490953B (en) | Method for training event extraction model, method, device and medium for extracting event | |
CN110245353B (en) | Natural language expression method, device, equipment and storage medium | |
CN113282714B (en) | Event detection method based on differential word vector representation | |
CN111966811A (en) | Intention recognition and slot filling method and device, readable storage medium and terminal equipment | |
CN114612767A (en) | Scene graph-based image understanding and expressing method, system and storage medium | |
CN114048290A (en) | Text classification method and device | |
CN115168592B (en) | Statement emotion analysis method, device and equipment based on aspect categories | |
CN114757189B (en) | Event extraction method and device, intelligent terminal and storage medium | |
CN111611796A (en) | Hypernym determination method and device for hyponym, electronic device and storage medium | |
CN113806747B (en) | Trojan horse picture detection method and system and computer readable storage medium | |
CN115588193A (en) | Visual question-answering method and device based on graph attention neural network and visual relation | |
CN114925702A (en) | Text similarity recognition method and device, electronic equipment and storage medium | |
CN114880427A (en) | Model based on multi-level attention mechanism, event argument extraction method and system | |
Lhasiw et al. | A bidirectional LSTM model for classifying Chatbot messages | |
CN114494777A (en) | Hyperspectral image classification method and system based on 3D CutMix-transform | |
CN113989566A (en) | Image classification method and device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |