CN114780720A - Text entity relation classification method based on small sample learning - Google Patents
Text entity relation classification method based on small sample learning Download PDFInfo
- Publication number
- CN114780720A CN114780720A CN202210318340.3A CN202210318340A CN114780720A CN 114780720 A CN114780720 A CN 114780720A CN 202210318340 A CN202210318340 A CN 202210318340A CN 114780720 A CN114780720 A CN 114780720A
- Authority
- CN
- China
- Prior art keywords
- distance
- vector
- small sample
- relation
- prototype
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 43
- 239000013598 vector Substances 0.000 claims abstract description 66
- 230000006870 function Effects 0.000 claims abstract description 37
- 230000007246 mechanism Effects 0.000 claims abstract description 34
- 238000005259 measurement Methods 0.000 claims abstract description 12
- 239000011159 matrix material Substances 0.000 claims description 11
- 230000004913 activation Effects 0.000 claims description 8
- 238000011176 pooling Methods 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 6
- 238000013527 convolutional neural network Methods 0.000 abstract description 8
- 238000004364 calculation method Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000012935 Averaging Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/355—Class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Databases & Information Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a text entity relation classification method based on small sample learning, which comprises the following steps of: 1) extracting semantic features of instance vectors in the data set by using a convolutional neural network as an instance encoder; 2) under a small sample learning scene, a prototype-level attention mechanism module is designed, each sample is endowed with a weight, and a prototype of each relation is represented in a weighted summation mode; 3) and in a small sample learning scene, replacing a new measurement function. Extracting a characteristic coefficient in a support set vector by using convolution operation through a distance level attention mechanism module, and calculating the distance between each relation prototype and a query instance in the support set by using the product of a Manhattan distance formula and the characteristic coefficient as a new measurement function; 4) and (5) realizing small sample relation classification by utilizing a softmax function.
Description
Technical Field
The invention relates to a text entity relation classification method based on small sample learning, and belongs to the technical field of text data identification.
Background
Relationship classification is receiving more and more attention as one of the important subtasks of knowledge extraction. Oriented to unstructured text data, the task of relational classification is to extract semantic relationships between two or more entities from the text. In the current relation classification problem, most mature technologies achieve excellent experimental results by improving the traditional neural network model (such as a recurrent neural network, a convolutional neural network, and the like). However, the data sets selected in the experiment are simple short sentences of which the categories are predefined, and the sample distribution of each relationship is relatively uniform. In practical applications, the data size is small, and the distribution of samples is not uniform.
The advent of remote surveillance methods has provided a solution for small scale datasets to obtain large scale training data by aligning plain text with existing knowledge maps. The basic assumption of the remote surveillance method is that if two entities have some relationship in the knowledge-graph, a sentence containing both entities expresses this relationship. But the assumption is too strong so that the remote supervision data set contains a large number of falsely labeled samples. Meanwhile, the long tail distribution characteristics of the relation and the entity pair still exist in a real scene, and available samples are few.
In fact, the human can learn knowledge quickly through fewer samples, and the ability of 'holding one thing over the other' is achieved, and the method is also suitable for deep learning. Researchers provide a small sample learning task, and the problems of less data sets and uneven distribution are effectively solved by designing a mode of combining a small sample learning mechanism and relationship classification. The text entity relation classification method based on small sample learning mainly has the function that two attention mechanism modules are introduced to realize relation classification on the basis of a network framework of small sample learning. Currently, for each relationship existing in the support set, the small sample relationship classification usually adopts the method of averaging the example vectors to find the relationship prototype. Since the amount of data is small in a scene of small sample learning, when one example is far away from other examples in the mapping space, the average calculated prototype may be greatly deviated. Once a large amount of noise data exists, the final relation classification effect is greatly influenced; meanwhile, for a relational feature vector, only a part of dimensions have obvious distinguishing effect on the final classification result, and once the problem of feature sparsity exists in the instance vector extracted from the support set, the final classification result is subjected to larger deviation. Therefore, aiming at the two problems of relation classification in a small sample learning scene, how to improve the prototype representation of the relation example, how to solve the characteristic sparsity problem of the relation characteristic vector, and meeting the representation requirement under the condition of uneven sample distribution are important problems to be solved.
Disclosure of Invention
The invention aims to: aiming at the problems and the defects in the prior art, the invention provides a text entity relation classification method based on small sample learning, wherein the small sample learning is a training method specially proposed aiming at scenes which have scarce data and are difficult to meet the requirements of model training. Aiming at the problem that the prototype vector representation has errors in the scene that a data set is scarce, the problem is solved by using a prototype-level attention mechanism module; aiming at the characteristic that the measurement function can not highlight important dimensionality in the vector, the problem is solved by utilizing a distance level attention mechanism module and replacing an original distance formula.
The technical scheme is as follows: a text entity relation classification method based on small sample learning comprises the following steps:
step 1: and (3) adopting a CNN network as an example encoder to encode the support set statement and the query set statement in the given data set, and converting the support set statement and the query set statement into low-dimensional example vectors so as to obtain the entity pair characteristics of the extracted corpus.
And 2, step: in the small sample learning scenario, originally, for each relationship existing in the support set, a relationship prototype is usually obtained by directly averaging the example vectors. Now, a prototype-level attention mechanism module is utilized to give a weight to each instance, so that a weighted prototype vector is obtained;
and 3, step 3: splicing the support set examples obtained by the encoding in the step 1 into a vector matrix, and extracting semantic features of important dimensions in the support set examples through a distance level attention mechanism module so as to obtain the weight beta of the distance level attention mechanism.
And 4, step 4: and (3) adopting a Manhattan distance formula as a new calculation formula, and multiplying the distance level attention mechanism weight beta obtained in the step (3) by the distance formula to obtain a new distance formula as a measurement function. By using the formula, the distance measurement can be performed on the query examples in the query set and the prototype obtained in step 2.
And 5: and (4) comparing the distance between the query example and the prototype according to the distance formula obtained in the step 4. And (5) carrying out relation classification by utilizing a softmax function.
In the step 1, a CNN network is used as an example encoder to encode a support set statement and a query set statement in a given data set, and the method includes the following steps:
1-1, converting the corpus in the input data set into a low-dimensional word vector form by means of Glove word embedding and entity position embedding. Glove word embedding (WF) is to convert input corpus into co-occurrence matrix with dimension dwRepresents; the sentence obtained after the Glove word embedding is represented as a vector list (x)0,x1,x2,...,xi) Wherein x isiIndicating the i-th word embedding, and the entity position embedding (PF) is to calculate each word embedding vector x in the vector listm(m∈[0,i]) Distance to two head and tail entities in a sentence, dimension 2 x dpAnd (4) showing. Finally, Glove word embedding and entity position embedding are combined and are expressed as { e1,...,en}={[WF1;PF1]...,[WFn;PFn]Thus forming an embedded vector sequence of sentences.
1-2, further processing the final sentence embedding vector sequence obtained in the step 1-1 by using a CNN network, extracting semantic features in the sentence embedding vector sequence, and specifically dividing the sentence embedding vector sequence into a convolutional layer and a maximum pooling layer. The convolutional layer extracts features of a sentence embedding vector sequence by using a convolutional sliding window with the length of m, and the obtained vector sequence after feature extraction is processed by a ReLU activation function to obtain sentence hiding embedding; and the maximum pooling layer processes the sentence hiding embedding obtained by the convolutional layer. Finally, an example vector of the whole sentence is obtained.
The step 2 of giving a weight to each instance by using a prototype-level attention mechanism module to obtain a weighted prototype vector includes the following steps:
2-1 Using the Gaussian function as the activation function, the weight γ assigned to each sample instance is foundijFinally, weighted summation is carried out on each relation example by using the weight, and the prototype-level relation vector representation c of the whole relation is obtainediRespectively expressed as:
wherein q isjRepresenting query set instances, M representing relationship types present in the data set, K representing the number of instances under each relationship type, xijDenotes the jth support set instance, σ, under the relation iiThe parameter values are expressed as gaussian functions.
The step 3 of extracting semantic features of important dimensions in the support set instance through the distance level attention mechanism module, thereby obtaining the weight β of the distance level attention mechanism, includes the following steps:
3-1 dividing the supporting centralized instance sentences according to the relation, and dividing each relation riK support set instances [ x ] of (1)i1,xi2,...,xiK]Processing the data by the example encoder in the step 1, and splicing the data into a K x dhVector matrix of 1, where K denotes the number of instances per relationship class, dhRepresenting a hidden layer unit.
3-2, the vector matrix passes through a module formed by interweaving three convolution layers and three ReLU function layers, semantic features with dimension values not being 0 in the support set example are extracted, and the dimension is changed into 1 x d h1, to obtain the distance-level attention weight β. The more useful the corresponding feature dimension, the higher the corresponding beta value. Since some dimension values of the instance vector are 0, important dimensions other than 0 need to be highlighted to exert their effects in relation classification.
The metric formula established in step 4 comprises the following steps:
4-1, a Manhattan distance formula is selected as a distance formula, so that errors in calculation can be eliminated, the operation speed is greatly improved, distance measurement can be performed under high-dimensional data, and a good classification effect is guaranteed.
4-2 multiplying the distance-level attention mechanism weight β obtained in step 3-2 by the distance formula selected in step 4-1 to obtain a distance function d (x, y) which is expressed as a new metric function:
wherein n represents a dimension, xiAnd yiRespectively representing the values of x and y in the i dimension
The relation classification by using the softmax function established in the step 5 comprises the following steps:
5-1 prototype-level relationship vector representation c obtained according to step 2-2iAnd the distance function d (x,y), computing an instance in the query set and c)iA distance therebetween Representing the vector obtained after the query set instance x passes through the instance coding layer.
5-2 for the instances in the query set, it is determined which relationship in the relationship set R specifically belongs to. The concrete expression is as follows:
wherein the conditional probabilityFor query set instance x at relationship riThe probability of the following.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the text entity relationship classification method based on small sample learning as described above when executing the computer program.
A computer readable storage medium storing a computer program for executing the text entity relationship classification method based on small sample learning as described above.
Compared with the prior art, the invention has the advantages that:
(1) the relation classification method adopts a prototype-level attention mechanism to obtain a prototype, and can eliminate the influence of extreme data on the representation of the whole relation prototype. Meanwhile, a Gaussian function is adopted as an activation function, and the method is different from a common activation function and is better at processing data with smaller difference; in addition, the Gaussian function curve is matched with the long tail distribution characteristics of the small sample data, and the method is more suitable for the small sample relation classification task;
(2) the relation classification method adopts a distance level attention mechanism to obtain the characteristic coefficient, can emphasize important dimensions in the relation vector, plays a role in final relation classification and solves the characteristic sparse problem;
(3) the relation classification method adopts a Manhattan distance formula as a measurement formula. The method is different from the conventional Euclidean distance formula, and solves the problem that the Euclidean distance fails under high-dimensional data. Meanwhile, errors in calculation are effectively eliminated, and the operation speed is greatly improved.
Drawings
FIG. 1 is a general framework diagram of a method of an embodiment of the invention;
FIG. 2 is a flow diagram of a prototype level attention mechanism module of an embodiment of the present invention;
FIG. 3 is a diagram of a distance level attention mechanism module framework according to an embodiment of the present invention.
Detailed Description
The present invention is further illustrated by the following examples, which are intended to be purely exemplary and are not intended to limit the scope of the invention, as various equivalent modifications of the invention will occur to those skilled in the art upon reading the present disclosure and fall within the scope of the appended claims.
As shown in fig. 1, the text entity relationship classification method based on small sample learning includes the following steps:
1-1, selecting a FewRel data set which is used more in the field of small sample relation classification;
1-2 predefined sets of relationship types for the FewRel dataset. The following are some entity relationship types set in the training set in this embodiment, where the specified entity relationship types are all classified according to the types specified in the original data set:
numbering | Entity relationship types | Number of | Entity relationship types |
1 | P931 | 6 | P6 |
2 | P4552 | 7 | |
3 | P140 | 8 | P449 |
4 | P1923 | 9 | P1435 |
5 | P150 | 10 | P175 |
1-3 embedding by using GloveAnd converting the input corpus into a low-dimensional word vector form by an embedding mode of the input entity and the entity position. Glove word embedding is to convert input linguistic data into a co-occurrence matrix by using dimension dwRepresenting; the sentence obtained after the Glove word embedding is represented as a vector list (x)0,x1,x2,...,xi) Wherein x isiThe method represents the i-th word embedding, and the entity position embedding is to calculate the distance between each word embedding vector and two head and tail entities in a sentence by using the dimension 2 x dpAnd (4) showing. Finally combining Glove word embedding with entity location embedding achieves a final embedded representation for each word.
For example, for The input sentences "The name of East middlands air objects at one point change to Nottingham," The head entity is "East middlands air objects" and The tail entity is "Nottingham," and for a certain word "one" therein, The relative distance to The head entity is 5 and The relative distance to The tail entity is-4, so The "one" position embedding is represented as [5, -4]. After the whole sentence passes through the embedding layer, the word embedding of each word is represented by WF, the dimension is set to be 50, the position embedding is represented by PF, the dimension is set to be 5, and finally the embedding vector sequence of the whole sentence is represented as { e }1,...,en}={[WF1;PF1]...,[WFn;PFn]}。
1-4, using CNN network to further process the sentence embedded vector sequence, extracting semantic features, which can be divided into convolutional layer and maximum pooling layer. The convolution layer extracts features by utilizing a convolution sliding window with the length of m and passes an obtained vector sequence through a ReLU activation function; and the maximum pooling layer processes the sentence hiding embedding obtained by the convolutional layer. Finally, an example vector of the whole sentence is obtained.
For example, for the final sentence found in 1-3, the sequence of embedding vectors e1,...,enAnd setting the length of a sliding window as m, extracting semantic features in the sliding window by using convolution operation, and expressing the semantic features as follows:
embedding [ h ] hidden sentences obtained by maximum pooling operation on convolutional layers1,h2,...hn]The treatment was performed as follows:
[x]i=maxpooling{[h1]i,...,[hn]i}
step 2, solving each relation prototype by utilizing a prototype-level attention mechanism module, which specifically comprises the following steps:
2-1 introduce a prototype-level attention mechanism method to solve the relationship prototypes for each relationship. The specific calculation flow is shown in fig. 2. Firstly, the input statement is judged whether to be a support set or not through a judgment process. Then combining the divided query set examples with all support set examples under a relation, and using a Gaussian function as an activation function to obtain the weight gamma given to each support set exampleij. Finally, summing each relation example to obtain a prototype representation c of the whole relationiRespectively expressed as:
for example, in this embodiment, an original network is selected as a frame, and a relation original of a support set vector in the input frame can be obtained by a formula in 2-1.
3-1 support set instances are partitioned by relationship and each relationship r is first partitioned as shown in the framework of FIG. 3iK support set instances [ x ] of (1)i1,xi2,...,xiK]Spliced into a K x dhAnd (4) extracting features of the vector matrix of 1 through convolution modules formed by combining three convolution layers and three ReLU layers respectively, and finally obtaining a weight coefficient beta of the vector matrix. Wherein the dimension of the input vector is sequentially changed to K x dh*32,K*dh*64,1*dh*1。
3-2, multiplying the weight coefficient obtained in the step 3-1 by using a Manhattan distance formula to obtain a final measurement formula, wherein the final measurement formula is expressed as follows:
and 4, calculating the query examples and the prototype obtained by the prototype-level attention mechanism module in the step 2, measuring the distance between the query examples and the relation prototype by using the distance formula obtained in the step 3, and finally finishing the relation classification of the query examples by applying a softmax function.
For example, for an example sentence in example sentences 1-3, in the prototype network framework, after comparing it with all the relationship prototypes in the support set, it is found that the value obtained in the calculation of the relationship P931 is the largest, so in example sentences 1-3, the entity classifies the relationship between < East Middle Airport, Nottingham > as P931.
According to the embodiment, the invention realizes the text entity relation classification method based on small sample learning. And setting the small sample relation classification scene according to requirements. The invention adopts a double attention mechanism module to improve the classification task of the small sample relation. The prototype-level attention mechanism module represents a relationship prototype by giving different weights to relationship examples, and eliminates the influence of individual extreme examples on prototype representation; the distance level attention mechanism is used for highlighting the dimension which has larger influence on relation classification in the feature space, and introduces a Manhattan formula as a new distance function to realize higher-performance relation classification.
It will be apparent to those skilled in the art that the steps of the text entity relationship classification method based on small sample learning according to the embodiment of the present invention described above can be implemented by a general-purpose computing device, they can be centralized on a single computing device or distributed on a network formed by a plurality of computing devices, and they can alternatively be implemented by program code executable by a computing device, so that they can be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described can be executed in a sequence different from that described herein, or they can be separately manufactured into various integrated circuit modules, or a plurality of modules or steps in them can be manufactured into a single integrated circuit module. Thus, embodiments of the invention are not limited to any specific combination of hardware and software.
Claims (8)
1. A text entity relation classification method based on small sample learning is characterized by comprising the following steps:
step 1: adopting a CNN network as an example encoder to encode support set statements and query set statements in a given data set, converting the support set statements and the query set statements into low-dimensional example vectors, and obtaining entity pair characteristics of extracted corpora;
step 2: in the small sample learning scene, a prototype-level attention mechanism module is utilized to give weight to each example, and therefore a weighted prototype vector is obtained;
and 3, step 3: splicing the support set examples obtained by the coding in the step 1 into a vector matrix, and extracting semantic features of important dimensions in the support set examples through a distance level attention mechanism module so as to obtain a weight beta of the distance level attention mechanism;
and 4, step 4: multiplying the distance level attention mechanism weight beta obtained in the step 3 by a Manhattan distance formula to obtain a new distance formula as a measurement function; utilizing the measurement function to measure the distance between the query examples in the query set and the prototype vector obtained in the step 2;
and 5: comparing the distance between the query instance and the prototype vector according to the distance formula obtained in the step 4; and (5) carrying out relation classification by utilizing a softmax function.
2. The method for classifying text entity relationships based on small sample learning according to claim 1, wherein in the step 1, a CNN network is used as an example encoder to encode support set sentences and query set sentences in a given data set, and the method comprises the following steps:
1-1, converting the linguistic data in the input data set into a low-dimensional word vector form by means of Glove word embedding and entity position embedding; the sentence obtained after the Glove word embedding is represented as a vector list (x)0,x1,x2,...,xi) Wherein x isiIndicating the i-th word embedding, and the entity position embedding is to calculate each word embedding vector x in the vector listm(m∈[0,i]) Distance between two head and tail entities in a sentence, Glove word embedding and entity position embedding are combined and expressed as { e }1,...,en}={[WF1;PF1]...,[WFn;PFn]Thus forming an embedded vector sequence of sentences.
1-2, further processing the final sentence embedding vector sequence obtained in the step 1-1 by using a CNN network, extracting semantic features in the sentence embedding vector sequence, and specifically classifying the semantic features into a convolutional layer and a maximum pooling layer. The convolutional layer is used for extracting features of a sentence embedding vector sequence by using a convolutional sliding window with the length of m, and processing the obtained vector sequence after the features are extracted through a ReLU activation function to obtain sentence hiding embedding; the maximum pooling layer processes hidden embedding of sentences obtained by the convolutional layers; finally, an example vector of the whole sentence is obtained.
3. The method for classifying textual entities based on small sample learning according to claim 1, wherein said step 2 of weighting each instance by using a prototype-level attention mechanism module to obtain a weighted prototype vector comprises the steps of:
2-1 uses a gaussian function as the activation function,the weight given to each sample instance is foundijFinally, weighted summation is carried out on each relation example by using the weight, and the prototype-level relation vector representation c of the whole relation is obtainediRespectively expressed as:
wherein q isjRepresenting query set instances, M representing relationship types present in the data set, K representing the number of instances under each relationship type, xijDenotes the jth support set instance, σ, under the relation iiThe parameter values are expressed as gaussian functions.
4. The method for classifying text entity relations based on small sample learning according to claim 1, wherein the step 3 of extracting semantic features of important dimensions in support set instances through a distance level attention mechanism module so as to obtain the weight β of the distance level attention mechanism comprises the following steps:
3-1 dividing the example sentences in the support set according to the relation and dividing each relation riK support set instances [ x ] of (1)i1,xi2,...,xiK]Processing the data by the example encoder in the step 1, and splicing the data into a K x dhVector matrix of 1, where K denotes the number of instances per relationship class, dhRepresenting a hidden layer unit;
3-2, the vector matrix passes through a module which is formed by interweaving three convolution layers and three ReLU function layers, semantic features with dimension values not being 0 in the support set example are extracted, and dimension is carried outDegree becomes 1 x dh1, to obtain the distance-level attention mechanism weight β.
5. The method for classifying text entity relationships based on small sample learning according to claim 1, wherein the metric formula established in step 4 comprises the following steps:
4-1, selecting a Manhattan distance formula as a distance formula;
4-2 multiplying the distance level attention mechanism weight β by the distance formula selected in 4-1 to obtain a distance function d (x, y) as a new metric function, expressed as:
wherein n represents a dimension, xiAnd yiRespectively representing the values of x and y in the i dimension.
6. The method for classifying text entities based on small sample learning according to claim 1, wherein the relationship classification using softmax function established in step 5 comprises the following steps:
5-1 representation of c from prototype-level relationship vectorsiAnd a distance function d (x, y) for computing an instance in the query set and ciA distance therebetween Representing a vector obtained after an example x of the query set passes through an example coding layer;
5-2, judging which relationship in the relationship set R specifically belongs to for the examples in the query set; the concrete expression is as follows:
7. A computer device, characterized by: the computer device comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor implements the text entity relation classification method based on small sample learning according to any one of claims 1-6 when executing the computer program.
8. A computer-readable storage medium characterized by: the computer readable storage medium stores a computer program for executing the text entity relationship classification method based on small sample learning according to any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210318340.3A CN114780720A (en) | 2022-03-29 | 2022-03-29 | Text entity relation classification method based on small sample learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210318340.3A CN114780720A (en) | 2022-03-29 | 2022-03-29 | Text entity relation classification method based on small sample learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114780720A true CN114780720A (en) | 2022-07-22 |
Family
ID=82424921
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210318340.3A Pending CN114780720A (en) | 2022-03-29 | 2022-03-29 | Text entity relation classification method based on small sample learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114780720A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118114671A (en) * | 2024-03-01 | 2024-05-31 | 南京审计大学 | Gaussian function-based text data set small sample named entity recognition method and system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112015902A (en) * | 2020-09-14 | 2020-12-01 | 中国人民解放军国防科技大学 | Least-order text classification method under metric-based meta-learning framework |
CN113505225A (en) * | 2021-07-08 | 2021-10-15 | 东北大学 | Small sample medical relation classification method based on multilayer attention mechanism |
US20210391080A1 (en) * | 2018-12-29 | 2021-12-16 | New H3C Big Data Technologies Co., Ltd. | Entity Semantic Relation Classification |
CN114020907A (en) * | 2021-11-01 | 2022-02-08 | 深圳市中科明望通信软件有限公司 | Information extraction method and device, storage medium and electronic equipment |
-
2022
- 2022-03-29 CN CN202210318340.3A patent/CN114780720A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210391080A1 (en) * | 2018-12-29 | 2021-12-16 | New H3C Big Data Technologies Co., Ltd. | Entity Semantic Relation Classification |
CN112015902A (en) * | 2020-09-14 | 2020-12-01 | 中国人民解放军国防科技大学 | Least-order text classification method under metric-based meta-learning framework |
CN113505225A (en) * | 2021-07-08 | 2021-10-15 | 东北大学 | Small sample medical relation classification method based on multilayer attention mechanism |
CN114020907A (en) * | 2021-11-01 | 2022-02-08 | 深圳市中科明望通信软件有限公司 | Information extraction method and device, storage medium and electronic equipment |
Non-Patent Citations (2)
Title |
---|
XIAOHAN ZHAO 等: "Improving Long-tail Relation Extraction with Knowledge-aware Hierarchical Attention", IEEE, 31 December 2021 (2021-12-31), pages 1 - 4 * |
胡晗 等: "小样本关系分类研究综述", 中文信息学报, vol. 36, no. 2, 28 February 2022 (2022-02-28), pages 1 - 11 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118114671A (en) * | 2024-03-01 | 2024-05-31 | 南京审计大学 | Gaussian function-based text data set small sample named entity recognition method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112732916B (en) | BERT-based multi-feature fusion fuzzy text classification system | |
CN109947963A (en) | A kind of multiple dimensioned Hash search method based on deep learning | |
CN107562784A (en) | Short text classification method based on ResLCNN models | |
CN105184298B (en) | A kind of image classification method of quick local restriction low-rank coding | |
CN111125411B (en) | Large-scale image retrieval method for deep strong correlation hash learning | |
CN110516074B (en) | Website theme classification method and device based on deep learning | |
CN104915448A (en) | Substance and paragraph linking method based on hierarchical convolutional network | |
CN109871454B (en) | Robust discrete supervision cross-media hash retrieval method | |
CN107329954B (en) | Topic detection method based on document content and mutual relation | |
CN105469096A (en) | Feature bag image retrieval method based on Hash binary code | |
CN103699523A (en) | Product classification method and device | |
CN108537257B (en) | Zero sample image classification method based on discriminant dictionary matrix pair | |
CN111832546A (en) | Lightweight natural scene text recognition method | |
CN111475622A (en) | Text classification method, device, terminal and storage medium | |
CN106997379B (en) | Method for merging similar texts based on click volumes of image texts | |
CN111460222B (en) | Short video multi-label classification method based on multi-view low-rank decomposition | |
CN110276396B (en) | Image description generation method based on object saliency and cross-modal fusion features | |
CN114169442A (en) | Remote sensing image small sample scene classification method based on double prototype network | |
CN110472652A (en) | A small amount of sample classification method based on semanteme guidance | |
CN112800249A (en) | Fine-grained cross-media retrieval method based on generation of countermeasure network | |
CN112579783B (en) | Short text clustering method based on Laplace atlas | |
CN112256866A (en) | Text fine-grained emotion analysis method based on deep learning | |
CN111191031A (en) | Entity relation classification method of unstructured text based on WordNet and IDF | |
CN111723572B (en) | Chinese short text correlation measurement method based on CNN convolutional layer and BilSTM | |
CN111026887A (en) | Cross-media retrieval method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |