Nothing Special   »   [go: up one dir, main page]

CN112836056B - Text classification method based on network feature fusion - Google Patents

Text classification method based on network feature fusion Download PDF

Info

Publication number
CN112836056B
CN112836056B CN202110266934.XA CN202110266934A CN112836056B CN 112836056 B CN112836056 B CN 112836056B CN 202110266934 A CN202110266934 A CN 202110266934A CN 112836056 B CN112836056 B CN 112836056B
Authority
CN
China
Prior art keywords
features
text
network
representing
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110266934.XA
Other languages
Chinese (zh)
Other versions
CN112836056A (en
Inventor
覃晓
廖兆琪
元昌安
乔少杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu University of Information Technology
Nanning Normal University
Original Assignee
Chengdu University of Information Technology
Nanning Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu University of Information Technology, Nanning Normal University filed Critical Chengdu University of Information Technology
Priority to CN202110266934.XA priority Critical patent/CN112836056B/en
Publication of CN112836056A publication Critical patent/CN112836056A/en
Application granted granted Critical
Publication of CN112836056B publication Critical patent/CN112836056B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a text classification method based on network feature fusion, and provides a model based on Res2Net and BilSTM network fusion aiming at the problems that the traditional convolutional neural network cannot pay attention to text context meaning and short-term memory and gradient disappearance exist in the traditional recurrent neural network, so that the problems of the network can be effectively solved, and texts can be better classified. The method utilizes the multi-scale residual error network Res2Net to extract local features of the text, simultaneously extracts context features of the text by combining the bidirectional long-time and short-time memory network BilSTM, and simultaneously predicts the relation between the labels by adding a traditional machine learning method, namely conditional random field CRF, behind the BilSTM network layer, thereby achieving the effect of correctly classifying the text. The method can effectively improve the accuracy of text classification through fusion under the condition of not increasing network parameters too much.

Description

Text classification method based on network feature fusion
Technical Field
The invention belongs to the technical field of deep learning and natural language processing, and particularly relates to a design of a text classification method based on network feature fusion.
Background
With the large-scale use of the internet in today's society, information resources on the network are growing at an exponential rate, and among various forms of information, unstructured text information is one of the most important information resources. In various massive text information, how to acquire the most effective information resources is an urgent problem to be solved, and text classification can better help people to manage and classify the complicated text information by using an efficient and concise algorithm or model, so as to quickly and accurately acquire the required information. However, the traditional machine learning text classification algorithm needs a large amount of preprocessing operations such as manual design features, and the complexity is improved. Based on the method, the text features are extracted by utilizing the deep learning model, the text classification speed can be obviously improved, a large amount of manual preprocessing is not needed, and the better classification effect than the traditional text classification is achieved.
In a plurality of deep learning network models, the traditional convolutional neural network has the capability of processing the problem of high-dimensional and nonlinear mapping relation, and can use the preprocessed word vector as input and realize sentence-level classification by using the convolutional neural network. However, the traditional convolutional neural network focuses more on the local features of the input vector, ignores the context meaning of the word, and thus has an influence on the accuracy of text classification. From a context-aware point of view, the problem can be solved by using a recurrent neural network. The traditional recurrent neural network considers the previous output for the current output, and forms a memory-like function aiming at the time series problem, which is particularly shown in the fact that the recurrent neural network applies the network state information at the previous moment to the network state at the next moment. However, although the context of the text is noticed by memorizing, the conventional recurrent neural network only concerns the network state at the previous moment and involves a large number of derivative operations on the time series during the solution, so that the problems of incapability of memorizing information on the long-time series, disappearance of gradients and the like are caused.
Disclosure of Invention
The invention aims to solve the problems that the context meaning of a text cannot be concerned by a traditional convolutional neural network and short-time memory and gradient of the traditional convolutional neural network disappear, provides a text classification method based on network feature fusion, adopts a model based on Res2Net (multi-scale residual error network) and BilSTM (bidirectional long-time memory network) fusion, can effectively solve the problems of the network and better classifies the text.
The technical scheme of the invention is as follows: a text classification method based on network feature fusion comprises the following steps:
s1, preprocessing a text to be classified, and processing a preprocessed text data set into a word vector set through a word vector representation method.
And S2, splicing the word vector set into a matrix, inputting the matrix into a Res2Net network for training, and outputting to obtain local characteristics of the text data set.
And S3, inputting the word vector set into a BilSTM network for training, and outputting to obtain the context characteristics of the text data set.
And S4, scoring the context characteristics of the text data set by adopting a CRF conditional random field scoring mechanism, and selecting the label sequence set with the highest score as the optimal context characteristic sequence set of the text data set.
And S5, splicing and fusing the local features and the optimal context features of the text data set to obtain fusion features.
And S6, inputting the fusion features into a softmax classifier for classification to obtain a text classification result.
Further, the method for preprocessing the text to be classified in step S1 specifically includes: removing useless symbols, keeping the text dataset only containing Chinese, and removing stop words.
Further, the Res2Net network in step S2 includes a first 1 × 1 convolutional layer, a 3 × 3 convolutional layer, and a second 1 × 1 convolutional layer connected in sequence, each convolutional layer includes a relu activation function, and the relu activation functions of the second 1 × 1 convolutional layers are connected by a residual block before.
The number of channels of the first 1 × 1 convolutional layer is n, the feature map of the input matrix is equally divided into s groups of features according to the number of channels, and if the number of channels of each group of features is w, n = s × w, and each group of features after being equally divided is marked as x i Where i ∈ {1,2,..., s }.
The 3 x 3 convolutional layer for each set of features x after averaging i Except that the first set of features is not subjected to convolution operation, all the other sets of features are subjected to convolution operation k correspondingly i (. To) note y i For convolution operations k i The output of (c) then starts with the second set of features, each convolution operation k i Before (v), the output y of the previous group will be i-1 With the current feature x i Residual concatenation is performed as convolution operation k i (. Cndot.) input until the last set of features.
The second 1 × 1 convolutional layer outputs each group of 3 × 3 convolutional layers y i And performing channel splicing, fusing the multi-scale features and outputting to obtain the local features of the text data set.
Further, the objective function of Res2Net network in step S2 is:
Figure BDA0002972488390000021
wherein x i Representing the equipartition i-th set of features, k i (. Cndot.) denotes a convolution operation performed on the ith set of features, y i Representing the output of the i-th set of features after the convolution operation.
Further, the basic expression of the BiLSTM network in step S3 is:
Figure BDA0002972488390000022
Figure BDA0002972488390000023
Figure BDA0002972488390000024
wherein
Figure BDA0002972488390000031
Represents the forward LSTM current layer hidden state, <' > based on the status of the forward LSTM current layer hidden state>
Figure BDA0002972488390000032
Represents a forward LSTM input gate weight matrix, <' > based on the value of the threshold value>
Figure BDA0002972488390000033
Represents a forward LSTM current input unit state weight matrix, <' > based on a forward LSTM current input unit state weight>
Figure BDA0002972488390000034
Represents a hidden state above forward LSTM>
Figure BDA0002972488390000035
Represents a forward LSTM input unit bias term, <' > greater than>
Figure BDA0002972488390000036
Represents the current-level hidden state of the backward LSTM, and->
Figure BDA0002972488390000037
Represents a backward LSTM input door weight matrix, based on a set of values>
Figure BDA0002972488390000038
Represents a backward LSTM current input unit state weight matrix, <' > based on a set of values>
Figure BDA0002972488390000039
Represents a hidden state of a next layer of backward LSTM>
Figure BDA00029724883900000310
Representing backward LSTM input cell bias terms, U representing forward backward output cell stitching matrix, c representing total output cell bias terms, x t Representing the input value of a BilSTM network hidden layer, f (-) representing the activation function when the BilSTM network hidden layer is calculated, g (-) representing the activation function when the BilSTM network output layer is calculated, y (-) representing the activation function when the BilSTM network hidden layer is calculated t Representing the output value of the BiLSTM network.
Further, in step S4, a formula for scoring the context features of the text data set by using a CRF conditional random field scoring mechanism is as follows:
Figure BDA00029724883900000311
wherein S (X, y) represents the score of the input word vector sequence X of the BilSTM network corresponding to the output tag sequence y,
Figure BDA00029724883900000312
indicates the ith label tag i Transfer to the i +1 st tag i+1 Is greater than or equal to>
Figure BDA00029724883900000313
Representing the ith word v in the input word vector sequence X i Mapping to ith tag i The probability is quantized.
Normalizing the scores of a plurality of output label sequences y of the input word vector sequence X to obtain:
Figure BDA00029724883900000314
where p (Y | X) denotes the score for a number of output tag sequences Y of the input word vector sequence X, Y-denotes a particular output tag sequence belonging to all possible output tag sequences, Y x Representing all possible output label sequences, and optimizing a log-likelihood function to obtain the following formula:
Figure BDA00029724883900000315
further, in step S5, a concat () method in the tensoflow frame is used to splice and fuse the local feature and the optimal context feature of the text data set, so as to obtain a fused feature.
Further, step S6 specifically includes: and storing the fusion features as input of a first full connection layer, introducing a dropout mechanism between the first full connection layer and a second full connection layer, giving up part of trained parameters each time of iteration, enabling weight updating not to depend on part of inherent features, preventing overfitting, and finally inputting iteration results into a softmax classifier for classification to obtain text classification results.
Further, the probability P (y) that the softmax classifier classifies the text x into the category j in step S6 (i) =j|x (i) (ii) a θ) is:
Figure BDA0002972488390000041
wherein x (i) Representing the input, y, of each category (i) Representing the probability value of each class j, theta represents the training model parameters, with the aim of maximizing the likelihood function exp (-) and,
Figure BDA0002972488390000042
represents the training parameters when training each class j in order to maximize the likelihood function exp (-) and k represents the number of training model parameters θ.
The invention has the beneficial effects that:
(1) The invention utilizes a multi-network feature fusion method to carry out feature extraction with all-round meaning on the text, makes up for the defects of the traditional single network in extracting the text features and improves the precision of text classification.
(2) According to the method, the Res2Net residual error network is adopted to extract the local features of the text data set, and compared with the traditional CNN network, the network can better extract the local features of the text through multi-scale feature learning.
(3) Compared with the traditional RNN (navigation network) and LSTM (local navigation network) networks, the method can extract the context characteristics of the text and simultaneously pay attention to the influence of the information behind the current word on the whole sentence, so that the extraction of the context characteristics of the text is more accurate.
(4) The invention also adopts the traditional machine learning method, namely a CRF conditional random field and a softmax classifier. The CRF conditional random field can score vectors output by the BilSTM network, and reorder sentences to obtain a more reasonable ordered text; and the softmax classifier scores each classified sample, calculates a probability value through a function, and determines the category of the text according to the final probability value.
(5) The invention solves the problem of low text extraction precision of the traditional single network by utilizing the feature fusion of the deep learning network and combining the traditional machine learning method, and provides a powerful basis for better classifying texts.
Drawings
Fig. 1 is a flowchart of a text classification method based on network feature fusion according to an embodiment of the present invention.
Fig. 2 is a general architecture diagram of a text classification method based on network feature fusion according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It is to be understood that the embodiments shown and described in the drawings are merely exemplary and are intended to illustrate the principles and spirit of the invention, rather than to limit the scope of the invention.
The embodiment of the invention provides a text classification method based on network feature fusion, which comprises the following steps S1-S6 as shown in fig. 1 and fig. 2 together:
s1, preprocessing a text to be classified, and processing a preprocessed text data set into a word vector set through a word vector representation method.
In the embodiment of the present invention, the method for preprocessing the text to be classified specifically includes: removing useless symbols, keeping the text dataset only containing Chinese, and removing stop words.
And S2, splicing the word vector set into a matrix, inputting the matrix into a Res2Net network for training, and outputting to obtain local characteristics of the text data set.
As shown in fig. 2, the Res2Net network includes a first 1 × 1 convolutional layer, a 3 × 3 convolutional layer, and a second 1 × 1 convolutional layer connected in sequence, each including a relu activation function, and the relu activation functions of the second 1 × 1 convolutional layers are connected with a residual block before.
The number of channels of the first 1 × 1 convolutional layer is n, the feature map of the input matrix is equally divided into s groups of features according to the number of channels, if the number of channels of each group of features is w, n = s × w, and each group of features after being equally divided is recorded as x i Where i ∈ {1,2,..., s }. As shown in fig. 2, s =4 in the embodiment of the present invention.
The 3 x 3 convolutional layer for each set of the features x after the equalization i Except that the first set of features is not subjected to convolution operation, all the other sets of features are subjected to convolution operation k correspondingly i (. To) note y i For convolution operation k i The output of (c) then starts with the second set of features, each convolution operation k i Before (v), the output y of the previous group will be i-1 With the current feature x i Residual concatenation is performed and used as convolution operation k i Input of (·) up to the last set of features.
The second 1 × 1 convolutional layer outputs each group of 3 × 3 convolutional layers y i And performing channel splicing, fusing the multi-scale features and outputting to obtain the local features of the text data set.
In summary, the objective function of Res2Net network is:
Figure BDA0002972488390000051
wherein x i Representing the equipartition i-th set of features, k i (. Represents a pairConvolution operations performed on i sets of features, y i Representing the output of the i-th set of features after the convolution operation.
And S3, inputting the word vector set into a BilSTM network for training, and outputting to obtain the context characteristics of the text data set.
For the traditional LSTM network, it can only learn the information before the current word, but cannot utilize the information after the current word, so the embodiment of the present invention uses the BiLSTM-bidirectional LSTM network to extract the information after the current word.
As shown in fig. 2, the basic expression of the BiLSTM network is:
Figure BDA0002972488390000061
Figure BDA0002972488390000062
Figure BDA0002972488390000063
wherein
Figure BDA0002972488390000064
Represents forward LSTM current layer hidden state, and>
Figure BDA0002972488390000065
represents a forward LSTM input gate weight matrix, <' > based on the value of the threshold value>
Figure BDA0002972488390000066
Represents the forward LSTM current input unit state weight matrix, <' > based on the status of the unit>
Figure BDA0002972488390000067
Represents a hidden state above forward LSTM>
Figure BDA0002972488390000068
To representForward LSTM input cell bias term, <' > based on the status of the input cell>
Figure BDA0002972488390000069
Represents the current-level hidden state of the backward LSTM, and->
Figure BDA00029724883900000610
Represents a backward LSTM input gate weight matrix, < > based on the value of the weighted value>
Figure BDA00029724883900000611
Represents the backward LSTM current input unit state weight matrix, <' > based on the evaluation of the status of the preceding LSTM>
Figure BDA00029724883900000612
Represents a hidden state of a next layer of backward LSTM>
Figure BDA00029724883900000613
Representing backward LSTM input cell bias terms, U representing forward backward output cell stitching matrix, c representing total output cell bias terms, x t Representing the input value of a BilSTM network hidden layer, f (-) representing the activation function when the BilSTM network hidden layer is calculated, g (-) representing the activation function when the BilSTM network output layer is calculated, y t Representing the output value of the BiLSTM network.
And S4, scoring the context characteristics of the text data set by adopting a CRF conditional random field scoring mechanism, and selecting the label sequence set with the highest score as the optimal context characteristic sequence set of the text data set.
In the embodiment of the invention, a formula for scoring the context characteristics of the text data set by adopting a CRF conditional random field scoring mechanism is as follows:
Figure BDA00029724883900000614
wherein S (X, y) represents the score of the input word vector sequence X of the BilSTM network corresponding to the output tag sequence y,
Figure BDA00029724883900000615
indicates the ith label tag i Transfer to the i +1 st tag i+1 Is greater than or equal to>
Figure BDA00029724883900000616
Representing the ith word v in the input word vector sequence X i Mapping to ith tag i The probability is quantized.
Normalizing scores of a plurality of output tag sequences y of the input word vector sequence X to obtain:
Figure BDA00029724883900000617
where p (y | X) represents the score for a plurality of output tag sequences y of the input word vector sequence X, y Indicating a particular output tag sequence, Y, belonging to all possible output tag sequences x Representing all possible output label sequences, and optimizing a log-likelihood function to obtain the following formula:
Figure BDA0002972488390000071
because the label sequence output by the BilSTM network is based on the maximum probability value obtained by softmax, the word order problem of the label is not considered, and the output word order is unreasonable, the maximum likelihood probability log (p (y | X)) of p (y | X) is obtained in step S4, and through the probability, the CRF considers the sequentiality between the output label sequences and adds a constraint rule to the finally predicted label to make the predicted label word order reasonable.
And S5, splicing and fusing the local features and the optimal context features of the text data set to obtain fused features.
In the embodiment of the invention, the local feature and the optimal context feature of the text data set are spliced and fused by adopting a concat () method in a tensorflow framework to obtain a fusion feature.
And S6, inputting the fusion features into a softmax classifier for classification to obtain a text classification result.
In the embodiment of the invention, the fusion characteristics are stored and used as the input of the first full connection layer, a dropout mechanism is introduced between the first full connection layer and the second full connection layer, part of trained parameters are abandoned each time iteration is carried out, so that weight updating does not depend on part of inherent characteristics any more, overfitting is prevented, and finally the iteration result is input into a softmax classifier for classification to obtain a text classification result.
In the embodiment of the invention, the probability P (y) of classifying the text x into the category j by the softmax classifier (i) =j|x (i) (ii) a θ) is:
Figure BDA0002972488390000072
wherein x (i) Representing the input, y, of each category (i) Representing the probability value of each class j, theta represents the training model parameters, with the aim of maximizing the likelihood function exp (-) and,
Figure BDA0002972488390000073
represents the training parameters when training each class j in order to maximize the likelihood function exp (-) and k represents the number of training model parameters θ.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.

Claims (8)

1. A text classification method based on network feature fusion is characterized by comprising the following steps:
s1, preprocessing a text to be classified, and processing a preprocessed text data set into a word vector set by a word vector representation method;
s2, splicing the word vector set into a matrix, inputting the matrix into a Res2Net network for training, and outputting to obtain local features of the text data set;
s3, inputting the word vector set into a BilSTM network for training, and outputting context characteristics of the text data set;
s4, scoring the context features of the text data set by adopting a CRF conditional random field scoring mechanism, and selecting a label sequence set with the highest score as an optimal context feature sequence set of the text data set;
s5, splicing and fusing local features and optimal context features of the text data set to obtain fusion features;
s6, inputting the fusion features into a softmax classifier for classification to obtain a text classification result;
in the step S4, a formula for scoring the context features of the text data set by using a CRF conditional random field scoring mechanism is as follows:
Figure FDA0003954516370000011
wherein S (X, y) represents the score of the input word vector sequence X of the BilSTM network corresponding to the output tag sequence y,
Figure FDA0003954516370000012
indicates the ith label tag i Transfer to the i +1 st tag i+1 Is greater than or equal to>
Figure FDA0003954516370000013
Representing the ith word v in the input word vector sequence X i Mapping to ith tag i The scalar probability of (2);
normalizing the scores of a plurality of output label sequences y of the input word vector sequence X to obtain:
Figure FDA0003954516370000014
where p (y | X) represents the score for a plurality of output tag sequences y of the input word vector sequence X, y Indicating a particular output tag sequence, Y, belonging to all possible output tag sequences x Representing all possible output label sequences, and optimizing a log-likelihood function to obtain the following formula:
Figure FDA0003954516370000015
2. the text classification method according to claim 1, wherein the method for preprocessing the text to be classified in step S1 specifically comprises: removing useless symbols, keeping the text dataset only containing Chinese, and removing stop words.
3. The text classification method according to claim 1, wherein the Res2Net network in step S2 includes a first 1 x 1 convolutional layer, a 3 x 3 convolutional layer, and a second 1 x 1 convolutional layer connected in sequence, each convolutional layer includes a relu activation function, and the relu activation functions of the second 1 x 1 convolutional layers are connected by a residual block before;
the number of channels of the first 1 × 1 convolutional layer is n, the feature map of the input matrix is equally divided into s groups of features according to the number of channels, if the number of channels of each group of features is w, n = s × w, and each group of equally divided features is recorded as x i Wherein i ∈ {1,2,..., s };
the 3 x 3 convolutional layer for each set of features x after equipartition i Except that the first set of features is not subjected to convolution operation, all other sets of features are correspondingly subjected to convolution operation k i (. To) note y i For convolution operations k i The output of (c), then, starting from the second set of features, each convolution operation k i Before (v), the output y of the previous group will be i-1 With the current feature x i Residual concatenation is performed and as convolution operationk i (. H) until the last set of features;
the second 1 × 1 convolutional layer outputs y for each group of 3 × 3 convolutional layers i And performing channel splicing, fusing the multi-scale features and outputting to obtain the local features of the text data set.
4. The text classification method according to claim 3, characterized in that the objective function of the Res2Net network in step S2 is:
Figure FDA00039545163700000213
wherein x i Representing the equipartition i-th set of features, k i (. H) represents a convolution operation on the i-th set of features, y i Representing the output of the i-th set of features after the convolution operation.
5. The text classification method according to claim 1, wherein the basic expression of the BilSTM network in step S3 is as follows:
Figure FDA0003954516370000021
Figure FDA0003954516370000022
Figure FDA0003954516370000023
wherein
Figure FDA0003954516370000024
Represents the forward LSTM current layer hidden state, <' > based on the status of the forward LSTM current layer hidden state>
Figure FDA0003954516370000025
Represents a forward LSTM input gate weight matrix, <' > based on the value of the threshold value>
Figure FDA0003954516370000026
Represents a forward LSTM current input unit state weight matrix, <' > based on a forward LSTM current input unit state weight>
Figure FDA0003954516370000027
Represents a hidden state on the forward LSTM, and/or is present in the forward LSTM>
Figure FDA0003954516370000028
Represents a forward LSTM input unit bias term, <' > greater than>
Figure FDA0003954516370000029
Represents a backward LSTM current level hidden state, and>
Figure FDA00039545163700000210
represents a backward LSTM input gate weight matrix, < > based on the value of the weighted value>
Figure FDA00039545163700000211
Represents the backward LSTM current input unit state weight matrix, <' > based on the evaluation of the status of the preceding LSTM>
Figure FDA00039545163700000212
Represents a hidden state of a next layer of backward LSTM>
Figure FDA0003954516370000031
Representing backward LSTM input cell bias terms, U representing forward backward output cell stitching matrix, c representing total output cell bias terms, x t Representing the input value of a BilSTM network hidden layer, f (-) representing the activation function when the BilSTM network hidden layer is calculated, g (-) representing the activation function when the BilSTM network output layer is calculated, y (-) representing the activation function when the BilSTM network hidden layer is calculated t Representing the output value of the BiLSTM network.
6. The text classification method according to claim 1, characterized in that in step S5, a concat () method in a tensoflow framework is used to splice and fuse the local features and the optimal context features of the text data set to obtain fused features.
7. The text classification method according to claim 1, wherein the step S6 specifically is: and storing the fusion features as input of a first full-connection layer, introducing a dropout mechanism between the first full-connection layer and a second full-connection layer, giving up part of trained parameters each time of iteration, enabling weight updating not to depend on part of inherent features, preventing overfitting, and finally inputting iteration results into a softmax classifier for classification to obtain text classification results.
8. The method according to claim 1, wherein in step S6, the probability P (y) that the softmax classifier classifies the text x into the category j (i) =j|x (i) (ii) a θ) is:
Figure FDA0003954516370000032
wherein x is (i) Representing the input, y, of each category (i) Representing the probability value of each class j, theta represents the training model parameters, with the aim of maximizing the likelihood function exp (-) and,
Figure FDA0003954516370000033
represents the training parameters when training each class j in order to maximize the likelihood function exp (-) and k represents the number of training model parameters θ. />
CN202110266934.XA 2021-03-12 2021-03-12 Text classification method based on network feature fusion Active CN112836056B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110266934.XA CN112836056B (en) 2021-03-12 2021-03-12 Text classification method based on network feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110266934.XA CN112836056B (en) 2021-03-12 2021-03-12 Text classification method based on network feature fusion

Publications (2)

Publication Number Publication Date
CN112836056A CN112836056A (en) 2021-05-25
CN112836056B true CN112836056B (en) 2023-04-18

Family

ID=75930136

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110266934.XA Active CN112836056B (en) 2021-03-12 2021-03-12 Text classification method based on network feature fusion

Country Status (1)

Country Link
CN (1) CN112836056B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113901801B (en) * 2021-09-14 2024-05-07 燕山大学 Text content safety detection method based on deep learning

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110580458A (en) * 2019-08-25 2019-12-17 天津大学 music score image recognition method combining multi-scale residual error type CNN and SRU
CN111444726A (en) * 2020-03-27 2020-07-24 河海大学常州校区 Method and device for extracting Chinese semantic information of long-time and short-time memory network based on bidirectional lattice structure
CN111783462A (en) * 2020-06-30 2020-10-16 大连民族大学 Chinese named entity recognition model and method based on dual neural network fusion
WO2020215870A1 (en) * 2019-04-22 2020-10-29 京东方科技集团股份有限公司 Named entity identification method and apparatus
CN112163089A (en) * 2020-09-24 2021-01-01 中国电子科技集团公司第十五研究所 Military high-technology text classification method and system fusing named entity recognition
CN112464663A (en) * 2020-12-01 2021-03-09 小牛思拓(北京)科技有限公司 Multi-feature fusion Chinese word segmentation method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11544259B2 (en) * 2018-11-29 2023-01-03 Koninklijke Philips N.V. CRF-based span prediction for fine machine learning comprehension

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020215870A1 (en) * 2019-04-22 2020-10-29 京东方科技集团股份有限公司 Named entity identification method and apparatus
CN110580458A (en) * 2019-08-25 2019-12-17 天津大学 music score image recognition method combining multi-scale residual error type CNN and SRU
CN111444726A (en) * 2020-03-27 2020-07-24 河海大学常州校区 Method and device for extracting Chinese semantic information of long-time and short-time memory network based on bidirectional lattice structure
CN111783462A (en) * 2020-06-30 2020-10-16 大连民族大学 Chinese named entity recognition model and method based on dual neural network fusion
CN112163089A (en) * 2020-09-24 2021-01-01 中国电子科技集团公司第十五研究所 Military high-technology text classification method and system fusing named entity recognition
CN112464663A (en) * 2020-12-01 2021-03-09 小牛思拓(北京)科技有限公司 Multi-feature fusion Chinese word segmentation method

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Classification Method for Tibetan Texts Based on In-depth Learning;Lili Wang等;《2019 IEEE 8th Joint International Information Technology and Artificial Intelligence Conference (ITAIC)》;20190805;第29-34页 *
Word2vec-CNN-Bilstm短文本情感分类;王立荣;《福建电脑》;20200125(第01期);第15-20页 *
基于CNN和BiLSTM网络特征融合的文本情感分析;李洋等;《计算机应用》;20181110(第11期);第1231-1235页 *
基于深度双向长短时记忆网络的文本情感分类;刘建兴等;《桂林电子科技大学学报》;20180514(第02期);第40-44页 *

Also Published As

Publication number Publication date
CN112836056A (en) 2021-05-25

Similar Documents

Publication Publication Date Title
CN109657239B (en) Chinese named entity recognition method based on attention mechanism and language model learning
CN107729309B (en) Deep learning-based Chinese semantic analysis method and device
CN109086267B (en) Chinese word segmentation method based on deep learning
CN111274394B (en) Method, device and equipment for extracting entity relationship and storage medium
CN110245229B (en) Deep learning theme emotion classification method based on data enhancement
CN111709242B (en) Chinese punctuation mark adding method based on named entity recognition
CN111291566B (en) Event main body recognition method, device and storage medium
CN111046179B (en) Text classification method for open network question in specific field
CN109902177A (en) Text emotion analysis method based on binary channels convolution Memory Neural Networks
CN116127953B (en) Chinese spelling error correction method, device and medium based on contrast learning
CN110276069A (en) A kind of Chinese braille mistake automatic testing method, system and storage medium
CN110263325A (en) Chinese automatic word-cut
CN111274804A (en) Case information extraction method based on named entity recognition
CN111339260A (en) BERT and QA thought-based fine-grained emotion analysis method
CN112163089B (en) High-technology text classification method and system integrating named entity recognition
CN114417851B (en) Emotion analysis method based on keyword weighted information
CN112612871A (en) Multi-event detection method based on sequence generation model
CN112434686B (en) End-to-end misplaced text classification identifier for OCR (optical character) pictures
CN111159345A (en) Chinese knowledge base answer obtaining method and device
CN109840328A (en) Deep learning comment on commodity text emotion trend analysis method
CN110751234A (en) OCR recognition error correction method, device and equipment
CN115238693A (en) Chinese named entity recognition method based on multi-word segmentation and multi-layer bidirectional long-short term memory
CN112836056B (en) Text classification method based on network feature fusion
CN114048314A (en) Natural language steganalysis method
CN116522165B (en) Public opinion text matching system and method based on twin structure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant