CN117422065A

CN117422065A - Natural language data processing system based on reinforcement learning algorithm

Info

Publication number: CN117422065A
Application number: CN202311323531.XA
Authority: CN
Inventors: 汪宇通; 曹铭登; 丁利钦; 沈雅萍
Original assignee: Hangzhou Xinxiang Infinite Technology Co ltd
Current assignee: Hangzhou Xinxiang Infinite Technology Co ltd
Priority date: 2023-10-13
Filing date: 2023-10-13
Publication date: 2024-01-19

Abstract

The application relates to the field of natural language data processing, and particularly discloses a natural language data processing system based on a reinforcement learning algorithm, which firstly collects a news data set to be classified, wherein the news data set to be classified comprises: and respectively extracting and analyzing the characteristics of the news text data and the news picture data through a trained convolutional neural network model. In this way, the model is trained and learned by using a reinforcement learning algorithm, and in the training process, the system selects classification actions according to the current state by interacting with the environment, and updates parameters of the strategy network according to the reward signals so as to optimize classification performance. In this way, a faster and more accurate classification of news data can be achieved.

Description

Natural language data processing system based on reinforcement learning algorithm

Technical Field

The present application relates to the field of natural language data processing, and more particularly, to a natural language data processing system based on reinforcement learning algorithms.

Background

Reinforcement learning is a learning paradigm in which one agent interacts with the environment to achieve a learning goal. Reinforcement learning has achieved remarkable achievements in recent years, and its application in the field of natural language processing is increasing. For example, they have applications in text classification, text generation, machine translation, and the like. Among them, the text classification problem has been a classical problem in the field of natural language processing, and the research of classification processing of texts such as news has been relatively thorough, but there are still a great deal of details and many additional difficulties encountered in commercialization implementation process to be considered. Because news is classified into many sub-scenes, the methods used in the different classification scenes may be completely different, and the news data includes text data, picture data, and the like. Whereas conventional news classification systems typically rely on marked training data, this means that a significant amount of manual labeling work is required, which is costly and time consuming.

Accordingly, a natural language data processing system based on reinforcement learning algorithms is desired to achieve faster and more accurate classification of news datasets.

Disclosure of Invention

The present application has been made in order to solve the above technical problems. Embodiments of the present application provide a natural language data processing system based on a reinforcement learning algorithm, which first collects a news data set to be classified, wherein the news data set to be classified includes: and respectively extracting and analyzing the characteristics of the news text data and the news picture data through a trained convolutional neural network model. In this way, the model is trained and learned by using a reinforcement learning algorithm, and in the training process, the system selects classification actions according to the current state by interacting with the environment, and updates parameters of the strategy network according to the reward signals so as to optimize classification performance. In this way, a faster and more accurate classification of news data can be achieved.

According to a first aspect of the present application, there is provided a natural language data processing system based on a reinforcement learning algorithm, comprising:

the data acquisition unit is used for acquiring a news data set to be classified, wherein the news data set to be classified comprises: news text data and news picture data;

The context coding unit is used for enabling the news text data to pass through a trained context coder comprising an embedded layer so as to obtain semantic feature vectors;

the multi-scale convolution coding unit is used for obtaining a multi-scale text understanding feature vector through the trained multi-scale neighborhood feature extraction module of the semantic feature vector;

the picture convolutional coding unit is used for enabling the news picture data to pass through a convolutional neural network which is completed through training and serves as a filter so as to obtain picture understanding feature vectors;

the feature fusion unit is used for fusing the multi-scale text understanding feature vector and the picture understanding feature vector to obtain a classification feature vector;

the feature optimization unit is used for carrying out displacement order based on interpolation vectors on the classification feature vectors so as to obtain optimized classification feature vectors;

and the result generation unit is used for enabling the optimized classification feature vector to pass through a multi-label classifier to obtain a classification result, wherein the classification result is used for representing classification labels corresponding to the news data set to be classified.

With reference to the first aspect of the present application, in a natural language data processing system based on a reinforcement learning algorithm of the first aspect of the present application, the context encoding unit includes: the segmentation processing subunit is used for carrying out segmentation processing on the news text data to obtain segment sequences corresponding to each news; the word segmentation processing subunit is used for carrying out word segmentation processing on the segment sequence to obtain a word sequence; an embedding vectorization subunit, configured to map each word in the word sequence into a word embedding vector by using an embedding layer of a sequence encoder of the Clip model to obtain a sequence of word embedding vectors; a context coding subunit, configured to perform global context semantic coding on the sequence of word embedding vectors using a Bert model based on a converter of a sequence encoder of the Clip model to obtain a plurality of feature vectors; and the cascading subunit is used for cascading the plurality of feature vectors to obtain the semantic feature vectors.

With reference to the first aspect of the present application, in a natural language data processing system based on a reinforcement learning algorithm of the first aspect of the present application, the multi-scale convolution encoding unit includes: the first scale convolution coding unit is used for respectively carrying out one-dimensional convolution coding on the semantic feature vectors by using a first convolution layer of the multi-scale neighborhood feature extraction module according to the following first convolution formula so as to obtain the first scale feature vectors; wherein the first convolution formula is:

wherein a is the width of the first convolution kernel in the X direction, F (a) is a first convolution kernel parameter vector, G (X-a) is a local vector matrix operated with a first convolution kernel function, w is the size of the first convolution kernel, X represents the semantic feature vector, and Cov (X) represents one-dimensional convolution encoding of the semantic feature vector respectively; the second scale convolution encoding unit is used for carrying out one-dimensional convolution encoding on the semantic feature vector by using a second convolution layer of the multi-scale neighborhood feature extraction module according to the following second convolution formula so as to obtain a second scale feature vector; wherein the second convolution formula is:

wherein b is the width of the second convolution kernel in the X direction, F (b) is a second convolution kernel parameter vector, G (X-b) is a local vector matrix operated with a second convolution kernel function, m is the size of the second convolution kernel, X represents the semantic feature vector, and Cov (X) represents one-dimensional convolution encoding of the semantic feature vector respectively; and a concatenation unit, configured to concatenate the first scale feature vector and the second scale feature vector to obtain the multi-scale text understanding feature vector.

In the natural language data processing system based on the reinforcement learning algorithm, the natural language data processing system based on the reinforcement learning algorithm further comprises a training module for training the context encoder including the embedded layer, the multi-scale neighborhood feature extraction module and the convolutional neural network as a filter;

wherein, training module includes:

the training data acquisition unit is used for acquiring training data, wherein the training data comprises the news text data, the news picture data and classification tag values corresponding to the news data set to be classified;

the training context coding unit is used for enabling the news text data to pass through a context coder comprising an embedded layer to obtain training semantic feature vectors;

the training multi-scale convolution coding unit is used for enabling the training semantic feature vector to pass through a multi-scale neighborhood feature extraction module to obtain a training multi-scale text understanding feature vector;

the training picture convolutional coding unit is used for enabling the news picture data to pass through a convolutional neural network serving as a filter so as to obtain training picture understanding feature vectors;

the training feature fusion unit is used for fusing the training multi-scale text understanding feature vector and the training picture understanding feature vector to obtain a training classification feature vector;

The training feature optimization unit is used for carrying out displacement order based on interpolation vectors on the training classification feature vectors so as to obtain training optimization classification feature vectors;

the classification loss unit is used for enabling the training optimization classification feature vector to pass through a classifier to obtain a classification loss function value;

and the training unit is used for training the context encoder containing the embedded layer, the multi-scale neighborhood feature extraction module and the convolution neural network serving as the filter by using the classification loss function value.

With reference to the first aspect of the present application, in a natural language data processing system based on a reinforcement learning algorithm of the first aspect of the present application, the classification loss unit is configured to: processing the training optimized classification feature vector by using the classifier according to the following calculation formula to obtain the classification result; wherein, the calculation formula is: o=softmax { (W) _n ，B _n )：...：(W ₁ ，B ₁ ) X, where W ₁ To W _n Is a weight matrix, B ₁ To B _n X is a training optimization classification feature vector for the bias vector; and calculating a cross entropy value between the classification result and a true value as the classification loss function value.

According to a second aspect of the present application, there is provided a natural language data processing method based on a reinforcement learning algorithm, including:

Collecting a news data set to be classified, wherein the news data set to be classified comprises: news text data and news picture data;

passing the news text data through a trained context encoder comprising an embedded layer to obtain semantic feature vectors;

the semantic feature vector passes through a trained multi-scale neighborhood feature extraction module to obtain a multi-scale text understanding feature vector;

the news picture data is passed through a convolutional neural network which is completed through training and serves as a filter so as to obtain picture understanding feature vectors;

fusing the multi-scale text understanding feature vector and the picture understanding feature vector to obtain a classification feature vector;

performing displacement order based on interpolation vectors on the classification feature vectors to obtain optimized classification feature vectors;

and the optimized classification feature vector passes through a multi-label classifier to obtain a classification result, wherein the classification result is used for representing classification labels corresponding to the news data set to be classified.

In the natural language data processing method based on the reinforcement learning algorithm, the natural language data processing method based on the reinforcement learning algorithm further comprises a training module for training the context encoder including the embedded layer, the multi-scale neighborhood feature extraction module and the convolutional neural network as a filter;

Wherein, training module includes:

acquiring training data, wherein the training data comprises the news text data, the news picture data and classification tag values corresponding to the news data set to be classified;

passing the news text data through a context encoder comprising an embedded layer to obtain training semantic feature vectors;

the training semantic feature vector passes through a multi-scale neighborhood feature extraction module to obtain a training multi-scale text understanding feature vector;

the news picture data is passed through a convolutional neural network serving as a filter to obtain training picture understanding feature vectors;

fusing the training multi-scale text understanding feature vector and the training picture understanding feature vector to obtain a training classification feature vector;

performing displacement order based on interpolation vectors on the training classification feature vectors to obtain training optimization classification feature vectors;

the training optimization classification feature vector passes through a classifier to obtain a classification loss function value;

training the context encoder including the embedded layer, the multi-scale neighborhood feature extraction module, and the convolutional neural network as a filter with the class loss function value.

Compared with the prior art, the natural language data processing system based on the reinforcement learning algorithm firstly collects a news data set to be classified, wherein the news data set to be classified comprises: and respectively extracting and analyzing the characteristics of the news text data and the news picture data through a trained convolutional neural network model. In this way, the model is trained and learned by using a reinforcement learning algorithm, and in the training process, the system selects classification actions according to the current state by interacting with the environment, and updates parameters of the strategy network according to the reward signals so as to optimize classification performance. In this way, a faster and more accurate classification of news data can be achieved.

Drawings

The foregoing and other objects, features and advantages of the present application will become more apparent from the following more particular description of embodiments of the present application, as illustrated in the accompanying drawings. The accompanying drawings are included to provide a further understanding of embodiments of the application and are incorporated in and constitute a part of this specification, illustrate the application and not constitute a limitation to the application. In the drawings, like reference numerals generally refer to like parts or steps.

FIG. 1 illustrates a schematic block diagram of a natural language data processing system based on a reinforcement learning algorithm in accordance with an embodiment of the present application.

FIG. 2 illustrates a schematic block diagram of a context encoding unit in a natural language data processing system based on a reinforcement learning algorithm according to an embodiment of the present application.

FIG. 3 illustrates a schematic block diagram of a multi-scale convolutional encoding unit in a natural language data processing system based on a reinforcement learning algorithm in accordance with an embodiment of the present application.

Fig. 4 illustrates a schematic block diagram of a picture convolutional encoding unit in a natural language data processing system based on a reinforcement learning algorithm according to an embodiment of the present application.

FIG. 5 illustrates a schematic block diagram of a feature optimization unit in a natural language data processing system based on a reinforcement learning algorithm in accordance with an embodiment of the present application.

FIG. 6 illustrates a schematic block diagram of training modules in a natural language data processing system based on a reinforcement learning algorithm in accordance with an embodiment of the present application.

FIG. 7 illustrates a flow chart of a natural language data processing method based on a reinforcement learning algorithm according to an embodiment of the present application.

Fig. 8 illustrates a schematic diagram of a system architecture of a natural language data processing method based on a reinforcement learning algorithm according to an embodiment of the present application.

Fig. 9 illustrates a block diagram of an electronic device according to an embodiment of the present application.

Detailed Description

Hereinafter, example embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application and not all of the embodiments of the present application, and it should be understood that the present application is not limited by the example embodiments described herein.

Exemplary System

FIG. 1 illustrates a schematic block diagram of a natural language data processing system based on a reinforcement learning algorithm in accordance with an embodiment of the present application. As shown in fig. 1, the natural language data processing system 100 based on the reinforcement learning algorithm according to the embodiment of the present application includes: a data acquisition unit 110, configured to acquire a news data set to be classified, where the news data set to be classified includes: news text data and news picture data; a context encoding unit 120, configured to pass the news text data through a trained context encoder including an embedded layer to obtain a semantic feature vector; the multi-scale convolution encoding unit 130 is configured to pass the semantic feature vector through a trained multi-scale neighborhood feature extraction module to obtain a multi-scale text understanding feature vector; a picture convolutional encoding unit 140, configured to pass the news picture data through a convolutional neural network that is completed through training and serves as a filter to obtain a picture understanding feature vector; a feature fusion unit 150, configured to fuse the multi-scale text understanding feature vector and the picture understanding feature vector to obtain a classification feature vector; a feature optimization unit 160, configured to perform displacement order based on interpolation vectors on the classification feature vectors to obtain optimized classification feature vectors; the result generating unit 170 is configured to pass the optimized classification feature vector through a multi-tag classifier to obtain a classification result, where the classification result is used to represent a classification tag corresponding to the news data set to be classified.

In this embodiment of the present application, the data obtaining unit 110 is configured to collect a news data set to be categorized, where the news data set to be categorized includes: news text data and news picture data. It should be understood that news classification refers to a task of classifying input news data according to a category to which it belongs. Moreover, news data may have not only text data but also image data. Therefore, when classifying the news data set, it can be classified more accurately by news text data and news picture data. Specifically, in the technical scheme of the application, the text data and the news picture data to be news in the news data set to be classified are collected.

In this embodiment of the present application, the context encoding unit 120 is configured to pass the news text data through a trained context encoder including an embedded layer to obtain a semantic feature vector. It should be understood that, the news text data is usually long text, and it is difficult to efficiently and accurately extract the semantic features of the news text data, and the segment sequence corresponding to each news can be obtained by performing segmentation processing on the news text data, so that the subsequent mining of the semantic features of the news is facilitated, and the accuracy of semantic understanding of the news text data is further improved. In addition, considering that certain correlation exists between the segment sequences, in order to accurately acquire the global semantic understanding feature distribution information of the segment sequences in the high-dimensional feature space, word segmentation processing is further carried out on the segment sequences to obtain word sequences. Further, in order to prevent the word sequence confusion phenomenon after the word segmentation processing, the word sequence can be encoded by a trained context encoder containing an embedded layer to obtain global semantic distribution characteristics in the news text data. Specifically, in the technical scheme of the application, the semantic feature vector is passed through a trained multi-scale neighborhood feature extraction module to obtain a multi-scale text understanding feature vector.

FIG. 2 illustrates a schematic block diagram of a context encoding unit in a natural language data processing system based on a reinforcement learning algorithm according to an embodiment of the present application. As shown in fig. 2, the context encoding unit 120 includes: a segmentation processing subunit 121, configured to perform segmentation processing on the news text data to obtain a segment sequence corresponding to each news; a word segmentation processing subunit 122, configured to perform word segmentation processing on the segment sequence to obtain a word sequence; an embedding vectorization subunit 123, configured to map each word in the word sequence into a word embedding vector by using an embedding layer of a sequence encoder of the Clip model to obtain a sequence of word embedding vectors; a context encoding subunit 124, configured to perform global-based context semantic encoding on the sequence of word embedded vectors using a Bert model based on a converter of the sequence encoder of the Clip model to obtain a plurality of feature vectors; a concatenation subunit 125, configured to concatenate the plurality of feature vectors to obtain the semantic feature vector.

In this embodiment of the present application, the multi-scale convolutional encoding unit 130 is configured to pass the semantic feature vector through a trained multi-scale neighborhood feature extraction module to obtain a multi-scale text understanding feature vector. Considering that not only a certain association exists between adjacent words or sentences in the news text data, a certain association may also exist between non-adjacent words and sentences and words and sentences with different spans. That is, when the high-dimensional semantic distribution features of the news text data are extracted, not only the global semantic distribution features of the news text data are extracted, but also the multi-scale neighborhood high-dimensional semantic understanding associated features of the news text data are extracted according to different scales. Specifically, in the technical scheme of the application, the semantic feature vector is passed through a trained multi-scale neighborhood feature extraction module to obtain a multi-scale text understanding feature vector. The multi-scale neighborhood feature extraction module can use two one-dimensional convolution cores with different scales to carry out convolution coding on the semantic feature vectors so as to obtain two feature vectors with different scales, and then the two feature vectors with different scales are cascaded to obtain multi-scale neighborhood high-dimensional semantic understanding associated features of news text data.

FIG. 3 illustrates a schematic block diagram of a multi-scale convolutional encoding unit in a natural language data processing system based on a reinforcement learning algorithm in accordance with an embodiment of the present application. As shown in fig. 3, the multi-scale convolutional encoding unit 130 includes: a first scale convolution encoding unit 131, configured to perform one-dimensional convolution encoding on the semantic feature vectors by using a first convolution layer of the multi-scale neighborhood feature extraction module according to the following first convolution formula, so as to obtain the first scale feature vectors;

wherein the first convolution formula is:

wherein a is the width of the first convolution kernel in the X direction, F (a) is a first convolution kernel parameter vector, G (X-a) is a local vector matrix operated with a first convolution kernel function, w is the size of the first convolution kernel, X represents the semantic feature vector, and Cov (X) represents one-dimensional convolution encoding of the semantic feature vector respectively; a second scale convolution encoding unit 132, configured to perform one-dimensional convolution encoding on the semantic feature vector by using a second convolution layer of the multi-scale neighborhood feature extraction module according to the following second convolution formula to obtain the second scale feature vector;

wherein the second convolution formula is:

Wherein b is the width of the second convolution kernel in the X direction, F (b) is a second convolution kernel parameter vector, G (X-b) is a local vector matrix operated with a second convolution kernel function, m is the size of the second convolution kernel, X represents the semantic feature vector, and Cov (X) represents one-dimensional convolution encoding of the semantic feature vector respectively; and a concatenation unit 133, configured to concatenate the first scale feature vector and the second scale feature vector to obtain the multi-scale text understanding feature vector.

In this embodiment of the present application, the picture convolutional encoding unit 140 is configured to pass the news picture data through a trained convolutional neural network serving as a filter to obtain a picture understanding feature vector. It should be understood that the news picture data in the news data set is image data, which can be encoded by a convolutional neural network having excellent performance in the image field. In addition, considering that each position in the feature map contains local feature information of the position, when the high-dimensional semantic understanding distribution feature of the news picture data is extracted, we pay more attention to the feature representation of the whole picture, and the feature map dimension-reducing feature vector obtained after the news picture data is encoded by the convolutional neural network can compress the local feature information into the whole feature representation. Therefore, the distribution characteristics of the news picture data based on global high-dimensional semantic understanding can be obtained, and the accuracy of the global semantic understanding of the news picture data is improved. Specifically, in the technical scheme of the application, the news picture data is passed through a convolutional neural network which is completed through training and serves as a filter so as to obtain a picture understanding feature vector.

Fig. 4 illustrates a schematic block diagram of a picture convolutional encoding unit in a natural language data processing system based on a reinforcement learning algorithm according to an embodiment of the present application. As shown in fig. 4, the picture convolutional encoding unit 140 includes: an encoding subunit 141, configured to convolutionally encode the news picture data using the convolutional neural network as a filter to obtain a picture understanding feature map; the dimension reduction subunit 142 is configured to globally average and pool each feature matrix along the channel dimension of the picture understanding feature map to obtain the picture understanding feature vector. Wherein the encoding subunit 141 is configured to: input data are respectively carried out in forward transfer of layers by using the convolutional neural network model: performing three-dimensional convolution processing on the input data based on the convolution neural network serving as a filter to obtain a convolution feature map; carrying out mean pooling treatment based on a local feature matrix on the convolution feature map to obtain a pooled feature map; non-linear activation is carried out on the pooled feature map so as to obtain an activated feature map; the output of the last layer of the convolutional neural network serving as the filter is the picture understanding feature map, and the input of the first layer of the convolutional neural network serving as the filter is the news picture data.

In this embodiment of the present application, the feature fusion unit 150 is configured to fuse the multi-scale text understanding feature vector and the picture understanding feature vector to obtain a classification feature vector. It should be appreciated that text and pictures are two different information expressions that can complement and enrich each other, and fusing the multi-scale text understanding feature vector and the picture understanding feature vector can provide a more comprehensive and diversified news semantic understanding feature information representation, which helps to better understand and describe the high-dimensional semantic understanding distribution features of the news data to be classified. Specifically, in the technical scheme of the application, the multi-scale text understanding feature vector and the picture understanding feature vector are fused to obtain a classification feature vector.

In a specific embodiment of the present application, the feature fusion unit 150 includes: fusing the multi-scale text understanding feature vector and the picture understanding feature vector using a cascading function to obtain the classification feature vector, wherein the cascading function is formulated as:

f(X _i ，X _j )＝Relu( _W f[θ(X _i )，φ(X _j )])

wherein W is _f ，θ(X _i ) And phi (X) _j ) All representing the point convolution of the input, relu as the activation function, [ ]Representing the splicing operation, X _i Feature values, X, representing respective positions in the multi-scale text understanding feature vector _j And representing the characteristic value of each position in the picture understanding characteristic vector.

In this embodiment, the feature optimization unit 160 is configured to perform displacement order based on interpolation vectors on the classification feature vectors to obtain optimized classification feature vectors. Considering that feature redundancy exists in the classification feature vector and the feature values of all positions in the classification feature vector have differences in spatial position significance, namely the feature values of all positions in the classification feature vector have different contribution degrees to final classification judgment, the classification result obtained by the classification feature vector through the classifier is poor in accuracy and difficult to guarantee in robustness.

In view of the above technical problems, in the technical solution of the present application, vector granularity segmentation is first performed on the classification feature vector, that is, a feature value set of the classification feature vector is subjected to subgroup processing to obtain a plurality of classification local features. In a specific example of the present application, vector segmentation is performed on the classification feature vector to obtain a plurality of classification local feature vectors. Considering that after vector segmentation, each classified local feature vector loses information due to sparsification of feature dimensions, after vector segmentation, the classified local feature vectors are respectively passed through an upsampling module based on linear difference values to obtain a plurality of dense classified local feature vectors, that is, the characteristic densification of the classified local features is performed through linear interpolation. Further, a displacement offset value of each of the densely classified local feature vectors in a feature vector set formed by the plurality of densely classified local feature vectors is calculated, the displacement offset value representing a quantization outlier of each of the densely classified local feature vectors based on a feature distribution pattern. It will be appreciated that the higher the quantization outliers, the poorer the contribution of each densely classified local feature vector to the final classification decision. Then, the displacement characteristic values of the dense classification local characteristic vectors are arranged into displacement order input vectors, and then the displacement order characteristic vectors are obtained through a displacement order characteristic extractor comprising a one-dimensional convolution layer and a Sigmoid activation layer, namely, high-dimensional implicit mode characteristics among quantized outliers are captured through one-dimensional convolution coding and Sigmoid coding, and class probability domain mapping is carried out through a Sigmoid activation function. And finally, taking the characteristic value of each position in the displacement order characteristic vector as a weight, respectively weighting each classified local characteristic vector to obtain a plurality of weighted classified local characteristic vectors, and carrying out vector splicing on the plurality of weighted classified local characteristic vectors to obtain the optimized classified characteristic vector.

In this way, the displacement order based on the interpolation vector is utilized to perform feature selection, specifically, local features with distinguishing ability in the classification feature vector are enhanced according to the difference of the quantized outliers, and local features with weak distinguishing ability in the classification feature vector are restrained.

FIG. 5 illustrates a schematic block diagram of a feature optimization unit in a natural language data processing system based on a reinforcement learning algorithm in accordance with an embodiment of the present application. As shown in fig. 5, the feature optimization unit 160 includes: a vector segmentation subunit 161, configured to perform vector segmentation on the classification feature vector to obtain a plurality of classification local feature vectors; a sampling subunit 162, configured to pass the plurality of classified local feature vectors through an upsampling module based on a linear difference value to obtain a plurality of densely classified local feature vectors, respectively; a euclidean distance calculating subunit 163 configured to calculate, for each of the plurality of dense classification local feature vectors, euclidean distances between the each dense classification local feature vector and all other dense classification local feature vectors to obtain a plurality of euclidean distance values and calculate a sum of the plurality of euclidean distance values as a displacement feature value of the each dense classification local feature vector; a feature activation subunit 164, configured to arrange the displacement feature values of the densely classified local feature vectors into displacement order input vectors, and obtain displacement order feature vectors through a displacement order feature extractor including a one-dimensional convolution layer and a Sigmoid activation layer; a vector weighting subunit 165, configured to respectively weight the classified local feature vectors with respect to the feature values of the positions in the displacement order feature vector as weights, so as to obtain a plurality of weighted classified local feature vectors; the vector stitching subunit 166 is configured to perform vector stitching on the plurality of weighted classified local feature vectors to obtain the optimized classified feature vector.

In this embodiment of the present application, the result generating unit 170 is configured to pass the optimized classification feature vector through a multi-tag classifier to obtain a classification result, where the classification result is used to represent a classification tag corresponding to the news data set to be classified.

It should be appreciated that training of the context encoder including the embedded layer, the multi-scale neighborhood feature extraction module, and the convolutional neural network as a filter is required prior to news classification using the above-described data-based neural network model. That is, in the natural language data processing system based on the reinforcement learning algorithm according to the embodiment of the present application, the training module is further configured to train the context encoder including the embedded layer, the multi-scale neighborhood feature extraction module, and the convolutional neural network as the filter.

FIG. 6 illustrates a schematic block diagram of training modules in a natural language data processing system based on a reinforcement learning algorithm in accordance with an embodiment of the present application. As shown in fig. 6, the training module 200 includes: a training data obtaining unit 210, configured to obtain training data, where the training data includes the news text data, the news picture data, and a classification tag value corresponding to the news data set to be classified; a training context encoding unit 220, configured to pass the news text data through a context encoder including an embedded layer to obtain training semantic feature vectors; a training multi-scale convolution encoding unit 230, configured to pass the training semantic feature vector through a multi-scale neighborhood feature extraction module to obtain a training multi-scale text understanding feature vector; a training picture convolutional encoding unit 240 for passing the news picture data through a convolutional neural network as a filter to obtain training picture understanding feature vectors; the training feature fusion unit 250 is configured to fuse the training multi-scale text understanding feature vector and the training picture understanding feature vector to obtain a training classification feature vector; a training feature optimization unit 260, configured to perform displacement order based on interpolation vectors on the training classification feature vectors to obtain training optimization classification feature vectors; a classification loss unit 270, configured to pass the training optimized classification feature vector through a classifier to obtain a classification loss function value; a training unit 280, configured to train the context encoder including the embedded layer, the multi-scale neighborhood feature extraction module, and the convolutional neural network as a filter with the classification loss function value.

In this embodiment, the classification loss unit 270 is configured to pass the training optimized classification feature vector through a classifier to obtain a classification loss function value.

In a specific embodiment of the present application, the classification loss unit 270 is configured to: processing the training optimized classification feature vector by using the classifier according to the following calculation formula to obtain the classification result; wherein, the calculation formula is: o=softmax { (W) _n ，B _n )：...：(W ₁ ，B ₁ ) X, where W ₁ To W _n Is a weight matrix, B ₁ To B _n X is a training optimization classification feature vector for the bias vector; and calculating a cross entropy value between the classification result and a true value as the classification loss function value.

In summary, the natural language data processing system 100 based on the reinforcement learning algorithm according to the embodiment of the present application is illustrated, which first collects a news data set to be classified, wherein the news data set to be classified includes: and respectively extracting and analyzing the characteristics of the news text data and the news picture data through a trained convolutional neural network model. In this way, the model is trained and learned by using a reinforcement learning algorithm, and in the training process, the system selects classification actions according to the current state by interacting with the environment, and updates parameters of the strategy network according to the reward signals so as to optimize classification performance. In this way, a faster and more accurate classification of news data can be achieved.

As described above, the natural language data processing system 100 based on the reinforcement learning algorithm according to the embodiment of the present application may be implemented in various terminal devices, for example, a server or the like in which the natural language data processing algorithm based on the reinforcement learning algorithm is deployed. In one example, the natural language data processing system 100 according to the reinforcement learning algorithm may be integrated into the terminal device as a software module and/or hardware module. For example, the reinforcement learning algorithm-based natural language data processing system 100 may be a software module in the operating system of the terminal device, or may be an application developed for the terminal device; of course, the reinforcement learning algorithm-based natural language data processing system 100 can also be one of a plurality of hardware modules of the terminal device.

Alternatively, in another example, the reinforcement learning algorithm-based natural language data processing system 100 and the terminal device may be separate devices, and the reinforcement learning algorithm-based natural language data processing system 100 may be connected to the terminal device through a wired and/or wireless network and transmit interactive information in a contracted data format.

Exemplary method

FIG. 7 illustrates a flow chart of a natural language data processing method based on a reinforcement learning algorithm according to an embodiment of the present application. Fig. 8 illustrates a schematic diagram of a system architecture of a natural language data processing method based on a reinforcement learning algorithm according to an embodiment of the present application. As shown in fig. 7 and 8, a natural language data processing method based on a reinforcement learning algorithm according to an embodiment of the present application includes: s110, collecting a news data set to be classified, wherein the news data set to be classified comprises: news text data and news picture data; s120, passing the news text data through a trained context encoder comprising an embedded layer to obtain semantic feature vectors; s130, passing the semantic feature vector through a trained multi-scale neighborhood feature extraction module to obtain a multi-scale text understanding feature vector; s140, the news picture data is passed through a convolutional neural network which is completed through training and serves as a filter so as to obtain picture understanding feature vectors; s150, fusing the multi-scale text understanding feature vector and the picture understanding feature vector to obtain a classification feature vector; s160, a feature optimization unit, which is used for carrying out displacement order based on interpolation vectors on the classification feature vectors so as to obtain optimized classification feature vectors; and S170, the optimized classification feature vector passes through a multi-label classifier to obtain a classification result, wherein the classification result is used for representing classification labels corresponding to the news data set to be classified.

Here, it will be understood by those skilled in the art that the specific functions and operations of the respective steps in the above-described reinforcement learning algorithm-based natural language data processing method have been described in detail in the above description of the reinforcement learning algorithm-based natural language data processing system with reference to fig. 1, and thus, repetitive descriptions thereof will be omitted.

Exemplary electronic device

Next, an electronic device according to an embodiment of the present application is described with reference to fig. 9.

As shown in fig. 9, the electronic device 10 includes one or more processors 11 and a memory 12.

The processor 11 may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities, and may control other components in the electronic device 10 to perform desired functions.

Memory 12 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM) and/or cache memory (cache), and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer readable storage medium that can be executed by the processor 11 to implement the reinforcement learning algorithm based natural language data processing system and/or other desired functions of the various embodiments of the present application described above.

In one example, the electronic device 10 may further include: an input device 13 and an output device 14, which are interconnected by a bus system and/or other forms of connection mechanisms (not shown).

The input means 13 may comprise, for example, a keyboard, a mouse, etc.

The output device 14 can output various information including a decoded value and the like to the outside. The output means 14 may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, etc.

The basic principles of the present application have been described above in connection with specific embodiments, however, it should be noted that the advantages, benefits, effects, etc. mentioned in the present application are merely examples and not limiting, and these advantages, benefits, effects, etc. are not to be considered as necessarily possessed by the various embodiments of the present application. Furthermore, the specific details disclosed herein are for purposes of illustration and understanding only, and are not intended to be limiting, as the application is not intended to be limited to the details disclosed herein as such.

It should be understood that the specific examples herein are intended only to facilitate a better understanding of the embodiments of the present application by those skilled in the art and are not intended to limit the scope of the embodiments of the present application. In addition, in various embodiments of the present application, the sequence number of each process does not mean the sequence of execution, and the execution sequence of each process should be determined by its functions and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present application.

The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A natural language data processing system based on a reinforcement learning algorithm, comprising:

2. The reinforcement learning algorithm-based natural language data processing system of claim 1, wherein the context encoding unit comprises:

the segmentation processing subunit is used for carrying out segmentation processing on the news text data to obtain segment sequences corresponding to each news;

the word segmentation processing subunit is used for carrying out word segmentation processing on the segment sequence to obtain a word sequence;

an embedding vectorization subunit, configured to map each word in the word sequence into a word embedding vector by using an embedding layer of a sequence encoder of the Clip model to obtain a sequence of word embedding vectors;

a context coding subunit, configured to perform global context semantic coding on the sequence of word embedding vectors using a Bert model based on a converter of a sequence encoder of the Clip model to obtain a plurality of feature vectors;

And the cascading subunit is used for cascading the plurality of feature vectors to obtain the semantic feature vectors.

3. The reinforcement learning algorithm-based natural language data processing system of claim 2, wherein the multi-scale convolution encoding unit comprises:

the first scale convolution coding unit is used for respectively carrying out one-dimensional convolution coding on the semantic feature vectors by using a first convolution layer of the multi-scale neighborhood feature extraction module according to the following first convolution formula so as to obtain the first scale feature vectors;

wherein the first convolution formula is:

wherein a is the width of the first convolution kernel in the X direction, F (a) is a first convolution kernel parameter vector, G (X-a) is a local vector matrix operated with a first convolution kernel function, w is the size of the first convolution kernel, X represents the semantic feature vector, and Cov (X) represents one-dimensional convolution encoding of the semantic feature vector respectively;

the second scale convolution encoding unit is used for carrying out one-dimensional convolution encoding on the semantic feature vector by using a second convolution layer of the multi-scale neighborhood feature extraction module according to the following second convolution formula so as to obtain a second scale feature vector;

Wherein the second convolution formula is:

wherein b is the width of the second convolution kernel in the X direction, F (b) is a second convolution kernel parameter vector, G (X-b) is a local vector matrix operated with a second convolution kernel function, m is the size of the second convolution kernel, X represents the semantic feature vector, and Cov (X) represents one-dimensional convolution encoding of the semantic feature vector respectively;

and the cascading unit is used for cascading the first scale feature vector and the second scale feature vector to obtain the multi-scale text understanding feature vector.

4. A natural language data processing system based on a reinforcement learning algorithm according to claim 3, wherein the picture convolution encoding unit comprises:

the coding subunit is used for performing convolutional coding on the news picture data by using the convolutional neural network serving as a filter so as to obtain a picture understanding characteristic diagram;

and the dimension reduction subunit is used for carrying out global mean value pooling on each feature matrix of the picture understanding feature image along the channel dimension so as to obtain the picture understanding feature vector.

5. The reinforcement learning algorithm-based natural language data processing system of claim 4, wherein the encoding subunit is configured to: input data are respectively carried out in forward transfer of layers by using the convolutional neural network model:

Performing three-dimensional convolution processing on the input data based on the convolution neural network serving as a filter to obtain a convolution feature map;

carrying out mean pooling treatment based on a local feature matrix on the convolution feature map to obtain a pooled feature map; and

non-linear activation is carried out on the pooled feature map so as to obtain an activated feature map;

the output of the last layer of the convolutional neural network serving as the filter is the picture understanding feature map, and the input of the first layer of the convolutional neural network serving as the filter is the news picture data.

6. The reinforcement learning algorithm-based natural language data processing system of claim 5, wherein the feature fusion unit comprises:

fusing the multi-scale text understanding feature vector and the picture understanding feature vector using a cascading function to obtain the classification feature vector, wherein the cascading function is formulated as:

f(X _i ,X _j )＝Relu(W _f [θ(X _i ),φ(X _j )])

wherein W is _f ,θ(X _i ) And phi (X) _j ) All representing the point convolution of the input, relu as the activation function, []Representing the splicing operation, X _i Feature values, X, representing respective positions in the multi-scale text understanding feature vector _j And representing the characteristic value of each position in the picture understanding characteristic vector.

7. The reinforcement learning algorithm-based natural language data processing system of claim 6, wherein the feature optimization unit comprises:

the vector segmentation subunit is used for carrying out vector segmentation on the classified feature vectors to obtain a plurality of classified local feature vectors;

the sampling subunit is used for respectively passing the plurality of classified local feature vectors through an up-sampling module based on linear difference values to obtain a plurality of dense classified local feature vectors;

a euclidean distance calculating subunit configured to calculate, for each of the plurality of dense classification local feature vectors, euclidean distances between the each dense classification local feature vector and all other dense classification local feature vectors to obtain a plurality of euclidean distance values and calculate a sum of the plurality of euclidean distance values as a displacement feature value of the each dense classification local feature vector;

a feature activation subunit, configured to arrange the displacement feature values of the dense classification local feature vectors into displacement order input vectors, and obtain displacement order feature vectors through a displacement order feature extractor that includes a one-dimensional convolution layer and a Sigmoid activation layer;

The vector weighting subunit is used for taking the characteristic value of each position in the displacement order characteristic vector as a weight and respectively weighting each classified local characteristic vector to obtain a plurality of weighted classified local characteristic vectors;

and the vector splicing subunit is used for carrying out vector splicing on the plurality of weighted classified local feature vectors to obtain the optimized classified feature vector.

8. The reinforcement learning algorithm-based natural language data processing system of claim 7, further comprising a training module for training the context encoder including an embedded layer, the multi-scale neighborhood feature extraction module, and the convolutional neural network as a filter;

wherein, training module includes:

9. The reinforcement learning algorithm-based natural language data processing system of claim 8, wherein the classification loss unit is configured to: processing the training optimized classification feature vector by using the classifier according to the following calculation formula to obtain the classification result;

Wherein, the calculation formula is: o=softmax { (W) _n ,B _n ):…:(W ₁ ,B ₁ ) X, where W ₁ To W _n Is a weight matrix, B ₁ To B _n X is a training optimization classification feature vector for the bias vector; and

and calculating a cross entropy value between the classification result and a true value as the classification loss function value.

10. A natural language data processing method based on reinforcement learning algorithm is characterized by comprising the following steps: