CN116628202A

CN116628202A - Intention recognition method, electronic device, and storage medium

Info

Publication number: CN116628202A
Application number: CN202310594584.9A
Authority: CN
Inventors: 李志韬; 叶童; 王健宗; 程宁
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2023-05-24
Filing date: 2023-05-24
Publication date: 2023-08-22

Abstract

The application relates to the technical field of artificial intelligence, intelligent customer service and financial transaction, in particular to an intention recognition method, electronic equipment and a storage medium. The intention recognition method comprises the steps that target information to be recognized and a pre-trained intention recognition model are required to be obtained, the intention recognition model comprises an attention coding layer, an evidence neural network and an output layer, the attention coding layer is used for analyzing the target information to obtain global semantic representation and mean semantic representation, the evidence neural network is used for respectively predicting the global semantic representation and the mean semantic representation to obtain global classification probability corresponding to the global semantic representation and mean classification probability corresponding to the mean semantic representation, the output layer is used for integrating the global classification probability and the mean classification probability to obtain an intention recognition result, the intention recognition result is used for representing dialogue intention types corresponding to the target information, the calibration effect of uncertainty calibration can be improved, and the accuracy of intention recognition is improved.

Description

Intention recognition method, electronic device, and storage medium

Technical Field

The application relates to the technical field of artificial intelligence, intelligent customer service and financial transaction, in particular to an intention recognition method, electronic equipment and a storage medium.

Background

Intent recognition refers to analyzing the core needs of a user based on the user's input information and outputting semantic content most relevant to the input information. The accuracy of intent recognition, among other things, affects to a large extent the accuracy of the search and the intelligence of the dialog system. Ideally, the model output is accurate and has good calibration effect. In business scenarios such as intelligent customer service and financial transaction, a conversation robot and an intelligent customer service assistant are often utilized to realize intention recognition based on conversation between a user and a machine, so that a response action matched with the intention of the user is made.

Uncertainty calibration of an artificial intelligence model refers to correcting the accuracy of the artificial intelligence model's prediction uncertainty. It should be noted that, when the intent recognition is performed on the information input by the user, if the model obtains the prediction result with low confidence, no response action of wasting resources should be made, for example, the prediction result with low confidence is fed back to the expert for decision making. Therefore, in order to avoid that the artificial intelligence model gives an unreasonable response to the possible erroneous recognition result, uncertainty calibration is required to be performed on the artificial intelligence model in the related art. However, how to improve the accuracy of intent recognition by improving the calibration effect of uncertainty calibration has become a major challenge in the industry.

Disclosure of Invention

The present application aims to solve at least one of the technical problems existing in the prior art. Therefore, the application provides an intention recognition method, electronic equipment and a storage medium, which can improve the calibration effect of uncertainty calibration and further improve the accuracy of intention recognition.

An intention recognition method according to an embodiment of the first aspect of the present application includes:

acquiring target information to be identified and a pre-trained intention identification model, wherein the intention identification model comprises an attention coding layer, an evidence neural network and an output layer;

analyzing the target information based on the attention coding layer to obtain global semantic representation and mean semantic representation;

respectively carrying out prediction processing on the global semantic representation and the mean semantic representation based on the evidence neural network to obtain global classification probability corresponding to the global semantic representation and mean classification probability corresponding to the mean semantic representation;

and integrating the global classification probability and the average classification probability based on the output layer to obtain an intention recognition result, wherein the intention recognition result is used for representing a dialogue intention category corresponding to the target information.

According to some embodiments of the present application, before the obtaining the target information to be identified and the pre-trained intent recognition model, the method further includes pre-training the intent recognition model, and specifically includes:

selecting a training text sentence from a preset training data set;

extracting semantic features of the training text sentence based on the attention coding layer to obtain a sentence semantic feature vector and a word semantic mean vector;

respectively carrying out predictive training on the sentence meaning feature vector and the word meaning mean vector based on the evidence neural network to obtain sentence meaning classification probability corresponding to the sentence meaning feature vector and word meaning classification probability corresponding to the word meaning mean vector;

performing classification deviation analysis based on the sentence meaning classification probability and the word meaning classification probability to obtain training deviation data;

and iteratively updating the intention recognition model based on the training deviation data until the training deviation data accords with a preset error condition, so as to obtain the intention recognition model after pre-training.

According to some embodiments of the application, the semantic feature extraction is performed on the training text sentence based on the attention encoding layer to obtain a sentence meaning feature vector and a word meaning mean vector, including:

Performing word segmentation processing on the training text sentence to obtain a training word sequence, wherein the training word sequence comprises a plurality of word segmentation elements;

extracting semantic features of each word segmentation element based on an attention mechanism of the attention coding layer to obtain sentence meaning feature vectors and a plurality of word meaning feature vectors;

and carrying out mean analysis based on the word sense feature vectors to obtain the word sense mean vector.

According to some embodiments of the application, the training word sequence includes a start sequence bit and an element sequence bit, the start sequence bit is configured with a classification identifier, the element sequence bit is configured with each word segmentation element, the semantic feature extraction is performed on each word segmentation element based on the attention mechanism of the attention encoding layer to obtain sentence meaning feature vectors and a plurality of word meaning feature vectors, and the method includes:

global semantic extraction is carried out on all the word segmentation elements based on a self-attention mechanism of the attention coding layer, and sentence-meaning feature vectors are obtained based on the classification identifiers;

and carrying out local semantic extraction on each word segmentation element based on the attention coding layer, and obtaining a plurality of word sense feature vectors based on each word segmentation element.

According to some embodiments of the application, the classifying bias parsing based on the sentence meaning classifying probability and the word meaning classifying probability to obtain training bias data includes:

constructing a classification loss function based on the sentence meaning classification probability and the word meaning classification probability;

and determining the output value of the classification loss function as the training deviation data.

According to some embodiments of the application, the constructing a classification loss function based on the sentence meaning classification probability and the word meaning classification probability includes:

acquiring an evidence mapping analytic expression corresponding to the evidence neural network;

and constructing the classification loss function based on the word sense classification probability, the sentence sense classification probability and the evidence mapping analytic type.

According to some embodiments of the application, the constructing the classification loss function based on the word sense classification probability, the sentence sense classification probability, and the evidence mapping resolution includes:

constructing a first cross entropy sub-function based on the sentence meaning classification probability and the evidence mapping analytic type;

constructing a second cross entropy sub-function based on the word sense classification probability and the evidence mapping analytic type;

And integrating the first cross entropy sub-function and the second cross entropy sub-function to obtain the classification loss function.

According to some embodiments of the application, the integrating the first cross entropy sub-function with the second cross entropy sub-function to obtain the classification loss function includes:

performing relative entropy analysis based on the word sense classification probability and the sentence sense classification probability to obtain semantic information divergence;

and integrating the first cross entropy sub-function, the second cross entropy sub-function and the semantic information divergence to obtain the classification loss function.

In a second aspect, an embodiment of the present application provides an electronic device, including: a memory, a processor, the memory storing a computer program, the processor implementing the method for identifying intent according to any one of the embodiments of the first aspect of the present application when executing the computer program.

In a third aspect, embodiments of the present application provide a computer-readable storage medium storing a computer program that is executed by a processor to implement the intent recognition method according to any one of the embodiments of the first aspect of the present application.

The intention recognition method, the electronic device and the storage medium according to the embodiment of the application at least have the following

The beneficial effects are that:

according to the intention recognition method, target information to be recognized and a pre-trained intention recognition model are required to be acquired first, the intention recognition model comprises an attention coding layer, an evidence neural network and an output layer, analysis processing is carried out on the target information based on the attention coding layer to obtain global semantic representation and mean semantic representation, prediction processing is carried out on the global semantic representation and the mean semantic representation respectively based on the evidence neural network to obtain global classification probability corresponding to the global semantic representation and mean classification probability corresponding to the mean semantic representation, integration processing is carried out on the global classification probability and the mean classification probability based on the output layer to obtain an intention recognition result, and the intention recognition result is used for representing dialogue intention types corresponding to the target information. After the intention recognition model obtains the target information to be recognized, the calibration effect of uncertainty calibration can be improved through three layers of processing of the attention coding layer, the evidence neural network and the output layer, and then the accuracy of intention recognition is improved. Under the business scenes of intelligent customer service, financial transaction and the like, the intention recognition is usually realized by utilizing a conversation robot and an intelligent customer service assistant based on the conversation between a user and a machine, so that a response action matched with the intention of the user is made, and the application of the intention recognition method is beneficial to achieving a better man-machine interaction effect.

Additional aspects and advantages of the application will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application.

Drawings

The foregoing and/or additional aspects and advantages of the application will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:

FIG. 1 is an alternative flow chart of a method for identifying intent of an embodiment of the present application;

FIG. 2 is another alternative flow chart of an intent recognition method in accordance with an embodiment of the present application;

FIG. 3 is an alternative flowchart of step S202 in FIG. 2;

FIG. 4 is an alternative flowchart of step S302 in FIG. 3;

FIG. 5 is an alternative flowchart of step S204 in FIG. 2;

fig. 6 is an alternative flowchart of step S501 in fig. 5;

fig. 7 is an alternative flowchart of step S602 in fig. 6;

fig. 8 is an alternative flowchart of step S703 in fig. 7;

fig. 9 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.

Detailed Description

Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the application.

In the description of the present application, a number means one or more, a number means two or more, and greater than, less than, exceeding, etc. are understood to not include the present number, and above, below, within, etc. are understood to include the present number. The description of the first and second is for the purpose of distinguishing between technical features only and should not be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.

In the description of the present specification, reference to the terms "one embodiment," "some embodiments," "illustrative embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

In the description of the present application, unless explicitly defined otherwise, terms such as arrangement, installation, connection, etc. should be construed broadly and the specific meaning of the terms in the present application can be determined reasonably by a person skilled in the art in combination with the specific content of the technical solution. In addition, the following description of specific steps does not represent limitations on the order of steps or logic performed, and the order of steps and logic performed between steps should be understood and appreciated with reference to what is described in the embodiments.

Intent recognition refers to analyzing the core needs of a user based on the user's input and outputting information most relevant to the query input. The accuracy of intent recognition, among other things, affects to a large extent the accuracy of the search and the intelligence of the dialog system. For example, in a recommendation task, it is difficult to find out content meeting the requirements of the user due to incorrect identification of related movie, music or office requirements, resulting in a very poor user experience. In the related art, the difficulty of intention recognition is that incorrect intention recognition makes it difficult for a machine to accurately understand what a user wants to express, so that a case of answering a question frequently occurs, and thus, accurate dialog intention recognition is a very challenging task.

The following is a further description based on the accompanying drawings.

An alternative flow chart of the method of the present application for identifying intent as shown with reference to fig. 1 may include, but is not limited to, steps S101 to S104 described below.

Step S101, target information to be identified and a pre-trained intention identification model are obtained, wherein the intention identification model comprises an attention coding layer, an evidence neural network and an output layer;

step S102, analyzing and processing the target information based on the attention coding layer to obtain global semantic representation and mean semantic representation;

step S103, respectively carrying out prediction processing on the global semantic representation and the mean semantic representation based on the evidence neural network to obtain global classification probability corresponding to the global semantic representation and mean classification probability corresponding to the mean semantic representation;

step S104, the overall classification probability and the average classification probability are integrated based on the output layer to obtain an intention recognition result, wherein the intention recognition result is used for representing the dialogue intention category corresponding to the target information.

According to the intention recognition method, target information to be recognized and a pre-trained intention recognition model are required to be acquired first, the intention recognition model comprises an attention coding layer, an evidence neural network and an output layer, analysis processing is carried out on the target information based on the attention coding layer to obtain global semantic representation and mean semantic representation, prediction processing is carried out on the global semantic representation and the mean semantic representation respectively based on the evidence neural network to obtain global classification probability corresponding to the global semantic representation and mean classification probability corresponding to the mean semantic representation, integration processing is carried out on the global classification probability and the mean classification probability based on the output layer to obtain an intention recognition result, and the intention recognition result is used for representing dialogue intention types corresponding to the target information. After the intention recognition model obtains the target information to be recognized, the calibration effect of uncertainty calibration can be improved through three layers of processing of the attention coding layer, the evidence neural network and the output layer, and then the accuracy of intention recognition is improved.

In business scenarios such as intelligent customer service and financial transaction, a conversation robot and an intelligent customer service assistant are often utilized to realize intention recognition based on conversation between a user and a machine, so that a response action matched with the intention of the user is made. In business scenes such as intelligent customer service and financial transaction, the accuracy of intention recognition determines the effect of man-machine interaction. The more accurate the intent recognition, the more likely the machine can make a corresponding action, such as providing a service, based on the recognized intent. Therefore, the intention recognition method is applied to business scenes such as intelligent customer service and financial transaction, so that the man-machine interaction effect can be greatly improved, and service can be conveniently provided for a conversation robot and an intelligent customer service assistant.

In step S101 of some embodiments of the present application, it is required to obtain target information to be identified and a pre-trained intent recognition model, where the intent recognition model includes an attention coding layer, an evidence neural network, and an output layer. The target information refers to text information input by the user. The target information to be identified contains semantic content to be identified, and the purpose of intention identification is to analyze the core requirement of the user and output the information most relevant to query input according to the target information input by the user. It should be noted that the intention recognition model is an artificial intelligence model for carrying out intention recognition on target information. It should be appreciated that the types of intent recognition models are various, such as intent recognition models based on convolutional neural networks (Convolutional Neural Networks, CNN), intent recognition models based on Long Short-Term Memory (LSTM), intent recognition models based on LSTM in combination with Attention mechanisms, and the like, and are not limited to the specific embodiments set forth above.

In some exemplary embodiments of the present application, the intent recognition model includes an attention coding layer, an evidence neural network and an output layer, where the attention coding layer processes target information input by a user through an attention mechanism, so as to generate global semantic representation and mean semantic representation, the evidence neural network predicts the global semantic representation and the mean semantic representation respectively, so as to obtain global classification probability corresponding to the global semantic representation and mean classification probability corresponding to the mean semantic representation, and the output layer is used to integrate the global classification probability and the mean classification probability, so as to obtain an intent recognition result.

Referring to fig. 2, the intention recognition method according to some embodiments of the present application further includes pre-training an intention recognition model, including, but not limited to, steps S201 to S205 described below, before step S101.

Step S201, selecting a training text sentence from a preset training data set;

step S202, semantic feature extraction is carried out on training text sentences based on an attention coding layer, and sentence semantic feature vectors and word semantic mean vectors are obtained;

step S203, respectively carrying out predictive training on the sentence characteristic vector and the word sense mean vector based on the evidence neural network to obtain sentence sense classification probability corresponding to the sentence sense characteristic vector and word sense classification probability corresponding to the word sense mean vector;

step S204, classifying deviation analysis is carried out based on sentence meaning classifying probability and word meaning classifying probability, and training deviation data are obtained;

step S205, iteratively updating the intention recognition model based on the training deviation data until the training deviation data accords with a preset error condition, and obtaining a pre-trained intention recognition model.

In step S201 of some embodiments of the present application, a training text sentence is selected from a preset training data set. It should be noted that, the training data set refers to a data set for pre-training the intent recognition model, which includes a plurality of candidate training text sentences, and in the process of pre-training the intent recognition model, the training text sentences need to be selected from the training data set as training materials. It should be noted that the training data set is a preset text sentence data set, and the purpose of the pre-training of the intent recognition model is to promote the calibration effect of the uncertainty calibration of the intent recognition model, so that the plurality of alternative training text sentences contains both some obvious intent and some ambiguous intent.

In step S202 of some embodiments of the present application, semantic feature extraction is performed on the training text sentence based on the attention encoding layer, so as to obtain a sentence meaning feature vector and a word meaning mean vector. In some exemplary embodiments of the application, the attention encoding layer may consist of an artificial intelligence model for semantic recognition, such as a transform-based bi-directional encoder (Bidirectional Encoder Representations from Transformer, BERT). Note that the attention encoding layer may perform semantic feature extraction on each word segmentation element in the training text sentence to obtain a word sense feature vector corresponding to each word segmentation element and a sentence sense feature vector of the training text sentence as a whole. It should be appreciated that artificial intelligence models for semantic recognition may use conventional semantic recognition models. In some embodiments, the ability of the attention encoding layer to extract semantic features from each word segmentation element in the training text sentence may also be obtained by performing semantic training on the artificial intelligence model in the attention encoding layer before step S202.

In some more specific embodiments of the present application, the entire training word sequence x is required before the training text sentence is input into the attention encoding layer _i Classification identifier [ CLS ] is added to the initial sequence bit of the sequence]As a start marker, the input form of the training text sentence can be expressed as x _i ＝{[CLS],w ₁ ,w ₂ ,...,w _t }. It is clear that the training text sentence x input from the input end of the attention coding layer _i ＝{[CLS],w ₁ ,w ₂ ,...,w _t Along with the action of extracting semantic features of the attention coding layer, a semantic feature sequence h can be obtained at the output end of the attention coding layer _i ＝{h _CLS ,h ₁ ,h ₂ ,...,h _t And (b) wherein h _cLS From class identifier [ CLS ]]Corresponding initial sequence bit acquisition, i.e. training text sentence x _i Meaning of whole sentenceFeature vectors, which can be used for downstream classification tasks, etc., and h ₁ ,h ₂ ,...,h _t The corresponding vectors are training text sentence x _i Each word segmentation element w in ₁ ,w ₂ ,...,w _t Corresponding word sense feature vectors. It should be noted that, the implementation manner of extracting semantic features from the training text sentence based on the attention encoding layer to obtain the sentence meaning feature vector and the word meaning mean vector is various, and may include, but not limited to, the specific examples mentioned above.

In step S203 of some embodiments of the present application, prediction training is required to be performed on the semantic feature vector and the word sense mean vector based on the evidence neural network, so as to obtain a sentence sense classification probability corresponding to the sentence sense feature vector and a word sense classification probability corresponding to the word sense mean vector. It should be noted that the evidence neural network (Evidential Neural Network, ENN) was developed based on the evidence framework and Subjective Logic (SL) of the evidence Theory (DST). The function is to output the probability distribution among each intention category according to sentence meaning characteristic vector or word meaning mean vector.

In some more specific embodiments of the present application, a certain semantic feature vector h is assumed _i The evidence neural network ENN is input, and f ()'s represent the output of one ENN classification layer, g ()'s represent a function that makes the output value of this f non-negative, then the ENN classification layer can be expressed as: f (h) _i )＝W _e *h _i +b，g(f(h _i ))＝σ(f(h _i ))＝e _i Wherein W is _e The weights of the ENN classification layers are represented, b the residuals and σ the sigmoid function. If e _ik Representing sample h _i Evidence on the kth category, and therefore the ENN classification layer outputs evidence as: e, e _i ＝g(f(h _i )). It is noted that evidence e _i Defined as a measure of the amount of support collected from data to support classifying a sample into a class, thus e _ik Representing training text sentence x _i Evidence on the kth category may reflect sample x _i Probability distribution over the kth category, the probability predicted over the kth intent category being the corresponding dirichlet scoreThe mean value of the cloth is calculated as:α _ik ＝e _ik +1。

in some exemplary embodiments of the present application, the sentence meaning feature vector characterizes the sentence meaning of each word segmentation element in the training text sentence, so that the sentence meaning feature vector is input into the ENN classification layer, and the corresponding sentence meaning classification probability can be obtained, which reflects the intention classification probability of the training text sentence. The word sense mean vector characterizes semantic information represented by adding and averaging each component of the training text sentence, in some embodiments, the word sense mean vector is obtained based on adding and averaging word sense feature vectors corresponding to each word segmentation element, and then the word sense mean vector is input into an ENN classification layer, so that corresponding word sense classification probability can be obtained, and the intention classification probability of the training text sentence in the mean dimension is reflected.

In steps S204 to S205 of some embodiments of the present application, classification deviation analysis is performed based on the sentence meaning classification probability and the word meaning classification probability to obtain training deviation data, and then iterative updating is performed on the intention recognition model based on the training deviation data until the training deviation data meets a preset error condition, so as to obtain the intention recognition model after pre-training. It is emphasized that the semantic feature vector reflects semantic information reflected by the whole training text sentence, the word sense mean vector reflects semantic information reflected by the sum and the mean of all the constituent parts of the training text sentence, and the semantic feature vector and the word sense mean vector can both represent the sentence information of the training text sentence, when the semantic feature vector is closer to the word sense mean vector, the attention encoding layer is illustrated to have a more determined result on the semantic recognition of the training text sentence, and if the difference between the semantic feature vector and the word sense mean vector is larger, the attention encoding layer is illustrated to be difficult to deliver the determined result on the semantic recognition of the training text sentence. Therefore, in order to improve the calibration effect of the intention recognition model based on uncertainty calibration, classification deviation analysis is performed based on sentence meaning classification probability and word meaning classification probability to obtain training deviation data, and then the intention recognition model is iteratively updated based on the training deviation data until the training deviation data accords with a preset error condition, so that the intention recognition model after pre-training can be obtained.

Through steps S201 to S205 of the embodiment of the present application, as iterative updating is continuously performed, recognition accuracy of sentence meaning feature vectors and mean value classification probability is improved, so that under the condition that both sentence meaning feature vectors and mean value classification probability are reliable, the closer the sentence meaning feature vectors and the mean value classification probability are, the more determined result is indicated to be provided for semantic recognition of training text sentences by the attention encoding layer, and if the difference between the sentence meaning feature vectors and the mean value classification probability is larger, the more determined result is indicated to be difficult to be provided for semantic recognition of training text sentences by the attention encoding layer. Therefore, the calibration effect of the attention coding layer based on uncertainty calibration can be improved, and the accuracy of intention recognition is improved.

Referring to fig. 3, step S202 may include, but is not limited to, steps S301 to S303 described below, according to an intention recognition method of some embodiments of the present application.

Step S301, word segmentation processing is carried out on the training text sentence, a training word sequence is obtained, and the training word sequence comprises a plurality of word segmentation elements;

step S302, semantic feature extraction is carried out on each word segmentation element based on the attention mechanism of the attention coding layer, and sentence meaning feature vectors and a plurality of word meaning feature vectors are obtained;

Step S303, carrying out mean analysis based on the plurality of word sense feature vectors to obtain a word sense mean vector.

In step S301 of some embodiments of the present application, word segmentation is performed on a training text sentence to obtain a training word sequence, where the training word sequence includes a plurality of word segmentation elements. In some embodiments, the training text sentence may be in a word sequence format that has been preset, such as "x _i ＝{w _i ,w ₂ ,…,w _t In some embodiments, the training text sentence is a text sentence extracted from various text materials, for example, "xxxx, xxx, xxxx", and the text sentence extracted from various text materials has the advantages of wide materials and low manufacturing cost of the training data set, and the disadvantage that the text sentence extracted from various text materials cannot be directly input into the attention encoding layer. Thus, when trainingThe training text sentence is a text sentence extracted from various text materials, and in some embodiments of the present application, word segmentation processing needs to be performed on the training text sentence to obtain a training word sequence, where the training word sequence includes a plurality of word segmentation elements. Word segmentation is the process of recombining a continuous word sequence into a word sequence according to a certain specification. Since space and punctuation marks naturally divide each vocabulary in the expression habit of foreign sentences such as English, german and the like, the foreign sentences such as English, german and the like are divided, and space or other punctuation is often used as a basis. Chinese word segmentation refers to the segmentation of a sequence of Chinese characters into individual words, and because Chinese characters are square characters, continuous subsequences in Chinese expression habits may contain multiple words, and thus Chinese word segmentation is more difficult than English word segmentation. It should be appreciated that the word segmentation process for training text sentences may be implemented in a variety of ways and is not limited to the specific embodiments set forth above.

In step S302 of some embodiments of the present application, semantic feature extraction is performed on each word segmentation element based on the attention mechanism of the attention encoding layer, so as to obtain a sentence meaning feature vector and a plurality of word meaning feature vectors. It is emphasized that the attention encoding layer may perform semantic feature extraction on each word segmentation element in the training text sentence to obtain a word sense feature vector corresponding to each word segmentation element and a sentence sense feature vector of the training text sentence as a whole. In some embodiments, the intention recognition is performed on the training text sentence based on the attention encoding layer, and the obtained training recognition result is the word sense feature vector extracted for each word segmentation element and the sentence sense feature vector extracted for the whole training text sentence. Wherein, since the training word sequence comprises a plurality of word segmentation elements, the number of word sense feature vectors extracted for each word segmentation element is also a plurality.

In some more specific embodiments of the present application, the entire training word sequence x is required before the training text sentence is input into the attention encoding layer _i Character with start flag added [ CLS ]]Thus the input form of the whole model can be expressed as x _i ＝{[CLS],w ₁ ,w ₂ ,...,w _t }. It should be noted that for the whole training Word sequence x _i Character with start flag added [ CLS ]]H obtained at the output end of the attention coding layer along with the action of semantic feature extraction of the attention coding layer _i ＝{h _CLS ,h ₁ ,h ₂ ,...,h _t In }, h _CLS For extracting w ₁ ,w ₂ ,...,w _t Overall meaning, therefore h _CLS The corresponding vector can be taken as x _i Sentence meaning feature vector of whole sentence can be used for downstream classification task and so on, and h ₁ ,h ₂ ,...,h _t The corresponding vectors are the word segmentation elements w ₁ ,w ₂ ,...,w _t The word sense feature vectors corresponding respectively (the correspondence between the word segmentation elements and the word sense feature vectors is represented in the index numbers). It should be noted that, the semantic feature extraction is performed on each word segmentation element based on the attention mechanism of the attention encoding layer, and various embodiments for obtaining the sentence meaning feature vector and the plurality of word meaning feature vectors may include, but are not limited to, the specific examples mentioned above.

In step S303 of some embodiments of the present application, mean analysis is performed based on a plurality of word sense feature vectors to obtain a word sense mean vector. It should be noted that, semantic feature extraction is performed on each word segmentation element based on a self-attention mechanism of the attention coding layer, so as to obtain a sentence meaning feature vector and a plurality of word meaning feature vectors. And carrying out semantic recognition on each word segmentation element in the training word sequence to obtain word sense feature vectors reflecting the semantics of each word segmentation element, and obtaining a word sense mean vector based on the addition and the mean value of each word sense feature vector. In some more specific embodiments, if the input form of the training text sentence is expressed as x _i ＝{[CLS],w ₁ ,w ₂ ,...,w _t The number of the word segmentation elements is t, semantic feature extraction is carried out on each word segmentation element based on a self-attention mechanism of an attention coding layer, and the obtained word sense feature vector is h ₁ ,h ₂ ,...,h _t Word sense mean vectorThe analytical formula of (c) can be expressed as:

The sentence meaning feature vector and the word meaning mean vector are obtained by the method shown in the steps S301 to S303 in the embodiment of the application, and the reliability of extracting the sentence meaning feature vector and the word meaning mean vector by the attention coding layer can be improved in the continuous process of iterative updating. It is required to be clear that the reliability of the semantic feature vector and the word sense mean vector is a precondition for uncertainty calibration. Under the condition that the sentence meaning feature vector and the word meaning mean vector are reliable, the closer the sentence meaning feature vector and the word meaning mean vector are, the more definite result is provided for the semantic recognition of the training text sentence by the attention coding layer, and if the difference between the sentence meaning feature vector and the word meaning mean vector is large, the more definite result is provided for the semantic recognition of the training text sentence by the attention coding layer.

Referring to fig. 4, according to some embodiments of the present application, the training word sequence includes a start sequence bit configured with a classification identifier and an element sequence bit configured with respective word segmentation elements, and step S302 may include, but is not limited to, steps S401 to S402 described below.

Step S401, global semantic extraction is carried out on all word segmentation elements based on a self-attention mechanism of an attention coding layer, and sentence meaning feature vectors are obtained based on a classification identifier;

step S402, carrying out local semantic extraction on each word segmentation element based on the attention coding layer, and obtaining a plurality of word sense feature vectors based on each word segmentation element.

In steps S401 to S402 of some embodiments of the present application, global semantic extraction is performed on all word segmentation elements based on a self-attention mechanism of an attention encoding layer, sentence meaning feature vectors are obtained based on a classification identifier, local semantic extraction is performed on each word segmentation element based on the attention encoding layer, and a plurality of word meaning feature vectors are obtained based on each word segmentation element. It should be noted that, in some more specific embodiments of the present application, before inputting the training text sentence into the attention encoding layer, the training word sequence x is required to be the whole _i Classification identifier [ CLS ] is added to the initial sequence bit of the sequence]As a start mark, in the training word sequence x _i Adding individual word segmentation elements w to the element sequence bits of (a) ₁ ,w ₂ ,...,w _t Thus the input form of the training text sentence can be expressed as x _i ＝{[CLS],w ₁ ,w ₂ ,...,w _t }. It is clear that the training text sentence x input from the input end of the attention coding layer _i ＝{[CLS],w ₁ ,w ₂ ,...,w _t Along with the action of extracting semantic features of the attention coding layer, a semantic feature sequence h can be obtained at the output end of the attention coding layer _i ＝{h _CLS ,h ₁ ,h ₂ ,...,h _t And (b) wherein h _CLS From class identifier [ CLS ]]Corresponding initial sequence bit acquisition, i.e. training text sentence x _i Sentence meaning feature vector of whole sentence can be used for downstream classification task and so on, and h ₁ ,h ₂ ,...,h _t The corresponding vectors are training text sentence x _i Each word segmentation element w in ₁ ,w ₂ ,...,w _t Corresponding word sense feature vectors. It should be noted that, the implementation manner of extracting semantic features from the training text sentence based on the attention encoding layer to obtain the sentence meaning feature vector and the word meaning mean vector is various, and may include, but not limited to, the specific examples mentioned above.

Through steps S401 to S402 of the embodiment of the application, reliable sentence meaning feature vectors and a plurality of word meaning feature vectors are obtained, which is helpful to raise the calibration effect of uncertainty calibration of the intention recognition model, thereby improving the accuracy of intention recognition.

Referring to fig. 5, step S204 may include, but is not limited to, steps S501 to S502 described below, according to some embodiments of the present application.

Step S501, constructing a classification loss function based on sentence meaning classification probability and word meaning classification probability;

Step S502, determining an output value of the classification loss function as training deviation data.

In steps S501 to S502 of some embodiments of the present application, a classification loss function is constructed based on the sentence meaning classification probability and the word meaning classification probability, and then an output value of the classification loss function is determined as training deviation data. In some exemplary embodiments of the present application, the sentence meaning feature vector characterizes the sentence meaning of each word segmentation element in the training text sentence, so that the sentence meaning feature vector is input into the ENN classification layer, and the corresponding sentence meaning classification probability can be obtained, which reflects the intention classification probability of the training text sentence. The word sense mean vector characterizes semantic information represented by adding and averaging each component of the training text sentence, in some embodiments, the word sense mean vector is obtained based on adding and averaging word sense feature vectors corresponding to each word segmentation element, and then the word sense mean vector is input into an ENN classification layer, so that corresponding word sense classification probability can be obtained, and the intention classification probability of the training text sentence in the mean dimension is reflected.

It is emphasized that the semantic feature vector reflects semantic information reflected by the whole training text sentence, the word sense mean vector reflects semantic information reflected by the sum and the mean of all the constituent parts of the training text sentence, and the semantic feature vector and the word sense mean vector can both represent the sentence information of the training text sentence, when the semantic feature vector is closer to the word sense mean vector, the attention encoding layer is illustrated to have a more determined result on the semantic recognition of the training text sentence, and if the difference between the semantic feature vector and the word sense mean vector is larger, the attention encoding layer is illustrated to be difficult to deliver the determined result on the semantic recognition of the training text sentence. It is clear that the classification loss function is used for calculating the difference value between the sentence meaning classification probability and the word meaning classification probability, and in the process of continuously performing iterative training, the output value of the classification loss function is determined as training deviation data, namely, in each round of iteration, the intention recognition model is iteratively updated based on the training deviation data until the training deviation data accords with a preset error condition, so that the intention recognition model after pre-training is obtained.

Therefore, through the classification loss function constructed in step S501 to step S502 in the embodiment of the present application, and determining the output value of the classification loss function as training deviation data, the intention recognition model can be trained iteratively, so that the calibration effect of uncertainty calibration of the intention recognition model can be improved, and the accuracy of intention recognition can be improved.

Referring to fig. 6, step S501 may include, but is not limited to, steps S601 to S602 described below, according to some embodiments of the present application.

Step S601, obtaining an evidence mapping analytic expression corresponding to an evidence neural network;

step S602, a classification loss function is constructed based on the word sense classification probability, the sentence sense classification probability and the evidence mapping analytic expression.

In steps S601 to S602 of some embodiments of the present application, an evidence mapping analysis formula corresponding to an evidence neural network is first obtained, and then a classification loss function is constructed based on word sense classification probability, sentence sense classification probability and the evidence mapping analysis formula. It should be noted that for the K class classification problem, ENN will provide a belief quality for any one class K ε {1,2, …, K }And an overall uncertainty u, both of which will satisfy the conditionAnd b _k And u is non-negative. Evidence is defined as a measure of the amount of support collected from data to support classifying a sample into a class, let e _ik Representing sample x _i Evidence on the kth category, confidence quality b _k And uncertainty u _i Is calculated as follows:

assuming that this assignment of belief quality (i.e., subjective opinion) is subject to alpha _ik ＝e _ik Dirichlet distribution of +1, where S _i Known as dirichlet strength.

Thus, from K parameters α= [ α ] ₁ ,α ₂ ,…,α _K ]The density function of dirichlet distribution of (c) is expressed as:

wherein S is _K Is the cell cone of the K dimension:b (α) is a beta function of the K dimension.

In some more specific embodiments of the present application, a certain semantic feature vector h is assumed _i The evidence neural network ENN is input, and f ()'s represent the output of one ENN classification layer, g ()'s represent a function that makes the output value of this f non-negative, then the ENN classification layer can be expressed as: f (h) _i )＝W _e *h _i +b，g(f(h _i ))＝σ(f(h _i ))＝e _i Wherein W is _e The weights of the ENN classification layers are represented, b the residuals and σ the sigmoid function. If e _ik Representing sample h _i Evidence on the kth category, and therefore the ENN classification layer outputs evidence as: e, e _i ＝g(f(h _i )). It is noted that evidence e _i Defined as a measure of the amount of support collected from data to support classifying a sample into a class, thus e _ik Representing training text sentence x _i Evidence on the kth category may reflect sample x _i Probability distribution over the kth category, the probability predicted over the kth intent category being the mean of the corresponding dirichlet distribution, calculated as: α _ik ＝e _ik +1,Therefore, the evidence mapping analytic expression corresponding to the evidence neural network can be obtained by integrating the above formulas:

according to the embodiment of the application, the evidence mapping analytic style from step S601 to step S602 is used for constructing the classification loss function, and the output value of the classification loss function is determined as training deviation data, so that the intention recognition model can be trained iteratively, the calibration effect of uncertainty calibration of the intention recognition model can be improved, and the accuracy of the intention recognition is improved.

Referring to fig. 7, step S602 includes:

step S701, constructing a first cross entropy sub-function based on sentence meaning classification probability and an evidence mapping analytic expression;

step S702, constructing a second cross entropy sub-function based on word sense classification probability and evidence mapping analysis type;

in step S703, the first cross entropy sub-function and the second cross entropy sub-function are integrated to obtain a classification loss function.

In steps S701 to S703 of some embodiments of the present application, a first cross entropy sub-function is constructed based on the semantic classification probability and the evidence mapping resolution, and then a second cross entropy sub-function is constructed based on the semantic classification probability and the evidence mapping resolution, and the first cross entropy sub-function and the second cross entropy sub-function are further integrated to obtain a classification loss function. It should be noted that, in some more specific embodiments, the evidence mapping parsing formula corresponding to the evidence neural network is:

On the basis, sentence meaning classification probability e _ik ^CLS Substitution evidence mapping parsing formula L _ENN Can construct and obtain a corresponding first cross entropy sub-function

Similarly, word sense is classified according to probabilitySubstitution evidence mapping parsing formula L _ENN It is possible to construct a corresponding second cross entropy sub-function +.>

Therefore, after the first cross entropy sub-function and the second cross entropy sub-function are constructed, the first cross entropy sub-function and the second cross entropy sub-function are further integrated, and then the classification loss function can be obtained.

According to the method and the device for the uncertainty calibration of the intention recognition model, the classification loss function is constructed in the steps S701 to S703, and the output value of the classification loss function is determined to be training deviation data, so that the intention recognition model can be trained iteratively, the calibration effect of the uncertainty calibration of the intention recognition model can be improved, and the accuracy of the intention recognition is improved.

Referring to fig. 8, step S703 may include, but is not limited to, steps S801 to S802 described below, according to some embodiments of the present application.

Step S801, carrying out relative entropy analysis based on word sense classification probability and sentence sense classification probability to obtain semantic information divergence;

step S802, integrating the first cross entropy sub-function, the second cross entropy sub-function and the semantic information divergence to obtain a classification loss function.

In the embodiment of the application, step S801 to step S802, relative entropy analysis is performed based on word sense classification probability and sentence sense classification probability to obtain semantic information divergence, and then the first cross entropy sub-function, the second cross entropy sub-function and the semantic information divergence are integrated to obtain a classification loss function. It is emphasized that the semantic feature vector reflects semantic information reflected by the whole training text sentence, the word sense mean vector reflects semantic information reflected by the sum and the mean of all the constituent parts of the training text sentence, and the semantic feature vector and the word sense mean vector can both represent the sentence information of the training text sentence, when the semantic feature vector is closer to the word sense mean vector, the attention encoding layer is illustrated to have a more determined result on the semantic recognition of the training text sentence, and if the difference between the semantic feature vector and the word sense mean vector is larger, the attention encoding layer is illustrated to be difficult to deliver the determined result on the semantic recognition of the training text sentence. It is clear that the classification loss function is used for calculating the difference value between the sentence meaning classification probability and the word meaning classification probability, and in the process of continuously performing iterative training, the output value of the classification loss function is determined as training deviation data, namely, in each round of iteration, the intention recognition model is iteratively updated based on the training deviation data until the training deviation data accords with a preset error condition, so that the intention recognition model after pre-training is obtained. In some preferred embodiments of the present application, calculating KL divergences of the first cross entropy sub-function and the second cross entropy sub-function may enable the intention recognition model to tend to be a training result with high accuracy and good calibration effect, and may effectively alleviate the overfitting:

Therefore, in some more specific embodiments of the present application, the first cross entropy sub-function, the second cross entropy sub-function and the semantic information divergence are integrated to obtain a classification loss function, which is specifically:

Totalloss＝αloss1+βloss2+γloss3

wherein, alpha, beta and gamma are respectively super parameters. It should be noted that there are two general types of parameters in the machine learning model: one class needs to be learned and estimated from the data, called the model Parameter (Parameter), i.e. the Parameter of the model itself. For example, the weighting coefficient (slope) and the deviation term (intercept) of the linear regression line are all model parameters. Another class is tuning parameters (Tuning Parameters) in machine learning algorithms, which need to be flexibly set according to existing or existing experience, called hyper parameters. Such as regularization coefficient λ, the depth of the tree in the decision tree model. A superparameter is also a parameter that has the property of a parameter, such as unknown, i.e. it is not a known constant. A manually configurable setting, for which it is desired to assign a "correct" value based on existing or existing experience, i.e. a value that is flexibly set, is not obtained by system learning.

According to the method and the device for the uncertainty calibration of the intention recognition model, the classification loss function is constructed in the steps S801 to S802, and the output value of the classification loss function is determined to be training deviation data, so that the intention recognition model can be trained iteratively, the calibration effect of the uncertainty calibration of the intention recognition model can be improved, and the accuracy of the intention recognition is improved.

In step S102 of some embodiments of the present application, the attention encoding layer is used to parse the target information to obtain a global semantic representation and a mean semantic representation. It should be noted that, the attention coding layer processes the target information input by the user through the attention mechanism, and analyzes the target information, so as to generate the global semantic representation and the mean semantic representation. The target information refers to text information input by a user, in an attention coding layer, the target information in a text form can be subjected to word segmentation processing to obtain a corresponding target word sequence, the target word sequence is composed of a plurality of target word segmentation elements, then overall semantic recognition is carried out on the target word sequence to obtain global semantic representation reflecting the meaning of overall sentences of the target word sequence, meanwhile, local semantic recognition is carried out on each target word element in the target word sequence to obtain element semantic vectors reflecting the semantics of each target word element, and average semantic representation can be obtained based on addition and average value calculation of each element semantic vector. It should be noted that the global semantic representation reflects semantic information which is reflected by the whole object information, the average semantic representation reflects semantic information which is reflected by the addition and averaging of all components of the object information, the two components can represent statement information of the object information, when the global semantic representation is closer to the average semantic representation, the semantic recognition of the object information by the attention coding layer is illustrated to have a more determined result, and when the difference between the global semantic representation and the average semantic representation is larger, the semantic recognition of the object information by the attention coding layer is illustrated to be difficult to be issued to the determined result.

It should be noted that, the transform-based bi-directional encoder (Bidirectional Encoder Representations from Transformer, BERT) is an artificial intelligence model built based on the Attention mechanism of the transform, and the BERT model uses extensive unmarked predictive training to obtain a Representation (presentation) containing text intrinsic semantic information. It should be noted that the core of the BERT model is the transducer, and the core of the transducer is the AttentionThe force mechanism, the action mechanism, is to have the neural network focus on a portion of the input. The Attention mechanism of the BERT model mainly involves three concepts: query, value, and Key, where Query refers to a target word, or a word to be generated, value refers to an original Value representation of each word in the context in the input information, key refers to a Key vector representation of each word in the context in the input information, and semantic representation of the target word can be generated by adding Value to the Value by calculating similarity of Query and Key, such as an analytical formula: attention (Query, source) = Σ _i Similarity(Query,Key _i ) Value, where source=<Key，Value>. Through the above formula, the semantic representation (attribute value) of the target word can be obtained.

There are generally two types of parameters in machine learning models: one class needs to be learned and estimated from the data, called the model Parameter (Parameter), i.e. the Parameter of the model itself. For example, the weighting coefficient (slope) and the deviation term (intercept) of the linear regression line are all model parameters. Another class is tuning parameters (Tuning Parameters) in machine learning algorithms, which need to be flexibly set according to existing or existing experience, called hyper parameters. Such as regularization coefficient λ, the depth of the tree in the decision tree model. A superparameter is also a parameter that has the property of a parameter, such as unknown, i.e. it is not a known constant. A manually configurable setting, for which it is desired to assign a "correct" value based on existing or existing experience, i.e. a value that is flexibly set, is not obtained by system learning.

According to Self-Attention mechanism of BERT model, every word in input information must pass through three weight matrixes W _Q ，W _K ，W _V And performing linear change once to generate three vectors of Query, value and Key of each word, performing dot product on the Key vector of each word and the Query vector of each word every time when self-attention processing is performed by taking one word as a center, normalizing weights by Softmax, and performing weighted fusion on semantic information of all words by the weights to obtain semantic representation of each word. It should be noted that, due to the length of the textThe degree limitation results in the BERT model not being particularly effective in processing long text, and therefore some preferred embodiments of the application use a super parameter μ to promote the predictive effect of the BERT model in long text input, let n denote the length of the input text, d denote the dimension of the vector, q=xw _Q 、K＝xW _K 、V＝xW _V Where x represents the matrix of inputs, i.e., the outputs of the upper model, then the specific resolution is:in order to avoid that the Attention mechanism may be weakened when the BERT model processes long text, the effect of the super parameter μ is to weight the Attention value, so that the capability of the BERT model to process long text is improved. / >

It should be noted that the BERT model has the advantages that: firstly, the expression capacity of the model can be fully trained, and the feature extraction capacity of the transducer is stronger than that of a bidirectional LSTM; secondly, the BERT model can acquire semantic characterization of sentence level higher than the word; thirdly, the BERT model can combine the pre-training model and the downstream task model, that is to say, the BERT model can be used when the downstream task is performed, and the model is not required to be modified; fourth, the BERT model has a small fine tuning cost. Therefore, in some preferred embodiments of the present application, the BERT model is selected as the attention encoding layer of the intention recognition model.

In step S103 of some embodiments of the present application, the global semantic representation and the mean semantic representation are respectively predicted based on the evidence neural network, so as to obtain a global classification probability corresponding to the global semantic representation and a mean classification probability corresponding to the mean semantic representation. It should be noted that the evidence neural network (Evidential Neural Network, ENN) was developed based on the evidence framework and Subjective Logic (SL) of the evidence Theory (DST). In some exemplary embodiments of the present application, the pre-trained evidence neural network performs prediction processing on the global semantic representation and the mean semantic representation, so as to obtain a global classification probability corresponding to the global semantic representation and a mean classification probability corresponding to the mean semantic representation. It is emphasized that the global semantic representation reflects semantic information represented by the target information as a whole, and the average semantic representation reflects semantic information represented by the addition and averaging of each component of the target information, both of which can represent sentence information of the target information. The corresponding global classification probability can be obtained after the global semantic representation is predicted through the evidence neural network; and after the mean semantic characterization is predicted by the evidence neural network, the corresponding mean classification probability can be obtained.

Step S104, the overall classification probability and the average classification probability are integrated based on the output layer to obtain an intention recognition result, wherein the intention recognition result is used for representing the dialogue intention category corresponding to the target information. It should be noted that, the process of integrating the global classification probability with the mean classification probability may be performed in various manners, for example, summing the global classification probability with the mean classification probability, averaging the global classification probability with the mean classification probability, or weighting and summing the global classification probability with the mean classification probability to obtain a comprehensive classification probability, and classifying the comprehensive classification probability by softmax logistic regression (Softmax Regression). It should be noted that, the closer the global classification probability is to the mean classification probability, the more accurate the recognition result can be predicted by the intention recognition model, and if the difference between the global classification probability and the mean classification probability is greater, the prediction result with lower confidence is obtained when the intention recognition model performs intention recognition on the information input by the user. For the prediction result with low confidence, the response action of wasting resources should not be made, so that the attention coding layer and the evidence neural network are processed, the possible erroneous recognition result of the artificial intelligent model can be avoided, an unreasonable response action is forcedly given, the calibration effect of uncertainty calibration is improved, and the quality of intention recognition is improved. It should be understood that, the integration processing of the global classification probability and the mean classification probability based on the output layer to obtain the intention recognition result may be completed in various ways, which is not limited to the specific embodiments described above.

Fig. 9 shows an electronic device 900 provided by an embodiment of the application. The electronic device 900 includes: a processor 901, a memory 902 and a computer program stored on the memory 902 and executable on the processor 901, the computer program when run for performing the above-described intention recognition method.

The processor 901 and the memory 902 may be connected by a bus or other means.

The memory 902, as a non-transitory computer readable storage medium, may be used to store a non-transitory software program as well as a non-transitory computer executable program, such as the intent recognition method described in embodiments of the present application. The processor 901 implements the above-described intention recognition method by running a non-transitory software program and instructions stored in the memory 902.

The memory 902 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area. The storage data area may store information for performing the intention recognition method described above. In addition, the memory 902 may include high-speed random access memory 902 and may also include non-transitory memory 902, such as at least one storage device memory device, flash memory device, or other non-transitory solid state memory device. In some implementations, the memory 902 optionally includes memory 902 located remotely from the processor 901, the remote memory 902 being connectable to the electronic device 900 through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The non-transitory software programs and instructions required to implement the intent recognition method described above are stored in the memory 902 and when executed by the one or more processors 901 perform the intent recognition method described above, for example, performing method steps S101 through S104 in fig. 1, method steps S201 through S205 in fig. 2, method steps S301 through S303 in fig. 3, method steps S401 through S402 in fig. 4, method steps S501 through S502 in fig. 5, method steps S601 through S602 in fig. 6, method steps S701 through S703 in fig. 7, and method steps S801 through S802 in fig. 8.

The embodiment of the application also provides a computer readable storage medium which stores computer executable instructions for executing the intention recognition method.

In an embodiment, the computer-readable storage medium stores computer-executable instructions that are executed by one or more control processors, for example, to perform method steps S101 through S104 in fig. 1, method steps S201 through S205 in fig. 2, method steps S301 through S303 in fig. 3, method steps S401 through S402 in fig. 4, method steps S501 through S502 in fig. 5, method steps S601 through S602 in fig. 6, method steps S701 through S703 in fig. 7, and method steps S801 through S802 in fig. 8.

The above described apparatus embodiments are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

Those of ordinary skill in the art will appreciate that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, storage device storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically include computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and may include any information delivery media. It should also be appreciated that the various embodiments provided by the embodiments of the present application may be arbitrarily combined to achieve different technical effects.

While the preferred embodiment of the present application has been described in detail, the present application is not limited to the above embodiments, and those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit and scope of the present application, and these equivalent modifications or substitutions are included in the scope of the present application as defined in the appended claims.

Claims

1. An intent recognition method, comprising:

2. The method according to claim 1, wherein before the target information to be identified and the pre-trained intent recognition model are obtained, further comprising pre-training the intent recognition model, specifically comprising:

selecting a training text sentence from a preset training data set;

3. The method according to claim 2, wherein the semantic feature extraction is performed on the training text sentence based on the attention encoding layer to obtain a sentence meaning feature vector and a word meaning mean vector, and the method comprises:

4. The method of claim 3, wherein the training word sequence includes a start sequence bit and an element sequence bit, the start sequence bit is configured with a classification identifier, the element sequence bit is configured with each word segmentation element, the semantic feature extraction is performed on each word segmentation element based on an attention mechanism of the attention encoding layer to obtain a sentence meaning feature vector and a plurality of word meaning feature vectors, and the method comprises:

5. The method of claim 2, wherein the performing the classification bias analysis based on the sentence meaning classification probability and the word meaning classification probability to obtain training bias data comprises:

6. The method of claim 5, wherein constructing a classification loss function based on the sentence sense classification probability and the word sense classification probability comprises:

7. The method of claim 6, wherein constructing the classification loss function based on the word sense classification probability, the sentence sense classification probability, and the evidence mapping resolution comprises:

8. The method of claim 7, wherein said integrating the first cross entropy sub-function with the second cross entropy sub-function results in the classification loss function, comprising:

9. An electronic device, comprising: a memory, a processor storing a computer program, the processor implementing the intent recognition method as claimed in any one of claims 1 to 8 when the computer program is executed.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program that is executed by a processor to implement the intention recognition method according to any one of claims 1 to 8.