Nothing Special   »   [go: up one dir, main page]

CN111428468A - Method, device, equipment and storage medium for predicting single sentence smoothness - Google Patents

Method, device, equipment and storage medium for predicting single sentence smoothness Download PDF

Info

Publication number
CN111428468A
CN111428468A CN202010138555.8A CN202010138555A CN111428468A CN 111428468 A CN111428468 A CN 111428468A CN 202010138555 A CN202010138555 A CN 202010138555A CN 111428468 A CN111428468 A CN 111428468A
Authority
CN
China
Prior art keywords
sentence
layer
smoothness
model
single sentence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010138555.8A
Other languages
Chinese (zh)
Inventor
黄嘉鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Life Insurance Company of China Ltd
Original Assignee
Ping An Life Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Life Insurance Company of China Ltd filed Critical Ping An Life Insurance Company of China Ltd
Priority to CN202010138555.8A priority Critical patent/CN111428468A/en
Publication of CN111428468A publication Critical patent/CN111428468A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a method for predicting the smoothness of a single sentence, which comprises the following steps: acquiring a single sentence to be judged and determining an application scene corresponding to the single sentence; inputting a single sentence into a corresponding preset sentence smoothness model, wherein the sentence smoothness model is a neural network model, a Bert model is used as an input layer, and a CNN model is used as a classifier; converting the single sentence into a sentence vector through a Bert model and inputting the sentence vector into a CNN model; processing the convolutional layer, the pooling layer, the Flatten layer, the connection layer, the Dropout layer and the full connection layer of the CNN model in sequence to obtain a global feature vector in a single statement; and inputting the global feature vector into an output layer of the CNN model, calculating a Sigmoid function, and outputting to obtain a statement smoothness predicted value of a single statement. The invention also discloses a device and equipment for predicting the smoothness of the monolingual sentences and a computer-readable storage medium. The invention realizes the prediction and scoring of the smoothness of the single sentence in various application scenes and improves the intelligent level of machine conversation.

Description

Method, device, equipment and storage medium for predicting single sentence smoothness
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a method, a device, equipment and a storage medium for predicting the smoothness of a single sentence.
Background
Sentence smoothness means whether the semantics of a sentence are smooth, whether the expression conforms to the grammar or not, and whether the word usage is improper or not; or the sentences can be smoothly communicated with each other and the characters and qi are communicated with each other, and no mixed, disordered or awkward place exists. From the aspect of expression, if an article has more sentences, unsmooth language and lack of literary mining, the structure is ingenious even if the idea is renewed, the expression mode is well applied, and the article can only be read by people in a hurry way. Therefore, in the field of natural language processing, how to measure the sentence smoothness degree is also an important task, for example, in the scenes of an automatic question and answer robot or intelligent customer service and the like, the task can assist in identifying and screening effective user Query, and the response accuracy is improved.
The traditional method for measuring the smoothness of a single sentence is based on a language model, namely, the probability of each word or character after the existing word or sentence is estimated, iteration is carried out until the sentence is generated, and then the probability of the whole sentence is obtained by means of multiplication accumulation and the like, namely the smoothness. However, a large amount of high-quality texts need to be accumulated for establishing the language model, and although open corpora such as wikipedia can be captured by a crawler for training in a general scene, the scheme cannot be popularized in a specific scene and cannot be adaptive to proper nouns, so that the smoothness of sentences is low.
Disclosure of Invention
The invention mainly aims to provide a method, a device, equipment and a storage medium for predicting the smoothness of a single sentence, and aims to solve the technical problem that the traditional method for predicting the smoothness of the single sentence cannot be adaptive to scenes.
In order to achieve the above object, the present invention provides a method for predicting the smoothness of a monolingual sentence, wherein the method for predicting the smoothness of the monolingual sentence comprises the following steps:
acquiring a single sentence to be judged and determining an application scene corresponding to the single sentence;
inputting the single sentence into a preset sentence smoothness model corresponding to the application scene, wherein the sentence smoothness model is a neural network model, a Bert model is used as an input layer of the neural network model, and a CNN model is used as a classifier of the neural network model;
converting the single sentence into a sentence vector through the Bert model and inputting the sentence vector into the CNN model, wherein the CNN model sequentially comprises: the multilayer chip comprises a convolution layer, a pooling layer, a Flatten layer, a connection layer, a Dropout layer, a full connection layer and an output layer formed by a Sigmoid function;
processing the convolution layer, the pooling layer, the Flatten layer, the connecting layer, the Dropout layer and the full connecting layer in sequence to obtain a global feature vector in the single statement;
and inputting the global feature vector into the output layer, calculating a Sigmoid function, and outputting the Sigmoid function to obtain a predicted value of the sentence smoothness of the single sentence, wherein the smoothness of the single sentence is in direct proportion to the predicted value of the sentence smoothness.
Optionally, the obtaining of the global feature vector in the single sentence after the processing of the convolutional layer, the pooling layer, the scatter layer, the connection layer, the Dropout layer, and the full connection layer in sequence includes:
extracting local feature vectors in the sentence vectors through the convolutional layer and inputting the local feature vectors into the pooling layer for dimensionality reduction to obtain a plurality of low-dimensional local feature vectors and compressing the plurality of low-dimensional local feature vectors into a plurality of one-dimensional local feature vectors through a Flatten layer;
regularizing the plurality of one-dimensional local feature vectors through the Dropout layer to obtain a plurality of processed optimal local feature vectors;
and inputting the optimal local feature vectors into the full-connection layer for feature combination to obtain the global feature vector in the single statement.
Optionally, before the step of obtaining the single sentence to be judged and determining the application scenario corresponding to the single sentence, the method further includes:
acquiring a normal statement under a specified application scene as a positive sample;
respectively adjusting the expression sequence of the words in each positive sample to obtain a corresponding negative sample;
and repeatedly adjusting the expression sequence of the words in each positive sample until the ratio of the positive sample to the negative sample reaches a preset ratio, wherein the preset ratio is any one of 1:2, 1:3 and 1: 4.
Optionally, the respectively adjusting the expression order of the words in each positive sample to obtain the corresponding negative sample includes:
randomly sequencing the sentences in each positive sample according to characters or words and reconstructing the sentences into new sentences to obtain corresponding negative samples; and/or
And respectively randomly replacing the words or words of the sentences in each positive sample with the words or words in the preset dictionary to obtain corresponding negative samples, wherein the total length of the randomly replaced words is less than half of the length of the words of the corresponding sentences.
Optionally, before the step of obtaining the single sentence to be judged and determining the application scenario corresponding to the single sentence, the method further includes:
setting the sample labels of all positive samples to be 1 and the sample labels of all negative samples to be-1;
inputting each positive sample and each negative sample with a sample label into the neural network model for training, and judging whether a cross entropy loss function corresponding to the neural network model converges or not;
and if the cross entropy loss function corresponding to the neural network model is converged, stopping training to obtain the sentence smoothness model, otherwise, adjusting the learning weight of the neural network model and continuing training.
Further, in order to achieve the above object, the present invention provides a unilingual sentence continuity prediction apparatus, including:
the acquisition module is used for acquiring a single sentence to be judged and determining an application scene corresponding to the single sentence;
the input module is used for inputting the single sentence into a preset sentence smoothness model corresponding to the application scene, wherein the sentence smoothness model is a neural network model, a Bert model is used as an input layer of the neural network model, and a CNN model is used as a classifier of the neural network model;
a preprocessing module, configured to convert the single sentence into a sentence vector through the Bert model and input the sentence vector into the CNN model, where the CNN model sequentially includes: the multilayer chip comprises a convolution layer, a pooling layer, a Flatten layer, a connection layer, a Dropout layer, a full connection layer and an output layer formed by a Sigmoid function;
the characteristic acquisition module is used for obtaining a global characteristic vector in the single statement after the processing of the convolution layer, the pooling layer, the Flatten layer, the connection layer, the Dropout layer and the full connection layer is carried out in sequence;
and the smoothness output module is used for inputting the global feature vector into the output layer to perform Sigmoid function calculation and then outputting the global feature vector to obtain a statement smoothness predicted value of the single statement, wherein the smoothness of the single statement is in direct proportion to the statement smoothness predicted value.
Optionally, the feature obtaining module includes:
the feature extraction unit is used for extracting local feature vectors in the sentence vectors through the convolutional layer and inputting the local feature vectors into the pooling layer for dimensionality reduction to obtain a plurality of low-dimensional local feature vectors and compressing the plurality of low-dimensional local feature vectors into a plurality of one-dimensional local feature vectors through a Flatten layer;
the characteristic optimization unit is used for performing regularization processing on the plurality of one-dimensional local characteristic vectors through the Dropout layer to obtain a plurality of processed optimal local characteristic vectors;
and the feature combination unit is used for inputting the optimal local feature vectors into the full-connection layer to carry out feature combination to obtain the global feature vectors in the single statement.
Optionally, the apparatus for predicting the order of single sentence passing further includes:
the sample processing module is used for acquiring a normal sentence under a specified application scene as a positive sample; respectively adjusting the expression sequence of the words in each positive sample to obtain a corresponding negative sample; and repeatedly adjusting the expression sequence of the words in each positive sample until the ratio of the positive sample to the negative sample reaches a preset ratio, wherein the preset ratio is any one of 1:2, 1:3 and 1: 4.
Further, in order to achieve the above object, the present invention further provides a unilingual sentence continuity prediction apparatus, where the unilingual sentence continuity prediction apparatus includes a memory, a processor, and a unilingual sentence continuity prediction program stored in the memory and operable on the processor, and when the unilingual sentence continuity prediction program is executed by the processor, the method of predicting unilingual sentence continuity is implemented as in any one of the above described steps.
Further, in order to achieve the above object, the present invention provides a computer-readable storage medium, wherein a monolingual smoothness prediction program is stored on the computer-readable storage medium, and when the monolingual smoothness prediction program is executed by a processor, the step of the monolingual smoothness prediction method according to any one of the above aspects is implemented.
The sentence smoothness model used by the invention is a neural network model, and is specifically formed by fusing a Bert model and a convolutional neural network model, wherein the Bert model is an input layer of the neural network model, a CNN model is a classifier of the neural network model, the Bert model converts a single sentence to be predicted into a sentence vector, then inputs the sentence vector into the CNN model for prediction, and finally outputs a predicted value of the sentence smoothness of the single sentence. The method for predicting the single sentence smoothness based on the convolutional neural network of the Bert fully utilizes the Bert model trained in advance, and the Bert model has good generalization, so that the method can be suitable for any specific scene. On one hand, the method does not need to learn the probability model by a large amount of linguistic data as the traditional language model; on the other hand, for some words which do not appear in the learning set of the method, the situation that the out-of-set words cannot be created is not easy to occur, that is, the sentence currency model of the embodiment is easier to generalize. Therefore, the method can effectively reduce the difficulty and cost of obtaining the corpus, greatly improve the efficiency of model development and application, effectively improve the generalization degree of the model, be more easily popularized to various specific scenes, and be well suitable for the specific nouns.
Drawings
Fig. 1 is a schematic structural diagram of an operating environment of a monolingual sentence smoothness prediction device according to an embodiment of the present application;
FIG. 2 is a flowchart illustrating a method for forecasting the smoothness of a single sentence according to a first embodiment of the present invention;
FIG. 3 is a flowchart illustrating an embodiment of step S140 in FIG. 2;
FIG. 4 is a flowchart illustrating a method for forecasting the smoothness of a single sentence according to a second embodiment of the present invention;
FIG. 5 is a flowchart illustrating a method for forecasting the smoothness of a single sentence according to a third embodiment of the present invention;
fig. 6 is a functional block diagram of an embodiment of a single sentence smoothness prediction apparatus according to the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The invention provides a device for predicting the smoothness of a single sentence.
Referring to fig. 1, fig. 1 is a schematic structural diagram of an operating environment of a unilingual sentence smoothness prediction device according to an embodiment of the present application.
As shown in fig. 1, the monolingual smoothness prediction apparatus includes: a processor 1001, such as a CPU, a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display (Display), an input unit such as a Keyboard (Keyboard), and the network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.
Those skilled in the art will appreciate that the hardware architecture of the single-sentence compliance prediction apparatus shown in fig. 1 does not constitute a limitation of the single-sentence compliance prediction apparatus, and may include more or fewer components than those shown, or some components in combination, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a kind of computer-readable storage medium, may include therein an operating system, a network communication module, a user interface module, and a computer program. The operating system is a program for managing and controlling the monolingual sentence continuity prediction equipment and software resources, and supports the operation of the monolingual sentence continuity prediction program and other software and/or programs.
In the hardware structure of the unilingual sentence order degree prediction device shown in fig. 1, the network interface 1004 is mainly used for accessing a network; the user interface 1003 is mainly used for detecting a confirmation instruction, an editing instruction, and the like. And the processor 1001 may be configured to call the monolingual smoothness prediction program stored in the memory 1005 and perform the operations of the following embodiments of the monolingual smoothness prediction method.
Based on the hardware structure of the single sentence smoothness prediction equipment, various embodiments of the single sentence smoothness prediction method are provided.
Referring to fig. 2, fig. 2 is a flowchart illustrating a method for predicting the smoothness of a single sentence according to a first embodiment of the present invention. In this embodiment, the method for predicting the smoothness of a single sentence includes the following steps:
step S110, obtaining a single sentence to be judged and determining an application scene corresponding to the single sentence;
this embodiment specifically is applied to in the man-machine dialogue, for example automatic question answering robot or scenes such as wisdom customer service can assist discernment and the effectual user of screening to ask questions, and then promotes the rate of accuracy that the machine responded.
In this embodiment, the corresponding sentence smoothness models are correspondingly set in the human-computer conversation in different special application scenarios, so that the application scenario corresponding to the user question (single sentence) needs to be further determined while the human-computer conversation is performed. For example, when a user uses an APP of a certain service, the user performs a human-computer conversation with the machine by clicking a human-computer conversation button in a certain scene, and at this time, the machine acquires a question of the user and translates the question into a single sentence in a text format on the one hand, and acquires an application scene of the current human-computer conversation, such as a product service consultation scene, on the other hand.
Step S120, inputting the single sentence into a preset sentence smoothness model corresponding to the application scene, wherein the sentence smoothness model is a neural network model, a Bert model is used as an input layer of the neural network model, and a CNN model is used as a classifier of the neural network model;
in this embodiment, the corresponding sentence smoothness model is obtained by pre-training for different application scenarios, wherein the sentence smoothness model adopts a neural network model structure and is specifically formed by fusing the following two models:
(1) bert model
The Bert model is a transform-based bi-directional encoder characterization, which is different from other language representation models, which aim to pre-train the deep bi-directional representation by jointly adjusting the context in all layers. Therefore, the pre-trained BERT representation can be finely adjusted through an additional output layer, is suitable for building the most advanced model of a wide range of tasks, such as question-answering tasks and language reasoning, and does not need to make great architectural modification aiming at specific tasks.
(2) CNN model
Convolutional Neural Networks (CNN) are a type of feed-forward Neural network that includes convolution calculations and has a deep structure, and are one of the representative algorithms for deep learning. The convolutional neural network has the characteristic learning ability and can carry out translation invariant classification on input information according to the hierarchical structure of the convolutional neural network.
The neural network model generally includes an input layer and a classifier, and in this embodiment, the sentence order model uses the Bert model as the input layer of the neural network model and uses the CNN model as the classifier of the neural network model.
Step S130, converting the single sentence into a sentence vector through the Bert model and inputting the sentence vector into the CNN model, wherein the CNN model sequentially comprises: the multilayer chip comprises a convolution layer, a pooling layer, a Flatten layer, a connection layer, a Dropout layer, a full connection layer and an output layer formed by a Sigmoid function;
in this embodiment, in order to process a single sentence, it is necessary to convert the single sentence into a sentence vector through a Bert model, and then input the sentence vector corresponding to the single sentence into a CNN model for processing. In this embodiment, a pretrained Bert model is preferably used to convert a single sentence into a corresponding sentence vector, and the sentence vector is used as an input layer of the CNN model, so that the smoothness calculation of the single sentence is finally realized through the CNN model.
The CNN model used in this embodiment sequentially includes:
(1) the layer whose parameters consist of a set of learnable filters, each convolving the input during feed forward, and then calculating the dot product between the filter and the input. The input layer is convolved by the convolutional layer, thereby extracting features of a higher level.
(2) Pooling layers, also known as downsampling, are used to reduce data throughput while retaining useful information. By reducing the size of the model, the calculation speed is improved, and the robustness of the extracted features is improved.
(3) The Flatten layer is specifically used for flattening data input into the layer, namely converting multidimensional data output by the previous layer into one-dimensional data.
(4) And the connecting layer is used for connecting the multidimensional characteristics output by the pooling layer with the one-dimensional characteristics output by the Flatten layer.
(5) Dropout layers, a regularization method for preventing CNN model overfitting, is realized by the following principle that in one iteration during training, neurons (the total number is N) in each layer are randomly eliminated by probability P, and data in the iteration is trained by a network formed by the remaining (1-P) × N neurons, wherein the probability P is preferably 0.5 in the embodiment.
(6) And the full connection layer is used for fully connecting the output characteristics of the convolution layer and the output characteristics of the connection layer.
(7) The output layer, which is specifically composed of Sigmoid functions, represented on the image as Sigmoid functions, can typically map the data to [0,1] or [ -1,1] intervals.
Step S140, processing the convolution layer, the pooling layer, the Flatten layer, the connection layer, the Dropout layer and the full connection layer in sequence to obtain a global feature vector in the single statement;
in this embodiment, in the convolutional neural network, feature extraction and various kinds of processing of features, such as dimension reduction processing and regularization processing, are performed on an input vector mainly through a convolutional layer, a pooling layer, a scatter layer, a connection layer, a Dropout layer, and a full connection layer, so as to obtain a global feature vector in a single sentence.
And S150, inputting the global feature vector into the output layer, calculating a Sigmoid function, and outputting the result to obtain a statement smoothness predicted value of the single statement, wherein the smoothness of the single statement is in direct proportion to the statement smoothness predicted value.
In this embodiment, the output layer is formed by a Sigmoid function, and an expression of the Sigmoid function is as follows:
Figure BDA0002398193720000081
where s is the output of the upper layer of the CNN model, the Sigmoid function has the following characteristics: when s is 0, g(s) is 0.5; when s >0, g is approximately equal to 1, and when s < <0, g is approximately equal to 0. Obviously, g(s) may map the linear output of the previous stage to a numerical probability between [0,1 ]. Here, g(s) is a prediction output value of the CNN model, that is, the sentence smoothness value in this embodiment, where the larger the sentence smoothness value is, the more the sentence smoothness is.
The sentence smoothness model used in this embodiment is a neural network model, and is specifically formed by fusing a Bert model and a convolutional neural network model, the Bert model is an input layer of the neural network model, the CNN model is a classifier of the neural network model, the Bert model converts a single sentence to be predicted into a sentence vector, inputs the sentence vector into the CNN model for prediction, and finally outputs a predicted sentence smoothness value of the single sentence.
The method for predicting the single sentence smoothness based on the Bert convolutional neural network provided by the embodiment makes full use of the pre-trained Bert model, and the Bert model has good generalization, so that the method can be applied to any specific scene. On one hand, the method does not need to learn the probability model by a large amount of linguistic data as the traditional language model; on the other hand, for some words which do not appear in the learning set of the method, the situation that the out-of-set words cannot be created is not easy to occur, that is, the sentence currency model of the embodiment is easier to generalize. Therefore, the difficulty and the cost of corpus acquisition can be effectively reduced, the efficiency of model development and application is greatly improved, the generalization degree of the model is effectively improved, the model can be more easily popularized to various specific scenes, and the model can be well adapted to specific nouns.
Referring to fig. 3, fig. 3 is a flowchart illustrating an embodiment of step S140 in fig. 2. In this embodiment, the step S140 further includes:
step 1401, extracting local feature vectors in the sentence vectors through the convolutional layer and inputting the extracted local feature vectors into the pooling layer for dimensionality reduction to obtain a plurality of low-dimensional local feature vectors and compressing the plurality of low-dimensional local feature vectors into a plurality of one-dimensional local feature vectors through a Flatten layer;
the Convolutional layer (Convolutional layer) is composed of several convolution units, and the parameters of each convolution unit are optimized by a back propagation algorithm. The convolution operation aims to extract different input features, the first layer of convolution layer can only extract some low-level features such as edges, lines, angles and other levels, and more layers of convolution layer networks can iteratively extract more complex features from the low-level features. The convolutional layer has the function of extracting local features, can extract local information similar to n-grams in a sentence, and is used as a representation of the whole sentence by integrating n-gram features with different sizes. In this embodiment, the number of layers of convolutional layers is not limited, and more local features can be extracted by using multiple convolutional layers, but in one embodiment, four convolutional layers are preferably used, and feature extraction is performed on sentence vectors input by using convolutional cores with heights of 2, 3, 4 and 5, respectively, and specifically, local feature vectors in the sentence vectors can be extracted.
The pooling layer is specifically composed of nonlinear pooling functions of various different forms, and the pooling layer can continuously reduce the space size of data, so that the number of parameters and the calculation amount are reduced, and overfitting is controlled to a certain extent. In this embodiment, dimension reduction is performed through the pooling layer, and then the high-dimensional local feature vector extracted by the convolutional layer can be optimized to a low-dimensional local feature vector.
The Flatten layer is used for flattening input high-dimensional data, namely, the input multi-dimensional data is subjected to one-dimensional processing, so that subsequent processing is facilitated. In this embodiment, a plurality of low-dimensional local feature vectors are compressed into a plurality of one-dimensional local feature vectors by a scatter layer.
Step S1402, regularizing the plurality of one-dimensional local feature vectors by the Dropout layer to obtain a plurality of processed optimal local feature vectors;
the Dropout layer is a normalization means applied in a deep learning environment, and is realized by randomly eliminating neurons (total number is N) in each layer with probability P in one iteration during training, and training data in the current iteration by using a network formed by the remaining (1-P) × N neurons, wherein the probability P is preferably 0.5 in the embodiment.
To improve the expression or classification capability of CNN models, the most straightforward approach is to use deeper networks and more neurons, however complex networks also mean easier overfitting. Therefore, a Dropout layer is accessed to carry out regularization processing on the plurality of one-dimensional local feature vectors to obtain a plurality of processed optimal local feature vectors, and further, overfitting of the CNN model is prevented.
Step S1403, inputting the optimal local feature vectors into the full connection layer for feature combination, so as to obtain a global feature vector in the single sentence.
In this embodiment, each neuron in the fully connected layer is fully connected to all neurons in the previous layer to integrate the extracted features. The fully connected layer may integrate local information with category distinctiveness in the convolutional layer or the pooling layer. The output value of the last fully connected layer is transferred to an output layer for classification.
In this embodiment, a plurality of optimal local feature vectors are input into the fully-connected layer to perform feature combination, so as to obtain global feature vectors, and then the global feature vectors are delivered to the output layer to be classified.
Referring to fig. 4, fig. 4 is a flowchart illustrating a method for predicting the smoothness of a single sentence according to a second embodiment of the present invention. Based on the first embodiment, in this embodiment, before the step S110, the method further includes:
step S210, acquiring a normal sentence in a specified application scene as a positive sample;
in this embodiment, different corpora are used in different application scenarios, for example, a corpus set M is used in an application scenario a, and a corpus set P is used in an application scenario B.
In this embodiment, to realize feature extraction of the sentence order, two types of samples are used:
(1) positive sample
In this embodiment, the normal single sentence in the corpus is used as the positive sample, and the normal sentence needs to be manually set in advance and has the best sentence smoothness.
(2) Negative sample
Negative examples have less sentence currency than positive examples. In this embodiment, the expression order of the words in each positive sample is adjusted. For example, a positive sample is; "ask for a question, what is the price of the product? "the corresponding negative example may be" ask for a question, what the price of the product is ", or" what the price of the product is asked for "and so on.
S220, respectively adjusting the expression sequence of the words in each positive sample to obtain corresponding negative samples;
in this embodiment, in order to form enough negative samples to simulate various possibilities appearing in an actual application scenario, it is necessary to repeatedly adjust the expression sequence of words in the same positive sample, so as to form a plurality of different negative samples.
Optionally, in a specific embodiment, the negative sample is constructed by specifically adopting one or two of the following manners:
the first method is as follows: constructed in an out-of-order manner
In the method, sentences in each positive sample are randomly ordered according to characters or words and are reconstructed into new sentences, so that corresponding negative samples are obtained.
For example, a positive sample is; "ask for a question, what is the price of the product? "the sentence after disorder may be" ask for, what price the product is ", or" what price the product asks for ", etc.
The second method comprises the following steps: constructed by word-substitution
The method randomly replaces the words or words of the sentences in each positive sample with the words or words in the preset dictionary to obtain corresponding negative samples, wherein the total length of the randomly replaced words is less than half of the length of the words of the corresponding sentences.
For example, a positive sample is; "ask for a question, what is the price of the product? "then the sentence after word replacement can be" how much do you get, what is the price of the product? "or" ask for a question, what is the unit price of the product? ".
In the embodiment, by constructing a plurality of different negative samples corresponding to the positive samples, the overall coverage of the question asking mode in the application scene is realized, so that the accuracy of the smoothness of the model recognition statement is improved.
And step S230, repeatedly adjusting the expression sequence of the words in each positive sample until the ratio of the positive samples to the negative samples reaches a preset ratio, wherein the preset ratio is any one of 1:2, 1:3 and 1: 4.
In this embodiment, samples for training the sentence smoothness model are divided into two types, namely positive samples and negative samples, and the proportion between the positive samples and the negative samples is preferably any one of 1:2, 1:3 and 1:4, so that various sequences of words in the same single sentence in an actual application scene can be simulated, the model data processing amount can be reduced, and the model calculation speed can be increased.
Referring to fig. 5, fig. 5 is a flowchart illustrating a method for predicting the smoothness of a single sentence according to a third embodiment of the present invention. Based on the second embodiment, in this embodiment, before the step S210, a neural network model needs to be generated through machine learning to calculate the smoothness of a single sentence, and the specific implementation steps include:
step S310, setting the sample labels of all positive samples to be 1 and the sample labels of all negative samples to be-1;
step S320, inputting each positive sample and each negative sample with the sample label into the neural network model for training, and judging whether the cross entropy loss function corresponding to the neural network model is converged;
and step S330, if the cross entropy loss function corresponding to the neural network model is converged, stopping training to obtain the sentence smoothness model, otherwise, adjusting the learning weight of the neural network model and continuing training.
In this embodiment, the positive and negative samples in the above embodiment are used to train the neural network model, and then a sentence smoothness model capable of judging sentence smoothness of a single sentence is generated when training is completed.
In this embodiment, the sample labels of all positive samples are set to 1, and the sample labels of all negative samples are set to-1, and meanwhile, the neural network model used for machine learning in this embodiment is specifically formed by fusing the following two models:
(1) bert model
The Bert model is a transform-based bi-directional encoder characterization, which is different from other language representation models, which aim to pre-train the deep bi-directional representation by jointly adjusting the context in all layers. Therefore, the pre-trained BERT representation can be finely adjusted through an additional output layer, is suitable for building the most advanced model of a wide range of tasks, such as question-answering tasks and language reasoning, and does not need to make great architectural modification aiming at specific tasks.
(2) CNN model
Convolutional Neural Networks (CNN) are a type of feed-forward Neural network that includes convolution calculations and has a deep structure, and are one of the representative algorithms for deep learning. The convolutional neural network has the characteristic learning ability and can carry out translation invariant classification on input information according to the hierarchical structure of the convolutional neural network.
The neural network model generally includes an input layer and a classifier, and in this embodiment, the sentence order model uses the Bert model as the input layer of the neural network model and uses the CNN model as the classifier of the neural network model.
In this embodiment, the cross entropy loss function is used as a reference for determining whether the training is up to standard, if the cross entropy loss function corresponding to the neural network model is converged, the training is stopped, at this time, the trained neural network model is the sentence smoothness model, otherwise, the learning weight of the current neural network model is adjusted, and the training is continued.
In the embodiment, the single sentence smoothness models in various application scenes are obtained through machine learning training, and then the models can be used for predicting the single sentence smoothness in various application scenes.
The invention also provides a device for predicting the smoothness of the monolingual sentences.
Referring to fig. 6, fig. 6 is a functional module diagram of a unilingual sentence smoothness prediction apparatus according to an embodiment of the present invention. In this embodiment, the apparatus for predicting the order of single sentence passing includes:
the acquiring module 10 is configured to acquire a single sentence to be judged and determine an application scenario corresponding to the single sentence;
an input module 20, configured to input the single sentence into a preset sentence smoothness model corresponding to the application scenario, where the sentence smoothness model is a neural network model, a Bert model is used as an input layer of the neural network model, and a CNN model is used as a classifier of the neural network model;
a preprocessing module 30, configured to convert the single sentence into a sentence vector through the Bert model and input the sentence vector into the CNN model, where the CNN model sequentially includes: the multilayer chip comprises a convolution layer, a pooling layer, a Flatten layer, a connection layer, a Dropout layer, a full connection layer and an output layer formed by a Sigmoid function;
the feature obtaining module 40 is configured to obtain a global feature vector in the single statement after processing of the convolutional layer, the pooling layer, the Flatten layer, the connection layer, the Dropout layer, and the full connection layer in sequence;
and the smoothness output module 50 is configured to input the global feature vector into the output layer, perform Sigmoid function calculation, and output the result to obtain a statement smoothness predicted value of the single statement, where the smoothness of the single statement is directly proportional to the statement smoothness predicted value.
Optionally, in a specific embodiment, the feature obtaining module includes:
the feature extraction unit is used for extracting local feature vectors in the sentence vectors through the convolutional layer and inputting the local feature vectors into the pooling layer for dimensionality reduction to obtain a plurality of low-dimensional local feature vectors and compressing the plurality of low-dimensional local feature vectors into a plurality of one-dimensional local feature vectors through a Flatten layer;
the characteristic optimization unit is used for performing regularization processing on the plurality of one-dimensional local characteristic vectors through the Dropout layer to obtain a plurality of processed optimal local characteristic vectors;
and the feature combination unit is used for inputting the optimal local feature vectors into the full-connection layer to carry out feature combination to obtain the global feature vectors in the single statement.
Optionally, in a specific embodiment, the apparatus for predicting the compliance of a single sentence further includes:
the sample processing module is used for acquiring a normal sentence under a specified application scene as a positive sample; respectively adjusting the expression sequence of the words in each positive sample to obtain a corresponding negative sample; and repeatedly adjusting the expression sequence of the words in each positive sample until the ratio of the positive sample to the negative sample reaches a preset ratio, wherein the preset ratio is any one of 1:2, 1:3 and 1: 4.
Based on the same description of the embodiment as the method for predicting the single sentence passing order of the present invention, the embodiment of the single sentence passing order prediction apparatus is not described in detail.
The method for predicting the single sentence smoothness based on the Bert convolutional neural network provided by the embodiment makes full use of the pre-trained Bert model, and the Bert model has good generalization, so that the method can be applied to any specific scene. On one hand, the method does not need to learn the probability model by a large amount of linguistic data as the traditional language model; on the other hand, for some words which do not appear in the learning set of the method, the situation that the out-of-set words cannot be created is not easy to occur, that is, the sentence currency model of the embodiment is easier to generalize. Therefore, the difficulty and the cost of corpus acquisition can be effectively reduced, the efficiency of model development and application is greatly improved, the generalization degree of the model is effectively improved, the model can be more easily popularized to various specific scenes, and the model can be well adapted to specific nouns.
The invention also provides a computer readable storage medium.
In this embodiment, the computer-readable storage medium stores a monolingual sentence continuity prediction program, and when the monolingual sentence continuity prediction program is processed and executed, the step of the monolingual sentence continuity prediction method described in any one of the above embodiments is implemented. The method for implementing the single sentence order prediction program when executed by the processor may refer to various embodiments of the single sentence order prediction method of the present invention, and thus, will not be described in detail.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM), and includes instructions for causing a terminal (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
The present invention is described in connection with the accompanying drawings, but the present invention is not limited to the above embodiments, which are only illustrative and not restrictive, and those skilled in the art can make various changes without departing from the spirit and scope of the invention as defined by the appended claims, and all changes that come within the meaning and range of equivalency of the specification and drawings that are obvious from the description and the attached claims are intended to be embraced therein.

Claims (10)

1. A method for predicting the smoothness of a single sentence is characterized by comprising the following steps of:
acquiring a single sentence to be judged and determining an application scene corresponding to the single sentence;
inputting the single sentence into a preset sentence smoothness model corresponding to the application scene, wherein the sentence smoothness model is a neural network model, a Bert model is used as an input layer of the neural network model, and a CNN model is used as a classifier of the neural network model;
converting the single sentence into a sentence vector through the Bert model and inputting the sentence vector into the CNN model, wherein the CNN model sequentially comprises: the multilayer chip comprises a convolution layer, a pooling layer, a Flatten layer, a connection layer, a Dropout layer, a full connection layer and an output layer formed by a Sigmoid function;
processing the convolution layer, the pooling layer, the Flatten layer, the connecting layer, the Dropout layer and the full connecting layer in sequence to obtain a global feature vector in the single statement;
and inputting the global feature vector into the output layer, calculating a Sigmoid function, and outputting the Sigmoid function to obtain a predicted value of the sentence smoothness of the single sentence, wherein the smoothness of the single sentence is in direct proportion to the predicted value of the sentence smoothness.
2. The method for predicting the smoothness of a single sentence according to claim 1, wherein the obtaining of the global feature vector in the single sentence after the processing of the convolutional layer, the pooling layer, the Flatten layer, the connection layer, the Dropout layer, and the full connection layer in sequence comprises:
extracting local feature vectors in the sentence vectors through the convolutional layer and inputting the local feature vectors into the pooling layer for dimensionality reduction to obtain a plurality of low-dimensional local feature vectors and compressing the plurality of low-dimensional local feature vectors into a plurality of one-dimensional local feature vectors through a Flatten layer;
regularizing the plurality of one-dimensional local feature vectors through the Dropout layer to obtain a plurality of processed optimal local feature vectors;
and inputting the optimal local feature vectors into the full-connection layer for feature combination to obtain the global feature vector in the single statement.
3. The method for predicting the smoothness of a single sentence according to claim 1, wherein before the step of obtaining the single sentence to be judged and determining the application scenario corresponding to the single sentence, the method further comprises:
acquiring a normal statement under a specified application scene as a positive sample;
respectively adjusting the expression sequence of the words in each positive sample to obtain a corresponding negative sample;
and repeatedly adjusting the expression sequence of the words in each positive sample until the ratio of the positive sample to the negative sample reaches a preset ratio, wherein the preset ratio is any one of 1:2, 1:3 and 1: 4.
4. The method for predicting the order of single sentence according to claim 3, wherein the step of adjusting the expression order of the words in each positive sample to obtain the corresponding negative sample comprises:
randomly sequencing the sentences in each positive sample according to characters or words and reconstructing the sentences into new sentences to obtain corresponding negative samples; and/or
And respectively randomly replacing the words or words of the sentences in each positive sample with the words or words in the preset dictionary to obtain corresponding negative samples, wherein the total length of the randomly replaced words is less than half of the length of the words of the corresponding sentences.
5. The method for predicting the smoothness of a single sentence according to claim 3 or 4, wherein before the step of obtaining the single sentence to be judged and determining the application scenario corresponding to the single sentence, the method further comprises:
setting the sample labels of all positive samples to be 1 and the sample labels of all negative samples to be-1;
inputting each positive sample and each negative sample with a sample label into the neural network model for training, and judging whether a cross entropy loss function corresponding to the neural network model converges or not;
and if the cross entropy loss function corresponding to the neural network model is converged, stopping training to obtain the sentence smoothness model, otherwise, adjusting the learning weight of the neural network model and continuing training.
6. A monolingual sentence order degree prediction device is characterized by comprising:
the acquisition module is used for acquiring a single sentence to be judged and determining an application scene corresponding to the single sentence;
the input module is used for inputting the single sentence into a preset sentence smoothness model corresponding to the application scene, wherein the sentence smoothness model is a neural network model, a Bert model is used as an input layer of the neural network model, and a CNN model is used as a classifier of the neural network model;
a preprocessing module, configured to convert the single sentence into a sentence vector through the Bert model and input the sentence vector into the CNN model, where the CNN model sequentially includes: the multilayer chip comprises a convolution layer, a pooling layer, a Flatten layer, a connection layer, a Dropout layer, a full connection layer and an output layer formed by a Sigmoid function;
the characteristic acquisition module is used for obtaining a global characteristic vector in the single statement after the processing of the convolution layer, the pooling layer, the Flatten layer, the connection layer, the Dropout layer and the full connection layer is carried out in sequence;
and the smoothness output module is used for inputting the global feature vector into the output layer to perform Sigmoid function calculation and then outputting the global feature vector to obtain a statement smoothness predicted value of the single statement, wherein the smoothness of the single statement is in direct proportion to the statement smoothness predicted value.
7. The apparatus for predicting the smoothness of a single sentence according to claim 6, wherein the feature obtaining module comprises:
the feature extraction unit is used for extracting local feature vectors in the sentence vectors through the convolutional layer and inputting the local feature vectors into the pooling layer for dimensionality reduction to obtain a plurality of low-dimensional local feature vectors and compressing the plurality of low-dimensional local feature vectors into a plurality of one-dimensional local feature vectors through a Flatten layer;
the characteristic optimization unit is used for performing regularization processing on the plurality of one-dimensional local characteristic vectors through the Dropout layer to obtain a plurality of processed optimal local characteristic vectors;
and the feature combination unit is used for inputting the optimal local feature vectors into the full-connection layer to carry out feature combination to obtain the global feature vectors in the single statement.
8. The monolingual smoothness prediction apparatus according to claim 6, further comprising:
the sample processing module is used for acquiring a normal sentence under a specified application scene as a positive sample; respectively adjusting the expression sequence of the words in each positive sample to obtain a corresponding negative sample; and repeatedly adjusting the expression sequence of the words in each positive sample until the ratio of the positive sample to the negative sample reaches a preset ratio, wherein the preset ratio is any one of 1:2, 1:3 and 1: 4.
9. A unilingual sentence continuity prediction apparatus, characterized in that the unilingual sentence continuity prediction apparatus comprises a memory, a processor and a unilingual sentence continuity prediction program stored on the memory and operable on the processor, wherein the unilingual sentence continuity prediction program, when executed by the processor, implements the steps of the unilingual sentence continuity prediction method according to any one of claims 1 to 5.
10. A computer-readable storage medium, wherein the computer-readable storage medium has stored thereon a monolingual sentence continuity prediction program, which when executed by a processor, implements the steps of the monolingual sentence continuity prediction method according to any one of claims 1 to 5.
CN202010138555.8A 2020-03-03 2020-03-03 Method, device, equipment and storage medium for predicting single sentence smoothness Pending CN111428468A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010138555.8A CN111428468A (en) 2020-03-03 2020-03-03 Method, device, equipment and storage medium for predicting single sentence smoothness

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010138555.8A CN111428468A (en) 2020-03-03 2020-03-03 Method, device, equipment and storage medium for predicting single sentence smoothness

Publications (1)

Publication Number Publication Date
CN111428468A true CN111428468A (en) 2020-07-17

Family

ID=71547516

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010138555.8A Pending CN111428468A (en) 2020-03-03 2020-03-03 Method, device, equipment and storage medium for predicting single sentence smoothness

Country Status (1)

Country Link
CN (1) CN111428468A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112446322A (en) * 2020-11-24 2021-03-05 杭州网易云音乐科技有限公司 Eyeball feature detection method, device, equipment and computer-readable storage medium
CN113010635A (en) * 2021-02-19 2021-06-22 网易(杭州)网络有限公司 Text error correction method and device
CN114330276A (en) * 2022-01-04 2022-04-12 四川新网银行股份有限公司 Short message template generation method and system based on deep learning and electronic device
CN114386396A (en) * 2021-12-17 2022-04-22 北京达佳互联信息技术有限公司 Language model training method, language model prediction device and electronic equipment

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112446322A (en) * 2020-11-24 2021-03-05 杭州网易云音乐科技有限公司 Eyeball feature detection method, device, equipment and computer-readable storage medium
CN112446322B (en) * 2020-11-24 2024-01-23 杭州网易云音乐科技有限公司 Eyeball characteristic detection method, device, equipment and computer readable storage medium
CN113010635A (en) * 2021-02-19 2021-06-22 网易(杭州)网络有限公司 Text error correction method and device
CN114386396A (en) * 2021-12-17 2022-04-22 北京达佳互联信息技术有限公司 Language model training method, language model prediction device and electronic equipment
CN114386396B (en) * 2021-12-17 2024-10-25 北京达佳互联信息技术有限公司 Language model training method, prediction method, device and electronic equipment
CN114330276A (en) * 2022-01-04 2022-04-12 四川新网银行股份有限公司 Short message template generation method and system based on deep learning and electronic device

Similar Documents

Publication Publication Date Title
WO2020228376A1 (en) Text processing method and model training method and apparatus
CN111160350B (en) Portrait segmentation method, model training method, device, medium and electronic equipment
CN111428468A (en) Method, device, equipment and storage medium for predicting single sentence smoothness
CN110990543A (en) Intelligent conversation generation method and device, computer equipment and computer storage medium
CN113780486B (en) Visual question answering method, device and medium
WO2023137911A1 (en) Intention classification method and apparatus based on small-sample corpus, and computer device
WO2023173552A1 (en) Establishment method for target detection model, application method for target detection model, and device, apparatus and medium
CN110825849A (en) Text information emotion analysis method, device, medium and electronic equipment
CN116051388A (en) Automatic photo editing via language request
US20240135610A1 (en) Image generation using a diffusion model
CN112836025A (en) Intention identification method and device
CN111858898A (en) Text processing method and device based on artificial intelligence and electronic equipment
CN112632244A (en) Man-machine conversation optimization method and device, computer equipment and storage medium
CN112036705A (en) Quality inspection result data acquisition method, device and equipment
CN109271636B (en) Training method and device for word embedding model
CN109960791A (en) Judge the method and storage medium, terminal of text emotion
CN113392640A (en) Title determining method, device, equipment and storage medium
CN113886562A (en) AI resume screening method, system, equipment and storage medium
WO2023134085A1 (en) Question answer prediction method and prediction apparatus, electronic device, and storage medium
WO2023159756A1 (en) Price data processing method and apparatus, electronic device, and storage medium
CN114416969B (en) LSTM-CNN online comment emotion classification method and system based on background enhancement
US20240169623A1 (en) Multi-modal image generation
CN110414515B (en) Chinese character image recognition method, device and storage medium based on information fusion processing
CN113779244B (en) Document emotion classification method and device, storage medium and electronic equipment
CN114911940A (en) Text emotion recognition method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination