CN113589946B - Data processing method and device and electronic equipment - Google Patents
Data processing method and device and electronic equipment Download PDFInfo
- Publication number
- CN113589946B CN113589946B CN202010366764.8A CN202010366764A CN113589946B CN 113589946 B CN113589946 B CN 113589946B CN 202010366764 A CN202010366764 A CN 202010366764A CN 113589946 B CN113589946 B CN 113589946B
- Authority
- CN
- China
- Prior art keywords
- input
- information
- deep learning
- learning model
- sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 18
- 238000013136 deep learning model Methods 0.000 claims abstract description 213
- 238000000034 method Methods 0.000 claims abstract description 79
- 238000012545 processing Methods 0.000 claims abstract description 32
- 238000012549 training Methods 0.000 claims description 160
- 230000008569 process Effects 0.000 claims description 24
- 238000012216 screening Methods 0.000 claims description 21
- 238000010586 diagram Methods 0.000 description 13
- 238000004891 communication Methods 0.000 description 11
- 238000005516 engineering process Methods 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 5
- 238000004590 computer program Methods 0.000 description 5
- 230000009471 action Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 4
- 230000001133 acceleration Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/02—Input arrangements using manually operated switches, e.g. using keyboards or dials
- G06F3/023—Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
- G06F3/0233—Character input methods
- G06F3/0237—Character input methods using prediction or retrieval techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/02—Input arrangements using manually operated switches, e.g. using keyboards or dials
- G06F3/023—Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
- G06F3/0233—Character input methods
- G06F3/0236—Character input methods using selection techniques to select from displayed items
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
The embodiment of the invention provides a data processing method, a data processing device and electronic equipment, wherein the method comprises the following steps: acquiring an input sequence and input association information; performing long sentence prediction based on the input sequence and the input association information by adopting a deep learning model, and outputting sentence candidates; furthermore, by adopting a deep learning model and combining an input sequence and input related information to conduct long sentence prediction, the prediction accuracy can be improved, and therefore the user input efficiency is improved.
Description
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data processing method, a data processing device, and an electronic device.
Background
With the development of computer technology, electronic devices such as mobile phones and tablet computers are becoming more popular, and great convenience is brought to life, study and work of people. These electronic devices are typically installed with an input method application (input method for short) so that a user can input information using the input method.
In the user input process, the input method can predict various types of candidates matched with the input sequence, such as sentence candidates, name candidates, associations and the like, so as to be used for the user to screen, thereby improving the user input efficiency. However, in the prior art, the prediction accuracy of sentence candidates is not high, and the input requirements of users cannot be well met, so that the input efficiency of the users cannot be well improved.
Disclosure of Invention
The embodiment of the invention provides a data processing method, which is used for improving the user input efficiency by improving the accuracy of long sentence prediction.
Correspondingly, the embodiment of the invention also provides a data processing device and electronic equipment, which are used for ensuring the realization and application of the method.
In order to solve the above problems, an embodiment of the present invention discloses a data processing method, which specifically includes: acquiring an input sequence and input association information; and carrying out long sentence prediction based on the input sequence and the input association information by adopting a deep learning model, and outputting sentence candidates.
Optionally, the deep learning model includes an encoding module and a decoding module; the adoption of the deep learning model for long sentence prediction based on the input sequence and the input association information, and output of sentence candidates comprise: the coding module is adopted to code the input associated information to obtain coding information; the decoding module is adopted to decode the coding information, the first X words with highest probability are selected from the corresponding word candidate set at each moment, and output and prediction at the next moment are carried out based on the X words; the word candidate set is obtained by screening words in a preset word set according to the input sequence, and X is a positive integer; and forming X sentence candidates by adopting the words output at each moment.
Optionally, the input sequence includes a pinyin sequence, and the method further includes a step of screening words in a preset word set according to the input sequence to obtain a word candidate set: at the ith moment in the decoding process, determining the length M of a target word aiming at the target word in a preset word set; determining the number N of syllables which are matched before the ith moment in the pinyin sequence, and matching M syllables corresponding to the syllables after the Nth syllable in the pinyin sequence with M characters corresponding to the pinyin in the target word; if the M characters in the target word correspond to the phonetic alphabets and the prefix of the M syllables in the phonetic sequence correspond to the phonetic alphabets, adding the target word into the word candidate set corresponding to the ith moment; wherein i is a positive integer, and the value range is 1-N.
Optionally, the long sentence prediction is performed by using a deep learning model based on the input sequence and the input association information, and the output sentence candidates include: the input sequence and the input association information are used as model input information of one dimension and are input into the deep learning model; extracting the characteristics of the model input information of one dimension by adopting the deep learning model to obtain the characteristic information of one dimension; and carrying out long sentence prediction according to the characteristic information of one dimension, and outputting sentence candidates.
Optionally, the performing long sentence prediction based on the input sequence and the input association information by using a deep learning model, and outputting sentence candidates, includes: the input sequence and the input association information are respectively used as model input information of two dimensions and are input into the deep learning model; respectively extracting features of model input information of two dimensions by adopting the deep learning model to obtain feature information of the two dimensions; splicing the characteristic information of the two dimensions; and carrying out long sentence prediction according to the spliced characteristic information, and outputting sentence candidates.
Optionally, the method further comprises the step of training a deep learning model: collecting corpus; performing word granularity division on the corpus to obtain training data; and training the deep learning model by adopting the training data.
Optionally, the method further comprises the step of training the deep learning model: collecting a plurality of sets of training data, each set of training data comprising: history input association information, a history input sequence, sentences input by a user under the conditions of the history input association information and the history input sequence; for a group of training data, taking a history input sequence and history input associated information in the group of training data as model input information of one dimension, and inputting the model input information into the deep learning model; extracting the characteristics of the model input information of one dimension by adopting the deep learning model to obtain the characteristic information of one dimension; performing long sentence prediction according to the feature information of the dimension, and outputting sentence candidates; and adjusting the weight of the deep learning model according to the sentence candidates and sentences input by the user in the set of training data.
Optionally, the method further comprises the step of training the deep learning model: collecting a plurality of sets of training data, each set of training data comprising: history input association information, a history input sequence, sentences input by a user under the conditions of the history input association information and the history input sequence; for a group of training data, taking a history input sequence and history input associated information in the group of training data as model input information of two dimensions, and inputting the model input information into the deep learning model; extracting the characteristics of the model input information of the two dimensions by adopting the deep learning model to obtain the characteristic information of the two dimensions; splicing the characteristic information of the two dimensions; predicting long sentences according to the spliced characteristic information, and outputting sentence candidates; and adjusting the weight of the deep learning model according to the sentence candidates and sentences input by the user in the set of training data.
The embodiment of the invention also discloses a data processing device, which specifically comprises: the acquisition module is used for acquiring an input sequence and input associated information; and the prediction module is used for predicting long sentences based on the input sequence and the input association information by adopting a deep learning model and outputting sentence candidates.
Optionally, the deep learning model includes an encoding module and a decoding module; the prediction module comprises: the first long sentence prediction sub-module is used for encoding the input associated information by adopting the encoding module to obtain encoded information; the decoding module is adopted to decode the coding information, the first X words with highest probability are selected from the corresponding word candidate set at each moment, and output and prediction at the next moment are carried out based on the X words; the word candidate set is obtained by screening words in a preset word set according to the input sequence, and X is a positive integer; and forming X sentence candidates by adopting the words output at each moment.
Optionally, the input sequence includes a pinyin sequence, and the apparatus further includes: the screening module is used for determining the length M of a target word aiming at the target word in a preset word set at the ith moment in the decoding process; determining the number N of syllables which are matched before the ith moment in the pinyin sequence, and matching M syllables corresponding to the syllables after the Nth syllable in the pinyin sequence with M characters corresponding to the pinyin in the target word; if the M characters in the target word correspond to the phonetic alphabets and the prefix of the M syllables in the phonetic sequence correspond to the phonetic alphabets, adding the target word into the word candidate set corresponding to the ith moment; wherein i is a positive integer, and the value range is 1-N.
Optionally, the prediction module includes: the second long sentence prediction sub-module is used for taking the input sequence and the input association information as model input information of one dimension and inputting the model input information into the deep learning model; extracting the characteristics of the model input information of one dimension by adopting the deep learning model to obtain the characteristic information of one dimension; and carrying out long sentence prediction according to the characteristic information of one dimension, and outputting sentence candidates.
Optionally, the prediction module includes: the third long sentence prediction sub-module is used for respectively taking the input sequence and the input association information as model input information of two dimensions and inputting the model input information into the deep learning model; respectively extracting features of model input information of two dimensions by adopting the deep learning model to obtain feature information of the two dimensions; splicing the characteristic information of the two dimensions; and carrying out long sentence prediction according to the spliced characteristic information, and outputting sentence candidates.
Optionally, the apparatus further comprises: the first model training module is used for collecting corpus; performing word granularity division on the corpus to obtain training data; and training the deep learning model by adopting the training data.
Optionally, the apparatus further comprises: and a second model training module for collecting a plurality of sets of training data, each set of training data comprising: history input association information, a history input sequence, sentences input by a user under the conditions of the history input association information and the history input sequence; for a group of training data, taking a history input sequence and history input associated information in the group of training data as model input information of one dimension, and inputting the model input information into the deep learning model; extracting the characteristics of the model input information of one dimension by adopting the deep learning model to obtain the characteristic information of one dimension; performing long sentence prediction according to the feature information of the dimension, and outputting sentence candidates; and adjusting the weight of the deep learning model according to the sentence candidates and sentences input by the user in the set of training data.
Optionally, the apparatus further comprises: and a third model training module for collecting a plurality of sets of training data, each set of training data comprising: history input association information, a history input sequence, sentences input by a user under the conditions of the history input association information and the history input sequence; for a group of training data, taking a history input sequence and history input associated information in the group of training data as model input information of two dimensions, and inputting the model input information into the deep learning model; extracting the characteristics of the model input information of the two dimensions by adopting the deep learning model to obtain the characteristic information of the two dimensions; splicing the characteristic information of the two dimensions; predicting long sentences according to the spliced characteristic information, and outputting sentence candidates; and adjusting the weight of the deep learning model according to the sentence candidates and sentences input by the user in the set of training data.
The embodiment of the invention also discloses a readable storage medium, which enables the electronic device to execute the data processing method according to any one of the embodiments of the invention when the instructions in the storage medium are executed by the processor of the electronic device.
The embodiment of the invention also discloses an electronic device, which comprises a memory and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, and the one or more programs comprise instructions for: acquiring an input sequence and input association information; and carrying out long sentence prediction based on the input sequence and the input association information by adopting a deep learning model, and outputting sentence candidates.
Optionally, the deep learning model includes an encoding module and a decoding module; the adoption of the deep learning model for long sentence prediction based on the input sequence and the input association information, and output of sentence candidates comprise: the coding module is adopted to code the input associated information to obtain coding information; the decoding module is adopted to decode the coding information, the first X words with highest probability are selected from the corresponding word candidate set at each moment, and output and prediction at the next moment are carried out based on the X words; the word candidate set is obtained by screening words in a preset word set according to the input sequence, and X is a positive integer; and forming X sentence candidates by adopting the words output at each moment.
Optionally, the input sequence includes a pinyin sequence, and further includes an instruction for screening words in a preset word set according to the input sequence to obtain a word candidate set: at the ith moment in the decoding process, determining the length M of a target word aiming at the target word in a preset word set; determining the number N of syllables which are matched before the ith moment in the pinyin sequence, and matching M syllables corresponding to the syllables after the Nth syllable in the pinyin sequence with M characters corresponding to the pinyin in the target word; if the M characters in the target word correspond to the phonetic alphabets and the prefix of the M syllables in the phonetic sequence correspond to the phonetic alphabets, adding the target word into the word candidate set corresponding to the ith moment; wherein i is a positive integer, and the value range is 1-N.
Optionally, the long sentence prediction is performed by using a deep learning model based on the input sequence and the input association information, and the output sentence candidates include: the input sequence and the input association information are used as model input information of one dimension and are input into the deep learning model; extracting the characteristics of the model input information of one dimension by adopting the deep learning model to obtain the characteristic information of one dimension; and carrying out long sentence prediction according to the characteristic information of one dimension, and outputting sentence candidates.
Optionally, the performing long sentence prediction based on the input sequence and the input association information by using a deep learning model, and outputting sentence candidates, includes: the input sequence and the input association information are respectively used as model input information of two dimensions and are input into the deep learning model; respectively extracting features of model input information of two dimensions by adopting the deep learning model to obtain feature information of the two dimensions; splicing the characteristic information of the two dimensions; and carrying out long sentence prediction according to the spliced characteristic information, and outputting sentence candidates.
Optionally, further comprising instructions for performing the following training deep learning model: collecting corpus; performing word granularity division on the corpus to obtain training data; and training the deep learning model by adopting the training data.
Optionally, instructions for training the deep learning model are also included: collecting a plurality of sets of training data, each set of training data comprising: history input association information, a history input sequence, sentences input by a user under the conditions of the history input association information and the history input sequence; for a group of training data, taking a history input sequence and history input associated information in the group of training data as model input information of one dimension, and inputting the model input information into the deep learning model; extracting the characteristics of the model input information of one dimension by adopting the deep learning model to obtain the characteristic information of one dimension; performing long sentence prediction according to the feature information of the dimension, and outputting sentence candidates; and adjusting the weight of the deep learning model according to the sentence candidates and sentences input by the user in the set of training data.
Optionally, instructions for training the deep learning model are also included: collecting a plurality of sets of training data, each set of training data comprising: history input association information, a history input sequence, sentences input by a user under the conditions of the history input association information and the history input sequence; for a group of training data, taking a history input sequence and history input associated information in the group of training data as model input information of two dimensions, and inputting the model input information into the deep learning model; extracting the characteristics of the model input information of the two dimensions by adopting the deep learning model to obtain the characteristic information of the two dimensions; splicing the characteristic information of the two dimensions; predicting long sentences according to the spliced characteristic information, and outputting sentence candidates; and adjusting the weight of the deep learning model according to the sentence candidates and sentences input by the user in the set of training data.
The embodiment of the invention has the following advantages:
in the embodiment of the invention, the input sequence and the input association information can be acquired; then, a deep learning model is adopted to predict long sentences based on the input sequence and the input association information, and sentence candidates are output; furthermore, by adopting a deep learning model and combining an input sequence and input related information to conduct long sentence prediction, the prediction accuracy can be improved, and therefore the user input efficiency is improved.
Drawings
FIG. 1 is a flow chart of steps of an embodiment of a data processing method of the present invention;
FIG. 2 is a flow chart of the steps of an embodiment of a model training method of the present invention;
FIG. 3 is a flow chart of steps of an alternative embodiment of a data processing method of the present invention;
FIG. 4 is a flow chart of steps of yet another embodiment of a model training method of the present invention;
FIG. 5 is a flow chart of steps of yet another alternative embodiment of a data processing method of the present invention;
FIG. 6 is a flow chart of steps of yet another embodiment of a model training method of the present invention;
FIG. 7 is a flowchart illustrating steps of yet another alternative embodiment of a data processing method of the present invention;
FIG. 8 is a block diagram of an embodiment of a data processing apparatus of the present invention;
FIG. 9 is a block diagram of an alternative embodiment of a data processing apparatus of the present invention;
FIG. 10 is a block diagram of an electronic device for data processing, according to an exemplary embodiment;
Fig. 11 is a schematic structural view of an electronic device for data processing according to another exemplary embodiment of the present invention.
Detailed Description
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.
Referring to fig. 1, a flowchart illustrating steps of an embodiment of a data processing method according to the present invention may specifically include the following steps:
step 102, acquiring an input sequence and input association information.
In the embodiment of the invention, long sentence prediction can be performed in the process of inputting an input sequence by a user, and corresponding sentence candidates can be generated.
The embodiment of the invention can be applied to long sentence prediction in scenes with various input modes. For example, the method can be applied to long sentence prediction in a stroke input scene; the method is also applied to long sentence prediction in a pinyin input scene; for example, the method is also applied to long sentence prediction in a voice input scene; etc., and embodiments of the invention are not limited in this regard.
In addition, the embodiment of the invention can also be applied to long sentence prediction in multiple language scenes. For example, the method can be applied to long sentence prediction in Chinese input scenes; the method can be applied to long sentence prediction in English input scenes; the method can also be applied to the long sentence prediction in Korean input scenes; etc., and embodiments of the invention are not limited in this regard.
Correspondingly, the input sequence may include a stroke sequence, a pinyin sequence, a foreign language string, etc., which the embodiments of the present invention are not limited to.
The input sequence and the input association information input by the user can be acquired in the process of inputting by the user by using the input method; and then predicting corresponding sentence candidates based on the acquired input sequence and the input association information.
Wherein the input-related information may include information related to input, such as the above information, input environment information, etc., to which the embodiments of the present invention are not limited.
In one example of the present invention, a manner of performing long sentence prediction based on the obtained input sequence and the input association information may refer to the following step 104:
And 104, carrying out long sentence prediction by adopting a deep learning model based on the input sequence and the input association information, and outputting sentence candidates.
In the embodiment of the invention, the deep learning model can be trained in advance; and then, carrying out long sentence prediction by adopting the trained deep learning model. The training process of the deep learning model is described in the following embodiments. The input association information and the input sequence can be input into a deep learning model, long sentence prediction is performed by the deep learning model based on the input association information and the input sequence, and sentence candidates are output. The sentence candidates output by the deep learning model may be one or more, which is not limited in the embodiment of the present invention.
In summary, in the embodiment of the present invention, an input sequence and input association information may be obtained; then, a deep learning model is adopted to predict long sentences based on the input sequence and the input association information, and sentence candidates are output; furthermore, by adopting a deep learning model and combining an input sequence and input related information to conduct long sentence prediction, the prediction accuracy can be improved, and therefore the user input efficiency is improved.
In one embodiment of the present invention, the manner in which the deep learning model is trained includes a plurality of ways; the long sentence prediction mode by adopting the deep learning model can also correspondingly comprise a plurality of modes. Each of the methods for training the deep learning model and the method for performing long sentence prediction using the trained deep learning model will be described below.
One way to train the deep learning model may be as follows:
Referring to FIG. 2, a flowchart illustrating steps of one embodiment of a model training method of the present invention is shown.
Step 202, collecting corpus.
In the embodiment of the invention, the corpus can be collected, and then training data is generated according to the collected corpus, so that the deep learning model is trained by adopting the training data. The corpus collection method can include various ways, for example, sentences input by a user in an input method can be collected as the corpus; for example, collecting texts, abstracts and the like in each webpage as corpus; the embodiments of the present invention are not limited in this regard.
And 204, performing word granularity division on the corpus to obtain training data.
In the embodiment of the invention, the corpus can be divided to generate training data.
One way to divide the corpus and generate training data may be: and carrying out word granularity division on the corpus to obtain training data. Wherein, the corpus can be divided into a plurality of words by taking words as granularity; two words that are adjacent and semantically related are then employed as a set of training data.
In the embodiment of the invention, the granularity of the words can be determined based on natural language processing; the corpus is then divided into a plurality of words with word granularity. Or determining the granularity of the words based on the screen-on operation of the user in the input process; then dividing the corpus into a plurality of words by taking words as granularity; the embodiments of the present invention are not limited in this regard.
And 206, training the deep learning model by adopting the training data.
The training of the deep learning model using a set of training data is described below as an example.
In one embodiment of the present invention, when the training data is obtained by performing sentence granularity division on the corpus, two sets of words may be included in each set of training data. In the embodiment of the invention, the previous group of words in the training data can be input into the deep learning model, and the deep learning model carries out forward calculation to obtain word candidates. And then comparing the word candidates with the next word in the training data set, and adjusting the weight of the deep learning model. Further, the deep learning model is trained by using a plurality of sets of training data in this way until the set end condition is satisfied.
The following describes a method for long sentence prediction using the deep learning model trained in steps 202 to 206.
Referring to fig. 3, a flowchart illustrating steps of an alternative embodiment of a data processing method of the present invention may specifically include the steps of:
step 302, acquiring an input sequence and input association information.
Wherein the input sequence may comprise a single code or may comprise a plurality of codes, and embodiments of the present invention are not limited in this respect. The input association information may include: the above information and/or input environment information may of course also include other information, which is not limited in this regard by the embodiment of the present invention; the contextual information may include interactive information and/or content in an edit box.
In one example of the invention, the deep learning model may include an encoding module and a decoding module. The method for predicting long sentences and outputting sentence candidates based on the input sequence and the input association information by adopting a deep learning model can refer to steps 304-308:
and 304, encoding the input associated information by adopting the encoding module to obtain encoded information.
Step 306, adopting the decoding module to decode the encoded information, selecting the first X words with highest probability from the corresponding word candidate set at each moment, outputting the first X words based on the X words, and predicting the next moment; the word candidate set is obtained by screening words in a preset word set according to the input sequence.
Step 308, adopting the words output at each moment to form X sentence candidates.
In the embodiment of the invention, the input association information can be input into the coding module of the depth learning model, the coding model encodes the input association information, and the corresponding coding information is output to the decoding module. And then decoding the encoded information by a decoding model to obtain sentence candidates.
In the decoding process, the decoding module can perform cluster search from a preset word set of the deep learning model according to the input associated information at the first moment (t moment) and select the most probable X words at the next moment (t+1 moment). Then, on one hand, the X words can be output, on the other hand, based on the X words and input related information, cluster searching is carried out from a preset word set, and the most probable X words at the moment of t+2 are selected; the above process is then repeated until the end of the sentence. And then the optimal X sentence candidates can be output, wherein X is a positive integer.
In order to improve the accuracy of sentence candidates output by the deep learning model and reduce the calculated amount of the deep learning model; before selecting the most probable X words from the preset word set at each moment, the embodiment of the invention can screen the words in the preset word set according to the input sequence to obtain the word candidate set. Then, each moment can select the first X words with highest probability from the corresponding word candidate set.
Taking an input sequence as a pinyin sequence as an example, a mode of screening words in a preset word set according to the input sequence to obtain a word candidate set is described; reference may be made to the following sub-step 22-sub-step 26:
In the substep 22, at the ith moment in the decoding process, for a target word in the preset word set, determining the length M of the target word.
And a substep 24, determining the number N of syllables which are matched before the ith moment in the pinyin sequence, and matching M syllables corresponding to the syllables after the Nth syllable in the pinyin sequence with M syllables corresponding to the words in the target word.
And step 26, if the prefix matching of the pinyin corresponding to M words in the target word and the pinyin corresponding to M syllables after the Nth syllable in the pinyin sequence is carried out, adding the target word into the word candidate set corresponding to the ith moment.
In the embodiment of the invention, the preset word set of the deep learning model comprises a plurality of words, and the lengths of different words can be the same or different. At the ith moment in the decoding process, one word can be selected from the predicted word set as a target word at a time, and then the pinyin sequence is matched with the target word. When the target word is matched with the pinyin sequence, the target word can be added into the word candidate set corresponding to the ith moment; when the target word does not match the pinyin sequence, the target word may be filtered.
One way to match the pinyin sequence to the target word may be: determining the length M of the target word and performing phonetic notation on the target word; and determining the syllable number N which is matched before the ith moment in the pinyin sequence. And then adopting the pinyin of M syllables after the Nth syllable in the pinyin sequence to match the pinyin of M words in the target word.
Since most users are usually used to input only the first code character, or the first few code characters, of the target text when inputting; therefore, in order to provide sentence candidates related to target input for a user when the user does not input a complete input sequence, the embodiment of the invention improves the input efficiency of the user; it can be determined whether the pinyin of M words in the target word matches the pinyin of M syllables after the N-th syllable in the pinyin sequence with the prefix. If the M words in the target word are matched with the prefixes of the M syllables in the Pinyin sequence, the target word can be added into the word candidate set corresponding to the ith moment to screen out a comprehensive and accurate word candidate set, and accordingly the comprehensiveness and accuracy of predicting words at each moment of the encoder are improved.
One way to determine whether the prefix of the pinyin corresponding to M syllables in the target word matches the prefix of the pinyin corresponding to M syllables after the N-th syllable in the pinyin sequence is: it may be determined, starting with the first syllable of the pinyin sequence, whether the first syllable of the pinyin sequence corresponds to pinyin including only initials or initials and finals. If the first syllable of the pinyin sequence corresponds to the pinyin and contains initials and finals, judging whether the first character of the target word corresponds to the pinyin and is completely matched with the first syllable of the pinyin sequence. If the first character of the target word corresponds to the pinyin and is completely matched with the pinyin corresponding to the first syllable of the pinyin sequence, the second character of the target word can be matched with the pinyin corresponding to the second syllable of the pinyin sequence by continuously referring to the mode. Otherwise, the target word is screened out. If the first syllable of the pinyin sequence corresponds to the pinyin only including the initial consonant, judging whether the first character of the target word corresponds to the pinyin and the initial consonant of the pinyin corresponding to the first syllable of the pinyin sequence is matched. If it is determined that the first word of the target word corresponds to pinyin and is matched with the initial consonant of the pinyin corresponding to the first syllable of the pinyin sequence, the pinyin corresponding to the second word of the target word can be matched with the pinyin corresponding to the second syllable of the pinyin sequence by continuously referring to the above manner. Otherwise, the target word is screened out. According to the mode, the M-1 character of the target word is matched with the syllable corresponding to the N+M-1 syllable of the pinyin sequence until the M-1 character of the target word corresponds to the pinyin. Aiming at the Pinyin corresponding to the N+M syllables of the Pinyin sequence, judging whether the prefix of the Pinyin corresponding to the M characters in the target word is matched with the prefix of the Pinyin corresponding to the N+M syllables of the Pinyin sequence; that is, judging whether the Mth character in the target word corresponds to pinyin or not and whether the Mth syllable in the pinyin sequence corresponds to pinyin or not. If the M-th character in the target word corresponds to the pinyin and is matched with the prefix of the pinyin corresponding to the n+M syllable of the pinyin sequence, adding the target word into the word candidate set corresponding to the i moment. And if the M-th character in the target word corresponds to the pinyin and is not matched with the prefix of the pinyin corresponding to the N+M syllable of the pinyin sequence, screening out the target word.
And then a word candidate set corresponding to the ith moment can be obtained; i is a positive integer, and the value range is 1-N. Then at the ith moment, when the encoder selects X most probable words at the ith moment, the X most probable words can be selected from the word candidate set corresponding to the ith moment; and further, the accuracy of the selected words can be improved, and the efficiency of selecting the words can be improved.
In addition, error correction can be carried out on the input sequence, and then words in a preset word set are screened by adopting the error corrected input sequence to obtain a word candidate set; furthermore, sentence candidates hit the user's needs can be given under the condition of user misinput.
In summary, in the embodiment of the present invention, an input sequence and input association information may be acquired, and then the input association information is encoded by using the encoding module to obtain encoded information; the decoding module is adopted to decode the coding information, the first X words with highest probability are selected from the corresponding word candidate set at each moment, the output is carried out based on the X words and the prediction of the next moment is carried out, and then the words output at each moment are adopted to form X sentence candidates; the word candidate set is obtained by screening words in a preset word set according to the input sequence, and then the accuracy of the predicted words at each moment is improved through screening while generating; thereby further improving the accuracy of the predicted sentence candidates.
Wherein, another way of training the deep learning model can be as follows:
referring to fig. 4, a flowchart of the steps of yet another embodiment of the model training method of the present invention is shown.
Step 402, collecting multiple sets of training data, each set of training data comprising: history input association information, a history input sequence, and sentences input by a user under the conditions of the history input association information and the history input sequence.
In the embodiment of the invention, the input sequence input by the user history can be collected, the input related information is input when the user inputs the input sequence, and the sentence is input after the user inputs the input sequence under the condition of the input related information.
For convenience of the following explanation, the input sequence of the user history input may be referred to as a history input sequence, and the input related information when the user inputs the history input sequence may be referred to as history input related information. Wherein, a history input sequence, the history input associated information corresponding to the history input sequence, and sentences input by the user under the conditions of the history input associated information and the history input sequence can be used as a set of training data.
Then, training the deep learning model by adopting a plurality of groups of collected data; the following describes training a depth model using a set of training data.
Step 404, for a set of training data, using the history input sequence and the history input association information in the set of training data as model input information of one dimension, and inputting the model input information into the deep learning model.
And 406, extracting the characteristics of the model input information of one dimension by adopting the deep learning model to obtain the characteristic information of one dimension.
Step 408, performing long sentence prediction according to the feature information of the dimension, and outputting sentence candidates.
Step 410, adjusting the weight of the deep learning model according to the sentence candidate and the sentence input by the user in the set of training data.
In the embodiment of the invention, for each set of training data, the history input sequence and the history input associated information in the set of training data can be used as model input information of one dimension and input into the deep learning model. Wherein the history input sequence and the history input association information may be encoded separately; for example, the historical input sequence may be encoded to obtain a corresponding syllable vector, and the historical input association information may be encoded to obtain a corresponding word vector. And then sequentially inputting the coded history input sequence and history input related information into a deep learning model. The embodiment of the invention does not limit the sequence of inputting the history input sequence and the history input associated information into the deep learning model.
The deep learning model may then perform forward calculations based on the historical input sequence and the historical input association information, outputting sentence candidates. The deep learning model can extract the characteristic information of the history input sequence and the characteristic information of the history input related information in the previous A layer to obtain the characteristic information of one dimension; and then inputting the characteristic information of the dimension into a later B layer for calculation, and outputting sentence candidates. Wherein A and B are both positive integers, which is not limiting in this embodiment of the invention.
Comparing sentence candidates output by the deep learning model with sentences input by a user in the training data set, and adjusting the weight of the deep learning model; further, the deep learning model is trained by using a plurality of sets of training data in this way until the set end condition is satisfied.
The following describes a method for long sentence prediction using the deep learning model trained in steps 402 to 410.
Referring to fig. 5, a flowchart of the steps of yet another alternative embodiment of a data processing method of the present invention is shown.
Step 502, acquiring an input sequence and input association information.
This step 502 is similar to the step 302 described above, and will not be described again.
And step 504, taking the input sequence and the input association information as model input information of one dimension, and inputting the model input information into the deep learning model.
And 506, extracting the characteristics of the model input information of one dimension by adopting the deep learning model to obtain the characteristic information of one dimension.
And 508, carrying out long sentence prediction according to the characteristic information of the dimension, and outputting sentence candidates.
In the embodiment of the present invention, in the training process of the deep learning model, the step 402-step 410 uses the history input sequence and the history input association information as the model input information of one dimension, and inputs the model input information into the deep learning model. Correspondingly, when long sentence prediction is performed by using the trained deep learning model, the obtained input sequence and the input association information can be used as model input information of the same dimension and input into the trained deep learning model. Wherein the input sequence and the input association information may be encoded separately; for example, the input sequence may be encoded to obtain a corresponding syllable vector, and the input association information may be encoded to obtain a corresponding word vector. And then sequentially inputting the encoded input sequence and the input association information into a deep learning model. The sequence of inputting the encoded input sequence and the encoded input related information into the deep learning model is the same as the sequence of inputting the encoded historical input sequence and the encoded historical input related information into the deep learning model. The input sequence is encoded in the same manner as the history input sequence, and the input related information is encoded in the same manner as the history input related information.
The input sequence and the input association information are used as model input information of the same dimension, after the input sequence and the input association information are input into a deep learning model, the deep learning model can extract the characteristic information of the input sequence and the input association information, and the characteristic information of one dimension is obtained. And then, the deep learning model predicts long sentences according to the characteristic information and outputs sentence candidates.
In summary, in the embodiment of the present invention, after an input sequence and input association information are acquired, the input sequence and the input association information may be used as model input information of the same dimension and input into the deep learning model; extracting the characteristics of the model input information of one dimension by adopting the deep learning model to obtain the characteristic information of one dimension, and then carrying out long sentence prediction according to the characteristic information of one dimension to output sentence candidates; and the data volume of the input information of the same dimension model in the deep learning model can be increased, and the accuracy of predicting sentence candidates is further improved, so that the user input efficiency and the user experience are further improved.
Wherein, another way of training the deep learning model can be as follows:
referring to fig. 6, a flowchart of the steps of yet another embodiment of the model training method of the present invention is shown.
Step 602, collecting multiple sets of training data, each set of training data including: history input association information, a history input sequence, and sentences input by a user under the conditions of the history input association information and the history input sequence.
Step 602 is similar to step 402 described above and will not be described again.
Then, training the deep learning model by adopting a plurality of groups of collected data; the following describes training a depth model using a set of training data.
Step 604, for a set of training data, taking the history input sequence and the history input association information in the set of training data as two-dimensional model input information, and inputting the two-dimensional model input information into the deep learning model.
And 606, respectively extracting the characteristics of the model input information of the two dimensions by adopting the deep learning model to obtain the characteristic information of the two dimensions.
And 608, splicing the characteristic information of the two dimensions.
Step 610, long sentence prediction is performed according to the spliced characteristic information, and sentence candidates are output.
Step 612, adjusting the weight of the deep learning model according to the sentence candidate and the sentence input by the user in the set of training data.
In the embodiment of the invention, for each set of training data, the history input sequence and the history input associated information in the set of training data can be respectively used as model input information of one dimension and input into the deep learning model. That is, the history input sequence and history input related information in the set of training data are input as model input information in two dimensions to the deep learning model. Wherein the history input sequence and the history input association information may be encoded separately; for example, the historical input sequence can be encoded to obtain a corresponding syllable vector, and the historical input associated information can be encoded to obtain a corresponding word vector; and then using the syllable vector as model input information of one dimension, and using the word vector as model input information of the other dimension to respectively input the syllable vector into the deep learning model.
The deep learning model may then perform forward calculations based on the historical input sequence and the historical input association information, outputting sentence candidates. The deep learning model can extract the characteristic information of the history input sequence and the characteristic information of the history input related information in the previous A layer, and can obtain the characteristic information of two dimensions. Then, the characteristic information of the history input sequence and the characteristic information of the history input associated information can be spliced to obtain the characteristic information of one dimension; and inputting the characteristic information of one dimension obtained by splicing into the rear layer B for calculation.
Comparing sentence candidates output by the deep learning model with sentences input by a user in the training data set, and adjusting the weight of the deep learning model; further, the deep learning model is trained by using a plurality of sets of training data in this way until the set end condition is satisfied.
The following describes a method for long sentence prediction using the deep learning model trained in steps 602 to 612.
Referring to fig. 7, a flowchart of the steps of yet another alternative embodiment of a data processing method of the present invention is shown.
Step 702, acquiring an input sequence and input association information.
This step 702 is similar to the step 302 described above, and will not be described again.
Step 704, the input sequence and the input association information are respectively used as model input information of two dimensions and are input into the deep learning model.
And 706, respectively extracting the characteristics of the model input information of the two dimensions by adopting the deep learning model to obtain the characteristic information of the two dimensions.
And 708, splicing the characteristic information of the two dimensions.
And 710, predicting long sentences according to the spliced characteristic information, and outputting sentence candidates.
In the embodiment of the present invention, in the training process of the deep learning model, the historical input sequence and the historical input association information are used as the model input information of two dimensions, and are input into the deep learning model in the steps 602-612. Correspondingly, when long sentence prediction is performed by using the trained deep learning model, the acquired input sequence and input association information can be used as model input information with two dimensions and input into the trained deep learning model. Wherein the input sequence and the input association information may be encoded separately; for example, the input sequence may be encoded to obtain a corresponding syllable vector, and the input association information may be encoded to obtain a corresponding word vector. And then taking the encoded input sequence as model input information of one dimension and the historical input association information as model input information of the other dimension, and inputting the model input information into the deep learning model. The deep learning model can extract the characteristics of model input information of two dimensions respectively to obtain the characteristic information of the two dimensions; splicing the model input information of the two dimensions; and then, carrying out long sentence prediction by the deep learning model according to the spliced characteristic information, and outputting sentence candidates.
In summary, in the embodiment of the present invention, after an input sequence and input association information are acquired, the input sequence and the input association information may be respectively used as model input information of two dimensions and input into the deep learning model; then, respectively extracting features of model input information of two dimensions by adopting the deep learning model to obtain feature information of the two dimensions, and splicing the feature information of the two dimensions; then, long sentence prediction is carried out according to the spliced characteristic information, and sentence candidates are output; the dimension of model input information input into the deep learning model can be increased, the accuracy of predicting sentence candidates is further improved, and therefore user input efficiency and user experience are further improved.
In addition, the deep learning model can output candidate scores of each sentence candidate while outputting each sentence candidate; the method is convenient for sequencing the following sentence candidates according to the candidate scores of the sentence candidates, and displaying the sentence candidates according to the sequenced results.
When the input associated information is more, part of the input associated information can be input into a deep learning model to obtain a plurality of sentence candidates output by the deep learning model; to reduce the computational effort of the deep learning model. Therefore, after the sentence candidates are obtained, the sentence candidates can be ranked based on the complete input association information and the candidate scores of the sentence candidates; and further, the accuracy of sorting the sentence candidates can be improved. Of course, sentence candidates may also be ordered based on complete input association information, which is not limited in this embodiment of the present invention.
It should be noted that, for simplicity of description, the method embodiments are shown as a series of acts, but it should be understood by those skilled in the art that the embodiments are not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred embodiments, and that the acts are not necessarily required by the embodiments of the invention.
Referring to FIG. 8, a block diagram illustrating an embodiment of a data processing apparatus of the present invention may include the following modules:
An obtaining module 802, configured to obtain an input sequence and input association information;
And the prediction module 804 is used for predicting long sentences based on the input sequence and the input association information by adopting a deep learning model and outputting sentence candidates.
Referring to FIG. 9, a block diagram of an alternative embodiment of a data processing apparatus of the present invention is shown.
In an alternative embodiment of the present invention, the deep learning model includes an encoding module and a decoding module; the prediction module 804 includes:
A first long sentence prediction sub-module 8042, configured to encode the input association information by using the encoding module, so as to obtain encoded information; the decoding module is adopted to decode the coding information, the first X words with highest probability are selected from the corresponding word candidate set at each moment, and output and prediction at the next moment are carried out based on the X words; the word candidate set is obtained by screening words in a preset word set according to the input sequence, and X is a positive integer; and forming X sentence candidates by adopting the words output at each moment.
In an alternative embodiment of the present invention, the input sequence includes a pinyin sequence, and the apparatus further includes:
A screening module 806, configured to determine, at an ith moment in the decoding process, a length M of a target word in a preset word set, for the target word; determining the number N of syllables which are matched before the ith moment in the pinyin sequence, and matching M syllables corresponding to the syllables after the Nth syllable in the pinyin sequence with M characters corresponding to the pinyin in the target word; if the M characters in the target word correspond to the phonetic alphabets and the prefix of the M syllables in the phonetic sequence correspond to the phonetic alphabets, adding the target word into the word candidate set corresponding to the ith moment; wherein i is a positive integer, and the value range is 1-N.
In an alternative embodiment of the present invention, the prediction module 804 includes:
A second long sentence prediction submodule 8044, configured to input the input sequence and the input association information as model input information of one dimension into the deep learning model; extracting the characteristics of the model input information of one dimension by adopting the deep learning model to obtain the characteristic information of one dimension; and carrying out long sentence prediction according to the characteristic information of one dimension, and outputting sentence candidates.
In an alternative embodiment of the present invention, the prediction module 804 includes:
A third long sentence prediction sub-module 8046, configured to input the input sequence and the input association information into the deep learning model as two-dimensional model input information respectively; respectively extracting features of model input information of two dimensions by adopting the deep learning model to obtain feature information of the two dimensions; splicing the characteristic information of the two dimensions; and carrying out long sentence prediction according to the spliced characteristic information, and outputting sentence candidates.
In an alternative embodiment of the present invention, the apparatus further comprises:
a first model training module 808 for collecting corpus; performing word granularity division on the corpus to obtain training data; and training the deep learning model by adopting the training data.
In an alternative embodiment of the present invention, the apparatus further comprises:
A second model training module 810 for collecting a plurality of sets of training data, each set of training data comprising: history input association information, a history input sequence, sentences input by a user under the conditions of the history input association information and the history input sequence; for a group of training data, taking a history input sequence and history input associated information in the group of training data as model input information of one dimension, and inputting the model input information into the deep learning model; extracting the characteristics of the model input information of one dimension by adopting the deep learning model to obtain the characteristic information of one dimension; performing long sentence prediction according to the feature information of the dimension, and outputting sentence candidates; and adjusting the weight of the deep learning model according to the sentence candidates and sentences input by the user in the set of training data.
In an alternative embodiment of the present invention, the apparatus further comprises:
A third model training module 812 for collecting multiple sets of training data, each set of training data comprising: history input association information, a history input sequence, sentences input by a user under the conditions of the history input association information and the history input sequence; for a group of training data, taking a history input sequence and history input associated information in the group of training data as model input information of two dimensions, and inputting the model input information into the deep learning model; extracting the characteristics of the model input information of the two dimensions by adopting the deep learning model to obtain the characteristic information of the two dimensions; splicing the characteristic information of the two dimensions; predicting long sentences according to the spliced characteristic information, and outputting sentence candidates; and adjusting the weight of the deep learning model according to the sentence candidates and sentences input by the user in the set of training data.
In summary, in the embodiment of the present invention, an input sequence and input association information may be obtained; then, a deep learning model is adopted to predict long sentences based on the input sequence and the input association information, and sentence candidates are output; furthermore, by adopting a deep learning model and combining an input sequence and input related information to conduct long sentence prediction, the prediction accuracy can be improved, and therefore the user input efficiency is improved.
For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.
Fig. 10 is a block diagram illustrating a configuration of an electronic device 1000 for data processing according to an example embodiment. For example, electronic device 1000 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, exercise device, personal digital assistant, or the like.
Referring to fig. 10, an electronic device 1000 may include one or more of the following components: a processing component 1002, a memory 1004, a power component 1006, a multimedia component 1008, an audio component 1010, an input/output (I/O) interface 1012, a sensor component 1014, and a communication component 1016.
The processing component 1002 generally controls overall operation of the electronic device 1000, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing element 1002 may include one or more processors 1020 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 1002 can include one or more modules that facilitate interaction between the processing component 1002 and other components. For example, the processing component 1002 may include a multimedia module to facilitate interaction between the multimedia component 1008 and the processing component 1002.
The memory 1004 is configured to store various types of data to support operations at the device 1000. Examples of such data include instructions for any application or method operating on the electronic device 1000, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 1004 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
The power component 1006 provides power to the various components of the electronic device 1000. Power component 1006 can include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for electronic device 1000.
The multimedia component 1008 includes a screen between the electronic device 1000 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or slide action, but also the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia assembly 1008 includes a front-facing camera and/or a rear-facing camera. When the electronic device 1000 is in an operational mode, such as a shooting mode or a video mode, the front-facing camera and/or the rear-facing camera may receive external multimedia data. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.
The audio component 1010 is configured to output and/or input audio signals. For example, the audio component 1010 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 1000 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in memory 1004 or transmitted via communication component 1016. In some embodiments, the audio component 1010 further comprises a speaker for outputting audio signals.
The I/O interface 1012 provides an interface between the processing assembly 1002 and peripheral interface modules, which may be a keyboard, click wheel, buttons, and the like. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.
The sensor assembly 1014 includes one or more sensors for providing status assessment of various aspects of the electronic device 1000. For example, the sensor assembly 1014 may detect an on/off state of the device 1000, a relative positioning of components such as a display and keypad of the electronic device 1000, the sensor assembly 1014 may also detect a change in position of the electronic device 1000 or a component of the electronic device 1000, the presence or absence of a user's contact with the electronic device 1000, an orientation or acceleration/deceleration of the electronic device 1000, and a change in temperature of the electronic device 1000. The sensor assembly 1014 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact. The sensor assembly 1014 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 1014 can also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 1016 is configured to facilitate communication between the electronic device 1000 and other devices, either wired or wireless. The electronic device 1000 may access a wireless network based on a communication standard, such as WiFi,2G, or 3G, or a combination thereof. In one exemplary embodiment, the communication part 1014 receives a broadcast signal or broadcast-related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 1014 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the electronic device 1000 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for executing the methods described above.
In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as memory 1004, including instructions executable by processor 1020 of electronic device 1000 to perform the above-described method. For example, the non-transitory computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
A non-transitory computer readable storage medium, which when executed by a processor of an electronic device, causes the electronic device to perform a data processing method, the method comprising: acquiring an input sequence and input association information; and carrying out long sentence prediction based on the input sequence and the input association information by adopting a deep learning model, and outputting sentence candidates.
Optionally, the deep learning model includes an encoding module and a decoding module; the adoption of the deep learning model for long sentence prediction based on the input sequence and the input association information, and output of sentence candidates comprise: the coding module is adopted to code the input associated information to obtain coding information; the decoding module is adopted to decode the coding information, the first X words with highest probability are selected from the corresponding word candidate set at each moment, and output and prediction at the next moment are carried out based on the X words; the word candidate set is obtained by screening words in a preset word set according to the input sequence, and X is a positive integer; and forming X sentence candidates by adopting the words output at each moment.
Optionally, the input sequence includes a pinyin sequence, and the method further includes a step of screening words in a preset word set according to the input sequence to obtain a word candidate set: at the ith moment in the decoding process, determining the length M of a target word aiming at the target word in a preset word set; determining the number N of syllables which are matched before the ith moment in the pinyin sequence, and matching M syllables corresponding to the syllables after the Nth syllable in the pinyin sequence with M characters corresponding to the pinyin in the target word; if the M characters in the target word correspond to the phonetic alphabets and the prefix of the M syllables in the phonetic sequence correspond to the phonetic alphabets, adding the target word into the word candidate set corresponding to the ith moment; wherein i is a positive integer, and the value range is 1-N.
Optionally, the long sentence prediction is performed by using a deep learning model based on the input sequence and the input association information, and the output sentence candidates include: the input sequence and the input association information are used as model input information of one dimension and are input into the deep learning model; extracting the characteristics of the model input information of one dimension by adopting the deep learning model to obtain the characteristic information of one dimension; and carrying out long sentence prediction according to the characteristic information of one dimension, and outputting sentence candidates.
Optionally, the performing long sentence prediction based on the input sequence and the input association information by using a deep learning model, and outputting sentence candidates, includes: the input sequence and the input association information are respectively used as model input information of two dimensions and are input into the deep learning model; respectively extracting features of model input information of two dimensions by adopting the deep learning model to obtain feature information of the two dimensions; splicing the characteristic information of the two dimensions; and carrying out long sentence prediction according to the spliced characteristic information, and outputting sentence candidates.
Optionally, the method further comprises the step of training a deep learning model: collecting corpus; performing word granularity division on the corpus to obtain training data; and training the deep learning model by adopting the training data.
Optionally, the method further comprises the step of training the deep learning model: collecting a plurality of sets of training data, each set of training data comprising: history input association information, a history input sequence, sentences input by a user under the conditions of the history input association information and the history input sequence; for a group of training data, taking a history input sequence and history input associated information in the group of training data as model input information of one dimension, and inputting the model input information into the deep learning model; extracting the characteristics of the model input information of one dimension by adopting the deep learning model to obtain the characteristic information of one dimension; performing long sentence prediction according to the feature information of the dimension, and outputting sentence candidates; and adjusting the weight of the deep learning model according to the sentence candidates and sentences input by the user in the set of training data.
Optionally, the method further comprises the step of training the deep learning model: collecting a plurality of sets of training data, each set of training data comprising: history input association information, a history input sequence, sentences input by a user under the conditions of the history input association information and the history input sequence; for a group of training data, taking a history input sequence and history input associated information in the group of training data as model input information of two dimensions, and inputting the model input information into the deep learning model; extracting the characteristics of the model input information of the two dimensions by adopting the deep learning model to obtain the characteristic information of the two dimensions; splicing the characteristic information of the two dimensions; predicting long sentences according to the spliced characteristic information, and outputting sentence candidates; and adjusting the weight of the deep learning model according to the sentence candidates and sentences input by the user in the set of training data.
Fig. 11 is a schematic diagram of an electronic device 1100 for data processing according to another exemplary embodiment of the present invention. The electronic device 1100 may be a server, which may vary widely in configuration or performance, and may include one or more central processing units (centralprocessingunits, CPUs) 1122 (e.g., one or more processors) and memory 1132, one or more storage mediums 1130 (e.g., one or more mass storage devices) that store applications 1142 or data 1144. Wherein the memory 1132 and the storage medium 1130 may be transitory or persistent. The program stored on the storage medium 1130 may include one or more modules (not shown), each of which may include a series of instruction operations on a server. Still further, the central processor 1122 may be provided in communication with a storage medium 1130, executing a series of instruction operations in the storage medium 1130 on a server.
The servers may also include one or more power supplies 1126, one or more wired or wireless network interfaces 1150, one or more input output interfaces 1158, one or more keyboards 1156, and/or one or more operating systems 1141, such as WindowsServerTM, macOSXTM, unixTM, linuxTM, freeBSDTM, etc.
An electronic device comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs comprising instructions for: acquiring an input sequence and input association information; and carrying out long sentence prediction based on the input sequence and the input association information by adopting a deep learning model, and outputting sentence candidates.
Optionally, the deep learning model includes an encoding module and a decoding module; the adoption of the deep learning model for long sentence prediction based on the input sequence and the input association information, and output of sentence candidates comprise: the coding module is adopted to code the input associated information to obtain coding information; the decoding module is adopted to decode the coding information, the first X words with highest probability are selected from the corresponding word candidate set at each moment, and output and prediction at the next moment are carried out based on the X words; the word candidate set is obtained by screening words in a preset word set according to the input sequence, and X is a positive integer; and forming X sentence candidates by adopting the words output at each moment.
Optionally, the input sequence includes a pinyin sequence, and further includes an instruction for screening words in a preset word set according to the input sequence to obtain a word candidate set: at the ith moment in the decoding process, determining the length M of a target word aiming at the target word in a preset word set; determining the number N of syllables which are matched before the ith moment in the pinyin sequence, and matching M syllables corresponding to the syllables after the Nth syllable in the pinyin sequence with M characters corresponding to the pinyin in the target word; if the M characters in the target word correspond to the phonetic alphabets and the prefix of the M syllables in the phonetic sequence correspond to the phonetic alphabets, adding the target word into the word candidate set corresponding to the ith moment; wherein i is a positive integer, and the value range is 1-N.
Optionally, the long sentence prediction is performed by using a deep learning model based on the input sequence and the input association information, and the output sentence candidates include: the input sequence and the input association information are used as model input information of one dimension and are input into the deep learning model; extracting the characteristics of the model input information of one dimension by adopting the deep learning model to obtain the characteristic information of one dimension; and carrying out long sentence prediction according to the characteristic information of one dimension, and outputting sentence candidates.
Optionally, the performing long sentence prediction based on the input sequence and the input association information by using a deep learning model, and outputting sentence candidates, includes: the input sequence and the input association information are respectively used as model input information of two dimensions and are input into the deep learning model; respectively extracting features of model input information of two dimensions by adopting the deep learning model to obtain feature information of the two dimensions; splicing the characteristic information of the two dimensions; and carrying out long sentence prediction according to the spliced characteristic information, and outputting sentence candidates.
Optionally, further comprising instructions for performing the following training deep learning model: collecting corpus; performing word granularity division on the corpus to obtain training data; and training the deep learning model by adopting the training data.
Optionally, instructions for training the deep learning model are also included: collecting a plurality of sets of training data, each set of training data comprising: history input association information, a history input sequence, sentences input by a user under the conditions of the history input association information and the history input sequence; for a group of training data, taking a history input sequence and history input associated information in the group of training data as model input information of one dimension, and inputting the model input information into the deep learning model; extracting the characteristics of the model input information of one dimension by adopting the deep learning model to obtain the characteristic information of one dimension; performing long sentence prediction according to the feature information of the dimension, and outputting sentence candidates; and adjusting the weight of the deep learning model according to the sentence candidates and sentences input by the user in the set of training data.
Optionally, instructions for training the deep learning model are also included: collecting a plurality of sets of training data, each set of training data comprising: history input association information, a history input sequence, sentences input by a user under the conditions of the history input association information and the history input sequence; for a group of training data, taking a history input sequence and history input associated information in the group of training data as model input information of two dimensions, and inputting the model input information into the deep learning model; extracting the characteristics of the model input information of the two dimensions by adopting the deep learning model to obtain the characteristic information of the two dimensions; splicing the characteristic information of the two dimensions; predicting long sentences according to the spliced characteristic information, and outputting sentence candidates; and adjusting the weight of the deep learning model according to the sentence candidates and sentences input by the user in the set of training data.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or terminal device that comprises the element.
The foregoing has described in detail a data processing method, a data processing apparatus and an electronic device according to the present invention, and specific examples have been provided herein to illustrate the principles and embodiments of the present invention, the above examples being provided only to assist in understanding the method and core idea of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.
Claims (8)
1. A method of data processing, comprising:
Acquiring an input sequence and input association information, wherein the input sequence comprises a pinyin sequence;
Coding the input associated information by adopting a coding module of a deep learning model to obtain coding information;
The decoding module of the deep learning model is used for decoding the coding information, selecting the first X words with highest probability from the corresponding word candidate set at each moment, outputting the first X words based on the X words and predicting the next moment; the word candidate set is obtained by screening words in a preset word set according to the input sequence, and X is a positive integer;
forming X sentence candidates by adopting words output at each moment;
At the ith moment in the decoding process, determining the length M of a target word aiming at the target word in a preset word set;
Determining the number N of syllables which are matched before the ith moment in the pinyin sequence, and matching M syllables corresponding to the syllables after the Nth syllable in the pinyin sequence with M characters corresponding to the pinyin in the target word;
If the M characters in the target word correspond to the phonetic alphabets and the prefix of the M syllables in the phonetic sequence correspond to the phonetic alphabets, adding the target word into the word candidate set corresponding to the ith moment;
wherein i is a positive integer, and the value range is 1-N.
2. The method of claim 1, wherein the employing the deep learning model to predict long sentences based on the input sequence and input association information, the output sentence candidates, comprises:
the input sequence and the input association information are used as model input information of one dimension and are input into the deep learning model;
Extracting the characteristics of the model input information of one dimension by adopting the deep learning model to obtain the characteristic information of one dimension;
and carrying out long sentence prediction according to the characteristic information of one dimension, and outputting sentence candidates.
3. The method of claim 1, wherein the employing a deep learning model to predict long sentences based on the input sequence and input association information, outputting sentence candidates, comprises:
The input sequence and the input association information are respectively used as model input information of two dimensions and are input into the deep learning model;
Respectively extracting features of model input information of two dimensions by adopting the deep learning model to obtain feature information of the two dimensions;
Splicing the characteristic information of the two dimensions;
and carrying out long sentence prediction according to the spliced characteristic information, and outputting sentence candidates.
4. The method of claim 1, further comprising the step of training a deep learning model:
Collecting corpus;
performing word granularity division on the corpus to obtain training data;
and training the deep learning model by adopting the training data.
5. The method of claim 2, further comprising the step of training the deep learning model:
collecting a plurality of sets of training data, each set of training data comprising: history input association information, a history input sequence, sentences input by a user under the conditions of the history input association information and the history input sequence;
For a group of training data, taking a history input sequence and history input associated information in the group of training data as model input information of one dimension, and inputting the model input information into the deep learning model;
Extracting the characteristics of the model input information of one dimension by adopting the deep learning model to obtain the characteristic information of one dimension;
performing long sentence prediction according to the feature information of the dimension, and outputting sentence candidates;
And adjusting the weight of the deep learning model according to the sentence candidates and sentences input by the user in the set of training data.
6. A data processing apparatus, comprising:
The system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring an input sequence and input associated information, and the input sequence comprises a pinyin sequence;
The prediction module is used for encoding the input associated information by adopting an encoding module of a deep learning model to obtain encoded information; the decoding module of the deep learning model is used for decoding the coding information, selecting the first X words with highest probability from the corresponding word candidate set at each moment, outputting the first X words based on the X words and predicting the next moment; the word candidate set is obtained by screening words in a preset word set according to the input sequence, and X is a positive integer; forming X sentence candidates by adopting words output at each moment;
The screening module is used for determining the length M of a target word aiming at the target word in a preset word set at the ith moment in the decoding process; determining the number N of syllables which are matched before the ith moment in the pinyin sequence, and matching M syllables corresponding to the syllables after the Nth syllable in the pinyin sequence with M characters corresponding to the pinyin in the target word; if the M characters in the target word correspond to the phonetic alphabets and the prefix of the M syllables in the phonetic sequence correspond to the phonetic alphabets, adding the target word into the word candidate set corresponding to the ith moment; wherein i is a positive integer, and the value range is 1-N.
7. An electronic device comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs comprising instructions for:
Acquiring an input sequence and input association information, wherein the input sequence comprises a pinyin sequence;
Coding the input associated information by adopting a coding module based on a deep learning model to obtain coding information;
The decoding module of the deep learning model is used for decoding the coding information, selecting the first X words with highest probability from the corresponding word candidate set at each moment, outputting the first X words based on the X words and predicting the next moment; the word candidate set is obtained by screening words in a preset word set according to the input sequence, and X is a positive integer;
forming X sentence candidates by adopting words output at each moment;
At the ith moment in the decoding process, determining the length M of a target word aiming at the target word in a preset word set;
Determining the number N of syllables which are matched before the ith moment in the pinyin sequence, and matching M syllables corresponding to the syllables after the Nth syllable in the pinyin sequence with M characters corresponding to the pinyin in the target word;
If the M characters in the target word correspond to the phonetic alphabets and the prefix of the M syllables in the phonetic sequence correspond to the phonetic alphabets, adding the target word into the word candidate set corresponding to the ith moment;
wherein i is a positive integer, and the value range is 1-N.
8. A readable storage medium, characterized in that instructions in said storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the data processing method according to any one of the method claims 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010366764.8A CN113589946B (en) | 2020-04-30 | 2020-04-30 | Data processing method and device and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010366764.8A CN113589946B (en) | 2020-04-30 | 2020-04-30 | Data processing method and device and electronic equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113589946A CN113589946A (en) | 2021-11-02 |
CN113589946B true CN113589946B (en) | 2024-07-26 |
Family
ID=78237674
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010366764.8A Active CN113589946B (en) | 2020-04-30 | 2020-04-30 | Data processing method and device and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113589946B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110427617A (en) * | 2019-07-22 | 2019-11-08 | 阿里巴巴集团控股有限公司 | The generation method and device of pushed information |
CN110673748A (en) * | 2019-09-27 | 2020-01-10 | 北京百度网讯科技有限公司 | Method and device for providing candidate long sentences in input method |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100530171C (en) * | 2005-01-31 | 2009-08-19 | 日电(中国)有限公司 | Dictionary learning method and devcie |
CN102866782B (en) * | 2011-07-06 | 2015-05-20 | 哈尔滨工业大学 | Input method and input method system for improving sentence generating efficiency |
CN105718070A (en) * | 2016-01-16 | 2016-06-29 | 上海高欣计算机系统有限公司 | Pinyin long sentence continuous type-in input method and Pinyin long sentence continuous type-in input system |
CN107688397B (en) * | 2016-08-03 | 2022-10-21 | 北京搜狗科技发展有限公司 | Input method, system and device for inputting |
KR102426435B1 (en) * | 2016-11-29 | 2022-07-29 | 삼성전자주식회사 | Apparatus and method for providing a sentence based on a user input |
CN107329585A (en) * | 2017-06-28 | 2017-11-07 | 北京百度网讯科技有限公司 | Method and apparatus for inputting word |
CN109032375B (en) * | 2018-06-29 | 2022-07-19 | 北京百度网讯科技有限公司 | Candidate text sorting method, device, equipment and storage medium |
CN110874145A (en) * | 2018-08-30 | 2020-03-10 | 北京搜狗科技发展有限公司 | Input method and device and electronic equipment |
CN110908523B (en) * | 2018-09-14 | 2024-08-20 | 北京搜狗科技发展有限公司 | Input method and device |
KR20180118096A (en) * | 2018-10-22 | 2018-10-30 | 김기주 | The method of changing a part of character in the characters which is allocated on the button of keyboard |
CN110187780B (en) * | 2019-06-10 | 2023-07-21 | 北京百度网讯科技有限公司 | Long text prediction method, long text prediction device, long text prediction equipment and storage medium |
CN110286778B (en) * | 2019-06-27 | 2023-08-15 | 北京金山安全软件有限公司 | Chinese deep learning input method, device and electronic equipment |
-
2020
- 2020-04-30 CN CN202010366764.8A patent/CN113589946B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110427617A (en) * | 2019-07-22 | 2019-11-08 | 阿里巴巴集团控股有限公司 | The generation method and device of pushed information |
CN110673748A (en) * | 2019-09-27 | 2020-01-10 | 北京百度网讯科技有限公司 | Method and device for providing candidate long sentences in input method |
Also Published As
Publication number | Publication date |
---|---|
CN113589946A (en) | 2021-11-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107291690B (en) | Punctuation adding method and device and punctuation adding device | |
CN107247519B (en) | Input method and device | |
CN108628813B (en) | Processing method and device for processing | |
CN111831806B (en) | Semantic integrity determination method, device, electronic equipment and storage medium | |
CN107274903B (en) | Text processing method and device for text processing | |
CN107564526B (en) | Processing method, apparatus and machine-readable medium | |
CN110069624B (en) | Text processing method and device | |
CN112036195A (en) | Machine translation method, device and storage medium | |
CN110069143B (en) | Information error correction preventing method and device and electronic equipment | |
CN113539233B (en) | Voice processing method and device and electronic equipment | |
CN112735396A (en) | Speech recognition error correction method, device and storage medium | |
CN107422872B (en) | Input method, input device and input device | |
CN110858099B (en) | Candidate word generation method and device | |
CN110908523B (en) | Input method and device | |
CN113589946B (en) | Data processing method and device and electronic equipment | |
CN111324214B (en) | Statement error correction method and device | |
CN113589954B (en) | Data processing method and device and electronic equipment | |
CN109979435B (en) | Data processing method and device for data processing | |
CN116484828A (en) | Similar case determining method, device, apparatus, medium and program product | |
CN112989819B (en) | Chinese text word segmentation method, device and storage medium | |
CN111382566B (en) | Site theme determining method and device and electronic equipment | |
CN113589949A (en) | Input method and device and electronic equipment | |
CN113589948B (en) | Data processing method and device and electronic equipment | |
CN113589947B (en) | Data processing method and device and electronic equipment | |
CN111103986B (en) | User word stock management method and device, and user word stock input method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |