WO2022222224A1

WO2022222224A1 - Deep learning model-based data augmentation method and apparatus, device, and medium

Info

Publication number: WO2022222224A1
Application number: PCT/CN2021/096475
Authority: WO
Inventors: 李鹏宇; 李剑锋; 陈又新; 肖京
Original assignee: 平安科技（深圳）有限公司
Priority date: 2021-04-19
Filing date: 2021-05-27
Publication date: 2022-10-27
Also published as: CN113158652A; CN113158652B

Abstract

A deep learning model-based data augmentation method and apparatus, a device, and a medium. The method comprises: obtaining manually labeled original training data and original test data, and obtaining an original parameter list (S10); randomly initializing, according to an artificial fish swarm algorithm, augmentation parameters in the original parameter list to obtain multiple optimization parameter lists (S20); using each optimization parameter list to transform the original training data to obtain corresponding artificially constructed data, and mixing the original training data with the corresponding artificially constructed data to obtain multiple training sets (S30); using the multiple training sets to carry out training separately to obtain multiple recognition models, and using the original test data as a test set to test the multiple recognition models (S40); determining, according to the test results, whether there is a model satisfying a convergence condition among the multiple recognition models (S50); if yes, outputting the optimization parameter list corresponding to the model satisfying the convergence condition as a target data augmentation parameter list (S60); and using the target data augmentation parameter list to perform data augmentation on the original training data to obtain a training set of a named entity recognition model (S70). The method uses the artificial fish swarm algorithm as the framework and integrates the model recognition effect as an optimization target into the formulation of the data augmentation strategy, thereby improving data augmentation effect on the data.

Description

Data enhancement method, device, device and medium based on deep learning model

This application claims the priority of the Chinese invention patent application filed on April 19, 2021 with the application number 202110420110.3 and the invention title "Data Enhancement Method, Apparatus, Equipment and Medium Based on Deep Learning Model", and its entire content Incorporated herein by reference.

technical field

The present application relates to the field of artificial intelligence, and in particular, to a data enhancement method, apparatus, device and medium based on a deep learning model.

Background technique

With the development of intelligent technology, there are more and more demands for Named Entity Recognition (NER) tasks in the application fields of natural language processing methods such as question answering systems and machine translation systems. The obtained named entity recognition model to perform named entity recognition task has become an increasingly common recognition method. In order to improve the entity recognition rate in the text to be recognized by the named entity recognition model, we usually start from two perspectives: enhancing the training data or enhancing the model algorithm, so as to achieve the purpose of enhancing the accuracy of the named entity recognition model.

technical problem

The inventor found that in the prior art, the data enhancement model of the named entity recognition model mainly replaces the entity words in the training data through different data enhancement methods and parameters corresponding to the data enhancement methods. The entity words in the training data are subjected to synonym replacement, random insertion, random exchange of positions, and random deletions to increase the scale and diversity of the training data. The enhancement effect of the data augmentation model on the training data is inseparable from the model parameters, but the model parameters of the existing data augmentation models are determined by experience or the parameter optimization method of grid search, and the interaction with the named entity recognition model is low. As a result, the data augmentation model has a poor effect on the augmentation of the training data.

technical solutions

The present application provides a data enhancement method, device, equipment and medium based on a deep learning model. In the prior art, the model parameters of the data enhancement model are determined by experience or the parameter optimization method of grid search, resulting in the data enhancement model. The problem of poor data augmentation.

A data augmentation method based on a deep learning model, including:

Obtain the manually marked original training data and original test data, and obtain the original parameter list, where the original parameter list is composed of a data enhancement method and an enhancement parameter corresponding to the data enhancement method;

Randomly initialize the enhanced parameters in the original parameter list according to the artificial fish swarm algorithm to obtain multiple optimized parameter lists;

Transform the original training data using each of the optimized parameter lists to obtain corresponding artificially constructed data, and mix the original training data with the corresponding artificially constructed data to obtain multiple training sets;

Use the plurality of training sets to train to obtain a plurality of recognition models, and use the original test data as a test set to test the plurality of recognition models, so as to determine whether the plurality of recognition models meet the convergence conditions. Model;

If there is a model that satisfies the convergence condition in the plurality of identification models, outputting an optimization parameter list corresponding to the model that satisfies the convergence condition as a target data enhancement parameter list;

Data augmentation is performed on the original training data by using the target data augmentation parameter list to obtain a training set of a named entity recognition model.

A data enhancement device based on a deep learning model, comprising:

an acquisition module, used for acquiring the manually marked original training data and original test data, and acquiring the original parameter list, where the original parameter list is composed of the data enhancement method and the enhancement parameters corresponding to the data enhancement method;

an initialization module for randomly initializing the enhanced parameters in the original parameter list according to the artificial fish swarm algorithm to obtain multiple optimized parameter lists;

a conversion module, configured to convert the original training data using each of the optimized parameter lists to obtain corresponding artificially constructed data, and mix the original training data with the corresponding artificially constructed data to obtain multiple training sets;

A test module, configured to use the multiple training sets to train to obtain multiple recognition models, and use the original test data as a test set to test the multiple recognition models to determine whether the multiple recognition models are There is a model that satisfies the convergence condition;

an output module, configured to output an optimization parameter list corresponding to the model that satisfies the convergence condition if there is a model that satisfies the convergence condition in the plurality of identification models, as a target data enhancement parameter list;

An enhancement module, configured to perform data enhancement on the original training data by using the target data enhancement parameter list to obtain a training set of a named entity recognition model.

A computer device, comprising a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor, the processor implementing the following steps when executing the computer-readable instructions:

One or more readable storage media storing computer-readable instructions that, when executed by one or more processors, cause the one or more processors to perform the following steps:

technical effect

In this application, an artificial fish swarm algorithm suitable for the coexistence of discrete values and continuous values is used to randomly initialize the enhancement parameters in the original parameter list, and the recognition effect of the recognition model is taken as the optimization target and integrated into the formulation of the data enhancement strategy, so as to be more efficient. A data enhancement list with better effect is obtained at a small cost, thereby improving the data enhancement effect of the data enhancement list on the data.

The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below, and other features and advantages of the application will become apparent from the description, drawings, and claims.

Description of drawings

In order to illustrate the technical solutions of the embodiments of the present application more clearly, the following briefly introduces the drawings that are used in the description of the embodiments of the present application. Obviously, the drawings in the following description are only some embodiments of the present application. , for those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative labor.

1 is a schematic diagram of an application environment of a data enhancement method based on a deep learning model in an embodiment of the present application;

2 is a schematic flowchart of a data enhancement method based on a deep learning model in an embodiment of the present application;

3 is another schematic flowchart of a data enhancement method based on a deep learning model in an embodiment of the present application;

Fig. 4 is a realization flow chart of step S30 in Fig. 2;

Fig. 5 is a realization flow chart of step S33 in Fig. 4;

Fig. 6 is another realization flow chart of step S30 in Fig. 2;

Fig. 7 is a realization flow chart of step S50 in Fig. 2;

Fig. 8 is a realization flow chart of step S52 in Fig. 7;

9 is a schematic structural diagram of a data enhancement device based on a deep learning model in an embodiment of the present application;

FIG. 10 is a schematic structural diagram of a computer device in an embodiment of the present application.

Detailed ways

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative work fall within the protection scope of the present application.

The data enhancement method based on the deep learning model provided by the embodiment of the present application can be applied in the application environment as shown in FIG. 1 , in which the terminal device communicates with the server through the network. The server obtains the manually labeled original training data and original test data sent by the user through the terminal device, and obtains the original parameter list sent by the user through the terminal device. The original parameter list consists of the data enhancement method and the enhancement parameters corresponding to the data enhancement method. , according to the artificial fish swarm algorithm, the enhanced parameters in the original parameter list are randomly initialized to obtain multiple optimized parameter lists, and each optimized parameter list is used to transform the original training data to obtain the corresponding artificially constructed data, and the original The training data is mixed with the corresponding artificially constructed data to obtain multiple training sets, and multiple training sets are used to train and obtain multiple recognition models, and the original test data is used as the test set to test the multiple recognition models to determine the multiple recognition models. Whether there is a model that satisfies the convergence condition in each recognition model, if there is a model that satisfies the convergence condition in multiple recognition models, the optimized parameter list corresponding to the model that satisfies the convergence condition is output as the target data enhancement parameter list, and the target data is used to enhance the The parameter list performs data enhancement on the original training data to obtain the training set of the named entity recognition model. The artificial fish swarm algorithm suitable for the coexistence of discrete values and continuous values is used to randomly initialize the enhanced parameters in the original parameter list, and the recognition model is used to randomly initialize the enhanced parameters. The recognition effect is integrated into the formulation of the data augmentation strategy as an optimization objective, and a data augmentation list with better effect is obtained at a small cost, thus ensuring the data diversity of the training set of the named entity recognition model and expanding the scale of the training set. Then, the recognition accuracy of the named entity recognition model is improved, and the training data enhancement and artificial intelligence of named entity recognition are realized.

Among them, the relevant data used or produced by the deep learning model-based data enhancement method is stored in the database of the server, and the database in this embodiment is stored in the blockchain network for storing and implementing the deep learning model-based etc. data The data used and generated by the enhancement method, such as the original training data, the original test data, the original parameter list, the artificially constructed data, the optimized parameter list and the related data of multiple recognition models. The blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer. Deploying the database on the blockchain can improve the security of data storage.

Wherein, the terminal device can be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers and portable wearable devices. The server can be implemented as an independent server or a server cluster composed of multiple servers.

In one embodiment, as shown in FIG. 2, a data enhancement method based on a deep learning model is provided, and the method is applied to the server in FIG. 1 as an example for description, including the following steps:

S10: Obtain the manually labeled original training data and original test data, and obtain the original parameter list, where the original parameter list is composed of the data enhancement method and the enhancement parameters corresponding to the data enhancement method.

It can be understood that the original parameter list in this embodiment is a data enhancement model, the data enhancement model is composed of a data enhancement method and an enhancement parameter corresponding to the data enhancement method, and the data enhancement performance of the data enhancement model depends on the data in the model. The enhancement parameters corresponding to the enhancement method and the data enhancement method. Therefore, before using the data enhancement model, it is necessary to optimize the parameters of the existing data enhancement model to improve the enhancement performance of the data enhancement model on the training data, so as to ensure the subsequent training data. The recognition accuracy of the named entity recognition model obtained by training.

To optimize the parameters of an existing data augmentation model, it is necessary to obtain the existing data augmentation model, that is, to obtain the original parameter list of the model, and to obtain the manually labeled original training data and original test data.

S20: Randomly initialize the enhanced parameters in the original parameter list according to the artificial fish swarm algorithm to obtain multiple optimized parameter lists.

After obtaining the manually labeled original training data and original test data, and obtaining the original parameter list, the artificial fish swarm algorithm with fast convergence speed and suitable for the coexistence of discrete values and continuous values is used as the framework. Random initialization is performed to obtain multiple lists of optimized parameters.

S30: Transform the original training data by using each optimization parameter list to obtain corresponding artificially constructed data, and mix the original training data with the corresponding artificially constructed data to obtain multiple training sets.

After multiple optimization parameter lists are obtained, each optimization parameter list is used to transform the original training data to obtain corresponding artificially constructed data, and the original training data and the corresponding artificially constructed data are randomly scrambled to obtain multiple a training set.

For example, after obtaining L optimization parameter lists, each optimization parameter list is used to convert the original training data, and L pieces of corresponding artificially constructed data are obtained, and each piece of artificially constructed data corresponds to an optimization parameter list. After the corresponding artificially constructed data, the original training data is mixed with each artificially constructed data to obtain L training sets.

S40: Use multiple training sets to train to obtain multiple recognition models, and use the original test data as a test set to test the multiple recognition models.

After obtaining multiple training sets, use multiple training sets to train to obtain multiple recognition models, use the original test data as the test set, use the test set to test the multiple recognition models, and obtain each recognition model for each recognition model in the test set. The recognition effect (recognition score) of the entity word is used as the test result.

S50: Determine, according to the test result, whether there is a model that satisfies the convergence condition in the plurality of identification models.

After testing the multiple recognition models with the original test data as the test set, according to the recognition effect of each recognition model on each entity word in the test set, that is, the test result, it is determined whether there is a model satisfying the convergence condition among the multiple recognition models. Wherein, the multiple recognition models may be traditional entity recognition models.

S60: If a model that satisfies the convergence condition exists in the plurality of identification models, output an optimization parameter list corresponding to the model that satisfies the convergence condition as a target data enhancement parameter list.

After it is determined whether there is a model that satisfies the convergence condition in the multiple recognition models, if there is a model that satisfies the convergence condition in the multiple recognition models, it means that the recognition effect of the existing recognition model in the multiple recognition models meets the user's requirements. Correspondingly, this The training set used by the model that satisfies the convergence conditions meets the requirements, and then the optimization parameter list corresponding to the training set is determined to be the data enhancement list that meets the data enhancement requirements, and the corresponding optimization parameter list is output as the target data enhancement parameter list.

S70: Perform data enhancement on the original training data by using the target data enhancement parameter list to obtain a training set of the named entity recognition model.

After outputting the corresponding optimization parameter list as the target data enhancement parameter list, the target data enhancement parameter list will be used to perform data enhancement on the original training data, and then the enhanced data after data enhancement will be randomly scrambled with the original training data. By mixing the training set of the named entity recognition model, a relatively accurate named entity recognition model can be obtained, thereby ensuring the recognition accuracy of the named entity recognition model.

It should be understood that the artificial fish swarm optimization algorithm is a particle swarm optimization algorithm, which regards particles as fish trying to reach the position with the highest food concentration in the waters, thereby improving their living conditions. In this embodiment, the particles and artificial fish are the enhancement parameters in the original parameter list for random initialization, the food concentration is the cost function or loss function of the recognition model, and the swimming process of the artificial fish during the algorithm operation is the original parameter The enhancement parameters in the list gradually approach the optimal position, and the process of making the cost function or loss function approach the lowest value.

Among them, the original parameter list formed by the data enhancement method and the enhancement parameters corresponding to the data enhancement method can be shown in Table 1:

Table 1

In this embodiment, as shown in Table 1, with the artificial fish swarm algorithm as the framework, the discrete values β ₁ to β ₅ in the original parameter list and the discrete value p _syn are combined to form a mixed continuous value and discrete value The original parameter list. The original parameter list includes the data enhancement method and the corresponding enhancement parameters. The artificial fish swarm algorithm is used to iteratively optimize the enhancement parameters of the original parameter list to obtain an optimized parameter list. The data is processed to obtain artificially constructed data, and then the artificially constructed data is mixed with the original training data to obtain a high-quality training set at a lower cost, which ensures the recognition accuracy of the named entity recognition model.

In this embodiment, the manually labeled original training data and original test data are obtained, and the original parameter list is obtained. The original parameter list is composed of the data enhancement method and the enhancement parameters corresponding to the data enhancement method. According to the artificial fish swarm algorithm, the original parameters are The enhancement parameters in the list are randomly initialized to obtain multiple optimized parameter lists, and each optimized parameter list is used to transform the original training data to obtain the corresponding artificially constructed data, and the original training data is compared with the corresponding artificially constructed data. Mix to obtain multiple training sets, use multiple training sets to train to obtain multiple recognition models, and use the original test data as a test set to test multiple recognition models to determine whether there are multiple recognition models that meet the convergence conditions If there is a model that satisfies the convergence condition in the multiple recognition models, the optimization parameter list corresponding to the model that satisfies the convergence condition is output as the target data enhancement parameter list, and the original training data is enhanced by using the target data enhancement parameter list. , to obtain the training set of the named entity recognition model; in this application, an artificial fish swarm algorithm suitable for the coexistence of discrete values and continuous values is used to randomly initialize the enhanced parameters in the original parameter list, and the recognition effect of the recognition model is used as the optimization target. It is integrated into the formulation of the data enhancement strategy, and a data enhancement list with better effect is obtained at a small cost, thereby ensuring the data diversity of the training set of the named entity recognition model, expanding the scale of the training set, and improving the named entity recognition. The recognition accuracy of the model.

In addition, since the enhancement parameters of each data enhancement method in the target data enhancement parameter list are obtained by automatic optimization, this embodiment can support the expansion of the data enhancement method, and obtain different data enhancement lists according to the needs of users, so that a more A large amount of model training data further ensures the accuracy of the model.

In one embodiment, the data enhancement method includes a synonym replacement method. As shown in FIG. 3 , after step S50, that is, after determining whether there is a model that satisfies the convergence condition in the multiple recognition models according to the test result, the method further specifically includes the following: step:

S80: If there is no model that satisfies the convergence condition among the multiple identification models, randomly initialize the enhanced parameters in the original parameter list again according to the artificial fish swarm algorithm to obtain multiple optimized parameter lists after random initialization, and count them .

After determining whether there is a model that satisfies the convergence condition in the multiple recognition models, if there is no model that satisfies the convergence condition in the multiple recognition models, it means that the recognition effects of the multiple recognition models do not meet the user's requirements. The parameter list does not sufficiently augment the original training data. At this time, the enhanced parameters in the original parameter list are randomly initialized again according to the artificial fish swarm algorithm, so as to obtain multiple recognition models through training according to the optimized parameter list after random initialization, and test the multiple recognition models until the target data is obtained. Enhanced parameter list. At the same time, when the enhancement parameters in the original parameter list are randomly initialized again according to the artificial fish swarm algorithm, it is necessary to record and count the number of times of repeated random initialization of the enhancement parameters in the original parameter list.

S90: Determine whether the number of random initializations for the enhancement parameters of the original parameter list is less than a preset number of times.

S100: If the number of random initializations of the enhanced parameters in the original parameter list is not less than the preset number of times, stop random initialization of the enhanced parameters in the original parameter list.

After determining whether the number of random initializations of the enhanced parameters of the original parameter list is less than the preset number of times, if the number of random initializations of the enhanced parameters of the original parameter list is not less than the preset number of times, it means that the number of iterations is too many, in order to reduce the computational burden , it is necessary to stop random initialization of the enhancement parameters in the original parameter list, and the number of times, can output the optimization parameter list corresponding to the model close to the convergence condition, as the target data enhancement parameter list, and then use the target data enhancement parameter list to perform the original training data. Data augmentation to obtain a training set for a named entity recognition model.

S110: If the number of random initializations for the enhanced parameters of the original parameter list is less than the preset number of times, repeat steps S30 to S70.

After determining whether the number of random initializations of the enhancement parameters of the original parameter list is less than the preset number of times, if the number of random initializations of the enhancement parameters of the original parameter list is less than the preset number of times, the target data enhancement parameter list has not yet been determined at this time, Then the above steps S30-S70 need to be repeatedly performed, that is, multiple new recognition models need to be retrained according to the randomly initialized optimization parameter list, and multiple new recognition models are tested to obtain the target data enhancement parameter list, and The training set of the named entity recognition model is obtained by augmenting the parameter list with the target data.

In this embodiment, after determining whether there is a model satisfying the convergence condition among the multiple identification models, if there is no model satisfying the convergence condition among the multiple identification models, the enhancement parameters in the original parameter list are again adjusted according to the artificial fish swarm algorithm. Perform random initialization to obtain multiple optimized parameter lists after random initialization, and count them to determine whether the number of random initializations for the enhanced parameters of the original parameter list is less than the preset number of times, and determine whether to randomly initialize the enhanced parameters of the original parameter list. Whether the number of times is less than the preset number of times, if the number of random initialization of the enhancement parameters of the original parameter list is less than the preset number of times, then repeat steps S30 to S70, and further clarify the operations that need to be performed when the non-identified model converges. The artificial fish swarm algorithm is used many times, and the parameters of the original parameter list are optimized with the recognition effect of multiple recognition models as the goal, so as to obtain the optimized parameter list that is satisfactory to the user, which ensures the parameter performance of the optimized parameter list, thereby ensuring Enhancements to the data.

In one embodiment, the data enhancement method includes a synonym replacement method. As shown in FIG. 4 , in step S30, each optimization parameter list is used to convert the original training data, which specifically includes the following steps:

S31: Determine enhancement parameters corresponding to the synonym replacement method in the optimization parameter list, where the enhancement parameters corresponding to the synonym replacement method include entity word category replacement probability and entity word replacement category.

In this embodiment, the data enhancement method in the optimization parameter list includes a synonym replacement method, and the enhancement parameter corresponding to the synonym replacement method is determined in the optimization parameter list, wherein the enhancement parameter corresponding to the synonym replacement method includes entity word category replacement probability and entity Word replacement category.

S32: Acquire a preset synonym dictionary pre-built by the user according to requirements. In the preset synonym dictionary, entity words in the same entity category whose synonymous relationship is not prohibited are used as synonyms for each other.

Before converting the original training data by using the data enhancement method and the corresponding enhancement parameters in each optimization parameter list, it is also necessary to obtain a preset synonym dictionary as the source of the entity words in the original training data for conversion, wherein the preset synonyms The dictionary is a dictionary that is pre-built by users according to the needs and includes entity words of different entity categories. In the preset synonym dictionary, entity words of the same entity category are used as synonyms for each other, and in the preset synonym dictionary, specific entities are also prohibited. Synonymous relationship between words, entity words that are prohibited from synonymous relationship cannot be used as synonyms of each other.

In this embodiment, the scale of the entity words of the preset synonym dictionary is increased by relaxing the judgment conditions of synonyms, and the entity words of the same entity category are used as synonyms, that is, if the word A in the sentence is replaced by the word B, the new Sentence, semantics and grammar are still reasonable, then word B and word A are the same entity category, then word B is a synonym of word A, and the entity words of the same category are collected to form a preset synonym dictionary. For example, Sun Wukong is pressed under the Five Elements Mountain. In this sentence, Sun Wukong can be replaced by the names of Buddha, Bull Demon, etc., then Sun Wukong, Buddha Buddha and Bull Demon are synonyms for each other.

In this embodiment, the quality of the preset synonym dictionary is improved by prohibiting the synonymous relationship between specific words. In daily use, although some entity words belong to the same entity category, but after being replaced as synonyms in the sentence, the sentence grammar changes. At this time, it is necessary to prohibit the synonymous relationship between the two parties, that is, the two entity words are not synonyms. For example, Sun Wukong is pressed under the Five Elements Mountain. In this sentence, if the Yellow River is replaced by the Five Elements Mountain, then the Sun Wukong is pressed under the Yellow River. Therefore, the synonymous relationship between the Five Elements Mountain and the Yellow River is not prohibited in the preset synonym dictionary. When performing synonym substitution, they cannot be replaced as synonyms of each other.

In this embodiment, the above sentences are based on the Sun Wukong being pressed under the Five Elements Mountain, and the synonyms are explained by using Tathagata Buddha, Niu Demon King and the Yellow River as entity words, which are only exemplary descriptions. In other embodiments, other sentences can also be used. and entity words as an example.

Among them, the synonyms in the preset synonym dictionary can exist in the form of Table 2, wherein, Table 2 includes four columns, the first column is the serial number, the second column and the third column are different words: word A and word B, The fourth column is the replacement relationship between word A and word B. If word B can replace word A, it means that word A and word B are synonyms with each other. If word B cannot replace word A, it means word A and word A B are not synonyms for each other. The content of the preset synonym dictionary is shown in Table 2 below:

Table 2

S33: Perform synonym replacement on entity words in the original training data according to the preset synonym dictionary, entity word category replacement probability, and entity word replacement category.

After obtaining the preset synonym dictionary, entity word category replacement probability and entity word replacement category, perform synonym replacement on entity words in the original training data according to the preset synonym dictionary, entity word category replacement probability and entity word replacement category, and obtain The data after synonym replacement is then processed according to other data enhancement methods and corresponding enhancement parameters in the optimization parameter list to obtain artificially constructed data. Among them, the replacement probability of the entity word category is the replacement probability of the entity word replacement category. In the optimization parameter list, the probability distribution of each entity word category being replaced is p_syn=[p_(syn1), p_(syn2),...,p_( synK)], based on the preset synonym dictionary, with the probability of p_(syn, k), the k-type entity words in the original training data are replaced with the synonyms of the preset synonym dictionary.

In this embodiment, the enhancement parameters corresponding to the synonym replacement method are determined in the optimization parameter list, and the enhancement parameters corresponding to the synonym replacement method include the entity word category replacement probability and the entity word replacement category, and a preset synonym dictionary pre-built by the user according to requirements is obtained. , in the preset synonym dictionary, the entity words in the same entity category whose synonymous relationship is not prohibited are regarded as synonyms of each other, and the entity words in the original training data are compared according to the preset synonym dictionary, entity word category replacement probability and entity word replacement category. Words are replaced by synonyms, which refines the steps of using each optimization parameter list to convert the original training data. By relaxing the judgment conditions of entity word synonyms, the scale of the preset synonym dictionary is expanded, and the diversity of artificially constructed data is improved. Build a method based on the prohibition of synonymous relationships, and continuously improve the quality of the preset synonym dictionary, thereby ensuring the quality of artificially constructed data.

In one embodiment, as shown in FIG. 5 , in step S33, that is, according to the preset synonym dictionary, the entity word category replacement probability and the entity word replacement category, the entity words in the original training data are replaced by synonyms, which specifically includes the following steps: :

S331: Determine whether the category of each entity word in the original training data belongs to the entity word replacement category.

After presetting the synonym dictionary, entity word category replacement probability, and entity word replacement category, it is necessary to determine the category of each entity word in the original training data to determine whether each entity word in the original training data belongs to the entity word replacement category.

S332: If the category of the entity word in the original training data belongs to the entity word replacement category, search for the synonym of the entity word in the preset synonym dictionary.

After determining whether each entity word in the original training data belongs to the entity word replacement category, if the category of an entity word in the original training data belongs to the entity word replacement category, it means that the entity word in the original training data needs to be replaced by synonyms. All synonyms of the entity word in the synonym dictionary are preset for subsequent replacement.

S333: Determine whether the synonymous relationship between the entity word and the synonym of the entity word is prohibited.

After presetting the synonyms of the entity word in the synonym dictionary, it is determined whether the synonymous relationship between the entity word and each synonym is prohibited.

S334: If the synonymous relationship between the entity word and the synonym of the entity word is not prohibited, select a synonym from the preset synonym dictionary as the replacement word with the entity word category replacement probability to replace the entity word with the replacement word.

After determining whether the synonymous relationship between the entity word and the synonym of the entity word is prohibited, if the synonymous relationship between the entity word and the synonym of the entity word is not prohibited, replace the entity word with the corresponding synonyms.

S335: If the synonymous relationship between the entity word and the synonym of the entity word is prohibited, the synonym is not used as a replacement word for the entity word.

After determining whether the synonymous relationship between the entity word and the corresponding synonym is prohibited, if the synonymous relationship between the entity word and the synonym of the entity word is prohibited, the synonym is skipped, that is, the synonym is not used as the replacement of the entity word word.

For example, the entity word replacement category includes three categories: person name, place name, and institution name, and the entity word category replacement probability is p_syn=[0.30, 0.60, 0.10], that is, according to the synonym replacement method, the probability of the person name in the original training data being replaced is 0.30, the probability of a place name being replaced is 0.6, and the probability of an institution name being replaced is 0.1. If the synonyms of the person’s name in the preset synonym dictionary are not prohibited from synonymous relationship, then each person’s name in a sentence in the original training data, There is a 30% probability of being replaced with a synonym of the person's name in the preset synonym dictionary; if a synonym of the person's name in the preset synonym dictionary is prohibited from being synonymous, skip the synonym and use other synonyms. Replace the person's name.

In this embodiment, the entity word replacement category includes three categories: person name, place name, and organization name, and the entity word category replacement probability is p_syn=[0.30, 0.60, 0.10], which is only an exemplary illustration. In other embodiments, the entity word replacement The class and entity word class replacement probabilities can also be other.

S336: If the category of each entity word in the original training data does not belong to the category of entity word replacement, no synonym replacement is performed.

After determining whether each entity word in the original training data belongs to the entity word replacement category, if the category part of each entity word in the original training data belongs to the entity word replacement category, it means that the entity word in the original training data does not need to be replaced by synonyms. Perform other data augmentation methods in the optimization parameter list.

In this embodiment, by determining whether the category of each entity word in the original training data belongs to the category of entity word replacement, if the category of the entity word in the original training data belongs to the category of entity word replacement, find the corresponding entity word in the preset synonym dictionary. Synonyms, determine whether the synonymous relationship between the entity word and the corresponding synonym is prohibited, if the synonymous relationship is not prohibited between the entity word and the corresponding synonym, replace the entity word with the corresponding synonym with the entity word category replacement probability , which clarifies the steps of performing synonym replacement on entity words in the original training data according to the preset synonym dictionary, entity word category replacement probability and entity word replacement category, which provides a basis for the acquisition of artificially constructed data.

In one embodiment, the data enhancement method further includes a random replacement method, a random deletion method, a random exchange method and a long sentence construction method, as shown in FIG. 6 , after step S33, synonyms are performed on the entity words in the original training data. After the replacement, the method further specifically includes the following steps:

S34: In the optimization parameter list, determine the random replacement probability of the random replacement method, and determine the random deletion probability of the random deletion method.

In this embodiment, the data enhancement method further includes a random replacement method and a random deletion method. It is necessary to determine the random replacement probability of the random replacement method and the random deletion probability of the random deletion method in the optimization parameter list, so as to determine the random replacement probability of the random replacement method and the random deletion probability according to the random replacement probability, The original training data is transformed with random deletion probabilities.

S35: Determine the random exchange probability of the random exchange method, and determine the sentence length set by the method of constructing a long sentence.

In this embodiment, the data enhancement method further includes a random exchange method and a long sentence construction method. In the optimization parameter list, it is also necessary to determine the random exchange probability of the random exchange method, and determine the sentence length set by the long sentence construction method, so as to The original training data is transformed according to the random exchange probability and the sentence length set by the method of constructing long sentences.

S36: Perform entity word replacement for each sentence in the original training data according to the random replacement probability, and perform the same sentence entity word exchange for each sentence in the original training data according to the random exchange probability.

After determining the random replacement probability of the random replacement method and the random exchange probability of the random exchange method, perform entity word replacement for each sentence in the original training data with the random replacement probability, and replace the original training data with the random exchange probability. Each sentence performs the same sentence entity word exchange.

For example, in the optimization parameter list, the random replacement probability of the random replacement method is β2, and the random exchange probability of the random exchange method is β3. In the original training data, for each token (entity word) of each sentence, there is a probability of β2 to be Replace with any other token in the dictionary (which can be a preset synonym dictionary), where the rules for selecting tokens from the dictionary are: obey a uniform random distribution and exclude other tokens to be randomly replaced in the original training data. At the same time, in each sentence of the original training data, the i-th token and the j-th token have a probability of β3 for position exchange.

S37: Perform entity word deletion on each sentence in the original training data according to the random deletion probability to obtain processing data.

After performing entity word replacement for each sentence in the original training data according to the random replacement probability, and after performing the same sentence entity word exchange for each sentence in the original training data according to the random exchange probability, the original training data is replaced according to the random deletion probability. Entity word removal is performed on each sentence of , to obtain processing data.

For example, in the original training data, replace each token of each sentence with any other token in the dictionary with the probability of β2, and then in each sentence, with the probability of β3, replace the ith token and the ith token with the The positions of j tokens are exchanged, and then each token of each sentence is deleted with a probability of β4 to obtain processing data.

S38: Perform splicing processing on each sentence in the processing data, so that the sentence length after the processing is completed is the sentence length.

After the processing data is obtained, splicing processing is performed on each sentence in the processing data, so that the sentence length after the processing is completed is the sentence length.

For example, the sentence length set by the method of constructing long sentences is 100, and the sentence length of each sentence in the data is statistically processed to obtain the 90th percentile of sentence length. Sentences are paired in pairs to spliced into a longer spliced sentence (the order of the two sentences is random), and then delete the part of the spliced sentence whose length exceeds 100, so that the sentence length of each sentence in the processing data is sentence length 100 .

In this embodiment, the sentence length set by the method for constructing long sentences is 100, and the pairwise splicing of sentences whose sentence length is less than or less than the 90th percentile is only an exemplary illustration. The sentence length set by the long sentence method can also be other values, and sentences with sentence lengths of other percentiles can also be paired and spliced, which will not be repeated here.

In this embodiment, after the entity words in the original training data are replaced with synonyms, the random replacement probability of the random replacement method and the random deletion probability of the random deletion method are determined in the optimization parameter list, and the random replacement probability of the random exchange method is determined. Random exchange probability, and determine the sentence length set by the method of constructing long sentences, perform entity word replacement for each sentence in the original training data according to the random replacement probability, and perform entity word replacement for each sentence in the original training data according to the random exchange probability. Exchange the entity words of the same sentence, delete the entity words of each sentence in the original training data according to the random deletion probability to obtain the processing data, and perform splicing processing on each sentence in the processing data, so that the sentence length after the processing is completed. It further refines the steps of using each optimization parameter list to convert the original training data, and adopts a variety of data enhancement methods to convert the original training data, which further increases the diversity of artificially constructed data and ensures the recognition model training set. accuracy.

In one embodiment, the data enhancement method includes a synonym replacement method. As shown in FIG. 7 , in step S50, it is determined according to the test result whether there is a convergence model in the multiple recognition models, which specifically includes the following steps:

S51: Determine the highest recognition score for each word in the test set in the multiple recognition models.

After the multiple recognition models are tested on the original test data as the test set, the highest recognition score of the multiple recognition models for recognizing each word in the test set is determined.

Among them, the recognition score of each word in the test set by the recognition model is determined by the following formula:

Among them, score _t is the score of the recognition model for the recognition of the t-th word in the test set, recall is the recall rate of the entity word, and precision is the accuracy of the recognition model recalling the entity word.

For example, if the number of recognition models is 3, after using the original test data as the test set to test the three recognition models A, B, and C, the recognition scores of the three recognition models A, B, and C for the t-th word in the test set, are 0.6, 0.8 and 0.9 respectively, then among the three recognition models A, B, and C, the highest recognition score for recognizing the t-th word in the test set is 0.9.

In this embodiment, the number of recognition models is 3, and the recognition scores for the t-th word in the test set are 0.6, 0.8, and 0.9, respectively, for exemplary illustration. In other embodiments, the number of recognition models may also be other numerical values. The recognition score for the t-th word in the test set may also be other numerical values, which will not be repeated here.

S52: Determine whether the highest recognition score satisfies the convergence condition.

After determining the highest recognition score for recognizing each word in the test set in the multiple recognition models, it is determined whether the highest recognition score for recognizing each word in the test set in the multiple recognition models satisfies the convergence condition.

S53: If the highest recognition score satisfies the convergence condition, it is determined that there is a convergence model satisfying the convergence condition among the plurality of recognition models, and the recognition model corresponding to the highest recognition score is the convergence model.

After determining whether the highest recognition score satisfies the convergence condition, if the highest recognition score satisfies the convergence condition, it means that the recognition effect of the existing recognition model meets the requirements, then it is determined that there is a convergence model that satisfies the convergence condition in the multiple recognition models, and the highest recognition score corresponds to the The identification model is a convergent model, and the optimization parameter list corresponding to the convergent model can be used as the target data enhancement parameter list.

S54: If the highest recognition score does not satisfy the convergence condition, determine that there is no convergence model satisfying the convergence condition among the multiple recognition models.

After determining whether the highest recognition score satisfies the convergence condition, if the highest recognition score satisfies the convergence condition, it means that the recognition effect of no recognition model meets the requirements, then it is determined that there is no convergence model that satisfies the convergence condition in the multiple recognition models, and the optimization parameters of this round The list is not available and needs to be re-iteratively optimized using the artificial fish swarm algorithm.

In this embodiment, by determining the highest recognition score for identifying each word in the test set in the multiple recognition models, it is determined whether the highest recognition score satisfies the convergence condition, and if the highest recognition score satisfies the convergence condition, then it is determined that there are any The convergence model of the convergence condition, the recognition model corresponding to the highest recognition score is the convergence model. If the highest recognition score does not satisfy the convergence condition, it is determined that there is no convergence model that satisfies the convergence condition among the multiple recognition models, and it is clarified that among the multiple recognition models Whether there is a convergence model judgment process, the recognition effect of the recognition model on the test set is used as the concentration of the artificial fish swarm algorithm, and the recognition effect of the recognition model on the test set is used as the goal of parameter optimization of the data enhancement model, and a small cost is obtained. Effective data augmentation strategy.

In one embodiment, the data enhancement method includes a synonym replacement method. As shown in FIG. 8 , in step S52, it is determined whether the highest recognition score satisfies the convergence condition, which specifically includes the following steps:

S521: Determine the convergence parameters configured by the user.

S522: Determine the first highest recognition score for recognizing the t-th word in the test set among the multiple recognition models;

S523: Determine the second highest recognition score for recognizing the t-1th word in the test set among the multiple recognition models;

S524: subtract the second highest recognition score from the first highest recognition score to obtain the highest recognition score difference;

S525: Determine whether the ratio of the difference between the highest recognition score and the second highest recognition score is less than the convergence parameter;

S526: If the ratio of the difference between the highest recognition score and the second highest recognition score is less than the convergence parameter, then determine that the highest recognition score satisfies the convergence condition;

S527: If the ratio of the difference between the highest recognition score and the second highest recognition score is not less than the convergence parameter, determine that the highest recognition score does not satisfy the convergence condition.

After determining the highest recognition score of each word in the test set among the multiple recognition models, determine whether the highest recognition score of the multiple recognition models satisfies the convergence condition by the following formula:

Among them, maxscore _t is the highest recognition score of the t-th word in the test set among multiple recognition models, that is, the first highest recognition score, and maxscore _t-1 is the highest recognition score of the t-1th word in the test set among multiple recognition models. , that is, the second highest recognition score, α is a convergence parameter configured by the user (it can be 0.01).

In the above formula, if the highest score difference maxscore _t -maxscore _t -1 between the first highest recognition score maxscore _t and the second highest recognition score maxscore _t-1 is divided by the second highest recognition score, the convergence value is obtained

like

is less than the convergence parameter α, then it is determined that the highest recognition score satisfies the convergence condition; if

If not less than the convergence parameter α, it is determined that the highest recognition score does not satisfy the convergence condition.

In this embodiment, the convergence parameters configured by the user are determined, the first highest recognition score for recognizing the t-th word in the test set among the multiple recognition models is determined, and the number of recognition models for recognizing the t-1th word in the test set is determined. For the second highest recognition score, subtract the second highest recognition score from the first highest recognition score to obtain the highest recognition score difference, and determine whether the ratio of the highest recognition score difference to the second highest recognition score is less than the convergence parameter. If the ratio of the second highest recognition score is less than the convergence parameter, it is determined that the highest recognition score satisfies the convergence condition; if the ratio of the difference between the highest recognition score and the second highest recognition score is not less than the convergence parameter, then it is determined that the highest recognition score does not satisfy the convergence condition, and it is clear that The specific process of determining whether the highest recognition score satisfies the convergence condition provides a judgment basis for determining whether the model converges according to the highest recognition score.

It should be understood that the size of the sequence numbers of the steps in the above embodiments does not mean the sequence of execution, and the execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

In one embodiment, a data enhancement apparatus based on a deep learning model is provided, and the data enhancement apparatus based on a deep learning model corresponds to the data enhancement method based on the deep learning model in the above-mentioned embodiment. As shown in FIG. 9 , the data enhancement device based on the deep learning model includes an acquisition module 901 , an initialization module 902 , a conversion module 903 , a test module 904 , an output module 905 and an enhancement module 906 . The detailed description of each functional module is as follows:

Obtaining module 901 is used to obtain the original training data and original test data marked manually, and obtain the original parameter list, and the original parameter list is formed by the data enhancement method and the corresponding enhancement parameters of the data enhancement method;

An initialization module 902, configured to randomly initialize the enhanced parameters in the original parameter list according to the artificial fish swarm algorithm to obtain multiple optimized parameter lists;

The conversion module 903 is configured to convert the original training data by using each of the optimized parameter lists to obtain corresponding artificially constructed data, and mix the original training data with the corresponding artificially constructed data to obtain corresponding artificially constructed data. Get multiple training sets;

The testing module 904 is configured to use the multiple training sets to train to obtain multiple recognition models, and use the original test data as a test set to test the multiple recognition models, so as to determine the number of recognition models among the multiple recognition models. Whether there is a model that satisfies the convergence condition;

The output module 905 is configured to output the optimization parameter list corresponding to the model satisfying the convergence condition as the target data enhancement parameter list if there is a model satisfying the convergence condition in the plurality of identification models;

An enhancement module 906, configured to perform data enhancement on the original training data by using the target data enhancement parameter list to obtain a training set of a named entity recognition model.

Further, the data enhancement device based on the deep learning model further includes a loop module 907, after determining whether there is a model satisfying the convergence condition in the plurality of identification models, the loop module 907 is specifically used for:

If there is no model that satisfies the convergence condition in the multiple identification models, the enhancement parameters in the original parameter list are randomly initialized again according to the artificial fish swarm algorithm to obtain multiple optimizations after the random initialization. parameter list, and count;

Determine whether the number of random initializations of the enhanced parameters in the original parameter list is less than a preset number of times;

If the number of random initializations for the enhanced parameters in the original parameter list is not less than the preset number of times, stop performing the random initialization on the enhanced parameters in the original parameter list;

If the number of random initializations for the enhanced parameters in the original parameter list is less than the preset number of times, multiple new recognition models are obtained by training according to the randomly initialized optimized parameter list, and the new multiple Each recognition model is tested to obtain the target data enhancement parameter list, and a training set of the named entity recognition model is obtained by using the target data enhancement parameter list.

Further, the data enhancement method includes a synonym replacement method, and the conversion module 903 is specifically used for:

Determine the enhancement parameters corresponding to the synonym replacement method in the optimization parameter list, and the enhancement parameters corresponding to the synonym replacement method include entity word category replacement probability and entity word replacement category;

Obtaining a preset synonym dictionary pre-built by the user according to requirements, in the preset synonym dictionary, entity words in the same entity category whose synonymous relationship is not prohibited are used as synonyms for each other;

According to the preset synonym dictionary, the entity word category replacement probability and the entity word replacement category, synonym replacement is performed on the entity words in the original training data.

Further, the conversion module 903 is specifically also used for:

determining whether the category of each entity word in the original training data belongs to the entity word replacement category;

If the category of the entity word in the original training data belongs to the entity word replacement category, search for the synonym of the entity word in the preset synonym dictionary;

determining whether a synonymous relationship is prohibited between the entity word and a synonym of the entity word;

If the synonym relationship between the entity word and the synonym of the entity word is not prohibited, use the entity word category replacement probability to select one of the synonyms from the preset synonym dictionary as a replacement word to replace the The entity word is replaced with the replacement word.

Further, the data enhancement method also includes a random replacement method, a random deletion method, a random exchange method and a long sentence construction method, and the conversion module 903 is specifically also used for:

In the optimization parameter list, determine the random replacement probability of the random replacement method, and determine the random deletion probability of the random deletion method;

Determine the random exchange probability of the random exchange method, and determine the sentence length set by the method of constructing a long sentence;

Perform entity word replacement for each sentence in the original training data according to the random replacement probability, and perform same-sentence entity word exchange for each sentence in the original training data according to the random exchange probability;

Perform entity word deletion on each sentence in the original training data according to the random deletion probability to obtain processing data;

Perform splicing processing on each sentence in the processing data, so that the sentence length after the processing is completed is the sentence length.

Further, the test module 904 is specifically used for:

determining the highest recognition score for recognizing each word in the test set in the plurality of recognition models;

determining whether the highest recognition score satisfies the convergence condition;

If the highest recognition score satisfies the convergence condition, it is determined that there is a convergence model satisfying the convergence condition in the plurality of recognition models, and the recognition model corresponding to the highest recognition score is the convergence model;

If the highest recognition score does not satisfy the convergence condition, it is determined that the convergence model that satisfies the convergence condition does not exist in the plurality of recognition models.

Further, the test module 905 is specifically also used for:

Determine the convergence parameters configured by the user;

determining the first highest recognition score for recognizing the t-th word in the test set in the plurality of recognition models;

determining the second highest recognition score for recognizing the t-1th word in the test set in the plurality of recognition models;

Subtracting the second highest recognition score from the first highest recognition score to obtain the highest recognition score difference;

determining whether the ratio of the highest recognition score difference to the second highest recognition score is less than the convergence parameter;

If the ratio of the difference between the highest recognition score and the second highest recognition score is less than the convergence parameter, it is determined that the highest recognition score satisfies the convergence condition;

If the ratio of the difference between the highest recognition score and the second highest recognition score is not less than the convergence parameter, it is determined that the highest recognition score does not satisfy the convergence condition.

For the specific definition of the data augmentation apparatus based on the deep learning model, reference may be made to the above definition of the data augmentation method based on the deep learning model, which will not be repeated here. Each module in the above-mentioned deep learning model-based data augmentation apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules can be embedded in or independent of the processor in the computer device in the form of hardware, or stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.

In one embodiment, a computer device is provided, and the computer device may be a server, and its internal structure diagram may be as shown in FIG. 10 . The computer device includes a processor, memory, a network interface, and a database connected by a system bus. Among them, the processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium, an internal memory. The non-volatile storage medium stores an operating system, computer readable instructions and a database. The internal memory provides an environment for the execution of the operating system and computer-readable instructions in the non-volatile storage medium. The database of the computer device is used to store the original training data, the original test data, the original parameter list, the artificially constructed data, the optimized parameter list, and the related data used or produced by the data enhancement methods such as multiple recognition models. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer-readable instructions, when executed by a processor, implement a deep learning model-based data augmentation method.

In one embodiment, a computer device is provided, including a memory, a processor, and computer-readable instructions stored on the memory and executable on the processor, and the processor implements the above deep learning-based model when the processor executes the computer-readable instructions The steps of the data augmentation method.

In one embodiment, a computer-readable storage medium is provided, on which computer-readable instructions are stored, and when the computer-readable instructions are executed by a processor, implement the steps of the above-mentioned deep learning model-based data enhancement method.

Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing relevant hardware through computer-readable instructions, and the computer-readable instructions can be stored in a non-volatile computer. In the readable storage medium, the computer-readable instructions, when executed, may include the processes of the foregoing method embodiments. Wherein, any reference to memory, storage, database or other medium used in the various embodiments provided in this application may include non-volatile and/or volatile memory. Nonvolatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Road (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Those skilled in the art can clearly understand that, for the convenience and simplicity of description, only the division of the above-mentioned functional units and modules is used as an example. Module completion, that is, dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above.

The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, but not to limit them; although the present application has been described in detail with reference to the above-mentioned embodiments, those of ordinary skill in the art should understand that: it is still possible to implement the above-mentioned implementations. The technical solutions described in the examples are modified, or some technical features thereof are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions in the embodiments of the application, and should be included in the within the scope of protection of this application.

Claims

A data augmentation method based on a deep learning model, comprising:

Obtain the manually marked original training data and original test data, and obtain the original parameter list, where the original parameter list is composed of a data enhancement method and an enhancement parameter corresponding to the data enhancement method;

Randomly initialize the enhanced parameters in the original parameter list according to the artificial fish swarm algorithm to obtain multiple optimized parameter lists;

Transform the original training data using each of the optimized parameter lists to obtain corresponding artificially constructed data, and mix the original training data with the corresponding artificially constructed data to obtain multiple training sets;

Use the plurality of training sets to train to obtain a plurality of recognition models, and use the original test data as a test set to test the plurality of recognition models, so as to determine whether the plurality of recognition models meet the convergence conditions. Model;

If there is a model that satisfies the convergence condition in the plurality of identification models, outputting an optimization parameter list corresponding to the model that satisfies the convergence condition as a target data enhancement parameter list;

Data augmentation is performed on the original training data by using the target data augmentation parameter list to obtain a training set of a named entity recognition model.
The data enhancement method based on a deep learning model according to claim 1, wherein after determining whether there is a model that satisfies the convergence condition in the plurality of recognition models, the method further comprises:

If there is no model that satisfies the convergence condition in the multiple identification models, the enhancement parameters in the original parameter list are randomly initialized again according to the artificial fish swarm algorithm to obtain multiple optimizations after the random initialization. parameter list, and count;

Determine whether the number of random initializations of the enhanced parameters in the original parameter list is less than a preset number of times;

If the number of random initializations for the enhanced parameters in the original parameter list is not less than the preset number of times, stop performing the random initialization on the enhanced parameters in the original parameter list;

If the number of random initializations for the enhanced parameters in the original parameter list is less than the preset number of times, multiple new recognition models are obtained by training according to the randomly initialized optimized parameter list, and the new multiple Each recognition model is tested to obtain the target data enhancement parameter list, and a training set of the named entity recognition model is obtained by using the target data enhancement parameter list.
The data enhancement method based on a deep learning model according to claim 1, wherein the data enhancement method includes a synonym replacement method, and the conversion of the original training data by using each of the optimized parameter lists includes:

Determine the enhancement parameters corresponding to the synonym replacement method in the optimization parameter list, and the enhancement parameters corresponding to the synonym replacement method include entity word category replacement probability and entity word replacement category;

Obtaining a preset synonym dictionary pre-built by the user according to requirements, in the preset synonym dictionary, entity words in the same entity category whose synonymous relationship is not prohibited are used as synonyms for each other;

According to the preset synonym dictionary, the entity word category replacement probability and the entity word replacement category, synonym replacement is performed on the entity words in the original training data.
The data enhancement method based on a deep learning model according to claim 3, wherein, according to the preset thesaurus dictionary, the entity word category replacement probability and the entity word replacement category, for the original training data Entity words for synonym substitution, including:

determining whether the category of each entity word in the original training data belongs to the entity word replacement category;

If the category of the entity word in the original training data belongs to the entity word replacement category, search for the synonym of the entity word in the preset synonym dictionary;

determining whether a synonymous relationship is prohibited between the entity word and a synonym of the entity word;

If the synonym relationship between the entity word and the synonym of the entity word is not prohibited, use the entity word category replacement probability to select one of the synonyms from the preset synonym dictionary as a replacement word to replace the The entity word is replaced with the replacement word.
The data augmentation method based on a deep learning model according to claim 4, wherein the data augmentation method further comprises a random replacement method, a random deletion method, a random exchange method and a long sentence construction method, and the data augmentation method for the original training data After the entity words in are replaced by synonyms, the method further includes:

In the optimization parameter list, determine the random replacement probability of the random replacement method, and determine the random deletion probability of the random deletion method;

Determine the random exchange probability of the random exchange method, and determine the sentence length set by the method of constructing a long sentence;

Perform entity word replacement for each sentence in the original training data according to the random replacement probability, and perform same-sentence entity word exchange for each sentence in the original training data according to the random exchange probability;

Perform entity word deletion on each sentence in the original training data according to the random deletion probability to obtain processing data;

Perform splicing processing on each sentence in the processing data, so that the sentence length after the processing is completed is the sentence length.
The data enhancement method based on a deep learning model according to any one of claims 1-5, wherein the determining whether there is a convergence model in the plurality of identification models comprises:

determining the highest recognition score for recognizing each word in the test set in the plurality of recognition models;

determining whether the highest recognition score satisfies the convergence condition;

If the highest recognition score satisfies the convergence condition, it is determined that there is a convergence model satisfying the convergence condition in the plurality of recognition models, and the recognition model corresponding to the highest recognition score is the convergence model;

If the highest recognition score does not satisfy the convergence condition, it is determined that the convergence model that satisfies the convergence condition does not exist in the plurality of recognition models.
The data enhancement method based on a deep learning model according to claim 6, wherein the determining whether the highest recognition score satisfies the convergence condition comprises:

Determine the convergence parameters configured by the user;

determining the first highest recognition score for recognizing the t-th word in the test set in the plurality of recognition models;

determining the second highest recognition score for recognizing the t-1th word in the test set in the plurality of recognition models;

Subtracting the second highest recognition score from the first highest recognition score to obtain the highest recognition score difference;

determining whether the ratio of the highest recognition score difference to the second highest recognition score is less than the convergence parameter;

If the ratio of the difference between the highest recognition score and the second highest recognition score is less than the convergence parameter, it is determined that the highest recognition score satisfies the convergence condition;

If the ratio of the difference between the highest recognition score and the second highest recognition score is not less than the convergence parameter, it is determined that the highest recognition score does not satisfy the convergence condition.
A data enhancement device based on a deep learning model, comprising:

an acquisition module, configured to acquire the manually marked original training data and original test data, and acquire an original parameter list, where the original parameter list is composed of a data enhancement method and an enhancement parameter corresponding to the data enhancement method;

an initialization module for randomly initializing the enhanced parameters in the original parameter list according to the artificial fish swarm algorithm to obtain multiple optimized parameter lists;

a conversion module, configured to convert the original training data using each of the optimized parameter lists to obtain corresponding artificially constructed data, and mix the original training data with the corresponding artificially constructed data to obtain multiple training sets;

A test module, configured to use the multiple training sets to train to obtain multiple recognition models, and use the original test data as a test set to test the multiple recognition models to determine whether the multiple recognition models are There is a model that satisfies the convergence condition;

an output module, configured to output an optimization parameter list corresponding to the model that satisfies the convergence condition if there is a model that satisfies the convergence condition in the plurality of identification models, as a target data enhancement parameter list;

An enhancement module, configured to perform data enhancement on the original training data by using the target data enhancement parameter list to obtain a training set of a named entity recognition model.
A computer device comprising a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor, wherein the processor implements the following steps when executing the computer-readable instructions:

Obtain the manually marked original training data and original test data, and obtain the original parameter list, where the original parameter list is composed of a data enhancement method and an enhancement parameter corresponding to the data enhancement method;

Randomly initialize the enhanced parameters in the original parameter list according to the artificial fish swarm algorithm to obtain multiple optimized parameter lists;

Transform the original training data using each of the optimized parameter lists to obtain corresponding artificially constructed data, and mix the original training data with the corresponding artificially constructed data to obtain multiple training sets;

Use the plurality of training sets to train to obtain a plurality of recognition models, and use the original test data as a test set to test the plurality of recognition models, so as to determine whether the plurality of recognition models meet the convergence conditions. Model;

If there is a model that satisfies the convergence condition in the plurality of identification models, outputting an optimization parameter list corresponding to the model that satisfies the convergence condition as a target data enhancement parameter list;

Data augmentation is performed on the original training data by using the target data augmentation parameter list to obtain a training set of a named entity recognition model.
The computer device according to claim 9, wherein after the determining whether there is a model satisfying the convergence condition among the plurality of identification models, the processor further implements the following steps when executing the computer-readable instructions:

If there is no model that satisfies the convergence condition in the multiple identification models, the enhancement parameters in the original parameter list are randomly initialized again according to the artificial fish swarm algorithm to obtain multiple optimizations after the random initialization. parameter list, and count;

Determine whether the number of random initializations of the enhanced parameters in the original parameter list is less than a preset number of times;

If the number of random initializations for the enhanced parameters in the original parameter list is not less than the preset number of times, stop performing the random initialization on the enhanced parameters in the original parameter list;

If the number of random initializations for the enhanced parameters in the original parameter list is less than the preset number of times, multiple new recognition models are obtained by training according to the randomly initialized optimized parameter list, and the new multiple Each recognition model is tested to obtain the target data enhancement parameter list, and a training set of the named entity recognition model is obtained by using the target data enhancement parameter list.
The computer device of claim 9, wherein the data augmentation method includes a synonym replacement method, and the transforming the original training data using each of the optimization parameter lists includes:

Determine the enhancement parameters corresponding to the synonym replacement method in the optimization parameter list, and the enhancement parameters corresponding to the synonym replacement method include entity word category replacement probability and entity word replacement category;

Obtaining a preset synonym dictionary pre-built by the user according to requirements, in the preset synonym dictionary, entity words in the same entity category whose synonymous relationship is not prohibited are used as synonyms for each other;

According to the preset synonym dictionary, the entity word category replacement probability and the entity word replacement category, synonym replacement is performed on the entity words in the original training data.
The computer device according to claim 11, wherein the entity words in the original training data are replaced by synonyms according to the preset thesaurus dictionary, the entity word category replacement probability and the entity word replacement category ,include:

determining whether the category of each entity word in the original training data belongs to the entity word replacement category;

If the category of the entity word in the original training data belongs to the entity word replacement category, search for the synonym of the entity word in the preset synonym dictionary;

determining whether a synonymous relationship is prohibited between the entity word and a synonym of the entity word;

If the synonym relationship between the entity word and the synonym of the entity word is not prohibited, use the entity word category replacement probability to select one of the synonyms from the preset synonym dictionary as a replacement word to replace the The entity word is replaced with the replacement word.
The computer device according to claim 12, wherein the data enhancement method further comprises a random replacement method, a random deletion method, a random exchange method and a long sentence construction method, and the synonyms are performed on the entity words in the original training data. After the replacement, the processor further implements the following steps when executing the computer-readable instructions:

In the optimization parameter list, determine the random replacement probability of the random replacement method, and determine the random deletion probability of the random deletion method;

Determine the random exchange probability of the random exchange method, and determine the sentence length set by the method of constructing a long sentence;

Perform entity word replacement for each sentence in the original training data according to the random replacement probability, and perform same-sentence entity word exchange for each sentence in the original training data according to the random exchange probability;

Perform entity word deletion on each sentence in the original training data according to the random deletion probability to obtain processing data;

Perform splicing processing on each sentence in the processing data, so that the sentence length after the processing is completed is the sentence length.
The computer device of any one of claims 9-13, wherein the determining whether a convergent model exists in the plurality of recognition models comprises:

determining the highest recognition score for recognizing each word in the test set in the plurality of recognition models;

determining whether the highest recognition score satisfies the convergence condition;

If the highest recognition score satisfies the convergence condition, it is determined that there is a convergence model satisfying the convergence condition in the plurality of recognition models, and the recognition model corresponding to the highest recognition score is the convergence model;

If the highest recognition score does not satisfy the convergence condition, it is determined that the convergence model that satisfies the convergence condition does not exist in the plurality of recognition models.
One or more readable storage media storing computer-readable instructions, wherein the computer-readable instructions, when executed by one or more processors, cause the one or more processors to perform the following steps:

Obtain the manually marked original training data and original test data, and obtain the original parameter list, where the original parameter list is composed of a data enhancement method and an enhancement parameter corresponding to the data enhancement method;

Randomly initialize the enhanced parameters in the original parameter list according to the artificial fish swarm algorithm to obtain multiple optimized parameter lists;

Transform the original training data using each of the optimized parameter lists to obtain corresponding artificially constructed data, and mix the original training data with the corresponding artificially constructed data to obtain multiple training sets;

Use the plurality of training sets to train to obtain a plurality of recognition models, and use the original test data as a test set to test the plurality of recognition models, so as to determine whether the plurality of recognition models meet the convergence conditions. Model;

If there is a model that satisfies the convergence condition in the plurality of identification models, outputting an optimization parameter list corresponding to the model that satisfies the convergence condition as a target data enhancement parameter list;

Data augmentation is performed on the original training data by using the target data augmentation parameter list to obtain a training set of a named entity recognition model.
16. The readable storage medium of claim 15, wherein, after said determining whether there is a model in the plurality of identification models that satisfies a convergence condition, the computer-readable instructions, when executed by one or more processors, cause The one or more processors also perform the following steps:

If there is no model that satisfies the convergence condition in the multiple identification models, the enhancement parameters in the original parameter list are randomly initialized again according to the artificial fish swarm algorithm to obtain multiple optimizations after the random initialization. parameter list, and count;

Determine whether the number of random initializations of the enhanced parameters in the original parameter list is less than a preset number of times;

If the number of random initializations for the enhanced parameters in the original parameter list is not less than the preset number of times, stop performing the random initialization on the enhanced parameters in the original parameter list;

If the number of random initializations for the enhanced parameters in the original parameter list is less than the preset number of times, multiple new recognition models are obtained by training according to the randomly initialized optimized parameter list, and the new multiple Each recognition model is tested to obtain the target data enhancement parameter list, and a training set of the named entity recognition model is obtained by using the target data enhancement parameter list.
The readable storage medium of claim 15, wherein the data augmentation method includes a synonym replacement method, and the transforming the original training data using each of the optimization parameter lists includes:

Determine the enhancement parameters corresponding to the synonym replacement method in the optimization parameter list, and the enhancement parameters corresponding to the synonym replacement method include entity word category replacement probability and entity word replacement category;

Obtaining a preset synonym dictionary pre-built by the user according to requirements, in the preset synonym dictionary, entity words in the same entity category whose synonymous relationship is not prohibited are used as synonyms for each other;

According to the preset synonym dictionary, the entity word category replacement probability and the entity word replacement category, synonym replacement is performed on the entity words in the original training data.
The readable storage medium according to claim 17, wherein the entity word in the original training data is performed according to the preset thesaurus dictionary, the entity word category replacement probability and the entity word replacement category. Synonym substitution, including:

determining whether the category of each entity word in the original training data belongs to the entity word replacement category;

If the category of the entity word in the original training data belongs to the entity word replacement category, search for the synonym of the entity word in the preset synonym dictionary;

determining whether a synonymous relationship is prohibited between the entity word and a synonym of the entity word;

If the synonym relationship between the entity word and the synonym of the entity word is not prohibited, use the entity word category replacement probability to select one of the synonyms from the preset synonym dictionary as a replacement word to replace the The entity word is replaced with the replacement word.
The readable storage medium of claim 18, wherein the data augmentation method further comprises a random replacement method, a random deletion method, a random exchange method, and a long sentence construction method, and wherein the entity words in the original training data are After the synonym substitution is performed, when the computer-readable instructions are executed by one or more processors, the one or more processors further perform the following steps:

In the optimization parameter list, determine the random replacement probability of the random replacement method, and determine the random deletion probability of the random deletion method;

Determine the random exchange probability of the random exchange method, and determine the sentence length set by the method of constructing a long sentence;

Perform entity word replacement for each sentence in the original training data according to the random replacement probability, and perform same-sentence entity word exchange for each sentence in the original training data according to the random exchange probability;

Perform entity word deletion on each sentence in the original training data according to the random deletion probability to obtain processing data;

Perform splicing processing on each sentence in the processing data, so that the sentence length after the processing is completed is the sentence length.
The readable storage medium of any one of claims 15-19, wherein the determining whether a convergence model exists in the plurality of identification models comprises:

determining the highest recognition score for recognizing each word in the test set in the plurality of recognition models;

determining whether the highest recognition score satisfies the convergence condition;

If the highest recognition score satisfies the convergence condition, it is determined that there is a convergence model satisfying the convergence condition in the plurality of recognition models, and the recognition model corresponding to the highest recognition score is the convergence model;

If the highest recognition score does not satisfy the convergence condition, it is determined that the convergence model that satisfies the convergence condition does not exist in the plurality of recognition models.