Disclosure of Invention
The invention aims to provide an iteration method and device of a nuclear body model, which are used for solving the problem that the model verification is inaccurate due to the fact that the distribution of test data and the distribution of training data form a gap according to different scenes or time.
According to a first aspect of the present invention, there is provided an iterative method of a kernel body model, comprising:
desensitizing the received data, and extracting and screening characteristic data from the desensitized data;
selecting super parameters from the characteristic data according to the state of the nuclear body model, and training the nuclear body model;
and evaluating the trained nuclear body model, and after the evaluation is qualified, putting the iterative nuclear body model on line.
Further, the method of the present invention, after desensitizing the received data, further comprises: timing after desensitizing the received data, and judging whether the current service is in an idle state or not when the timing time is greater than or equal to a preset backflow time interval;
if yes, carrying out data reflow so as to execute the steps of extracting and screening characteristic data from the data after the desensitization treatment.
Further, the method of the present invention extracts and screens feature data from the desensitized data, including:
reprocessing the desensitized data;
extracting the reprocessed characteristic data;
and carrying out post-processing on the reprocessed characteristic data, and filtering out abnormal characteristic data to obtain a training sample.
Further, according to the method of the present invention, according to the state of the core body model, a hyper-parameter is selected from the feature data, and the core body model is trained, including:
and selecting super parameters from the training samples according to the state of the nuclear body model, and training the nuclear body model.
Further, the method of the present invention performs post-processing on the reprocessed feature data, filters out abnormal feature data, and obtains a training sample, including:
adopting a clustering algorithm, and filtering data which are far away from a clustering center by a preset distance as abnormal characteristic data;
and sampling the data with the abnormal characteristic data filtered according to a preset sample selection strategy to obtain a training sample.
Further, the method of the present invention, the evaluating the trained core body model includes:
and evaluating the trained nuclear body model according to the algorithm evaluation set and the algorithm precision index.
Further, the method of the present invention, wherein the reprocessing the data after the desensitization treatment includes:
and if the desensitized data is image data, decoding, scale adjustment and image normalization are carried out on the image data.
According to a second aspect of the present invention, there is provided an iteration apparatus of a nuclear body model, comprising:
the desensitization processing module is used for carrying out desensitization processing on the received data;
the characteristic data screening module is used for extracting and screening characteristic data from the data subjected to the desensitization treatment;
the model training module is used for selecting super parameters from the characteristic data according to the state of the nuclear body model and training the nuclear body model;
and the model online module is used for evaluating the trained nuclear body model and online the iterative nuclear body model after the evaluation is qualified.
Further, the device of the present invention, the desensitizing processing module, includes:
the judging unit is used for timing after the desensitization treatment is carried out on the received data, and judging whether the current service is in an idle state or not when the timing time is greater than or equal to a preset backflow time interval;
and the data reflow unit is used for carrying out data reflow when the current service is in an idle state so as to execute the steps of extracting and screening the characteristic data from the desensitized data.
Further, the device of the present invention, the feature data screening module includes:
a reprocessing unit for reprocessing the desensitized data;
a feature data extraction unit for extracting the reprocessed feature data;
the training sample acquisition unit is used for carrying out post-processing on the reprocessed characteristic data and filtering out abnormal characteristic data to obtain a training sample.
Further, the device of the invention, the model training module is used for:
and selecting super parameters from the training samples according to the state of the nuclear body model, and training the nuclear body model.
Further, the device of the present invention, the training sample obtaining unit is configured to:
adopting a clustering algorithm, and filtering data which are far away from a clustering center by a preset distance as abnormal characteristic data;
and sampling the data with the abnormal characteristic data filtered according to a preset sample selection strategy to obtain a training sample.
Further, according to the device disclosed by the invention, the model online module is used for:
and evaluating the trained nuclear body model according to the algorithm evaluation set and the algorithm precision index.
Further, the device of the present invention, the reprocessing unit is configured to:
and if the desensitized data is image data, decoding, scale adjustment and image normalization are carried out on the image data.
According to a third aspect of the present invention there is provided a storage medium storing computer program instructions for execution in accordance with the method of the present invention.
According to a fourth aspect of the present invention there is provided a computing device comprising: a memory for storing computer program instructions and a processor for executing the computer program instructions, wherein the computer program instructions, when executed by the processor, trigger the computing device to perform the method of the invention.
According to the iteration method and the iteration device of the nuclear body model, the received data are subjected to desensitization treatment, and the characteristic data are extracted and screened from the preprocessed data; the problem that the performance of a deep learning algorithm is affected by the change of data caused by the change of the distribution of test data and the distribution of training data due to different time is solved; according to the state of the nuclear body model, super parameters are selected from the characteristic data, and the nuclear body model is trained, so that the problem that the distribution of test data and the distribution of training data change due to different scenes, and the performance of a deep learning algorithm is influenced due to the change of the data is solved; the method provided by the invention enables the updated model to be suitable for each scene, exerts the extreme performance of the algorithm, and improves the accuracy of verification of the kernel body of the algorithm.
Detailed Description
The invention is described in further detail below with reference to the accompanying drawings.
Fig. 1 is a flow chart of an iteration method of a core-body model according to an embodiment of the present invention, as shown in fig. 1, where the iteration method of a core-body model according to an embodiment of the present invention includes:
101. desensitizing the received data, and extracting and screening characteristic data from the desensitized data;
in the desensitization processing in this embodiment, in order to perform desensitization processing on received data, that is, data from different scenes, it may be understood that sensitive information in the received data is removed, for example, data desensitization processing, and risk of theft of user information is guaranteed through the desensitization processing.
102. Selecting super parameters from the characteristic data according to the state of the nuclear body model, and training the nuclear body model;
the state of the nuclear body model is various, including various numerical information such as loss function values, gradient values, and the like;
the super-parameters in the feature data include learning rate, momentum, etc.
103. And evaluating the trained nuclear body model, and after the evaluation is qualified, putting the iterative nuclear body model on line.
Because the performance of the model after automatic training is not evaluated, whether the model reaches the on-line performance standard is unknown, the automatic evaluation needs to be configured and corresponding parameter indexes need to be set for judgment.
Different performance indexes such as payment and login can be performed according to different service scenes, so that the trained core body model is online after the performance indexes under the required service scenes are evaluated to be qualified.
The method comprises the steps of performing desensitization treatment on received data, and extracting and screening characteristic data from the preprocessed data; the problem that the performance of a deep learning algorithm is affected by the change of data caused by the change of the distribution of test data and the distribution of training data due to different time is solved; according to the state of the nuclear body model, super parameters are selected from the characteristic data, and the nuclear body model is trained, so that the problem that the distribution of test data and the distribution of training data change due to different scenes, and the performance of a deep learning algorithm is influenced due to the change of the data is solved; the method provided by the invention enables the updated model to be suitable for each scene, exerts the extreme performance of the algorithm, and improves the accuracy of verification of the kernel body of the algorithm.
Fig. 2 is a flow chart of an iteration method of a core-body model according to an embodiment of the present invention, as shown in fig. 2, where the iteration method of a core-body model according to an embodiment of the present invention includes:
201. desensitizing the received data;
the desensitization processing in this embodiment is a reflow operation of desensitizing the received data.
The data desensitization, namely, the scene data may contain sensitive information of the user, including the face, the user name, transaction information and the like of the user, so that the data needs to be desensitized before the data is reflowed. The desensitization treatment includes, but is not limited to, the following: the method comprises (1) watermarking encryption processing of pictures; (2) anonymously processing key information such as user names; (3) transaction information deletion. After desensitization of the data, it is stored in the local device waiting for reflow.
Generally, after desensitizing the received data, the method further comprises: timing after desensitizing the received data, and judging whether the current service is in an idle state or not when the timing time is greater than or equal to a preset backflow time interval;
if yes, carrying out data reflux so as to execute the subsequent step, namely, the step of extracting and screening the characteristic data from the data after the desensitization treatment.
In addition, the data reflow has two modes of real-time reflow and asynchronous reflow, and the method of the embodiment adopts asynchronous data reflow. Mainly because automated iteration has no high requirement on data instantaneity. And the real-time data reflux can occupy the bandwidth greatly, so that the user experience of the main service is affected. Asynchronous data reflow can flexibly set a reflow time interval, and occupy redundant bandwidth to reflow data when the main service is idle.
Different time intervals may be used for automated deployment for different scenarios, and specific time intervals are not illustrated in detail in this embodiment.
202. Reprocessing the desensitized data, and extracting the reprocessed characteristic data;
reprocessing the desensitized data as described in step 202 above may include the steps of:
and if the desensitized data is image data, decoding, scale adjustment and image normalization are carried out on the image data.
The general above step 202 includes the sub-steps of:
the scene data is streamed back to the model server.
It will be appreciated that the model server is provided on the side receiving the scene data, and may be provided with the device receiving the scene data or may be provided on a different end.
The model server preprocesses the scene data, and when the scene data is image data, the model server preprocesses the scene data including image decoding, image scale adjustment, image normalization and the like.
And carrying out automatic feature extraction on the preprocessed image by a model server.
The whole flow of the method is automatically completed by a model server, and an algorithm developer cannot acquire image data and desensitization information of a user. The reprocessed data is used as the screening of the subsequent characteristic data, and is mainly used for guaranteeing the uniformity of the subsequent characteristic data.
203. And carrying out post-processing on the reprocessed characteristic data, and filtering out abnormal characteristic data to obtain a training sample.
The post-processing is performed on the reprocessed feature data in the step 203, and the abnormal feature data is filtered to obtain a training sample, which includes the following sub-steps:
2031. adopting a clustering algorithm, and filtering data which are far away from a clustering center by a preset distance as abnormal characteristic data;
after the scene data feature extraction is completed, post-processing is required to be carried out on the feature data, noise is filtered, and a sample is selected.
Because some noise exists in the scene data, a clustering algorithm is adopted to detect abnormal data. Specifically, when the feature data is distant from the cluster center by a certain distance, it is determined as an abnormal feature. The outlier feature will not be employed as a subsequent training sample.
2032. And sampling the data with the abnormal characteristic data filtered according to a preset sample selection strategy to obtain a training sample.
Since the reflowed scene data is quite large in sample number and is a simple sample for the existing model, the disadvantage of directly adding the scene data to the training data is greater than the advantage of training the model. The strategy of sample selection is mainly that the lines are sampled according to the algorithm corresponding to the features. Taking a face comparison algorithm as an example, the algorithm will sample from the dimension of the user id, for example every 10 features of the user.
204. Selecting super parameters from the characteristic data according to the state of the nuclear body model, and training the nuclear body model;
and selecting super parameters from the training samples according to the state of the nuclear body model, and training the nuclear body model.
The traditional model training method often needs to manually select a plurality of super parameters, and has the defects of low efficiency and high labor cost. Therefore, the method adopts an automatic training method based on reinforcement learning.
When super parameters are selected from training samples, reinforcement learning can adjust the super parameters (including learning rate, momentum and the like) of the model according to the states (including loss function values, gradient values and the like) of the model in the model training process. The method can find a better set of hyper-parameters in a huge hyper-parameter search space, and has proven to be more effective than manually adjusting parameters.
205. And evaluating the trained nuclear body model, and after the evaluation is qualified, putting the iterative nuclear body model on line.
The foregoing evaluation training core body model includes:
and evaluating the trained nuclear body model according to the algorithm evaluation set and the algorithm precision index.
Regarding model automation evaluation, since the performance of the model after automation training is not evaluated, whether the model reaches the on-line performance standard is unknown, the automatic evaluation needs to be configured and corresponding parameter indexes need to be set for judgment. The model automation evaluation mainly comprises two aspects of evaluation set selection and algorithm precision index.
Regarding the algorithm evaluation set: the algorithm evaluation set comprises a basic test set and a scene test set. The basic test set mainly tests the generalization capability of the algorithm, and the scene test set mainly tests the performance of the algorithm on the corresponding service scene.
Regarding algorithm accuracy indicators, the choice of algorithm accuracy indicators is related to a specific business scenario and a specific algorithm type. For example, for face alignment algorithms, the false recognition rate is a major performance indicator. Different business scenarios, such as payment and login, may have different performance indicators. And if the algorithm precision of the model obtained by automatic training reaches the index, an automatic online process is performed.
According to the method provided by the embodiment of the invention, the received data is subjected to desensitization treatment, and the characteristic data is extracted and screened from the preprocessed data; the problem that the performance of a deep learning algorithm is affected by the change of data caused by the change of the distribution of test data and the distribution of training data due to different time is solved; according to the state of the nuclear body model, super parameters are selected from the characteristic data, and the nuclear body model is trained, so that the problem that the distribution of test data and the distribution of training data change due to different scenes, and the performance of a deep learning algorithm is influenced due to the change of the data is solved; the method provided by the invention enables the updated model to be suitable for each scene, exerts the extreme performance of the algorithm, and improves the accuracy of verification of the kernel body of the algorithm.
Among the 5 steps described above, as shown in fig. 3, including 5 processes, the following can be simplified:
301. scene data desensitization reflow: different scenes collect different data, and after data desensitization, the data is refluxed to the model server.
302. Scene data feature extraction: the model server uses the current version of the model to perform automatic feature extraction on the reflow data for subsequent model training.
303. Scene data screening: since there is noise in the reflowed data features and not all samples are needed for the model, a data filter is set to filter the data.
304. Model automation training: the model super-parameters are automatically selected and the model is trained through an automatic machine learning technology.
305. Model automation evaluation is online: and (3) finishing automatic evaluation of the model by setting a performance evaluation experiment, and determining whether to be on line.
Through the five stages, the method can automatically mine the defect of the current algorithm from the desensitized scene data, and then make up for the defect through automatic model optimization iteration. Finally, the updated model can be better adapted to each scene, and the extreme performance of the algorithm is exerted. In addition, the method can be used for automatically updating the model by configuring reasonable time intervals, so that the influence of time on data distribution is overcome.
Fig. 4 is a schematic structural diagram of an iteration apparatus of a core-body model according to an embodiment of the present invention, as shown in fig. 4, where the iteration apparatus of a core-body model according to an embodiment of the present invention includes:
a desensitization processing module 41 for desensitizing the received data;
in the desensitization processing in this embodiment, in order to perform desensitization processing on received data, that is, data from different scenes, it may be understood that sensitive information in the received data is removed, for example, data desensitization processing, and risk of theft of user information is guaranteed through the desensitization processing.
A feature data screening module 42, configured to extract and screen feature data from the desensitized data;
the model training module 43 is configured to select a hyper-parameter from the feature data according to the state of the core-body model, and train the core-body model;
the state of the nuclear body model is various, including various numerical information such as loss function values, gradient values, and the like;
the super-parameters in the feature data include learning rate, momentum, etc.
The model online module 44 is configured to evaluate the trained core body model, and online the iterated core body model after the evaluation is qualified.
Because the performance of the model after automatic training is not evaluated, whether the model reaches the on-line performance standard is unknown, the automatic evaluation needs to be configured and corresponding parameter indexes need to be set for judgment.
Different performance indexes such as payment and login can be performed according to different service scenes, so that the trained core body model is online after the performance indexes under the required service scenes are evaluated to be qualified.
In the iteration device of the nuclear body model, the desensitization processing module is used for carrying out desensitization processing on the received data, and the characteristic data screening module is used for extracting and screening the characteristic data from the preprocessed data; the problem that the performance of a deep learning algorithm is affected by the change of data caused by the change of the distribution of test data and the distribution of training data due to different time is solved; the model training module is used for selecting super parameters from the characteristic data according to the state of the nuclear body model, so that the nuclear body model is trained, and the problem that the performance of a deep learning algorithm is affected due to the change of data caused by the change of distribution of test data and distribution of training data due to different scenes is solved; the method provided by the invention enables the updated model to be suitable for each scene, exerts the extreme performance of the algorithm, and improves the accuracy of verification of the kernel body of the algorithm.
Fig. 5 is a schematic structural diagram of an iteration apparatus of a core-body model according to an embodiment of the present invention, where, as shown in fig. 5, the iteration apparatus of a core-body model according to an embodiment of the present invention includes:
the desensitization processing module 51 includes:
the judging unit is used for timing after the desensitization treatment is carried out on the received data, and judging whether the current service is in an idle state or not when the timing time is greater than or equal to a preset backflow time interval;
and the data reflow unit is used for carrying out data reflow when the current service is in an idle state so as to execute the steps of extracting and screening the characteristic data from the desensitized data.
A feature data screening module 52, configured to reprocess the desensitized data, and extract the reprocessed feature data;
wherein the feature data screening module 52 includes,
a reprocessing unit 521 for reprocessing the desensitized data;
a feature data extraction unit 522, configured to extract the reprocessed feature data;
the training sample obtaining unit 523 is configured to post-process the reprocessed feature data, and filter the abnormal feature data to obtain a training sample.
In one embodiment of the present invention, the training sample acquiring unit 523 is configured to:
adopting a clustering algorithm, and filtering data which are far away from a clustering center by a preset distance as abnormal characteristic data;
and sampling the data with the abnormal characteristic data filtered according to a preset sample selection strategy to obtain a training sample.
The model training module 53 is configured to select a hyper-parameter from the feature data according to the state of the core body model, and train the core body model;
the model online module 54 is configured to:
and evaluating the trained nuclear body model according to the algorithm evaluation set and the algorithm precision index.
In one embodiment of the invention, the reprocessing unit is configured to:
and if the desensitized data is image data, decoding, scale adjustment and image normalization are carried out on the image data.
The apparatus shown in fig. 4 and fig. 5 of the present embodiment is an implementation apparatus of the method shown in fig. 1 and fig. 2 of the present embodiment, and the specific principle thereof is the same as that of the method shown in fig. 1 and fig. 2 of the present embodiment, and will not be repeated here.
In one embodiment of the invention, there is also provided a storage medium storing computer program instructions for execution in accordance with the method of an embodiment of the invention.
In one typical configuration of the invention, the computing devices each include one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, etc., such as Read Only Memory (ROM) or flash memory (flashRAM). Memory is an example of computer-readable media.
In one embodiment of the present invention, there is also provided a computing device including: a memory for storing computer program instructions and a processor for executing the computer program instructions, wherein the computer program instructions, when executed by the processor, trigger the computing device to perform the method of embodiments of the invention.
Computer-readable storage media include both non-transitory and non-transitory, removable and non-removable media, and information storage may be implemented by any method or technology. The information may be computer readable instructions, data structures, program devices, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape storage or other magnetic storage devices, or any other non-transmission medium which can be used to store information that can be accessed by a computing device.
It should be noted that the present invention may be implemented in software and/or a combination of software and hardware, e.g., using Application Specific Integrated Circuits (ASIC), a general purpose computer or any other similar hardware device. In some embodiments, the software program of the present invention may be executed by a processor to implement the above steps or functions. Likewise, the software programs of the present invention (including associated data structures) may be stored on a computer readable recording medium, such as RAM memory, magnetic or optical drive or diskette and the like. In addition, some steps or functions of the present invention may be implemented in hardware, for example, as circuitry that cooperates with the processor to perform various steps or functions.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. A plurality of units or means recited in the apparatus claims can also be implemented by means of one unit or means in software or hardware. The terms first, second, etc. are used to denote a name, but not any particular order.