CN111832591B - Machine learning model training method and device - Google Patents
Machine learning model training method and device Download PDFInfo
- Publication number
- CN111832591B CN111832591B CN201910327485.8A CN201910327485A CN111832591B CN 111832591 B CN111832591 B CN 111832591B CN 201910327485 A CN201910327485 A CN 201910327485A CN 111832591 B CN111832591 B CN 111832591B
- Authority
- CN
- China
- Prior art keywords
- sample data
- model
- global
- model training
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012549 training Methods 0.000 title claims abstract description 111
- 238000000034 method Methods 0.000 title claims abstract description 42
- 238000010801 machine learning Methods 0.000 title description 6
- 230000004927 fusion Effects 0.000 claims abstract description 8
- 239000003550 marker Substances 0.000 claims description 34
- 238000012545 processing Methods 0.000 claims description 12
- 238000010586 diagram Methods 0.000 description 14
- 230000006870 function Effects 0.000 description 6
- 238000004891 communication Methods 0.000 description 5
- 238000007499 fusion processing Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medical Informatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present disclosure provides a method and apparatus for model training. At the global model device, the training sample data set is divided into a plurality of independent training sample data subsets, each training sample data subset is utilized to respectively and independently train a plurality of global sub-models, and the plurality of global sub-models are subjected to model fusion to obtain a global model. When the local model is trained, the local model training device sends unmarked sample data to the global model side, the global model is utilized to obtain the marking value of the sample data, then the local model of the user is trained locally by utilizing the sample data and the corresponding marking value, and the trained local model is deployed locally to the user for model prediction service. By using the model training method and device, the data leakage of training data can be protected.
Description
Technical Field
The present disclosure relates generally to the field of computer technology, and more particularly, to machine learning model training methods and apparatus.
Background
In some machine learning applications, training of a machine learning model may involve sensitive data, such as a large amount of face data required to train a model for detecting whether a picture is a face, and a large amount of personal privacy data required to train a model for medical diagnosis.
It has been found that, by using reverse engineering techniques, training data used in model training can be reconstructed based on the prediction results of the machine learning model. It can be seen that the conventional model training method has a high possibility of causing disclosure of personal privacy data, for example, a large number of model prediction results are obtained through a large number of queries, and then training data is reconstructed based on the obtained model prediction results, so that personal privacy data for the model training data is obtained.
Disclosure of Invention
In view of the foregoing, the present disclosure provides a model training method and apparatus. By using the model training method and device, the data leakage of training data can be protected.
According to one aspect of the present disclosure, there is provided a method for model training, comprising: transmitting at least one first sample data to a global model means to obtain a marker value of the at least one first sample data based on a global model at the global model means, the first sample data being non-marker sample data; and training a local model locally at the user using the at least one first sample data and the corresponding marker values, wherein the global model comprises at least one global sub-model, each global sub-model being trained with a separate second sample data set.
Optionally, in an example of the above aspect, the second sample data set is obtained by dividing the sample data set or is acquired by a different data acquisition device.
Optionally, in one example of the above aspect, the method may further include: the at least one first sample data is collected locally at the user.
Optionally, in one example of the above aspect, the at least one first sample data is public sample data.
Optionally, in an example of the above aspect, the flag value of each of the at least one first sample data is obtained by inputting the first sample data into each of the at least one global sub-model to predict and fusing the obtained predicted value of each of the global sub-models.
Optionally, in one example of the above aspect, the predicted value of each global sub-model is a predicted value after the noise addition process.
According to another aspect of the present disclosure, there is provided a method for model training, comprising: locally receiving at least one first sample data from a user, the first sample data being unlabeled sample data; providing the at least one first sample data to a global model to obtain a signature value of the at least one first sample data; and transmitting the obtained marker value of the at least one first sample data to the user's local to train a local model using the at least one first sample data and the corresponding marker value at the user's local, wherein the global model comprises at least one global sub-model, each global sub-model being trained with a separate second set of sample data.
Optionally, in an example of the above aspect, the second sample data set is obtained by dividing the sample data set or is acquired by a different data acquisition device.
Optionally, in one example of the above aspect, providing the at least one first sample data to the global model to obtain the tag value of the at least one first sample data includes: inputting each first sample data in the at least one first sample data into each global sub-model in the at least one global sub-model for prediction; and fusing the obtained predicted values of the global sub-models of each first sample data to obtain the marked value of the sample data.
Optionally, in one example of the above aspect, providing the at least one first sample data to the global model to obtain the tag value of the at least one first sample data further comprises: noise adding processing is performed on the obtained predicted values of the global sub-models, wherein the fusing of the obtained predicted values of the global sub-models of each first sample data to obtain the marking value of the sample data comprises: and fusing the obtained predicted values of each first sample data after noise addition processing to obtain the marking value of the sample data.
According to another aspect of the present disclosure, there is provided an apparatus for model training, comprising: a sample data transmitting unit configured to transmit at least one first sample data to a global model apparatus to obtain a marker value of the at least one first sample data based on a global model at the global model apparatus, the first sample data being a marker-free sample data; a flag value receiving unit configured to receive a flag value of the at least one first sample data; and a local model training unit configured to train the local model locally at the user using the at least one first sample data and the corresponding marker values, wherein the global model comprises at least one global sub-model, each global sub-model being trained with a separate second set of sample data.
Optionally, in one example of the above aspect, the apparatus may further include: a sample data acquisition unit configured to acquire the at least one first sample data locally at a user.
Optionally, in one example of the above aspect, the at least one first sample data is public sample data.
According to another aspect of the present disclosure, there is provided an apparatus for model training, comprising: a sample data receiving unit configured to receive at least one first sample data locally from a user, the first sample data being unlabeled sample data; a flag value acquisition unit configured to provide the at least one first sample data to a global model to obtain a flag value of the at least one first sample data; and a tag value transmitting unit configured to transmit the obtained tag value of the at least one first sample data to the user's local to train a local model using the at least one first sample data and the corresponding tag value at the user's local, wherein the global model comprises at least one global sub-model, each of which is trained with an independent second sample data set.
Optionally, in one example of the above aspect, the apparatus may further include: at least one global sub-model training unit configured to train out each of the at least one global sub-model using an independent second sample data set.
Optionally, in one example of the above aspect, the flag value acquisition unit includes: a prediction module configured to input each of the at least one first sample data to a respective one of the at least one global sub-model for prediction; and a data fusion module configured to fuse the obtained predicted values of the respective global sub-models of each first sample data to obtain a marker value of the sample data.
Optionally, in one example of the above aspect, the apparatus may further include: a noise adding module configured to perform noise adding processing on the obtained predicted values of the global submodels, wherein the data fusion module is configured to: and fusing the obtained predicted values of each first sample data after noise addition processing to obtain the marking value of the sample data.
According to another aspect of the present disclosure, there is provided a system for model training, comprising: means for model training on the user's local side as described above; and means for model training at the distal end as described above.
According to another aspect of the present disclosure, there is provided a computing device comprising: at least one processor, and a memory coupled to the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method for user local model training as described above.
According to another aspect of the disclosure, there is provided a machine-readable storage medium storing executable instructions that, when executed, cause the machine to perform a method for user local model training as described above.
According to another aspect of the present disclosure, there is provided a computing device comprising: at least one processor, and a memory coupled to the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method for global model training as described above.
According to another aspect of the disclosure, there is provided a machine-readable storage medium storing executable instructions that, when executed, cause the machine to perform a method for global model training as described above.
Drawings
A further understanding of the nature and advantages of the present disclosure may be realized by reference to the following drawings. In the drawings, similar components or features may have the same reference numerals.
FIG. 1 illustrates a block diagram of a system for model training according to an embodiment of the present disclosure;
FIG. 2 illustrates a flow chart of a method for model training according to an embodiment of the present disclosure;
FIG. 3 illustrates a flowchart of a process for obtaining a marker value for first sample data, according to an embodiment of the present disclosure;
FIG. 4 illustrates a block diagram of one example of a local model training device, according to an embodiment of the present disclosure;
FIG. 5 illustrates a block diagram of one example of a global model apparatus according to an embodiment of the present disclosure;
FIG. 6 illustrates a block diagram of one implementation example of a tag value acquisition unit, according to an embodiment of the present disclosure;
FIG. 7 illustrates a block diagram of a computing device for local model training in accordance with an embodiment of the present disclosure;
Fig. 8 illustrates a block diagram of a computing device for model training at a distal side according to an embodiment of the present disclosure.
Detailed Description
The subject matter described herein will now be discussed with reference to example embodiments. It should be appreciated that these embodiments are discussed only to enable a person skilled in the art to better understand and thereby practice the subject matter described herein, and are not limiting of the scope, applicability, or examples set forth in the claims. Changes may be made in the function and arrangement of elements discussed without departing from the scope of the disclosure. Various examples may omit, replace, or add various procedures or components as desired. For example, the described methods may be performed in a different order than described, and various steps may be added, omitted, or combined. In addition, features described with respect to some examples may be combined in other examples as well.
As used herein, the term "comprising" and variations thereof mean open-ended terms, meaning "including, but not limited to. The term "based on" means "based at least in part on". The terms "one embodiment" and "an embodiment" mean "at least one embodiment. The term "another embodiment" means "at least one other embodiment". The terms "first," "second," and the like, may refer to different or the same object. Other definitions, whether explicit or implicit, may be included below. Unless the context clearly indicates otherwise, the definition of a term is consistent throughout this specification.
Embodiments of the present disclosure propose a new model training scheme. In the model training scheme, a sample data set for model training is divided into a plurality of independent sample data subsets, then a plurality of global sub-models are respectively and independently trained by the aid of the independent sample data subsets, or each global sub-model in a global model is trained by aid of training party combinations respectively provided with the sample data subsets, and then the obtained global sub-models are subjected to fusion processing to obtain the global model. Here, "fusion processing is performed on the global sub-model to obtain the global model" means that the global model is composed of a plurality of global sub-models, and when prediction is performed using the global model, the prediction result of the global model is obtained by fusion processing of the prediction results of the respective global sub-models by a certain mechanism. Further, public sample data is collected locally at the user, the public sample data being unlabeled sample data. The collected public sample data is transmitted to the global model side, a tag value of the public sample data (i.e., a predicted value of the global model) is obtained using the global model, and a user local model is trained locally on the user using the public sample data and the corresponding tag value, and the trained local model is deployed locally on the user, such as a mobile phone, to perform model prediction service. Through the mode that this kind of global model combines with local model, when global model trains on the one hand, data distribution has avoided the possibility that all data was revealed, simultaneously through training local model, can accelerate model prediction speed, simultaneously because local model only uses public data to avoid revealing of data privacy.
Methods and apparatuses for model training according to embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.
Fig. 1 shows a block diagram of a system for model training (hereinafter referred to as model training system 10) according to an embodiment of the present disclosure. As shown in fig. 1, model training system 10 includes a local model training device 110 and a global model device 120.
In performing model training, the local model training device 110 transmits at least one first sample data to the global model device 120, the at least one first sample data being unlabeled sample data, i.e., having no labeled value. In one example of the present disclosure, the at least one first sample data may be public data collected locally at the user, i.e., the first sample data for local model training is not sample data local to the user.
The global model means 120 has a global model. The global model includes at least one global sub-model, each of which is trained using a separate second sample data set. In the present disclosure, the second sample data set may be a sample data subset obtained by dividing a sample data set for global model training, or may be acquired by a different data acquisition device. The second sample data subset is an independent sample data set and respective corresponding global sub-models are trained in an independent training environment, such as on an independent training device. When the global model is used for prediction, the prediction result of the global model is obtained by fusion processing of the prediction results of all the global sub-models through a certain mechanism.
Upon receiving at least one first sample data transmitted by the local model training device 110, a corresponding tag value is obtained for each first sample data based on the global model. How the marker values are derived based on the global model will be described in detail later with reference to the drawings.
Then, the global model apparatus 120 transmits the obtained flag value of each first sample data to the local model training apparatus 110. The local model training device 110 uses the at least one first sample data and the corresponding marker values to train the local model locally to the user. The trained local model is then deployed locally for subsequent model prediction.
Fig. 2 shows a flowchart of a method for model training according to an embodiment of the present disclosure.
As shown in FIG. 2, at block 210, the local model training device 110 collects at least one first sample data. The collected at least one first sample data is then transmitted to the global model device 120 at block 220.
At the global model means 120, at block 230, the marker values of the respective first sample data are predicted based on the global model in the global model means 120.
Fig. 3 shows a flowchart of a process for acquiring a marker value for first sample data according to an embodiment of the present disclosure.
As shown in fig. 3, at block 310, each of the at least one first sample data is input to a respective global sub-model in the global model for prediction to obtain a corresponding prediction value.
Next, at block 320, noise addition processing is performed on the obtained predicted values of the respective global sub-models for the respective first sample data. Here, the noise may be gaussian noise or laplace noise, for example. For example, for a data distribution of at least one first sample data, a corresponding noise may be generated using, for example, a sample data mean or variance.
Then, at block 330, the predicted values of the respective global sub-models of each of the obtained first sample data are fused to obtain a marker value for the sample data.
Here, it is to be noted that, in other examples of the present disclosure, the acquisition process of the flag value for the first sample data shown in fig. 3 may not include the operation of block 320.
After obtaining the tag values for each first sample data as described above, the global model device 120 sends the obtained tag values for each first sample data to the local model training device 110 at block 240.
Upon receiving the tag values for each of the first sample data, at block 250, local model training device 110 trains the local model using each of the first sample data and the corresponding tag values locally to the user.
With the method and the device for model training according to the embodiments of the present disclosure, the model is performed by combining the global model with the local model, on the one hand, when the global model is trained, training data is divided into a plurality of parts and each part is distributed in an independent environment (for example, an independent training party), so that the possibility of leakage of all training data is avoided, and on the other hand, the model prediction speed can be accelerated by training the local model by using sample data without a marker value and a corresponding marker value obtained based on the global model.
With the method and apparatus for model training according to the disclosed embodiments, a global sub-model can be prevented from being reverse-cracked by adding noise to the predicted values of the respective global sub-models. For example, a malicious user performs model prediction by inputting a large number of samples, so as to obtain a stack of prediction results, and when the input sample size is relatively large, the malicious user can use the prediction results to recover the global sub-model and training sample data used by the global sub-model, so that the global sub-model and the training sample data are leaked. After noise is added, a malicious user cannot recover the global sub-model, so that the safety of the global sub-model and training sample data is ensured.
In addition, with the method and apparatus for model training according to the embodiments of the present disclosure, private data privacy disclosure local to the user is avoided since the first sample data used in local model training is public data collected locally to the user.
A method for model training according to an embodiment of the present disclosure is described above with reference to fig. 1 to 3, and an apparatus for model training according to an embodiment of the present disclosure will be described below with reference to fig. 4 to 6.
Fig. 4 shows a block diagram of a local model training device 110 according to an embodiment of the present disclosure. As shown in fig. 4, the local model training apparatus 110 includes a sample data transmitting unit 111, a flag value receiving unit 113, and a local model training unit 115.
The sample data transmitting unit 111 is configured to transmit at least one first sample data to the global model arrangement 120 to obtain a marker value of the at least one first sample data based on the global model at the global model arrangement 120, the first sample data being non-marker sample data. The global model includes at least one global sub-model, each of which is trained using a separate second sample data set. The operation of the sample data transmitting unit 111 may refer to the operation of block 220 described above with reference to fig. 2.
The flag value receiving unit 113 is configured to receive a flag value of the at least one first sample data. The operation of the tag value receiving unit 113 may refer to the operation of block 240 described above with reference to fig. 2.
The local model training unit 115 is configured to train the local model locally at the user using the at least one first sample data and the corresponding marker values. The operation of the local model training unit 115 may refer to the operation of block 250 described above with reference to fig. 2.
Further, in another example of the present disclosure, the local model training device 110 may also include a sample data acquisition unit (not shown). The sample data acquisition unit is configured to acquire at least one first sample data locally at a user.
Fig. 5 shows a block diagram of a global model apparatus 120 according to an embodiment of the present disclosure. As shown in fig. 5, the global model apparatus 120 includes a sample data receiving unit 121, a flag value acquiring unit 123, and a flag value transmitting unit 125.
The sample data receiving unit 121 is configured to receive locally from a user at least one first sample data, which is a label-free sample data. The operation of the sample data receiving unit 121 may refer to the operation of block 220 described above with reference to fig. 2.
The tag value obtaining unit 123 is configured to provide the at least one first sample data to the global model to obtain the tag value of the at least one first sample data. The operation of the flag value acquisition unit 123 may refer to the operation of block 230 described above with reference to fig. 2 and the operation described with reference to fig. 3.
The tag value sending unit 125 is configured to send the obtained tag value of the at least one first sample data to the user's local to train the local model using the at least one first sample data and the corresponding tag value locally at the user. The operation of the tag value transmitting unit 125 may refer to the operation of block 240 described above with reference to fig. 2.
Fig. 6 shows a block diagram of one implementation example of the flag value acquisition unit 123 according to an embodiment of the present disclosure. As shown in fig. 6, the flag value acquisition unit 123 includes a prediction module 124 and a data fusion module 126.
The prediction module 124 is configured to input each of the at least one first sample data into a respective one of the at least one global sub-model for prediction.
The data fusion module 126 is configured to fuse the respective predicted values of each of the obtained first sample data to obtain a marker value for the sample data.
In another example of the present disclosure, the flag value acquisition unit 125 may further include a noise addition module (not shown). The noise adding module is configured to perform noise adding processing on the obtained predicted values of the respective global sub-models. And then, the data fusion module fuses the obtained predicted values of each first sample data after noise addition processing to obtain the marking value of the sample data.
Embodiments of a method for model training and an apparatus for model training according to the present disclosure are described above with reference to fig. 1 to 6. The above means for model training may be implemented in hardware, or in software, or a combination of hardware and software.
Fig. 7 illustrates a hardware block diagram of a computing device 700 for local model training, according to an embodiment of the disclosure. As shown in fig. 7, computing device 700 may include at least one processor 710, memory 720, memory 730, and communication interface 740, and at least one processor 710, memory 720, memory 730, and communication interface 740 are connected together via a bus 760. The at least one processor 710 executes at least one computer-readable instruction (i.e., the elements described above as being implemented in software) stored or encoded in memory.
In one embodiment, computer-executable instructions are stored in memory that, when executed, cause the at least one processor 710 to: transmitting at least one first sample data to a global model means to obtain a marker value of the at least one first sample data based on a global model at the global model means, the first sample data being non-marker sample data; and training a local model locally at the user using the at least one first sample data and the corresponding marker values, wherein the global model comprises at least one global sub-model, each global sub-model being trained with a separate second sample data set.
It should be appreciated that the computer-executable instructions stored in the memory, when executed, cause the at least one processor 710 to perform the various operations and functions described above in connection with fig. 1-6 in various embodiments of the present disclosure.
Fig. 8 illustrates a hardware block diagram of a computing device 800 for model training (i.e., global model apparatus above) at a far-end side according to an embodiment of the present disclosure. As shown in fig. 8, computing device 800 may include at least one processor 810, a memory 820, a memory 830, and a communication interface 840, and at least one processor 810, memory 820, memory 830, and communication interface 840 are connected together via a bus 860. At least one processor 810 executes at least one computer-readable instruction (i.e., the elements described above as being implemented in software) stored or encoded in memory.
In one embodiment, computer-executable instructions are stored in memory that, when executed, cause the at least one processor 810 to: locally receiving at least one first sample data from a user, the first sample data being unlabeled sample data; providing the at least one first sample data to a global model to obtain a signature value of the at least one first sample data; and transmitting the obtained marker value of the at least one first sample data to the user's local to train a local model using the at least one first sample data and the corresponding marker value at the user's local, wherein the global model comprises at least one global sub-model, each global sub-model being trained with a separate second set of sample data.
It should be appreciated that the computer-executable instructions stored in the memory, when executed, cause the at least one processor 810 to perform the various operations and functions described above in connection with fig. 1-6 in various embodiments of the present disclosure.
In this disclosure, computing devices 700/800 may include, but are not limited to: personal computers, server computers, workstations, desktop computers, laptop computers, notebook computers, mobile computing devices, smart phones, tablet computers, cellular phones, personal Digital Assistants (PDAs), handsets, messaging devices, wearable computing devices, consumer electronic devices, and the like.
According to one embodiment, a program product, such as a machine-readable medium, is provided. The machine-readable medium may have instructions (i.e., the elements described above implemented in software) that, when executed by a machine, cause the machine to perform the various operations and functions described above in connection with fig. 1-5 in various embodiments of the disclosure. In particular, a system or apparatus provided with a readable storage medium having stored thereon software program code implementing the functions of any of the above embodiments may be provided, and a computer or processor of the system or apparatus may be caused to read out and execute instructions stored in the readable storage medium.
In this case, the program code itself read from the readable medium may implement the functions of any of the above-described embodiments, and thus the machine-readable code and the readable storage medium storing the machine-readable code form part of the present invention.
Examples of readable storage media include floppy disks, hard disks, magneto-optical disks, optical disks (e.g., CD-ROMs, CD-R, CD-RWs, DVD-ROMs, DVD-RAMs, DVD-RWs), magnetic tapes, nonvolatile memory cards, and ROMs. Alternatively, the program code may be downloaded from a server computer or cloud by a communications network.
It will be appreciated by those skilled in the art that various changes and modifications can be made to the embodiments disclosed above without departing from the spirit of the invention. Accordingly, the scope of the invention should be limited only by the attached claims.
It should be noted that not all the steps and units in the above flowcharts and the system configuration diagrams are necessary, and some steps or units may be omitted according to actual needs. The order of execution of the steps is not fixed and may be determined as desired. The apparatus structures described in the above embodiments may be physical structures or logical structures, that is, some units may be implemented by the same physical entity, or some units may be implemented by multiple physical entities, or may be implemented jointly by some components in multiple independent devices.
In the above embodiments, the hardware units or modules may be implemented mechanically or electrically. For example, a hardware unit, module or processor may include permanently dedicated circuitry or logic (e.g., a dedicated processor, FPGA or ASIC) to perform the corresponding operations. The hardware unit or processor may also include programmable logic or circuitry (e.g., a general purpose processor or other programmable processor) that may be temporarily configured by software to perform the corresponding operations. The particular implementation (mechanical, or dedicated permanent, or temporarily set) may be determined based on cost and time considerations.
The detailed description set forth above in connection with the appended drawings describes exemplary embodiments, but does not represent all embodiments that may be implemented or fall within the scope of the claims. The term "exemplary" used throughout this specification means "serving as an example, instance, or illustration," and does not mean "preferred" or "advantageous over other embodiments. The detailed description includes specific details for the purpose of providing an understanding of the described technology. However, the techniques may be practiced without these specific details. In some instances, well-known structures and devices are shown in block diagram form in order to avoid obscuring the concepts of the described embodiments.
The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the scope of the disclosure. Thus, the disclosure is not limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (17)
1. A method for model training, comprising:
Transmitting at least one first sample data to a global model means to obtain a marker value of the at least one first sample data based on a global model at the global model means, the first sample data being non-marker sample data and being public data collected locally at a user; and
Training a local model locally at the user using the at least one first sample data and the corresponding marker values,
The global model comprises at least one global sub-model, and each global sub-model is trained by using second sample data sets which are independent from each other, and the second sample data sets are obtained by dividing the sample data sets or are acquired by different data acquisition devices.
2. The method of claim 1, further comprising:
the at least one first sample data is collected locally at the user.
3. The method of claim 1, wherein the flag value of each of the at least one first sample data is obtained by inputting the first sample data to each of the at least one global sub-model to predict and fusing the predicted value of each of the obtained global sub-models.
4. A method as claimed in claim 3, wherein the predicted values of the respective global sub-models are predicted values after noise addition processing.
5. A method for model training, comprising:
Transmitting, via the local model training means, at least one first sample data collected locally by the user to the global model training means, the first sample data being unlabeled sample data and being public data collected locally by the user;
providing the at least one first sample data to a global model via the global model training means to obtain a marker value of the at least one first sample data, and transmitting the obtained marker value of the at least one first sample data to the local model training means; and
Training a local model at the user via the local model training device using the at least one first sample data and corresponding marker values,
The global model comprises at least one global sub-model, and each global sub-model is trained by using second sample data sets which are independent from each other, and the second sample data sets are obtained by dividing the sample data sets or are acquired by different data acquisition devices.
6. The method of claim 5, wherein providing the at least one first sample data to a global model via a global model training device to obtain a marker value for the at least one first sample data comprises:
Inputting each first sample data in the at least one first sample data into each global sub-model in the at least one global sub-model respectively to predict via the global model training device; and
And fusing the obtained predicted values of the global sub-models of each first sample data through the global model training device to obtain the marked value of the sample data.
7. The method of claim 6, wherein providing the at least one first sample data to a global model via the global model training device to obtain a tag value for the at least one first sample data further comprises:
the obtained predicted values of the respective global sub-models are subjected to noise addition processing via the global model training means,
Wherein fusing, via the global model training device, the obtained predicted values of the respective global sub-models of each first sample data to obtain the marker value of the sample data includes:
and fusing the predicted values of each obtained first sample data after noise addition processing by the global model training device to obtain the marked value of the sample data.
8. An apparatus for model training, comprising:
A sample data transmitting unit configured to transmit at least one first sample data to a global model apparatus to obtain a mark value of the at least one first sample data based on a global model at the global model apparatus, the first sample data being unmarked sample data and being public data collected locally at a user;
A flag value receiving unit configured to receive a flag value of the at least one first sample data; and
A local model training unit configured to train a local model locally at the user using the at least one first sample data and the corresponding marker values,
The global model comprises at least one global sub-model, and each global sub-model is trained by using second sample data sets which are independent from each other, and the second sample data sets are obtained by dividing the sample data sets or are acquired by different data acquisition devices.
9. The apparatus of claim 8, further comprising:
a sample data acquisition unit configured to acquire the at least one first sample data locally at a user.
10. A system for model training, comprising:
Local model training apparatus comprising an apparatus for model training according to claim 8 or 9; and
Global model training means, the global model training means comprising:
A sample data receiving unit configured to receive at least one first sample data locally from a user, the first sample data being unlabeled sample data and being public data collected locally at the user;
a flag value acquisition unit configured to provide the at least one first sample data to a global model to obtain a flag value of the at least one first sample data; and
And a tag value transmitting unit configured to transmit the obtained tag value of the at least one first sample data to the user local to train a local model using the at least one first sample data and the corresponding tag value locally at the user.
11. The system of claim 10, wherein the global model training means further comprises:
At least one global sub-model training unit configured to train out respective ones of the at least one global sub-model using second sample data sets that are independent of each other.
12. The system of claim 10, wherein the tag value acquisition unit includes:
a prediction module configured to input each of the at least one first sample data to a respective one of the at least one global sub-model for prediction; and
And the data fusion module is configured to fuse the obtained predicted values of the global sub-models of each piece of first sample data to obtain the marked value of the sample data.
13. The system of claim 12, wherein the tag value acquisition unit further comprises:
A noise adding module configured to perform noise adding processing on the obtained predicted values of the respective global sub-models,
Wherein the data fusion module is configured to: and fusing the obtained predicted values of each first sample data after noise addition processing to obtain the marking value of the sample data.
14. A computing device, comprising:
At least one processor, and
A memory coupled to the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method of any of claims 1-4.
15. A machine-readable storage medium storing executable instructions that, when executed, cause the machine to perform the method of any one of claims 1 to 4.
16. A computing device, comprising:
At least one processor, and
A memory coupled to the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method of any of claims 5 to 7.
17. A machine-readable storage medium storing executable instructions that, when executed, cause the machine to perform the method of any of claims 5 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910327485.8A CN111832591B (en) | 2019-04-23 | 2019-04-23 | Machine learning model training method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910327485.8A CN111832591B (en) | 2019-04-23 | 2019-04-23 | Machine learning model training method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111832591A CN111832591A (en) | 2020-10-27 |
CN111832591B true CN111832591B (en) | 2024-06-04 |
Family
ID=72912298
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910327485.8A Active CN111832591B (en) | 2019-04-23 | 2019-04-23 | Machine learning model training method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111832591B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114548416A (en) * | 2020-11-26 | 2022-05-27 | 华为技术有限公司 | Data model training method and device |
CN113420322B (en) * | 2021-05-24 | 2023-09-01 | 阿里巴巴新加坡控股有限公司 | Model training and desensitizing method and device, electronic equipment and storage medium |
CN113689000A (en) * | 2021-08-25 | 2021-11-23 | 深圳前海微众银行股份有限公司 | Federal learning model training method and device, electronic equipment and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104573720A (en) * | 2014-12-31 | 2015-04-29 | 北京工业大学 | Distributed training method for kernel classifiers in wireless sensor network |
CN107169573A (en) * | 2017-05-05 | 2017-09-15 | 第四范式(北京)技术有限公司 | Using composite machine learning model come the method and system of perform prediction |
WO2018033890A1 (en) * | 2016-08-19 | 2018-02-22 | Linear Algebra Technologies Limited | Systems and methods for distributed training of deep learning models |
CN107967491A (en) * | 2017-12-14 | 2018-04-27 | 北京木业邦科技有限公司 | Machine learning method, device, electronic equipment and the storage medium again of plank identification |
CN108289115A (en) * | 2017-05-10 | 2018-07-17 | 腾讯科技(深圳)有限公司 | A kind of information processing method and system |
CN108491720A (en) * | 2018-03-20 | 2018-09-04 | 腾讯科技(深圳)有限公司 | A kind of application and identification method, system and relevant device |
CN108764065A (en) * | 2018-05-04 | 2018-11-06 | 华中科技大学 | A kind of method of pedestrian's weight identification feature fusion assisted learning |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11443226B2 (en) * | 2017-05-17 | 2022-09-13 | International Business Machines Corporation | Training a machine learning model in a distributed privacy-preserving environment |
-
2019
- 2019-04-23 CN CN201910327485.8A patent/CN111832591B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104573720A (en) * | 2014-12-31 | 2015-04-29 | 北京工业大学 | Distributed training method for kernel classifiers in wireless sensor network |
WO2018033890A1 (en) * | 2016-08-19 | 2018-02-22 | Linear Algebra Technologies Limited | Systems and methods for distributed training of deep learning models |
CN107169573A (en) * | 2017-05-05 | 2017-09-15 | 第四范式(北京)技术有限公司 | Using composite machine learning model come the method and system of perform prediction |
CN108289115A (en) * | 2017-05-10 | 2018-07-17 | 腾讯科技(深圳)有限公司 | A kind of information processing method and system |
CN107967491A (en) * | 2017-12-14 | 2018-04-27 | 北京木业邦科技有限公司 | Machine learning method, device, electronic equipment and the storage medium again of plank identification |
CN108491720A (en) * | 2018-03-20 | 2018-09-04 | 腾讯科技(深圳)有限公司 | A kind of application and identification method, system and relevant device |
CN108764065A (en) * | 2018-05-04 | 2018-11-06 | 华中科技大学 | A kind of method of pedestrian's weight identification feature fusion assisted learning |
Also Published As
Publication number | Publication date |
---|---|
CN111832591A (en) | 2020-10-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109886417B (en) | Model parameter training method, device, equipment and medium based on federal learning | |
CN110929870B (en) | Method, device and system for training neural network model | |
CN111832591B (en) | Machine learning model training method and device | |
CN111835511A (en) | Data security transmission method and device, computer equipment and storage medium | |
CN111523673B (en) | Model training method, device and system | |
CN108306891B (en) | Method, apparatus and system for performing machine learning using data to be exchanged | |
CN112101531B (en) | Neural network model training method, device and system based on privacy protection | |
CN111935179B (en) | Model training method and device based on trusted execution environment | |
CN112804133B (en) | Encryption group chat method and system based on blockchain technology | |
CN111741020A (en) | Public data set determination method, device and system based on data privacy protection | |
CN112039902A (en) | Data encryption method and device | |
CN111861099A (en) | Model evaluation method and device of federal learning model | |
CN111523134B (en) | Homomorphic encryption-based model training method, device and system | |
CN109525949A (en) | Register method and device, storage medium, server, user terminal | |
CN111641619A (en) | Method and device for constructing hacker portrait based on big data and computer equipment | |
CN113094739B (en) | Data processing method and device based on privacy protection and server | |
CN111737756B (en) | XGB model prediction method, device and system performed through two data owners | |
CN111061720B (en) | Data screening method and device and electronic equipment | |
CN112183759B (en) | Model training method, device and system | |
CN112288088A (en) | Business model training method, device and system | |
CN109039651B (en) | Position information transmission method and device and satellite positioning system | |
CN114553549B (en) | Data encryption method and system | |
CN116306905A (en) | Semi-supervised non-independent co-distributed federal learning distillation method and device | |
CN113992393B (en) | Method, apparatus, system, and medium for model update for vertical federal learning | |
CN111738453B (en) | Business model training method, device and system based on sample weighting |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |