CN114548300B - Method and device for explaining service processing result of service processing model - Google Patents
Method and device for explaining service processing result of service processing model Download PDFInfo
- Publication number
- CN114548300B CN114548300B CN202210181701.4A CN202210181701A CN114548300B CN 114548300 B CN114548300 B CN 114548300B CN 202210181701 A CN202210181701 A CN 202210181701A CN 114548300 B CN114548300 B CN 114548300B
- Authority
- CN
- China
- Prior art keywords
- samples
- sample
- disturbance
- hidden
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012545 processing Methods 0.000 title claims abstract description 147
- 238000000034 method Methods 0.000 title claims abstract description 75
- 238000012216 screening Methods 0.000 claims abstract description 23
- 238000013528 artificial neural network Methods 0.000 claims abstract description 22
- 239000013598 vector Substances 0.000 claims description 64
- 238000012549 training Methods 0.000 claims description 39
- 230000008569 process Effects 0.000 claims description 38
- 238000005070 sampling Methods 0.000 claims description 22
- 238000013527 convolutional neural network Methods 0.000 claims description 12
- 230000006399 behavior Effects 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 abstract description 5
- 238000010586 diagram Methods 0.000 description 7
- 238000010801 machine learning Methods 0.000 description 4
- 230000008859 change Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000003542 behavioural effect Effects 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q20/00—Payment architectures, schemes or protocols
- G06Q20/38—Payment protocols; Details thereof
- G06Q20/40—Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
- G06Q20/401—Transaction verification
- G06Q20/4014—Identity check for transactions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q20/00—Payment architectures, schemes or protocols
- G06Q20/38—Payment protocols; Details thereof
- G06Q20/40—Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
- G06Q20/401—Transaction verification
- G06Q20/4016—Transaction verification involving fraud or risk level assessment in transaction processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/03—Credit; Loans; Processing thereof
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Accounting & Taxation (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Finance (AREA)
- Computing Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Business, Economics & Management (AREA)
- Mathematical Physics (AREA)
- Strategic Management (AREA)
- Biophysics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biomedical Technology (AREA)
- Computer Security & Cryptography (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Economics (AREA)
- Technology Law (AREA)
- Marketing (AREA)
- Medical Informatics (AREA)
- Development Economics (AREA)
- Complex Calculations (AREA)
Abstract
The embodiment of the specification provides a method and a device for explaining a service processing result of a service processing model, wherein the method comprises the following steps: inputting a sample to be interpreted into a pre-trained generation model to obtain a first number of disturbance samples, wherein the sample to be interpreted and the disturbance samples both comprise a plurality of characteristic dimensions; respectively inputting a sample to be interpreted and a first number of disturbance samples into a business processing model realized through a neural network, and outputting a first business processing result corresponding to the sample to be interpreted and a second business processing result corresponding to each disturbance sample; screening a second number of disturbance samples from the first number of disturbance samples by taking the consistency of the second service processing result and the first service processing result as screening conditions; and counting the differences of the second number of disturbance samples and the samples to be explained in each characteristic dimension, and explaining the first business processing result according to the differences in each characteristic dimension. The method can reduce the calculation complexity and improve the efficiency.
Description
The present invention is a divisional application of the invention application of which the application date is 2019, 12, 20, application number is 201911326360.X, and the invention name is a method and a device for explaining the service processing result of a service processing model.
Technical Field
One or more embodiments of the present specification relate to the field of computers, and more particularly, to a method and apparatus for interpreting business process results of a business process model.
Background
Machine learning is now widely used in retail, technical, healthcare, scientific, and other fields. Whether the classification model or the regression model gives a result or decision, the entire decision process is invisible or unintelligible to the person. The decision process of the business processing model realized by the neural network and the rules which are easier to accept and understand are quite different, the decision of the rules corresponds to an easily understood and traceable decision path, the decision of the business processing model is more a black box process, only the input and the output are exposed to the user, the decision process is transparent and not perceivable to the user, and even if the decision is wrong, the decision is not traceable. The non-traceable and uncontrollable nature of these black boxes is why they are blocked from functioning in certain specific fields, in particular in the financial field, such as insurance, banking, etc. where security requirements are high, stability and controllability are required.
In the prior art, a method for explaining a service processing result of a service processing model is generally high in calculation complexity and low in efficiency.
Therefore, an improved scheme is desired, which can reduce the computational complexity and improve the efficiency when interpreting the business processing results of the business processing model.
Disclosure of Invention
One or more embodiments of the present disclosure describe a method and apparatus for interpreting a service processing result of a service processing model, which can reduce computational complexity and improve efficiency.
In a first aspect, a method for interpreting a service processing result of a service processing model is provided, the method comprising:
Inputting samples to be interpreted into a pre-trained generation model based on a variational automatic encoder (variational autoencoders, VAE) to obtain a first number of disturbance samples, wherein the samples to be interpreted and the disturbance samples both comprise a plurality of characteristic dimensions;
Respectively inputting the sample to be interpreted and the first number of disturbance samples into a business processing model realized by a neural network, and outputting a first business processing result corresponding to the sample to be interpreted and a second business processing result corresponding to each disturbance sample by the business processing model;
Screening a second number of disturbance samples from the first number of disturbance samples by taking the consistency of the second service processing result and the first service processing result as screening conditions;
And counting the differences between the second number of disturbance samples and the sample to be explained in each characteristic dimension, and explaining the first business processing result according to the differences in each characteristic dimension.
In one possible implementation, the sample to be interpreted corresponds to a target user;
And the service processing result output by the service processing model is used for indicating whether to intercept the preset behavior of the target user.
In one possible implementation, the business process model includes a deep neural network (deep neural networks, DNN).
In one possible implementation, the generative model is trained by:
inputting a training sample into the generation model, and outputting a training disturbance sample through the generation model;
Inputting the training samples and the training disturbance samples into the business processing model, wherein the business processing model comprises a plurality of hidden layers;
Aiming at a target hidden layer in the hidden layers, acquiring a target hidden vector and a disturbance hidden vector of the training sample and the training disturbance sample in the target hidden layer respectively; determining cross entropy between the target hidden vector and the disturbance hidden vector;
Determining a reconstruction error according to the cross entropy;
and training the generated model with the aim of minimizing the reconstruction error.
Further, the target hidden layer is any hidden layer in the plurality of hidden layers;
The determining a reconstruction error according to the cross entropy comprises:
and summing the cross entropies corresponding to all the hidden layers in the plurality of hidden layers respectively, thereby determining the reconstruction error.
In one possible implementation, the generation model includes an encoder, a decoder, and a sampling unit;
The encoder is used for receiving input samples, and outputting the mean value and the variance of Gaussian distribution obeyed by hidden vectors corresponding to the input samples through the encoder;
The sampling unit is used for sampling from each hidden vector of Gaussian distribution corresponding to the mean value and the variance output by the encoder to obtain a first hidden vector;
the decoder is configured to decode the first hidden vector to obtain a first disturbance sample.
Further, the encoder includes: deep neural network DNN, multi-layer perceptron (multi-Layer perceptron, MLP) or convolutional neural network (convolutional neural networks, CNN).
Further, the inputting the sample to be interpreted into a pre-trained generation model based on a variational automatic encoder VAE, to obtain a first number of disturbance samples, including:
Inputting a sample to be interpreted as an input sample into the pre-trained encoder, and outputting the mean value and the variance of Gaussian distribution obeyed by the hidden vector corresponding to the input sample through the encoder;
The sampling unit samples the first hidden vectors of the Gaussian distribution corresponding to the mean value and the variance output by the encoder to obtain a first number of first hidden vectors;
the decoder decodes the first number of first hidden vectors to obtain a first number of first disturbance samples.
In a possible implementation manner, the counting the differences between the second number of disturbance samples and the sample to be interpreted in each characteristic dimension, and the interpreting the first service processing result according to the differences in each characteristic dimension includes:
And counting the variances of the second number of disturbance samples and the samples to be explained in each characteristic dimension, and determining the importance of each characteristic dimension in the basis of obtaining the first service processing result according to the variances in each characteristic dimension.
In a second aspect, there is provided an apparatus for interpreting a service processing result of a service processing model, the apparatus comprising:
the generation unit is used for inputting samples to be explained into a pre-trained generation model based on a Variation Automatic Encoder (VAE) to obtain a first number of disturbance samples, wherein the samples to be explained and the disturbance samples both contain a plurality of characteristic dimensions;
The business processing unit is used for respectively inputting the samples to be interpreted and the first number of disturbance samples obtained by the generating unit into a business processing model realized by a neural network, and outputting a first business processing result corresponding to the samples to be interpreted and a second business processing result corresponding to each disturbance sample by the business processing model;
The screening unit is used for screening a second number of disturbance samples from the first number of disturbance samples obtained by the generating unit by taking the second service processing result obtained by the service processing unit and the first service processing result as screening conditions;
And the interpretation unit is used for counting the difference between the second number of disturbance samples obtained by the screening unit and the samples to be interpreted in each characteristic dimension, and interpreting the first service processing result obtained by the service processing unit according to the difference in each characteristic dimension.
In a third aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the first aspect.
In a fourth aspect, there is provided a computing device comprising a memory having executable code stored therein and a processor which, when executing the executable code, implements the method of the first aspect.
Through the method and the device provided by the embodiment of the specification, firstly, the sample to be explained is input into a pre-trained generation model based on a Variational Automatic Encoder (VAE) to obtain a first number of disturbance samples, and the sample to be explained and the disturbance samples both comprise a plurality of characteristic dimensions; then respectively inputting the sample to be interpreted and the first number of disturbance samples into a business processing model realized by a neural network, and outputting a first business processing result corresponding to the sample to be interpreted and a second business processing result corresponding to each disturbance sample by the business processing model; then, taking the second business processing result and the first business processing result as screening conditions, and screening a second number of disturbance samples from the first number of disturbance samples; and finally, counting the differences between the second number of disturbance samples and the sample to be explained in each characteristic dimension, and explaining the first business processing result according to the differences in each characteristic dimension. From the above, in the embodiment of the present disclosure, a generating model is constructed to generate a plurality of disturbance samples for a sample to be interpreted, the disturbance samples are neighbor pseudo samples of the sample to be interpreted, those disturbance samples with consistent service processing results of the service processing model for the disturbance samples and the sample to be interpreted are screened out, and the model interpretation is derived from the screened disturbance samples. The method can explain the sample level of the existing business processing model, namely the model can give the decision basis of the current time when outputting the business processing result each time. The method can reduce the calculation complexity and improve the efficiency.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic illustration of an implementation scenario of an embodiment disclosed herein;
FIG. 2 illustrates a method flow diagram for interpreting business process results of a business process model, in accordance with one embodiment;
FIG. 3 illustrates a training process schematic of generating a model according to one embodiment;
FIG. 4 shows an overall process diagram that explains business process results of a business process model, according to one embodiment;
fig. 5 shows a schematic block diagram of an apparatus for interpreting a business process result of a business process model, according to one embodiment.
Detailed Description
The following describes the scheme provided in the present specification with reference to the drawings.
Fig. 1 is a schematic diagram of an implementation scenario of an embodiment disclosed in the present specification. The implementation scenario involves interpreting the business process results of the business process model. Referring to fig. 1, a sample to be interpreted includes a plurality of feature dimensions, and when the sample to be interpreted is input into a service processing model, the service processing model outputs a corresponding service processing result.
It will be appreciated that the embodiments of the present description may give a sample-level interpretation, with the same feature having different feature importance for different samples to be interpreted. For example, each input sample of the service processing model includes N feature dimensions, namely feature 1 and feature 2 … … feature N, and for one sample to be interpreted, the most important basis in the corresponding service processing result is feature 1, and for another sample to be interpreted, the most important basis in the corresponding service processing result is feature 2.
As an example, a typical implementation scenario is a financial scenario, where the business process model is used to identify a user with a spoofed identity and intercept preset actions of the user identified as a spoofed identity. In some network financial platforms, some people can impersonate accounts of others to perform consumption or borrowing and other actions, which is called identity impersonation. The large probability of identity impersonation is accompanied by financial risk, and interception of the corresponding behavior is required, but the performance requirements and the interpretability requirements of the interception model used are necessarily high in view of the sensitivity of the financial scene. Therefore, it is required to meet both high performance requirements and interpretability.
Fig. 2 shows a flow chart of a method of interpreting the results of a business process model, which may be based on the implementation scenario shown in fig. 1, according to one embodiment. As shown in fig. 2, the method for explaining the service processing result of the service processing model in this embodiment includes the steps of: step 21, inputting a sample to be interpreted into a pre-trained generation model based on a variation automatic encoder (variational autoencoders, VAE) to obtain a first number of disturbance samples, wherein the sample to be interpreted and the disturbance samples both comprise a plurality of characteristic dimensions; step 22, inputting the sample to be interpreted and the first number of disturbance samples into a service processing model realized by a neural network respectively, and outputting a first service processing result corresponding to the sample to be interpreted and a second service processing result corresponding to each disturbance sample respectively by the service processing model; step 23, screening a second number of disturbance samples from the first number of disturbance samples by taking the consistency of the second service processing result and the first service processing result as screening conditions; and step 24, counting the difference between the second number of disturbance samples and the sample to be interpreted in each characteristic dimension, and interpreting the first service processing result according to the difference in each characteristic dimension. Specific implementations of the above steps are described below.
First, in step 21, a sample to be interpreted is input into a pre-trained VAE-based generation model, and a first number of disturbance samples are obtained, where the sample to be interpreted and the disturbance samples each include a plurality of feature dimensions. It will be appreciated that the first number may be predetermined.
In one example, the sample to be interpreted corresponds to a target user;
And the service processing result output by the service processing model is used for indicating whether to intercept the preset behavior of the target user.
It will be appreciated that the plurality of feature dimensions, i.e., the plurality of features, may include user portrayal features such as gender, age, academic, occupation, etc.; historical behavioral characteristics such as, for example, amount of consumption, records of violations, and the like, may also be included.
VAE: is a typical representation of a class of generative models in machine learning, combining probability map models with deep learning.
Generating a model: the machine learning model is generally divided into a discriminant model, which is a type of machine learning model directly modeling for posterior probability, and a generative model, which is a model directly modeling the joint probability of a sample and a label.
In one example, the generative model includes an encoder, a decoder, and a sampling unit;
The encoder is used for receiving input samples, and outputting the mean value and the variance of Gaussian distribution obeyed by hidden vectors corresponding to the input samples through the encoder;
The sampling unit is used for sampling from each hidden vector of Gaussian distribution corresponding to the mean value and the variance output by the encoder to obtain a first hidden vector;
the decoder is configured to decode the first hidden vector to obtain a first disturbance sample.
Further, the encoder includes: deep neural network DNN, multi-layer perceptron (multi-Layer perceptron, MLP) or convolutional neural network (convolutional neural networks, CNN).
Further, step 21 specifically includes:
Inputting a sample to be interpreted as an input sample into the pre-trained encoder, and outputting the mean value and the variance of Gaussian distribution obeyed by the hidden vector corresponding to the input sample through the encoder;
The sampling unit samples the first hidden vectors of the Gaussian distribution corresponding to the mean value and the variance output by the encoder to obtain a first number of first hidden vectors;
the decoder decodes the first number of first hidden vectors to obtain a first number of first disturbance samples.
Then, in step 22, the sample to be interpreted and the first number of disturbance samples are respectively input into a service processing model implemented through a neural network, and a first service processing result corresponding to the sample to be interpreted and a second service processing result corresponding to each disturbance sample are output through the service processing model. It will be appreciated that some of the perturbation samples correspond to the same second business process results as the first business process results, while other perturbation samples correspond to different second business process results than the first business process results.
In one example, the business process model includes a deep neural network (deep neural networks, DNN). The general DNN can meet higher performance requirements and flexibly add some service constraint conditions into a network, but the DNN lacks of interpretation. In the embodiment of the present specification, sample-level interpretation may be performed for DNN.
Next, in step 23, a second number of disturbance samples is screened from the first number of disturbance samples, using the second service processing result and the first service processing result as screening conditions. It can be appreciated that the second number of disturbance samples screened out can be used as an explanation basis for the first service processing result.
Finally, in step 24, the differences between the second number of disturbance samples and the sample to be interpreted in each characteristic dimension are counted, and the first service processing result is interpreted according to the differences in each characteristic dimension. It will be appreciated that the larger the difference in the feature dimension, the less important the feature dimension is for obtaining the first business process result, and the smaller the difference in the feature dimension is for obtaining the first business process result.
In one example, the variances of the second number of disturbance samples and the samples to be interpreted in each feature dimension are counted, and the importance of each feature dimension in the basis of obtaining the first service processing result is determined according to the variances in each feature dimension. In this example, the variance indicates the difference in the feature dimension, and it is understood that other indicators may be employed to indicate the difference in the feature dimension.
In one example, the generative model is trained by:
inputting a training sample into the generation model, and outputting a training disturbance sample through the generation model;
Inputting the training samples and the training disturbance samples into the business processing model, wherein the business processing model comprises a plurality of hidden layers;
Aiming at a target hidden layer in the hidden layers, acquiring a target hidden vector and a disturbance hidden vector of the training sample and the training disturbance sample in the target hidden layer respectively; determining cross entropy between the target hidden vector and the disturbance hidden vector;
Determining a reconstruction error according to the cross entropy;
and training the generated model with the aim of minimizing the reconstruction error.
Further, the target hidden layer is any hidden layer in the plurality of hidden layers;
The determining a reconstruction error according to the cross entropy comprises:
and summing the cross entropies corresponding to all the hidden layers in the plurality of hidden layers respectively, thereby determining the reconstruction error.
FIG. 3 illustrates a training process diagram of a generative model, according to one embodiment. Referring to fig. 3, the generation model includes an encoder, a decoder, and a sampling unit; the training sample learns the mean mu and variance sigma of the gaussian distribution to which the hidden vector is subjected through the encoder, and then the sampling unit extracts a hidden vector from the hidden vectors of the gaussian distribution, the hidden vector obtains a disturbance sample x ' through the decoder, and the disturbance sample x ' is input into the business processing model, and meanwhile, the training sample x is also input into the business processing model, and the cross entropy of the result of the hidden layers of the business processing model of x and x ' is used as a reconstruction error, so that the generated model is trained.
Fig. 4 shows an overall process diagram for interpreting the business process results of the business process model, according to one embodiment. Referring to fig. 4, a disturbance sample is generated using a generation model, and an interpretation is obtained using the disturbance sample. And for the sample x to be interpreted, obtaining n disturbance samples through a pre-trained generation model. The disturbance samples are input into a business processing model, disturbance samples with the business processing results consistent with the samples to be interpreted are selected, and m effective disturbance samples are obtained after screening. The delta is obtained by performing difference calculation on the m disturbance samples and the sample to be explained, the disturbance sample is intuitively understood to be equivalent to adding one disturbance delta on the original characteristic x of the sample to be explained, but the service processing result of the service processing model cannot be changed, so that the characteristic with larger characteristic change amplitude is relatively less important, and the characteristic with small characteristic change amplitude is more important. The final interpretation is to count the variance of each dimension of these deltas as the basis of interpretation.
According to the method provided by the embodiment of the specification, a plurality of disturbance samples are generated aiming at the sample to be interpreted by constructing a generating model, the disturbance samples are neighborhood pseudo samples of the sample to be interpreted, the disturbance samples with the same business processing results of the business processing model aiming at the disturbance samples and the sample to be interpreted are screened out, and the model interpretation is derived from the screened disturbance samples. The method can explain the sample level of the existing business processing model, namely the model can give the decision basis of the current time when outputting the business processing result each time. The method can reduce the calculation complexity and improve the efficiency. And the disturbance sample generated in the method is more in line with the distribution of the sample to be explained.
According to an embodiment of another aspect, there is further provided an apparatus for interpreting a service processing result of a service processing model, where the apparatus is configured to perform a method for interpreting a service processing result of a service processing model provided in an embodiment of the present specification. Fig. 5 shows a schematic block diagram of an apparatus for interpreting a business process result of a business process model, according to one embodiment. As shown in fig. 5, the apparatus 500 includes:
A generating unit 51, configured to input samples to be interpreted into a pre-trained generation model based on a variational automatic encoder VAE, to obtain a first number of disturbance samples, where the samples to be interpreted and the disturbance samples each include a plurality of feature dimensions;
A service processing unit 52, configured to input the samples to be interpreted and the first number of disturbance samples obtained by the generating unit 51 into a service processing model implemented by a neural network, and output, through the service processing model, a first service processing result corresponding to the samples to be interpreted, and a second service processing result corresponding to each disturbance sample;
A screening unit 53, configured to screen a second number of disturbance samples from the first number of disturbance samples obtained by the generating unit 51, with the second service processing result obtained by the service processing unit 52 being consistent with the first service processing result as a screening condition;
and an interpretation unit 54, configured to count differences between the second number of disturbance samples obtained by the screening unit 53 and the samples to be interpreted in each feature dimension, and interpret the first service processing result obtained by the service processing unit according to the differences in each feature dimension.
Optionally, as an embodiment, the sample to be interpreted corresponds to a target user;
And the service processing result output by the service processing model is used for indicating whether to intercept the preset behavior of the target user.
Optionally, as an embodiment, the service processing model includes a deep neural network DNN.
Alternatively, as an embodiment, the generative model is trained by:
inputting a training sample into the generation model, and outputting a training disturbance sample through the generation model;
Inputting the training samples and the training disturbance samples into the business processing model, wherein the business processing model comprises a plurality of hidden layers;
Aiming at a target hidden layer in the hidden layers, acquiring a target hidden vector and a disturbance hidden vector of the training sample and the training disturbance sample in the target hidden layer respectively; determining cross entropy between the target hidden vector and the disturbance hidden vector;
Determining a reconstruction error according to the cross entropy;
and training the generated model with the aim of minimizing the reconstruction error.
Further, the target hidden layer is any hidden layer in the plurality of hidden layers;
The determining a reconstruction error according to the cross entropy comprises:
and summing the cross entropies corresponding to all the hidden layers in the plurality of hidden layers respectively, thereby determining the reconstruction error.
Optionally, as an embodiment, the generating model includes an encoder, a decoder, and a sampling unit;
The encoder is used for receiving input samples, and outputting the mean value and the variance of Gaussian distribution obeyed by hidden vectors corresponding to the input samples through the encoder;
The sampling unit is used for sampling from each hidden vector of Gaussian distribution corresponding to the mean value and the variance output by the encoder to obtain a first hidden vector;
the decoder is configured to decode the first hidden vector to obtain a first disturbance sample.
Further, the encoder includes: deep neural network DNN, multi-layer perceptron MLP, or convolutional neural network CNN.
Further, the generating unit 51 is specifically configured to:
Inputting a sample to be interpreted as an input sample into the pre-trained encoder, and outputting the mean value and the variance of Gaussian distribution obeyed by the hidden vector corresponding to the input sample through the encoder;
The sampling unit samples the first hidden vectors of the Gaussian distribution corresponding to the mean value and the variance output by the encoder to obtain a first number of first hidden vectors;
the decoder decodes the first number of first hidden vectors to obtain a first number of first disturbance samples.
Optionally, as an embodiment, the interpretation unit 54 is specifically configured to count variances of the second number of disturbance samples and the samples to be interpreted in each feature dimension, and determine importance of each feature dimension in the basis of obtaining the first service processing result according to the variances in each feature dimension.
According to an embodiment of another aspect, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method described in connection with fig. 2.
According to an embodiment of yet another aspect, there is also provided a computing device including a memory having executable code stored therein and a processor that, when executing the executable code, implements the method described in connection with fig. 2.
Those skilled in the art will appreciate that in one or more of the examples described above, the functions described in the present invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, these functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
The foregoing embodiments have been provided for the purpose of illustrating the general principles of the present invention in further detail, and are not to be construed as limiting the scope of the invention, but are merely intended to cover any modifications, equivalents, improvements, etc. based on the teachings of the invention.
Claims (18)
1. A method of interpreting a business process result of a business process model, the method comprising:
Inputting a sample to be interpreted into a pre-trained generation model to obtain a first number of disturbance samples, wherein the sample to be interpreted and the disturbance samples both comprise a plurality of characteristic dimensions;
Respectively inputting the sample to be interpreted and the first number of disturbance samples into a business processing model realized by a neural network, and outputting a first business processing result corresponding to the sample to be interpreted and a second business processing result corresponding to each disturbance sample by the business processing model;
Screening a second number of disturbance samples from the first number of disturbance samples by taking the consistency of the second service processing result and the first service processing result as screening conditions;
counting the differences of the second number of disturbance samples and the samples to be explained in each characteristic dimension, and explaining the first business processing result according to the differences in each characteristic dimension;
Wherein the sample to be interpreted corresponds to a target user;
And the service processing result output by the service processing model is used for indicating whether to intercept the preset behavior of the target user.
2. The method of claim 1, wherein the business process model comprises a deep neural network DNN.
3. The method of claim 1, wherein the generative model is trained by:
inputting a training sample into the generation model, and outputting a training disturbance sample through the generation model;
Inputting the training samples and the training disturbance samples into the business processing model, wherein the business processing model comprises a plurality of hidden layers;
Aiming at a target hidden layer in the hidden layers, acquiring a target hidden vector and a disturbance hidden vector of the training sample and the training disturbance sample in the target hidden layer respectively; determining cross entropy between the target hidden vector and the disturbance hidden vector;
Determining a reconstruction error according to the cross entropy;
and training the generated model with the aim of minimizing the reconstruction error.
4. A method as claimed in claim 3, wherein the target hidden layer is any one of the number of hidden layers;
The determining a reconstruction error according to the cross entropy comprises:
and summing the cross entropies corresponding to all the hidden layers in the plurality of hidden layers respectively, thereby determining the reconstruction error.
5. The method of claim 1, wherein the generation model comprises an encoder, a decoder, and a sampling unit;
The encoder is used for receiving input samples, and outputting the mean value and the variance of Gaussian distribution obeyed by hidden vectors corresponding to the input samples through the encoder;
The sampling unit is used for sampling from each hidden vector of Gaussian distribution corresponding to the mean value and the variance output by the encoder to obtain a first hidden vector;
the decoder is configured to decode the first hidden vector to obtain a first disturbance sample.
6. The method of claim 5, wherein the encoder comprises: deep neural network DNN, multi-layer perceptron MLP, or convolutional neural network CNN.
7. The method of claim 5, wherein said inputting the sample to be interpreted into a pre-trained generative model results in a first number of perturbation samples, comprising:
Inputting a sample to be interpreted as an input sample into the pre-trained encoder, and outputting the mean value and the variance of Gaussian distribution obeyed by the hidden vector corresponding to the input sample through the encoder;
The sampling unit samples the first hidden vectors of the Gaussian distribution corresponding to the mean value and the variance output by the encoder to obtain a first number of first hidden vectors;
the decoder decodes the first number of first hidden vectors to obtain a first number of first disturbance samples.
8. The method of claim 1, wherein the counting the differences in each characteristic dimension between the second number of perturbation samples and the sample to be interpreted, and interpreting the first business process result according to the differences in each characteristic dimension, comprises:
And counting the variances of the second number of disturbance samples and the samples to be explained in each characteristic dimension, and determining the importance of each characteristic dimension in the basis of obtaining the first service processing result according to the variances in each characteristic dimension.
9. An apparatus for interpreting a business process result of a business process model, the apparatus comprising:
The generation unit is used for inputting the sample to be interpreted into a pre-trained generation model to obtain a first number of disturbance samples, wherein the sample to be interpreted and the disturbance samples both comprise a plurality of characteristic dimensions;
The business processing unit is used for respectively inputting the samples to be interpreted and the first number of disturbance samples obtained by the generating unit into a business processing model realized by a neural network, and outputting a first business processing result corresponding to the samples to be interpreted and a second business processing result corresponding to each disturbance sample by the business processing model;
The screening unit is used for screening a second number of disturbance samples from the first number of disturbance samples obtained by the generating unit by taking the second service processing result obtained by the service processing unit and the first service processing result as screening conditions;
The interpretation unit is used for counting the difference between the second number of disturbance samples obtained by the screening unit and the sample to be interpreted in each characteristic dimension, and interpreting the first service processing result obtained by the service processing unit according to the difference in each characteristic dimension;
Wherein the sample to be interpreted corresponds to a target user;
And the service processing result output by the service processing model is used for indicating whether to intercept the preset behavior of the target user.
10. The apparatus of claim 9, wherein the traffic processing model comprises a deep neural network DNN.
11. The apparatus of claim 9, wherein the generative model is trained by:
inputting a training sample into the generation model, and outputting a training disturbance sample through the generation model;
Inputting the training samples and the training disturbance samples into the business processing model, wherein the business processing model comprises a plurality of hidden layers;
Aiming at a target hidden layer in the hidden layers, acquiring a target hidden vector and a disturbance hidden vector of the training sample and the training disturbance sample in the target hidden layer respectively; determining cross entropy between the target hidden vector and the disturbance hidden vector;
Determining a reconstruction error according to the cross entropy;
and training the generated model with the aim of minimizing the reconstruction error.
12. The apparatus of claim 11, wherein the target hidden layer is any one of the number of hidden layers;
The determining a reconstruction error according to the cross entropy comprises:
and summing the cross entropies corresponding to all the hidden layers in the plurality of hidden layers respectively, thereby determining the reconstruction error.
13. The apparatus of claim 9, wherein the generation model comprises an encoder, a decoder, and a sampling unit;
The encoder is used for receiving input samples, and outputting the mean value and the variance of Gaussian distribution obeyed by hidden vectors corresponding to the input samples through the encoder;
The sampling unit is used for sampling from each hidden vector of Gaussian distribution corresponding to the mean value and the variance output by the encoder to obtain a first hidden vector;
the decoder is configured to decode the first hidden vector to obtain a first disturbance sample.
14. The apparatus of claim 13, wherein the encoder comprises: deep neural network DNN, multi-layer perceptron MLP, or convolutional neural network CNN.
15. The apparatus of claim 13, wherein the generating unit is specifically configured to:
Inputting a sample to be interpreted as an input sample into the pre-trained encoder, and outputting the mean value and the variance of Gaussian distribution obeyed by the hidden vector corresponding to the input sample through the encoder;
The sampling unit samples the first hidden vectors of the Gaussian distribution corresponding to the mean value and the variance output by the encoder to obtain a first number of first hidden vectors;
the decoder decodes the first number of first hidden vectors to obtain a first number of first disturbance samples.
16. The apparatus of claim 9, wherein the interpretation unit is specifically configured to count variances of the second number of disturbance samples and the samples to be interpreted in each feature dimension, and determine importance of each feature dimension in the basis of obtaining the first service processing result according to the variances in each feature dimension.
17. A computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of any of claims 1-8.
18. A computing device comprising a memory having executable code stored therein and a processor, which when executing the executable code, implements the method of any of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210181701.4A CN114548300B (en) | 2019-12-20 | 2019-12-20 | Method and device for explaining service processing result of service processing model |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210181701.4A CN114548300B (en) | 2019-12-20 | 2019-12-20 | Method and device for explaining service processing result of service processing model |
CN201911326360.XA CN111062442B (en) | 2019-12-20 | 2019-12-20 | Method and device for explaining service processing result of service processing model |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911326360.XA Division CN111062442B (en) | 2019-12-20 | 2019-12-20 | Method and device for explaining service processing result of service processing model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114548300A CN114548300A (en) | 2022-05-27 |
CN114548300B true CN114548300B (en) | 2024-05-28 |
Family
ID=70301299
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911326360.XA Active CN111062442B (en) | 2019-12-20 | 2019-12-20 | Method and device for explaining service processing result of service processing model |
CN202210181701.4A Active CN114548300B (en) | 2019-12-20 | 2019-12-20 | Method and device for explaining service processing result of service processing model |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911326360.XA Active CN111062442B (en) | 2019-12-20 | 2019-12-20 | Method and device for explaining service processing result of service processing model |
Country Status (1)
Country | Link |
---|---|
CN (2) | CN111062442B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112052957B (en) * | 2020-09-02 | 2023-08-04 | 平安科技(深圳)有限公司 | Method and device for acquiring interpretability parameters of deep learning model |
CN113377640B (en) * | 2021-06-23 | 2022-07-08 | 杭州网易云音乐科技有限公司 | Method, medium, device and computing equipment for explaining model under business scene |
CN115618748B (en) * | 2022-11-29 | 2023-05-02 | 支付宝(杭州)信息技术有限公司 | Model optimization method, device, equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2003081527A1 (en) * | 2002-03-26 | 2003-10-02 | Council Of Scientific And Industrial Research | Improved performance of artificial neural network models in the presence of instrumental noise and measurement errors |
CN107895160A (en) * | 2017-12-21 | 2018-04-10 | 曙光信息产业(北京)有限公司 | Human face detection and tracing device and method |
CN109903053A (en) * | 2019-03-01 | 2019-06-18 | 成都新希望金融信息有限公司 | A kind of anti-fraud method carrying out Activity recognition based on sensing data |
CN110175646A (en) * | 2019-05-27 | 2019-08-27 | 浙江工业大学 | Multichannel confrontation sample testing method and device based on image transformation |
CN110334806A (en) * | 2019-05-29 | 2019-10-15 | 广东技术师范大学 | A method of adversarial sample generation based on generative adversarial network |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10447733B2 (en) * | 2014-06-11 | 2019-10-15 | Accenture Global Services Limited | Deception network system |
US10387655B2 (en) * | 2017-02-15 | 2019-08-20 | International Business Machines Corporation | Method, system and product for using a predictive model to predict if inputs reach a vulnerability of a program |
WO2019051113A1 (en) * | 2017-09-06 | 2019-03-14 | BigML, Inc. | Prediction characterization for black box machine learning models |
CN108090507A (en) * | 2017-10-19 | 2018-05-29 | 电子科技大学 | A kind of medical imaging textural characteristics processing method based on integrated approach |
US11386342B2 (en) * | 2018-04-20 | 2022-07-12 | H2O.Ai Inc. | Model interpretation |
CN111542841A (en) * | 2018-06-08 | 2020-08-14 | 北京嘀嘀无限科技发展有限公司 | System and method for content identification |
CN108960434B (en) * | 2018-06-28 | 2021-07-20 | 第四范式(北京)技术有限公司 | Method and device for analyzing data based on machine learning model interpretation |
CN110033094A (en) * | 2019-02-22 | 2019-07-19 | 阿里巴巴集团控股有限公司 | A kind of model training method and device based on disturbance sample |
CN110110139B (en) * | 2019-04-19 | 2021-06-22 | 北京奇艺世纪科技有限公司 | Method and device for explaining recommendation result and electronic equipment |
-
2019
- 2019-12-20 CN CN201911326360.XA patent/CN111062442B/en active Active
- 2019-12-20 CN CN202210181701.4A patent/CN114548300B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2003081527A1 (en) * | 2002-03-26 | 2003-10-02 | Council Of Scientific And Industrial Research | Improved performance of artificial neural network models in the presence of instrumental noise and measurement errors |
CN107895160A (en) * | 2017-12-21 | 2018-04-10 | 曙光信息产业(北京)有限公司 | Human face detection and tracing device and method |
CN109903053A (en) * | 2019-03-01 | 2019-06-18 | 成都新希望金融信息有限公司 | A kind of anti-fraud method carrying out Activity recognition based on sensing data |
CN110175646A (en) * | 2019-05-27 | 2019-08-27 | 浙江工业大学 | Multichannel confrontation sample testing method and device based on image transformation |
CN110334806A (en) * | 2019-05-29 | 2019-10-15 | 广东技术师范大学 | A method of adversarial sample generation based on generative adversarial network |
Non-Patent Citations (1)
Title |
---|
互联网背景下基于集成半监督学习的用户对抗性识别;张伟健;《中国优秀硕士学位论文全文数据库(社会科学Ⅱ辑)》;20190815;H123-266 * |
Also Published As
Publication number | Publication date |
---|---|
CN111062442A (en) | 2020-04-24 |
CN114548300A (en) | 2022-05-27 |
CN111062442B (en) | 2022-04-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018212710A1 (en) | Predictive analysis methods and systems | |
WO2019190886A1 (en) | Digital watermarking of machine learning models | |
Lopes et al. | Effective network intrusion detection via representation learning: A Denoising AutoEncoder approach | |
CN114548300B (en) | Method and device for explaining service processing result of service processing model | |
EP3916597B1 (en) | Detecting malware with deep generative models | |
US10929756B1 (en) | Systems and methods for configuring and implementing an interpretive surrogate machine learning model | |
Demidov et al. | Application model of modern artificial neural network methods for the analysis of information systems security | |
Wei et al. | Toward identifying APT malware through API system calls | |
US11886597B2 (en) | Detection of common patterns in user generated content with applications in fraud detection | |
Zhao et al. | Natural backdoor attacks on deep neural networks via raindrops | |
Ravi et al. | Hybrid classification and regression models via particle swarm optimization auto associative neural network based nonlinear PCA | |
Salcedo-Sanz et al. | An island grouping genetic algorithm for fuzzy partitioning problems | |
US20220198255A1 (en) | Training a semantic parser using action templates | |
US11997137B2 (en) | Webpage phishing detection using deep reinforcement learning | |
Qayoom et al. | A novel approach for credit card fraud transaction detection using deep reinforcement learning scheme | |
CN117522403A (en) | GCN abnormal customer early warning method and device based on subgraph fusion | |
Lijun et al. | An intuitionistic calculus to complex abnormal event recognition on data streams | |
CN112950222A (en) | Resource processing abnormity detection method and device, electronic equipment and storage medium | |
CN114618167A (en) | Anti-cheating detection model construction method and anti-cheating detection method | |
Gunes et al. | Detecting Direction of Pepper Stem by Using CUDA‐Based Accelerated Hybrid Intuitionistic Fuzzy Edge Detection and ANN | |
Wu et al. | English text recognition deep learning framework to automatically identify fake news | |
Demir et al. | Subnetwork ensembling and data augmentation: Effects on calibration | |
US12061622B1 (en) | Apparatus and method for communications associated with one or more data sets | |
US12223531B2 (en) | Method and an apparatus for a personalized user interface | |
Camino | Machine Learning Techniques for Suspicious Transaction Detection and Analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |