CN114548300B

CN114548300B - Method and device for explaining service processing result of service processing model

Info

Publication number: CN114548300B
Application number: CN202210181701.4A
Authority: CN
Inventors: 唐才智
Original assignee: Alipay Hangzhou Information Technology Co Ltd
Current assignee: Alipay Hangzhou Information Technology Co Ltd
Priority date: 2019-12-20
Filing date: 2019-12-20
Publication date: 2024-05-28
Anticipated expiration: 2039-12-20
Also published as: CN111062442A; CN114548300A; CN111062442B

Abstract

The embodiment of the specification provides a method and a device for explaining a service processing result of a service processing model, wherein the method comprises the following steps: inputting a sample to be interpreted into a pre-trained generation model to obtain a first number of disturbance samples, wherein the sample to be interpreted and the disturbance samples both comprise a plurality of characteristic dimensions; respectively inputting a sample to be interpreted and a first number of disturbance samples into a business processing model realized through a neural network, and outputting a first business processing result corresponding to the sample to be interpreted and a second business processing result corresponding to each disturbance sample; screening a second number of disturbance samples from the first number of disturbance samples by taking the consistency of the second service processing result and the first service processing result as screening conditions; and counting the differences of the second number of disturbance samples and the samples to be explained in each characteristic dimension, and explaining the first business processing result according to the differences in each characteristic dimension. The method can reduce the calculation complexity and improve the efficiency.

Description

Method and device for explaining service processing result of service processing model

The present invention is a divisional application of the invention application of which the application date is 2019, 12, 20, application number is 201911326360.X, and the invention name is a method and a device for explaining the service processing result of a service processing model.

Technical Field

One or more embodiments of the present specification relate to the field of computers, and more particularly, to a method and apparatus for interpreting business process results of a business process model.

Background

Machine learning is now widely used in retail, technical, healthcare, scientific, and other fields. Whether the classification model or the regression model gives a result or decision, the entire decision process is invisible or unintelligible to the person. The decision process of the business processing model realized by the neural network and the rules which are easier to accept and understand are quite different, the decision of the rules corresponds to an easily understood and traceable decision path, the decision of the business processing model is more a black box process, only the input and the output are exposed to the user, the decision process is transparent and not perceivable to the user, and even if the decision is wrong, the decision is not traceable. The non-traceable and uncontrollable nature of these black boxes is why they are blocked from functioning in certain specific fields, in particular in the financial field, such as insurance, banking, etc. where security requirements are high, stability and controllability are required.

In the prior art, a method for explaining a service processing result of a service processing model is generally high in calculation complexity and low in efficiency.

Therefore, an improved scheme is desired, which can reduce the computational complexity and improve the efficiency when interpreting the business processing results of the business processing model.

Disclosure of Invention

One or more embodiments of the present disclosure describe a method and apparatus for interpreting a service processing result of a service processing model, which can reduce computational complexity and improve efficiency.

In a first aspect, a method for interpreting a service processing result of a service processing model is provided, the method comprising:

Inputting samples to be interpreted into a pre-trained generation model based on a variational automatic encoder (variational autoencoders, VAE) to obtain a first number of disturbance samples, wherein the samples to be interpreted and the disturbance samples both comprise a plurality of characteristic dimensions;

Respectively inputting the sample to be interpreted and the first number of disturbance samples into a business processing model realized by a neural network, and outputting a first business processing result corresponding to the sample to be interpreted and a second business processing result corresponding to each disturbance sample by the business processing model;

Screening a second number of disturbance samples from the first number of disturbance samples by taking the consistency of the second service processing result and the first service processing result as screening conditions;

And counting the differences between the second number of disturbance samples and the sample to be explained in each characteristic dimension, and explaining the first business processing result according to the differences in each characteristic dimension.

In one possible implementation, the sample to be interpreted corresponds to a target user;

And the service processing result output by the service processing model is used for indicating whether to intercept the preset behavior of the target user.

In one possible implementation, the business process model includes a deep neural network (deep neural networks, DNN).

In one possible implementation, the generative model is trained by:

inputting a training sample into the generation model, and outputting a training disturbance sample through the generation model;

Inputting the training samples and the training disturbance samples into the business processing model, wherein the business processing model comprises a plurality of hidden layers;

Aiming at a target hidden layer in the hidden layers, acquiring a target hidden vector and a disturbance hidden vector of the training sample and the training disturbance sample in the target hidden layer respectively; determining cross entropy between the target hidden vector and the disturbance hidden vector;

Determining a reconstruction error according to the cross entropy;

and training the generated model with the aim of minimizing the reconstruction error.

Further, the target hidden layer is any hidden layer in the plurality of hidden layers;

The determining a reconstruction error according to the cross entropy comprises:

and summing the cross entropies corresponding to all the hidden layers in the plurality of hidden layers respectively, thereby determining the reconstruction error.

In one possible implementation, the generation model includes an encoder, a decoder, and a sampling unit;

The encoder is used for receiving input samples, and outputting the mean value and the variance of Gaussian distribution obeyed by hidden vectors corresponding to the input samples through the encoder;

The sampling unit is used for sampling from each hidden vector of Gaussian distribution corresponding to the mean value and the variance output by the encoder to obtain a first hidden vector;

the decoder is configured to decode the first hidden vector to obtain a first disturbance sample.

Further, the encoder includes: deep neural network DNN, multi-layer perceptron (multi-Layer perceptron, MLP) or convolutional neural network (convolutional neural networks, CNN).

Further, the inputting the sample to be interpreted into a pre-trained generation model based on a variational automatic encoder VAE, to obtain a first number of disturbance samples, including:

Inputting a sample to be interpreted as an input sample into the pre-trained encoder, and outputting the mean value and the variance of Gaussian distribution obeyed by the hidden vector corresponding to the input sample through the encoder;

The sampling unit samples the first hidden vectors of the Gaussian distribution corresponding to the mean value and the variance output by the encoder to obtain a first number of first hidden vectors;

the decoder decodes the first number of first hidden vectors to obtain a first number of first disturbance samples.

In a possible implementation manner, the counting the differences between the second number of disturbance samples and the sample to be interpreted in each characteristic dimension, and the interpreting the first service processing result according to the differences in each characteristic dimension includes:

And counting the variances of the second number of disturbance samples and the samples to be explained in each characteristic dimension, and determining the importance of each characteristic dimension in the basis of obtaining the first service processing result according to the variances in each characteristic dimension.

In a second aspect, there is provided an apparatus for interpreting a service processing result of a service processing model, the apparatus comprising:

the generation unit is used for inputting samples to be explained into a pre-trained generation model based on a Variation Automatic Encoder (VAE) to obtain a first number of disturbance samples, wherein the samples to be explained and the disturbance samples both contain a plurality of characteristic dimensions;

The business processing unit is used for respectively inputting the samples to be interpreted and the first number of disturbance samples obtained by the generating unit into a business processing model realized by a neural network, and outputting a first business processing result corresponding to the samples to be interpreted and a second business processing result corresponding to each disturbance sample by the business processing model;

The screening unit is used for screening a second number of disturbance samples from the first number of disturbance samples obtained by the generating unit by taking the second service processing result obtained by the service processing unit and the first service processing result as screening conditions;

And the interpretation unit is used for counting the difference between the second number of disturbance samples obtained by the screening unit and the samples to be interpreted in each characteristic dimension, and interpreting the first service processing result obtained by the service processing unit according to the difference in each characteristic dimension.

In a third aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the first aspect.

In a fourth aspect, there is provided a computing device comprising a memory having executable code stored therein and a processor which, when executing the executable code, implements the method of the first aspect.

Through the method and the device provided by the embodiment of the specification, firstly, the sample to be explained is input into a pre-trained generation model based on a Variational Automatic Encoder (VAE) to obtain a first number of disturbance samples, and the sample to be explained and the disturbance samples both comprise a plurality of characteristic dimensions; then respectively inputting the sample to be interpreted and the first number of disturbance samples into a business processing model realized by a neural network, and outputting a first business processing result corresponding to the sample to be interpreted and a second business processing result corresponding to each disturbance sample by the business processing model; then, taking the second business processing result and the first business processing result as screening conditions, and screening a second number of disturbance samples from the first number of disturbance samples; and finally, counting the differences between the second number of disturbance samples and the sample to be explained in each characteristic dimension, and explaining the first business processing result according to the differences in each characteristic dimension. From the above, in the embodiment of the present disclosure, a generating model is constructed to generate a plurality of disturbance samples for a sample to be interpreted, the disturbance samples are neighbor pseudo samples of the sample to be interpreted, those disturbance samples with consistent service processing results of the service processing model for the disturbance samples and the sample to be interpreted are screened out, and the model interpretation is derived from the screened disturbance samples. The method can explain the sample level of the existing business processing model, namely the model can give the decision basis of the current time when outputting the business processing result each time. The method can reduce the calculation complexity and improve the efficiency.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic illustration of an implementation scenario of an embodiment disclosed herein;

FIG. 2 illustrates a method flow diagram for interpreting business process results of a business process model, in accordance with one embodiment;

FIG. 3 illustrates a training process schematic of generating a model according to one embodiment;

FIG. 4 shows an overall process diagram that explains business process results of a business process model, according to one embodiment;

fig. 5 shows a schematic block diagram of an apparatus for interpreting a business process result of a business process model, according to one embodiment.

Detailed Description

The following describes the scheme provided in the present specification with reference to the drawings.

Fig. 1 is a schematic diagram of an implementation scenario of an embodiment disclosed in the present specification. The implementation scenario involves interpreting the business process results of the business process model. Referring to fig. 1, a sample to be interpreted includes a plurality of feature dimensions, and when the sample to be interpreted is input into a service processing model, the service processing model outputs a corresponding service processing result.

It will be appreciated that the embodiments of the present description may give a sample-level interpretation, with the same feature having different feature importance for different samples to be interpreted. For example, each input sample of the service processing model includes N feature dimensions, namely feature 1 and feature 2 … … feature N, and for one sample to be interpreted, the most important basis in the corresponding service processing result is feature 1, and for another sample to be interpreted, the most important basis in the corresponding service processing result is feature 2.

As an example, a typical implementation scenario is a financial scenario, where the business process model is used to identify a user with a spoofed identity and intercept preset actions of the user identified as a spoofed identity. In some network financial platforms, some people can impersonate accounts of others to perform consumption or borrowing and other actions, which is called identity impersonation. The large probability of identity impersonation is accompanied by financial risk, and interception of the corresponding behavior is required, but the performance requirements and the interpretability requirements of the interception model used are necessarily high in view of the sensitivity of the financial scene. Therefore, it is required to meet both high performance requirements and interpretability.

Fig. 2 shows a flow chart of a method of interpreting the results of a business process model, which may be based on the implementation scenario shown in fig. 1, according to one embodiment. As shown in fig. 2, the method for explaining the service processing result of the service processing model in this embodiment includes the steps of: step 21, inputting a sample to be interpreted into a pre-trained generation model based on a variation automatic encoder (variational autoencoders, VAE) to obtain a first number of disturbance samples, wherein the sample to be interpreted and the disturbance samples both comprise a plurality of characteristic dimensions; step 22, inputting the sample to be interpreted and the first number of disturbance samples into a service processing model realized by a neural network respectively, and outputting a first service processing result corresponding to the sample to be interpreted and a second service processing result corresponding to each disturbance sample respectively by the service processing model; step 23, screening a second number of disturbance samples from the first number of disturbance samples by taking the consistency of the second service processing result and the first service processing result as screening conditions; and step 24, counting the difference between the second number of disturbance samples and the sample to be interpreted in each characteristic dimension, and interpreting the first service processing result according to the difference in each characteristic dimension. Specific implementations of the above steps are described below.

First, in step 21, a sample to be interpreted is input into a pre-trained VAE-based generation model, and a first number of disturbance samples are obtained, where the sample to be interpreted and the disturbance samples each include a plurality of feature dimensions. It will be appreciated that the first number may be predetermined.

In one example, the sample to be interpreted corresponds to a target user;

It will be appreciated that the plurality of feature dimensions, i.e., the plurality of features, may include user portrayal features such as gender, age, academic, occupation, etc.; historical behavioral characteristics such as, for example, amount of consumption, records of violations, and the like, may also be included.

VAE: is a typical representation of a class of generative models in machine learning, combining probability map models with deep learning.

Generating a model: the machine learning model is generally divided into a discriminant model, which is a type of machine learning model directly modeling for posterior probability, and a generative model, which is a model directly modeling the joint probability of a sample and a label.

In one example, the generative model includes an encoder, a decoder, and a sampling unit;

Further, step 21 specifically includes:

Then, in step 22, the sample to be interpreted and the first number of disturbance samples are respectively input into a service processing model implemented through a neural network, and a first service processing result corresponding to the sample to be interpreted and a second service processing result corresponding to each disturbance sample are output through the service processing model. It will be appreciated that some of the perturbation samples correspond to the same second business process results as the first business process results, while other perturbation samples correspond to different second business process results than the first business process results.

In one example, the business process model includes a deep neural network (deep neural networks, DNN). The general DNN can meet higher performance requirements and flexibly add some service constraint conditions into a network, but the DNN lacks of interpretation. In the embodiment of the present specification, sample-level interpretation may be performed for DNN.

Next, in step 23, a second number of disturbance samples is screened from the first number of disturbance samples, using the second service processing result and the first service processing result as screening conditions. It can be appreciated that the second number of disturbance samples screened out can be used as an explanation basis for the first service processing result.

Finally, in step 24, the differences between the second number of disturbance samples and the sample to be interpreted in each characteristic dimension are counted, and the first service processing result is interpreted according to the differences in each characteristic dimension. It will be appreciated that the larger the difference in the feature dimension, the less important the feature dimension is for obtaining the first business process result, and the smaller the difference in the feature dimension is for obtaining the first business process result.

In one example, the variances of the second number of disturbance samples and the samples to be interpreted in each feature dimension are counted, and the importance of each feature dimension in the basis of obtaining the first service processing result is determined according to the variances in each feature dimension. In this example, the variance indicates the difference in the feature dimension, and it is understood that other indicators may be employed to indicate the difference in the feature dimension.

In one example, the generative model is trained by:

Determining a reconstruction error according to the cross entropy;

FIG. 3 illustrates a training process diagram of a generative model, according to one embodiment. Referring to fig. 3, the generation model includes an encoder, a decoder, and a sampling unit; the training sample learns the mean mu and variance sigma of the gaussian distribution to which the hidden vector is subjected through the encoder, and then the sampling unit extracts a hidden vector from the hidden vectors of the gaussian distribution, the hidden vector obtains a disturbance sample x ' through the decoder, and the disturbance sample x ' is input into the business processing model, and meanwhile, the training sample x is also input into the business processing model, and the cross entropy of the result of the hidden layers of the business processing model of x and x ' is used as a reconstruction error, so that the generated model is trained.

Fig. 4 shows an overall process diagram for interpreting the business process results of the business process model, according to one embodiment. Referring to fig. 4, a disturbance sample is generated using a generation model, and an interpretation is obtained using the disturbance sample. And for the sample x to be interpreted, obtaining n disturbance samples through a pre-trained generation model. The disturbance samples are input into a business processing model, disturbance samples with the business processing results consistent with the samples to be interpreted are selected, and m effective disturbance samples are obtained after screening. The delta is obtained by performing difference calculation on the m disturbance samples and the sample to be explained, the disturbance sample is intuitively understood to be equivalent to adding one disturbance delta on the original characteristic x of the sample to be explained, but the service processing result of the service processing model cannot be changed, so that the characteristic with larger characteristic change amplitude is relatively less important, and the characteristic with small characteristic change amplitude is more important. The final interpretation is to count the variance of each dimension of these deltas as the basis of interpretation.

According to the method provided by the embodiment of the specification, a plurality of disturbance samples are generated aiming at the sample to be interpreted by constructing a generating model, the disturbance samples are neighborhood pseudo samples of the sample to be interpreted, the disturbance samples with the same business processing results of the business processing model aiming at the disturbance samples and the sample to be interpreted are screened out, and the model interpretation is derived from the screened disturbance samples. The method can explain the sample level of the existing business processing model, namely the model can give the decision basis of the current time when outputting the business processing result each time. The method can reduce the calculation complexity and improve the efficiency. And the disturbance sample generated in the method is more in line with the distribution of the sample to be explained.

According to an embodiment of another aspect, there is further provided an apparatus for interpreting a service processing result of a service processing model, where the apparatus is configured to perform a method for interpreting a service processing result of a service processing model provided in an embodiment of the present specification. Fig. 5 shows a schematic block diagram of an apparatus for interpreting a business process result of a business process model, according to one embodiment. As shown in fig. 5, the apparatus 500 includes:

A generating unit 51, configured to input samples to be interpreted into a pre-trained generation model based on a variational automatic encoder VAE, to obtain a first number of disturbance samples, where the samples to be interpreted and the disturbance samples each include a plurality of feature dimensions;

A service processing unit 52, configured to input the samples to be interpreted and the first number of disturbance samples obtained by the generating unit 51 into a service processing model implemented by a neural network, and output, through the service processing model, a first service processing result corresponding to the samples to be interpreted, and a second service processing result corresponding to each disturbance sample;

A screening unit 53, configured to screen a second number of disturbance samples from the first number of disturbance samples obtained by the generating unit 51, with the second service processing result obtained by the service processing unit 52 being consistent with the first service processing result as a screening condition;

and an interpretation unit 54, configured to count differences between the second number of disturbance samples obtained by the screening unit 53 and the samples to be interpreted in each feature dimension, and interpret the first service processing result obtained by the service processing unit according to the differences in each feature dimension.

Optionally, as an embodiment, the sample to be interpreted corresponds to a target user;

Optionally, as an embodiment, the service processing model includes a deep neural network DNN.

Alternatively, as an embodiment, the generative model is trained by:

Determining a reconstruction error according to the cross entropy;

Optionally, as an embodiment, the generating model includes an encoder, a decoder, and a sampling unit;

Further, the encoder includes: deep neural network DNN, multi-layer perceptron MLP, or convolutional neural network CNN.

Further, the generating unit 51 is specifically configured to:

Optionally, as an embodiment, the interpretation unit 54 is specifically configured to count variances of the second number of disturbance samples and the samples to be interpreted in each feature dimension, and determine importance of each feature dimension in the basis of obtaining the first service processing result according to the variances in each feature dimension.

According to an embodiment of another aspect, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method described in connection with fig. 2.

According to an embodiment of yet another aspect, there is also provided a computing device including a memory having executable code stored therein and a processor that, when executing the executable code, implements the method described in connection with fig. 2.

Those skilled in the art will appreciate that in one or more of the examples described above, the functions described in the present invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, these functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.

The foregoing embodiments have been provided for the purpose of illustrating the general principles of the present invention in further detail, and are not to be construed as limiting the scope of the invention, but are merely intended to cover any modifications, equivalents, improvements, etc. based on the teachings of the invention.

Claims

1. A method of interpreting a business process result of a business process model, the method comprising:

Inputting a sample to be interpreted into a pre-trained generation model to obtain a first number of disturbance samples, wherein the sample to be interpreted and the disturbance samples both comprise a plurality of characteristic dimensions;

counting the differences of the second number of disturbance samples and the samples to be explained in each characteristic dimension, and explaining the first business processing result according to the differences in each characteristic dimension;

Wherein the sample to be interpreted corresponds to a target user;

2. The method of claim 1, wherein the business process model comprises a deep neural network DNN.

3. The method of claim 1, wherein the generative model is trained by:

Determining a reconstruction error according to the cross entropy;

4. A method as claimed in claim 3, wherein the target hidden layer is any one of the number of hidden layers;

5. The method of claim 1, wherein the generation model comprises an encoder, a decoder, and a sampling unit;

6. The method of claim 5, wherein the encoder comprises: deep neural network DNN, multi-layer perceptron MLP, or convolutional neural network CNN.

7. The method of claim 5, wherein said inputting the sample to be interpreted into a pre-trained generative model results in a first number of perturbation samples, comprising:

8. The method of claim 1, wherein the counting the differences in each characteristic dimension between the second number of perturbation samples and the sample to be interpreted, and interpreting the first business process result according to the differences in each characteristic dimension, comprises:

9. An apparatus for interpreting a business process result of a business process model, the apparatus comprising:

The generation unit is used for inputting the sample to be interpreted into a pre-trained generation model to obtain a first number of disturbance samples, wherein the sample to be interpreted and the disturbance samples both comprise a plurality of characteristic dimensions;

The interpretation unit is used for counting the difference between the second number of disturbance samples obtained by the screening unit and the sample to be interpreted in each characteristic dimension, and interpreting the first service processing result obtained by the service processing unit according to the difference in each characteristic dimension;

Wherein the sample to be interpreted corresponds to a target user;

10. The apparatus of claim 9, wherein the traffic processing model comprises a deep neural network DNN.

11. The apparatus of claim 9, wherein the generative model is trained by:

Determining a reconstruction error according to the cross entropy;

12. The apparatus of claim 11, wherein the target hidden layer is any one of the number of hidden layers;

13. The apparatus of claim 9, wherein the generation model comprises an encoder, a decoder, and a sampling unit;

14. The apparatus of claim 13, wherein the encoder comprises: deep neural network DNN, multi-layer perceptron MLP, or convolutional neural network CNN.

15. The apparatus of claim 13, wherein the generating unit is specifically configured to:

16. The apparatus of claim 9, wherein the interpretation unit is specifically configured to count variances of the second number of disturbance samples and the samples to be interpreted in each feature dimension, and determine importance of each feature dimension in the basis of obtaining the first service processing result according to the variances in each feature dimension.

17. A computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of any of claims 1-8.

18. A computing device comprising a memory having executable code stored therein and a processor, which when executing the executable code, implements the method of any of claims 1-8.