Nothing Special   »   [go: up one dir, main page]

CN114548300B - Method and device for explaining service processing result of service processing model - Google Patents

Method and device for explaining service processing result of service processing model Download PDF

Info

Publication number
CN114548300B
CN114548300B CN202210181701.4A CN202210181701A CN114548300B CN 114548300 B CN114548300 B CN 114548300B CN 202210181701 A CN202210181701 A CN 202210181701A CN 114548300 B CN114548300 B CN 114548300B
Authority
CN
China
Prior art keywords
samples
sample
disturbance
hidden
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210181701.4A
Other languages
Chinese (zh)
Other versions
CN114548300A (en
Inventor
唐才智
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202210181701.4A priority Critical patent/CN114548300B/en
Publication of CN114548300A publication Critical patent/CN114548300A/en
Application granted granted Critical
Publication of CN114548300B publication Critical patent/CN114548300B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/401Transaction verification
    • G06Q20/4014Identity check for transactions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/401Transaction verification
    • G06Q20/4016Transaction verification involving fraud or risk level assessment in transaction processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Accounting & Taxation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Finance (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • Strategic Management (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Computer Security & Cryptography (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Economics (AREA)
  • Technology Law (AREA)
  • Marketing (AREA)
  • Medical Informatics (AREA)
  • Development Economics (AREA)
  • Complex Calculations (AREA)

Abstract

The embodiment of the specification provides a method and a device for explaining a service processing result of a service processing model, wherein the method comprises the following steps: inputting a sample to be interpreted into a pre-trained generation model to obtain a first number of disturbance samples, wherein the sample to be interpreted and the disturbance samples both comprise a plurality of characteristic dimensions; respectively inputting a sample to be interpreted and a first number of disturbance samples into a business processing model realized through a neural network, and outputting a first business processing result corresponding to the sample to be interpreted and a second business processing result corresponding to each disturbance sample; screening a second number of disturbance samples from the first number of disturbance samples by taking the consistency of the second service processing result and the first service processing result as screening conditions; and counting the differences of the second number of disturbance samples and the samples to be explained in each characteristic dimension, and explaining the first business processing result according to the differences in each characteristic dimension. The method can reduce the calculation complexity and improve the efficiency.

Description

Method and device for explaining service processing result of service processing model
The present invention is a divisional application of the invention application of which the application date is 2019, 12, 20, application number is 201911326360.X, and the invention name is a method and a device for explaining the service processing result of a service processing model.
Technical Field
One or more embodiments of the present specification relate to the field of computers, and more particularly, to a method and apparatus for interpreting business process results of a business process model.
Background
Machine learning is now widely used in retail, technical, healthcare, scientific, and other fields. Whether the classification model or the regression model gives a result or decision, the entire decision process is invisible or unintelligible to the person. The decision process of the business processing model realized by the neural network and the rules which are easier to accept and understand are quite different, the decision of the rules corresponds to an easily understood and traceable decision path, the decision of the business processing model is more a black box process, only the input and the output are exposed to the user, the decision process is transparent and not perceivable to the user, and even if the decision is wrong, the decision is not traceable. The non-traceable and uncontrollable nature of these black boxes is why they are blocked from functioning in certain specific fields, in particular in the financial field, such as insurance, banking, etc. where security requirements are high, stability and controllability are required.
In the prior art, a method for explaining a service processing result of a service processing model is generally high in calculation complexity and low in efficiency.
Therefore, an improved scheme is desired, which can reduce the computational complexity and improve the efficiency when interpreting the business processing results of the business processing model.
Disclosure of Invention
One or more embodiments of the present disclosure describe a method and apparatus for interpreting a service processing result of a service processing model, which can reduce computational complexity and improve efficiency.
In a first aspect, a method for interpreting a service processing result of a service processing model is provided, the method comprising:
Inputting samples to be interpreted into a pre-trained generation model based on a variational automatic encoder (variational autoencoders, VAE) to obtain a first number of disturbance samples, wherein the samples to be interpreted and the disturbance samples both comprise a plurality of characteristic dimensions;
Respectively inputting the sample to be interpreted and the first number of disturbance samples into a business processing model realized by a neural network, and outputting a first business processing result corresponding to the sample to be interpreted and a second business processing result corresponding to each disturbance sample by the business processing model;
Screening a second number of disturbance samples from the first number of disturbance samples by taking the consistency of the second service processing result and the first service processing result as screening conditions;
And counting the differences between the second number of disturbance samples and the sample to be explained in each characteristic dimension, and explaining the first business processing result according to the differences in each characteristic dimension.
In one possible implementation, the sample to be interpreted corresponds to a target user;
And the service processing result output by the service processing model is used for indicating whether to intercept the preset behavior of the target user.
In one possible implementation, the business process model includes a deep neural network (deep neural networks, DNN).
In one possible implementation, the generative model is trained by:
inputting a training sample into the generation model, and outputting a training disturbance sample through the generation model;
Inputting the training samples and the training disturbance samples into the business processing model, wherein the business processing model comprises a plurality of hidden layers;
Aiming at a target hidden layer in the hidden layers, acquiring a target hidden vector and a disturbance hidden vector of the training sample and the training disturbance sample in the target hidden layer respectively; determining cross entropy between the target hidden vector and the disturbance hidden vector;
Determining a reconstruction error according to the cross entropy;
and training the generated model with the aim of minimizing the reconstruction error.
Further, the target hidden layer is any hidden layer in the plurality of hidden layers;
The determining a reconstruction error according to the cross entropy comprises:
and summing the cross entropies corresponding to all the hidden layers in the plurality of hidden layers respectively, thereby determining the reconstruction error.
In one possible implementation, the generation model includes an encoder, a decoder, and a sampling unit;
The encoder is used for receiving input samples, and outputting the mean value and the variance of Gaussian distribution obeyed by hidden vectors corresponding to the input samples through the encoder;
The sampling unit is used for sampling from each hidden vector of Gaussian distribution corresponding to the mean value and the variance output by the encoder to obtain a first hidden vector;
the decoder is configured to decode the first hidden vector to obtain a first disturbance sample.
Further, the encoder includes: deep neural network DNN, multi-layer perceptron (multi-Layer perceptron, MLP) or convolutional neural network (convolutional neural networks, CNN).
Further, the inputting the sample to be interpreted into a pre-trained generation model based on a variational automatic encoder VAE, to obtain a first number of disturbance samples, including:
Inputting a sample to be interpreted as an input sample into the pre-trained encoder, and outputting the mean value and the variance of Gaussian distribution obeyed by the hidden vector corresponding to the input sample through the encoder;
The sampling unit samples the first hidden vectors of the Gaussian distribution corresponding to the mean value and the variance output by the encoder to obtain a first number of first hidden vectors;
the decoder decodes the first number of first hidden vectors to obtain a first number of first disturbance samples.
In a possible implementation manner, the counting the differences between the second number of disturbance samples and the sample to be interpreted in each characteristic dimension, and the interpreting the first service processing result according to the differences in each characteristic dimension includes:
And counting the variances of the second number of disturbance samples and the samples to be explained in each characteristic dimension, and determining the importance of each characteristic dimension in the basis of obtaining the first service processing result according to the variances in each characteristic dimension.
In a second aspect, there is provided an apparatus for interpreting a service processing result of a service processing model, the apparatus comprising:
the generation unit is used for inputting samples to be explained into a pre-trained generation model based on a Variation Automatic Encoder (VAE) to obtain a first number of disturbance samples, wherein the samples to be explained and the disturbance samples both contain a plurality of characteristic dimensions;
The business processing unit is used for respectively inputting the samples to be interpreted and the first number of disturbance samples obtained by the generating unit into a business processing model realized by a neural network, and outputting a first business processing result corresponding to the samples to be interpreted and a second business processing result corresponding to each disturbance sample by the business processing model;
The screening unit is used for screening a second number of disturbance samples from the first number of disturbance samples obtained by the generating unit by taking the second service processing result obtained by the service processing unit and the first service processing result as screening conditions;
And the interpretation unit is used for counting the difference between the second number of disturbance samples obtained by the screening unit and the samples to be interpreted in each characteristic dimension, and interpreting the first service processing result obtained by the service processing unit according to the difference in each characteristic dimension.
In a third aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the first aspect.
In a fourth aspect, there is provided a computing device comprising a memory having executable code stored therein and a processor which, when executing the executable code, implements the method of the first aspect.
Through the method and the device provided by the embodiment of the specification, firstly, the sample to be explained is input into a pre-trained generation model based on a Variational Automatic Encoder (VAE) to obtain a first number of disturbance samples, and the sample to be explained and the disturbance samples both comprise a plurality of characteristic dimensions; then respectively inputting the sample to be interpreted and the first number of disturbance samples into a business processing model realized by a neural network, and outputting a first business processing result corresponding to the sample to be interpreted and a second business processing result corresponding to each disturbance sample by the business processing model; then, taking the second business processing result and the first business processing result as screening conditions, and screening a second number of disturbance samples from the first number of disturbance samples; and finally, counting the differences between the second number of disturbance samples and the sample to be explained in each characteristic dimension, and explaining the first business processing result according to the differences in each characteristic dimension. From the above, in the embodiment of the present disclosure, a generating model is constructed to generate a plurality of disturbance samples for a sample to be interpreted, the disturbance samples are neighbor pseudo samples of the sample to be interpreted, those disturbance samples with consistent service processing results of the service processing model for the disturbance samples and the sample to be interpreted are screened out, and the model interpretation is derived from the screened disturbance samples. The method can explain the sample level of the existing business processing model, namely the model can give the decision basis of the current time when outputting the business processing result each time. The method can reduce the calculation complexity and improve the efficiency.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic illustration of an implementation scenario of an embodiment disclosed herein;
FIG. 2 illustrates a method flow diagram for interpreting business process results of a business process model, in accordance with one embodiment;
FIG. 3 illustrates a training process schematic of generating a model according to one embodiment;
FIG. 4 shows an overall process diagram that explains business process results of a business process model, according to one embodiment;
fig. 5 shows a schematic block diagram of an apparatus for interpreting a business process result of a business process model, according to one embodiment.
Detailed Description
The following describes the scheme provided in the present specification with reference to the drawings.
Fig. 1 is a schematic diagram of an implementation scenario of an embodiment disclosed in the present specification. The implementation scenario involves interpreting the business process results of the business process model. Referring to fig. 1, a sample to be interpreted includes a plurality of feature dimensions, and when the sample to be interpreted is input into a service processing model, the service processing model outputs a corresponding service processing result.
It will be appreciated that the embodiments of the present description may give a sample-level interpretation, with the same feature having different feature importance for different samples to be interpreted. For example, each input sample of the service processing model includes N feature dimensions, namely feature 1 and feature 2 … … feature N, and for one sample to be interpreted, the most important basis in the corresponding service processing result is feature 1, and for another sample to be interpreted, the most important basis in the corresponding service processing result is feature 2.
As an example, a typical implementation scenario is a financial scenario, where the business process model is used to identify a user with a spoofed identity and intercept preset actions of the user identified as a spoofed identity. In some network financial platforms, some people can impersonate accounts of others to perform consumption or borrowing and other actions, which is called identity impersonation. The large probability of identity impersonation is accompanied by financial risk, and interception of the corresponding behavior is required, but the performance requirements and the interpretability requirements of the interception model used are necessarily high in view of the sensitivity of the financial scene. Therefore, it is required to meet both high performance requirements and interpretability.
Fig. 2 shows a flow chart of a method of interpreting the results of a business process model, which may be based on the implementation scenario shown in fig. 1, according to one embodiment. As shown in fig. 2, the method for explaining the service processing result of the service processing model in this embodiment includes the steps of: step 21, inputting a sample to be interpreted into a pre-trained generation model based on a variation automatic encoder (variational autoencoders, VAE) to obtain a first number of disturbance samples, wherein the sample to be interpreted and the disturbance samples both comprise a plurality of characteristic dimensions; step 22, inputting the sample to be interpreted and the first number of disturbance samples into a service processing model realized by a neural network respectively, and outputting a first service processing result corresponding to the sample to be interpreted and a second service processing result corresponding to each disturbance sample respectively by the service processing model; step 23, screening a second number of disturbance samples from the first number of disturbance samples by taking the consistency of the second service processing result and the first service processing result as screening conditions; and step 24, counting the difference between the second number of disturbance samples and the sample to be interpreted in each characteristic dimension, and interpreting the first service processing result according to the difference in each characteristic dimension. Specific implementations of the above steps are described below.
First, in step 21, a sample to be interpreted is input into a pre-trained VAE-based generation model, and a first number of disturbance samples are obtained, where the sample to be interpreted and the disturbance samples each include a plurality of feature dimensions. It will be appreciated that the first number may be predetermined.
In one example, the sample to be interpreted corresponds to a target user;
And the service processing result output by the service processing model is used for indicating whether to intercept the preset behavior of the target user.
It will be appreciated that the plurality of feature dimensions, i.e., the plurality of features, may include user portrayal features such as gender, age, academic, occupation, etc.; historical behavioral characteristics such as, for example, amount of consumption, records of violations, and the like, may also be included.
VAE: is a typical representation of a class of generative models in machine learning, combining probability map models with deep learning.
Generating a model: the machine learning model is generally divided into a discriminant model, which is a type of machine learning model directly modeling for posterior probability, and a generative model, which is a model directly modeling the joint probability of a sample and a label.
In one example, the generative model includes an encoder, a decoder, and a sampling unit;
The encoder is used for receiving input samples, and outputting the mean value and the variance of Gaussian distribution obeyed by hidden vectors corresponding to the input samples through the encoder;
The sampling unit is used for sampling from each hidden vector of Gaussian distribution corresponding to the mean value and the variance output by the encoder to obtain a first hidden vector;
the decoder is configured to decode the first hidden vector to obtain a first disturbance sample.
Further, the encoder includes: deep neural network DNN, multi-layer perceptron (multi-Layer perceptron, MLP) or convolutional neural network (convolutional neural networks, CNN).
Further, step 21 specifically includes:
Inputting a sample to be interpreted as an input sample into the pre-trained encoder, and outputting the mean value and the variance of Gaussian distribution obeyed by the hidden vector corresponding to the input sample through the encoder;
The sampling unit samples the first hidden vectors of the Gaussian distribution corresponding to the mean value and the variance output by the encoder to obtain a first number of first hidden vectors;
the decoder decodes the first number of first hidden vectors to obtain a first number of first disturbance samples.
Then, in step 22, the sample to be interpreted and the first number of disturbance samples are respectively input into a service processing model implemented through a neural network, and a first service processing result corresponding to the sample to be interpreted and a second service processing result corresponding to each disturbance sample are output through the service processing model. It will be appreciated that some of the perturbation samples correspond to the same second business process results as the first business process results, while other perturbation samples correspond to different second business process results than the first business process results.
In one example, the business process model includes a deep neural network (deep neural networks, DNN). The general DNN can meet higher performance requirements and flexibly add some service constraint conditions into a network, but the DNN lacks of interpretation. In the embodiment of the present specification, sample-level interpretation may be performed for DNN.
Next, in step 23, a second number of disturbance samples is screened from the first number of disturbance samples, using the second service processing result and the first service processing result as screening conditions. It can be appreciated that the second number of disturbance samples screened out can be used as an explanation basis for the first service processing result.
Finally, in step 24, the differences between the second number of disturbance samples and the sample to be interpreted in each characteristic dimension are counted, and the first service processing result is interpreted according to the differences in each characteristic dimension. It will be appreciated that the larger the difference in the feature dimension, the less important the feature dimension is for obtaining the first business process result, and the smaller the difference in the feature dimension is for obtaining the first business process result.
In one example, the variances of the second number of disturbance samples and the samples to be interpreted in each feature dimension are counted, and the importance of each feature dimension in the basis of obtaining the first service processing result is determined according to the variances in each feature dimension. In this example, the variance indicates the difference in the feature dimension, and it is understood that other indicators may be employed to indicate the difference in the feature dimension.
In one example, the generative model is trained by:
inputting a training sample into the generation model, and outputting a training disturbance sample through the generation model;
Inputting the training samples and the training disturbance samples into the business processing model, wherein the business processing model comprises a plurality of hidden layers;
Aiming at a target hidden layer in the hidden layers, acquiring a target hidden vector and a disturbance hidden vector of the training sample and the training disturbance sample in the target hidden layer respectively; determining cross entropy between the target hidden vector and the disturbance hidden vector;
Determining a reconstruction error according to the cross entropy;
and training the generated model with the aim of minimizing the reconstruction error.
Further, the target hidden layer is any hidden layer in the plurality of hidden layers;
The determining a reconstruction error according to the cross entropy comprises:
and summing the cross entropies corresponding to all the hidden layers in the plurality of hidden layers respectively, thereby determining the reconstruction error.
FIG. 3 illustrates a training process diagram of a generative model, according to one embodiment. Referring to fig. 3, the generation model includes an encoder, a decoder, and a sampling unit; the training sample learns the mean mu and variance sigma of the gaussian distribution to which the hidden vector is subjected through the encoder, and then the sampling unit extracts a hidden vector from the hidden vectors of the gaussian distribution, the hidden vector obtains a disturbance sample x ' through the decoder, and the disturbance sample x ' is input into the business processing model, and meanwhile, the training sample x is also input into the business processing model, and the cross entropy of the result of the hidden layers of the business processing model of x and x ' is used as a reconstruction error, so that the generated model is trained.
Fig. 4 shows an overall process diagram for interpreting the business process results of the business process model, according to one embodiment. Referring to fig. 4, a disturbance sample is generated using a generation model, and an interpretation is obtained using the disturbance sample. And for the sample x to be interpreted, obtaining n disturbance samples through a pre-trained generation model. The disturbance samples are input into a business processing model, disturbance samples with the business processing results consistent with the samples to be interpreted are selected, and m effective disturbance samples are obtained after screening. The delta is obtained by performing difference calculation on the m disturbance samples and the sample to be explained, the disturbance sample is intuitively understood to be equivalent to adding one disturbance delta on the original characteristic x of the sample to be explained, but the service processing result of the service processing model cannot be changed, so that the characteristic with larger characteristic change amplitude is relatively less important, and the characteristic with small characteristic change amplitude is more important. The final interpretation is to count the variance of each dimension of these deltas as the basis of interpretation.
According to the method provided by the embodiment of the specification, a plurality of disturbance samples are generated aiming at the sample to be interpreted by constructing a generating model, the disturbance samples are neighborhood pseudo samples of the sample to be interpreted, the disturbance samples with the same business processing results of the business processing model aiming at the disturbance samples and the sample to be interpreted are screened out, and the model interpretation is derived from the screened disturbance samples. The method can explain the sample level of the existing business processing model, namely the model can give the decision basis of the current time when outputting the business processing result each time. The method can reduce the calculation complexity and improve the efficiency. And the disturbance sample generated in the method is more in line with the distribution of the sample to be explained.
According to an embodiment of another aspect, there is further provided an apparatus for interpreting a service processing result of a service processing model, where the apparatus is configured to perform a method for interpreting a service processing result of a service processing model provided in an embodiment of the present specification. Fig. 5 shows a schematic block diagram of an apparatus for interpreting a business process result of a business process model, according to one embodiment. As shown in fig. 5, the apparatus 500 includes:
A generating unit 51, configured to input samples to be interpreted into a pre-trained generation model based on a variational automatic encoder VAE, to obtain a first number of disturbance samples, where the samples to be interpreted and the disturbance samples each include a plurality of feature dimensions;
A service processing unit 52, configured to input the samples to be interpreted and the first number of disturbance samples obtained by the generating unit 51 into a service processing model implemented by a neural network, and output, through the service processing model, a first service processing result corresponding to the samples to be interpreted, and a second service processing result corresponding to each disturbance sample;
A screening unit 53, configured to screen a second number of disturbance samples from the first number of disturbance samples obtained by the generating unit 51, with the second service processing result obtained by the service processing unit 52 being consistent with the first service processing result as a screening condition;
and an interpretation unit 54, configured to count differences between the second number of disturbance samples obtained by the screening unit 53 and the samples to be interpreted in each feature dimension, and interpret the first service processing result obtained by the service processing unit according to the differences in each feature dimension.
Optionally, as an embodiment, the sample to be interpreted corresponds to a target user;
And the service processing result output by the service processing model is used for indicating whether to intercept the preset behavior of the target user.
Optionally, as an embodiment, the service processing model includes a deep neural network DNN.
Alternatively, as an embodiment, the generative model is trained by:
inputting a training sample into the generation model, and outputting a training disturbance sample through the generation model;
Inputting the training samples and the training disturbance samples into the business processing model, wherein the business processing model comprises a plurality of hidden layers;
Aiming at a target hidden layer in the hidden layers, acquiring a target hidden vector and a disturbance hidden vector of the training sample and the training disturbance sample in the target hidden layer respectively; determining cross entropy between the target hidden vector and the disturbance hidden vector;
Determining a reconstruction error according to the cross entropy;
and training the generated model with the aim of minimizing the reconstruction error.
Further, the target hidden layer is any hidden layer in the plurality of hidden layers;
The determining a reconstruction error according to the cross entropy comprises:
and summing the cross entropies corresponding to all the hidden layers in the plurality of hidden layers respectively, thereby determining the reconstruction error.
Optionally, as an embodiment, the generating model includes an encoder, a decoder, and a sampling unit;
The encoder is used for receiving input samples, and outputting the mean value and the variance of Gaussian distribution obeyed by hidden vectors corresponding to the input samples through the encoder;
The sampling unit is used for sampling from each hidden vector of Gaussian distribution corresponding to the mean value and the variance output by the encoder to obtain a first hidden vector;
the decoder is configured to decode the first hidden vector to obtain a first disturbance sample.
Further, the encoder includes: deep neural network DNN, multi-layer perceptron MLP, or convolutional neural network CNN.
Further, the generating unit 51 is specifically configured to:
Inputting a sample to be interpreted as an input sample into the pre-trained encoder, and outputting the mean value and the variance of Gaussian distribution obeyed by the hidden vector corresponding to the input sample through the encoder;
The sampling unit samples the first hidden vectors of the Gaussian distribution corresponding to the mean value and the variance output by the encoder to obtain a first number of first hidden vectors;
the decoder decodes the first number of first hidden vectors to obtain a first number of first disturbance samples.
Optionally, as an embodiment, the interpretation unit 54 is specifically configured to count variances of the second number of disturbance samples and the samples to be interpreted in each feature dimension, and determine importance of each feature dimension in the basis of obtaining the first service processing result according to the variances in each feature dimension.
According to an embodiment of another aspect, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method described in connection with fig. 2.
According to an embodiment of yet another aspect, there is also provided a computing device including a memory having executable code stored therein and a processor that, when executing the executable code, implements the method described in connection with fig. 2.
Those skilled in the art will appreciate that in one or more of the examples described above, the functions described in the present invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, these functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
The foregoing embodiments have been provided for the purpose of illustrating the general principles of the present invention in further detail, and are not to be construed as limiting the scope of the invention, but are merely intended to cover any modifications, equivalents, improvements, etc. based on the teachings of the invention.

Claims (18)

1. A method of interpreting a business process result of a business process model, the method comprising:
Inputting a sample to be interpreted into a pre-trained generation model to obtain a first number of disturbance samples, wherein the sample to be interpreted and the disturbance samples both comprise a plurality of characteristic dimensions;
Respectively inputting the sample to be interpreted and the first number of disturbance samples into a business processing model realized by a neural network, and outputting a first business processing result corresponding to the sample to be interpreted and a second business processing result corresponding to each disturbance sample by the business processing model;
Screening a second number of disturbance samples from the first number of disturbance samples by taking the consistency of the second service processing result and the first service processing result as screening conditions;
counting the differences of the second number of disturbance samples and the samples to be explained in each characteristic dimension, and explaining the first business processing result according to the differences in each characteristic dimension;
Wherein the sample to be interpreted corresponds to a target user;
And the service processing result output by the service processing model is used for indicating whether to intercept the preset behavior of the target user.
2. The method of claim 1, wherein the business process model comprises a deep neural network DNN.
3. The method of claim 1, wherein the generative model is trained by:
inputting a training sample into the generation model, and outputting a training disturbance sample through the generation model;
Inputting the training samples and the training disturbance samples into the business processing model, wherein the business processing model comprises a plurality of hidden layers;
Aiming at a target hidden layer in the hidden layers, acquiring a target hidden vector and a disturbance hidden vector of the training sample and the training disturbance sample in the target hidden layer respectively; determining cross entropy between the target hidden vector and the disturbance hidden vector;
Determining a reconstruction error according to the cross entropy;
and training the generated model with the aim of minimizing the reconstruction error.
4. A method as claimed in claim 3, wherein the target hidden layer is any one of the number of hidden layers;
The determining a reconstruction error according to the cross entropy comprises:
and summing the cross entropies corresponding to all the hidden layers in the plurality of hidden layers respectively, thereby determining the reconstruction error.
5. The method of claim 1, wherein the generation model comprises an encoder, a decoder, and a sampling unit;
The encoder is used for receiving input samples, and outputting the mean value and the variance of Gaussian distribution obeyed by hidden vectors corresponding to the input samples through the encoder;
The sampling unit is used for sampling from each hidden vector of Gaussian distribution corresponding to the mean value and the variance output by the encoder to obtain a first hidden vector;
the decoder is configured to decode the first hidden vector to obtain a first disturbance sample.
6. The method of claim 5, wherein the encoder comprises: deep neural network DNN, multi-layer perceptron MLP, or convolutional neural network CNN.
7. The method of claim 5, wherein said inputting the sample to be interpreted into a pre-trained generative model results in a first number of perturbation samples, comprising:
Inputting a sample to be interpreted as an input sample into the pre-trained encoder, and outputting the mean value and the variance of Gaussian distribution obeyed by the hidden vector corresponding to the input sample through the encoder;
The sampling unit samples the first hidden vectors of the Gaussian distribution corresponding to the mean value and the variance output by the encoder to obtain a first number of first hidden vectors;
the decoder decodes the first number of first hidden vectors to obtain a first number of first disturbance samples.
8. The method of claim 1, wherein the counting the differences in each characteristic dimension between the second number of perturbation samples and the sample to be interpreted, and interpreting the first business process result according to the differences in each characteristic dimension, comprises:
And counting the variances of the second number of disturbance samples and the samples to be explained in each characteristic dimension, and determining the importance of each characteristic dimension in the basis of obtaining the first service processing result according to the variances in each characteristic dimension.
9. An apparatus for interpreting a business process result of a business process model, the apparatus comprising:
The generation unit is used for inputting the sample to be interpreted into a pre-trained generation model to obtain a first number of disturbance samples, wherein the sample to be interpreted and the disturbance samples both comprise a plurality of characteristic dimensions;
The business processing unit is used for respectively inputting the samples to be interpreted and the first number of disturbance samples obtained by the generating unit into a business processing model realized by a neural network, and outputting a first business processing result corresponding to the samples to be interpreted and a second business processing result corresponding to each disturbance sample by the business processing model;
The screening unit is used for screening a second number of disturbance samples from the first number of disturbance samples obtained by the generating unit by taking the second service processing result obtained by the service processing unit and the first service processing result as screening conditions;
The interpretation unit is used for counting the difference between the second number of disturbance samples obtained by the screening unit and the sample to be interpreted in each characteristic dimension, and interpreting the first service processing result obtained by the service processing unit according to the difference in each characteristic dimension;
Wherein the sample to be interpreted corresponds to a target user;
And the service processing result output by the service processing model is used for indicating whether to intercept the preset behavior of the target user.
10. The apparatus of claim 9, wherein the traffic processing model comprises a deep neural network DNN.
11. The apparatus of claim 9, wherein the generative model is trained by:
inputting a training sample into the generation model, and outputting a training disturbance sample through the generation model;
Inputting the training samples and the training disturbance samples into the business processing model, wherein the business processing model comprises a plurality of hidden layers;
Aiming at a target hidden layer in the hidden layers, acquiring a target hidden vector and a disturbance hidden vector of the training sample and the training disturbance sample in the target hidden layer respectively; determining cross entropy between the target hidden vector and the disturbance hidden vector;
Determining a reconstruction error according to the cross entropy;
and training the generated model with the aim of minimizing the reconstruction error.
12. The apparatus of claim 11, wherein the target hidden layer is any one of the number of hidden layers;
The determining a reconstruction error according to the cross entropy comprises:
and summing the cross entropies corresponding to all the hidden layers in the plurality of hidden layers respectively, thereby determining the reconstruction error.
13. The apparatus of claim 9, wherein the generation model comprises an encoder, a decoder, and a sampling unit;
The encoder is used for receiving input samples, and outputting the mean value and the variance of Gaussian distribution obeyed by hidden vectors corresponding to the input samples through the encoder;
The sampling unit is used for sampling from each hidden vector of Gaussian distribution corresponding to the mean value and the variance output by the encoder to obtain a first hidden vector;
the decoder is configured to decode the first hidden vector to obtain a first disturbance sample.
14. The apparatus of claim 13, wherein the encoder comprises: deep neural network DNN, multi-layer perceptron MLP, or convolutional neural network CNN.
15. The apparatus of claim 13, wherein the generating unit is specifically configured to:
Inputting a sample to be interpreted as an input sample into the pre-trained encoder, and outputting the mean value and the variance of Gaussian distribution obeyed by the hidden vector corresponding to the input sample through the encoder;
The sampling unit samples the first hidden vectors of the Gaussian distribution corresponding to the mean value and the variance output by the encoder to obtain a first number of first hidden vectors;
the decoder decodes the first number of first hidden vectors to obtain a first number of first disturbance samples.
16. The apparatus of claim 9, wherein the interpretation unit is specifically configured to count variances of the second number of disturbance samples and the samples to be interpreted in each feature dimension, and determine importance of each feature dimension in the basis of obtaining the first service processing result according to the variances in each feature dimension.
17. A computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of any of claims 1-8.
18. A computing device comprising a memory having executable code stored therein and a processor, which when executing the executable code, implements the method of any of claims 1-8.
CN202210181701.4A 2019-12-20 2019-12-20 Method and device for explaining service processing result of service processing model Active CN114548300B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210181701.4A CN114548300B (en) 2019-12-20 2019-12-20 Method and device for explaining service processing result of service processing model

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210181701.4A CN114548300B (en) 2019-12-20 2019-12-20 Method and device for explaining service processing result of service processing model
CN201911326360.XA CN111062442B (en) 2019-12-20 2019-12-20 Method and device for explaining service processing result of service processing model

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN201911326360.XA Division CN111062442B (en) 2019-12-20 2019-12-20 Method and device for explaining service processing result of service processing model

Publications (2)

Publication Number Publication Date
CN114548300A CN114548300A (en) 2022-05-27
CN114548300B true CN114548300B (en) 2024-05-28

Family

ID=70301299

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201911326360.XA Active CN111062442B (en) 2019-12-20 2019-12-20 Method and device for explaining service processing result of service processing model
CN202210181701.4A Active CN114548300B (en) 2019-12-20 2019-12-20 Method and device for explaining service processing result of service processing model

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN201911326360.XA Active CN111062442B (en) 2019-12-20 2019-12-20 Method and device for explaining service processing result of service processing model

Country Status (1)

Country Link
CN (2) CN111062442B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112052957B (en) * 2020-09-02 2023-08-04 平安科技(深圳)有限公司 Method and device for acquiring interpretability parameters of deep learning model
CN113377640B (en) * 2021-06-23 2022-07-08 杭州网易云音乐科技有限公司 Method, medium, device and computing equipment for explaining model under business scene
CN115618748B (en) * 2022-11-29 2023-05-02 支付宝(杭州)信息技术有限公司 Model optimization method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003081527A1 (en) * 2002-03-26 2003-10-02 Council Of Scientific And Industrial Research Improved performance of artificial neural network models in the presence of instrumental noise and measurement errors
CN107895160A (en) * 2017-12-21 2018-04-10 曙光信息产业(北京)有限公司 Human face detection and tracing device and method
CN109903053A (en) * 2019-03-01 2019-06-18 成都新希望金融信息有限公司 A kind of anti-fraud method carrying out Activity recognition based on sensing data
CN110175646A (en) * 2019-05-27 2019-08-27 浙江工业大学 Multichannel confrontation sample testing method and device based on image transformation
CN110334806A (en) * 2019-05-29 2019-10-15 广东技术师范大学 A method of adversarial sample generation based on generative adversarial network

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10447733B2 (en) * 2014-06-11 2019-10-15 Accenture Global Services Limited Deception network system
US10387655B2 (en) * 2017-02-15 2019-08-20 International Business Machines Corporation Method, system and product for using a predictive model to predict if inputs reach a vulnerability of a program
WO2019051113A1 (en) * 2017-09-06 2019-03-14 BigML, Inc. Prediction characterization for black box machine learning models
CN108090507A (en) * 2017-10-19 2018-05-29 电子科技大学 A kind of medical imaging textural characteristics processing method based on integrated approach
US11386342B2 (en) * 2018-04-20 2022-07-12 H2O.Ai Inc. Model interpretation
CN111542841A (en) * 2018-06-08 2020-08-14 北京嘀嘀无限科技发展有限公司 System and method for content identification
CN108960434B (en) * 2018-06-28 2021-07-20 第四范式(北京)技术有限公司 Method and device for analyzing data based on machine learning model interpretation
CN110033094A (en) * 2019-02-22 2019-07-19 阿里巴巴集团控股有限公司 A kind of model training method and device based on disturbance sample
CN110110139B (en) * 2019-04-19 2021-06-22 北京奇艺世纪科技有限公司 Method and device for explaining recommendation result and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003081527A1 (en) * 2002-03-26 2003-10-02 Council Of Scientific And Industrial Research Improved performance of artificial neural network models in the presence of instrumental noise and measurement errors
CN107895160A (en) * 2017-12-21 2018-04-10 曙光信息产业(北京)有限公司 Human face detection and tracing device and method
CN109903053A (en) * 2019-03-01 2019-06-18 成都新希望金融信息有限公司 A kind of anti-fraud method carrying out Activity recognition based on sensing data
CN110175646A (en) * 2019-05-27 2019-08-27 浙江工业大学 Multichannel confrontation sample testing method and device based on image transformation
CN110334806A (en) * 2019-05-29 2019-10-15 广东技术师范大学 A method of adversarial sample generation based on generative adversarial network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
互联网背景下基于集成半监督学习的用户对抗性识别;张伟健;《中国优秀硕士学位论文全文数据库(社会科学Ⅱ辑)》;20190815;H123-266 *

Also Published As

Publication number Publication date
CN111062442A (en) 2020-04-24
CN114548300A (en) 2022-05-27
CN111062442B (en) 2022-04-12

Similar Documents

Publication Publication Date Title
WO2018212710A1 (en) Predictive analysis methods and systems
WO2019190886A1 (en) Digital watermarking of machine learning models
Lopes et al. Effective network intrusion detection via representation learning: A Denoising AutoEncoder approach
CN114548300B (en) Method and device for explaining service processing result of service processing model
EP3916597B1 (en) Detecting malware with deep generative models
US10929756B1 (en) Systems and methods for configuring and implementing an interpretive surrogate machine learning model
Demidov et al. Application model of modern artificial neural network methods for the analysis of information systems security
Wei et al. Toward identifying APT malware through API system calls
US11886597B2 (en) Detection of common patterns in user generated content with applications in fraud detection
Zhao et al. Natural backdoor attacks on deep neural networks via raindrops
Ravi et al. Hybrid classification and regression models via particle swarm optimization auto associative neural network based nonlinear PCA
Salcedo-Sanz et al. An island grouping genetic algorithm for fuzzy partitioning problems
US20220198255A1 (en) Training a semantic parser using action templates
US11997137B2 (en) Webpage phishing detection using deep reinforcement learning
Qayoom et al. A novel approach for credit card fraud transaction detection using deep reinforcement learning scheme
CN117522403A (en) GCN abnormal customer early warning method and device based on subgraph fusion
Lijun et al. An intuitionistic calculus to complex abnormal event recognition on data streams
CN112950222A (en) Resource processing abnormity detection method and device, electronic equipment and storage medium
CN114618167A (en) Anti-cheating detection model construction method and anti-cheating detection method
Gunes et al. Detecting Direction of Pepper Stem by Using CUDA‐Based Accelerated Hybrid Intuitionistic Fuzzy Edge Detection and ANN
Wu et al. English text recognition deep learning framework to automatically identify fake news
Demir et al. Subnetwork ensembling and data augmentation: Effects on calibration
US12061622B1 (en) Apparatus and method for communications associated with one or more data sets
US12223531B2 (en) Method and an apparatus for a personalized user interface
Camino Machine Learning Techniques for Suspicious Transaction Detection and Analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant