CN115063143A - Account data processing method and device, computer equipment and storage medium - Google Patents
Account data processing method and device, computer equipment and storage medium Download PDFInfo
- Publication number
- CN115063143A CN115063143A CN202210666462.1A CN202210666462A CN115063143A CN 115063143 A CN115063143 A CN 115063143A CN 202210666462 A CN202210666462 A CN 202210666462A CN 115063143 A CN115063143 A CN 115063143A
- Authority
- CN
- China
- Prior art keywords
- account
- result
- merged
- abnormity
- abnormal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q20/00—Payment architectures, schemes or protocols
- G06Q20/38—Payment protocols; Details thereof
- G06Q20/40—Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
- G06Q20/401—Transaction verification
- G06Q20/4016—Transaction verification involving fraud or risk level assessment in transaction processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q20/00—Payment architectures, schemes or protocols
- G06Q20/38—Payment protocols; Details thereof
- G06Q20/40—Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
- G06Q20/401—Transaction verification
- G06Q20/4014—Identity check for transactions
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Accounting & Taxation (AREA)
- Computer Security & Cryptography (AREA)
- Finance (AREA)
- Strategic Management (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application relates to an account data processing method, an account data processing device, computer equipment and a storage medium. The method comprises the following steps: acquiring business transaction data of a target account; inputting the business transaction data of the target account into at least one base classifier to obtain account abnormity identification results corresponding to the base classifiers, wherein the base classifiers are classification models with mutually independent errors, and the account abnormity identification results are used for representing the probability of abnormal interaction behaviors of the target account in the resource interaction system; merging the account abnormity identification results corresponding to the base classifiers to obtain merged account abnormity identification results, wherein the merged account abnormity identification results represent whether the target account has abnormal business transactions or not; and fusing the account abnormity identification results corresponding to the base classifiers and the combined account abnormity identification results to obtain an account abnormity scoring result corresponding to the target account. By adopting the method, the accuracy of account data processing can be improved, and the updating timeliness of the account data processing model is improved.
Description
Technical Field
The present application relates to the field of computer technologies, and in particular, to an account data processing method and apparatus, a computer device, and a storage medium.
Background
With the development of computer technology, risk monitoring technologies have emerged, which are various surveillance and control activities developed based on project risk management plans and the actual risk and project development changes that occur to the project. This is a project risk management task based on the stage, progressiveness, and controllability of project risk.
In the aspect of risk monitoring, the existing method mainly includes that an anti-fraud model is arranged, developers need to rely on a requirement report of business personnel or a document issued by a public security department, past experience and the like to compile and deploy the anti-fraud model, and the anti-fraud model needs to be issued to each website to be manually checked after recognition is completed, so that the purpose of risk recognition is finally achieved. However, the whole process has strong coupling and subjectivity, high time cost and high labor cost, and the deviation of any link can cause great risk detection accuracy to be discounted.
Disclosure of Invention
In view of the above, it is necessary to provide an account data processing method, an apparatus, a computer device, a computer readable storage medium and a computer program product capable of identifying telecom fraud clients.
In a first aspect, the application provides an account data processing method. The method comprises the following steps: acquiring business transaction data of a target account, wherein the business transaction data of the target account is business data generated after the target account executes resource interaction operation in a resource interaction system; inputting the business transaction data of the target account into at least one base classifier to obtain account abnormity identification results corresponding to the base classifiers, wherein the base classifiers are classification models with mutually independent errors, and the account abnormity identification results are used for representing the probability of abnormal interaction behaviors of the target account in the resource interaction system; merging the account abnormity identification results corresponding to each base classifier to obtain a merged account abnormity identification result, wherein the merged account abnormity identification result represents whether the target account has abnormal business transaction; and fusing the account abnormity identification results corresponding to the base classifiers and the combined account abnormity identification result to obtain an account abnormity scoring result corresponding to the target account.
In one embodiment, the merging the account anomaly identification results corresponding to each base classifier to obtain a merged account anomaly identification result includes: inputting the account abnormity identification results corresponding to each base classifier into a voting algorithm to obtain a combined result matrix corresponding to the account abnormity identification results, wherein the dimension of the combined result matrix is the same as the number of the base classifiers; and selecting an abnormal judgment result corresponding to the account abnormal recognition result of which the vote quantity exceeds a preset threshold value from the merged result matrix as the merged account abnormal recognition result.
In one embodiment, the inputting the account abnormal recognition result corresponding to each base classifier into a voting algorithm to obtain a merged result matrix corresponding to the account abnormal recognition result includes: determining columns corresponding to the non-voting matrix based on the number of the base classifiers, and determining rows corresponding to the non-voting matrix based on the number of the account abnormity identification results; multiplying the data of the rows and the columns corresponding to the non-voting matrix one by one to obtain a non-voting matrix corresponding to the number of the base classifiers; and inputting the non-voting matrix into the voting algorithm to obtain a combined result matrix corresponding to the account abnormity identification result.
In one embodiment, the selecting, from the merged result matrix, an abnormality judgment result corresponding to the account abnormality identification result whose vote number exceeds a preset threshold as the merged account abnormality identification result includes: determining a preset threshold value of the vote obtained by each element in the merged result matrix according to the number of the base classifiers, wherein the preset threshold value is smaller than the number of the base classifiers; and if the vote quantity obtained by the elements in the merged result matrix exceeds the preset threshold value, outputting an abnormal judgment result corresponding to the account abnormal recognition result of the elements as the merged account abnormal recognition result.
In one embodiment, the fusing the account anomaly identification result corresponding to each base classifier and the merged account anomaly identification result to obtain the account anomaly scoring result corresponding to the target account includes: acquiring the weight corresponding to each base classifier and the number of the base classifiers, wherein the weight corresponding to each base classifier is determined according to the error corresponding to each base classifier; adjusting the account abnormity identification result corresponding to each base classifier based on the weight corresponding to each base classifier to obtain each adjusted identification result; and multiplying the adjusted recognition results and the merged account abnormity recognition results to obtain an account abnormity scoring result corresponding to the target account.
In one embodiment, the obtaining an account abnormality scoring result corresponding to the target account by multiplying the adjusted recognition result by the merged account abnormality recognition result includes: multiplying each adjusted recognition result with the merged account abnormity recognition result to obtain a sub-abnormity scoring result corresponding to each adjusted recognition result; and accumulating the sub-abnormal scoring results corresponding to each adjusted identification result to obtain the account abnormal scoring result corresponding to the target account.
In a second aspect, the application also provides an account data processing device. The device comprises: the system comprises a business transaction data acquisition module, a resource interaction system and a resource interaction module, wherein the business transaction data acquisition module is used for acquiring business transaction data of a target account, and the business transaction data of the target account is generated after the target account executes resource interaction operation in the resource interaction system; an account anomaly identification result obtaining module, configured to input the service transaction data of the target account into at least one base classifier, so as to obtain an account anomaly identification result corresponding to each base classifier, where each base classifier is a classification model with mutually independent errors, and the account anomaly identification result is used to represent a probability that an abnormal interaction behavior exists in the resource interaction system for the target account; the merged account abnormity identification result module is used for merging the account abnormity identification results corresponding to the base classifiers to obtain merged account abnormity identification results, and the merged account abnormity identification results represent whether the target account has abnormal business transactions or not; and the account abnormity scoring result module is used for fusing the account abnormity identification results corresponding to the base classifiers and the merged account abnormity identification results to obtain the account abnormity scoring result corresponding to the target account.
In one embodiment, the merged account anomaly identification result module is further configured to input the account anomaly identification results corresponding to each of the base classifiers to a voting algorithm to obtain a merged result matrix corresponding to the account anomaly identification results, where the dimension of the merged result matrix is the same as the number of the base classifiers; and selecting an abnormal judgment result corresponding to the account abnormal recognition result of which the vote quantity exceeds a preset threshold value from the merged result matrix as the merged account abnormal recognition result.
In one embodiment, the merged account anomaly identification result module is further configured to determine a column corresponding to an un-voted matrix based on the number of the base classifiers, and determine a row corresponding to the un-voted matrix based on the number of the account anomaly identification results; multiplying the data of the rows and the columns corresponding to the non-voting matrix one by one to obtain a non-voting matrix corresponding to the number of the base classifiers; and inputting the non-voting matrix into the voting algorithm to obtain a combined result matrix corresponding to the account abnormity identification result.
In one embodiment, the merged account anomaly identification result module is further configured to determine a preset threshold of the vote obtained by each element in the merged result matrix from the number of the base classifiers, where the preset threshold is set to be smaller than the number of the base classifiers; and if the vote quantity obtained by the elements in the merged result matrix exceeds the preset threshold value, outputting an abnormal judgment result corresponding to the account abnormal recognition result of the elements as the merged account abnormal recognition result.
In one embodiment, the account anomaly scoring result module is further configured to obtain weights corresponding to the base classifiers and the number of the base classifiers, where the weights corresponding to the base classifiers are determined according to errors corresponding to the base classifiers; adjusting the account abnormity identification result corresponding to each base classifier based on the weight corresponding to each base classifier to obtain each adjusted identification result; and multiplying the adjusted recognition results and the merged account abnormity recognition results to obtain an account abnormity scoring result corresponding to the target account.
In one embodiment, the account anomaly scoring result module is further configured to multiply each adjusted identification result by the merged account anomaly identification result to obtain a sub-anomaly scoring result corresponding to each adjusted identification result; and accumulating the sub-abnormal scoring results corresponding to each adjusted identification result to obtain the account abnormal scoring result corresponding to the target account.
In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the following steps when executing the computer program: acquiring business transaction data of a target account, wherein the business transaction data of the target account is business data generated after the target account executes resource interaction operation in a resource interaction system; inputting the business transaction data of the target account into at least one base classifier to obtain account abnormity identification results corresponding to the base classifiers, wherein the base classifiers are classification models with mutually independent errors, and the account abnormity identification results are used for representing the probability of abnormal interaction behaviors of the target account in the resource interaction system; merging the account abnormity identification results corresponding to each base classifier to obtain merged account abnormity identification results, wherein the merged account abnormity identification results represent whether the target account has abnormal business transactions; and fusing the account abnormity identification results corresponding to the base classifiers and the merged account abnormity identification results to obtain the account abnormity scoring result corresponding to the target account.
In a fourth aspect, the present application further provides a computer-readable storage medium. The computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of: acquiring business transaction data of a target account, wherein the business transaction data of the target account is generated after the target account executes resource interaction operation in a resource interaction system; inputting the business transaction data of the target account into at least one base classifier to obtain account abnormity identification results corresponding to the base classifiers, wherein the base classifiers are classification models with mutually independent errors, and the account abnormity identification results are used for representing the probability of abnormal interaction behaviors of the target account in the resource interaction system; merging the account abnormity identification results corresponding to each base classifier to obtain merged account abnormity identification results, wherein the merged account abnormity identification results represent whether the target account has abnormal business transactions; and fusing the account abnormity identification results corresponding to the base classifiers and the merged account abnormity identification results to obtain the account abnormity scoring result corresponding to the target account.
In a fifth aspect, the present application further provides a computer program product. The computer program product comprising a computer program which when executed by a processor performs the steps of:
acquiring business transaction data of a target account, wherein the business transaction data of the target account is business data generated after the target account executes resource interaction operation in a resource interaction system; inputting the business transaction data of the target account into at least one base classifier to obtain account abnormity identification results corresponding to the base classifiers, wherein the base classifiers are classification models with mutually independent errors, and the account abnormity identification results are used for representing the probability of abnormal interaction behaviors of the target account in the resource interaction system; merging the account abnormity identification results corresponding to each base classifier to obtain merged account abnormity identification results, wherein the merged account abnormity identification results represent whether the target account has abnormal business transactions; and fusing the account abnormity identification results corresponding to the base classifiers and the merged account abnormity identification results to obtain the account abnormity scoring result corresponding to the target account.
According to the account data processing method, the account data processing device, the computer equipment, the storage medium and the computer program product, the business transaction data of the target account is obtained, and the business transaction data of the target account is the business data generated after the target account executes the resource interaction operation in the resource interaction system; inputting the business transaction data of the target account into at least one base classifier to obtain an account abnormity identification result corresponding to each base classifier, wherein each base classifier is a classification model with mutually independent errors, and the account abnormity identification result is used for representing the probability that the target account has abnormal interaction behavior in the resource interaction system; merging the account abnormity identification results corresponding to the base classifiers to obtain merged account abnormity identification results, wherein the merged account abnormity identification results represent whether the target account has abnormal business transactions or not; and fusing the account abnormity identification results corresponding to the base classifiers and the combined account abnormity identification results to obtain an account abnormity scoring result corresponding to the target account.
And classifying the data through different models by utilizing the high-precision characteristics of the integrated learning, finally combining the results, and classifying and grading the business transaction behaviors of the target account. The user data are learned and fitted through different models, effective information is obtained, the accuracy of the target account abnormity checking system is improved, labor cost is reduced, and updating timeliness of the target account scoring model is improved.
Drawings
FIG. 1 is a diagram of an application environment of a method for processing account data in one embodiment;
FIG. 2 is a schematic flow chart diagram illustrating a method for processing account data in one embodiment;
FIG. 3 is a flowchart illustrating a method for obtaining merged account anomaly recognition results according to an embodiment;
FIG. 4 is a flowchart illustrating a method for obtaining merged account anomaly recognition results according to another embodiment;
FIG. 5 is a flowchart illustrating a method for obtaining merged account anomaly recognition results according to yet another embodiment;
FIG. 6 is a flowchart illustrating a method for obtaining account anomaly scoring results according to one embodiment;
FIG. 7 is a flowchart illustrating a method for obtaining account anomaly scoring results according to another embodiment;
FIG. 8 is a flow chart illustrating an ensemble learning identification process of an account data processing method according to an embodiment;
FIG. 9 is a schematic diagram of scoring logic for a method of account data processing in one embodiment;
FIG. 10 is a block diagram showing the structure of an account data processing apparatus according to one embodiment;
FIG. 11 is a diagram illustrating an internal structure of a computer device in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The account data processing method provided by the embodiment of the application can be applied to the application environment shown in fig. 1. The terminal 102 acquires data, the server 104 receives the data of the terminal 102 in response to an instruction of the terminal 102 and performs calculation on the acquired data, and the server 104 transmits the calculation result of the data back to the terminal 102 and is displayed by the terminal 102. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104, or may be located on the cloud or other network server. The server 104 acquires the service transaction data of the target account from the terminal 102, wherein the service transaction data of the target account is the service data generated after the target account performs resource interaction operation in the resource interaction system; inputting the business transaction data of the target account into at least one base classifier to obtain account abnormity identification results corresponding to the base classifiers, wherein the base classifiers are classification models with mutually independent errors, and the account abnormity identification results are used for representing the probability of abnormal interaction behaviors of the target account in the resource interaction system; merging the account abnormity identification results corresponding to each base classifier to obtain a merged account abnormity identification result, wherein the merged account abnormity identification result represents whether the target account has abnormal business transaction or not; and fusing the account abnormity identification results corresponding to the base classifiers and the combined account abnormity identification results to obtain an account abnormity scoring result corresponding to the target account. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices and portable wearable devices, and the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart car-mounted devices, and the like. The portable wearable device can be a smart watch, a smart bracelet, a head-mounted device, and the like. The server 104 may be implemented as a stand-alone server or as a server cluster comprised of multiple servers.
In one embodiment, as shown in fig. 2, an account data processing method is provided, which is described by taking the application of the method to the server in fig. 1 as an example, and includes the following steps:
The target account may be an account scored by using an account data processing method, and the scoring criteria may be multiple, for example: and taking the specific score as a standard, whether the score passes or not as a standard, and the safety degree as a standard.
The business transaction data may be transaction data generated when business transaction is performed in an account to be scored, and the transaction data may be transaction records, transaction amounts, transaction time, transaction places, and the like.
Specifically, the server responds to an instruction of the terminal, acquires the business transaction data of the target account from the terminal, stores the acquired business transaction data corresponding to the target account into the storage unit, and calls volatile storage resources from the storage unit for the central processing unit to calculate when the server needs to process any one transaction data record in the business transaction data of the target account. The business transaction data may be single data or multiple data input simultaneously, and each data to be business transaction may include at least one transaction data record.
For example, the server 104 responds to an instruction of the terminal 102, obtains the business transaction data of the target account from the terminal 102, and stores the business transaction data into a storage unit in the server 104, where there are 10 pieces of information corresponding to the business transaction data of the target account obtained by the server 104, and there are 20 transaction data records corresponding to each piece of business transaction data.
The base classifier can be one of classifiers, a boosting algorithm is operated at the same time, namely, the base classifier is gradually fitted to approximate a real value, the base classifier is a serial algorithm, errors can be reduced, but the deviation cannot be reduced, all samples are basically trained each time, accidental influences cannot be eliminated, but the base classifier is gradually approximated to the real value each time, and errors can be reduced. Each base classifier is a classification model with mutually independent errors, and the account abnormity identification result is used for representing the probability of abnormal interaction behavior of the target account in the resource interaction system
The account abnormality identification result may be a calculation result obtained by calculating the service transaction data of the target account through the base classifier, and the expression mode of the account abnormality identification result is determined according to the output result of the base classifier.
Specifically, there are a plurality of base classifiers capable of performing account data processing, and thus the set formed for the plurality of base classifiers is a base classifier set. When the business transaction data of any target account needs to be processed, the server selects at least one base classifier suitable for processing the business transaction data from the base classifier set, and for the condition that a plurality of base classifiers need to be selected, the base classifiers with mutually independent errors are found based on the principle of ensemble learning and the result of a plurality of classification models is combined. After the required base classifiers are determined, inputting the business transaction data of the target account into the determined base classifiers, if a plurality of base classifiers are selected, simultaneously inputting the plurality of base classifiers to obtain an account abnormity identification result corresponding to each base classifier, wherein the account abnormity identification result adopts probability to represent the abnormity condition of the target account. The above-mentioned base classifier can be a Support Vector Machine (SVM), a random forest, a linear regression, and a logistic regression model.
For example, the required base classifiers (random forest and linear regression) are determined from a base classifier set A with a plurality of base classifiers, the business transaction data of the target account is simultaneously input into the two base classifiers of the random forest and the linear regression, account abnormity identification results corresponding to the two base classifiers of the random forest and the linear regression are obtained, and the two account abnormity identification results are both expressed by probability for the abnormity condition of the target account.
And step 206, merging the account abnormity identification results corresponding to the base classifiers to obtain a merged account abnormity identification result.
The merged account abnormality identification result may be obtained by processing the account abnormality identification result corresponding to each base classifier by using a voting algorithm, where the voting result is generally 0 or 1, and the output result is generally yes or no.
Specifically, a matched voting algorithm is selected from a voting algorithm set according to the account abnormal recognition result corresponding to each base classifier, wherein the selection principle is that the data type of each voting algorithm is traversed from the voting algorithm set according to the data type of the account abnormal recognition result, the voting algorithm with the highest matching degree is selected from the voting algorithm set, after the selected voting algorithm is based, the results are combined in a voting mode, and the result with the most votes is used as the final result. The voting algorithm is to output the results of a plurality of base classifiers respectively and combine the results into a multidimensional matrix, the dimension of the matrix is the same as the number of the base classifiers, namely each column is the discrimination result of different base classifiers on all data, each behavior is the discrimination result of different base classifiers on the same sample, namely if a certain class of labels has more than half of the votes, the labels are predicted to be the labels; otherwise, the prediction is rejected. The algorithm implementation process comprises the following steps: in the method, the classification is whether abnormal behaviors exist or not, namely 0 is the abnormal behavior and 1 is the abnormal behavior; for example: and judging whether the ticket obtained by class 0 in the judgment results of the four base classifiers exceeds 3 tickets or not, if so, judging that abnormal behaviors do not exist, and otherwise, judging that abnormal behaviors exist.
For example, an absolute voting algorithm is selected from the voting algorithm set B as a combined voting algorithm, account abnormality recognition results corresponding to two base classifiers, namely, a random forest and a linear regression, are input into the absolute voting algorithm to be combined, a combined account abnormality recognition result after voting is obtained, and if the output result of the result is 1, it is displayed that the target account has an abnormal behavior.
And 208, fusing the account abnormity identification results corresponding to the base classifiers and the combined account abnormity identification results to obtain an account abnormity scoring result corresponding to the target account.
The account anomaly scoring result may be a result obtained by measuring the anomaly degree of the target account by using digitization, the higher the score of the account anomaly scoring result is, the higher the anomaly degree of the target account is identified, and if the score exceeds a preset threshold value, the account anomaly scoring result is regarded as a dangerous account and is further manually checked.
Specifically, the account abnormality identification result corresponding to the base classifier, the merged account abnormality identification result, the number of the base classifiers and the weight of each base classifier are input into an abnormality scoring formula for calculation, and an account abnormality scoring result for the target account is obtained. Classifying the target accounts according to preset threshold value ranges of various levels, and if the target accounts belong to dangerous accounts, further manually checking transaction data of the target accounts according to specific score values, wherein behaviors causing the target accounts to become the dangerous accounts comprise fast forward and fast forward of resources in the accounts, concentrated in and scattered out, scattered in and concentrated out, small amount probing, frequent transfer at night, large amount acquisition of important areas of ATM machine borders and the like, and an abnormal score formula is as follows:
where n is the number of basis classifiers, θ i Judging the probability (between 0 and 1) of case involvement of the client for each base classifier, delta i For the weight of each base classifier, flag is the result (0 or 1) of merging base classifiers.
For example, the account anomaly identification result a corresponding to the base classifier, the merged account anomaly identification result b, the number c of the base classifiers and the weight d of each base classifier are input into an anomaly scoring formula for calculation, so as to obtain an account anomaly scoring result e for the target account, which is composed of a, b, c and d.
In the account data processing method, the service transaction data of the target account is acquired and is generated after the target account executes resource interaction operation in the resource interaction system; inputting the business transaction data of the target account into at least one base classifier to obtain account abnormity identification results corresponding to the base classifiers, wherein the base classifiers are classification models with mutually independent errors, and the account abnormity identification results are used for representing the probability of abnormal interaction behaviors of the target account in the resource interaction system; merging the account abnormity identification results corresponding to the base classifiers to obtain merged account abnormity identification results, wherein the merged account abnormity identification results represent whether the target account has abnormal business transactions or not; and fusing the account abnormity identification results corresponding to the base classifiers and the combined account abnormity identification results to obtain an account abnormity scoring result corresponding to the target account.
And classifying the data through different models by utilizing the high-precision characteristics of the integrated learning, finally combining the results, and classifying and grading the business transaction behaviors of the target account. The user data are subjected to learning fitting through different models, effective information is obtained, the accuracy of the target account abnormity checking system is improved, the labor cost is reduced, and the updating timeliness of the target account scoring model is improved.
In an embodiment, as shown in fig. 3, merging the account anomaly identification results corresponding to each base classifier to obtain a merged account anomaly identification result includes:
The voting algorithm may be a voting method, such as an absolute voting algorithm, which requires half of the number of valid votes to be approved, and in the case that a plurality of classifiers predict a certain class, only a part higher than half of the total result is predicted.
The merged result matrix may be a matrix obtained by performing absolute voting on an intermediate matrix formed by the account anomaly identification results corresponding to the base classifier by using an absolute voting algorithm to indicate whether the target account has an abnormal condition.
Specifically, the results of each base classifier are respectively output and merged into a multidimensional intermediate matrix, the dimension of the intermediate matrix is the same as the number of the base classifiers, namely, each column is the discrimination result of different base classifiers on all data, the discrimination result of different base classifiers on the same sample in each behavior is input into an absolute voting algorithm for voting based on the merged multidimensional intermediate matrix, and a merged result matrix corresponding to the account anomaly identification result is obtained. The calculation formula of the absolute voting algorithm is as follows:
wherein, C j To classify the category, h i For the base classifier, T indicates that there are T classifiers, N indicates that there are N classes, i.e. the prediction result for the class j in the T classifiers is greater than half of the total voting result, the prediction is class j,otherwise, the prediction is rejected. If the mark of a certain type has a half-number of votes, the mark is predicted; otherwise, the prediction is rejected. For example: the classification in the method is whether abnormal behaviors exist or not, namely 0 is the abnormal behavior and 1 is the abnormal behavior; and judging whether the ticket obtaining of the 0 type in the judgment results of the five base classifiers exceeds 3 tickets or not, if so, judging that no abnormal behavior exists, otherwise, judging that abnormal behavior exists.
For example, a 5-dimensional intermediate matrix with the same number as that of the base classifiers is formed for the account anomaly identification results corresponding to 5 base classifiers, each column is the discrimination result of different base classifiers for all data, each behavior is the discrimination result of different base classifiers for the same sample, the 5-dimensional intermediate matrix is input into an absolute voting algorithm for voting, a combined result matrix corresponding to the account anomaly identification result is obtained, and whether a target account is abnormal or not can be obtained from the combined result matrix.
And 304, selecting an abnormal judgment result corresponding to the account abnormal identification result of which the vote quantity exceeds a preset threshold value from the merged result matrix as the merged account abnormal identification result.
The anomaly determination result may be a result of determining whether the account anomaly identification result has an anomaly after voting by an absolute voting algorithm, and generally, the output mode of the anomaly determination result may be 0 or 1, where if the output is 0, the target account has no anomaly, and if the output is 1, the target account has an anomaly.
Specifically, a threshold value of the number of votes is preset, if the number of votes obtained exceeds the threshold value, the target account is determined as an abnormal account, and if the number of votes obtained does not exceed the threshold value, the target account is determined as a normal account. And selecting the account abnormal recognition results of which the votes exceed the threshold from the merging result matrix, and outputting the account abnormal recognition results in a mode of 1 as abnormal discrimination results, wherein if no account abnormal recognition results in the merging result matrix exceed the threshold of the vote quantity, 0 is output, and the output result is the merged account abnormal recognition result, so that the output of the merged account abnormal recognition result is also 0 or 1, if 0, the account abnormal recognition result is a normal account, and if 1, the account abnormal recognition result is an abnormal account.
For example, the number of the base classifiers is 5, so that the number of votes set according to the number of the base classifiers is an abnormal account if the number exceeds 3. And selecting an account abnormity identification result with the vote quantity exceeding 3 from the merging result matrix, so that an abnormity discrimination result is 1, indicating that the account has abnormity, and using the abnormity discrimination result as an account abnormity identification result after merging. If the account abnormal recognition result with the vote number exceeding 3 does not exist in the merged result matrix, the abnormal recognition result is 0, and the account abnormal recognition result after merging is also 0.
In the embodiment, the absolute voting algorithm is used for combining the abnormal account identification results, so that the vote corresponding to each abnormal account identification result is further obtained, the abnormal account identification results can be further judged, and the contingency caused by the probability obtained by simply passing through the base classifier is eliminated.
In one embodiment, as shown in fig. 4, the inputting the account anomaly recognition result corresponding to each base classifier into a voting algorithm to obtain a merged result matrix corresponding to the account anomaly recognition result includes:
The non-voting matrix can be a matrix formed according to the number of the base classifiers and the account abnormity identification result, and the absolute voting algorithm is not input into the matrix for voting.
Specifically, the number of columns of the non-voting matrix is determined based on the number of the base classifiers, and each base classifier is filled into the column corresponding to the non-voting matrix in any order; and determining the number of rows of the non-voting matrix based on the number of the account abnormity identification results, and filling the account abnormity identification results into the rows corresponding to the non-voting matrix in any order.
For example, if the number of the base classifiers is 3, each base classifier is X, Y, Z, and the number of the corresponding account anomaly recognition results is also 3, each account anomaly recognition result is x, y, and z, then the number of rows of the non-voting matrix corresponds to the account anomaly recognition result, and the number of columns corresponds to the number of the base classifiers, the number of columns of the non-voting matrix filled in by the base classifier X, Y, Z may be in any order, for example: x, Y, Z or X, Z, Y, etc.
And step 404, fusing the data of the rows and the columns corresponding to the non-voting matrix one by one to obtain the non-voting matrix corresponding to the number of the base classifiers.
Specifically, the data corresponding to the rows and columns of the non-voting matrix established based on the base classifier and the account anomaly identification result are fused by using the corresponding relationship, so as to obtain the non-voting matrix with the same number of dimensions as the base classifier, wherein the fusion can be the mathematical calculation such as multiplication and weighting between the corresponding data.
For example, the rows of the non-voting matrix established by the number of the base classifiers being 5 and the columns of the non-voting matrix established by the number of the account anomaly identification results are fused by adopting the multiplication calculation, so as to obtain a five-dimensional non-voting matrix.
And step 406, inputting the non-voting matrix into a voting algorithm to obtain a combined result matrix corresponding to the account abnormity identification result.
Specifically, an unsupervised matrix composed of account anomaly identification results corresponding to the multiple base classifiers is input into an absolute voting algorithm for voting, so as to obtain a combined result matrix corresponding to the account anomaly identification results, wherein the algorithm input by the unsupervised matrix is not limited to the absolute voting algorithm, and a Soft voting algorithm, a Hard voting algorithm, a weighted sum method, a stacking algorithm and the like can be selected.
For example, a five-dimensional non-voting matrix formed by 5 base classifiers and account anomaly identification results corresponding to the 5 base classifiers is input into a stacking algorithm for processing, so as to obtain a combined result matrix corresponding to the account anomaly identification results.
In the embodiment, the target account which possibly has an account abnormity identification result can be further verified by establishing the multi-dimensional matrix and inputting the multi-dimensional matrix into the voting algorithm for voting, so that the judgment error is reduced, and the accuracy of the system is improved.
In one embodiment, as shown in fig. 5, selecting an abnormality judgment result corresponding to an account abnormality identification result whose vote number exceeds a preset threshold from the merged result matrix as the merged account abnormality identification result includes:
Specifically, according to the number of the base classifiers, a voting algorithm for merging the corresponding elements is determined from the voting algorithm set, and a vote threshold corresponding to any one element in the merged result matrix is further determined, and the preset threshold must be set to be smaller than the number of the base classifiers. If the number of votes obtained by any element is larger than or equal to the threshold value, the target account is indicated to have abnormity, and if the number of votes obtained by all elements is smaller than the threshold value, the target account is indicated to have no abnormity.
For example, for an account data processing service, the number of the corresponding base classifiers used is 5, an absolute voting algorithm is selected from the voting algorithm set as a merging algorithm, and according to the nature of the absolute voting algorithm, and the threshold of the votes cannot be greater than the number of the base classifiers, 3 votes are selected as the threshold of the votes.
Specifically, if the voting number obtained by any element in the obtained combined result matrix exceeds the corresponding preset threshold after the voting is performed on the non-voting matrix through the corresponding voting algorithm, the account abnormality recognition result where the element corresponding to the exceeding threshold is located is output, and the abnormality judgment result is further output as the combined account abnormality recognition result according to the account abnormality recognition result. And the expression modes of the abnormal judgment result and the merged account abnormal recognition result are 0 or 1, wherein 0 represents that the target account is normal, and 1 represents that the target account is abnormal.
For example, after the five-dimensional non-voting matrixes are combined through an absolute voting algorithm, a five-dimensional combined result matrix is obtained, wherein the number of votes obtained by the H element in the matrix is 5, and exceeds a preset voting threshold value 3, an abnormality judgment result corresponding to the account abnormality identification result of the H element is 1, and meanwhile, the account abnormality identification result after combination is also output as 1.
In this embodiment, a preset vote threshold is set for each element in the combination result matrix, and the number of votes obtained by the element is compared with the preset threshold, so that the identification efficiency of the abnormal target account can be further improved.
In one embodiment, as shown in fig. 6, fusing the account anomaly identification results corresponding to the base classifiers and the merged account anomaly identification result to obtain an account anomaly scoring result corresponding to the target account, includes:
Specifically, the server obtains the base classifiers that need to be used, counts the number of the base classifiers, and obtains the weight corresponding to each base classifier at the same time. The weights corresponding to the base classifiers are determined according to the errors corresponding to the base classifiers, and the errors of any base classifier are independent.
For example, the server obtains 4 base classifiers (support vector machine (SVM), random forest, linear regression, and logistic regression model) that need to be used, and obtains the corresponding weights for each base classifier (support vector machine-0.4, random forest-0.2, linear regression-0.35, and logistic regression model-0.05).
And step 604, adjusting the account abnormity identification results corresponding to the base classifiers based on the weights corresponding to the base classifiers to obtain adjusted identification results.
The adjusted recognition result may be an account anomaly recognition result obtained by using weights corresponding to the base classifiers to perform corresponding calculation on the base classifiers.
Specifically, each base classifier has a weight corresponding to each base classifier and an account anomaly identification result obtained through calculation, so that for the same base classifier, the weight corresponding to the classifier and the account anomaly identification result are jointly adjusted to obtain an adjusted identification result determined based on the weight and the account anomaly identification result, and the same operation is performed on any one base classifier. Wherein the adjustment can be a plurality of different mathematical calculation modes.
For example, the weights corresponding to the base classifiers H, I, J are h, i, j, the account abnormality recognition results x, y, z corresponding to the respective base classifiers obtained by calculation are adjusted by multiplying the weights by the account abnormality recognition results, and thus the adjusted recognition results for each base classifier are hx, iy, jz.
And 606, multiplying the adjusted recognition results and the merged account abnormity recognition results to obtain an account abnormity scoring result corresponding to the target account.
Specifically, the adjusted recognition results corresponding to the base classifiers are multiplied by the merged account anomaly recognition results one by one to obtain products corresponding to the base classifiers, and then fusion is carried out to further obtain the account anomaly scoring results corresponding to the target account.
For example, the adjusted recognition result corresponding to each base classifier is hx, iy, jz, and the merged account anomaly recognition result is 0 or 1. If the merged account abnormity identification result is 0, the account abnormity scoring result corresponding to the target account is 0 and belongs to a normal account; and if the merged account abnormity identification result is 1, the account abnormity scoring result corresponding to the target account is hx + iy + jz.
In this embodiment, by obtaining the weight corresponding to each base classifier and adjusting the corresponding account anomaly identification result according to the obtained weight, the weights of the account anomaly identification results corresponding to different base classifiers can be changed, so that the output result can be weighted according to the service requirement.
In one embodiment, as shown in fig. 7, the obtaining an account anomaly scoring result corresponding to the target account based on multiplying the adjusted recognition result by the merged account anomaly recognition result includes:
and step 702, multiplying each adjusted recognition result by the merged account abnormity recognition result to obtain a sub-abnormity scoring result corresponding to each adjusted recognition result.
Specifically, if the merged account abnormality recognition result is 0 and the result obtained by multiplying each adjusted recognition result by the merged account abnormality recognition result is 0, the sub-abnormality scoring result corresponding to the adjusted recognition result is 0; and if the merged account abnormity identification result is 1, the result obtained by multiplying each adjusted identification result by the merged account abnormity identification result is still each adjusted identification result, and the sub-abnormity scoring result corresponding to the adjusted identification result is each adjusted identification result.
For example, if the merged account abnormality recognition result is 0, and the result obtained by multiplying each adjusted recognition result hx, iy, jz by the merged account abnormality recognition result is 0, the sub-abnormality scoring result corresponding to the adjusted recognition result is 0; and if the merged account abnormity identification result is 1, the result obtained by multiplying each adjusted identification result hx, iy and jz by the merged account abnormity identification result is still each adjusted identification result, and the sub-abnormity scoring result corresponding to the adjusted identification result is each adjusted identification result hx, iy and jz.
And 704, accumulating the sub-abnormal scoring results corresponding to each adjusted identification result to obtain an account abnormal scoring result corresponding to the target account.
Specifically, if the merged account abnormality recognition result is 1, summing the sub-abnormality scoring results corresponding to each adjusted recognition result, and obtaining a sum, namely the account abnormality scoring result corresponding to the target account; and if the account abnormity identification result after combination is 0, the account abnormity scoring result corresponding to the target account is 0.
For example, if the merged account abnormality recognition result is 1, the sub-abnormality scoring results hx, iy, and jz corresponding to each adjusted recognition result are summed, and the account abnormality scoring result corresponding to the target account is hx + iy + jz.
In this embodiment, by summing up the sub-abnormal scoring results, the scoring results obtained by all the base classifiers according to different algorithms can be integrated, so that the account abnormal scoring results have referential property.
In one embodiment, the business transaction data of the target account needs to be preprocessed, wherein the preprocessing steps comprise data cleaning, data encoding and data normalization.
(1) Data cleaning: for cleaning user data required to be used, for example, cleaning missing values and abnormal values, generally adopted methods include: deleting data, filling missing values, not processing, converting true values,
and (4) deleting data: namely, the row record or the column field with the missing value is directly deleted, so that the influence of the trend data record on the whole data is reduced, and the accuracy of the data is improved. However, this method is not applicable to all scenarios because deletion, i.e. representing data characteristics, is reduced, and cannot be used when there is a large amount of data in the data set or the records are incomplete and the missing values of the data records have obvious data distribution rules or characteristics;
missing value padding: missing data is supplemented through a certain method to form a complete data record, and the behavior is very important for subsequent data processing, analysis and modeling;
data not processed: in the data preprocessing stage, missing value parts in the data set are not processed;
and (3) true value conversion: and recognizing the existence of missing values, taking the missing data as a part of a data distribution rule, and taking the actual values and the missing of the variables as input dimensions to participate in subsequent data processing and model calculation.
(2) And (3) data encoding: in order to use part of the features in models such as logistic regression and support vector machine, the part of the features needs to be converted into numerical types: serial number coding, one-hot coding, binary coding.
And (4) coding sequence number: the method is mainly used for the encoding mode that all data sets are of the category characteristics, and the internal values of the category characteristics have the size sequence. For a feature with m classes, mapping it correspondingly to an integer of [0, m-1 ]; for example, for "academic," scholars, "Master," "doctor," are naturally coded as [0, 1, 2 ];
one-hot encoding: also known as one-bit-efficient encoding, N-bit state registers are used to encode N states, each state having an independent register bit and only one bit being active at any time.
Binary coding: elements in the sample matrix that are above or below a given threshold are denoted by 0 and 1.
In this method, part of the text data is encoded by using a one-hot encoding method, for example, for model training, so as to improve the fitting degree of the model to the data.
(3) Data normalization: unifying the features to a substantially same numerical range, so as to eliminate dimensional influence between the features and make different indexes comparable, the common method includes: linear function normalization, zero mean normalization.
In the method, data normalization is carried out on data of attributes of related money, zero mean normalization is adopted, and dimension influence caused by overlarge transaction money is eliminated.
In one embodiment, the ensemble learning identification process is illustrated in fig. 8, and the scoring logic diagram is illustrated in fig. 9.
It should be understood that, although the steps in the flowcharts related to the above embodiments are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the above embodiments may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of the steps or stages in other steps.
Based on the same inventive concept, the embodiment of the application also provides an account data processing device for realizing the account data processing method. The implementation scheme for solving the problem provided by the device is similar to the implementation scheme described in the method, so specific limitations in one or more embodiments of the account data processing device provided below can be referred to the limitations of the account data processing method in the foregoing, and details are not described herein again.
In one embodiment, as shown in fig. 10, there is provided an account data processing apparatus including: the system comprises a business transaction data acquisition module, an account abnormity identification result acquisition module, a combined account abnormity identification result module and an account abnormity scoring result module, wherein:
the service transaction data acquisition module 1002 is configured to acquire service transaction data of a target account, where the service transaction data of the target account is service data generated after a resource interaction operation is performed on the target account in a resource interaction system;
an account anomaly identification result obtaining module 1004, configured to input the service transaction data of the target account into at least one base classifier, so as to obtain an account anomaly identification result corresponding to each base classifier, where each base classifier is a classification model with mutually independent errors, and the account anomaly identification result is used to represent a probability that an abnormal interaction behavior exists in the resource interaction system for the target account;
a merged account exception identification result module 1006, configured to merge account exception identification results corresponding to the base classifiers to obtain a merged account exception identification result, where the merged account exception identification result indicates whether the target account has an exception service transaction;
and the account anomaly scoring result module 1008 is configured to fuse the account anomaly identification results corresponding to the base classifiers and the merged account anomaly identification result to obtain an account anomaly scoring result corresponding to the target account.
In one embodiment, the merged account anomaly identification result module is further configured to input the account anomaly identification results corresponding to each base classifier into a voting algorithm to obtain a merged result matrix corresponding to the account anomaly identification results, where the dimension of the merged result matrix is the same as the number of the base classifiers; and selecting an abnormal discrimination result corresponding to the account abnormal recognition result of which the vote quantity exceeds a preset threshold value from the merged result matrix as a merged account abnormal recognition result.
In one embodiment, the merged account anomaly identification result module is further configured to determine a column corresponding to the non-voting matrix based on the number of the base classifiers, and determine a row corresponding to the non-voting matrix based on the number of the account anomaly identification results; multiplying the data of the rows and the columns corresponding to the non-voting matrix one by one to obtain a non-voting matrix corresponding to the number of the base classifiers; and inputting the non-voting matrix into a voting algorithm to obtain a combined result matrix corresponding to the account abnormity identification result.
In one embodiment, the merged account anomaly identification result module is further configured to determine a preset threshold value of votes obtained by each element in the merged result matrix from the number of the base classifiers, where the preset threshold value is set to be smaller than the number of the base classifiers; and if the vote quantity obtained by the elements in the merging result matrix exceeds a preset threshold value, outputting an abnormality judgment result corresponding to the account abnormality identification result of the element as the merged account abnormality identification result.
In one embodiment, the account anomaly scoring result module is further configured to obtain weights corresponding to the base classifiers and the number of the base classifiers, where the weights corresponding to the base classifiers are determined according to errors corresponding to the base classifiers; adjusting the account abnormity identification result corresponding to each base classifier based on the weight corresponding to each base classifier to obtain each adjusted identification result; and multiplying each adjusted identification result by the combined account abnormity identification result to obtain an account abnormity scoring result corresponding to the target account.
In one embodiment, the account anomaly scoring result module is further configured to multiply each adjusted identification result by the merged account anomaly identification result to obtain a sub-anomaly scoring result corresponding to each adjusted identification result; and accumulating the sub-abnormal scoring results corresponding to each adjusted identification result to obtain the account abnormal scoring result corresponding to the target account.
The respective modules in the above-mentioned account data processing device may be wholly or partially implemented by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 11. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing server data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an account data processing method.
Those skilled in the art will appreciate that the architecture shown in fig. 11 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is further provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above method embodiments when executing the computer program.
In an embodiment, a computer-readable storage medium is provided, in which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
In one embodiment, a computer program product or computer program is provided that includes computer instructions stored in a computer-readable storage medium. The computer instructions are read by a processor of a computer device from a computer-readable storage medium, and the computer instructions are executed by the processor to cause the computer device to perform the steps in the above-mentioned method embodiments.
It should be noted that, the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high-density embedded nonvolatile Memory, resistive Random Access Memory (ReRAM), Magnetic Random Access Memory (MRAM), Ferroelectric Random Access Memory (FRAM), Phase Change Memory (PCM), graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), for example. The databases referred to in various embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing based data processing logic devices, etc., without limitation.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.
Claims (10)
1. An account data processing method, characterized in that the method comprises:
acquiring business transaction data of a target account, wherein the business transaction data of the target account is business data generated after the target account executes resource interaction operation in a resource interaction system;
inputting the business transaction data of the target account into at least one base classifier to obtain an account abnormity identification result corresponding to each base classifier, wherein each base classifier is a classification model with mutually independent errors, and the account abnormity identification result is used for representing the probability that the target account has abnormal interaction behavior in the resource interaction system;
merging the account abnormity identification results corresponding to each base classifier to obtain a merged account abnormity identification result, wherein the merged account abnormity identification result represents whether the target account has abnormal business transaction;
and fusing the account abnormity identification results corresponding to the base classifiers and the merged account abnormity identification results to obtain the account abnormity scoring result corresponding to the target account.
2. The method according to claim 1, wherein the merging the account anomaly identification results corresponding to each of the base classifiers to obtain a merged account anomaly identification result includes:
inputting the account abnormity identification results corresponding to each base classifier into a voting algorithm to obtain a combined result matrix corresponding to the account abnormity identification results, wherein the dimension of the combined result matrix is the same as the number of the base classifiers;
and selecting an abnormal judgment result corresponding to the account abnormal recognition result of which the vote quantity exceeds a preset threshold value from the merged result matrix as the merged account abnormal recognition result.
3. The method of claim 2, wherein the inputting the account anomaly identification result corresponding to each base classifier into a voting algorithm to obtain a merged result matrix corresponding to the account anomaly identification result comprises:
determining columns corresponding to the non-voting matrix based on the number of the base classifiers, and determining rows corresponding to the non-voting matrix based on the number of the account abnormity identification results;
performing one-to-one fusion on each data of the rows and the columns corresponding to the non-voting matrix to obtain a non-voting matrix corresponding to the number of the base classifiers;
and inputting the non-voting matrix into the voting algorithm to obtain a combined result matrix corresponding to the account abnormity identification result.
4. The method according to claim 2, wherein the selecting an abnormal discrimination result corresponding to the account abnormal recognition result with the vote quantity exceeding a preset threshold value from the merged result matrix as the merged account abnormal recognition result comprises:
determining a preset threshold value of the vote obtained by each element in the merged result matrix according to the number of the base classifiers, wherein the preset threshold value is smaller than the number of the base classifiers;
and if the vote quantity obtained by the elements in the merged result matrix exceeds the preset threshold value, outputting an abnormal judgment result corresponding to the account abnormal recognition result of the elements as the merged account abnormal recognition result.
5. The method according to claim 1, wherein the fusing the account abnormality recognition results corresponding to the base classifiers and the merged account abnormality recognition result to obtain the account abnormality scoring result corresponding to the target account comprises:
acquiring the weight corresponding to each base classifier and the number of the base classifiers, wherein the weight corresponding to each base classifier is determined according to the error corresponding to each base classifier;
adjusting the account abnormity identification result corresponding to each base classifier based on the weight corresponding to each base classifier to obtain each adjusted identification result;
and multiplying the adjusted recognition results and the merged account abnormity recognition results to obtain an account abnormity scoring result corresponding to the target account.
6. The method according to claim 5, wherein the multiplying the adjusted recognition result and the combined account abnormality recognition result to obtain an account abnormality scoring result corresponding to the target account comprises:
multiplying each adjusted recognition result with the merged account abnormity recognition result to obtain a sub-abnormity scoring result corresponding to each adjusted recognition result;
and accumulating the sub-abnormal scoring results corresponding to each adjusted identification result to obtain the account abnormal scoring result corresponding to the target account.
7. An account data processing apparatus, characterized in that the apparatus comprises:
the system comprises a business transaction data acquisition module, a resource interaction system and a resource interaction module, wherein the business transaction data acquisition module is used for acquiring business transaction data of a target account, and the business transaction data of the target account is generated after the target account executes resource interaction operation in the resource interaction system;
an account anomaly identification result obtaining module, configured to input the service transaction data of the target account into at least one base classifier, so as to obtain an account anomaly identification result corresponding to each base classifier, where each base classifier is a classification model with mutually independent errors, and the account anomaly identification result is used to represent a probability that an abnormal interaction behavior exists in the resource interaction system for the target account;
the merged account abnormity identification result module is used for merging the account abnormity identification results corresponding to the base classifiers to obtain merged account abnormity identification results, and the merged account abnormity identification results represent whether the target account has abnormal business transactions or not;
and the account abnormity scoring result module is used for fusing the account abnormity identification results corresponding to the base classifiers and the merged account abnormity identification results to obtain the account abnormity scoring result corresponding to the target account.
8. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 6.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.
10. A computer program product comprising a computer program, characterized in that the computer program realizes the steps of the method of any one of claims 1 to 6 when executed by a processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210666462.1A CN115063143A (en) | 2022-06-14 | 2022-06-14 | Account data processing method and device, computer equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210666462.1A CN115063143A (en) | 2022-06-14 | 2022-06-14 | Account data processing method and device, computer equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115063143A true CN115063143A (en) | 2022-09-16 |
Family
ID=83200768
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210666462.1A Pending CN115063143A (en) | 2022-06-14 | 2022-06-14 | Account data processing method and device, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115063143A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117807406A (en) * | 2024-03-01 | 2024-04-02 | 深圳市拜特科技股份有限公司 | Enterprise account management method, system, equipment and storage medium of payment platform |
-
2022
- 2022-06-14 CN CN202210666462.1A patent/CN115063143A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117807406A (en) * | 2024-03-01 | 2024-04-02 | 深圳市拜特科技股份有限公司 | Enterprise account management method, system, equipment and storage medium of payment platform |
CN117807406B (en) * | 2024-03-01 | 2024-04-30 | 深圳市拜特科技股份有限公司 | Enterprise account management method, system, equipment and storage medium of payment platform |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3985578A1 (en) | Method and system for automatically training machine learning model | |
CN110852881B (en) | Risk account identification method and device, electronic equipment and medium | |
CN107633030B (en) | Credit evaluation method and device based on data model | |
CN110807700A (en) | Unsupervised fusion model personal credit scoring method based on government data | |
US20150269669A1 (en) | Loan risk assessment using cluster-based classification for diagnostics | |
CN111861697B (en) | Loan multi-head data-based user portrait generation method and system | |
CN112990386B (en) | User value clustering method and device, computer equipment and storage medium | |
CN112241805A (en) | Defect prediction using historical inspection data | |
CN114997916A (en) | Prediction method, system, electronic device and storage medium of potential user | |
CN115080868A (en) | Product pushing method, product pushing device, computer equipment, storage medium and program product | |
CN116150663A (en) | Data classification method, device, computer equipment and storage medium | |
CN115063143A (en) | Account data processing method and device, computer equipment and storage medium | |
Lee et al. | An entropy decision model for selection of enterprise resource planning system | |
Fu et al. | Applying DEA–BPN to enhance the explanatory power of performance measurement | |
CN116304251A (en) | Label processing method, device, computer equipment and storage medium | |
CN115907954A (en) | Account identification method and device, computer equipment and storage medium | |
CN114170000A (en) | Credit card user risk category identification method, device, computer equipment and medium | |
Lee et al. | Application of machine learning in credit risk scorecard | |
CN118569981B (en) | Customer repayment risk prediction method and system based on consumption portraits | |
CN118260347B (en) | Data acquisition and analysis method and system based on artificial intelligence | |
CN117078441B (en) | Method, apparatus, computer device and storage medium for identifying claims fraud | |
CN117808441B (en) | Bid information checking method and system | |
CN113239024B (en) | Bank abnormal data detection method based on outlier detection | |
US11397783B1 (en) | Ranking similar users based on values and personal journeys | |
CN118536083A (en) | User credit scoring method, apparatus, computer device, readable storage medium and program product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |