Nothing Special   »   [go: up one dir, main page]

CN110309219A - The generation method and device of credit scoring model - Google Patents

The generation method and device of credit scoring model Download PDF

Info

Publication number
CN110309219A
CN110309219A CN201910541021.7A CN201910541021A CN110309219A CN 110309219 A CN110309219 A CN 110309219A CN 201910541021 A CN201910541021 A CN 201910541021A CN 110309219 A CN110309219 A CN 110309219A
Authority
CN
China
Prior art keywords
indicator combination
ship data
index
user
credit scoring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910541021.7A
Other languages
Chinese (zh)
Inventor
顾浙君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiqi Wulian Science And Technology (shanghai) Co Ltd
Original Assignee
Jiqi Wulian Science And Technology (shanghai) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiqi Wulian Science And Technology (shanghai) Co Ltd filed Critical Jiqi Wulian Science And Technology (shanghai) Co Ltd
Priority to CN201910541021.7A priority Critical patent/CN110309219A/en
Publication of CN110309219A publication Critical patent/CN110309219A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/08Logistics, e.g. warehousing, loading or distribution; Inventory or stock management

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Operations Research (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Marketing (AREA)
  • Mathematical Optimization (AREA)
  • Educational Administration (AREA)
  • Mathematical Physics (AREA)
  • General Business, Economics & Management (AREA)
  • Mathematical Analysis (AREA)
  • Tourism & Hospitality (AREA)
  • Computational Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Software Systems (AREA)
  • Algebra (AREA)
  • Game Theory and Decision Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

This application discloses a kind of generation method of credit scoring model and devices.This method comprises: obtaining the target ship data of user;According to target ship data described in default processing rule process, indicator combination is obtained;The indicator combination is screened by default statistical model;Predetermined registration operation is executed to the index in the selection result, generates credit scoring model.The device includes: to obtain module, processing module, screening module and generation module.Present application addresses metering method applicability is undesirable, and the technical problem of the stability of result, accuracy difference.

Description

The generation method and device of credit scoring model
Technical field
This application involves quantization credit scoring fields, in particular to a kind of generation side of credit scoring model Method and device.
Background technique
In the methods of marking numerous studies of quantization credit risk, the overwhelming majority is from quotient for academia and financial circles bound pair The position of the large size financial institution such as industry bank is set out, and is research object expansion as credit visitor group for big companies, with There were significant differences for present situation of the freight industry using little Wei enterprise as mainstream, be much widely used in index on methods of marking from Above the condition and content of data acquisition, just departing from the practical circumstances of freight industry.
As big companies can obtain the data mining credit scoring index of securities market from outside or be obtained by internal The financial data of high quality forms index, and the little Wei enterprise of freight industry, which is then generally deficient of, such can be constructed as credit scoring and refer to Target data source;And the metering method of source feature association based on these data, as Credit Metrics, Credit Risk+, KMV, Z point-score etc. also has significant basic difference, therefore applicability is undesirable.
It is undesirable for metering method applicability in the related technology, and the problem of the stability of result, accuracy difference, at present Not yet put forward effective solutions.
Summary of the invention
The main purpose of the application is to provide the generation method and device of a kind of credit scoring model, to solve metering side Method applicability is undesirable, and the problem of the stability of result, accuracy difference.
To achieve the goals above, according to the one aspect of the application, a kind of generation side of credit scoring model is provided Method.
Generation method according to the credit scoring model of the application includes: to obtain the target ship data of user;According to pre- If handling target ship data described in rule process, indicator combination is obtained;The indicator combination is screened by default statistical model; Predetermined registration operation is executed to the index in the selection result, generates credit scoring model.
Further, the target ship data for obtaining user includes: to extract to obtain the ship data of user from database; ETL processing is carried out to the ship data, obtains target ship data.
Further, according to target ship data described in default processing rule process, obtaining indicator combination includes: by pre- If time standard judgement collects user's refund record;User is divided into two classes according to judging result;According to it is described default when Between timing node in standard, determine the observation period node of classification results;By observation period node by target ship data with not Summarized with time span, obtains the indicator combination of shipping performance.
Further, according to target ship data described in default processing rule process, indicator combination is obtained further include: to institute The index execution relative value stated in the indicator combination of shipping performance compares operation, obtains the first derivative indicator combination;By the goods At least two indexs transported in the indicator combination of performance combine, and obtain the second derivative indicator combination.
Further, it includes below one or more for screening the indicator combination by default statistical model: use side Difference analyses indicator combination described in model discrimination;The indicator combination is screened using Logic Regression Models;Using Random Forest model Screen the indicator combination;The indicator combination is screened using neural network model.
Further, pass through default statistical model and screen the indicator combination further include: by conspicuousness or multiple conllinear Inspection obtains the first indicator combination;It takes height to screen according to AR value first indicator combination, obtains target indicator combination.
Further, predetermined registration operation is executed to the index in the selection result, generating credit scoring model includes: to tie to screening Distribution Indexes weight in fruit;The operation that cross validation or reserved Sample are executed to the index after distribution weight, generates letter Use Rating Model.
To achieve the goals above, according to the another aspect of the application, a kind of generation dress of credit scoring model is provided It sets.
Generating means according to the credit scoring model of the application include: acquisition module, for obtaining the target goods of user Destiny evidence;Processing module, for obtaining indicator combination according to target ship data described in default processing rule process;Screen mould Block, for screening the indicator combination by default statistical model;Generation module, it is pre- for being executed to the index in the selection result If operation generates credit scoring model.
Further, the acquiring unit includes: and extracts to obtain the ship data of user from database;To the shipping Data carry out ETL processing, obtain target ship data.
Further, the processing module includes: to collect user's refund record by the judgement of preset time standard;Root It is judged that user is divided into two classes by result;According to the timing node in the preset time standard, the observation of classification results is determined Phase node;Target ship data is summarized with different time length by observation period node, obtains the index of shipping performance Combination.
In the embodiment of the present application, by the way of data processing and index screening, by the target shipping for obtaining user Data;According to target ship data described in default processing rule process, indicator combination is obtained;Institute is screened by default statistical model State indicator combination;Predetermined registration operation is executed to the index in the selection result, generates credit scoring model;Acquisition target shipping is reached Data, and index is obtained by data processing, while using the purpose of specific metering method screening index, to realize metering Algorithm can be suitable for freight industry, and promote the technical effect of credit scoring model stability and accuracy, and then solve Metering method applicability is undesirable, and the technical problem of the stability of result, accuracy difference.
Detailed description of the invention
The attached drawing constituted part of this application is used to provide further understanding of the present application, so that the application's is other Feature, objects and advantages become more apparent upon.The illustrative examples attached drawing and its explanation of the application is for explaining the application, not Constitute the improper restriction to the application.In the accompanying drawings:
Fig. 1 is the generation method schematic diagram according to the credit scoring model of the application first embodiment;
Fig. 2 is the generation method schematic diagram according to the credit scoring model of the application second embodiment;
Fig. 3 is the generation method schematic diagram according to the credit scoring model of the application 3rd embodiment;
Fig. 4 is the generation method schematic diagram according to the credit scoring model of the application fourth embodiment;
Fig. 5 is the generation method schematic diagram according to the credit scoring model of the 5th embodiment of the application;
Fig. 6 is the generation method schematic diagram according to the credit scoring model of the application sixth embodiment;
Fig. 7 is the generation method schematic diagram according to the credit scoring model of the 7th embodiment of the application;
Fig. 8 is the generating means schematic diagram according to the credit scoring model of the application first embodiment.
Specific embodiment
In order to make those skilled in the art more fully understand application scheme, below in conjunction in the embodiment of the present application Attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is only The embodiment of the application a part, instead of all the embodiments.Based on the embodiment in the application, ordinary skill people Member's every other embodiment obtained without making creative work, all should belong to the model of the application protection It encloses.
It should be noted that the description and claims of this application and term " first " in above-mentioned attached drawing, " Two " etc. be to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should be understood that using in this way Data be interchangeable under appropriate circumstances, so as to embodiments herein described herein.In addition, term " includes " and " tool Have " and their any deformation, it is intended that cover it is non-exclusive include, for example, containing a series of steps or units Process, method, system, product or equipment those of are not necessarily limited to be clearly listed step or unit, but may include without clear Other step or units listing to Chu or intrinsic for these process, methods, product or equipment.
In this application, term " on ", "lower", "left", "right", "front", "rear", "top", "bottom", "inner", "outside", " in ", "vertical", "horizontal", " transverse direction ", the orientation or positional relationship of the instructions such as " longitudinal direction " be orientation based on the figure or Positional relationship.These terms are not intended to limit indicated dress primarily to better describe the present invention and embodiment Set, element or component must have particular orientation, or constructed and operated with particular orientation.
Also, above-mentioned part term is other than it can be used to indicate that orientation or positional relationship, it is also possible to for indicating it His meaning, such as term " on " also are likely used for indicating certain relations of dependence or connection relationship in some cases.For ability For the those of ordinary skill of domain, the concrete meaning of these terms in the present invention can be understood as the case may be.
In addition, term " installation ", " setting ", " being equipped with ", " connection ", " connected ", " socket " shall be understood in a broad sense.For example, It may be a fixed connection, be detachably connected or monolithic construction;It can be mechanical connection, or electrical connection;It can be direct phase It even, or indirectly connected through an intermediary, or is two connections internal between device, element or component. For those of ordinary skills, the specific meanings of the above terms in the present invention can be understood according to specific conditions.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
According to embodiments of the present invention, a kind of generation method of credit scoring model is provided, as shown in Figure 1, this method packet Include following step S100 to step S106:
Step S100, the target ship data of user is obtained;
Specifically, as shown in Fig. 2, the target ship data for obtaining user includes:
Step S200, it is extracted from database and obtains the ship data of user;
Step S202, ETL processing is carried out to the ship data, obtains target ship data.
The ship data of shipping company is extracted from G7 database, ship data includes but is not limited to the customer management time limit, Mileage travelled, task route, the total time-consuming of route and the traveling therein of G7 appliance case subrecord are time-consuming, the resource of shipping investment The amount of consumption of (such as vehicle, driver, monitoring device), the ETC of the resource accounting of lateral comparison and G7, oil card product record.
The processing that ETL is carried out to ship data, summarizes by dimension of shipping company, screens out miss rate, repetitive rate is excessively high Field is verified, is corrected to the missing values retained in field, exceptional value, the processing such as median filling.
After processing, satisfactory data can be filtered out, duplicate data are deleted, it can also be to certain, abnormal Data supplemented, so that improve the target ship data of acquisition simplifies degree, also improve the quality of data, thus It can be provided safeguard to reduce the operand of data processing.
Step S102, according to target ship data described in default processing rule process, indicator combination is obtained;
Specifically, as shown in figure 3, obtaining indicator combination packet according to target ship data described in default processing rule process It includes:
Step S300, user's refund record is collected by the judgement of preset time standard;
Step S302, user is divided by two classes according to judging result;
Step S304, according to the timing node in the preset time standard, the observation period node of classification results is determined;
Step S306, target ship data is summarized by observation period node with different time length, obtains shipping The indicator combination of performance.
Collect user refund record, user refund record in comprising user refund the time, according to the refund time with And preset threshold value of exceeding the time limit, it can be determined that go out whether user's refund exceeds the time limit, the user to exceed the time limit is divided into target sample, the use that do not exceed the time limit Family is divided into other samples, so that user is divided into two classes;The timing node in preset time standard is referred again to, can be determined respectively The observation period node of target sample, other samples;Finally by summarizing to obtain the indicator combination of shipping performance.
The refund performance for acquiring user (shipping company), records according to refund obtained, target sample and other is arranged The standard of sample, such as: the overdue client more than 90 days is target sample, has 90 days or more records but overdue be less than 90 days Client be other samples, and according to the corresponding timing node of the standard, obtain the different observation period node of these samples client;
Phase node according to the observation, target ship data, which be aggregated into different time length, can embody shipping performance Index;Such as nearly 2 lunar systems row vehicle number.
The acquisition of index is realized, is provided safeguard to establish credit scoring model.
Preferably, as shown in figure 4, obtaining indicator combination also according to target ship data described in default processing rule process Include:
Step S400, relative value is executed to the index in the indicator combination of the shipping performance and compares operation, obtain first Derivative indicator combination;
Step S402, at least two indexs in the indicator combination of the shipping performance are combined, it is derivative obtains second Indicator combination.
The indicator combination that different time length summarizes can with the comparison of further progress relative value, such as on year-on-year basis, ring ratio, spread out Raw more indexs obtain the first derivative indicator combination, such as the ring ratio of nearly 2 months mileages travelled;
Two or more different indexs can obtain the second derivative index group in conjunction with the derivative of further progress index It closes, for example obtains the growth ratio of the mileage travelled of average traffic in 2 months;
Compare the index number that can be further expanded in indicator combination by the relative value of index combination, index, so that Index is more perfect, and the applicability of the credit scoring model to ultimately generate provides safeguard.
Step S104, the indicator combination is screened by default statistical model;
Specifically, as shown in figure 5, it includes below a kind of or more for screening the indicator combination by default statistical model Kind:
Step S500, the indicator combination is screened using analysis of variance model;
Step S502, the indicator combination is screened using Logic Regression Models;
Step S504, the indicator combination is screened using Random Forest model;
Step S506, the indicator combination is screened using neural network model.
With variance analysis or logistic regression or random forest or artificial neural network or random forest combination logic It returns, or above one or more statistical methods combined, the indicator combination of target sample and other samples is carried out Screening.
Preferably, as shown in fig. 6, screening the indicator combination by default statistical model further include:
Step S600, the first indicator combination is obtained by conspicuousness or multiple conllinear inspection;
Step S602, it takes height to screen according to AR value first indicator combination, obtains target indicator combination.
With variance analysis or logistic regression or random forest or artificial neural network or random forest combination logic It returns, or above one or more statistical methods combined are needed using different verification modes;It is aobvious when needing to pass through When work property inspection, multicollinearity etc. are examined, the first indicator combination, or because multiple conllinear shadow are obtained by significance test It rings, needs to be further divided into different indicator combinations;
For issuable different indicator combinations, further takes height to be screened according to AR value, obtain target indicator group It closes;The credit scoring model finally obtained can be made more to be bonded the data cases of freight industry by index screening, it can be effective Ground measures credit risk, more qualitative method, hence it is evident that improves the stability and accuracy of result, compares other and want It seeks sample data distribution Normal Distribution and has the metering method of many stringent assumption conditions.
Step S106, predetermined registration operation is executed to the index in the selection result, generates credit scoring model.
Specifically, generating credit scoring model includes: as shown in fig. 7, executing predetermined registration operation to the index in the selection result
Step S700, to the Distribution Indexes weight in the selection result;
Step S702, the operation that cross validation or reserved Sample are executed to the index after distribution weight, generates credit Rating Model.
To the method for last application of results cross validation or reserved Sample, finally determine that the credit to be adopted is commented The method formula divided, i.e. credit scoring model;
Credit scoring model is for example:
Y=0.012 (X1)+0.014(X2)+0.033(X3)+0.006(X4)+0.999(X5)
The credit scoring model can more be bonded the data cases of freight industry.
It can be seen from the above description that the present invention realizes following technical effect:
In the embodiment of the present application, by the way of data processing and index screening, by the target shipping for obtaining user Data;According to target ship data described in default processing rule process, indicator combination is obtained;Institute is screened by default statistical model State indicator combination;Predetermined registration operation is executed to the index in the selection result, generates credit scoring model;Acquisition target shipping is reached Data, and index is obtained by data processing, while using the purpose of specific metering method screening index, to realize metering Algorithm can be suitable for freight industry, and promote the technical effect of credit scoring model stability and accuracy, and then solve Metering method applicability is undesirable, and the technical problem of the stability of result, accuracy difference.
It should be noted that step shown in the flowchart of the accompanying drawings can be in such as a group of computer-executable instructions It is executed in computer system, although also, logical order is shown in flow charts, and it in some cases, can be with not The sequence being same as herein executes shown or described step.
According to embodiments of the present invention, it additionally provides a kind of for implementing the dress of the generation method of above-mentioned credit scoring model It sets, as shown in figure 8, the device includes:
Module 10 is obtained, for obtaining the target ship data of user;
The acquiring unit 10 includes:
It is extracted from database and obtains the ship data of user;
ETL processing is carried out to the ship data, obtains target ship data.
The ship data of shipping company is extracted from G7 database, ship data includes but is not limited to the customer management time limit, Mileage travelled, task route, the total time-consuming of route and the traveling therein of G7 appliance case subrecord are time-consuming, the resource of shipping investment The amount of consumption of (such as vehicle, driver, monitoring device), the ETC of the resource accounting of lateral comparison and G7, oil card product record.
The processing that ETL is carried out to ship data, summarizes by dimension of shipping company, screens out miss rate, repetitive rate is excessively high Field is verified, is corrected to the missing values retained in field, exceptional value, the processing such as median filling.
After processing, satisfactory data can be filtered out, duplicate data are deleted, it can also be to certain, abnormal Data supplemented, so that improve the target ship data of acquisition simplifies degree, also improve the quality of data, thus It can be provided safeguard to reduce the operand of data processing.
Processing module 20, for obtaining indicator combination according to target ship data described in default processing rule process;
The processing module 20 includes:
User's refund record is collected by the judgement of preset time standard;
User is divided into two classes according to judging result;
According to the timing node in the preset time standard, the observation period node of classification results is determined;
Target ship data is summarized with different time length by observation period node, obtains the index of shipping performance Combination.
Collect user refund record, user refund record in comprising user refund the time, according to the refund time with And preset threshold value of exceeding the time limit, it can be determined that go out whether user's refund exceeds the time limit, the user to exceed the time limit is divided into target sample, the use that do not exceed the time limit Family is divided into other samples, so that user is divided into two classes;The timing node in preset time standard is referred again to, can be determined respectively The observation period node of target sample, other samples;Finally by summarizing to obtain the indicator combination of shipping performance.
The refund performance for acquiring user (shipping company), records according to refund obtained, target sample and other is arranged The standard of sample, such as: the overdue client more than 90 days is target sample, has 90 days or more records but overdue be less than 90 days Client be other samples, and according to the corresponding timing node of the standard, obtain the different observation period node of these samples client;
Phase node according to the observation, target ship data, which be aggregated into different time length, can embody shipping performance Index;Such as nearly 2 lunar systems row vehicle number.
The acquisition of index is realized, is provided safeguard to establish credit scoring model.
Preferably, according to target ship data described in default processing rule process, indicator combination is obtained further include:
Relative value is executed to the index in the indicator combination of the shipping performance and compares operation, obtains the first derivative index group It closes;
At least two indexs in the indicator combination of the shipping performance are combined, the second derivative indicator combination is obtained.
The indicator combination that different time length summarizes can with the comparison of further progress relative value, such as on year-on-year basis, ring ratio, spread out Raw more indexs obtain the first derivative indicator combination, such as the ring ratio of nearly 2 months mileages travelled;
Two or more different indexs can obtain the second derivative index group in conjunction with the derivative of further progress index It closes, for example obtains the growth ratio of the mileage travelled of average traffic in 2 months;
Compare the index number that can be further expanded in indicator combination by the relative value of index combination, index, so that Index is more perfect, and the applicability of the credit scoring model to ultimately generate provides safeguard.
Screening module 30, for screening the indicator combination by default statistical model;
Specifically, as shown in figure 5, it includes below a kind of or more for screening the indicator combination by default statistical model Kind:
The indicator combination is screened using analysis of variance model;
The indicator combination is screened using Logic Regression Models;
The indicator combination is screened using Random Forest model;
The indicator combination is screened using neural network model.
With variance analysis or logistic regression or random forest or artificial neural network or random forest combination logic It returns, or above one or more statistical methods combined, the indicator combination of target sample and other samples is carried out Screening.
Preferably, the indicator combination is screened by default statistical model further include:
The first indicator combination is obtained by conspicuousness or multiple conllinear inspection;
It takes height to screen according to AR value first indicator combination, obtains target indicator combination.
With variance analysis or logistic regression or random forest or artificial neural network or random forest combination logic It returns, or above one or more statistical methods combined are needed using different verification modes;It is aobvious when needing to pass through When work property inspection, multicollinearity etc. are examined, the first indicator combination, or because multiple conllinear shadow are obtained by significance test It rings, needs to be further divided into different indicator combinations;
For issuable different indicator combinations, further takes height to be screened according to AR value, obtain target indicator group It closes;The credit scoring model finally obtained can be made more to be bonded the data cases of freight industry by index screening, it can be effective Ground measures credit risk, more qualitative method, hence it is evident that improves the stability and accuracy of result, compares other and want It seeks sample data distribution Normal Distribution and has the metering method of many stringent assumption conditions.
Generation module 40 generates credit scoring model for executing predetermined registration operation to the index in the selection result.
Specifically, executing predetermined registration operation to the index in the selection result, generating credit scoring model includes:
To the Distribution Indexes weight in the selection result;
The operation that cross validation or reserved Sample are executed to the index after distribution weight, generates credit scoring model.
To the method for last application of results cross validation or reserved Sample, finally determine that the credit to be adopted is commented The method formula divided, i.e. credit scoring model;
Credit scoring model is for example:
Y=0.012 (X1)+0.014(X2)+0.033(X3)+0.006(X4)+0.999(X5)
The credit scoring model can more be bonded the data cases of freight industry.
It can be seen from the above description that the present invention realizes following technical effect:
In the embodiment of the present application, by the way of data processing and index screening, by the target shipping for obtaining user Data;According to target ship data described in default processing rule process, indicator combination is obtained;Institute is screened by default statistical model State indicator combination;Predetermined registration operation is executed to the index in the selection result, generates credit scoring model;Acquisition target shipping is reached Data, and index is obtained by data processing, while using the purpose of specific metering method screening index, to realize metering Algorithm can be suitable for freight industry, and promote the technical effect of credit scoring model stability and accuracy, and then solve Metering method applicability is undesirable, and the technical problem of the stability of result, accuracy difference.
Obviously, those skilled in the art should be understood that each module of the above invention or each step can be with general Computing device realize that they can be concentrated on a single computing device, or be distributed in multiple computing devices and formed Network on, optionally, they can be realized with the program code that computing device can perform, it is thus possible to which they are stored Be performed by computing device in the storage device, perhaps they are fabricated to each integrated circuit modules or by they In multiple modules or step be fabricated to single integrated circuit module to realize.In this way, the present invention is not limited to any specific Hardware and software combines.
The foregoing is merely preferred embodiment of the present application, are not intended to limit this application, for the skill of this field For art personnel, various changes and changes are possible in this application.Within the spirit and principles of this application, made any to repair Change, equivalent replacement, improvement etc., should be included within the scope of protection of this application.

Claims (10)

1. a kind of generation method of credit scoring model characterized by comprising
Obtain the target ship data of user;
According to target ship data described in default processing rule process, indicator combination is obtained;
The indicator combination is screened by default statistical model;
Predetermined registration operation is executed to the index in the selection result, generates credit scoring model.
2. generation method according to claim 1, which is characterized in that the target ship data for obtaining user includes:
It is extracted from database and obtains the ship data of user;
ETL processing is carried out to the ship data, obtains target ship data.
3. generation method according to claim 1, which is characterized in that according to target shipping described in default processing rule process Data, obtaining indicator combination includes:
User's refund record is collected by the judgement of preset time standard;
User is divided into two classes according to judging result;
According to the timing node in the preset time standard, the observation period node of classification results is determined;
Target ship data is summarized with different time length by observation period node, obtains the index group of shipping performance It closes.
4. generation method according to claim 3, which is characterized in that according to target shipping described in default processing rule process Data obtain indicator combination further include:
Relative value is executed to the index in the indicator combination of the shipping performance and compares operation, obtains the first derivative indicator combination;
At least two indexs in the indicator combination of the shipping performance are combined, the second derivative indicator combination is obtained.
5. generation method according to claim 1, which is characterized in that screen the indicator combination by default statistical model Including below one or more:
The indicator combination is screened using analysis of variance model;
The indicator combination is screened using Logic Regression Models;
The indicator combination is screened using Random Forest model;
The indicator combination is screened using neural network model.
6. generation method according to claim 1, which is characterized in that screen the indicator combination by default statistical model Further include:
The first indicator combination is obtained by conspicuousness or multiple conllinear inspection;
It takes height to screen according to AR value first indicator combination, obtains target indicator combination.
7. generation method according to claim 1, which is characterized in that predetermined registration operation is executed to the index in the selection result, Generating credit scoring model includes:
To the Distribution Indexes weight in the selection result;
The operation that cross validation or reserved Sample are executed to the index after distribution weight, generates credit scoring model.
8. a kind of generating means of credit scoring model characterized by comprising
Module is obtained, for obtaining the target ship data of user;
Processing module, for obtaining indicator combination according to target ship data described in default processing rule process;
Screening module, for screening the indicator combination by default statistical model;
Generation module generates credit scoring model for executing predetermined registration operation to the index in the selection result.
9. generating means according to claim 8, which is characterized in that the acquiring unit includes:
It is extracted from database and obtains the ship data of user;
ETL processing is carried out to the ship data, obtains target ship data.
10. generating means according to claim 8, which is characterized in that the processing module includes:
User's refund record is collected by the judgement of preset time standard;
User is divided into two classes according to judging result;
According to the timing node in the preset time standard, the observation period node of classification results is determined;
Target ship data is summarized with different time length by observation period node, obtains the index group of shipping performance It closes.
CN201910541021.7A 2019-06-20 2019-06-20 The generation method and device of credit scoring model Pending CN110309219A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910541021.7A CN110309219A (en) 2019-06-20 2019-06-20 The generation method and device of credit scoring model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910541021.7A CN110309219A (en) 2019-06-20 2019-06-20 The generation method and device of credit scoring model

Publications (1)

Publication Number Publication Date
CN110309219A true CN110309219A (en) 2019-10-08

Family

ID=68077080

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910541021.7A Pending CN110309219A (en) 2019-06-20 2019-06-20 The generation method and device of credit scoring model

Country Status (1)

Country Link
CN (1) CN110309219A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110837979A (en) * 2019-11-18 2020-02-25 吉旗物联科技(上海)有限公司 Safe driving risk prediction method and device based on random forest
CN113609121A (en) * 2021-08-17 2021-11-05 平安资产管理有限责任公司 Target data processing method, device, equipment and medium based on artificial intelligence
CN114266641A (en) * 2021-09-27 2022-04-01 东方微银科技股份有限公司 Scoring model construction method based on logistic regression and rules

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102542425A (en) * 2010-12-22 2012-07-04 时艳春 Road cargo logistics information management system and method
CN104867039A (en) * 2015-06-09 2015-08-26 南京大学 Vehicle member derivative credit evaluation method under influence of many factors
CN105354210A (en) * 2015-09-23 2016-02-24 深圳市爱贝信息技术有限公司 Mobile game payment account behavior data processing method and apparatus
CN108596757A (en) * 2018-04-23 2018-09-28 大连火眼征信管理有限公司 A kind of personal credit file method and system of intelligences combination
CN109345372A (en) * 2018-09-06 2019-02-15 江西汉辰金融科技集团有限公司 Credit-graded approach, system and computer readable storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102542425A (en) * 2010-12-22 2012-07-04 时艳春 Road cargo logistics information management system and method
CN104867039A (en) * 2015-06-09 2015-08-26 南京大学 Vehicle member derivative credit evaluation method under influence of many factors
CN105354210A (en) * 2015-09-23 2016-02-24 深圳市爱贝信息技术有限公司 Mobile game payment account behavior data processing method and apparatus
CN108596757A (en) * 2018-04-23 2018-09-28 大连火眼征信管理有限公司 A kind of personal credit file method and system of intelligences combination
CN109345372A (en) * 2018-09-06 2019-02-15 江西汉辰金融科技集团有限公司 Credit-graded approach, system and computer readable storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110837979A (en) * 2019-11-18 2020-02-25 吉旗物联科技(上海)有限公司 Safe driving risk prediction method and device based on random forest
CN110837979B (en) * 2019-11-18 2022-02-15 吉旗物联科技(上海)有限公司 Safe driving risk prediction method and device based on random forest
CN113609121A (en) * 2021-08-17 2021-11-05 平安资产管理有限责任公司 Target data processing method, device, equipment and medium based on artificial intelligence
CN114266641A (en) * 2021-09-27 2022-04-01 东方微银科技股份有限公司 Scoring model construction method based on logistic regression and rules

Similar Documents

Publication Publication Date Title
CN110309219A (en) The generation method and device of credit scoring model
Golden et al. Implementing vehicle routing algorithms
CN109615524B (en) Money laundering crime group partner identification method, money laundering crime group partner identification device, computer equipment and storage medium
CN109583796A (en) A kind of data digging system and method for Logistics Park OA operation analysis
CN104737167A (en) Profiling data with source tracking
CN112734219B (en) Vehicle transportation running behavior analysis method and system
CN105528447B (en) The layer-by-layer method summarized when rejecting of a kind of pair of specific data
CN102750367A (en) Big data checking system and method thereof on cloud platform
CN105631612A (en) System and method of evaluating individual performance and capability of public servant based on big data
CN106651732A (en) Highway different-vehicle card-change toll-dodging vehicle screening method and system
JP2002032773A (en) Device and method for processing map data
CN109243173A (en) Track of vehicle analysis method and system based on road high definition bayonet data
CN108510396A (en) It insures method, apparatus, computer equipment and the storage medium of verification
KR100865157B1 (en) System and method of quantitative estimation for research and development projects
CN116503166A (en) Tracking method and tracking system for transaction funds on Ethernet chain
CN108021361A (en) A kind of the highway fee evasion of falling card vehicle screening method and device
CN108108448A (en) A kind of method and system for generating national road portrait
CN106294834A (en) Connected transaction based on taxpayer's interests related network is evaded the tax Activity recognition method
CN104391986B (en) Business reclassification apparatus and method
CN110503229A (en) Method, apparatus and calculating equipment for vehicle routing optimization
Apparao et al. Financial statement fraud detection by data mining
CN110223104A (en) A kind of client model building system based on big data
CN111583007B (en) Tax risk management and control method and device
Puentes Flexible Funding for Transit: Who Uses It?
CN112926991A (en) Cascade group severity grade dividing method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191008

RJ01 Rejection of invention patent application after publication