Nothing Special   »   [go: up one dir, main page]

CN110570111A - Enterprise risk prediction method, model training method, device and equipment - Google Patents

Enterprise risk prediction method, model training method, device and equipment Download PDF

Info

Publication number
CN110570111A
CN110570111A CN201910815269.8A CN201910815269A CN110570111A CN 110570111 A CN110570111 A CN 110570111A CN 201910815269 A CN201910815269 A CN 201910815269A CN 110570111 A CN110570111 A CN 110570111A
Authority
CN
China
Prior art keywords
node
enterprise
network graph
target
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910815269.8A
Other languages
Chinese (zh)
Inventor
钱隽夫
曾威龙
王膂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201910815269.8A priority Critical patent/CN110570111A/en
Publication of CN110570111A publication Critical patent/CN110570111A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Tourism & Hospitality (AREA)
  • Operations Research (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Quality & Reliability (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Development Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Educational Administration (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the specification provides an enterprise risk prediction method, a model training method, a device and computer equipment. The method comprises the steps of constructing a relational network diagram of an enterprise based on a share right structure, extracting a plurality of types of sub-network diagrams based on the relation between the enterprise and shareholders, constructing a sample by utilizing a node set in a sub-network, calculating the characteristic attribute and the characteristic vector of the sample based on the characteristic attribute and the characteristic vector of each node of the sample, training the model by utilizing the characteristic attribute and the characteristic vector of the sample to obtain an enterprise risk prediction model, and predicting the enterprise risk by using the enterprise risk prediction model, so that the complexity of model calculation is reduced, and meanwhile, the attribute and the relational network structure of the node are comprehensively considered during prediction, and the prediction result is more accurate.

Description

Enterprise risk prediction method, model training method, device and equipment
Technical Field
the present disclosure relates to the field of risk identification technologies, and in particular, to an enterprise risk prediction method, a model training method, an enterprise risk prediction device, and an enterprise risk prediction apparatus.
Background
It is highly desirable to identify risks to the enterprise, such as identifying whether the enterprise is at risk of fraud, money laundering, and breach of contract. Traditional enterprise risk assessment is generally based on the business and industry basic attributes, business situation and cash flow of an enterprise, and the like, and an enterprise entity is subjected to independent analysis. However, the amount of information available for independent analysis is limited, and the relationships between enterprises are complex, and various complex equity structures may exist between enterprises, so that a huge relationship network is formed, wherein the large relationship network includes the enterprises and stockholders who contain the enterprises or natural people. The risk condition of each enterprise and stockholder can affect the risk condition of one enterprise. Therefore, it is very critical to make full use of the relationship network formed between enterprises to perform large-scale enterprise risk mining with the lowest possible labor cost and to predict enterprise risks more accurately.
disclosure of Invention
Based on the method, the specification provides an enterprise risk prediction method, a model training method, a device and computer equipment.
according to a first aspect of embodiments herein, there is provided a method for enterprise risk prediction, the method comprising:
Extracting a sub-network graph associated with a target enterprise node to be predicted from a relational network graph of an enterprise, and acquiring a node construction node set in the sub-network graph, wherein the relational network graph is constructed based on a share right structure;
Calculating to obtain the characteristic attribute values of the node set and the characteristic vectors of the node set based on the characteristic attribute values of the nodes in the node set, the characteristic vectors of the nodes and the correlation degree of the nodes and the target enterprise node;
And inputting the characteristic attribute values and the characterization vectors of the node set into a pre-trained enterprise risk prediction model to predict the risk of the target enterprise node, wherein the enterprise risk prediction model is obtained by training based on the characteristic attributes, the characterization vectors and the risk labels of the nodes in the relational network graph.
According to a second aspect of embodiments herein, there is provided a model training method, the method comprising:
Extracting a sub-network graph associated with a target node from a relational network graph of an enterprise, and taking a node set in the sub-network graph as a sample, wherein the relational network graph is constructed based on a share right structure;
calculating to obtain the characteristic attribute value of the sample and the characteristic vector of the sample based on the characteristic attribute value of each node in the node set, the characteristic vector of each node and the correlation degree of each node and the target node;
And taking the characteristic attribute value and the characterization vector of the sample as input, taking the risk label of the target node as output, and training a preset reference model until a preset loss function is converged to obtain an enterprise risk prediction model.
according to a third aspect of embodiments herein, there is provided a model training apparatus, the apparatus comprising:
The system comprises a sample construction module, a node selection module and a node selection module, wherein the sample construction module is used for extracting a sub-network graph associated with a target node from a relational network graph of an enterprise and taking a node set in the sub-network graph as a sample, and the relational network graph is constructed based on a stock right structure;
A calculation module, configured to calculate a characteristic attribute value of the sample and a characteristic vector of the sample based on a characteristic attribute value of each node in the node set, a characteristic vector of each node, and a correlation between each node and the target node;
And the training module is used for training a preset reference model by taking the characteristic attribute value and the characterization vector of the sample as input and the risk label of the target node as output until a preset loss function is converged to obtain an enterprise risk prediction model.
According to a fourth aspect of embodiments herein, there is provided an enterprise risk prediction device, the device comprising:
The system comprises a construction module, a prediction module and a prediction module, wherein the construction module is used for extracting a sub-network graph associated with a target enterprise node to be predicted from a relational network graph of an enterprise and acquiring a node construction node set in the sub-network graph, and the relational network graph is constructed based on a stock right structure;
the computing module is used for computing to obtain a characteristic attribute value of the node set and a characteristic vector of the node set based on the characteristic attribute of each node in the node set, the characteristic vector of each node and the correlation degree of each node and the target enterprise node;
and the prediction module is used for inputting the characteristic attribute values and the characterization vectors of the node set into a pre-trained enterprise risk prediction model and predicting the risk of the target enterprise node, wherein the enterprise risk prediction model is obtained by training based on the characteristic attributes, the characterization vectors and the risk labels of the nodes in the relational network graph.
according to a fifth aspect of embodiments herein, there is provided a computer apparatus comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of any of the embodiments when executing the program.
By applying the scheme of the embodiment of the specification, a relationship network diagram of an enterprise is constructed based on a share right structure, then a plurality of types of sub-network diagrams are extracted based on the relationship between the enterprise and shareholders, a node set in the sub-network is utilized to construct a sample, the characteristic attribute and the characteristic vector of the sample are calculated based on the characteristic attribute and the characteristic vector of each node of the sample, the model is trained by utilizing the characteristic attribute and the characteristic vector of the sample to obtain an enterprise risk prediction model, and the enterprise risk prediction model is used for predicting the enterprise risk, so that the complexity of model calculation is reduced, and meanwhile, the attribute and the relationship network structure of the node are comprehensively considered during prediction, and the prediction result is more accurate.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the specification.
Drawings
the accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present specification and together with the description, serve to explain the principles of the specification.
FIG. 1 is a flow chart of a model training method according to an embodiment of the present disclosure.
Fig. 2A is a relational network diagram of an enterprise according to one embodiment of the present disclosure.
Fig. 2B-2D are diagrams of subnets extracted from an enterprise relational network diagram, according to one embodiment of the present disclosure.
fig. 3 is a flowchart of an enterprise risk prediction method according to an embodiment of the present disclosure.
Fig. 4 is a block diagram of a logical structure of a model training apparatus according to an embodiment of the present disclosure.
fig. 5 is a block diagram illustrating a logical structure of an enterprise risk prediction device according to an embodiment of the present disclosure.
FIG. 6 is a schematic block diagram of a computer device for implementing the methods of the present description, according to one embodiment of the present description.
Detailed Description
reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present specification. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the specification, as detailed in the appended claims.
The terminology used in the description herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the description. As used in this specification and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, the first information may also be referred to as second information, and similarly, the second information may also be referred to as first information, without departing from the scope of the present specification. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
It is highly desirable to identify risks to the enterprise, such as identifying whether the enterprise is at risk of fraud, money laundering, and breach of contract. The relationships between enterprises are complex, and various complex equity structures may exist between enterprises, so that a huge relationship network is formed, wherein the enterprises and stockholders are included, and the stockholders contain the enterprises or natural persons. The risk condition of each enterprise and stockholder can affect the risk condition of one enterprise. Therefore, it is very critical to fully utilize the relationship network formed between enterprises for risk prediction. When risk prediction is performed by using a relational network formed among enterprises, some technologies calculate the characterization vectors of all nodes in a relational network graph directly through models such as Deepwalk, Node2vec and the like, and then classify the characterization vectors corresponding to the nodes through clustering, a label propagation method and the like so as to determine the risk of the nodes. The risk prediction method only considers the relationship structure between the nodes in the relationship network graph and does not consider the characteristic attributes of the nodes, so that the prediction result is inaccurate. In other technologies, a graph neural network is utilized, each node in the relational network takes into account the propagation of neighbor nodes, namely associated entity information, through the network in an iterative process, and an output score is obtained through a feed-forward neural network; after the model is converged, the network output value of each node is the final result. When the network relationship is complex and the number of nodes is large, the iterative computation process is complex and the computation amount is too large, and the neural network model of the graph has the characteristic of weak interpretability, so that the final prediction result cannot be interpreted.
In order to comprehensively consider the structural relationship among the nodes in the relational network graph and the characteristic attributes of the nodes when predicting the enterprise risk, the accuracy of model prediction is improved, the calculated amount of the model is reduced, and the interpretability and the efficiency of the model are improved. An embodiment of the present specification provides a method for training an enterprise risk prediction model, as shown in fig. 1, the method may include:
S102, extracting a sub-network graph associated with a target node from a relational network graph of an enterprise, and taking a node set in the sub-network graph as a sample, wherein the relational network graph is constructed based on a share right structure;
s104, calculating to obtain the characteristic attribute value of the sample and the characteristic vector of the sample based on the characteristic attribute value of each node in the node set, the characteristic vector of each node and the correlation degree of each node and the target node;
and S106, taking the characteristic attribute value and the characterization vector of the sample as input, taking the risk label of the target node as output, and training a preset reference model until a preset loss function is converged to obtain an enterprise risk prediction model.
The equity relationship in the enterprise relationship is very important, the association between the enterprise and the enterprise, between the enterprise and the shareholder and between the shareholder and the shareholder is well reflected by the equity structure of the enterprise, and the equity relationship is also an important influence factor of the enterprise risk. Therefore, when the enterprise risk prediction model is trained, an enterprise relational network graph can be constructed based on the equity relations, nodes on the relational network graph comprise various entities, such as enterprises and stakeholders, wherein the stakeholders can be enterprises or natural persons, and each node on the graph is assigned with a unique ID. The relationship network graph may be a directed graph, with the enterprise pointing to the shareholders, and the edges in the relationship network graph representing the degree of correlation between the enterprise and the shareholders. In some embodiments, the relevancy may be expressed by the share ratio, and may also be expressed by the share assets of the enterprise with shareholders, which is not limited in this application.
The risk labels are risk labels of the enterprise, and may also be risk levels of the enterprise, such as high risk, medium risk, low risk, no risk and the like, where each level corresponds to a risk coefficient range. The embodiment of the specification can obtain a risk prediction model through semi-supervised model training based on a relationship network diagram containing a small number of risk labels. As shown in fig. 2A, a relational network diagram in the embodiment of the present specification, where a business points to a shareholder, nodes in the diagram represent the business or a natural person, and edges represent a share percentage. In the relational network graph, nodes which have influence on the risk of a certain enterprise are generally nodes which have certain association with the enterprise, the nodes which are not directly or indirectly connected with the enterprise in the relational network graph and far away from each other generally have little influence on the enterprise, so that the nodes which have little influence do not need to be considered, the calculation amount is increased, and the significance for improving the accuracy of risk prediction is small. Therefore, the embodiments of the present specification extract some sub-network graphs associated with the target node based on the relationship of each node in the relational network graph, and then extract the nodes in the sub-network graphs to form a node set, and use the node set as a sample. In some embodiments, the sub-network graph may be formed by neighboring nodes of the target node, where the neighboring nodes of the target node are stakeholders of the enterprise and are nodes having connecting edges with the target node, and represent a first-order degree of association. As shown in fig. 2B, the neighboring nodes of enterprise a may be extracted from the relational network graph shown in fig. 2A to form a sub-network graph. And then, nodes in the sub-network graph form a node set as a sample, and each node in the node combination is a node adjacent to the target node. For example, a node set consisting of enterprise B, enterprise C, enterprise D, natural person a, and natural person B in the sub-network graph may be used as a sample.
In some embodiments, the sub-network graph may also be randomly walked in the relationship network graph with the target node as a starting point, and a random walk path is generated, and the nodes passed by the random walk path form the sub-network graph. The nodes in the sub-network graph are nodes which have direct or indirect equity relation with the target nodes, and represent a multi-level association. As shown in fig. 2C, random walks may be started from enterprise a in the relationship network diagram shown in fig. 2A to obtain a walking path, and nodes passed by the path form a sub-network diagram. For example, the migration path is "enterprise a-enterprise D-enterprise E-enterprise I" and thus the set of nodes on the path can be used as a sample.
In some embodiments, the sub-network graph may be a network graph formed by nodes extracted from the relational network graph and having common connection nodes with the target node, and the nodes having common connection nodes with the target node represent enterprises having common stakeholders with the target node, and represent a second-order association. As shown in fig. 2D, a node having a common node with enterprise a, i.e., an enterprise having a common shareholder, may be extracted from the relationship network diagram shown in fig. 2A to form a sub-network diagram. For example, a node set including an enterprise E, an enterprise F, and an enterprise G, which are natural persons a as shareholders of the enterprise a, may be used as the sample.
It should be noted that, the above are only three exemplary sub-network diagrams extracted according to the relationship between the enterprise and the shareholder, and of course, the sub-network diagrams may also be sub-network diagrams formed by third-order associated nodes or fourth-order associated nodes, and the embodiment of this specification is not limited. Of course, when the sub-network graphs are extracted, one type or multiple types of sub-network graphs can be extracted at a time to construct a node set sample, and the more types of sub-network graphs are extracted, the more influence factors are covered in the process of training the model, and the more accurate the model is obtained through training.
After obtaining the sample formed by the node set, one or more feature attributes of each node in the node set can be extracted, and feature attribute values of the feature attributes are obtained, wherein the feature attributes can be feature attributes having influence on enterprise risk. For a node being an enterprise or a natural person, the corresponding characteristic attributes may be different. For example, in some scenarios, the characteristic attribute may be enterprise registration time, registration capital for an enterprise node, and the characteristic attribute may be a property of a natural person, i.e., an equity property, for a natural person node. After the characteristic attribute value corresponding to each characteristic attribute of each node in the sample is obtained, the characteristic attribute value of the sample can be obtained by calculation according to the correlation degree between each node and the target node. In some embodiments, the process of calculating the characteristic attribute value of the sample is as follows: the characteristic attribute value of each node in the node set can be acquired first, then the statistical value of the characteristic attribute of the node set is calculated based on the correlation degree of each node and the target node and the characteristic attribute value of each node, and then the statistical value is used as the characteristic attribute value of the sample. For example, assuming that a sub-network graph of the target node V0 is extracted from the relational network, where a sample of a set of nodes in the sub-network graph is X ═ { V1, V2, V3, V4, V5}, share ratios of each node are 10%, 20%, 30%, 15%, and 25%, respectively, one of the feature attributes of each node is a1, and corresponding feature attribute values are a1V1, a1V2, a1V3, a1V4, and a1V5, the feature attribute value of the feature attribute a1 of the sample can be calculated as follows: a1X ═ a1V1 × 10% + a1V2 × 20% + a1V3 × 30% + a1V4 × 15% + a1V5 × 25%, other characteristic attributes can also be obtained with reference to the above calculation method.
The characteristic attribute values of the calculation samples only consider the influence of the characteristic attributes of the nodes in the relational network on the enterprise risk, and the influence of the relational structure between the nodes in the relational network graph can be also considered for more accurately predicting the enterprise risk. Because the nodes can not carry out mathematical operation, the nodes can be vectorized firstly, the characterization vector of each node in the node set is obtained by calculation, and then the characterization vector of the node set sample is obtained by calculation according to the characterization vector of each node and the correlation degree of each node and the target node. In some embodiments, a network-embedded learning model algorithm may be used to calculate the token vector of each Node in the Node set, for example, a deepwater model or a Node2vec model may be used to calculate the token vector of each Node.
In some embodiments, according to the characterization vector of each node in the node set and the correlation between each node and the target node, the calculation process of calculating the characterization vector of the obtained sample is as follows: firstly, calculating an average characterization vector and a maximum characterization vector of a node set based on the characterization vector of each node in the node set and the correlation degree of each node and the target node, and connecting the average characterization vector and the maximum characterization vector to obtain the characterization vector of the sample. For example, a set of nodes in the subnet graph of the target node V0 is represented by X ═ V1, V2, V3, V4, V5, and the token vectors corresponding to the nodes are e1, e2, e3, e4, e5, and the share weight ratio is 10%, 20%, 30%, 15%, and 25%, where the average token vector is 0.1e1+0.2e2+0.3e3+0.15e4+0.25e5, and the maximum token vector is 0.1e1, 0.2e2, 0.3e3, 0.15e4, and the maximum token vector in 0.25e5, that is, Max {0.1e1, 0.2e2, 0.3e3, 0.15e4, and 0.25e5, and then the average token vector and the maximum token vector are connected to form a sample.
After the characteristic attribute value and the characterization vector of the sample are obtained through calculation, the characteristic attribute value and the characterization vector of the sample can be used as input of the model, the label corresponding to the target node is used as output, the preset reference model is trained until the preset loss function is converged, and the trained enterprise risk prediction model is obtained. In some embodiments, the reference model may be Wide&Deep model. Wide&The Deep model is a model combining a Wide model and a Deep model, and has the memory capacity of the Wide model and the generalization capacity of the Deep model, wherein the memory capacity refers to the discovery of correlation between entities or features from historical data, and the generalization capacity refers to the transmission of the correlation and discovers new feature combinations which rarely or do not appear in the historical data. The Wide model is a linear model, and the Deep model is a Deep neural network model. The embodiment of the specification adopts Wide&The Deep model is used as a reference model, the characteristic attributes and the characteristic vectors of the nodes in the relational network are respectively used as the Wide model and the Deep model to be input, and the influence of the characteristic attributes of the nodes and the relational structure of the nodes on enterprise risks is comprehensively considered. Assuming that the target node is V0, and a sample formed by the node set in the target node sub-network graph is x, x ═ { V1, V2, V3, V4, V5 … }, the prediction model function is as follows:
Wherein f isv0(x) Is a risk label of the target node, u (x) is a characteristic attribute of the sample, gv(x) The value of the representative vector, σ,Are model parameters.
In addition, in order to verify whether the trained model reaches the standard, a preset loss function can be adopted to verify the trained model. The principle of the construction of the loss function is that the risk label results of two similar nodes in the relationship network should be relatively close. For example, the following loss function may be constructed:
Wherein x isifrom a labeled sample; x is the number ofjthe samples are samples without labels, and are randomly sampled during training.
Ai,jdenotes xiAnd xjthe degree of correlation of the corresponding node.
The left half part of the formula can be adopted for calculating samples with labels, and the right half part of the formula can be adopted for calculating samples without labels. And when the loss function approaches convergence, the enterprise risk prediction model is considered to be trained completely.
It should be noted that, when constructing the samples, the sub-network construction samples of different types of the target node may be extracted, so that when training the model, a plurality of prediction models may be obtained by respectively training the samples of different types, of course, the samples of different types may also be trained by using one model to obtain one prediction model, which may be determined specifically according to actual requirements.
according to the method, the Wide & Deep model is selected, the node set of the sub-network graph associated with the target node is used as a sample, the characteristic attribute value and the characteristic vector of the sample are obtained through the characteristic attribute value and the characteristic vector of each node of the node set, then the characteristic attribute value and the characteristic vector are input into the Wide & Deep model, the model is trained, the characteristic attribute of the node and the association structure between the nodes are comprehensively considered, the adopted model is basically a linear model, the calculation complexity of the model is reduced, and the interpretability of the model is improved.
The embodiment of the specification further provides an enterprise risk prediction method, which is used for predicting the risk of an enterprise through an enterprise relationship network diagram based on an equity structure and a preset enterprise risk prediction model. Specifically, as shown in fig. 3, the method may include the following steps:
s302, extracting a sub-network graph associated with a target enterprise node to be predicted from a relational network graph of an enterprise, and acquiring nodes in the sub-network graph to construct a node set, wherein the relational network graph is constructed based on a share structure;
S304, calculating the characteristic attribute values of the node set and the characteristic vectors of the node set based on the characteristic attribute values of the nodes in the node set, the characteristic vectors of the nodes and the correlation degree of the nodes and the target enterprise node;
s306, inputting the characteristic attribute values and the characterization vectors of the node set into a pre-trained enterprise risk prediction model, and predicting the risk of the target enterprise node, wherein the enterprise risk prediction model is obtained based on the characteristic attributes, the characterization vectors and the risk labels of the nodes in the relational network graph.
The equity structure of an enterprise well reflects the association between enterprises, enterprises and shareholders, and shareholders, and the equity relationship is also an important influence factor of enterprise risk. In order to predict the risk of the enterprise more accurately and efficiently, the embodiment of the specification predicts the risk of the enterprise through a relationship network diagram based on an equity structure and a pre-trained enterprise risk prediction model. The enterprise relational network graph can be a directed relational network graph which is constructed based on a shareholder structure and points to shareholders of an enterprise, nodes in the graph represent the enterprise or natural persons, and the connection of the nodes and edges represent the correlation degree between the nodes, namely the relation between the enterprise and the shareholder. In some embodiments, the relevancy may be expressed by the share ratio, and may also be expressed by the share assets of the enterprise with shareholders, which is not limited in this application. The enterprise risk prediction model in the embodiment of the present specification may be obtained by training based on the feature attributes, the characterization vectors, and the risk labels of each node in the enterprise relationship network graph, and is used to characterize the association relationship between the characterization vectors and the feature attributes of a part of nodes in the relationship network graph and the risk label of a certain enterprise node. In some embodiments, the risk prediction model may be a model obtained by training in the model training method, and a specific training process of the risk prediction model is consistent with the training process of the enterprise risk prediction model described above, and is not described herein again.
when the risk of a certain target enterprise node in the relational network graph needs to be predicted, the sub-network graph associated with the target enterprise node can be extracted first, and then the nodes in the sub-network graph are extracted to form a node set. In some embodiments, the sub-network graph may be formed of nodes adjacent to the target enterprise node, where the nodes adjacent to the target enterprise node are stakeholders of the enterprise and are nodes having connecting edges with the target enterprise node, and represent a first-order degree of association. In some embodiments, the sub-network graph may also be randomly walked in the relationship network graph with the target enterprise node as a starting point, and a random walk path is generated, and the nodes passed by the random walk path form the sub-network graph. The nodes in the sub-network graph are nodes which have direct or indirect stock right relation with the target enterprise nodes, and a multi-level association is embodied. In some embodiments, the sub-network graph may be a network graph of nodes extracted from the relational network graph and having common connection nodes with the target enterprise node, where the nodes having common connection nodes with the target enterprise node represent enterprises having common stakeholders with the target enterprise node, and represent a second-order association. Of course, because different types of sub-network graphs represent associated nodes of different levels and dimensions from the target enterprise node, multiple types of sub-network graphs may be selected to predict the risk of the target enterprise node at the same time when predicting the target enterprise node.
After the node set is obtained, one or more feature attributes of each node in the node set can be extracted, and feature attribute values of the feature attributes are obtained, wherein the feature attributes can be feature attributes having influence on enterprise risk. For a node being an enterprise or a natural person, the corresponding characteristic attributes may be different. For example, in some scenarios, the characteristic attribute may be enterprise registration time, registration capital for an enterprise node, and the characteristic attribute may be a property of a natural person, i.e., an equity property, for a natural person node. After the characteristic attribute value corresponding to each characteristic attribute of each node in the sample is obtained, the characteristic attribute value of the sample can be obtained by calculation according to the correlation degree between each node and the target enterprise node. In some embodiments, the process of calculating the characteristic attribute values for the set of nodes is as follows: the characteristic attribute value of each node in the node set can be obtained first, then the statistical value of the characteristic attribute of the node set is calculated based on the correlation degree of each node and the target enterprise node and the characteristic attribute value of each node, and then the statistical value is used as the characteristic attribute value of the node set.
The characteristic attribute value of the node set is calculated only by considering the influence of the characteristic attribute of the node in the relational network on the enterprise risk, and the influence of the relational structure between the nodes in the relational network graph can be also considered for more accurately predicting the enterprise risk. Because the nodes can not carry out mathematical operation, the nodes can be vectorized firstly, the characterization vectors of all the nodes in the node set are obtained through calculation, and then the characterization vectors of the node set sample are obtained through calculation according to the characterization vectors of all the nodes and the correlation degree of all the nodes and the target enterprise node. In some embodiments, a network-embedded learning model algorithm may be used to calculate the token vector of each Node in the Node set, for example, a deepwater model or a Node2vec model may be used to calculate the token vector of each Node.
in some embodiments, according to the characterization vector of each node in the node set and the correlation between each node and the target enterprise node, the calculation process of calculating the characterization vector of the node set is as follows: firstly, calculating to obtain an average characterization vector and a maximum characterization vector of a node set based on the characterization vector of each node in the node set and the correlation degree between each node and the target enterprise node, and connecting the average characterization vector and the maximum characterization vector to obtain the characterization vector of the node set.
After the feature attributes and the characterization vectors of the node set are obtained through calculation, the feature attributes and the characterization vectors can be input into a pre-trained enterprise risk prediction model, and the risk of the target enterprise node can be predicted. In some embodiments, when training the enterprise risk prediction model, if samples formed by node set collections extracted by different types of sub-networks are adopted to respectively train a reference model to obtain a plurality of models, for example, a prediction model is obtained by training a node set sample extracted by a sub-network graph formed by adjacent nodes, a prediction model is obtained by training a node set sample extracted by a sub-network graph obtained by random walk, a prediction model is obtained by training a node set sample extracted by a sub-network graph obtained by commonly connecting nodes, three prediction models in total are obtained, when predicting, a node set formed by the same type of sub-network graphs of a target enterprise node to be predicted is extracted, characteristic attributes and characteristic vectors of the node set are calculated and respectively substituted into the three prediction models, and then the predicted values of the three models are used for averaging, the final result is obtained. Of course, samples extracted from the three sub-networks can also be put into a model for training, and then the three types of node sets of the target enterprise node are obtained and input into a model for prediction.
In the embodiment of the description, a relationship network diagram of an enterprise is constructed based on a share right structure, then a plurality of types of sub-network diagrams are extracted based on the relationship between the enterprise and shareholders, a node set in a sub-network is used for constructing a sample, the characteristic attribute and the characteristic vector of the sample are calculated based on the characteristic attribute and the characteristic vector of each node of the sample, the model is trained by using the characteristic attribute and the characteristic vector of the sample, an enterprise risk prediction model is obtained, and the trained model is a linear model, so that not only is the calculation complexity reduced, but also the attribute and the relationship network structure of the node are comprehensively considered during prediction, and the prediction result is more accurate.
To further explain the enterprise risk prediction method of the embodiments of the present disclosure, a specific embodiment is explained below.
In order to more accurately utilize the relationship network diagram of the enterprise to predict the enterprise risk, the calculation amount is reduced, and the prediction efficiency is improved. One example of the present specification provides an enterprise risk prediction method, which is to obtain an enterprise risk prediction model based on a relationship network diagram, and train enterprise risks through the model, and the specific process is as follows:
Training of enterprise risk prediction model
1. Acquiring the equity structure of an enterprise and constructing a relationship network diagram
The equity structure may include three elements of enterprise, shareholder and equity ratio, each enterprise has at least one shareholder, and may be a natural person, a government organization or another enterprise. A directed graph relational network graph is constructed based on a share right structure, nodes on the graph comprise various entities such as enterprises, natural people and the like, each node V is assigned with a unique node _ id, edges on the graph point to shareholders from the enterprises, and the edges represent stock holding proportion of the shareholders.
2. extracting sub-network graph from constructed relation network graph
For each enterprise node in the relationship network graph, the sub-network graph associated with the node in the relationship network graph can be divided into three categories:
(1) The neighbor nodes of the current node, namely all shareholders, are embodied by one-degree association.
(2) the stock control path of the enterprise, namely a path generated by randomly walking on the graph from the enterprise node, is represented as a stock holding path from the enterprise to a certain (direct or indirect) stockholder, and is represented by a multi-level association.
(3) Sibling nodes of the current node, i.e., other enterprises having the same shareholder as the enterprise, embody a second-order association.
For each enterprise node in the relational network graph, the above several sub-network graphs can be extracted, and the node set in the sub-network graph is used as a sample, wherein each sample can also be represented by a share weight ratio weight of each node in the set and each node and the current node, and the weight can be a (direct or indirect) share ratio; for example, for the current node vkA certain one sampled from itThe samples may be represented as { x }k=(vk1,ak,1),(vk,2,ak,2) ,., wherein (v)k,m,ak,m) Respectively a node in the subnetwork graph and its weight vk,mRepresents a node, ak,mRepresenting the ratio of occupied strands; for different types of sub-network graphs, a series of samples, S ═ x, can be obtained1,y1),(x2,y2),...(xk),(xk+1),...]Where y is a label for the predicted task, may not be present. The sample sets formed by the three structures of the sub-network graph are respectively denoted as S _ neighbor, S _ path and S _ router.
3. Computing a value of a characteristic attribute of a sample
the characteristic attributes can be factors influencing the risk of the enterprise, can be attributes of the maximum registered capital, the average registered time, stockholder assets and the like of the enterprise, can be independent or continuous, and can be different aiming at enterprise nodes and natural person nodes. For each sample x, extracting a feature u (x) ═ u1(x),u2(x),...,uk(x),...,uK(x)]Wherein u isk(x)=uk({(v1,a1),(v2,a2) ,.), wherein the attribute value of each characteristic attribute of the sample is a statistical value of the characteristic attribute values of the nodes in the sample.
4. Computing a characterization vector for a sample
firstly, mapping each node in the sample into a characteristic vector v through a Deepwalk modelkis denoted as ek. The characterization vector for a sample is represented as:
gv(x)=g({(v1,a1),(v2,a2),...,(vk,ak),...,})
=concat(avg_pooling({akek}),max_pooling({akek})
wherein avg _ pooling refers to the average by element of a series of vectors.
max _ posing refers to the element-wise maximum of a series of vectors.
5. determining a prediction function
The Wide & Deep model can be used as a reference model to train the model. Assuming a binary classification model, only the existence of risk or the nonexistence of risk needs to be predicted, and for a node vi, the corresponding sample is x, and the prediction function is as follows:
wherein,Is a parameter of the model, u (x) is a characteristic attribute of the sample, gv(x) Is a characterization vector of the sample. 6. Model training
to Sneighbor,SpathAnd SbrotherThree sample sets are respectively modeled and trained, and the loss function is
wherein x isifrom a labeled sample; x is the number ofjThe samples are samples without labels, and are randomly sampled during training.
Ai,jDenotes xiAnd xjsome measure of similarity of the corresponding nodes.
Here, it is equivalent to adding a constraint penalty function, and it is considered that the label results of similar nodes should be relatively close.
Predicting enterprise risks by adopting trained enterprise risk prediction model
repeating the basic steps for the enterprise nodes needing prediction:
(1) The sub-network graph is extracted from the relational network graph of the enterprise, and for each type of sub-network graph, a node set formed by nodes in the sub-network graph is extracted.
(2) And calculating the characteristic attribute and the characterization vector of the node set according to the node characteristic attribute and the characterization vector of the node set.
(3) substitution of prediction function fv(x) And obtaining a prediction result.
(4) And averaging the prediction results to obtain a final risk prediction value.
the various technical features in the above embodiments can be arbitrarily combined, so long as there is no conflict or contradiction between the combinations of the features, but the combination is limited by the space and is not described one by one, and therefore, any combination of the various technical features in the above embodiments also falls within the scope disclosed in the present specification.
As shown in fig. 4, which is an embodiment of the present disclosure of an enterprise risk prediction model training apparatus, the apparatus 40 may include:
A sample construction module 41, configured to extract a sub-network graph associated with a target node from a relationship network graph of an enterprise, and use a node set in the sub-network graph as a sample, where the relationship network graph is constructed based on a share right structure;
A calculating module 42, configured to calculate a characteristic attribute value of the sample and a characteristic vector of the sample based on a characteristic attribute value of each node in the node set, a characteristic vector of each node, and a correlation between each node and the target node;
And the training module 43 is configured to train a preset reference model with the characteristic attribute value and the characterization vector of the sample as inputs and the risk label of the target node as an output until a preset loss function converges to obtain an enterprise risk prediction model.
in one embodiment, the reference model is a Wide & Deep model.
In one embodiment, extracting the sub-network graph associated with the target node from the relationship network graph of the enterprise comprises:
extracting adjacent nodes of the target node from the relational network graph to form the sub-network; and/or
Taking the target node as a starting point, randomly walking in the relational network graph to generate a path, wherein nodes passed by the path form the sub-network graph; and/or
and extracting nodes which have common connection nodes with the target node from the relational network graph to form the sub-network graph.
in one embodiment, the degree of correlation is a strand proportion.
In one embodiment, the step of calculating the characteristic attribute value of the sample comprises:
Acquiring a characteristic attribute value of each node in the node set;
calculating a statistical value of the characteristic attribute of each node based on the correlation degree of each node and the target node and the characteristic attribute value;
And taking the statistical value as a characteristic attribute value of the sample.
In one embodiment, the step of calculating the characterization vector of the sample comprises:
Calculating an average characterization vector and a maximum characterization vector of the node set based on the characterization vector of each node in the node set and the correlation degree of each node and the target node;
And connecting the average characterization vector with the maximum characterization vector to obtain the characterization vector of the sample.
In one embodiment, the characterization vector of the node is obtained based on a network-embedded learning model.
as shown in fig. 5, which is an enterprise risk prediction device according to an embodiment of the present disclosure, the device 50 may include:
A construction module 51, configured to extract a sub-network graph associated with a target enterprise node to be predicted from a relationship network graph of an enterprise, and obtain a node construction node set in the sub-network graph, where the relationship network graph is constructed based on a equity structure;
A calculating module 52, configured to calculate a characteristic attribute value of each node in the node set and a characterization vector of each node, and a correlation between each node and the target enterprise node, to obtain the characteristic attribute value of the node set and the characterization vector of the node set;
And the predicting module 53 is configured to input the feature attribute values and the characterization vectors of the node set into a pre-trained enterprise risk prediction model, and predict the risk of the target enterprise node, where the enterprise risk prediction model is obtained by training based on the feature attributes, the characterization vectors, and the risk labels of the nodes in the relational network graph.
In one embodiment, the degree of correlation is a strand proportion.
in one embodiment, extracting the sub-network graph associated with the target enterprise node to be predicted from the relationship network graph of the enterprise and the shareholder comprises:
Extracting adjacent nodes of the target enterprise node to be predicted from the relational network graph to form the sub-network; and/or
Taking the target enterprise node to be predicted as a starting point, randomly walking in the relational network graph to generate a path, wherein nodes passed by the path form the sub-network graph; and/or
and extracting nodes which have common connection nodes with the target enterprise node to be predicted from the relational network graph to form the sub-network graph.
in one embodiment, the feature attributes include: registered capital, registered time, enterprise assets, and/or stockholder assets of the enterprise.
in one embodiment, the step of calculating the characteristic attribute value of the node set includes:
Acquiring a characteristic attribute value of each node in the node set;
Calculating a statistical value of the characteristic attribute of each node based on the correlation degree of each node and the target enterprise node and the characteristic attribute value;
and taking the statistical value as a characteristic attribute value of the node set.
in one embodiment, the step of calculating the characterization vector of the node set comprises:
Calculating to obtain an average characterization vector and a maximum characterization vector of the node set based on the characterization vector of each node in the node set and the correlation degree between each node and the target enterprise node;
And connecting the average characterization vector with the maximum characterization vector to obtain the characterization vector of the node set.
In one embodiment, the characterization vector of the node is obtained based on a network-embedded learning model.
In one embodiment, the enterprise risk prediction model is trained as follows:
Extracting a sub-network graph associated with a target node from a relational network graph of an enterprise, and taking a node set in the sub-network graph as a sample, wherein the relational network graph is constructed based on a share right structure;
Calculating to obtain the characteristic attribute value of the sample and the characteristic vector of the sample based on the characteristic attribute value of each node in the node set, the characteristic vector of each node and the correlation degree of each node and the target node;
And taking the characteristic attribute value and the characterization vector of the sample as input, taking the risk label of the target node as output, and training a preset reference model until a preset loss function is converged to obtain an enterprise risk prediction model.
In one embodiment, the reference model is a Wide & Deep model.
the specific details of the implementation process of the functions and actions of each module in the device are referred to the implementation process of the corresponding step in the method, and are not described herein again.
for the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, wherein the modules described as separate parts may or may not be physically separate, and the parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution in the specification. One of ordinary skill in the art can understand and implement it without inventive effort.
The embodiment of the device in the specification can be applied to computer equipment, such as a server or an intelligent terminal. The device embodiments may be implemented by software, or by hardware, or by a combination of hardware and software. The software implementation is taken as an example, and as a logical device, the device is formed by reading corresponding computer program instructions in the nonvolatile memory into the memory for operation through the processor in which the file processing is located. From a hardware aspect, as shown in fig. 6, it is a hardware structure diagram of a computer device in which the apparatus of this specification is located, except for the processor 602, the memory 604, the network interface 606, and the nonvolatile memory 608 shown in fig. 6, a server or an electronic device in which the apparatus is located in an embodiment may also include other hardware according to an actual function of the computer device, which is not described again. The non-volatile memory 608 stores a computer program executable by the processor 602, and the processor 602 executes the computer program to implement the method steps of any of the above embodiments.
Accordingly, the embodiments of the present specification also provide a computer storage medium, in which a program is stored, and the program, when executed by a processor, implements the method in any of the above embodiments.
This application may take the form of a computer program product embodied on one or more storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having program code embodied therein. Computer-usable storage media include permanent and non-permanent, removable and non-removable media, and information storage may be implemented by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of the storage medium of the computer include, but are not limited to: phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technologies, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic tape storage or other magnetic storage devices, or any other non-transmission medium, may be used to store information that may be accessed by a computing device.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
it will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.
the above description is only exemplary of the present disclosure and should not be taken as limiting the disclosure, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present disclosure should be included in the scope of the present disclosure.

Claims (19)

1. A method of enterprise risk prediction, the method comprising:
Extracting a sub-network graph associated with a target enterprise node to be predicted from a relational network graph of an enterprise, and acquiring a node construction node set in the sub-network graph, wherein the relational network graph is constructed based on a share right structure;
calculating to obtain the characteristic attribute values of the node set and the characteristic vectors of the node set based on the characteristic attribute values of the nodes in the node set, the characteristic vectors of the nodes and the correlation degree of the nodes and the target enterprise node;
And inputting the characteristic attribute values and the characterization vectors of the node set into a pre-trained enterprise risk prediction model to predict the risk of the target enterprise node, wherein the enterprise risk prediction model is obtained by training based on the characteristic attributes, the characterization vectors and the risk labels of the nodes in the relational network graph.
2. the method of enterprise risk prediction of claim 1, wherein the relevance is a share ratio.
3. the enterprise risk prediction method of claim 1, wherein extracting the sub-network graph associated with the target enterprise node to be predicted from the relationship network graph of the enterprise and the shareholder comprises:
Extracting adjacent nodes of the target enterprise node to be predicted from the relational network graph to form the sub-network; and/or
Taking the target enterprise node to be predicted as a starting point, randomly walking in the relational network graph to generate a path, wherein nodes passed by the path form the sub-network graph; and/or
And extracting nodes which have common connection nodes with the target enterprise node to be predicted from the relational network graph to form the sub-network graph.
4. The enterprise risk prediction method of claim 1, wherein the characteristic attributes comprise: registered capital, registered time, enterprise assets, and/or stockholder assets of the enterprise.
5. The enterprise risk prediction method of claim 1, wherein calculating the characteristic attribute values for the set of nodes comprises:
Acquiring a characteristic attribute value of each node in the node set;
calculating a statistical value of the characteristic attribute of each node based on the correlation degree of each node and the target enterprise node and the characteristic attribute value;
And taking the statistical value as a characteristic attribute value of the node set.
6. The enterprise risk prediction method of claim 1, wherein computing the characterization vector for the set of nodes comprises:
calculating to obtain an average characterization vector and a maximum characterization vector of the node set based on the characterization vector of each node in the node set and the correlation degree between each node and the target enterprise node;
And connecting the average characterization vector with the maximum characterization vector to obtain the characterization vector of the node set.
7. the enterprise risk prediction method of claim 1, wherein the characterization vectors of the nodes are derived based on a network-embedded learning model.
8. the method of claim 1, wherein the business risk prediction model is trained as follows:
Extracting a sub-network graph associated with a target node from a relational network graph of an enterprise, and taking a node set in the sub-network graph as a sample, wherein the relational network graph is constructed based on a share right structure;
Calculating to obtain the characteristic attribute value of the sample and the characteristic vector of the sample based on the characteristic attribute value of each node in the node set, the characteristic vector of each node and the correlation degree of each node and the target node;
and taking the characteristic attribute value and the characterization vector of the sample as input, taking the risk label of the target node as output, and training a preset reference model until a preset loss function is converged to obtain an enterprise risk prediction model.
9. the enterprise risk prediction method of claim 8, wherein the benchmark model is a Wide & Deep model.
10. A method of enterprise risk prediction model training, the method comprising:
Extracting a sub-network graph associated with a target node from a relational network graph of an enterprise, and taking a node set in the sub-network graph as a sample, wherein the relational network graph is constructed based on a share right structure;
Calculating to obtain the characteristic attribute value of the sample and the characteristic vector of the sample based on the characteristic attribute value of each node in the node set, the characteristic vector of each node and the correlation degree of each node and the target node;
And taking the characteristic attribute value and the characterization vector of the sample as input, taking the risk label of the target node as output, and training a preset reference model until a preset loss function is converged to obtain an enterprise risk prediction model.
11. The method of claim 10, wherein the benchmark model is a Wide & Deep model.
12. The method of claim 10, wherein extracting the sub-network graph associated with the target node from the relational network graph of the enterprise comprises:
extracting adjacent nodes of the target node from the relational network graph to form the sub-network; and/or
Taking the target node as a starting point, randomly walking in the relational network graph to generate a path, wherein nodes passed by the path form the sub-network graph; and/or
And extracting nodes which have common connection nodes with the target node from the relational network graph to form the sub-network graph.
13. The method of claim 10, wherein the relevancy is a share ratio.
14. the method for training an enterprise risk prediction model according to claim 10, wherein the step of calculating the characteristic attribute values of the sample includes:
Acquiring a characteristic attribute value of each node in the node set;
Calculating a statistical value of the characteristic attribute of each node based on the correlation degree of each node and the target node and the characteristic attribute value;
And taking the statistical value as a characteristic attribute value of the sample.
15. The method for training an enterprise risk prediction model according to claim 10, wherein the step of calculating the characterization vectors of the samples comprises:
Calculating an average characterization vector and a maximum characterization vector of the node set based on the characterization vector of each node in the node set and the correlation degree of each node and the target node;
and connecting the average characterization vector with the maximum characterization vector to obtain the characterization vector of the sample.
16. The method of claim 10, wherein the characterization vectors of the nodes are derived based on a network-embedded learning model.
17. An enterprise risk prediction device, the device comprising:
The system comprises a construction module, a prediction module and a prediction module, wherein the construction module is used for extracting a sub-network graph associated with a target enterprise node to be predicted from a relational network graph of an enterprise and acquiring a node construction node set in the sub-network graph, and the relational network graph is constructed based on a stock right structure;
the computing module is used for computing to obtain a characteristic attribute value of the node set and a characteristic vector of the node set based on the characteristic attribute of each node in the node set, the characteristic vector of each node and the correlation degree of each node and the target enterprise node;
and the prediction module is used for inputting the characteristic attribute values and the characterization vectors of the node set into a pre-trained enterprise risk prediction model and predicting the risk of the target enterprise node, wherein the enterprise risk prediction model is obtained by training based on the characteristic attributes, the characterization vectors and the risk labels of the nodes in the relational network graph.
18. A model training apparatus, the apparatus comprising:
The system comprises a sample construction module, a node selection module and a node selection module, wherein the sample construction module is used for extracting a sub-network graph associated with a target node from a relational network graph of an enterprise and taking a node set in the sub-network graph as a sample, and the relational network graph is constructed based on a stock right structure;
a calculation module, configured to calculate a characteristic attribute value of the sample and a characteristic vector of the sample based on a characteristic attribute value of each node in the node set, a characteristic vector of each node, and a correlation between each node and the target node;
and the training module is used for training a preset reference model by taking the characteristic attribute value and the characterization vector of the sample as input and the risk label of the target node as output until a preset loss function is converged to obtain an enterprise risk prediction model.
19. an apparatus comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of any one of claims 1 to 16 when executing the program.
CN201910815269.8A 2019-08-30 2019-08-30 Enterprise risk prediction method, model training method, device and equipment Pending CN110570111A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910815269.8A CN110570111A (en) 2019-08-30 2019-08-30 Enterprise risk prediction method, model training method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910815269.8A CN110570111A (en) 2019-08-30 2019-08-30 Enterprise risk prediction method, model training method, device and equipment

Publications (1)

Publication Number Publication Date
CN110570111A true CN110570111A (en) 2019-12-13

Family

ID=68776948

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910815269.8A Pending CN110570111A (en) 2019-08-30 2019-08-30 Enterprise risk prediction method, model training method, device and equipment

Country Status (1)

Country Link
CN (1) CN110570111A (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111126476A (en) * 2019-12-19 2020-05-08 支付宝(杭州)信息技术有限公司 Homogeneous risk unit feature set generation method, device, equipment and medium
CN111160662A (en) * 2019-12-31 2020-05-15 咪咕文化科技有限公司 Risk prediction method, electronic equipment and storage medium
CN111178615A (en) * 2019-12-24 2020-05-19 成都数联铭品科技有限公司 Construction method and system of enterprise risk identification model
CN111382843A (en) * 2020-03-06 2020-07-07 浙江网商银行股份有限公司 Method and device for establishing upstream and downstream relation recognition model of enterprise and relation mining
CN111489168A (en) * 2020-04-17 2020-08-04 支付宝(杭州)信息技术有限公司 Target object risk identification method and device and processing equipment
CN111507543A (en) * 2020-05-28 2020-08-07 支付宝(杭州)信息技术有限公司 Model training method and device for predicting business relation between entities
CN111583037A (en) * 2020-04-30 2020-08-25 支付宝(杭州)信息技术有限公司 Method and device for determining risk associated object and server
CN112036642A (en) * 2020-08-31 2020-12-04 中国平安人寿保险股份有限公司 Information prediction method, device, equipment and medium based on artificial intelligence
CN112348318A (en) * 2020-10-19 2021-02-09 深圳前海微众银行股份有限公司 Method and device for training and applying supply chain risk prediction model
CN112379913A (en) * 2020-11-20 2021-02-19 上海复深蓝软件股份有限公司 Software optimization method, device, equipment and storage medium based on risk identification
CN112541698A (en) * 2020-12-22 2021-03-23 北京中数智汇科技股份有限公司 Method and system for identifying enterprise risks based on external characteristics of enterprise
CN113222610A (en) * 2021-05-07 2021-08-06 支付宝(杭州)信息技术有限公司 Risk identification method and device
CN113313333A (en) * 2020-02-26 2021-08-27 阿里巴巴集团控股有限公司 Risk judgment method, device and medium for relational network topology
CN114066081A (en) * 2021-11-23 2022-02-18 北京恒通慧源大数据技术有限公司 Enterprise risk prediction method and device based on graph attention network and electronic equipment
CN114118526A (en) * 2021-10-29 2022-03-01 中国建设银行股份有限公司 Enterprise risk prediction method, device, equipment and storage medium
CN114202261A (en) * 2022-02-18 2022-03-18 北京科技大学 Village-level industrial park fire risk directed graph depicting method and device
CN114282731A (en) * 2021-12-27 2022-04-05 深圳前海微众银行股份有限公司 Risk model training method, device, equipment and storage medium
CN114757767A (en) * 2020-12-29 2022-07-15 航天信息股份有限公司 Identification method and device for associated enterprise, electronic equipment and storage medium
CN116094827A (en) * 2023-01-18 2023-05-09 支付宝(杭州)信息技术有限公司 Safety risk identification method and system based on topology enhancement

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111126476A (en) * 2019-12-19 2020-05-08 支付宝(杭州)信息技术有限公司 Homogeneous risk unit feature set generation method, device, equipment and medium
CN111178615A (en) * 2019-12-24 2020-05-19 成都数联铭品科技有限公司 Construction method and system of enterprise risk identification model
CN111178615B (en) * 2019-12-24 2023-10-27 成都数联铭品科技有限公司 Method and system for constructing enterprise risk identification model
CN111160662A (en) * 2019-12-31 2020-05-15 咪咕文化科技有限公司 Risk prediction method, electronic equipment and storage medium
CN113313333A (en) * 2020-02-26 2021-08-27 阿里巴巴集团控股有限公司 Risk judgment method, device and medium for relational network topology
CN111382843A (en) * 2020-03-06 2020-07-07 浙江网商银行股份有限公司 Method and device for establishing upstream and downstream relation recognition model of enterprise and relation mining
CN111382843B (en) * 2020-03-06 2023-10-20 浙江网商银行股份有限公司 Method and device for establishing enterprise upstream and downstream relationship identification model and mining relationship
CN111489168A (en) * 2020-04-17 2020-08-04 支付宝(杭州)信息技术有限公司 Target object risk identification method and device and processing equipment
CN111583037A (en) * 2020-04-30 2020-08-25 支付宝(杭州)信息技术有限公司 Method and device for determining risk associated object and server
CN111507543A (en) * 2020-05-28 2020-08-07 支付宝(杭州)信息技术有限公司 Model training method and device for predicting business relation between entities
CN111507543B (en) * 2020-05-28 2022-05-17 支付宝(杭州)信息技术有限公司 Model training method and device for predicting business relation between entities
CN112036642A (en) * 2020-08-31 2020-12-04 中国平安人寿保险股份有限公司 Information prediction method, device, equipment and medium based on artificial intelligence
CN112348318A (en) * 2020-10-19 2021-02-09 深圳前海微众银行股份有限公司 Method and device for training and applying supply chain risk prediction model
CN112348318B (en) * 2020-10-19 2024-04-23 深圳前海微众银行股份有限公司 Training and application method and device of supply chain risk prediction model
CN112379913A (en) * 2020-11-20 2021-02-19 上海复深蓝软件股份有限公司 Software optimization method, device, equipment and storage medium based on risk identification
CN112379913B (en) * 2020-11-20 2022-01-07 上海复深蓝软件股份有限公司 Software optimization method, device, equipment and storage medium based on risk identification
CN112541698A (en) * 2020-12-22 2021-03-23 北京中数智汇科技股份有限公司 Method and system for identifying enterprise risks based on external characteristics of enterprise
CN114757767A (en) * 2020-12-29 2022-07-15 航天信息股份有限公司 Identification method and device for associated enterprise, electronic equipment and storage medium
CN114757767B (en) * 2020-12-29 2024-09-20 航天信息股份有限公司 Method and device for identifying associated enterprises, electronic equipment and storage medium
CN113222610B (en) * 2021-05-07 2022-08-23 支付宝(杭州)信息技术有限公司 Risk identification method and device
CN113222610A (en) * 2021-05-07 2021-08-06 支付宝(杭州)信息技术有限公司 Risk identification method and device
CN114118526A (en) * 2021-10-29 2022-03-01 中国建设银行股份有限公司 Enterprise risk prediction method, device, equipment and storage medium
CN114066081B (en) * 2021-11-23 2022-04-26 北京恒通慧源大数据技术有限公司 Enterprise risk prediction method and device based on graph attention network and electronic equipment
CN114066081A (en) * 2021-11-23 2022-02-18 北京恒通慧源大数据技术有限公司 Enterprise risk prediction method and device based on graph attention network and electronic equipment
CN114282731A (en) * 2021-12-27 2022-04-05 深圳前海微众银行股份有限公司 Risk model training method, device, equipment and storage medium
CN114202261B (en) * 2022-02-18 2022-05-31 北京科技大学 Village-level industrial park fire risk directed graph depicting method and device
CN114202261A (en) * 2022-02-18 2022-03-18 北京科技大学 Village-level industrial park fire risk directed graph depicting method and device
CN116094827A (en) * 2023-01-18 2023-05-09 支付宝(杭州)信息技术有限公司 Safety risk identification method and system based on topology enhancement

Similar Documents

Publication Publication Date Title
CN110570111A (en) Enterprise risk prediction method, model training method, device and equipment
Mukhametzyanov et al. A sensitivity analysis in MCDM problems: A statistical approach
Sharma et al. Survey of stock market prediction using machine learning approach
Tsinaslanidis et al. A prediction scheme using perceptually important points and dynamic time warping
Lee et al. Shap value-based feature importance analysis for short-term load forecasting
Masood et al. Clustering techniques in bioinformatics
Ramezanian Estimation of the profiles in posteriori ELECTRE TRI: A mathematical programming model
CN101901251B (en) Complex network cluster structure analysis and identification method based on Markov process metastability
Abdul-Rahman et al. Advanced machine learning algorithms for house price prediction: case study in Kuala Lumpur
Khosravi et al. Performance evaluation of machine learning regressors for estimating real estate house prices
Kumar et al. Community-enhanced Link Prediction in Dynamic Networks
Wu et al. An uncertainty-oriented cost-sensitive credit scoring framework with multi-objective feature selection
Gautam et al. Adaptive discretization using golden section to aid outlier detection for software development effort estimation
Ullah et al. Adaptive data balancing method using stacking ensemble model and its application to non-technical loss detection in smart grids
Helal et al. Leader‐based community detection algorithm for social networks
Ghimire et al. Machine learning-based prediction models for budget forecast in capital construction
Kumar et al. Comparative analysis of SOM neural network with K-means clustering algorithm
CN115730248A (en) Machine account detection method, system, equipment and storage medium
CN109472370B (en) Method and device for classifying maintenance plants
CN112884028A (en) System resource adjusting method, device and equipment
Lezama et al. Electrical load pattern shape clustering using ant colony optimization
Sánchez-Silva et al. Risk assessment and management of civil infrastructure networks: a systems approach
CN113987280B (en) Method and device for training graph model aiming at dynamic graph
Kangane et al. Analysis of different regression models for real estate price prediction
Naumann-Woleske et al. Exploration of the Parameter Space in Macroeconomic Models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20200924

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20200924

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Applicant before: Alibaba Group Holding Ltd.

TA01 Transfer of patent application right