WO2021027331A1 - Graph data-based full relationship calculation method and apparatus, device, and storage medium - Google Patents
Graph data-based full relationship calculation method and apparatus, device, and storage medium Download PDFInfo
- Publication number
- WO2021027331A1 WO2021027331A1 PCT/CN2020/087619 CN2020087619W WO2021027331A1 WO 2021027331 A1 WO2021027331 A1 WO 2021027331A1 CN 2020087619 W CN2020087619 W CN 2020087619W WO 2021027331 A1 WO2021027331 A1 WO 2021027331A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- node
- data
- identifier
- degree
- relationship
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9024—Graphs; Linked lists
Definitions
- This application relates to the field of big data technology, and in particular to a method, device, device, and storage medium for calculating a full relationship based on graph data.
- Graph data mining is an important method in relationship mining and group profiling.
- Graph data is composed of nodes and edges.
- the nodes in the graph data are used to represent the connected subjects, and the edges in the graph data are used to represent the association between the subjects.
- a node is associated with other nodes through the edges connected to it.
- One of the typical applications in graph calculation is to find the full relationship of a certain node and analyze the statistical characteristics. Among them, the calculation of the second-degree relationship and the third-degree relationship has become a difficult point in the graph calculation due to the large amount of calculation and computational resource consumption.
- the current typical environment for graph computing is the GraphX environment in the Spark project open sourced by the Apache Software Foundation.
- GraphX uses a memory computing strategy to achieve fast iterative calculations; however, the inventor realized that memory computing consumes a huge amount of memory resources, and it is difficult to support the processing of massive data.
- index calculation of second-degree and third-degree associated nodes using GraphX to calculate graphs on 50 million nodes and 400 million edges will consume 2000GB of memory. The calculation efficiency of the full relationship is low. On social networks, such calculation efficiency is difficult to meet the application of billions to billions of nodes.
- This application provides a method, device, device, and storage medium for calculating a full-scale relationship based on graph data, which are used to merge node attributes into node identifiers using bit operations to avoid duplication of node data, reduce memory resource consumption, and improve computing efficiency .
- the first aspect of the embodiments of the present application provides a full-scale relationship calculation method based on graph data, including: obtaining preprocessed graph data, the preprocessed graph data including node data and edge data of each node; Perform a bit operation to generate the synthetic node ID of each node; divide the node data and the edge data with each node data as the center to generate multiple data groups, each data group includes a synthetic node of the current node ID and the edge data connected to the current node; send a single node ID list of each node to all adjacent nodes, the single node ID list is used to store the composite node ID of the adjacent node; according to each node The received single node identification list generates the second-degree relationship of each node.
- a second aspect of the embodiments of the present application provides a full-scale relationship calculation device based on graph data, including: a first obtaining unit configured to obtain preprocessed graph data, the preprocessed graph data including node data of each node and Edge data; an operation generating unit for performing bit operations on the node data to generate a synthetic node identifier for each node; a division generating unit for centering the node data and the edge data on each node data Divide, generate multiple data groups, each data group includes a composite node ID of the current node and edge data connected to the current node; the first sending unit is used to send the single node ID list of each node to all Adjacent nodes, the single node identification list is used to store the composite node identification of the adjacent nodes; the first generating unit is used to generate the second degree of each node according to the single node identification list received by each node relationship.
- the third aspect of the embodiments of the present application provides a computer device, including: one or more processors; a memory; one or more computer programs, wherein the one or more computer programs are stored in the memory and Is configured to be executed by the one or more processors, and the one or more computer programs are configured to execute a full relationship calculation method based on graph data, wherein the full relationship calculation method based on graph data includes :
- the preprocessed graph data including node data and edge data of each node;
- the second-degree relationship of each node is generated.
- a fourth aspect of the embodiments of the present application provides a computer-readable storage medium having a computer program stored on the computer-readable storage medium, and when the computer program is executed by a processor, a method for calculating a full relationship based on graph data is implemented, wherein , The method for calculating the full relationship based on graph data includes the following steps:
- the preprocessed graph data including node data and edge data of each node;
- the second-degree relationship of each node is generated.
- preprocessed graph data is obtained, and the preprocessed graph data includes node data and edge data of each node; bit operations are performed on the node data to generate a synthetic node identifier for each node ; Divide the node data and the edge data with each node data as the center to generate multiple data groups, each data group includes a synthetic node identifier of the current node and the edge data connected to the current node; The single-node identification list of each node is sent to all adjacent nodes, and the single-node identification list is used to store the composite node identification of the adjacent nodes; and each node is generated according to the single-node identification list received by each node. The second-degree relationship of the node. Use bit operation to merge node attributes into node identifiers, and eliminate self-connection, avoid node data duplication, reduce memory resource consumption, and improve calculation efficiency.
- FIG. 1 is a schematic diagram of an embodiment of a method for calculating a full relationship based on graph data in an embodiment of the application;
- FIG. 2 is a schematic diagram of another embodiment of a method for calculating a full relationship based on graph data in an embodiment of the application;
- Fig. 3 is a schematic diagram of an embodiment of a full relation calculation device based on graph data in an embodiment of the application;
- FIG. 4 is a schematic diagram of another embodiment of a full relationship calculation device based on graph data in an embodiment of the application;
- FIG. 5 is a schematic diagram of an embodiment of a full relationship calculation device based on graph data in an embodiment of the application.
- This application provides a method, device, device, and storage medium for calculating a full-scale relationship based on graph data, which are used to merge node attributes into node identifiers using bit operations to avoid duplication of node data, reduce memory resource consumption, and improve computing efficiency .
- FIG. 1 a flowchart of a method for calculating a full relationship based on graph data provided by an embodiment of the present application, which specifically includes:
- preprocessed graph data where the preprocessed graph data includes node data and edge data of each node.
- the server obtains the preprocessed graph data, and the preprocessed graph data includes node data and edge data of each node. Specifically, the server loads the preprocessed graph data through Spark.
- the preprocessed graph data is composed of node data and edge data.
- the node data in the graph data is used to indicate the subject that is connected, and the edge data is used to indicate the association between the subjects. .
- a node is related to other nodes through its connected edges.
- the node data of a target node includes the target node attribute and a target node identifier
- the target edge data includes a target edge identifier and two node identifiers connected to the target edge.
- the node attribute contains several label data of the node.
- the node label data may include an ID number, a mobile phone number, and three Boolean variables A, B, and C.
- variable A is used to indicate the gender of the user, 1 is used to indicate male, 0 is used to indicate female;
- variable B is used to indicate whether it is dishonest, 1 is used to indicate failure, and 0 is used to indicate no failure;
- variable C is used to indicate whether there is a university degree, with 1 means a university degree, and 0 means no university degree.
- the preprocessed graph data means that a large amount of original graph data has been deduplicated, and the graph data that meets the requirements have been selected.
- the execution subject of this application may be a full-scale relational computing device based on graph data, or may be a terminal or a server, which is not specifically limited here.
- the embodiment of the present application takes the server as the execution subject as an example for description.
- the server performs bit operations on the node data to generate a synthetic node ID for each node. Specifically, the server determines multiple nodes in the node data; the server obtains the node attributes of each node and the initial node identifier corresponding to each node; the server obtains preset rules, which include the total storage bits for each node identifier Number and the starting and ending ordinal number of the storage location occupied by each variable; the server performs bit operations on the node attributes and initial node identification of each node according to the total number of storage bits for each node identification and the starting and ending ordinal number of the storage location occupied by each variable ; The server generates a synthetic node ID for each node.
- the preset rules include: (1) the total number of storage bits for each node identification; (2) the starting and ending ordinal numbers of the storage bits occupied by each variable.
- storage of ID card number, mobile phone number, and the aforementioned three Boolean variables A, B, and C are taken as examples for description.
- the preset rules are: (1) Each node allocates 101 bits of storage; (2) 1-61 bits are used to store the ID number, the 62nd-98th bits store the mobile phone number, and the 99th bit stores the variable A, the 100th bit stores variable B, and the 101st bit stores variable C.
- the server saves a lot of memory resources by using bit storage instead of individually assigning double integer storage space to each variable.
- the initial node identification is collected by the user in the preprocessing stage of the graph data and stored on the external memory (hard disk) of the computer. After starting the calculation, the server loads the initial node identifier into the internal memory. The data identified by the initial node belongs to the original data, and its collection occurs before the bit operation.
- each data group including a synthetic node identifier of the current node and edge data connected to the current node.
- the server divides the node data and the edge data with each node data as the center, and generates multiple data groups.
- Each data group includes a synthetic node identifier of the current node and the edge data connected to the current node.
- the graph data contains node data and edge data. After being loaded into the memory, in order to perform distributed calculations, the graph data needs to be divided into small processing units, which are referred to as "data groups" here. By establishing a node-centric “data group”, it is ensured that each node has only one copy, which will only appear once, and avoiding multiple node data replication.
- a data group contains the node ID and all the edge data on the node.
- One edge data contains the composite node ID of the node and another node connected to it, but does not include any attribute data of the connected node.
- the server sends the single node identification list of each node to all adjacent nodes, and the single node identification list is used to store the composite node identification of each node's adjacent nodes.
- the single node identification list is collected based on the edge data on the data group. Because the edge data contains the synthetic node identification of the current node and the once-associated node connected to the current node, all connected nodes can be collected by traversing all the edge data.
- the single node identifier list contains only the composite node identifiers of adjacent nodes.
- node a and nodes b, c, and d are adjacent nodes. Then first collect the identification list of all adjacent nodes (ie, single-node identification list), where the node identification is the four letters a, b, c, d (here is a simple example, in actual situations it contains hundreds of millions of nodes, you need Use more complex node identification to indicate, for example, when the device is a node, use the production information of the device, such as the date of generation, serial number, etc., as the node identification). For node a, the obtained node identification list is the list of [b, c, d]; then, node a passes the list of [b, c, d] to the three nodes b, c, d, respectively.
- the server generates the second-degree relationship of each node according to the single node identification list received by each node. Specifically, the server receives the list of single node identifiers of each node; the server determines its own synthetic node identifier for each node; the server separately receives the same node identifier as its own synthetic node identifier in the list of single node identifiers received by each node Delete; the server generates a second-degree relationship for each node, and the second-degree relationship is used to indicate that there is an interval between the second-degree associated node and the current node.
- node a is connected to nodes b, c, and d, and nodes b, c, and d are not directly connected, then after node b receives the list [b, c, d] sent by node a, Delete b from the list, and there are two nodes [c, d] left in the list. These two nodes are the second-degree relationship of b.
- a first-degree relationship refers to the connection between two nodes, that is, adjacent nodes; a second-degree relationship refers to a node between two nodes.
- node b and node c are separated by node a, so node b and node c are second-degree related, that is, node b and node c have a second-degree relationship.
- the method further includes: the server obtains the node attribute of each node according to the second-degree relationship of each node, and performs statistical analysis according to the node attribute of each node to generate an analysis result.
- the server reads the second-degree relationship of each node; the server determines the node attribute of each node from the second-degree relationship; the server separates the node attribute of each node from the synthetic node identification according to preset rules; Perform statistical analysis on the node attributes of each node to generate analysis results.
- FIG. 2 another flowchart of the full relationship calculation method based on graph data provided by the embodiment of the present application, which specifically includes:
- the preprocessed graph data includes node data and edge data of each node.
- the server obtains the preprocessed graph data, and the preprocessed graph data includes the node data and edge data of each node. Specifically, the server loads the preprocessed graph data through Spark.
- the preprocessed graph data is composed of node data and edge data.
- the node data in the graph data is used to indicate the subject that is connected, and the edge data is used to indicate the association between the subjects. .
- a node is related to other nodes through its connected edges.
- the node data of a target node includes the target node attribute and a target node identifier
- the target edge data includes a target edge identifier and two node identifiers connected to the target edge.
- the node attribute contains several label data of the node.
- the node label data may include an ID number, a mobile phone number, and three Boolean variables A, B, and C.
- variable A is used to indicate the gender of the user, 1 is used to indicate male, 0 is used to indicate female;
- variable B is used to indicate whether it is dishonest, 1 is used to indicate failure, and 0 is used to indicate no failure;
- variable C is used to indicate whether there is a university degree, with 1 means a university degree, and 0 means no university degree.
- the preprocessed graph data means that a large amount of original graph data has been deduplicated, and the graph data that meets the requirements have been selected.
- the execution subject of this application may be a full-scale relational computing device based on graph data, or may be a terminal or a server, which is not specifically limited here.
- the embodiment of the present application takes the server as the execution subject as an example for description.
- the server performs bit operations on the node data to generate a synthetic node ID for each node. Specifically, the server determines multiple nodes in the node data; the server obtains the node attributes of each node and the initial node identifier corresponding to each node; the server obtains preset rules, which include the total storage bits for each node identifier Number and the starting and ending ordinal number of the storage location occupied by each variable; the server performs bit operations on the node attributes and initial node identification of each node according to the total number of storage bits for each node identification and the starting and ending ordinal number of the storage location occupied by each variable ; The server generates a synthetic node ID for each node.
- the preset rules include: (1) the total number of storage bits for each node identification; (2) the starting and ending ordinal numbers of the storage bits occupied by each variable.
- storage of ID card number, mobile phone number, and the aforementioned three Boolean variables A, B, and C are taken as examples for description.
- the preset rules are: (1) Each node allocates 101 bits of storage; (2) 1-61 bits are used to store the ID number, the 62nd-98th bits store the mobile phone number, and the 99th bit stores the variable A, the 100th bit stores variable B, and the 101st bit stores variable C.
- the server saves a lot of memory resources by using bit storage instead of individually assigning double integer storage space to each variable.
- the initial node identification is collected by the user in the preprocessing stage of the graph data and stored on the external memory (hard disk) of the computer. After starting the calculation, the server loads the initial node identifier into the internal memory. The data identified by the initial node belongs to the original data, and its collection occurs before the bit operation.
- each data group including a synthetic node identifier of the current node and the edge data connected to the current node.
- the server divides the node data and edge data with each node data as the center, and generates multiple data groups.
- Each data group includes a synthetic node identifier of the current node and the edge data connected to the current node.
- the graph data contains node data and edge data. After being loaded into the memory, in order to perform distributed calculations, the graph data needs to be divided into small processing units, which are referred to as "data groups" here. By establishing a node-centric “data group”, it is ensured that each node has only one copy, which will only appear once, and avoiding multiple node data replication.
- a data group contains the node ID and all the edge data on the node.
- One edge data contains the composite node ID of the node and another node connected to it, but does not include any attribute data of the connected node.
- the server sends the single node identification list of each node to all adjacent nodes, and the single node identification list is used to store the composite node identification of the adjacent nodes of each node.
- the single node identification list is collected based on the edge data on the data group. Because the edge data contains the synthetic node identification of the current node and the once-associated node connected to the current node, all connected nodes can be collected by traversing all the edge data.
- the single node identifier list contains only the composite node identifiers of adjacent nodes.
- node a and nodes b, c, and d are adjacent nodes. Then first collect the identification list of all adjacent nodes (ie, single-node identification list), where the node identification is the four letters a, b, c, d (here is a simple example, in actual situations it contains hundreds of millions of nodes, you need Use more complex node identification to indicate, for example, when the device is a node, use the production information of the device, such as the date of generation, serial number, etc., as the node identification). For node a, the obtained node identification list is the list of [b, c, d]; then, node a passes the list of [b, c, d] to the three nodes b, c, d, respectively.
- each node According to the single node identification list received by each node, generate a second-degree relationship of each node.
- the server generates the second-degree relationship of each node according to the single node identification list received by each node. Specifically, the server receives the list of single node identifiers of each node; the server determines its own synthetic node identifier for each node; the server separately receives the same node identifier as its own synthetic node identifier in the list of single node identifiers received by each node Delete; the server generates a second-degree relationship for each node, and the second-degree relationship is used to indicate that there is an interval between the second-degree associated node and the current node.
- node a is connected to nodes b, c, and d, and nodes b, c, and d are not directly connected, then after node b receives the list [b, c, d] sent by node a, Delete b from the list, and there are two nodes [c, d] left in the list. These two nodes are the second-degree relationship of b.
- the first-degree relationship refers to the connection between two nodes, that is, adjacent nodes;
- the second-degree relationship refers to the distance between two nodes by one node.
- node b and node c are separated by node a, so node b and node c are second-degree related, that is, node b and node c have a second-degree relationship.
- the server generates a second-degree relationship identifier list according to the second-degree relationship of each node, and the second-degree relationship identifier list is used to store the second-degree relationship of each node.
- the server sends the second-degree relationship identifier list of each node to all neighboring nodes, and the second-degree relationship identifier list is used to store the composite node identifier of the neighboring nodes of each node.
- the three-degree relationship identifier list is used to store the three-degree relationship of each node, and the third-degree relationship is used for Indicates that there is a first-degree associated node and a second-degree associated node between the three-degree associated node and the current node.
- the server generates a three-degree relationship identifier list for each node according to the second-degree relationship identifier list received by each node from neighboring nodes.
- the three-degree relationship identifier list is used to store the three-degree relationship of each node, and the third-degree relationship is used to indicate Between the third-degree associated node and the current node, there is a first-degree associated node and a second-degree associated node.
- the server obtains the second-degree relationship identifier list of each node; the server determines each node's own synthetic node identifier; the server separately receives the second-degree relationship identifier list received by each node that is the same as its own synthetic node identifier
- the node ID is deleted; the server determines the three-degree relationship of each node, and the three-degree relationship is used to indicate the interval between two nodes (that is, the interval between the three-degree associated node and the current node is a one-degree associated node and a two-degree associated node.
- Degree-related nodes generate a three-degree relationship identifier list, and the three-degree relationship identifier list is used to store the three-degree relationship of each node.
- node a is connected to nodes b, c, and d
- node e is connected to node b
- nodes b, c, and d are not directly connected
- node b receives the list [b, c, d] sent by node a, Delete b from the list, and there are two nodes [c, d] left in the list.
- the node b continues to send the list [c, d] to the node e, and then obtains the three-degree relationship list of the node e, that is, the node e and the node c, and the e and d form a three-degree relationship.
- node b will also send the node ID list [c, d] to node a.
- the list [c, d] also exists in the one-degree relationship list [b, c, d] of the node a, c and d need to be excluded and do not form a three-degree relationship with a.
- the method further includes: the server obtains the node attributes of each node according to the three-degree relationship identification list of each node, and performs statistical analysis according to the node attributes of each node.
- the separation here refers to the process of reading the node attributes from the synthetic node identifier. It corresponds to the previous data reading process and follows the unified node attribute preset rules. Using the aforementioned example, use 101 bits to store the ID number, mobile phone number, and three Boolean variables of A, B, and C.
- the separated node attributes are statistically analyzed according to business needs, which improves the calculation efficiency. For example, count all the friends worth recommending in the three-degree relationship, or count all the users with good credit, and so on.
- the method further includes: the server obtains the original graph data of each node; the server performs deduplication and verification processing on the original graph data; and the server generates preprocessed graph data that meets the requirements.
- An embodiment of the relational computing device includes:
- the first obtaining unit 301 is configured to obtain preprocessed graph data, where the preprocessed graph data includes node data and edge data of each node;
- the operation generating unit 302 is configured to perform a bit operation on the node data to generate a synthetic node identifier of each node;
- the dividing and generating unit 303 is configured to divide the node data and the edge data with each node data as the center to generate multiple data groups, each data group including a composite node identifier of the current node and connection with the current node Edge data;
- the first sending unit 304 is configured to send a single node identification list of each node to all adjacent nodes, and the single node identification list is used to store the composite node identification of the adjacent nodes;
- the first generating unit 305 is configured to generate the second degree relationship of each node according to the single node identification list received by each node.
- FIG. 4 another embodiment of the full relation calculation device based on graph data in the embodiment of the present application includes:
- the first obtaining unit 301 is configured to obtain preprocessed graph data, where the preprocessed graph data includes node data and edge data of each node;
- the operation generating unit 302 is configured to perform a bit operation on the node data to generate a synthetic node identifier of each node;
- the dividing and generating unit 303 is configured to divide the node data and the edge data with each node data as the center to generate multiple data groups, each data group including a composite node identifier of the current node and connection with the current node Edge data;
- the first sending unit 304 is configured to send a single node identification list of each node to all adjacent nodes, and the single node identification list is used to store the composite node identification of the adjacent nodes;
- the first generating unit 305 is configured to generate the second degree relationship of each node according to the single node identification list received by each node.
- the operation generating unit 302 is specifically configured to:
- the full relationship calculation device based on graph data further includes:
- the statistical analysis unit 306 is configured to obtain the node attribute of each node according to the second-degree relationship of each node, and perform statistical analysis according to the node attribute of each node to generate an analysis result.
- the statistical analysis unit 306 is specifically used to:
- the first generating unit 305 is specifically configured to:
- Receive the single-node identification list of each node determine the connection status of each node; delete the synthetic node identification that is the same as itself in the received single-node identification list; generate the second degree relationship of each node, the second degree The relationship is used to indicate the second degree associated node and the current node, and there is an interval of one degree associated node.
- the full relationship calculation device based on graph data further includes:
- the second generating unit 307 is configured to generate a second-degree relationship identifier list according to the second-degree relationship of each node, and the second-degree relationship identifier list is used to store the second-degree relationship of each node;
- the second sending unit 308 is configured to send the second-degree relationship identifier list of each node to all adjacent nodes;
- the third generating unit 309 is configured to generate a third-degree relationship identifier list of each node according to the second-degree relationship identifier list sent by neighboring nodes received by each node, and the third-degree relationship identifier list is used to store the three-degree relationship identifier list of each node.
- Degree relationship the three-degree relationship is used to indicate that there is an interval between a first-degree associated node and a second-degree associated node between the third-degree associated node and the current node.
- the full relationship calculation device based on graph data further includes:
- the second obtaining unit 310 is configured to obtain the original graph data of each node
- the processing unit 311 is configured to perform deduplication processing and verification processing on the original image data
- the fourth generating unit 312 is used to generate preprocessed image data that meets the requirements.
- FIG. 5 is a schematic structural diagram of a full relational computing device based on graph data provided by an embodiment of the present application.
- the full relational computing device 500 based on graph data may have relatively large differences due to different configurations or performances, and may include one or One or more processors (central processing units, CPU) 501 (for example, one or more processors) and a memory 509, and one or more storage media 508 (for example, one or one storage device with a large amount of storage) storing application programs 507 or data 506.
- the memory 509 and the storage medium 508 may be short-term storage or persistent storage.
- the program stored in the storage medium 508 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the full relational computing device based on graph data. Further, the processor 501 may be configured to communicate with the storage medium 508, and execute a series of instruction operations in the storage medium 508 on the full relationship computing device 500 based on graph data.
- the full relational computing device 500 based on graph data may also include one or more power supplies 502, one or more wired or wireless network interfaces 503, one or more input and output interfaces 504, and/or, one or more operating systems 505 , Such as Windows Serve, MacOS X, Unix, Linux, FreeBSD, etc.
- operating systems 505 Such as Windows Serve, MacOS X, Unix, Linux, FreeBSD, etc.
- FIG. 5 does not constitute a limitation on the full relationship computing device based on graph data, and may include more or less components than shown in the figure. Or combine certain components, or different component arrangements.
- the processor 501 can execute the first acquisition unit 301, the operation generation unit 302, the division generation unit 303, the first generation unit 305, the statistical analysis unit 306, the second generation unit 307, the third generation unit 309, and the second generation unit 301 in the above embodiments. Functions of the acquiring unit 310, the processing unit 311, and the fourth generating unit 312.
- the processor 501 is the control center of the full relationship calculation device based on graph data, and can perform processing in accordance with the set full relationship calculation method based on graph data.
- the processor 501 uses various interfaces and lines to connect various parts of the entire graph data-based full relational computing device, by running or executing software programs and/or modules stored in the memory 509, and calling data stored in the memory 509, Perform various functions and processing data of full relational computing equipment based on graph data.
- the storage medium 508 and the memory 509 are both carriers for storing data.
- the storage medium 508 may refer to an internal memory with a small storage capacity but high speed, and the storage 509 may have a large storage capacity but a slow storage speed. External memory.
- the memory 509 may be used to store software programs and modules.
- the processor 501 executes various functional applications and data processing of the full relational computing device 500 based on graph data by running the software programs and modules stored in the memory 509.
- the memory 509 may mainly include a program storage area and a data storage area.
- the storage program area may store an operating system and at least one application program required by a function (for example, sending a single node identification list of each node to all adjacent nodes, The single node identifier list is used to store the composite node identifiers of adjacent nodes, etc.; the storage data area can store data created according to the use of the full relationship computing device based on graph data (such as the composite node identifier of each node).
- the memory 509 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.
- a non-volatile memory such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.
- the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
- the computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium, the storage medium being a volatile storage medium or a non-volatile storage medium,
- the computer instructions can be sent from one website site, computer, server, or data center to another website site, through wired (such as coaxial cable, optical fiber, twisted pair) or wireless (such as infrared, wireless, microwave, etc.) Computer, server or data center for transmission.
- the computer-readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a server or data center integrated with one or more available media.
- the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, an optical disc), or a semiconductor medium (for example, a solid state disk (SSD)).
- the disclosed system, device, and method may be implemented in other ways.
- the device embodiments described above are only illustrative.
- the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or It can be integrated into another system, or some features can be ignored or not implemented.
- the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
- the functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
- the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
- the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
- the technical solution of this application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to make a computer device (which can be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in each embodiment of the present application.
- the aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program code .
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (20)
- 一种基于图数据的全量关系计算方法,其中,包括:A calculation method for full relationship based on graph data, including:获取预处理图数据,所述预处理图数据包括每个节点的节点数据和边数据;Acquiring preprocessed graph data, the preprocessed graph data including node data and edge data of each node;对所述节点数据进行位运算,生成每个节点的合成节点标识;Performing bit operations on the node data to generate a synthetic node identifier for each node;将所述节点数据和所述边数据以每个节点数据为中心进行划分,生成多个数据组,每个数据组包括一个当前节点的合成节点标识以及与当前节点连接的边数据;Dividing the node data and the edge data with each node data as the center to generate a plurality of data groups, each data group including a synthetic node identifier of the current node and the edge data connected to the current node;将每个节点的单节点标识列表发送给所有的相邻节点,所述单节点标识列表用于存储所述相邻节点的合成节点标识;Sending a single node identification list of each node to all adjacent nodes, where the single node identification list is used to store the composite node identification of the adjacent nodes;根据每个节点接收到的所述单节点标识列表,生成各个节点的二度关系。According to the single node identification list received by each node, the second-degree relationship of each node is generated.
- 根据权利要求1所述的基于图数据的全量关系计算方法,其中,所述对所述节点数据进行位运算,生成每个节点的合成节点标识包括:The method for calculating a full relationship based on graph data according to claim 1, wherein said performing bit operation on said node data to generate a synthetic node identifier for each node comprises:确定所述节点数据中的多个节点;Determine multiple nodes in the node data;获取每个节点的节点属性和每个节点对应的初始节点标识;Obtain the node attributes of each node and the initial node ID corresponding to each node;获取预置规则,所述预置规则包括每个节点标识总共的存储比特位数和每个变量占据的存储位的起止序数;Acquiring a preset rule, the preset rule including a total number of storage bits for each node identifier and a starting and ending sequence number of storage bits occupied by each variable;按照所述每个节点标识总共的存储比特位数和所述每个变量占据的存储位的起止序数,将每个节点的节点属性和所述初始节点标识进行位运算;Perform a bit operation on the node attributes of each node and the initial node identifier according to the total number of storage bits for each node identifier and the starting and ending ordinal numbers of the storage positions occupied by each variable;生成每个节点的合成节点标识。Generate a synthetic node ID for each node.
- 根据权利要求1所述的基于图数据的全量关系计算方法,其中,在所述根据每个节点接收到的所述单节点标识列表,生成各个节点的二度关系之后,所述方法还包括:The method for calculating a full relationship based on graph data according to claim 1, wherein, after said generating the two-degree relationship of each node according to the single node identification list received by each node, the method further comprises:根据每个节点的二度关系获取每个节点的节点属性,并根据每个节点的节点属性进行统计分析,生成分析结果。Obtain the node attributes of each node according to the two-degree relationship of each node, and perform statistical analysis according to the node attributes of each node to generate analysis results.
- 根据权利要求3所述的基于图数据的全量关系计算方法,其中,所述根据每个节点的二度关系获取每个节点的节点属性,并根据每个节点的节点属性进行统计分析,生成分析结果包括:The method for calculating a full relationship based on graph data according to claim 3, wherein the node attribute of each node is obtained according to the second-degree relationship of each node, and statistical analysis is performed according to the node attribute of each node to generate an analysis The results include:读取每个节点的二度关系;Read the second-degree relationship of each node;从所述二度关系中确定每个节点的节点属性;Determine the node attribute of each node from the two-degree relationship;按照预置规则将所述每个节点的节点属性从所述合成节点标识中分离;Separating the node attribute of each node from the synthetic node identifier according to a preset rule;对所述每个节点的节点属性进行统计分析,生成分析结果。Perform statistical analysis on the node attributes of each node to generate analysis results.
- 根据权利要求1所述的基于图数据的全量关系计算方法,其中,所述根据每个节点接收到的所述单节点标识列表,生成各个节点的二度关系包括:The method for calculating a full relationship based on graph data according to claim 1, wherein said generating the second-degree relationship of each node according to the single node identification list received by each node comprises:接收每个节点的单节点标识列表;Receive a single node identification list of each node;确定每个节点的自身的合成节点标识;Determine the synthetic node ID of each node;分别将每个节点接收到的单节点标识列表中与自身的合成节点标识相同的节点标识删除;Respectively delete the node ID that is the same as its own synthetic node ID from the single node ID list received by each node;生成每个节点的二度关系,所述二度关系用于指示二度关联节点和当前节点之间,间隔一个一度关联节点。A second-degree relationship is generated for each node, and the second-degree relationship is used to indicate that there is an interval between the second-degree associated node and the current node by a first-degree associated node.
- 根据权利要求1所述的基于图数据的全量关系计算方法,其中,在所述根据每个节点接收到的所述单节点标识列表,生成各个节点的二度关系之后,所述方法还包括:The method for calculating a full relationship based on graph data according to claim 1, wherein, after said generating the two-degree relationship of each node according to the single node identification list received by each node, the method further comprises:根据所述各个节点的二度关系生成二度关系标识列表,所述二度关系标识列表用于存储各个节点的二度关系;Generating a second-degree relationship identifier list according to the second-degree relationship of each node, where the second-degree relationship identifier list is used to store the second-degree relationship of each node;将每个节点的二度关系标识列表发送给所有的相邻节点;Send the second-degree relationship identifier list of each node to all adjacent nodes;根据每个节点接收到的相邻节点发送的二度关系标识列表生成每个节点的三度关系标识列表,所述三度关系标识列表用于存储各个节点的三度关系,所述三度关系用于指示三度关联节点和当前节点之间,间隔一个一度关联节点和一个二度关联节点。A three-degree relationship identifier list for each node is generated according to the two-degree relationship identifier list sent by neighboring nodes received by each node. The three-degree relationship identifier list is used to store the three-degree relationship of each node. Used to indicate the interval between a three-degree associated node and the current node, a first-degree associated node and a second-degree associated node.
- 根据权利要求1-6中任一所述的基于图数据的全量关系计算方法,其中,在所述获取预处理图数据,所述预处理图数据包括每个节点的节点数据和边数据之前,所述方法还包括:The method for calculating a full relationship based on graph data according to any one of claims 1-6, wherein, before said acquiring preprocessed graph data, said preprocessed graph data including node data and edge data of each node, The method also includes:获取每个节点的原始图数据;Obtain the original graph data of each node;对所述原始图数据进行去重处理和校验处理;Performing de-duplication processing and verification processing on the original image data;生成符合要求的预处理图数据。Generate preprocessed map data that meets the requirements.
- 一种基于图数据的全量关系计算装置,其中,包括:A full-scale relational calculation device based on graph data, which includes:第一获取单元,用于获取预处理图数据,所述预处理图数据包括每个节点的节点数据和边数据;The first acquiring unit is configured to acquire preprocessed graph data, where the preprocessed graph data includes node data and edge data of each node;运算生成单元,用于对所述节点数据进行位运算,生成每个节点的合成节点标识;An operation generating unit, configured to perform bit operation on the node data to generate a synthetic node identifier for each node;划分生成单元,用于将所述节点数据和所述边数据以每个节点数据为中心进行划分,生成多个数据组,每个数据组包括一个当前节点的合成节点标识以及与当前节点连接的边数据;The dividing and generating unit is used to divide the node data and the edge data with each node data as the center to generate a plurality of data groups, each data group including a synthetic node identifier of the current node and a connection to the current node Edge data第一发送单元,用于将每个节点的单节点标识列表发送给所有的相邻节点,所述单节点标识列表用于存储所述相邻节点的合成节点标识;The first sending unit is configured to send a single node identification list of each node to all adjacent nodes, where the single node identification list is used to store the composite node identification of the adjacent nodes;第一生成单元,用于根据每个节点接收到的所述单节点标识列表,生成各个节点的二度关系。The first generating unit is configured to generate the second-degree relationship of each node according to the single node identification list received by each node.
- 一种计算机设备,其中,包括:A computer device, which includes:一个或多个处理器;One or more processors;存储器;Memory一个或多个计算机程序,其中所述一个或多个计算机程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个计算机程序配置用于执行一种基于图数据的全量关系计算方法,其中,所述基于图数据的全量关系计算方法包括:One or more computer programs, wherein the one or more computer programs are stored in the memory and configured to be executed by the one or more processors, and the one or more computer programs are configured to execute A method for calculating a full relationship based on graph data, wherein the method for calculating a full relationship based on graph data includes:获取预处理图数据,所述预处理图数据包括每个节点的节点数据和边数 据;Acquiring preprocessed graph data, the preprocessed graph data including node data and edge data of each node;对所述节点数据进行位运算,生成每个节点的合成节点标识;Performing bit operations on the node data to generate a synthetic node identifier for each node;将所述节点数据和所述边数据以每个节点数据为中心进行划分,生成多个数据组,每个数据组包括一个当前节点的合成节点标识以及与当前节点连接的边数据;Dividing the node data and the edge data with each node data as the center to generate a plurality of data groups, each data group including a synthetic node identifier of the current node and the edge data connected to the current node;将每个节点的单节点标识列表发送给所有的相邻节点,所述单节点标识列表用于存储所述相邻节点的合成节点标识;Sending a single node identification list of each node to all adjacent nodes, where the single node identification list is used to store the composite node identification of the adjacent nodes;根据每个节点接收到的所述单节点标识列表,生成各个节点的二度关系。According to the single node identification list received by each node, the second-degree relationship of each node is generated.
- 根据权利要求9所述的计算机设备,其中,所述对所述节点数据进行位运算,生成每个节点的合成节点标识包括:The computer device according to claim 9, wherein said performing a bit operation on said node data to generate a synthetic node identifier of each node comprises:确定所述节点数据中的多个节点;Determine multiple nodes in the node data;获取每个节点的节点属性和每个节点对应的初始节点标识;Obtain the node attributes of each node and the initial node ID corresponding to each node;获取预置规则,所述预置规则包括每个节点标识总共的存储比特位数和每个变量占据的存储位的起止序数;Acquiring a preset rule, the preset rule including a total number of storage bits for each node identifier and a starting and ending sequence number of storage bits occupied by each variable;按照所述每个节点标识总共的存储比特位数和所述每个变量占据的存储位的起止序数,将每个节点的节点属性和所述初始节点标识进行位运算;Perform a bit operation on the node attributes of each node and the initial node identifier according to the total number of storage bits for each node identifier and the starting and ending ordinal numbers of the storage positions occupied by each variable;生成每个节点的合成节点标识。Generate a synthetic node ID for each node.
- 根据权利要求9所述的计算机设备,其中,在所述根据每个节点接收到的所述单节点标识列表,生成各个节点的二度关系之后,所述方法还包括:The computer device according to claim 9, wherein, after said generating the two-degree relationship of each node according to the single node identification list received by each node, the method further comprises:根据每个节点的二度关系获取每个节点的节点属性,并根据每个节点的节点属性进行统计分析,生成分析结果。Obtain the node attributes of each node according to the two-degree relationship of each node, and perform statistical analysis according to the node attributes of each node to generate analysis results.
- 根据权利要求11所述的计算机设备,所述根据每个节点的二度关系获取每个节点的节点属性,并根据每个节点的节点属性进行统计分析,生成分析结果包括:11. The computer device according to claim 11, wherein said acquiring the node attribute of each node according to the two-degree relationship of each node, and performing statistical analysis according to the node attribute of each node, and generating the analysis result comprises:读取每个节点的二度关系;Read the second-degree relationship of each node;从所述二度关系中确定每个节点的节点属性;Determine the node attribute of each node from the two-degree relationship;按照预置规则将所述每个节点的节点属性从所述合成节点标识中分离;Separating the node attribute of each node from the synthetic node identifier according to a preset rule;对所述每个节点的节点属性进行统计分析,生成分析结果。Perform statistical analysis on the node attributes of each node to generate analysis results.
- 根据权利要求9所述的计算机设备,其中,所述根据每个节点接收到的所述单节点标识列表,生成各个节点的二度关系包括:The computer device according to claim 9, wherein the generating the second-degree relationship of each node according to the single node identification list received by each node comprises:接收每个节点的单节点标识列表;Receive a single node identification list of each node;确定每个节点的自身的合成节点标识;Determine the synthetic node ID of each node;分别将每个节点接收到的单节点标识列表中与自身的合成节点标识相同的节点标识删除;Respectively delete the node ID that is the same as its own synthetic node ID from the single node ID list received by each node;生成每个节点的二度关系,所述二度关系用于指示二度关联节点和当前节点之间,间隔一个一度关联节点。A second-degree relationship is generated for each node, and the second-degree relationship is used to indicate that there is an interval between the second-degree associated node and the current node by a first-degree associated node.
- 根据权利要求9所述的计算机设备,其中,在所述根据每个节点接收到的所述单节点标识列表,生成各个节点的二度关系之后,所述方法还包括:The computer device according to claim 9, wherein, after said generating the two-degree relationship of each node according to the single node identification list received by each node, the method further comprises:根据所述各个节点的二度关系生成二度关系标识列表,所述二度关系标识列表用于存储各个节点的二度关系;Generating a second-degree relationship identifier list according to the second-degree relationship of each node, where the second-degree relationship identifier list is used to store the second-degree relationship of each node;将每个节点的二度关系标识列表发送给所有的相邻节点;Send the second-degree relationship identifier list of each node to all adjacent nodes;根据每个节点接收到的相邻节点发送的二度关系标识列表生成每个节点的三度关系标识列表,所述三度关系标识列表用于存储各个节点的三度关系,所述三度关系用于指示三度关联节点和当前节点之间,间隔一个一度关联节点和一个二度关联节点。A three-degree relationship identifier list for each node is generated according to the two-degree relationship identifier list sent by neighboring nodes received by each node. The three-degree relationship identifier list is used to store the three-degree relationship of each node. Used to indicate the interval between a three-degree associated node and the current node, a first-degree associated node and a second-degree associated node.
- 根据权利要求9-14任一所述计算机设备,其中,在所述获取预处理图数据,所述预处理图数据包括每个节点的节点数据和边数据之前,所述方法还包括:The computer device according to any one of claims 9-14, wherein, before said acquiring preprocessed graph data, the preprocessed graph data including node data and edge data of each node, the method further comprises:获取每个节点的原始图数据;Obtain the original graph data of each node;对所述原始图数据进行去重处理和校验处理;Performing de-duplication processing and verification processing on the original image data;生成符合要求的预处理图数据。Generate preprocessed map data that meets the requirements.
- 一种计算机可读存储介质,其中,所述计算机可读存储介质上存储有计算机程序,该计算机程序被处理器执行时实现基于图数据的全量关系计算方法,其中,所述基于图数据的全量关系计算方法包括以下步骤:A computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, a method for calculating a full relationship based on graph data is implemented, wherein the full amount of graph data is The relationship calculation method includes the following steps:获取预处理图数据,所述预处理图数据包括每个节点的节点数据和边数据;Acquiring preprocessed graph data, the preprocessed graph data including node data and edge data of each node;对所述节点数据进行位运算,生成每个节点的合成节点标识;Performing bit operations on the node data to generate a synthetic node identifier for each node;将所述节点数据和所述边数据以每个节点数据为中心进行划分,生成多个数据组,每个数据组包括一个当前节点的合成节点标识以及与当前节点连接的边数据;Dividing the node data and the edge data with each node data as the center to generate a plurality of data groups, each data group including a synthetic node identifier of the current node and the edge data connected to the current node;将每个节点的单节点标识列表发送给所有的相邻节点,所述单节点标识列表用于存储所述相邻节点的合成节点标识;Sending a single node identification list of each node to all adjacent nodes, where the single node identification list is used to store the composite node identification of the adjacent nodes;根据每个节点接收到的所述单节点标识列表,生成各个节点的二度关系。According to the single node identification list received by each node, the second-degree relationship of each node is generated.
- 根据权利要求16所述的计算机可读存储介质,其中,所述对所述节点数据进行位运算,生成每个节点的合成节点标识包括:The computer-readable storage medium according to claim 16, wherein the performing a bit operation on the node data to generate a synthetic node identifier for each node comprises:确定所述节点数据中的多个节点;Determine multiple nodes in the node data;获取每个节点的节点属性和每个节点对应的初始节点标识;Obtain the node attributes of each node and the initial node ID corresponding to each node;获取预置规则,所述预置规则包括每个节点标识总共的存储比特位数和每个变量占据的存储位的起止序数;Acquiring a preset rule, the preset rule including a total number of storage bits for each node identifier and a starting and ending sequence number of storage bits occupied by each variable;按照所述每个节点标识总共的存储比特位数和所述每个变量占据的存储位的起止序数,将每个节点的节点属性和所述初始节点标识进行位运算;Perform a bit operation on the node attributes of each node and the initial node identifier according to the total number of storage bits for each node identifier and the starting and ending ordinal numbers of the storage positions occupied by each variable;生成每个节点的合成节点标识。Generate a synthetic node ID for each node.
- 根据权利要求16所述的计算机可读存储介质,其中,在所述根据每个节点接收到的所述单节点标识列表,生成各个节点的二度关系之后,所述方法还包括:The computer-readable storage medium according to claim 16, wherein, after said generating the two-degree relationship of each node according to the single node identification list received by each node, the method further comprises:根据每个节点的二度关系获取每个节点的节点属性,并根据每个节点的节点属性进行统计分析,生成分析结果。Obtain the node attributes of each node according to the two-degree relationship of each node, and perform statistical analysis according to the node attributes of each node to generate analysis results.
- 根据权利要求18所述的计算机可读存储介质,其中,所述根据每个节点的二度关系获取每个节点的节点属性,并根据每个节点的节点属性进行统计分析,生成分析结果包括:18. The computer-readable storage medium according to claim 18, wherein the obtaining the node attribute of each node according to the two-degree relationship of each node, and performing statistical analysis according to the node attribute of each node, and generating the analysis result comprises:读取每个节点的二度关系;Read the second-degree relationship of each node;从所述二度关系中确定每个节点的节点属性;Determine the node attribute of each node from the two-degree relationship;按照预置规则将所述每个节点的节点属性从所述合成节点标识中分离;Separating the node attribute of each node from the synthetic node identifier according to a preset rule;对所述每个节点的节点属性进行统计分析,生成分析结果。Perform statistical analysis on the node attributes of each node to generate analysis results.
- 根据权利要求16所述的计算机可读存储介质,其中,所述根据每个节点接收到的所述单节点标识列表,生成各个节点的二度关系包括:The computer-readable storage medium according to claim 16, wherein the generating the second-degree relationship of each node according to the single node identification list received by each node comprises:接收每个节点的单节点标识列表;Receive a single node identification list of each node;确定每个节点的自身的合成节点标识;Determine the synthetic node ID of each node;分别将每个节点接收到的单节点标识列表中与自身的合成节点标识相同的节点标识删除;Respectively delete the node ID that is the same as its own synthetic node ID from the single node ID list received by each node;生成每个节点的二度关系,所述二度关系用于指示二度关联节点和当前节点之间,间隔一个一度关联节点。A second-degree relationship is generated for each node, and the second-degree relationship is used to indicate that there is an interval between the second-degree associated node and the current node by a first-degree associated node.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910751784.4 | 2019-08-15 | ||
CN201910751784.4A CN110609924A (en) | 2019-08-15 | 2019-08-15 | Method, device and equipment for calculating total quantity relation based on graph data and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021027331A1 true WO2021027331A1 (en) | 2021-02-18 |
Family
ID=68889757
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/087619 WO2021027331A1 (en) | 2019-08-15 | 2020-04-28 | Graph data-based full relationship calculation method and apparatus, device, and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110609924A (en) |
WO (1) | WO2021027331A1 (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110609924A (en) * | 2019-08-15 | 2019-12-24 | 深圳壹账通智能科技有限公司 | Method, device and equipment for calculating total quantity relation based on graph data and storage medium |
CN111984828B (en) * | 2020-06-30 | 2024-07-23 | 联想(北京)有限公司 | Neighbor node retrieval method and device |
CN112528090B (en) * | 2020-12-11 | 2023-08-04 | 北京百度网讯科技有限公司 | Storage method and storage device for graph data |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104952032A (en) * | 2015-06-19 | 2015-09-30 | 清华大学 | Graph processing method and device as well as rasterization representation and storage method |
CN106919628A (en) * | 2015-12-28 | 2017-07-04 | 阿里巴巴集团控股有限公司 | A kind for the treatment of method and apparatus of diagram data |
US20180203897A1 (en) * | 2017-01-18 | 2018-07-19 | Oracle International Corporation | Fast graph query engine optimized for typical real-world graph instances whose small portion of vertices have extremely large degree |
CN109446362A (en) * | 2018-09-05 | 2019-03-08 | 北京费马科技有限公司 | Chart database structure, diagram data storage method, device based on external memory |
CN110609924A (en) * | 2019-08-15 | 2019-12-24 | 深圳壹账通智能科技有限公司 | Method, device and equipment for calculating total quantity relation based on graph data and storage medium |
-
2019
- 2019-08-15 CN CN201910751784.4A patent/CN110609924A/en active Pending
-
2020
- 2020-04-28 WO PCT/CN2020/087619 patent/WO2021027331A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104952032A (en) * | 2015-06-19 | 2015-09-30 | 清华大学 | Graph processing method and device as well as rasterization representation and storage method |
CN106919628A (en) * | 2015-12-28 | 2017-07-04 | 阿里巴巴集团控股有限公司 | A kind for the treatment of method and apparatus of diagram data |
US20180203897A1 (en) * | 2017-01-18 | 2018-07-19 | Oracle International Corporation | Fast graph query engine optimized for typical real-world graph instances whose small portion of vertices have extremely large degree |
CN109446362A (en) * | 2018-09-05 | 2019-03-08 | 北京费马科技有限公司 | Chart database structure, diagram data storage method, device based on external memory |
CN110609924A (en) * | 2019-08-15 | 2019-12-24 | 深圳壹账通智能科技有限公司 | Method, device and equipment for calculating total quantity relation based on graph data and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110609924A (en) | 2019-12-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11868359B2 (en) | Dynamically assigning queries to secondary query processing resources | |
US12013856B2 (en) | Burst performance of database queries according to query size | |
WO2021027331A1 (en) | Graph data-based full relationship calculation method and apparatus, device, and storage medium | |
CN110543586B (en) | Multi-user identity fusion method, device, equipment and storage medium | |
US11727004B2 (en) | Context dependent execution time prediction for redirecting queries | |
US20240126817A1 (en) | Graph data query | |
CN113687964B (en) | Data processing method, device, electronic equipment, storage medium and program product | |
CA3119167A1 (en) | Approach for a controllable trade-off between cost and availability of indexed data in a cloud log aggregation solution such as splunk or sumo | |
Ho et al. | Distributed graph database for large-scale social computing | |
Hu et al. | Output-optimal massively parallel algorithms for similarity joins | |
Elagib et al. | Big data analysis solutions using MapReduce framework | |
CN112925859B (en) | Data storage method and device | |
CN111404932A (en) | Method for accessing medical institution system to smart medical cloud service platform | |
WO2021082928A1 (en) | Data reduction method and apparatus, computing device, and storage medium | |
US10162830B2 (en) | Systems and methods for dynamic partitioning in distributed environments | |
US20210365300A9 (en) | Systems and methods for dynamic partitioning in distributed environments | |
EP3926453A1 (en) | Partitioning method and apparatus therefor | |
US20200104425A1 (en) | Techniques for lossless and lossy large-scale graph summarization | |
US11500931B1 (en) | Using a graph representation of join history to distribute database data | |
CN112612832B (en) | Node analysis method, device, equipment and storage medium | |
CN111723089A (en) | Method and device for processing data based on columnar storage format | |
Cao et al. | LogKV: Exploiting key-value stores for event log processing | |
US11249901B2 (en) | Ownership-based garbage collection of data | |
CN110309206B (en) | Order information acquisition method and system | |
US11768814B2 (en) | Data transmissions between two databases |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20852001 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20852001 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 03.08.2022) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20852001 Country of ref document: EP Kind code of ref document: A1 |