CN111159178B - Data map path navigation method based on big data SQL analysis - Google Patents
Data map path navigation method based on big data SQL analysis Download PDFInfo
- Publication number
- CN111159178B CN111159178B CN201911271405.8A CN201911271405A CN111159178B CN 111159178 B CN111159178 B CN 111159178B CN 201911271405 A CN201911271405 A CN 201911271405A CN 111159178 B CN111159178 B CN 111159178B
- Authority
- CN
- China
- Prior art keywords
- data
- data table
- path
- relation
- map
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2433—Query languages
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/248—Presentation of query results
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the field of big data processing, in particular to a data map path navigation method based on big data SQL analysis, which comprises the following steps: analyzing the data query script and the database execution log from each data application system through SQL to obtain the association relation between the data table and the data field; establishing a data table relation library, and storing the association relation between the parsed data table and the parsed data field into the data table relation library; connecting the data table and the data field through the association relation between the data table and the data field to form a data network so as to compile a data topological relation diagram, and constructing a data service map by relying on the topological relation diagram; setting a starting point data table and an end point data table in the data service map, calculating according to the data table relation library, obtaining a required data path and displaying the data path on the data service map. The invention has the beneficial effects that: and navigation and query of the data are realized.
Description
Technical Field
The invention relates to the field of big data processing, in particular to a data map path navigation method based on big data SQL analysis.
Background
A database index is an ordered data structure in a database management system to facilitate quick querying and updating of data in a database table. Database indexing is a directory built up of values in certain fields in order to increase the efficiency of table searching.
However, the database index lacks the association relation between the data table and the data field, and the path of the index cannot be visually displayed.
Disclosure of Invention
In order to solve the problems, the invention provides a data map path navigation method based on large data SQL analysis.
A data map path navigation method based on big data SQL analysis comprises the following steps:
analyzing the data query script and the database execution log from each data application system through SQL to obtain the association relation between the data table and the data field;
establishing a data table relation library, and storing the association relation between the parsed data table and the parsed data field into the data table relation library;
connecting the data table and the data field through the association relation between the data table and the data field to form a data network so as to compile a data topological relation diagram, and constructing a data service map by relying on the topological relation diagram;
setting a starting point data table and an end point data table in the data service map, calculating according to the data table relation library, obtaining a required data path and displaying the data path on the data service map.
Preferably, the analyzing the data query script and the database execution log from each data application system through SQL to obtain the association relationship between the data table and the data field includes:
for any SQL sentence, finding a Boolean relation expression, wherein the left side and the right side of the Boolean relation expression both contain expressions of fields;
judging that the expression father inquiry is a table relation connection inquiry, and obtaining a table association type;
and continuously analyzing the field sets related to the original base table for the left and right side expressions of the Boolean expression to obtain the base table set to which the fields belong.
Preferably, the step of connecting the data table and the data field through association relations between the data table and the data field to form a data network so as to compile a data topological relation diagram, and the step of constructing the data service map by relying on the topological relation diagram includes:
and connecting the data table and the data fields to form a data network through association relations between the data table and the data fields, wherein the data table is represented by point elements, the relation between the data tables is represented by connecting lines between two points, expressions of association of specific data fields between the data tables are displayed on the connecting lines, a data topological relation diagram is obtained, and a data service map covering the full service domain is constructed according to the data topological relation diagram.
Preferably, setting a start point data table, an end point data table and an end point data table in the data service map, calculating according to a data table relation library, obtaining a required data path, and displaying the data path on the data service map comprises:
and calculating the path of the node which is related to the minimum data table between the starting point data table and the end point data table to obtain the shortest path.
Preferably, setting a start point data table and an end point data table in the data service map, calculating according to a data table relation library, obtaining a required data path, and displaying the data path on the data service map includes:
and calculating a path with the highest associated query execution speed between the starting point data table and the end point data table to obtain the fastest path.
Preferably, setting a start point data table and an end point data table in the data service map, calculating according to a data table relation library, obtaining a required data path, and displaying the data path on the data service map includes:
and calculating a path formed by using the most correlation mode between the starting point data table and the end point data table in the history execution record to obtain the most common path.
Preferably, setting a start point data table and an end point data table in the data service map, calculating according to a data table relation library, obtaining a required data path, and displaying the data path on the data service map includes:
and calculating according to the comprehensive execution efficiency, the table relation complexity and the historical use frequency between the starting point data table and the end point data table to obtain an optimal path.
The invention has the beneficial effects that: the method comprises the steps of analyzing a data query script and a database execution log through SQL, acquiring an association relation between a data table and a data field, establishing a data table relation library, storing the association relation between the analyzed data table and the data field into the data table relation library, connecting the data table and the data field through the association relation between the data table and the data field to form a data network, compiling a data topological relation diagram, constructing a data service map according to the topological relation diagram, setting a starting point data table and an end point data table in the data service map, calculating according to the data table relation library, obtaining a required data path and displaying on the data service map, and therefore data navigation and query are achieved.
Drawings
The invention will be described in further detail with reference to the drawings and the detailed description.
FIG. 1 is a schematic flow chart of steps S1-S4 of a data map path navigation method based on large data SQL parsing according to an embodiment of the invention;
fig. 2 is a schematic flowchart of a data map path navigation method step S1 based on big data SQL parsing according to an embodiment of the present invention.
Detailed Description
The technical scheme of the present invention will be further described with reference to the accompanying drawings, but the present invention is not limited to these examples.
The basic idea of the invention is to analyze the data inquiry script and the database by SQL, obtain the association relation between the data table and the data field, establish the data table relation library, store the association relation between the analyzed data table and the data field into the data table relation library, connect the data table and the data field to form a data network by the association relation between the data table and the data field, build a data topological relation diagram, build a data service map by relying on the topological relation diagram, set a starting point data table and an end point data table in the data service map, calculate according to the data table relation library, obtain the needed data path and display on the data service map, thereby realizing the navigation and inquiry of the data.
Based on the above ideas, the invention provides a data map path navigation method based on big data SQL analysis, as shown in FIG. 1, comprising the following steps:
s1: and analyzing the data query script and the database execution log through SQL (structured query language), and acquiring the association relation between the data table and the data field. The method specifically comprises the following steps:
s11: for any SQL sentence, finding a Boolean relation expression, wherein the left side and the right side of the Boolean relation expression both contain expressions of fields;
s12: judging that the expression father inquiry is a table relation connection inquiry, and obtaining a table association type;
s13: and continuously analyzing the field sets related to the original base table for the left and right side expressions of the Boolean expression to obtain the base table set to which the fields belong.
Specifically, for any SQL statement, a Boolean relation expression is found, the left side and the right side of the Boolean expression are expressions containing fields, the father query of the expression is judged to be a table relation connection query, the table association type is obtained, the expressions on the two sides of the Boolean expression continuously analyze the field set related to the original base table, and the base table set to which the fields belong is obtained.
Fields contained in the expressions at two ends of the Boolean relation expression are derived from different data tables, and association between the fields of the Boolean relation expression and the two fields of the data tables is not classified in association relation between tables; if a certain end of the relation is from the same or different tables, the result obtained by calculating a plurality of fields is calculated to be associated with the other end, and then the relation is calculated to be a combined association relation. The association relationship further includes: left association (return records including all records in the left table and equal join fields in the right table), right association (return records including all records in the right table and equal join fields in the left table), equivalent connection (return only rows with equal join fields in the two tables), and so on.
By using SQL deep analysis technology, through judging the table association type, the two-side relationship is accurately reflected, the connection type is different, the returned data result is also different, the accuracy and the accuracy of data table and data relationship analysis are improved, and the analyzed data element particles are finer.
S2: and establishing a data table relation library, and storing the association relation between the analyzed data table and the data field into the data table relation library.
The data table relation library is used for storing the association relation between the parsed data table and the data field. And gradually precipitating the association relation between the data table and the data field through accumulation of the sample quantity, and updating the data table relation library.
S3: and connecting the data table with the data fields to form a data network through the association relationship between the data table and the data fields so as to compile a data topological relation diagram, and constructing a data service map by depending on the topological relation diagram.
And connecting the data table and the data fields to form a data network through the association relation between the data table and the data fields, wherein the data table is represented by point elements, the relation between the data table and the data table is represented by connecting lines between two points, and expressions associated with specific fields between the data table and the data table are displayed on the connecting lines. The drawn data table relation topological graph can construct a data service map covering the full service domain by showing all table relations and adjacent tables topological to a certain data table and depending on the topological graph.
S4: setting a starting point data table and an end point data table in the data service map, calculating according to the data table relation library, obtaining a required data path and displaying the data path on the data service map.
In one embodiment, the data path includes: and calculating the path of the node which is related to the minimum data table between the starting point data table and the end point data table to obtain the shortest path.
Among all paths capable of associating the start point data office table with the end point data table, the path with the least nodes of the association table passing through the middle is taken as the shortest path, for example, the start point table is directly associated with the end point table, the number of the intermediate nodes is 0, the shortest path is the shortest path, for example, each path from the start point to the end point has the intermediate nodes passing through, and the number of the intermediate nodes is the shortest path.
In one embodiment, the data path includes: and calculating a path with the highest associated query execution speed between the starting point data table and the end point data table to obtain the fastest path.
When the initial table and the key table are associated by adopting different association relations, the execution speeds of different association queries can be influenced due to different service logic complexity and different condition factors such as database operation environments or the number of association nodes, and one of the fastest paths with the highest execution speed is selected as a result.
In one embodiment, the data path includes: and calculating a path formed by using the most correlation mode between the starting point data table and the end point data table in the history execution record to obtain the most common path.
The most used way to correlate the start-end data table in the history execution record.
In one embodiment, the data path includes: and calculating according to the comprehensive execution efficiency, the table relation complexity and the historical use frequency between the starting point data table and the end point data table to obtain an optimal path.
In all paths from a starting point table to an end point table, setting weight proportion for execution speed, the number of passing nodes, business logic complexity and historical use frequency according to each condition, adding the weight proportion to a user in the path use process, marking, praying and evaluating the weight, and calculating a path with the highest comprehensive score through an intelligent recommendation algorithm model to be used as a recommended path.
In practical application, data service personnel can select the actually required path type by inputting data starting point, data end point, data passing point and the like in a data map, efficiently and conveniently create a data model and a data query script, and realize path navigation of data such as tables, fields and the like on the basis of the data map constructed by the SQL analysis technology.
Those skilled in the art may make various modifications or additions to the described embodiments or substitutions thereof without departing from the spirit of the invention or exceeding the scope of the invention as defined in the accompanying claims.
Claims (7)
1. The data map path navigation method based on large data SQL analysis is characterized by comprising the following steps:
analyzing the data query script and the database execution log from each data application system through SQL to obtain the association relation between the data table and the data field;
establishing a data table relation library, and storing the association relation between the parsed data table and the parsed data field into the data table relation library;
connecting the data table and the data field through the association relation between the data table and the data field to form a data network so as to compile a data topological relation diagram, and constructing a data service map by relying on the topological relation diagram;
setting a starting point data table and an end point data table in the data service map, calculating according to the data table relation library, obtaining a required data path and displaying the data path on the data service map.
2. The method for navigating a data map path based on big data SQL analysis according to claim 1, wherein the step of analyzing the data query script and the database execution log from each data application system by SQL to obtain the association relationship between the data table and the data field comprises:
for any SQL sentence, finding a Boolean relation expression, wherein the left side and the right side of the Boolean relation expression both contain expressions of fields;
judging that the expression father inquiry is a table relation connection inquiry, and obtaining a table association type;
and continuously analyzing the field sets related to the original base table for the left and right side expressions of the Boolean expression to obtain the base table set to which the fields belong.
3. The method for navigating a data map path based on big data SQL analysis according to claim 1, wherein the connecting the data table and the data field through the association relationship between the data table and the data field to form a data network, so as to compile a data topological relation diagram, and constructing a data service map based on the topological relation diagram comprises:
and connecting the data table and the data fields to form a data network through association relations between the data table and the data fields, wherein the data table is represented by point elements, the relation between the data tables is represented by connecting lines between two points, expressions of association of specific data fields between the data tables are displayed on the connecting lines, a data topological relation diagram is obtained, and a data service map covering the full service domain is constructed according to the data topological relation diagram.
4. The method for navigating a data map path based on large data SQL analysis according to claim 1, wherein setting a start point data table and an end point data table in the data traffic map, calculating according to a data table relational library, finding a required data path, and displaying on the data traffic map comprises:
and calculating the path of the node which is related to the minimum data table between the starting point data table and the end point data table to obtain the shortest path.
5. The method for navigating a data map path based on large data SQL parsing according to claim 1, wherein setting a start point data table and an end point data table in the data service map, calculating according to a data table relational library, finding a desired data path, and displaying on the data service map comprises:
and calculating a path with the highest associated query execution speed between the starting point data table and the end point data table to obtain the fastest path.
6. The method for navigating a data map path based on large data SQL parsing according to claim 1, wherein setting a start point data table and an end point data table in the data service map, calculating according to a data table relational library, finding a desired data path, and displaying on the data service map comprises:
and calculating a path formed by using the most correlation mode between the starting point data table and the end point data table in the history execution record to obtain the most common path.
7. The method for navigating a data map path based on large data SQL parsing according to claim 1, wherein setting a start point data table and an end point data table in the data service map, calculating according to a data table relational library, finding a desired data path, and displaying on the data service map comprises:
and calculating according to the comprehensive execution efficiency, the table relation complexity and the historical use frequency between the starting point data table and the end point data table to obtain an optimal path.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911271405.8A CN111159178B (en) | 2019-12-12 | 2019-12-12 | Data map path navigation method based on big data SQL analysis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911271405.8A CN111159178B (en) | 2019-12-12 | 2019-12-12 | Data map path navigation method based on big data SQL analysis |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111159178A CN111159178A (en) | 2020-05-15 |
CN111159178B true CN111159178B (en) | 2023-06-13 |
Family
ID=70557077
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911271405.8A Active CN111159178B (en) | 2019-12-12 | 2019-12-12 | Data map path navigation method based on big data SQL analysis |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111159178B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112948651B (en) * | 2021-03-31 | 2022-07-29 | 重庆市规划设计研究院 | Efficient OD data visualization method and system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105574048A (en) * | 2014-10-28 | 2016-05-11 | 西安造新电子信息科技有限公司 | Method for performing abstraction and path finding for network data |
CN106227892A (en) * | 2016-08-24 | 2016-12-14 | 深圳市卓讯信息技术有限公司 | A kind of intellectual analysis database table relation generates the method and device of E R figure |
CN109766345A (en) * | 2019-01-10 | 2019-05-17 | 深圳前海微众银行股份有限公司 | Metadata processing method and device, equipment, readable storage medium storing program for executing |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8948596B2 (en) * | 2011-07-01 | 2015-02-03 | CetusView Technologies, LLC | Neighborhood node mapping methods and apparatus for ingress mitigation in cable communication systems |
US9430360B2 (en) * | 2014-02-07 | 2016-08-30 | Wipro Limited | System and method for automatically testing performance of high-volume web navigation tree services |
US10691682B2 (en) * | 2017-10-04 | 2020-06-23 | EMC IP Holding Company LLC | Storing and processing JSON documents in a SQL database table |
-
2019
- 2019-12-12 CN CN201911271405.8A patent/CN111159178B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105574048A (en) * | 2014-10-28 | 2016-05-11 | 西安造新电子信息科技有限公司 | Method for performing abstraction and path finding for network data |
CN106227892A (en) * | 2016-08-24 | 2016-12-14 | 深圳市卓讯信息技术有限公司 | A kind of intellectual analysis database table relation generates the method and device of E R figure |
CN109766345A (en) * | 2019-01-10 | 2019-05-17 | 深圳前海微众银行股份有限公司 | Metadata processing method and device, equipment, readable storage medium storing program for executing |
Non-Patent Citations (1)
Title |
---|
周爱华 ; 裘洪彬 ; 高昆仑 ; 胡斌 ; 柴博 ; .基于图数据库的电网拓扑分析技术研究.电力信息与通信技术.2018,(08),全文. * |
Also Published As
Publication number | Publication date |
---|---|
CN111159178A (en) | 2020-05-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11372851B2 (en) | Systems and methods for rapid data analysis | |
JP4397978B2 (en) | Binding ordering method using concentration | |
US8065262B2 (en) | Computer-implemented multidimensional database processing method and system | |
CN107016072B (en) | Knowledge inference system and method based on social network knowledge graph | |
US9652497B2 (en) | Processing queries using hybrid access paths | |
EP0434586A2 (en) | Attribute-based method and apparatus of classification and retrieval | |
CN106874426B (en) | RDF (resource description framework) streaming data keyword real-time searching method based on Storm | |
CN109308303B (en) | Multi-table connection online aggregation method based on Markov chain | |
CN104573039A (en) | Keyword search method of relational database | |
Li et al. | An approach for approximate subgraph matching in fuzzy RDF graph | |
CN1323366C (en) | Methods and apparatus for query rewrite with auxiliary attributes in query processing operations | |
CN107193882A (en) | Why not query answer methods based on figure matching on RDF data | |
CN102855332A (en) | Graphic configuration management database based on graphic database | |
US11573987B2 (en) | System for detecting data relationships based on sample data | |
CN111159178B (en) | Data map path navigation method based on big data SQL analysis | |
CN103077216A (en) | Sub-graph matching device and sub-graph matching method | |
CN110737779A (en) | Knowledge graph construction method and device, storage medium and electronic equipment | |
CN105320700A (en) | Database dynamic query form generation method | |
TWI567574B (en) | A clustering method for mining relevance of search keywords and websites and a system thereof | |
CN111552856A (en) | Microblog public opinion propagation path analysis method | |
Boehm et al. | Squash: a tool for analyzing, tuning and refactoring relational database applications | |
Abilasha et al. | A genetic algorithm based heuristic search on graphs with weighted multiple attributes | |
CN114185929B (en) | Method and device for acquiring visual configuration for data query | |
Guzewicz | ExpRalytics: Expressive and Efficient Analytics for RDF Graphs | |
EP4462272A2 (en) | System for detecting data relationships based on sample data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |