Nothing Special   »   [go: up one dir, main page]

CN111159178B - Data map path navigation method based on big data SQL analysis - Google Patents

Data map path navigation method based on big data SQL analysis Download PDF

Info

Publication number
CN111159178B
CN111159178B CN201911271405.8A CN201911271405A CN111159178B CN 111159178 B CN111159178 B CN 111159178B CN 201911271405 A CN201911271405 A CN 201911271405A CN 111159178 B CN111159178 B CN 111159178B
Authority
CN
China
Prior art keywords
data
data table
path
relation
map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911271405.8A
Other languages
Chinese (zh)
Other versions
CN111159178A (en
Inventor
王仲锋
杨春晨
丁雪花
李冰
纪德良
石佳
解林超
阳东
王永平
于亚丰
汪娟玉
胡如一
姜震
蒋斌
徐宏伟
王澍
姜小建
吕旭芬
谭程文
吴美娟
方豪强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Huayun Information Technology Co Ltd
Original Assignee
Zhejiang Huayun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Huayun Information Technology Co Ltd filed Critical Zhejiang Huayun Information Technology Co Ltd
Priority to CN201911271405.8A priority Critical patent/CN111159178B/en
Publication of CN111159178A publication Critical patent/CN111159178A/en
Application granted granted Critical
Publication of CN111159178B publication Critical patent/CN111159178B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the field of big data processing, in particular to a data map path navigation method based on big data SQL analysis, which comprises the following steps: analyzing the data query script and the database execution log from each data application system through SQL to obtain the association relation between the data table and the data field; establishing a data table relation library, and storing the association relation between the parsed data table and the parsed data field into the data table relation library; connecting the data table and the data field through the association relation between the data table and the data field to form a data network so as to compile a data topological relation diagram, and constructing a data service map by relying on the topological relation diagram; setting a starting point data table and an end point data table in the data service map, calculating according to the data table relation library, obtaining a required data path and displaying the data path on the data service map. The invention has the beneficial effects that: and navigation and query of the data are realized.

Description

Data map path navigation method based on big data SQL analysis
Technical Field
The invention relates to the field of big data processing, in particular to a data map path navigation method based on big data SQL analysis.
Background
A database index is an ordered data structure in a database management system to facilitate quick querying and updating of data in a database table. Database indexing is a directory built up of values in certain fields in order to increase the efficiency of table searching.
However, the database index lacks the association relation between the data table and the data field, and the path of the index cannot be visually displayed.
Disclosure of Invention
In order to solve the problems, the invention provides a data map path navigation method based on large data SQL analysis.
A data map path navigation method based on big data SQL analysis comprises the following steps:
analyzing the data query script and the database execution log from each data application system through SQL to obtain the association relation between the data table and the data field;
establishing a data table relation library, and storing the association relation between the parsed data table and the parsed data field into the data table relation library;
connecting the data table and the data field through the association relation between the data table and the data field to form a data network so as to compile a data topological relation diagram, and constructing a data service map by relying on the topological relation diagram;
setting a starting point data table and an end point data table in the data service map, calculating according to the data table relation library, obtaining a required data path and displaying the data path on the data service map.
Preferably, the analyzing the data query script and the database execution log from each data application system through SQL to obtain the association relationship between the data table and the data field includes:
for any SQL sentence, finding a Boolean relation expression, wherein the left side and the right side of the Boolean relation expression both contain expressions of fields;
judging that the expression father inquiry is a table relation connection inquiry, and obtaining a table association type;
and continuously analyzing the field sets related to the original base table for the left and right side expressions of the Boolean expression to obtain the base table set to which the fields belong.
Preferably, the step of connecting the data table and the data field through association relations between the data table and the data field to form a data network so as to compile a data topological relation diagram, and the step of constructing the data service map by relying on the topological relation diagram includes:
and connecting the data table and the data fields to form a data network through association relations between the data table and the data fields, wherein the data table is represented by point elements, the relation between the data tables is represented by connecting lines between two points, expressions of association of specific data fields between the data tables are displayed on the connecting lines, a data topological relation diagram is obtained, and a data service map covering the full service domain is constructed according to the data topological relation diagram.
Preferably, setting a start point data table, an end point data table and an end point data table in the data service map, calculating according to a data table relation library, obtaining a required data path, and displaying the data path on the data service map comprises:
and calculating the path of the node which is related to the minimum data table between the starting point data table and the end point data table to obtain the shortest path.
Preferably, setting a start point data table and an end point data table in the data service map, calculating according to a data table relation library, obtaining a required data path, and displaying the data path on the data service map includes:
and calculating a path with the highest associated query execution speed between the starting point data table and the end point data table to obtain the fastest path.
Preferably, setting a start point data table and an end point data table in the data service map, calculating according to a data table relation library, obtaining a required data path, and displaying the data path on the data service map includes:
and calculating a path formed by using the most correlation mode between the starting point data table and the end point data table in the history execution record to obtain the most common path.
Preferably, setting a start point data table and an end point data table in the data service map, calculating according to a data table relation library, obtaining a required data path, and displaying the data path on the data service map includes:
and calculating according to the comprehensive execution efficiency, the table relation complexity and the historical use frequency between the starting point data table and the end point data table to obtain an optimal path.
The invention has the beneficial effects that: the method comprises the steps of analyzing a data query script and a database execution log through SQL, acquiring an association relation between a data table and a data field, establishing a data table relation library, storing the association relation between the analyzed data table and the data field into the data table relation library, connecting the data table and the data field through the association relation between the data table and the data field to form a data network, compiling a data topological relation diagram, constructing a data service map according to the topological relation diagram, setting a starting point data table and an end point data table in the data service map, calculating according to the data table relation library, obtaining a required data path and displaying on the data service map, and therefore data navigation and query are achieved.
Drawings
The invention will be described in further detail with reference to the drawings and the detailed description.
FIG. 1 is a schematic flow chart of steps S1-S4 of a data map path navigation method based on large data SQL parsing according to an embodiment of the invention;
fig. 2 is a schematic flowchart of a data map path navigation method step S1 based on big data SQL parsing according to an embodiment of the present invention.
Detailed Description
The technical scheme of the present invention will be further described with reference to the accompanying drawings, but the present invention is not limited to these examples.
The basic idea of the invention is to analyze the data inquiry script and the database by SQL, obtain the association relation between the data table and the data field, establish the data table relation library, store the association relation between the analyzed data table and the data field into the data table relation library, connect the data table and the data field to form a data network by the association relation between the data table and the data field, build a data topological relation diagram, build a data service map by relying on the topological relation diagram, set a starting point data table and an end point data table in the data service map, calculate according to the data table relation library, obtain the needed data path and display on the data service map, thereby realizing the navigation and inquiry of the data.
Based on the above ideas, the invention provides a data map path navigation method based on big data SQL analysis, as shown in FIG. 1, comprising the following steps:
s1: and analyzing the data query script and the database execution log through SQL (structured query language), and acquiring the association relation between the data table and the data field. The method specifically comprises the following steps:
s11: for any SQL sentence, finding a Boolean relation expression, wherein the left side and the right side of the Boolean relation expression both contain expressions of fields;
s12: judging that the expression father inquiry is a table relation connection inquiry, and obtaining a table association type;
s13: and continuously analyzing the field sets related to the original base table for the left and right side expressions of the Boolean expression to obtain the base table set to which the fields belong.
Specifically, for any SQL statement, a Boolean relation expression is found, the left side and the right side of the Boolean expression are expressions containing fields, the father query of the expression is judged to be a table relation connection query, the table association type is obtained, the expressions on the two sides of the Boolean expression continuously analyze the field set related to the original base table, and the base table set to which the fields belong is obtained.
Fields contained in the expressions at two ends of the Boolean relation expression are derived from different data tables, and association between the fields of the Boolean relation expression and the two fields of the data tables is not classified in association relation between tables; if a certain end of the relation is from the same or different tables, the result obtained by calculating a plurality of fields is calculated to be associated with the other end, and then the relation is calculated to be a combined association relation. The association relationship further includes: left association (return records including all records in the left table and equal join fields in the right table), right association (return records including all records in the right table and equal join fields in the left table), equivalent connection (return only rows with equal join fields in the two tables), and so on.
By using SQL deep analysis technology, through judging the table association type, the two-side relationship is accurately reflected, the connection type is different, the returned data result is also different, the accuracy and the accuracy of data table and data relationship analysis are improved, and the analyzed data element particles are finer.
S2: and establishing a data table relation library, and storing the association relation between the analyzed data table and the data field into the data table relation library.
The data table relation library is used for storing the association relation between the parsed data table and the data field. And gradually precipitating the association relation between the data table and the data field through accumulation of the sample quantity, and updating the data table relation library.
S3: and connecting the data table with the data fields to form a data network through the association relationship between the data table and the data fields so as to compile a data topological relation diagram, and constructing a data service map by depending on the topological relation diagram.
And connecting the data table and the data fields to form a data network through the association relation between the data table and the data fields, wherein the data table is represented by point elements, the relation between the data table and the data table is represented by connecting lines between two points, and expressions associated with specific fields between the data table and the data table are displayed on the connecting lines. The drawn data table relation topological graph can construct a data service map covering the full service domain by showing all table relations and adjacent tables topological to a certain data table and depending on the topological graph.
S4: setting a starting point data table and an end point data table in the data service map, calculating according to the data table relation library, obtaining a required data path and displaying the data path on the data service map.
In one embodiment, the data path includes: and calculating the path of the node which is related to the minimum data table between the starting point data table and the end point data table to obtain the shortest path.
Among all paths capable of associating the start point data office table with the end point data table, the path with the least nodes of the association table passing through the middle is taken as the shortest path, for example, the start point table is directly associated with the end point table, the number of the intermediate nodes is 0, the shortest path is the shortest path, for example, each path from the start point to the end point has the intermediate nodes passing through, and the number of the intermediate nodes is the shortest path.
In one embodiment, the data path includes: and calculating a path with the highest associated query execution speed between the starting point data table and the end point data table to obtain the fastest path.
When the initial table and the key table are associated by adopting different association relations, the execution speeds of different association queries can be influenced due to different service logic complexity and different condition factors such as database operation environments or the number of association nodes, and one of the fastest paths with the highest execution speed is selected as a result.
In one embodiment, the data path includes: and calculating a path formed by using the most correlation mode between the starting point data table and the end point data table in the history execution record to obtain the most common path.
The most used way to correlate the start-end data table in the history execution record.
In one embodiment, the data path includes: and calculating according to the comprehensive execution efficiency, the table relation complexity and the historical use frequency between the starting point data table and the end point data table to obtain an optimal path.
In all paths from a starting point table to an end point table, setting weight proportion for execution speed, the number of passing nodes, business logic complexity and historical use frequency according to each condition, adding the weight proportion to a user in the path use process, marking, praying and evaluating the weight, and calculating a path with the highest comprehensive score through an intelligent recommendation algorithm model to be used as a recommended path.
In practical application, data service personnel can select the actually required path type by inputting data starting point, data end point, data passing point and the like in a data map, efficiently and conveniently create a data model and a data query script, and realize path navigation of data such as tables, fields and the like on the basis of the data map constructed by the SQL analysis technology.
Those skilled in the art may make various modifications or additions to the described embodiments or substitutions thereof without departing from the spirit of the invention or exceeding the scope of the invention as defined in the accompanying claims.

Claims (7)

1. The data map path navigation method based on large data SQL analysis is characterized by comprising the following steps:
analyzing the data query script and the database execution log from each data application system through SQL to obtain the association relation between the data table and the data field;
establishing a data table relation library, and storing the association relation between the parsed data table and the parsed data field into the data table relation library;
connecting the data table and the data field through the association relation between the data table and the data field to form a data network so as to compile a data topological relation diagram, and constructing a data service map by relying on the topological relation diagram;
setting a starting point data table and an end point data table in the data service map, calculating according to the data table relation library, obtaining a required data path and displaying the data path on the data service map.
2. The method for navigating a data map path based on big data SQL analysis according to claim 1, wherein the step of analyzing the data query script and the database execution log from each data application system by SQL to obtain the association relationship between the data table and the data field comprises:
for any SQL sentence, finding a Boolean relation expression, wherein the left side and the right side of the Boolean relation expression both contain expressions of fields;
judging that the expression father inquiry is a table relation connection inquiry, and obtaining a table association type;
and continuously analyzing the field sets related to the original base table for the left and right side expressions of the Boolean expression to obtain the base table set to which the fields belong.
3. The method for navigating a data map path based on big data SQL analysis according to claim 1, wherein the connecting the data table and the data field through the association relationship between the data table and the data field to form a data network, so as to compile a data topological relation diagram, and constructing a data service map based on the topological relation diagram comprises:
and connecting the data table and the data fields to form a data network through association relations between the data table and the data fields, wherein the data table is represented by point elements, the relation between the data tables is represented by connecting lines between two points, expressions of association of specific data fields between the data tables are displayed on the connecting lines, a data topological relation diagram is obtained, and a data service map covering the full service domain is constructed according to the data topological relation diagram.
4. The method for navigating a data map path based on large data SQL analysis according to claim 1, wherein setting a start point data table and an end point data table in the data traffic map, calculating according to a data table relational library, finding a required data path, and displaying on the data traffic map comprises:
and calculating the path of the node which is related to the minimum data table between the starting point data table and the end point data table to obtain the shortest path.
5. The method for navigating a data map path based on large data SQL parsing according to claim 1, wherein setting a start point data table and an end point data table in the data service map, calculating according to a data table relational library, finding a desired data path, and displaying on the data service map comprises:
and calculating a path with the highest associated query execution speed between the starting point data table and the end point data table to obtain the fastest path.
6. The method for navigating a data map path based on large data SQL parsing according to claim 1, wherein setting a start point data table and an end point data table in the data service map, calculating according to a data table relational library, finding a desired data path, and displaying on the data service map comprises:
and calculating a path formed by using the most correlation mode between the starting point data table and the end point data table in the history execution record to obtain the most common path.
7. The method for navigating a data map path based on large data SQL parsing according to claim 1, wherein setting a start point data table and an end point data table in the data service map, calculating according to a data table relational library, finding a desired data path, and displaying on the data service map comprises:
and calculating according to the comprehensive execution efficiency, the table relation complexity and the historical use frequency between the starting point data table and the end point data table to obtain an optimal path.
CN201911271405.8A 2019-12-12 2019-12-12 Data map path navigation method based on big data SQL analysis Active CN111159178B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911271405.8A CN111159178B (en) 2019-12-12 2019-12-12 Data map path navigation method based on big data SQL analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911271405.8A CN111159178B (en) 2019-12-12 2019-12-12 Data map path navigation method based on big data SQL analysis

Publications (2)

Publication Number Publication Date
CN111159178A CN111159178A (en) 2020-05-15
CN111159178B true CN111159178B (en) 2023-06-13

Family

ID=70557077

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911271405.8A Active CN111159178B (en) 2019-12-12 2019-12-12 Data map path navigation method based on big data SQL analysis

Country Status (1)

Country Link
CN (1) CN111159178B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112948651B (en) * 2021-03-31 2022-07-29 重庆市规划设计研究院 Efficient OD data visualization method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105574048A (en) * 2014-10-28 2016-05-11 西安造新电子信息科技有限公司 Method for performing abstraction and path finding for network data
CN106227892A (en) * 2016-08-24 2016-12-14 深圳市卓讯信息技术有限公司 A kind of intellectual analysis database table relation generates the method and device of E R figure
CN109766345A (en) * 2019-01-10 2019-05-17 深圳前海微众银行股份有限公司 Metadata processing method and device, equipment, readable storage medium storing program for executing

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8948596B2 (en) * 2011-07-01 2015-02-03 CetusView Technologies, LLC Neighborhood node mapping methods and apparatus for ingress mitigation in cable communication systems
US9430360B2 (en) * 2014-02-07 2016-08-30 Wipro Limited System and method for automatically testing performance of high-volume web navigation tree services
US10691682B2 (en) * 2017-10-04 2020-06-23 EMC IP Holding Company LLC Storing and processing JSON documents in a SQL database table

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105574048A (en) * 2014-10-28 2016-05-11 西安造新电子信息科技有限公司 Method for performing abstraction and path finding for network data
CN106227892A (en) * 2016-08-24 2016-12-14 深圳市卓讯信息技术有限公司 A kind of intellectual analysis database table relation generates the method and device of E R figure
CN109766345A (en) * 2019-01-10 2019-05-17 深圳前海微众银行股份有限公司 Metadata processing method and device, equipment, readable storage medium storing program for executing

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
周爱华 ; 裘洪彬 ; 高昆仑 ; 胡斌 ; 柴博 ; .基于图数据库的电网拓扑分析技术研究.电力信息与通信技术.2018,(08),全文. *

Also Published As

Publication number Publication date
CN111159178A (en) 2020-05-15

Similar Documents

Publication Publication Date Title
US11372851B2 (en) Systems and methods for rapid data analysis
JP4397978B2 (en) Binding ordering method using concentration
US8065262B2 (en) Computer-implemented multidimensional database processing method and system
CN107016072B (en) Knowledge inference system and method based on social network knowledge graph
US9652497B2 (en) Processing queries using hybrid access paths
EP0434586A2 (en) Attribute-based method and apparatus of classification and retrieval
CN106874426B (en) RDF (resource description framework) streaming data keyword real-time searching method based on Storm
CN109308303B (en) Multi-table connection online aggregation method based on Markov chain
CN104573039A (en) Keyword search method of relational database
Li et al. An approach for approximate subgraph matching in fuzzy RDF graph
CN1323366C (en) Methods and apparatus for query rewrite with auxiliary attributes in query processing operations
CN107193882A (en) Why not query answer methods based on figure matching on RDF data
CN102855332A (en) Graphic configuration management database based on graphic database
US11573987B2 (en) System for detecting data relationships based on sample data
CN111159178B (en) Data map path navigation method based on big data SQL analysis
CN103077216A (en) Sub-graph matching device and sub-graph matching method
CN110737779A (en) Knowledge graph construction method and device, storage medium and electronic equipment
CN105320700A (en) Database dynamic query form generation method
TWI567574B (en) A clustering method for mining relevance of search keywords and websites and a system thereof
CN111552856A (en) Microblog public opinion propagation path analysis method
Boehm et al. Squash: a tool for analyzing, tuning and refactoring relational database applications
Abilasha et al. A genetic algorithm based heuristic search on graphs with weighted multiple attributes
CN114185929B (en) Method and device for acquiring visual configuration for data query
Guzewicz ExpRalytics: Expressive and Efficient Analytics for RDF Graphs
EP4462272A2 (en) System for detecting data relationships based on sample data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant