Nothing Special   »   [go: up one dir, main page]

CN108874819B - Data mining method for database - Google Patents

Data mining method for database Download PDF

Info

Publication number
CN108874819B
CN108874819B CN201710329637.9A CN201710329637A CN108874819B CN 108874819 B CN108874819 B CN 108874819B CN 201710329637 A CN201710329637 A CN 201710329637A CN 108874819 B CN108874819 B CN 108874819B
Authority
CN
China
Prior art keywords
data
ontology
database
network
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710329637.9A
Other languages
Chinese (zh)
Other versions
CN108874819A (en
Inventor
雷晓军
周京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Alcohol Information Technology Co ltd
Original Assignee
Shanghai Alcohol Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Alcohol Information Technology Co ltd filed Critical Shanghai Alcohol Information Technology Co ltd
Priority to CN201710329637.9A priority Critical patent/CN108874819B/en
Publication of CN108874819A publication Critical patent/CN108874819A/en
Application granted granted Critical
Publication of CN108874819B publication Critical patent/CN108874819B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A data mining method of a database comprises the steps of converting a data mode of an existing relational database into a proprietary ontology to form a proprietary ontology base, converting data in the existing relational database into an RDF (resource description framework) knowledge graph corresponding to the proprietary ontology, and then carrying out node operation on a semantic network formed by the proprietary ontology to obtain data in the RDF knowledge graph corresponding to nodes. The invention simplifies the process of data mining, so that the data can be obtained by non-IT staff, and the labor productivity is greatly improved.

Description

Data mining method for database
Technical Field
The invention relates to the field of semantic search and big data, in particular to a data mining method of a database.
Background
The combination of computers and the internet creates a vast amount of information that soon gives us the feeling of being overwhelmed. This is true, as well, and we are constantly making new information while dealing with unconventional vast amounts of information. This amount of information grows in a geometric progression. It is desirable to effectively process massive information by a computer, and it is expected that the massive information can be utilized better, while being released from information inundation.
Information processing of a computer is initially limited to data having a simple structure, and the structure is relatively simple although the amount of data may be large. With the rapid increase in the hardware capacity of computers, which are used to cope with complex problems, the complexity of the structure of data increases greatly. Through different accumulation of data by the internet, data of different data sources begin to be gathered together, so that data processing becomes more complex.
The database makes our daily work very concise and efficient. As the use of databases is deepened, the ecology of the databases in use is more and more complicated, and at the same time, more and more databases need to be integrated or merged to generate greater benefits. Since the database design is a bottom-up approach nowadays, when a database becomes very complex, the database itself becomes a legacy (legacy) system, and the bottom is a huge black hole, which makes it difficult for people to reach. When these complex and exotic databases need to be integrated or merged with homogeneous databases, the task becomes very laborious and impossible (mission observable).
With respect to searching, one thinks of the relevant results that are then given for a query made using "search terms" for textual descriptions in text or images. Text is also referred to as unstructured data. For Structured data (i.e. row data, stored in a Database, and implemented data can be logically expressed by a two-dimensional table structure) stored in the Database, IT is a matter of course to hand the DBA (Database Administrator) or corresponding IT personnel to Query the desired data, and let them write the Query statement of SQL using the Query Language of a relational Database, such as SQL (Structured Query Language), and then obtain these data and corresponding data reports. For example, a health management company project ideally knows the data of 50-60 year old men and 45-55 year old women whose glycemic index is close to diabetes in their managed population, and the project manager gives this request to DBA personnel, who write corresponding SQL query statements, query and extract relevant data from the database, and then browse and analyze the data. If any problem is found and further data is needed, the manager must ask the other requirements, for example, classify the data according to profession, and the DBA staff needs to do further data query and extraction. This process is very cumbersome and fraught with possible human error.
Disclosure of Invention
The invention provides a data mining method of a database, which simplifies the data mining process, enables the data to be obtained by non-IT staff and greatly improves the labor productivity.
In order to achieve the above object, the present invention provides a data mining method for a database, comprising the steps of:
step S1, converting the data mode of the existing relational database into a proprietary ontology to form a proprietary ontology library;
step S2, converting the data in the existing relational database into an RDF knowledge graph corresponding to the proprietary ontology;
and step S3, performing node operation on the semantic network formed by the proprietary ontology, and acquiring data in the RDF knowledge graph corresponding to the nodes.
The step S1 specifically includes the following steps:
s1.1, extracting a data mode of a relational database;
s1.2, converting the data mode into a proprietary ontology;
a table in the relational database represents an entity in an ontology, and fields owned by the table in the relational database are attributes of the entity;
and S1.3, after the special ontology is edited by experts in the special field, generating an expert-level special ontology, and storing the expert-level special ontology in a special ontology library.
In step S2, the data originally stored in the table of the relational database forms the semantic web graph in the RDF knowledge graph.
The step S3 specifically includes the following steps:
s3.1, the classes and the attributes of the special ontology in the special ontology library form a semantic network graph;
s3.2, selecting a plurality of nodes on the semantic network to generate a sub-network;
and S3.3, selecting data corresponding to the nodes from the RDF knowledge graph according to the sub-networks to obtain search data.
The step of generating a sub-network in step S3.2 specifically includes: and selecting a plurality of nodes on the semantic network, filtering the nodes which are not selected, and forming a sub-network by the selected nodes.
After a sub-network is generated, the semantic network is reset to the initial state of the semantic network, so that a next new sub-network can be generated, or the nodes can be continuously selected on the basis of the current sub-network, so that a new sub-network is generated.
The invention applies the proprietary ontology to data mining and converts the structured data into the knowledge graph, thereby carrying out semantic search through keywords, simplifying the process of data mining, leading the data to be obtained to be operated by non-IT staff and greatly improving the labor productivity.
Drawings
Fig. 1 is a flowchart of a data mining method for a database according to the present invention.
Fig. 2 is a specific schematic diagram of a data mining method for a database according to the present invention.
Detailed Description
The preferred embodiment of the present invention is described in detail below with reference to fig. 1 and 2.
Ontologies and proprietary ontologies are emerging in the computer science and artificial intelligence communities to deal with such complex data processing. The ontology and the proprietary ontology are the foundation of the third generation internet, namely the Semantic Web, and are also the cornerstone of Semantic search. Third generation internet and semantic search are the basis for big data processing. Soon after the introduction of ontology into the computer field, this concept was also introduced by some people into database design and development, and the design of databases has also changed from the bottom to the top of the past to a top-down approach: firstly, the composition relationship of concepts and entities in the field and the specific attributes of the concepts and the entities are determined and designed, a proprietary field ontology is established, and the data of the database is tightly surrounded around the proprietary field ontology. Such database design, development and maintenance biases are in the completeness of concepts and entities and the straightforward handlability of domain experts. Moreover, the evolution of the database is firstly embodied in the knowledge ontology and then implemented in the underlying data system. The ontology-driven database thoroughly changes the database's popularity, so that database integration and consolidation become the maintenance and updating of the ontology, while changes to the bottom level of the database are automated.
According to the top-down concept, as shown in fig. 1, the present invention provides a data mining method for a database, comprising the following steps:
step S1, converting the data mode of the existing relational database into a proprietary ontology to form a proprietary ontology library;
step S2, converting the data in the existing relational database into an RDF knowledge graph corresponding to the proprietary ontology;
and step S3, performing node operation on the semantic network formed by the proprietary ontology, and acquiring data in the RDF knowledge graph corresponding to the nodes.
As shown in fig. 2, the step S1 specifically includes the following steps:
s1.1, extracting a data mode of a relational database;
the relational Database is composed of a series of tables in which data is stored, and various tables in the relational Database are determined by data patterns, which are established by Database administrators (DBAs for short);
s1.2, converting the data mode into a proprietary ontology;
the proprietary ontology is established by experts in the proprietary domain;
generally, a table in a relational database represents an entity in an ontology, and fields owned by the table in the relational database are attributes of the entity; some table fields are called foreign keys, namely, primary keys of another table; from an ontology perspective, this indicates that the two entities are related, one entity being the attribute value of the other entity; in the same way, the method can be applied to all tables of the database, so that the data mode can be roughly converted into the proprietary ontology, and the existing proprietary ontology participates in the conversion process;
s1.3, after the special ontology is edited by experts in the special field, generating an expert-level special ontology, and storing the expert-level special ontology in a special ontology library;
the editing refers to adding, modifying and deleting.
The step S2 specifically includes the following steps:
the data in the relational database are originally stored in the table, the positions of the data are indicated by the fields in the table, the data are extracted now, the attributes in the entity corresponding to the proprietary ontology are the values of the attributes, namely, the data are arranged in the table in the relational database, but in the RDF knowledge graph, the data directly form a semantic network graph.
As shown in fig. 2, the step S3 specifically includes the following steps:
s3.1, the classes and the attributes of the special ontology in the special ontology library form a semantic network graph;
because a proprietary ontology can have a large number of classes and a corresponding large number of attributes, the number of nodes on the semantic network graph is large, and the relationship is complex, the network graph is generated on a computer interface by using the existing Javascript technology, so that the nodes formed by the classes and the attributes can be clicked, after the nodes representing the classes or the attributes are clicked, the nodes and the relationships connected with the nodes are highlighted, and the nodes become the focus of attention;
s3.2, selecting a plurality of nodes on the semantic network to generate a sub-network;
clicking a plurality of nodes on the semantic network, filtering the nodes which are not clicked, wherein the clicked nodes form a sub-network which represents a part of data in the whole data;
according to different selected nodes, different sub-networks can be generated, after one sub-network is generated, the semantic network is reset to return to the initial state of the semantic network, a next new sub-network can be generated, or the nodes can be continuously selected on the basis of the current sub-network to generate a new sub-network;
and S3.3, selecting data corresponding to the nodes from the RDF knowledge graph according to the sub-networks to obtain search data.
The invention applies the proprietary ontology to data mining and converts the structured data into the knowledge graph, thereby carrying out semantic search through keywords, simplifying the process of data mining, leading the data to be obtained to be operated by non-IT staff and greatly improving the labor productivity.
While the present invention has been described in detail with reference to the preferred embodiments, it should be understood that the above description should not be taken as limiting the invention. Various modifications and alterations to this invention will become apparent to those skilled in the art upon reading the foregoing description. Accordingly, the scope of the invention should be determined from the following claims.

Claims (5)

1. A method for mining data of a database, comprising the steps of:
step S1, converting the data mode of the existing relational database into a proprietary ontology to form a proprietary ontology library;
step S2, converting the data in the existing relational database into an RDF knowledge graph corresponding to the proprietary ontology;
step S3, performing node operation on the semantic network formed by the proprietary ontology to acquire data in the RDF knowledge graph corresponding to the nodes;
the step S1 specifically includes the following steps:
s1.1, extracting a data mode of a relational database;
s1.2, converting the data mode into a proprietary ontology;
a table in the relational database represents an entity in an ontology, and fields owned by the table in the relational database are attributes of the entity;
and S1.3, after the special ontology is edited by experts in the special field, generating an expert-level special ontology, and storing the expert-level special ontology in a special ontology library.
2. The method of data mining of database of claim 1, wherein in step S2, the data originally stored in the table of the relational database forms a semantic web graph in the RDF knowledge graph.
3. The method for mining data of a database according to claim 1, wherein said step S3 specifically comprises the steps of:
s3.1, the classes and the attributes of the special ontology in the special ontology library form a semantic network graph;
s3.2, selecting a plurality of nodes on the semantic network to generate a sub-network;
and S3.3, selecting data corresponding to the nodes from the RDF knowledge graph according to the sub-networks to obtain search data.
4. The method of data mining of a database according to claim 3, characterized in that the step of generating a sub-network in step S3.2 comprises: and selecting a plurality of nodes on the semantic network, filtering the nodes which are not selected, and forming a sub-network by the selected nodes.
5. The method of data mining of database of claim 4, wherein after a sub-network is created, a new sub-network is created by resetting the semantic network back to the initial state of the semantic network, or by continuing to select nodes based on the current sub-network.
CN201710329637.9A 2017-05-11 2017-05-11 Data mining method for database Active CN108874819B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710329637.9A CN108874819B (en) 2017-05-11 2017-05-11 Data mining method for database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710329637.9A CN108874819B (en) 2017-05-11 2017-05-11 Data mining method for database

Publications (2)

Publication Number Publication Date
CN108874819A CN108874819A (en) 2018-11-23
CN108874819B true CN108874819B (en) 2021-09-03

Family

ID=64319551

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710329637.9A Active CN108874819B (en) 2017-05-11 2017-05-11 Data mining method for database

Country Status (1)

Country Link
CN (1) CN108874819B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107330007A (en) * 2017-06-12 2017-11-07 南京邮电大学 A kind of Method for Ontology Learning based on multi-data source

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104102713A (en) * 2014-07-16 2014-10-15 百度在线网络技术(北京)有限公司 Method and device for displaying recommendation results
CN104462501A (en) * 2014-12-19 2015-03-25 北京奇虎科技有限公司 Knowledge graph construction method and device based on structural data
CN104866593A (en) * 2015-05-29 2015-08-26 中国电子科技集团公司第二十八研究所 Database searching method based on knowledge graph
CN105183869A (en) * 2015-09-16 2015-12-23 分众(中国)信息技术有限公司 Building knowledge mapping database and construction method thereof
CN106202564A (en) * 2016-08-02 2016-12-07 浪潮软件股份有限公司 Ontology relationship data searching framework based on elastic search
CN106294481A (en) * 2015-06-05 2017-01-04 阿里巴巴集团控股有限公司 A kind of air navigation aid based on collection of illustrative plates and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8051104B2 (en) * 1999-09-22 2011-11-01 Google Inc. Editing a network of interconnected concepts

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104102713A (en) * 2014-07-16 2014-10-15 百度在线网络技术(北京)有限公司 Method and device for displaying recommendation results
CN104462501A (en) * 2014-12-19 2015-03-25 北京奇虎科技有限公司 Knowledge graph construction method and device based on structural data
CN104866593A (en) * 2015-05-29 2015-08-26 中国电子科技集团公司第二十八研究所 Database searching method based on knowledge graph
CN106294481A (en) * 2015-06-05 2017-01-04 阿里巴巴集团控股有限公司 A kind of air navigation aid based on collection of illustrative plates and device
CN105183869A (en) * 2015-09-16 2015-12-23 分众(中国)信息技术有限公司 Building knowledge mapping database and construction method thereof
CN106202564A (en) * 2016-08-02 2016-12-07 浪潮软件股份有限公司 Ontology relationship data searching framework based on elastic search

Also Published As

Publication number Publication date
CN108874819A (en) 2018-11-23

Similar Documents

Publication Publication Date Title
CN112214611B (en) Enterprise knowledge graph construction system and method
CN112906826B (en) Multi-dimensional knowledge graph based fusion method and device and computer equipment
CN102968469B (en) A kind of application references method for building up and system, application search method and system
Vera-Olivera et al. Data modeling and nosql databases-a systematic mapping review
Galhotra et al. Semantic search over structured data
Benedetti et al. Exposing the underlying schema of LOD sources
Subramanian et al. UP-GNIV: an expeditious high utility pattern mining algorithm for itemsets with negative utility values
CN108874819B (en) Data mining method for database
Wang et al. Analysis of the structure and time-series evolution of knowledge label network from a complex perspective
Chen et al. Trends in conceptual modeling: Citation analysis of the ER conference papers (1979-2005)
Aloui et al. A fuzzy ontology-based platform for flexible querying
CN110825792A (en) High-concurrency distributed data retrieval method based on golang middleware coroutine mode
An et al. Learning to discover complex mappings from web forms to ontologies
Chan et al. Interactive visual analysis of hierarchical enterprise data
Lehmberg Web table integration and profiling for knowledge base augmentation
Chaturvedi et al. System Network Analytics: Evolution and Stable Rules of a State Series
CN103577560B (en) Method and device for inputting data base operating instructions
Ahmed et al. A light weight approach for ontology generation and change synchronization between ontologies and source relational databases
Rezende et al. Proposed application of data mining techniques for clustering software projects
Rattinger et al. Semantic and topological patent graphs: Analysis of retrieval and community structure
Liu et al. Current status and application analysis of graph database technology
Bodra Processing queries over partitioned graph databases: An approach and it’s evaluation
El Abdouli et al. A distributed approach for mining moroccan hashtags using Twitter platform
Gu et al. A Novel Approach for Constructing Intangible Cultural Heritage Knowledge Graphs
Simonini et al. Enhancing Loosely Schema-aware Entity Resolution with User Interaction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant