Nothing Special   »   [go: up one dir, main page]

CN116594987A - Database analysis system and method based on big data - Google Patents

Database analysis system and method based on big data Download PDF

Info

Publication number
CN116594987A
CN116594987A CN202310720127.XA CN202310720127A CN116594987A CN 116594987 A CN116594987 A CN 116594987A CN 202310720127 A CN202310720127 A CN 202310720127A CN 116594987 A CN116594987 A CN 116594987A
Authority
CN
China
Prior art keywords
data
database
analysis
mining
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310720127.XA
Other languages
Chinese (zh)
Inventor
王莉
周亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Nanhua Vocational College Of Industry And Commerce
Original Assignee
Guangdong Nanhua Vocational College Of Industry And Commerce
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Nanhua Vocational College Of Industry And Commerce filed Critical Guangdong Nanhua Vocational College Of Industry And Commerce
Priority to CN202310720127.XA priority Critical patent/CN116594987A/en
Publication of CN116594987A publication Critical patent/CN116594987A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/217Database tuning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Fuzzy Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The database analysis system and the database analysis method based on big data improve the accuracy and the comprehensiveness of database analysis by collecting massive data, monitor and alarm the collected database running state and SQL execution condition parameter information in real time by the database monitoring module, count and analyze the database performance parameter by the database performance analysis module, and check and optimize abnormal data, evaluate and analyze the database data safety by the database safety analysis module, provide early warning and precautionary measures of safety threat, provide high-efficiency and accurate database analysis service for users by collecting, processing and analyzing massive data, discover the relation, the rule and the trend in the data by applying a data mining algorithm, acquire the rule and the trend of the data by analyzing the knowledge and experience in the data field, and discover the knowledge and the value in the data.

Description

Database analysis system and method based on big data
Technical Field
The invention relates to the field of databases, in particular to a database analysis system and method based on big data.
Background
With the rapid development of the internet and the internet of things, various enterprises and organizations need to know consumer and market conditions in order to better make decisions and plans. Although the conventional database analysis system can meet certain requirements, when facing a large amount of data and a rapidly-changing market environment, the analysis efficiency and the accuracy of the database analysis system are required to be improved, and compared with structured data, the unstructured data is more difficult to analyze and process, the processing capacity of the unstructured data of the conventional database analysis system is required to be further improved, and the processing capacity of the conventional database analysis system is still limited to a certain extent when the conventional database analysis system is used for processing massive data, so that the data analysis efficiency is possibly low, and therefore, the database analysis system and the database analysis method based on the large data are improved.
Disclosure of Invention
The invention aims at: aiming at the problems of the prior art. In order to achieve the above object, the present invention provides the following technical solutions: the database analysis system based on big data comprises a data collection module, a data screening module, a data analysis module, a data visualization module and a data mining module, wherein the data collection module is used for collecting massive database information from various channels by adopting various means and modes;
the data collection module is in data connection with the data screening module, and the data screening module is used for cleaning and processing the collected data, removing invalid data and performing duplication removal and uniform format operation;
the data screening module is in data connection with the data analysis module, and the data analysis module is used for analyzing the screened data, finding rules and trends in the screened data and providing diagnosis and analysis results of the database;
the analytical formula is:
wherein: mu (mu) x 、μ y Is xMean square error of data of (a); c 1 、c 2 Is x->Is a data format constant of (1); sigma (sigma) x 、σ y Is x->Is a function of the respective variance of (2); />Data obtained by screening the original data x>
The data analysis module is in data connection with the data visualization module, and the data visualization module is used for visually displaying analysis results and visually presenting the analysis results to a user in a chart and report mode;
the data visualization module is in data connection with the data mining module, and the data mining module is used for mining and analyzing according to data in the database, exploring potential association and modes and further assisting a user in carrying out service analysis and decision.
The invention also comprises a database monitoring module, a database performance analysis module and a database security analysis module, wherein the database monitoring module is in data connection with the database performance analysis module, and the database performance analysis module is in data connection with the database security analysis module.
As a preferable technical scheme of the invention, the database monitoring module monitors and alarms in real time on collecting database running state and SQL execution condition parameter information.
As a preferable technical scheme of the invention, the database performance analysis module performs statistics and analysis on the performance parameters of the database, and investigation and optimization of abnormal data.
As a preferable technical scheme of the invention, the database security analysis module evaluates and analyzes the data security of the database and provides early warning and precautionary measures of security threat.
The database analysis method based on big data comprises the following steps S1, data collection, wherein mass database information is collected from various channels by adopting various means and modes, the mass database information comprises a data table structure, data records and SQL sentences, and the data collection can acquire required data from various websites through the data collection of Web crawlers;
s2, data screening, namely cleaning and processing the collected data, removing invalid data, removing duplication and performing uniform format operation to ensure the accuracy and consistency of the data, wherein the data screening comprises condition screening, filter screening, database query statement screening and data mining algorithm screening;
s3, data analysis, namely, analyzing the cleaned data to find rules and trends in the data, and providing diagnosis and analysis results of a database, wherein the data analysis comprises database performance analysis, SQL statement analysis, database architecture analysis, database security analysis and data mining algorithm analysis;
s4, data visualization, wherein the analysis result is visually displayed, and the data visualization process comprises the following display modes including chart display, report display and dynamic display;
s5, data mining, namely mining potential association and mode through mining and analyzing according to data in a database, so as to assist a user in carrying out service analysis and decision, wherein the data mining process comprises statistic mining, data distribution mining, time sequence mining, data mining and domain knowledge mining.
As a preferred embodiment of the present invention, the condition screening in S2: screening the data according to known conditions, and screening the data according to date, region and index conditions;
the filter screens: using an automatic screening or advanced screening function in Excel electronic form software, and setting a deleting condition, namely screening data;
the database query statement screening: screening data in the database according to conditions through SQL database query sentences;
the database running state execution calculation formula is as follows:
wherein: operating a monitoring target by theta data; v 1 、v 2 Is the running value of the data; sigma (sigma) 1 、∑ 2 Is the data mean vector and covariance.
The data mining algorithm screens: and (3) intelligently screening the data by using clustering, classifying and association rule data mining algorithms to find rules and modes in the data.
As a preferred embodiment of the present invention, the database performance analysis in S3: the performance bottleneck of the database is found out by analyzing the performance indexes of the database, the response time, the processing capacity and the I/O indexes, and the database is optimized;
the SQL statement analysis: by analyzing the execution plan information of the SQL sentence, potential problems in the SQL sentence are found out, and the SQL sentence is optimized;
the database architecture analysis: by analyzing the system structure of the database, the relation model and the data type of the field, finding out the defects in the database design and optimizing the data model design;
the database security analysis: the security problem of the database is found out and the security policy of the database is optimized by analyzing the security setting and the authority control of the database;
the data mining algorithm analyzes: by applying a data mining algorithm, rules and patterns in the data are found, and the database management strategy is optimized.
As a preferred technical solution of the present invention, the statistics mining in S5: mining the distribution condition and central trend of the data by calculating the mean, median, mode and standard deviation statistics of the data; the data distribution mining: the distribution condition of the data is mined by drawing a histogram and a probability distribution map, and the range and the change trend of the data are mastered; the time series mining: and (5) finding out the characteristics of regularity and trend by mining the periodicity, trend and seasonality of the time series data.
As a preferred technical solution of the present invention, S6 is the data mining: by applying a data mining algorithm, the relationship, rule and trend in the data are found, and the relationship, rule and decision tree are clustered; the field knowledge mining: and (3) obtaining rules and trends of the data through analysis of knowledge and experience in the field of the data, and finding out the knowledge and value in the data.
Compared with the prior art, the invention has the beneficial effects that:
in the scheme of the invention:
1. by collecting mass data, the accuracy and the comprehensiveness of database analysis are improved. The data mining module is used for mining and analyzing according to the data in the database, and mining potential association and modes so as to assist a user in carrying out service analysis and decision.
2. And the database monitoring module is used for monitoring and alarming the collected database running state and SQL execution condition parameter information in real time. And carrying out statistics and analysis on the performance parameters of the database and checking and optimizing abnormal data through the database performance analysis module. The database security analysis module is used for evaluating and analyzing the data security of the database, and the early warning and precautionary measures for providing security threat provide high-efficiency and accurate database analysis service for users by collecting, processing and analyzing mass data.
3. By applying a data mining algorithm, the relationship, rule and trend in the data are found, and the relationship, rule and decision tree are clustered; the field knowledge mining: and (3) obtaining rules and trends of the data through analysis of knowledge and experience in the field of the data, and finding out the knowledge and value in the data.
Drawings
FIG. 1 is a schematic diagram of a system frame structure provided by the present invention;
FIG. 2 is a schematic flow chart of the method provided by the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. It will be apparent that the described embodiments are some, but not all, embodiments of the invention.
Thus, the following detailed description of the embodiments of the invention is not intended to limit the scope of the invention, as claimed, but is merely representative of some embodiments of the invention. All other embodiments obtained by those skilled in the art without making any creative effort based on the embodiments of the present invention are within the protection scope of the present invention, and it should be noted that the embodiments of the present invention and features and technical solutions of the embodiments are combined with each other without collision: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
Example 1: referring to fig. 1-2, a database analysis system based on big data includes a data collection module, a data screening module, a data analysis module, a data visualization module and a data mining module, where the data collection module is configured to collect massive database information from various channels by adopting various means and modes; the data collection module is in data connection with the data screening module, and the data screening module is used for cleaning and processing the collected data, removing invalid data and performing duplication removal and uniform format operation; the data screening module is in data connection with the data analysis module, and the data analysis module is used for analyzing the screened data, finding rules and trends in the data and providing diagnosis and analysis results of the database;
the analytical formula is:
wherein: mu (mu) x 、μ y Is xMean square error of data of (a); c 1 、c 2 Is x->Is a data format constant of (1); sigma (sigma) x 、σ y Is x->Is a function of the respective variance of (2); />Data obtained by screening the original data x>The data analysis module is in data connection with the data visualization module, and the data visualization module is used for visually displaying analysis results and visually presenting the analysis results to a user in a chart and report mode;
the data visualization module is in data connection with the data mining module, and the data mining module is used for mining and analyzing according to the data in the database, and exploring potential association and modes so as to assist a user in carrying out service analysis and decision. The system also comprises a database monitoring module, a database performance analysis module and a database security analysis module, wherein the database monitoring module is in data connection with the database performance analysis module, and the database performance analysis module is in data connection with the database security analysis module.
The database monitoring module is used for carrying out real-time monitoring and alarming on the collected database running state and SQL execution condition parameter information, and the database performance analysis module is used for carrying out statistics and analysis on the database performance parameters and checking and optimizing abnormal data. The database security analysis module evaluates and analyzes the data security of the database and provides early warning and precautionary measures of security threat.
Example 2: the database analysis method based on big data comprises the following steps S1, data collection, wherein mass database information is collected from various channels by adopting various means and modes, the mass database information comprises a data table structure, data records and SQL sentences, and the data collection can acquire required data from various websites through the data collection of Web crawlers; s2, data screening, namely cleaning and processing the collected data, removing invalid data, removing duplication and performing uniform format operation to ensure the accuracy and consistency of the data, wherein the data screening comprises condition screening, filter screening, database query statement screening and data mining algorithm screening;
s3, data analysis, namely analyzing the cleaned data to find rules and trends in the data, and providing diagnosis and analysis results of a database, wherein the data analysis comprises database performance analysis, SQL statement analysis, database architecture analysis, database security analysis and data mining algorithm analysis; s4, data visualization, wherein the analysis result is visually displayed, and the data visualization process comprises the following display modes including chart display, report display and dynamic display;
s5, data mining, namely mining potential association and mode through mining and analyzing according to data in a database, so as to assist a user in carrying out service analysis and decision, wherein the data mining process comprises statistic mining, data distribution mining, time sequence mining, data mining and domain knowledge mining.
S2, screening the data according to known conditions, and screening the data according to date, region and index conditions; and (3) screening a filter: using an automatic screening or advanced screening function in Excel electronic form software, and setting a deleting condition, namely screening data; database query statement screening: screening data in the database according to conditions through SQL database query sentences; screening a data mining algorithm: and (3) intelligently screening the data by using clustering, classifying and association rule data mining algorithms to find rules and modes in the data.
Database performance analysis in S3: the performance bottleneck of the database is found out by analyzing the performance indexes of the database, the response time, the processing capacity and the I/O indexes, and the database is optimized;
SQL statement analysis: by analyzing the execution plan information of the SQL sentence, potential problems in the SQL sentence are found out, and the SQL sentence is optimized;
database architecture analysis: by analyzing the system structure of the database, the relation model and the data type of the field, finding out the defects in the database design and optimizing the data model design;
database security analysis: the security problem of the database is found out and the security policy of the database is optimized by analyzing the security setting and the authority control of the database;
data mining algorithm analysis: by applying a data mining algorithm, rules and patterns in the data are found, and the database management strategy is optimized.
And S5, uniformly excavating: mining the distribution condition and central trend of the data by calculating the mean, median, mode and standard deviation statistics of the data; data distribution mining: the distribution condition of the data is mined by drawing a histogram and a probability distribution map, and the range and the change trend of the data are mastered; time sequence mining: and (5) finding out the characteristics of regularity and trend by mining the periodicity, trend and seasonality of the time series data.
S6, data mining: by applying a data mining algorithm, the relationship, rule and trend in the data are found, and the relationship, rule and decision tree are clustered; and (3) field knowledge mining: and (3) obtaining rules and trends of the data through analysis of knowledge and experience in the field of the data, and finding out the knowledge and value in the data.
Working principle: in the using process, massive database information is collected from various channels by adopting various means and modes, wherein the database information comprises a data table structure, data records and SQL sentences, and the data collection can acquire required data from various websites through the data collection of Web crawlers; the collected data is cleaned and processed, invalid data is removed, duplicate removal and unified format operation are carried out, the accuracy and consistency of the data are ensured, and data screening comprises condition screening, filter screening, database query statement screening and data mining algorithm screening; by analyzing the cleaned data, rules and trends are found, and diagnosis and analysis results of the database are provided.
The analytical formula is:
wherein: mu (mu) x 、μ y Is xMean square error of data of (a); c 1 、c 2 Is x->Is a data format constant of (1); sigma (sigma) x 、σ y Is x->Is a function of the respective variance of (2); />Data obtained by screening the original data x>
The data analysis comprises database performance analysis, SQL statement analysis, database architecture analysis, database security analysis and data mining algorithm analysis;
the analysis result is visually displayed, and the data visual process comprises the following display modes including chart display, report display and dynamic display; by mining and analyzing the data in the database, potential association and modes are mined, so that a user is assisted in carrying out service analysis and decision making, and the data mining process comprises statistic mining, data distribution mining, time sequence mining, data mining and domain knowledge mining.
Screening the data according to known conditions, and screening the data according to date, region and index conditions; and (3) screening a filter: using an automatic screening or advanced screening function in Excel electronic form software, and setting a deleting condition, namely screening data; database query statement screening: screening data in the database according to conditions through SQL database query sentences; screening a data mining algorithm: and (3) intelligently screening the data by using clustering, classifying and association rule data mining algorithms to find rules and modes in the data.
The performance bottleneck of the database is found out by analyzing the performance indexes of the database, the response time, the processing capacity and the I/O indexes, and the database is optimized; by analyzing the execution plan information of the SQL sentence, potential problems in the SQL sentence are found out, and the SQL sentence is optimized; by analyzing the system structure of the database, the relation model and the data type of the field, finding out the defects in the database design and optimizing the data model design; the security problem of the database is found out and the security policy of the database is optimized by analyzing the security setting and the authority control of the database; by applying a data mining algorithm, rules and patterns in the data are found, and the database management strategy is optimized.
Mining the distribution condition and central trend of the data by calculating the mean, median, mode and standard deviation statistics of the data; data distribution mining: the distribution condition of the data is mined by drawing a histogram and a probability distribution map, and the range and the change trend of the data are mastered; time sequence mining: the method comprises the steps of finding out characteristics of regularity and trending by mining periodicity, trending and seasonality of time sequence data, and finding out relations, regularity and trending, clustering, association rules and decision trees in the data by applying a data mining algorithm; and (3) field knowledge mining: and (3) obtaining rules and trends of the data through analysis of knowledge and experience in the field of the data, and finding out the knowledge and value in the data.
The above embodiments are only for illustrating the present invention and not for limiting the technical solutions described in the present invention, and although the present invention has been described in detail in the present specification with reference to the above embodiments, the present invention is not limited to the above specific embodiments, and thus any modifications or substitutions are made thereto; all technical solutions and modifications thereof that do not depart from the spirit and scope of the invention are intended to be included in the scope of the appended claims.

Claims (10)

1. The database analysis system based on big data is characterized by comprising a data collection module, a data screening module, a data analysis module, a data visualization module and a data mining module, wherein the data collection module is used for collecting massive database information from various channels by adopting various means and modes;
the data collection module is in data connection with the data screening module, and the data screening module is used for cleaning and processing the collected data, removing invalid data and performing duplication removal and uniform format operation;
the data screening module is in data connection with the data analysis module, and the data analysis module is used for analyzing the screened data, finding rules and trends in the screened data and providing diagnosis and analysis results of the database;
the analytical formula is:
wherein: mu (mu) x 、μ y Is xMean square error of data of (a); c 1 、c 2 Is x->Is a data format constant of (1); sigma (sigma) x 、σ y Is x->Is a function of the respective variance of (2); />Data obtained by screening the original data x>
The data analysis module is in data connection with the data visualization module, and the data visualization module is used for visually displaying analysis results and visually presenting the analysis results to a user in a chart and report mode;
the data visualization module is in data connection with the data mining module, and the data mining module is used for mining and analyzing according to data in the database, exploring potential association and modes and further assisting a user in carrying out service analysis and decision.
2. The big data based database analysis system of claim 1, further comprising a database monitor module, a database performance analysis module, and a database security analysis module, wherein the database monitor module is in data connection with the database performance analysis module, and the database performance analysis module is in data connection with the database security analysis module.
3. The database analysis system based on big data according to claim 2, wherein the database monitoring module monitors and alarms the collected database running state and SQL execution condition parameter information in real time; the database running state execution calculation formula is as follows:
wherein: operating a monitoring target by theta data; v 1 、v 2 Is the running value of the data; sigma (sigma) 1 、∑ 2 Is the data mean vector and covariance.
4. A big data based database analysis system according to claim 3, wherein the database performance analysis module performs statistics and analysis on the performance parameters of the database, and investigation and optimization of abnormal data.
5. The big data based database analysis system of claim 4, wherein the database security analysis module evaluates and analyzes the data security of the database to provide security threat pre-warning and countermeasure.
6. The database analysis method based on big data is characterized by comprising the following steps of S1, collecting data, namely collecting massive database information from various channels by adopting various means and modes, wherein the database information comprises a data table structure, data records and SQL sentences, and the data collection can acquire required data from various websites through the data collection of Web crawlers;
s2, data screening, namely cleaning and processing the collected data, removing invalid data, removing duplication and performing uniform format operation to ensure the accuracy and consistency of the data, wherein the data screening comprises condition screening, filter screening, database query statement screening and data mining algorithm screening;
s3, data analysis, namely, analyzing the cleaned data to find rules and trends in the data, and providing diagnosis and analysis results of a database, wherein the data analysis comprises database performance analysis, SQL statement analysis, database architecture analysis, database security analysis and data mining algorithm analysis;
s4, data visualization, wherein the analysis result is visually displayed, and the data visualization process comprises the following display modes including chart display, report display and dynamic display;
s5, data mining, namely mining potential association and mode through mining and analyzing according to data in a database, so as to assist a user in carrying out service analysis and decision, wherein the data mining process comprises statistic mining, data distribution mining, time sequence mining, data mining and domain knowledge mining.
7. The big data based database analysis method of claim 6, wherein the condition filtering in S2: screening the data according to known conditions, and screening the data according to date, region and index conditions;
the filter screens: using an automatic screening or advanced screening function in Excel electronic form software, and setting a deleting condition, namely screening data;
the database query statement screening: screening data in the database according to conditions through SQL database query sentences;
the data mining algorithm screens: and (3) intelligently screening the data by using clustering, classifying and association rule data mining algorithms to find rules and modes in the data.
8. The big data based database analysis method of claim 7, wherein the database performance analysis in S3: the performance bottleneck of the database is found out by analyzing the performance indexes of the database, the response time, the processing capacity and the I/O indexes, and the database is optimized;
the SQL statement analysis: by analyzing the execution plan information of the SQL sentence, potential problems in the SQL sentence are found out, and the SQL sentence is optimized;
the database architecture analysis: by analyzing the system structure of the database, the relation model and the data type of the field, finding out the defects in the database design and optimizing the data model design;
the database security analysis: the security problem of the database is found out and the security policy of the database is optimized by analyzing the security setting and the authority control of the database;
the data mining algorithm analyzes: by applying a data mining algorithm, rules and patterns in the data are found, and the database management strategy is optimized.
9. The big data based database analysis method of claim 8, wherein the statistics mining in S5: mining the distribution condition and central trend of the data by calculating the mean, median, mode and standard deviation statistics of the data; the data distribution mining: the distribution condition of the data is mined by drawing a histogram and a probability distribution map, and the range and the change trend of the data are mastered; the time series mining: and (5) finding out the characteristics of regularity and trend by mining the periodicity, trend and seasonality of the time series data.
10. The method for analyzing big data based database according to claim 9, wherein S6 is the data mining: by applying a data mining algorithm, the relationship, rule and trend in the data are found, and the relationship, rule and decision tree are clustered; the field knowledge mining: and (3) obtaining rules and trends of the data through analysis of knowledge and experience in the field of the data, and finding out the knowledge and value in the data.
CN202310720127.XA 2023-06-18 2023-06-18 Database analysis system and method based on big data Pending CN116594987A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310720127.XA CN116594987A (en) 2023-06-18 2023-06-18 Database analysis system and method based on big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310720127.XA CN116594987A (en) 2023-06-18 2023-06-18 Database analysis system and method based on big data

Publications (1)

Publication Number Publication Date
CN116594987A true CN116594987A (en) 2023-08-15

Family

ID=87595721

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310720127.XA Pending CN116594987A (en) 2023-06-18 2023-06-18 Database analysis system and method based on big data

Country Status (1)

Country Link
CN (1) CN116594987A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020078039A1 (en) * 2000-12-18 2002-06-20 Ncr Corporation By Paul M. Cereghini Architecture for distributed relational data mining systems
US20110161030A1 (en) * 2009-12-31 2011-06-30 Semiconductor Manufacturing International (Shanghai) Corporation Method And Device For Monitoring Measurement Data In Semiconductor Process
CN102622441A (en) * 2012-03-09 2012-08-01 山东大学 Automatic performance identification tuning system based on Oracle database
KR20170079648A (en) * 2015-12-30 2017-07-10 대한민국(국민안전처 국립재난안전연구원장) Analysis system for predicting future risks
KR101765292B1 (en) * 2016-06-21 2017-08-04 어니컴 주식회사 Apparatus and method for providing data analysis tool based on purpose
CN109272155A (en) * 2018-09-11 2019-01-25 郑州向心力通信技术股份有限公司 A kind of corporate behavior analysis system based on big data
CN109977661A (en) * 2019-04-09 2019-07-05 福建奇点时空数字科技有限公司 A kind of network safety protection method and system based on big data platform
CN111949502A (en) * 2020-08-14 2020-11-17 中国工商银行股份有限公司 Database early warning method and device, computing equipment and medium
CN115640158A (en) * 2022-10-28 2023-01-24 合肥长月科技有限公司 Detection analysis method and device based on database

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020078039A1 (en) * 2000-12-18 2002-06-20 Ncr Corporation By Paul M. Cereghini Architecture for distributed relational data mining systems
US20110161030A1 (en) * 2009-12-31 2011-06-30 Semiconductor Manufacturing International (Shanghai) Corporation Method And Device For Monitoring Measurement Data In Semiconductor Process
CN102622441A (en) * 2012-03-09 2012-08-01 山东大学 Automatic performance identification tuning system based on Oracle database
KR20170079648A (en) * 2015-12-30 2017-07-10 대한민국(국민안전처 국립재난안전연구원장) Analysis system for predicting future risks
KR101765292B1 (en) * 2016-06-21 2017-08-04 어니컴 주식회사 Apparatus and method for providing data analysis tool based on purpose
CN109272155A (en) * 2018-09-11 2019-01-25 郑州向心力通信技术股份有限公司 A kind of corporate behavior analysis system based on big data
CN109977661A (en) * 2019-04-09 2019-07-05 福建奇点时空数字科技有限公司 A kind of network safety protection method and system based on big data platform
CN111949502A (en) * 2020-08-14 2020-11-17 中国工商银行股份有限公司 Database early warning method and device, computing equipment and medium
CN115640158A (en) * 2022-10-28 2023-01-24 合肥长月科技有限公司 Detection analysis method and device based on database

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王莉;张勇: "基于大数据平台的图像数据库架构的设计与实现", 软件工程, vol. 22, no. 02 *

Similar Documents

Publication Publication Date Title
Bailis et al. Macrobase: Prioritizing attention in fast data
CN111190876A (en) Log management system and operation method thereof
CN116662989B (en) Security data analysis method and system
CN106371986A (en) Log treatment operation and maintenance monitoring system
Allam An Exploratory Survey of Hadoop Log Analysis Tools
EP2747365A1 (en) Network security management
CN112988509B (en) Alarm message filtering method and device, electronic equipment and storage medium
CN117971606B (en) Log management system and method based on elastic search
CN117172751A (en) Construction method of intelligent operation and maintenance information analysis model
CN116755992B (en) Log analysis method and system based on OpenStack cloud computing
CN114817681B (en) Financial wind control system based on big data analysis and management equipment thereof
Yamini A violent crime analysis using fuzzy c-means clustering approach
CN117194919A (en) Production data analysis system
CN118260695A (en) Big data anomaly analysis method and system for digital online service
Isafiade et al. Citisafe: Adaptive spatial pattern knowledge using fp-growth algorithm for crime situation recognition
CN116991932B (en) Data analysis and management system and method based on artificial intelligence
CN116594987A (en) Database analysis system and method based on big data
CN116862109A (en) Regional carbon emission situation awareness early warning method
CN112256549B (en) Log processing method and device
CN115130793A (en) Enterprise management analysis system and method based on big data
Kumar et al. Crime Data Analysis using Big Data Analytics and Visualization using Tableau
CN116703321B (en) Pharmaceutical factory management method and system based on green production
Zhong et al. Leveraging decision making in cyber security analysis through data cleaning
CN118820216A (en) Data element extraction analysis system and data element extraction analysis method
Balaskó et al. What happens to process data in chemical industry? From source to applications–an overview

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20230815