A kind of database overall automation optimizes prior-warning device and disposal route thereof
Technical field
The present invention relates to a kind of data base tool and disposal route thereof, particularly relate to a kind of database overall automation and optimize prior-warning device and disposal route thereof.
Background technology
Along with the crucial IT business systems grow of enterprise comes into one's own, also when the river rises the boat goes up for IT demand.Require that IT system should have better high reliability, more fast, more easy-to-use and more adapt to the Change and Development of business.The online processing transactions amount of modern enterprise IT system is large, and data interaction is frequent, all very high to the availability of system, requirement of real-time, requires very strict to performance index.
On the other hand, the business of enterprise is the needs adapting to numerous customer groups, change again original business endlessly, release newborn business, the development need that IT system developing department just constantly must issue new application version, change IT system database object (table, index etc.) coordinates business.Along with the fast development of business event, business datum amount maintains sustained and rapid growth, thus brings impact to performance.
How to ensure the superior performance of operation system, and in constantly changing still retention stablize, this be numerous enterprises now faced by a very important problem.In reality, the operation system that technological frame is fixed, database schema is stable, the quality overwhelming majority of its runnability depends on the performance change of database, and a set of entirety rational index scheme is to database performance boost and stablely play vital effect.
Index object in database is a kind of structure sorted to row or the value of multiple row in database table, makes index of reference can customizing messages in fast access database table.As the catalogue effect of same books is the same, reader can be helped to navigate to particular content fast, be the very effective and strong means that database overall performance promotes.In a database, the two-part information of index stores: the key assignments in table and corresponding rowid(its comprise the physical address of data line), rowid can unique identification one line item.Suitable row reasonably make index of reference, read and write by means of only a small amount of IO, just from the table of huge data volume, location requirement can find the particular location of certain block of data in table, thus quick obtaining is to required data, reduces the time of data search and the expense of database resource.
But index is also a double-edged sword, if use irrational index on inappropriate row, then negative effect can be brought.On the one hand, use the index on improper row, query optimizer just repeatedly must be retrieved from index, draw all rowid, and then from table, the number of times of as much is read by rowid, just getting required data, using search index slower than not making index of reference on the contrary like this; On the other hand, there is maintenance cost in index itself.For the frequent table inserting, delete, if create more index, or create index in the field of often carrying out renewal rewards theory, the maintenance of index can affect the efficiency that these DML operate (comprising the operations such as insertion, deletion, renewal).Therefore, in order to ensure the rationality of index, needing to take into account simultaneously and considering query manipulation and DML operation.
For database, how to create and to safeguard a set of rational index scheme, needing to have the overall situation in mind, totally weigh, can not only catch a SQL statement, table, field just unilateral determining whether create one or more index.
For establishment and the maintenance of database index, a main point the following two kinds situation performs at present:
1), at applied system design, exploitation, test phase, or before Added Business is formally reached the standard grade, designer, developer etc. rely on self understanding to business and the effect of model measurement, estimate the indexs such as the data volume of corresponding service table, variable quantity, data value distribution, access frequency, then create initial index based on these assessed values batch in corresponding traffic table;
2), in system commencement of commercial operation process, when a certain bar SQL statement execution efficiency of discovery is low:
Situation 1) maintainer relies on this wall scroll SQL statement of personal experience's Manual analysis, as judged to lack caused by suitable index, namely on original index scheme basis, newly-increased create required for index; Situation 2) maintainer uses the tuningadvisor instrument provided as Oracle, wall scroll SQL sets out thus, provides limited Optimizing Suggestions according to the statistical information of related service table, index as newly-increased in instrument suggestion, maintainer is comprehensive descision then, then creates corresponding index.
Situation 2, mainly revising the initial index scheme that situation 1 creates, solving the performance issue caused because lacking suitable index in operational process.Process flow diagram as shown in Figure 1, comprises the steps: 1) real-time monitoring data storehouse wait event, whether have and extremely wait for generation; 2) according to DBD database diagnostics report, the TOPSQL that resource consumption is relatively given prominence to whether is had; 3) if flow process 1,2 finds poor efficiency SQL, manual or use as OracleSQLtuningadvisor tool analysis SQL and the traffic table be associated and index distribution situation, thus determine whether that the index needing establishment suitable is to promote SQL efficiency; 4) according to optimiged index scheme, increase index newly or revise existing index; 5) executive plan making SQL original was lost efficacy, and regenerated executive plan; 6) executive plan that this SQL is new, corresponding cost expense and wait event is checked; 7) whether validation problem solves; 8) if do not solved, so iterative analysis, until Resolving probiems.
System formally reach the standard grade before the index scheme of initial creation, due to artificially assess error, emulation testing and the factor such as the gap of actual production, data volume constantly increases, access habits changes, business demand changes, application program change, the optimization (or revise) of later stage to index scheme is essential, and database index optimization is a process continuing for a long time to carry out.Therefore there is following shortcoming in existing database index creation and maintenance:
1), foresight is not possessed.Optimiged index operates, and is all that core business is partly or entirely affected often, belongs to passive behavior after performance issue occurs.
2), lack the overall situation to consider.After performance issue occurs, technician navigates to concrete SQL statement, only analyze for this single SQL statement, optimiged index scheme formulated by its relevant table and field, fail to consider that this index changes the performance impact to other SQL on same table comprehensively, but this risk is outwardness, be a kind ofly turn round and look at the behavior that head ignores tail.
3), index only increases.Technician is when Analytical Index prioritization scheme, because worry deletion, merging index have the larger performance that may have influence on other SQL, so consider from minimum influence, often only dare to increase required index newly again on the original index basis of table, will cause like this index scheme of whole database more and more huger, become increasingly complex, even can reduce the efficiency of database conversely.
4), high to the dependence of people.Even if use OracleSQLtuningadvisor instrument to analyze SQL statement, because its shortage is to the collection of the key messages such as visitation frequency, data volume variation tendency and analysis, still cannot ensure that it is advised rationally effective, still maintainer is needed again in conjunction with service integration analysis, to professional knowledge Existence dependency.Further, for shortening the affected duration of business, maintainer is needed to deal with fast, so the technical merit also Existence dependency to maintainer.If can not meet the dependence of these two aspects simultaneously, maintainer attempts the establishment index of formula, may not only can't resolve original problem, also may draw new performance issue.
Summary of the invention
Technical matters to be solved by this invention is to provide a kind of database overall automation and optimizes prior-warning device and disposal route thereof, index early warning can be carried out according to data variation and visitation frequency, reduced data storehouse index scheme, effectively avoid the service feature problem caused because index is unreasonable, ensure that database maintains Effec-tive Function.
The present invention solves the problems of the technologies described above the technical scheme adopted to be to provide a kind of database overall automation optimization prior-warning device, and comprise task scheduling module: for arranging index monitor task, configuration needs the traffic table monitoring index; Information acquisition module: be connected to target database for timing, the execution frequency gathering all SQL, SQL of corresponding service table, the predicate information called, and be recorded to information acquisition storehouse; Results analyses module: according to the data collected, statistical study is carried out to the selectivity of predicate each in all SQL, frequency of usage and combination, calculate the index scheme of target database in real time, and be recorded to interpretation of result storehouse.
Above-mentioned database overall automation optimizes prior-warning device, wherein, described task scheduling module comprises task creation, task is cancelled, task upgrades, task is deleted and job enquiry, and described task scheduling module is by once to perform or timer-triggered scheduler module repeatedly circulates and performs image data library information.
Above-mentioned database overall automation optimizes prior-warning device, and wherein, described information acquisition module comprises SQL filtering module, predicate locating module, predicate filtering module, predicate fractionation module and frequency acquisition module; Described results analyses module comprises predicative analysis module and frequency analysis module, and described results analyses module increases newly index according to the visitation frequency change of traffic table and field, merge and/or deletes, and calculates the index scheme of target database in real time.
The present invention also provides a kind of above-mentioned database overall automation to optimize the disposal route of prior-warning device for solving the problems of the technologies described above, and comprise the steps: a) to arrange index monitor task, configuration needs the traffic table monitoring index; B) timing is connected to target database, the execution frequency gathering all SQL, SQL of corresponding service table, the predicate information called; C) according to the index scheme of the data real-time update target database collected.
Above-mentioned database overall automation optimizes early-warning processing method, wherein, described step b) in gatherer process as follows: b1) input user name in target database/table name set, and the beginning and ending time of sampling, the frequency of collection; B2) from target database, relevant SQL is obtained according to inputting user name/table name set; B3) obtain successively in every bar SQL statement according to SQL_ID and access predicate field and filter predicates field and gather; B4) selectivity of predicate is calculated according to the statistical information of traffic table; B5) the execution frequency of SQL is added up.
Above-mentioned database overall automation optimizes early-warning processing method, wherein, line number is less than to the traffic table of default line number, does not create index, directly skip subsequent step.
Above-mentioned database overall automation optimizes early-warning processing method, wherein, for the predicate comprising key word " and " or " or ", is split into independent predicate respectively by about it, and recurrence split process; Filter when collection for the predicate field of index of reference cannot be made; For the predicate comprising function, set up the index based on function.
Above-mentioned database overall automation optimizes early-warning processing method, wherein, described step c) comprise following process: c1) input the information collected: user name/table name, predicate, the frequency and selectivity are also added up; C2) by same SQL and frequency collating account for total amount before 60% predicate combine, for exist SQL that is identical or the identical predicate of part first judge whether create composite index, recycling index; C3) if do not need to create composite index, then order of the field is decided by traversal Ranking evaluation index utilization factor; C4) if single-row index cost is lower than composite index, then carry out single-row index analysis, add up each predicate field frequency of usage in all SQL and with descending sort, calculate the frequency of each field; C5) analyze predicate from high to low by the predicate frequency, predicate selectivity being greater than to 0.2 creates b-tree indexed or the b-tree indexed based on function; C6) iterative analysis predicate, until terminate, contrast difference done in last and existing index, exports the index needing to increase, delete and/or merge.
Above-mentioned database overall automation optimizes early-warning processing method, wherein, selectivity is less than to the predicate of 0.2, if the frequency of calling of this predicate accounts for total amount more than 40%, sets up bitmap index, otherwise abandon this predicate.
Above-mentioned database overall automation optimizes early-warning processing method, and wherein, the selectivity Selectivity of multiple predicate combination is calculated as follows:
Selectivity=1-(1-Dist
1/Rows)×(1-Dist
2/Rows)×(1-Dist
3/Rows)×...(1-Dist
n/Rows);
Multiple predicate on average performs frequency sum Exec_Freq and is calculated as follows:
Exec_Freq=SQL_Freq
1+SQL_Freq
2+SQL_Freq
3+...SQL_Freq
n;
The selectivity of predicate and the assessed value V of the predicate frequency are calculated as follows::
If the assessed value V of the selectivity of predicate and the predicate frequency is greater than predetermined threshold value, then b-tree indexed or the b-tree indexed based on function are created to predicate, wherein, Rows is the number of data lines of traffic table, Dist1, Dist2, Dist3 ... Distn is respectively the value number of each predicate, SQL_Freq1, SQL_Freq2, SQL_Freq3 ... on average the frequency is performed in each SQL unit interval that SQL_Freqn is respectively this predicate practical.
The present invention contrasts prior art following beneficial effect: database overall automation provided by the invention optimizes prior-warning device and disposal route thereof, three-tier architecture device is adopted to realize task scheduling, information acquisition and interpretation of result, and by the Data acquisition and issuance continued, index early warning (newly-increased, merge, deletion) can be sent according to table and the visitation frequency of field, data variation, thus make database indexing schemes more simplify rationally, database keeps Effec-tive Function, effectively can also avoid the service feature problem caused because index is unreasonable.
Accompanying drawing explanation
Fig. 1 is that early warning process flow diagram is optimized in existing database robotization of overall importance;
Fig. 2 is that database overall automation of the present invention optimizes prior-warning device block schematic illustration;
Fig. 3 is that database overall automation of the present invention optimizes early warning treatment scheme schematic diagram;
Fig. 4 is that database overall automation of the present invention optimizes early warning information collecting flowchart schematic diagram;
Fig. 5 is that database overall automation of the present invention optimizes early warning interpretation of result schematic flow sheet;
Fig. 6 is that database overall automation of the present invention optimizes cpu performance contrast schematic diagram before and after early warning;
Fig. 7 is that database overall automation of the present invention optimizes IO performance comparison schematic diagram before and after early warning.
Embodiment
Below in conjunction with drawings and Examples, the invention will be further described.
Fig. 2 is that database overall automation of the present invention optimizes prior-warning device block schematic illustration.
Refer to Fig. 2, database overall automation provided by the invention is optimized prior-warning device and is comprised following three layers:
Task allotment layer, adopts task scheduling module to arrange index monitor task, and configuration needs the traffic table monitoring index; The management of this layer of primary responsibility task, mainly contains the operations such as the establishment of task, renewal, deletion, inquiry and scheduling.Definition monitor task, definable once performs or repeatedly circulates executions, by information acquisition and analytic statistics information, proposes optimiged index suggestion, or circulates and execute the task monitor database index demand and send early warning.
Information acquisition mold layer, adopts information acquisition module timing to be connected to target database, the execution frequency gathering all SQL, SQL of corresponding service table, the predicate information called, and is recorded to information acquisition storehouse;
Interpretation of result layer, adopt results analyses module according to the data collected to the use of predicate each in all SQL, combine and add up, calculate the index scheme of target database in real time, and be recorded to interpretation of result storehouse.
Predicate in database, refer to the key that the key-value pair existed in the WHERE condition of SQL is answered, as user_id and expire_time in " selectuser_namefromuserswhereuser_id=1234andexpire_timeG reatT.GreaT.GTsysdate " is predicate, also can be regarded as the part field of the table used by where condition.The present invention is by the autonomous three-tier architecture device created, a set of standard mechanisms such as task scheduling, information acquisition, interpretation of result, from the database overall situation, to existing database indexing schemes, overall Optimizing Suggestions can be proposed, and by the Data acquisition and issuance continued, index early warning (newly-increased, merge, deletion) can be sent according to table and the visitation frequency of field, data variation, data analysis is relied on to export Optimizing Suggestions, to professional skill, the technical capability not Existence dependency of maintainer completely.After using the present invention, database indexing schemes is more simplified rationally, and database keeps Effec-tive Function, effectively can also avoid the service feature problem caused because index is unreasonable.
Fig. 3 is that database overall automation of the present invention optimizes early warning treatment scheme schematic diagram.
Please continue see Fig. 3, overall realization flow is as follows:
1) first index of definition monitor task, configuration needs the monitoring table of index or the set of table.
2) allocate storehouse by query task, judge respective table whether configuration index monitor task, configure, skip, otherwise creation task strategy.
3) after task creation, automatically initiate monitoring process, timing is connected to target database, gathers the information such as the execution frequency of all SQL, SQL of respective table, the predicate that calls, and is recorded to information acquisition storehouse.
4) at interpretation of result layer, according to the data collected, calculate optimum index scheme, and be recorded to interpretation of result storehouse.
5) by optimum index scheme and current index project plan comparison, export analysis result to maintainer, and terminate this subtask.
Monitor task can be configured to single and run, and closes after end of run, exports the suggestion of this optimiged index to maintainer; Monitor task also can be configured to periodic cyclic and perform, and when finding that original index scheme needs to optimize, namely sending early warning to maintainer, avoiding the generation of performance issue.
Fig. 4 is that database overall automation of the present invention optimizes early warning information collecting flowchart schematic diagram.
Please continue see Fig. 4, the collection of this layer of responsible relevant information, whole collecting flowchart is as follows:
1) task scheduling process fech connection information from task allotment storehouse, is connected to target database.
2) return Success Flag after target data successful connection, if failure, return error code and error message.
3) input data are obtained: user name/table name set in task allotment storehouse, and the beginning and ending time of sampling, the frequency etc. of collection.
4) from Database Dynamic performance views, relevant SQL is obtained, with SQL_ID unique identification according to inputting user name/table name set.
5) from statistical information, obtain the line number of table, if be less than the traffic table (this numerical value is adjustable, acquiescence value 10000) of 10000 row, then need not consider index creation, directly skip subsequent step.
6) get ACCESS predicate and FILTER predicate respectively according to SQL_ID and gather.
7) if comprise key word " and ", " or " in predicate, so independent predicate must be split into respectively by about it, and recurrence split process.
8) in time there is following situation, field cannot make index of reference, that is, needs to be filtered when collection:
When predicate comprises function time, need to consider especially, if need to make index of reference, so must set up the index based on function, common function has:
9) selectivity of predicate is very important index, lowly optionally arranges, and the b-tree indexed of establishment is nugatory, also affects the efficiency of DML operation.The selectivity of predicate is calculated according to the statistical information of traffic table.
10) the execution frequency of SQL is added up.Performing frequency is another important references index, determines the cost of index creation.
11) each image data, phase analysis result are stored in information acquisition storehouse.
Fig. 5 is that database overall automation of the present invention optimizes early warning interpretation of result schematic flow sheet.
Please continue see Fig. 5, this layer is responsible for analyzing Information Monitoring, and finally export optimiged index, early warning suggestion, detailed process is as follows:
1) information collected in information acquisition storehouse is inputted: user name/table name, predicate, the frequency, selectivity etc.
2) laws of use, the predicate combined level of predicate in all SQL relevant to table is analyzed.
3) composite index assessment is carried out, by in same SQL and the forward predicate of frequency collating attempt combining, such as frequency collating to account for before total amount 60%, check whether that other SQL have the identical or identical predicate of part, can index reuse, and assess cost, if lower than single-row index, then decide order of the field by traversal Ranking evaluation index utilization factor.
4) if single-row index cost is lower, then consider to carry out single-row index analysis.Add up each predicate field frequency of usage in all SQL and with descending sort, calculate the frequency of each field.
5) analyze predicate from high to low by the predicate frequency, if its selectivity is low, as selectivity is less than 0.2, then call the frequency in conjunction with predicate and consider bitmap index (this option is configurable, such as calls the frequency and accounts for total amount more than 40%).If selectivity is higher, as selectivity is greater than 0.2, then suggestion creates the b-tree indexed b-tree indexed of function (or based on).
6) iterative analysis predicate, until terminate.Contrast difference done in last and existing index, exports optimiged index (increase, deletion, merging etc.), early warning suggestion.
From above flow process, can see, judge that the two large key indexs whether a predicate creating index are: the selectivity of predicate and the frequency of predicate.(if composite index also needs the combined level considering predicate).The formula calculation of core as follows:
The selectivity of singular predicate: the ratio recording number in referring to the data of different value in predicate and showing.If the data in a table have 100000 row, predicate has 85000 different values, and so selectivity is 85000/100000=0.85.The selectivity value of index is more than or equal to 0, is less than or equal to 1, and more close to 1, then selectivity is better, and the efficiency of index is higher.
Selectivity=Dist/Rows
The selectivity of many predicates: when considering to create composite index, need the selectivity considering that multiple predicate combines.
selectivity=1-(1-Dist
1/Rows)×(1-Dist
2/Rows)×(1-Dist
3/Rows)×...(1-Dist
n/Rows)
The predicate frequency: use average execution frequency sum in all SQL unit interval of this predicate.
Exec_Freq=SQL_Freq
1+SQL_Freq
2+SQL_Freq
3+...SQL_Freq
n
Predicate creates index assessment: the selectivity and the predicate frequency that consider predicate.
Compared with corresponding threshold values by assessed value, determine whether to need to create index (or composite index) on this predicate (or multiple predicate).Because the business that each Service Database supports is variant, front end applications is different, so the affairs amount that database supports, the total amount of SQL, the execution frequencys of SQL etc. are all different, so threshold values corresponding to assessed value is not a constant of applicable all databases, but each database all has a threshold values of oneself correspondence, the formulation of this threshold values needs by full storehouse index assessed value, the original index goodness of fit of database is comprehensively analyzed, (later stage also can adjust according to actual conditions) generally determined in first full storehouse optimiged index analysis, after this threshold values is determined, namely can be used for follow-up database indexing schemes global optimization and automatic early-warning.
For certain telecom operators' core business system, after adopting technical scheme provided by the invention, after using apparatus of the present invention to carry out database index global optimization by the end of April, database performance has obvious lifting.Do to optimize front and back as follows to compare from database server CPU and IO two aspect: the peak traffic period choosing April 5 and two days 14 o'clock to 18 o'clock on the 5th May carries out across comparison: before optimization, CPU usage average 85%, after optimization, CPU usage average 61%, CPU usage declines 24%, as shown in Figure 6.Before optimization, average 141MB/S read by disk, and after optimization, average 99MB/S read by disk, and the IO that disk is read measures decline 30%, as shown in Figure 7.
In sum, database overall automation provided by the invention optimizes prior-warning device and disposal route thereof, adopt task scheduling, information acquisition and interpretation of result three-tier architecture device, and by the Data acquisition and issuance continued, index early warning (newly-increased, merge, delete) can be sent according to the visitation frequency of traffic table and field, data variation, thus make database indexing schemes more simplify rationally, database keeps Effec-tive Function, effectively can also avoid the service feature problem caused because index is unreasonable.Concrete advantage is as follows: 1) from global optimization database index.Compare with existing Oracle instrument SQLtuningadvisor, the present invention is not only for the optimiged index of wall scroll SQL (appointment predicate), but by real data collection and careful Algorithm Analysis, from the database overall situation, propose overall index optimal case, effectively prevent after original single index changes and cause new performance issue risk.2) realize index automatically to give warning in advance.By the timed task that customization circulation performs, objective change is caught by Data acquisition and issuance, comply with the development and change of business, automatically send index early warning signal in advance, after can effectively avoiding original performance issue to occur, the situation of Analytical Index scheme occurs again.3) continue to ensure that index scheme is simplified.Early stage, according to real data collection and analysis, is optimized initial index scheme before reaching the standard grade, and the later stage, by customization index automatic early-warning task, is constantly revised.Database only retains suitable index, and irrational index is deleted or merging then, ensure that simplifying of the overall index scheme of database.4) dependency degree of people is significantly reduced.In the present invention, the optimization of index and early warning, all rely on real data and advise from technical standpoint, do not need maintainer to the understanding of professional knowledge.Give warning in advance because achieve, maintainer has time enough to go to change index scheme, relies on also very large decline to the technical merit of maintainer.5) apparatus of the present invention are simple and easy to use.Creatively propose three-tier architecture, by shirtsleeve operation, can complete the customization of scheduler task, the operation such as follow-up information acquisition, analysis, calculating all has device automatically to complete, simple to operate, quite low to the requirement of user of service.
Although the present invention discloses as above with preferred embodiment; so itself and be not used to limit the present invention, any those skilled in the art, without departing from the spirit and scope of the present invention; when doing a little amendment and perfect, therefore protection scope of the present invention is when being as the criterion of defining with claims.