CN104616205B - A kind of operation states of electric power system monitoring method based on distributed information log analysis - Google Patents
A kind of operation states of electric power system monitoring method based on distributed information log analysis Download PDFInfo
- Publication number
- CN104616205B CN104616205B CN201410681737.4A CN201410681737A CN104616205B CN 104616205 B CN104616205 B CN 104616205B CN 201410681737 A CN201410681737 A CN 201410681737A CN 104616205 B CN104616205 B CN 104616205B
- Authority
- CN
- China
- Prior art keywords
- log
- log information
- point
- information
- electric power
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 43
- 238000000034 method Methods 0.000 title claims abstract description 32
- 238000012544 monitoring process Methods 0.000 title claims abstract description 27
- 230000007246 mechanism Effects 0.000 claims abstract description 22
- 238000012545 processing Methods 0.000 claims abstract description 16
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 9
- 239000011159 matrix material Substances 0.000 claims description 27
- 239000000284 extract Substances 0.000 claims description 10
- 230000005540 biological transmission Effects 0.000 claims description 7
- 230000009471 action Effects 0.000 claims description 3
- 238000000926 separation method Methods 0.000 claims 1
- 230000002159 abnormal effect Effects 0.000 abstract description 4
- 238000005516 engineering process Methods 0.000 description 14
- 238000007726 management method Methods 0.000 description 12
- 230000008569 process Effects 0.000 description 7
- 230000006399 behavior Effects 0.000 description 6
- 238000000605 extraction Methods 0.000 description 4
- 238000003860 storage Methods 0.000 description 4
- 238000007635 classification algorithm Methods 0.000 description 3
- 238000007405 data analysis Methods 0.000 description 3
- 230000005611 electricity Effects 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 238000009826 distribution Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 241000239290 Araneae Species 0.000 description 1
- 238000009412 basement excavation Methods 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013079 data visualisation Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000010248 power generation Methods 0.000 description 1
- 238000009958 sewing Methods 0.000 description 1
- 230000006641 stabilisation Effects 0.000 description 1
- 238000011105 stabilization Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- 238000005303 weighing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/1805—Append-only file systems, e.g. using logs or journals to store data
- G06F16/1815—Journaling file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02J—CIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
- H02J3/00—Circuit arrangements for ac mains or ac distribution networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Economics (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Water Supply & Treatment (AREA)
- Tourism & Hospitality (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Primary Health Care (AREA)
- Marketing (AREA)
- Human Resources & Organizations (AREA)
- General Health & Medical Sciences (AREA)
- Public Health (AREA)
- Power Engineering (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention discloses a kind of operation states of electric power system based on distributed information log analysis to monitor method, includes the following steps: S1, obtains the log information of electric system, and is merged into journal file;Journal file is split by S2, is handled it to obtain the log information of unified format, serializes the log information in journal file one by one and be output in distributed memory system;S3, log information is extracted from distributed memory system, in conjunction with Map-Reduce mechanism, classification processing is carried out to log information using the log analysis algorithm for removing cluster based on state noise, and by being analyzed sorted log information come monitoring system operating status.The present invention can have found the exception of operation states of electric power system when system is abnormal in time, and handled at the first time, effectively meet that electric system is timely, efficient service requirement.
Description
Technical field
The present invention relates to a kind of operation states of electric power system to monitor method, more particularly to a kind of based on distributed information log analysis
Operation states of electric power system monitor method, belong to electric power system dispatching technical field.
Background technique
With power grid scale expand increasingly and complexity is continuously increased, extra-high voltage interconnected power grid to power grid one
Bodyization runs and is uniformly coordinated control and puts forward new requirements, the requirement that country runs power grid security, stabilization, economy, environmental protection
Also higher and higher.Electric power big data is come into being, it is the practice of big data theory, technology and methods in power industry, electric power
Big data is related to power generation, transmission of electricity, power transformation, distribution, electricity consumption, each link of scheduling, combines across unit, multi-disciplinary, trans-sectoral business
The function of data analysis, excavation and data visualization.
In electric power system dispatching link, with putting into operation for smart grid supporting system technology, electric network data acquisition
Range and type constantly extend, and play important function in terms of meeting comprehensive to interconnected power grid.Mesh
Before, regulation centers at different levels are completed to be run by a series of scheduling production managements of core of smart grid supporting system technology
System mainly has SCADA/EMS, WAMS, water power and new energy, secondary device in-service monitoring and analysis, operation plan, safe school
The systems such as core, management and running, system have put into operation, substantially meet scheduling production needs, play weight in scheduling production management
It acts on.
The safe and stable operation of electric system needs the protection of the subterranean equipments such as relay protection and automatic device, but only according to
The safe operation of electric system can't be completely secured by these subterranean equipments, because these devices are all often according to part
Information handles the failure of electric system, and cannot with global information come predict, the operating condition and processing system of analysis system
The various challenges occurred in system, for this purpose, the log analysis technology for system running state monitoring is urgently developed.
Currently, the syslog analysis technology of domestic electrical enterprise is still immature, the discovery of most systems failure also according to
Rely in fault alarm and manpower verification, and in many cases, when fault alarm or manpower are verified when finding failure, system has been sent out
The misoperation of some time has been given birth to, the operation exception of system cannot have been found in time, and handled at the first time, prolong significantly
The O&M time for having grown system, it is not able to satisfy that network system is timely, efficient service requirement.In addition to this, electric power enterprise is daily
Might have many different data analysis requirements, the daily record data provided be also it is diversified, how to diversified log
Data carry out united analysis processing and a urgent problem.
Summary of the invention
Technical problem to be solved by the present invention lies in provide a kind of Operation of Electric Systems based on distributed information log analysis
State monitoring method.
For achieving the above object, the present invention uses following technical solutions:
A kind of operation states of electric power system monitoring method based on distributed information log analysis, includes the following steps:
S1, obtains the log information of electric system, and is merged into journal file;
Journal file is split by S2, is handled it to obtain the log information of unified format, be made in journal file
Log information serialize be output in distributed memory system one by one;
S3 extracts log information from distributed memory system, in conjunction with Map-Reduce mechanism, using based on state noise
The log analysis algorithm for removing cluster carries out classification processing to log information, and by analyzing sorted log information
To monitor operation states of electric power system.
Wherein more preferably, in step sl, using the log scan based on syslog mode when obtaining the log information
Grasping means.
Wherein more preferably, the log scan grasping means includes the following steps:
The log information for each seed module crawl being located on each node of electric system is carried out selection merging, obtained by S11
To all kinds of log informations of this node;
S12 carries out crawl merging to all kinds of log informations of each node in each region of electric system, obtains each
The integral data in region, and be sent to local area data processing node and data are handled, it is stored in journal file;
S13 obtains all kinds of log informations chosen and merged, and obtains crawl record data from the node of crawl log information,
The merging crawl strategy for obtaining log information by analysis, is as needed adjusted merging crawl strategy.
Wherein more preferably, in conjunction with Map-Reduce mechanism, using the log analysis algorithm for removing cluster based on state noise
Monitoring system operating status, specifically comprises the following steps:
S31 extracts log information from distributed memory system, it is in place according to the node institute of crawl log information
It sets, carries out rough sort according to log information classification, in middle its similarity matrix of building of all categories, and selected a bit in category set
Centered on point;
S32, using k nearest neighbour classification algorithm by similarity matrix rarefaction of all categories, with the similarity moment after sparse
Battle array building includes the shared closest to figure of whole log categories;
S33 summarizes distance of this away from other points for shared closest to each point in figure using Map mechanism
Degree;
S34, using Reduce mechanism, the distance length that Map mechanism is summarized is summed, and generates new key-value pair;
S35 selects distance length and maximum point as similarity matrix central point, former central point is covered, for from length
Degree and less than length threshold point, be marked as noise, be not re-used as class cluster central point;
S36 removes the small link of weight ratio threshold value, chooses the point conduct linked each other in the linking of all the points and point
One class cluster makes each class cluster represent a classification log information;
S37 takes further analysis according to different classes of log information, obtains the letter of reflection operation states of electric power system
The monitoring to operation of power networks state is realized in breath, the variation by observing these information.
Wherein more preferably, in step S31, the log information classification includes: system log, access log and user's row
For log three classes.
Wherein more preferably, in step s 32, the shared of whole log categories is constructed to include the following steps: closest to figure
The neighbouring point list for determining log information A and B with k k-nearest neighbor first, when A and B is in the point of proximity of other side
When in list, point-to-point transmission establishes a link;Then it will be put with certain without similarity corresponding to the point linked in similarity matrix
It is set as zero, realizes the rarefaction of similarity matrix;Finally the two o'clock established the link and its weight side are drawn out, complete building
Whole log categories are shared closest to figure;
Weight, that is, two o'clock similarity str (i, j) of link between two o'clock, calculating formula of similarity are as follows: str (i, j)
=∑ (k+1-m) * (k+1-n);
Wherein, k is size of the A and B adjacent to point list, and the section of closing on that m and n are A and B is respectively closed in list at it
Serial number.
Wherein more preferably, the operation states of electric power system based on distributed information log analysis monitors method, further includes walking as follows
It is rapid:
S4 determines the index and its affiliated log information classification for needing to pay special attention to according to Operation of Electric Systems situation, leads to
It crosses and monitoring of the monitoring realization to operation states of electric power system individually is carried out to the index in corresponding log information classification.
Wherein more preferably, in step s 4 further comprise following steps:
S41 parses log information, determines log information classification belonging to the index for needing to pay special attention to;
S42 extracts the keyword for needing to pay special attention in parsing logged result, is spliced into field name, value
Value is set as 1;
S43, using Reduce mechanism, in the log information classification, calculating summarizes the field name in the category
The number of appearance generates and exports new key-value pair;
S44 extracts the information of key assignments centering, analyzes it, realize the monitoring of operation states of electric power system.
Operation states of electric power system provided by the present invention monitors method, and log information is obtained from electric system, in conjunction with
Map-Reduce mechanism carries out classification processing to log information using the log analysis algorithm for removing cluster based on state noise,
And operation states of electric power system is monitored by being analyzed sorted log information, thus when system is abnormal,
It can find and handled at the first time in time, effectively meet that electric system is timely, efficient service requirement.
Detailed description of the invention
Fig. 1 is the flow chart that operation states of electric power system provided by the invention monitors method;
Fig. 2 is to realize the network crawler system structure chart of log information acquisition in the present invention;
Fig. 3 is log information collecting flowchart figure in the present invention;
Fig. 4 is the stream for removing the log analysis algorithm monitoring system operating status of cluster in the present invention based on state noise
Cheng Tu;
Fig. 5 is the statistical analysis flow chart of specific fields log information.
Specific embodiment
Technology contents of the invention are described in further detail in the following with reference to the drawings and specific embodiments.
As shown in Figure 1, the operation states of electric power system provided by the invention based on distributed information log analysis monitors method, tool
Body includes the following steps: to obtain electric system by the log scan crawl technology based on syslog (system log) mode first
Log information, be combined into journal file;Then by dividing processing, journal file is split, is sewed before and after combination message
Content makes log information have unified Log data format, and log information is serialized one by one and is output to distributed memory system
(HDFS/HBase) in;The Map-Reduce mechanism in Hadoop is finally combined, using the day for removing cluster based on state noise
Will parser carries out classification processing to log information, and monitors power train by being analyzed sorted log information
System operating status.Detailed specific description is done to this process below.
S1, obtains the log information of electric system by log scan grasping means based on syslog mode, and by its
It is merged into journal file.
Data acquisition, also known as data acquisition are to acquire data from exterior using a kind of tool and be input in system
The process in portion.In today of internet industry fast development, data collecting field has occurred that important variation, is answered extensively
For internet and field of distributed type.In power industry, data acquisition is exactly logical to safety equipment of concern, application system etc.
Cross the acquisition work of log information needed for certain concrete mode (file, syslog, http etc.) carries out power system monitor, accident analysis
Make.
Log collection technology is one of key technology of log analysis.Log collection technology needs to acquire various safety and sets
The log informations such as standby, application system provide data source for the event analysis work on upper layer, therefore log collection process is system
The basis of detection and decision is carried out, its accuracy, reliability and its efficiency directly influence the performance of whole system.
In one embodiment of the invention, the log information of analysis specifically includes that system log, access log, user
User behaviors log three classes obtain the log information of electric system by the log scan grasping means based on syslog mode.System
Log (syslog) agreement is developed in the TCP/IP system in University of California Bai Keli software distribution research center (BSD) is implemented
, oneself becomes industry-standard protocol at present, and the log of system and equipment can be recorded with it.In the routing of UNIX/Linux system
In the network equipments such as device, interchanger, syslog records any event in system, and manager can be by checking that system is remembered
Record, grasps system status at any time.The system log of UNIX/Linux records system related events by syslogd process, can also
Event is operated with records application program, by appropriately configured, can also realize the communication between the machine of operation syslog agreement.
By analyzing these network behavior logs, the situation related with system, equipment and network is tracked and grasped.
In one embodiment of the invention, the log scan grasping means based on syslog mode, which uses, is applied to system
The network crawler system of log scan crawl comes real time scan and grasping system log, does standard for subsequent running state monitoring
It is standby.Web crawlers (Spider), which refers to, follows http protocol, according to the index between hyperlink therein and Web page document
Relationship carrys out the software program in traversal information space.
Network crawler system includes Seed Management Module, handling module and crawler daily record data information extraction and statistics mould
Block;The network crawler system structure chart of log information acquisition is realized as shown in Fig. 2, crawler daily record data information extraction and statistics mould
Block obtains log information from Seed Management Module and handling module crawl node, backs up in local server, then presses first
It is compressed according to the mode of HadoopLzop, compressed data is uploaded to by HDFS by network transmission, Hive is parsed according to log
Plan generates Map-Reduce task, Hadoop cluster is submitted in a manner of Job, calculated result is stored in crawler data system
System.Cluster Job scheduling system is responsible for Job task schedule, and to realize the effective use of resource, group operation monitoring record Job appoints
The operating status of business, network monitoring can be monitored the operating status of system.
Wherein, realize that the acquisition log information of log information specifically comprises the following steps: by network crawler system
S11, Seed Management Module are distributed on each node of electric system, and each seed module being located on the node is grabbed
The daily record data taken carries out selection merging, obtains all kinds of log informations of this node.
For electric system, multiple seed modules are distributed on each node of electric system, for grabbing electric system
The log informations such as system information, access information and each advanced application message that the node generates when operation.Seed Management Module
It is also distributed about on each node of electric system, the log information to grab each seed module carries out selection merging, obtains this
All kinds of log informations of node.
Handling module is distributed in power train and unifies area, 2nd area, 3rd area, summarized to the Seed Management Module of each node by S12
Obtained log information carries out crawl merging, obtains the integral data in each area, is sent to local area data processing node, to data into
Row processing is stored in journal file.
Seed Management Module, handling module are dispersed on each node that the area, 2nd area, 3rd area of electric system include
It is respectively distributed to the area, 2nd area, 3rd area of electric system, the seed management mould for including by the area, 2nd area, 3rd area of electric system
The log information that block summarizes carries out crawl merging, obtains the integral data in each area, and is sent to local area data processing node,
Data are handled, by the storage of processed log information into journal file.
S13, crawler daily record data information extraction and statistical module obtain selection from Seed Management Module and handling module and close
And all kinds of log informations, from the node of crawl log information obtain crawl record data, obtain log information by analysis
Merge crawl strategy, can according to need and merging crawl strategy is adjusted in time.
Crawler daily record data information extraction and statistical module play the effect of adjustment crawl strategy, on the one hand obtain seed pipe
It manages module and handling module chooses all kinds of log informations merged, on the other hand obtain crawl note from the node of crawl log information
Data are recorded, by analyzing these information, obtain the merging crawl strategy of entire crawler system, it, can be with when encountering system problem
The log type being related to aiming at the problem that occurring in time as needed is adjusted correspondingly to crawl strategy is merged, and is made in system
Seed Management Module and handling module only grab log information relevant to problem, reduce log information processing quantity with
Time improves the efficiency of O&M.
Journal file is split by S2, is handled it to obtain the log information of unified format, be made in journal file
Log information serialize be output in distributed memory system (HDFS/HBase) one by one.
Journal file is split by Flume tool, by the way of sewing before and after combination message, customizes daily record data
Format makes different classes of log information obtain unified Log data format, serializes log information one by one and is output to point
In cloth storage system (HDFS/HBase), convenience is created for next step log analysis.
According to the actual needs of electric system, the log information of analysis specifically includes that system log, access log, user
User behaviors log three classes.System log is monitored for system running state, including system resource utilization rate, network equipment behaviour in service
Deng;Access log is used for the interaction scenario of statistical system host, such as system amount of access, accessed node information, access time;With
Family user behaviors log is used to dispatch the mining analysis of behavior pattern, mainly carries out modeling analysis to the operation data of operations staff.Three
Class journal file is grabbed crawler technology and is sent to distributed storage system in the way of batch, timing by Flume tool
In system.Flume tool be a kind of distributed information log collect, means of transport.It includes data receiver using Agent as basic unit
End, transmitting terminal, channel are the distributed tools with high scalability and high-freedom degree, can not only collect non-structured text
This document can also collect the files such as non-structured video, audio.Log information collecting flowchart is as shown in figure 3, the process is first
It first detects whether that new journal file generates, if there is being then split journal file, format is carried out to log information
It is uniformly processed, then by treated, log information serializes storage into distributed system one by one, convenient for later concentration point
Analysis.
S3, from log information is extracted in distributed memory system (HDFS/HBase), in conjunction with the Map- in Hadoop
Reduce mechanism is carried out classification processing to log information using the log analysis algorithm for being removed cluster based on state noise, passed through
Sorted log information is analyzed to monitor operation states of electric power system.
In power grid distributed data frame, multiple data acquisition units (in one embodiment of the invention, climb by network
The seed module of worm serves as data acquisition unit, to acquire the log information in electric system.) disperse to be deployed in network environment
In.Therefore the operation in control centre's Centralized Monitoring data acquisition unit and host is needed, and by log information to system shape
State is monitored.
Cluster (Clustering) is exactly that the similarity between data set to be divided into same group objects is maximum, right in different groups
As similarity minimize multiple groups (group) or cluster (cluster) process.Clustering is one in data analysis
Kind important technology, application are very extensive.From the point of view of statistics, clustering as multi-variate statistical analysis Main Branches it
One, it is a kind of method for simplifying data by data modeling, is mainly based upon distance and the clustering method based on similarity.Slave
From the point of view of device study, cluster is a kind of training example without class predetermined or with class label without instructing machine learning
Method.
In one embodiment of the invention, log information, knot are extracted from distributed memory system (HDFS/HBase)
Close Hadoop in Map-Reduce mechanism, using based on state noise remove cluster log analysis algorithm to log information into
Row classification processing only includes a classification log information in each class cluster after handling, can find in single classification
The corresponding index of operation states of electric power system, such as information on services are represented, is compared, is worked as by the analysis to index operation information
Preceding operation states of electric power system.Such as: when operating normally not with electric system occurs in the corresponding index of operation states of electric power system
When consistent operation data, illustrate that electric system corresponds to index and is abnormal, can rapidly to corresponding index is associated sets
It is standby to carry out O&M, greatly reduce the O&M time of system.
As shown in figure 4, in conjunction with the Map-Reduce mechanism in Hadoop, using the log for removing cluster based on state noise
Parser monitoring system operating status, specifically comprises the following steps:
S31, from log information is extracted in distributed memory system (HDFS/HBase), by it according to crawl log information
Rough sort is done according to system log, application log, access log etc. in node position, in middle its phase of building of all categories
Like degree matrix, and point centered on selecting at random in category set a bit.
In the power system, some Node distributions mainly include system log in basic platform, the log information of crawl, are had
A little nodes are application nodes, and the log information of crawl mainly includes application log, some nodes do not interact other nodes
Movement, just there is no access log, all types log has on some nodes.In one embodiment of the invention, according to
The log information summarized is carried out rough sort by node position.First single point of node only comprising unitary class log information
One kind, the node for then all including with three classes log information again take union.Different classes of journal file is formed after rough sort,
In middle its similarity matrix of building of all categories, and point centered on selecting at random in category set a bit.At of the invention one
In embodiment, each log information constitutes a point in journal file.Similarity matrix is a square matrix, each point
Similarity with other points is as matrix element.
According to the shared closest to its similarity of section definition of two log informations, i.e., the similarities of two log informations by
It is determined between its nearest-neighbour.In one embodiment of the invention, the neighbor point of log A and log X are determined with k k-nearest neighbor
List, and if only if A and X all in other side when closing in point list, point-to-point transmission just establishes a link.There is the point X linked with A
It is a set, having the point linked with B is also a set, the two intersection of sets collection are exactly shared closest to section.If
Log A is close to log B, and they are close to class set C, then, A close to B with regard to confidence level with higher because A's and B is similar
Degree is determined that class set C is shared closest to section by class set C simultaneously.
S32 will be put with certain without chain using k nearest neighbour classification algorithm by similarity matrix rarefaction of all categories in matrix
Similarity corresponding to the point connect is set as zero, is most faced with the similarity matrix building after sparse comprising whole the shared of log category
Nearly figure.
In one embodiment of the invention, using k nearest neighbour classification algorithm by similarity matrix rarefaction of all categories,
For the similarity matrix structure after sparse, similarity matrix midpoint and point and its weight side are drawn out, to construct whole days
Will classification is shared closest to figure.Specifically comprise the following steps:
The neighbouring point list that A and B is determined with k k-nearest neighbor, and if only if A and B all closing in point list in other side
When, point-to-point transmission just establishes a link, will be set as zero without similarity corresponding to the point linked with certain point in similarity matrix, in fact
The rarefaction of existing similarity matrix, then the two o'clock established the link and its weight side are drawn out, can construct whole logs
Classification is shared closest to figure.Wherein, weight, that is, two o'clock similarity of the link, calculating formula of similarity are as follows:
Str (i, j)=∑ (k+1-m) * (k+1-n)
Wherein, k is size of the A and B adjacent to point list, and the section of closing on that m and n are A and B is respectively closed in list at it
Serial number.
S33 summarizes distance length of this away from other points for shared closest to point each in figure using Map mechanism.
S34, using Reduce mechanism, the distance length that Map mechanism is summarized is summed, and calculates the distance length of each point
With generate new key-value pair.Wherein, the key value in key-value pair be log information, value value be each point distance length and.
S35 selects distance length and maximum point as similarity matrix central point, former central point is covered, for distance
Length and point less than length threshold, are marked as noise, are not re-used as class cluster central point.
S36 removes the small link of weight ratio threshold value, chooses the point conduct linked each other in the linking of all the points and point
One class cluster guarantees that all the points are all central point or are directly connected with central point that each class cluster represents a classification in class cluster
Log information.
By step S32 it is found that weight, that is, two o'clock similarity of link, calculating formula of similarity are as follows:
Str (i, j)=∑ (k+1-m) * (k+1-n)
Wherein, k is size of the A and B adjacent to point list, and the section of closing on that m and n are A and B is respectively closed in list at it
Serial number.The distance between two o'clock length is bigger, and the weight of link is smaller, and similarity is lower.The small link of weight ratio threshold value is removed,
Can guarantee that in the link that remaining point is formed be same category of log information, choose link each other o'clock as one
Class cluster guarantees that all the points are all central point or are directly connected with central point that each class cluster represents a classification log in class cluster
Information.
S37 takes further analysis according to different classes of log information, and the items for obtaining reflection POWER SYSTEM STATE refer to
Mark, system amount of access and the information such as accessed node and dispatcher's operation data, the variation by observing these information are realized
Monitoring to operation of power networks state.
Further analysis is taken according to different classes of log information, using Hive, for the day for being subordinate to system log classification
Will file, statistics obtain the indices of reflection system mode, such as CPU usage, memory headroom, hard drive space, network interface card stream
The status informations such as amount, process and information on services.To the journal file for being subordinate to access log classification, analysis obtains access of concern
The information such as amount, accessed node.To the journal file for being subordinate to User action log classification, dispatcher's operation data is counted.Work as electricity
When Force system occurs abnormal, certain variation can occur for these information, and the variation by observing these information is realized to power train
The monitoring for operating status of uniting.
S4 determines the index and its affiliated log information classification for needing to pay special attention to according to Operation of Electric Systems situation,
It is monitored by the index individually paid special attention to needs in corresponding log information classification to realize and be transported to electric system
The monitoring of row state.
If in system operation, due to the particular/special requirement of operation, such as in certain period electric system certain fingers
Mark is easy to happen exception, causes electric power system fault, user is needed to pay special attention to certain period or the operating status of certain index, can
It is monitored with the index individually paid special attention to needs.By paying special attention to it, electric system can be found in time
The exception of operating status.As shown in figure 5, specifically comprising the following steps:
S41 parses log information, determines log information classification belonging to the index for needing to pay special attention to, i.e., of concern
Problem is to belong to system log or access log or User action log.
S42 extracts the keyword for needing to pay special attention in parsing logged result, and " schedule job ", " ERROR " etc. will
It is spliced into field name, and value value is set as 1.
S43, using Reduce mechanism, in the log information classification, calculating summarizes value value, i.e., the field name is at this
The number occurred in classification generates and exports new key-value pair.
S44 extracts the information of key assignments centering, analyzes it, realize the monitoring of operation states of electric power system.
In conclusion operation states of electric power system provided by the present invention monitors method, by based on syslog mode
Log scan grabs the log information that technology obtains electric system, then combines message front and back and sews content, makes every log information
All there is the preceding suffix information of customization, serialize log information one by one and be output to distributed memory system (HDFS/HBase)
In, in conjunction with the Map-Reduce mechanism in Hadoop, system is monitored using the log analysis algorithm for removing cluster based on state noise
System operating status, so as to find the exception of operation states of electric power system in time, and is being handled at the first time, is effectively met
Electric system is timely, efficient service requirement.In addition to this, the web crawlers technology applied to system log scanning crawl can
Diversified daily record data is grabbed from electric system, and united analysis processing is carried out to it by Flume tool, is improved more
The treatment effeciency of sample daily record data.
The operation of power networks state monitoring method provided by the present invention based on distributed information log analysis has been carried out in detail above
Thin explanation.For those of ordinary skill in the art, it is done under the premise of without departing substantially from true spirit
Any obvious change, the infringement for all weighing composition to the invention patent, will undertake corresponding legal liabilities.
Claims (7)
1. a kind of operation states of electric power system based on distributed information log analysis monitors method, it is characterised in that including walking as follows
It is rapid:
S1, obtains the log information of electric system, and is merged into journal file;
Journal file is split by S2, is handled it to obtain the log information of unified format, makes the day in journal file
Will information serializes be output in distributed memory system one by one;
S3 extracts log information from distributed memory system, in conjunction with Map-Reduce mechanism, removes using based on state noise
The log analysis algorithm of cluster carries out classification processing to log information;It is monitored by being analyzed sorted log information
Operation states of electric power system;Include the following steps:
S31 extracts log information from distributed memory system, by it according to the node position of crawl log information, presses
Lighting system log, application log, access log carry out rough sort, single point one of node only comprising unitary class log information
Class, then the node for all including with three classes log information take union;Different classes of journal file is formed after rough sort, all kinds of
Its similarity matrix is constructed in, and point centered on selecting in category set a bit;
S32 determines the neighbouring point list of log A and log B with k k-nearest neighbor, and if only if log A and log B all right
When closing in point list of side, point-to-point transmission just establishes a link, will put with certain without corresponding to the point linked in similarity matrix
Similarity be set as zero, the rarefaction of similarity matrix is realized, for the similarity matrix after sparse, by similarity matrix midpoint
It is drawn out with point and its weight side, to construct comprising the shared closest to figure of whole log categories;
S33 summarizes distance length of this away from other points for shared closest to each point in figure using Map mechanism;
S34, using Reduce mechanism, the distance length that Map mechanism is summarized is summed, and generates new key-value pair;
S35 selects the central point of distance length and maximum point as similarity matrix, covers former central point, long for separation
Degree and less than length threshold point, be marked as noise, be not re-used as class cluster central point;
S36 removes the small link of weight ratio threshold value in the linking of all the points and point, choose link each other o'clock as one
Class cluster makes each class cluster represent a classification log information;
S37 takes further analysis according to different classes of log information, obtains the information of reflection operation states of electric power system, leads to
Cross monitoring of the variation realization for observing these information to operation of power networks state.
2. operation states of electric power system according to claim 1 monitors method, it is characterised in that:
In step sl, using the log scan grasping means based on syslog mode when obtaining the log information.
3. operation states of electric power system according to claim 2 monitors method, it is characterised in that the log scan crawl
Method includes the following steps:
The log information for each seed module crawl being located on each node of electric system is carried out selection merging, obtains this by S11
All kinds of log informations of node;
S12 carries out crawl merging to all kinds of log informations of each node, obtains each region in each region of electric system
Integral data, and be sent to local area data processing node and data handled, be stored in journal file;
S13 obtains all kinds of log informations chosen and merged, and obtains crawl record data from the node of crawl log information, passes through
Analysis obtains the merging crawl strategy of log information, is adjusted as needed to merging crawl strategy.
4. operation states of electric power system according to claim 1 monitors method, it is characterised in that:
In step S31, the log information classification includes: system log, access log and User action log three classes.
5. operation states of electric power system according to claim 1 monitors method, it is characterised in that in step s 32, building
The shared of whole log categories includes the following steps: closest to figure
The neighbouring point list for determining log information A and B with k k-nearest neighbor first, when A and B closes on point list in other side
When middle, point-to-point transmission establishes a link;Then it will be set as with certain point without similarity corresponding to the point linked in similarity matrix
Zero, realize the rarefaction of similarity matrix;Finally the two o'clock established the link and its weight side are drawn out, complete building all
Log category is shared closest to figure;
Weight, that is, two o'clock similarity str (i, j) of link between two o'clock, calculating formula of similarity are as follows: str (i, j)=∑
(k+1-m)*(k+1-n);
Wherein, k is size of the A and B adjacent to point list, the sequence closing on section and respectively being closed in list at it that m and n are A and B
Number.
6. operation states of electric power system according to claim 1 monitors method, it is characterised in that further include following steps:
S4 determines the index and its affiliated log information classification for needing to pay special attention to according to Operation of Electric Systems situation, by
Monitoring of the monitoring realization to operation states of electric power system individually is carried out to the index in corresponding log information classification.
7. operation states of electric power system according to claim 6 monitors method, it is characterised in that in step s 4 further
Include the following steps:
S41 parses log information, determines log information classification belonging to the index for needing to pay special attention to;
S42 extracts the keyword for needing to pay special attention in parsing logged result, is spliced into field name, value value is set
It is 1;
S43, using Reduce mechanism, in the log information classification, calculating summarizes the field name to be occurred in the category
Number, generate and export new key-value pair;
S44 extracts the information of key assignments centering, analyzes it, realize the monitoring of operation states of electric power system.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410681737.4A CN104616205B (en) | 2014-11-24 | 2014-11-24 | A kind of operation states of electric power system monitoring method based on distributed information log analysis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410681737.4A CN104616205B (en) | 2014-11-24 | 2014-11-24 | A kind of operation states of electric power system monitoring method based on distributed information log analysis |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104616205A CN104616205A (en) | 2015-05-13 |
CN104616205B true CN104616205B (en) | 2019-10-25 |
Family
ID=53150638
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410681737.4A Expired - Fee Related CN104616205B (en) | 2014-11-24 | 2014-11-24 | A kind of operation states of electric power system monitoring method based on distributed information log analysis |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104616205B (en) |
Families Citing this family (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105138661B (en) * | 2015-09-02 | 2018-10-30 | 西北大学 | A kind of network security daily record k-means cluster analysis systems and method based on Hadoop |
CN105608203B (en) * | 2015-12-24 | 2019-09-17 | Tcl集团股份有限公司 | A kind of Internet of Things log processing method and device based on Hadoop platform |
CN105516355B (en) * | 2016-01-13 | 2018-07-17 | 国家电网公司 | Intelligent electric energy meter error big data safe storage device based on fountain codes and method |
CN105701621A (en) * | 2016-02-19 | 2016-06-22 | 云南电网有限责任公司电力科学研究院 | Intelligent power grid real time load analyzing method and system |
CN106022664A (en) * | 2016-07-08 | 2016-10-12 | 大连大学 | Big data analysis based network intelligent power saving monitoring method |
CN106209826A (en) * | 2016-07-08 | 2016-12-07 | 瑞达信息安全产业股份有限公司 | A kind of safety case investigation method of Network Security Device monitoring |
CN107291614B (en) * | 2017-05-04 | 2020-10-30 | 平安科技(深圳)有限公司 | File abnormity detection method and electronic equipment |
CN107483238A (en) * | 2017-08-04 | 2017-12-15 | 郑州云海信息技术有限公司 | A kind of blog management method, cluster management node and system |
CN107704594B (en) * | 2017-10-13 | 2021-02-09 | 东南大学 | Real-time processing method for log data of power system based on spark streaming |
CN108133043B (en) * | 2018-01-12 | 2022-07-29 | 福建星瑞格软件有限公司 | Structured storage method for server running logs based on big data |
CN110389874B (en) * | 2018-04-20 | 2021-01-19 | 比亚迪股份有限公司 | Method and device for detecting log file abnormity |
CN108804606B (en) * | 2018-05-29 | 2021-08-31 | 上海欣能信息科技发展有限公司 | Method and system for migrating power measurement data to HBase |
CN108845560B (en) * | 2018-05-30 | 2021-07-13 | 国网浙江省电力有限公司宁波供电公司 | Power dispatching log fault classification method |
CN108833156B (en) * | 2018-06-08 | 2022-08-30 | 中国电力科学研究院有限公司 | Evaluation method and system for simulation performance index of power communication network |
CN108984610A (en) * | 2018-06-11 | 2018-12-11 | 华南理工大学 | A kind of method and system based on the offline real-time processing data of big data frame |
CN108959445A (en) * | 2018-06-13 | 2018-12-07 | 云南电网有限责任公司信息中心 | Distributed information log processing method and processing device |
CN109213091A (en) * | 2018-06-27 | 2019-01-15 | 中国电子科技集团公司第五十五研究所 | A kind of semiconductor chip process equipment method for monitoring state based on document analysis |
CN109685399B (en) * | 2019-02-19 | 2022-09-09 | 贵州电网有限责任公司 | Method and system for integrating and analyzing logs of power system |
CN110069572B (en) * | 2019-03-19 | 2022-08-02 | 深圳壹账通智能科技有限公司 | HIVE task scheduling method, device, equipment and storage medium based on big data platform |
CN110231998B (en) * | 2019-06-13 | 2021-07-20 | 泰康保险集团股份有限公司 | Detection method and device for distributed timing task and storage medium |
CN110555010B (en) * | 2019-09-11 | 2022-04-05 | 中国南方电网有限责任公司 | Power grid real-time operation data storage system |
CN110825873B (en) * | 2019-10-11 | 2022-04-12 | 支付宝(杭州)信息技术有限公司 | Method and device for expanding log exception classification rule |
CN111049684B (en) * | 2019-12-12 | 2023-04-07 | 闻泰通讯股份有限公司 | Data analysis method, device, equipment and storage medium |
CN111158997B (en) * | 2019-12-24 | 2023-05-23 | 广西电网有限责任公司 | Safety monitoring method and device for multi-log system |
CN112232982A (en) * | 2020-02-13 | 2021-01-15 | 吴龙圣 | Data analysis method and device based on big data and Internet of things |
CN112948211A (en) * | 2021-02-26 | 2021-06-11 | 杭州安恒信息技术股份有限公司 | Alarm method, device, equipment and medium based on log processing |
CN114048870A (en) * | 2021-11-04 | 2022-02-15 | 佳源科技股份有限公司 | Power system abnormity monitoring method based on log characteristic intelligent mining |
CN114172921A (en) * | 2021-12-02 | 2022-03-11 | 国网山东省电力公司信息通信公司 | Log auditing method and device for scheduling recording system |
CN114169651B (en) * | 2022-02-14 | 2022-04-19 | 中国空气动力研究与发展中心计算空气动力研究所 | Active prediction method for supercomputer operation failure based on application similarity |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103607291A (en) * | 2013-10-25 | 2014-02-26 | 北京科东电力控制系统有限责任公司 | Alarm analysis merging method for power secondary system intranet security monitoring platform |
-
2014
- 2014-11-24 CN CN201410681737.4A patent/CN104616205B/en not_active Expired - Fee Related
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103607291A (en) * | 2013-10-25 | 2014-02-26 | 北京科东电力控制系统有限责任公司 | Alarm analysis merging method for power secondary system intranet security monitoring platform |
Non-Patent Citations (4)
Title |
---|
"基于Web的电力系统自适应安全事件管理设计";马茜;《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》;20080415;第C042-144页(正文第25-46、51-58页) * |
"基于层次聚类的日志分析技术研究";薛文娟;《中国优秀硕士学位论文全文数据库 信息科技辑》;20130815;第I139-69页(正文第5-9、14-33页) * |
"爬虫日志数据信息抽取与统计系统设计与实现";王高垒;《中国优秀硕士学位论文全文数据库 信息科技辑》;20130215;第I138-2005页(正文第21-23页) * |
"电网调度日志系统的设计与开发";庞传军;《湖北电力》;20130228;第37卷(第1期);第59-61页 * |
Also Published As
Publication number | Publication date |
---|---|
CN104616205A (en) | 2015-05-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104616205B (en) | A kind of operation states of electric power system monitoring method based on distributed information log analysis | |
CN107294764A (en) | Intelligent supervision method and intelligent monitoring system | |
CN104616092B (en) | A kind of behavior pattern processing method based on distributed information log analysis | |
CN107945086A (en) | A kind of big data resource management system applied to smart city | |
CN102880802B (en) | A kind of assay method for the major hazard source towards industrial and mining establishment's safety production cloud service platform system | |
CN109726246A (en) | One kind being associated with reason retrogressive method with visual power grid accident based on data mining | |
CN112580831B (en) | Intelligent auxiliary operation and maintenance method and system for power communication network based on knowledge graph | |
CN107145959A (en) | A kind of electric power data processing method based on big data platform | |
CN107317724A (en) | Data collecting system and method based on cloud computing technology | |
CN107517131A (en) | A kind of analysis and early warning method based on log collection | |
CN111787090A (en) | Intelligent treatment platform based on block chain technology | |
Lin et al. | A general framework for quantitative modeling of dependability in cyber-physical systems: A proposal for doctoral research | |
CN101854652A (en) | Telecommunications network service performance monitoring system | |
CN112668841A (en) | Comprehensive traffic monitoring system and method based on data fusion | |
CN102930372A (en) | Data analysis method for association rule of cloud service platform system orienting to safe production of industrial and mining enterprises | |
CN107104951A (en) | The detection method and device of Attack Source | |
CN109460829A (en) | Based on the intelligent monitoring method and platform under big data processing and cloud transmission | |
CN109002901A (en) | A kind of province ground county's integration electric network information total management system and device | |
CN116629802A (en) | Big data platform system for railway port station | |
CN102915482A (en) | Safety production process control and management method for cloud service platforms of industrial and mining enterprises | |
CN102903009A (en) | Malfunction diagnosis method based on generalized rule reasoning and used for safety production cloud service platform facing industrial and mining enterprises | |
CN111125450A (en) | Management method of multilayer topology network resource object | |
CN108073582A (en) | A kind of Computational frame selection method and device | |
CN111353085A (en) | Cloud mining network public opinion analysis method based on feature model | |
Long et al. | 6G comprehensive intelligence: network operations and optimization based on Large Language Models |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20191025 Termination date: 20211124 |
|
CF01 | Termination of patent right due to non-payment of annual fee |