CN109308306B - User power consumption abnormal behavior detection method based on isolated forest - Google Patents
User power consumption abnormal behavior detection method based on isolated forest Download PDFInfo
- Publication number
- CN109308306B CN109308306B CN201811151326.9A CN201811151326A CN109308306B CN 109308306 B CN109308306 B CN 109308306B CN 201811151326 A CN201811151326 A CN 201811151326A CN 109308306 B CN109308306 B CN 109308306B
- Authority
- CN
- China
- Prior art keywords
- data
- power consumption
- user
- trend
- electricity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 22
- 206010000117 Abnormal behaviour Diseases 0.000 title claims abstract description 17
- 230000005611 electricity Effects 0.000 claims abstract description 71
- 239000011159 matrix material Substances 0.000 claims abstract description 40
- 238000007781 pre-processing Methods 0.000 claims abstract description 31
- 230000002159 abnormal effect Effects 0.000 claims abstract description 28
- 238000004364 calculation method Methods 0.000 claims abstract description 19
- 238000004140 cleaning Methods 0.000 claims abstract description 6
- 238000000605 extraction Methods 0.000 claims abstract description 5
- 238000000034 method Methods 0.000 claims description 38
- 230000001174 ascending effect Effects 0.000 claims description 12
- 230000006399 behavior Effects 0.000 claims description 12
- 230000009467 reduction Effects 0.000 claims description 8
- 239000013598 vector Substances 0.000 claims description 7
- 230000008569 process Effects 0.000 claims description 6
- 239000006185 dispersion Substances 0.000 claims description 5
- 238000012935 Averaging Methods 0.000 claims description 2
- 238000005259 measurement Methods 0.000 claims description 2
- 230000000630 rising effect Effects 0.000 claims description 2
- 230000011218 segmentation Effects 0.000 claims description 2
- 230000007704 transition Effects 0.000 claims description 2
- 238000012545 processing Methods 0.000 abstract description 5
- 238000004458 analytical method Methods 0.000 abstract description 2
- 238000010606 normalization Methods 0.000 abstract 1
- 230000006870 function Effects 0.000 description 27
- 238000000513 principal component analysis Methods 0.000 description 10
- 238000012549 training Methods 0.000 description 10
- 230000004913 activation Effects 0.000 description 9
- 230000000694 effects Effects 0.000 description 6
- 238000005457 optimization Methods 0.000 description 5
- 230000005856 abnormality Effects 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 4
- 238000013459 approach Methods 0.000 description 3
- 238000003745 diagnosis Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000009792 diffusion process Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 238000011524 similarity measure Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000007476 Maximum Likelihood Methods 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000010835 comparative analysis Methods 0.000 description 1
- 238000013144 data compression Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012847 principal component analysis method Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000013024 troubleshooting Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Business, Economics & Management (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Economics (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a user electricity consumption abnormal behavior detection method based on an isolated forest, which comprises the following steps: s1, acquiring power utilization time sequence data in a data acquisition mode; s2, cleaning the data to remove incomplete data, error data and repeated data; s3, feature extraction based on statistics; s4, preprocessing data; s5, pairing matrix YM×KCarrying out normalization processing to obtain a new matrix YM×K'; s6, judging whether the power consumption is abnormal or normal by adopting an isolated forest model: s61, selecting the new matrix YM×KExtracting psi statistical characteristics from each user, and setting the number t, y of iTreeijIs a new matrix YM×KRow i and column j; s62, calculating yijIs given as an anomaly score s (y)ijψ); s63, determination S (y)ijψ) is less than 1- Δ e, Δ e is a constant in the range of 0.22 to 0.07; if yes, the power utilization is abnormal; if not, the electricity is normal. The user electricity consumption abnormal behavior detection method based on the isolated forest solves the problem that in the prior art, the analysis and calculation running time is long due to large follow-up operation caused by the fact that data are not processed.
Description
Technical Field
The invention relates to the field of power utilization monitoring, in particular to a user power utilization abnormal behavior detection method based on an isolated forest.
Background
The earlier electricity utilization abnormity monitoring method is to determine each electricity utilization abnormity index, determine the threshold value of each abnormity index, assign different weight values to each abnormity index, and calculate the electricity stealing suspicion coefficient of each user after accumulation. General electricity consumption abnormality indicators are briefly classified into line loss abnormality and instantaneous quantity abnormality. And designing a power stealing identification model according to the abnormalities, and identifying the power stealing users by calculating suspicion coefficients.
However, for the detection of such equipment failure and abnormal index of power consumption of users, an on-site detection method is often adopted in the early stage, that is, technicians go to the power consumption site to perform troubleshooting. The processing mode consumes manpower and material resources, has low efficiency and poor effect, can only monitor daily electricity quantity even if centralized meter reading is realized in partial areas, and cannot acquire instantaneous quantity data such as voltage, current, power and the like of the metering device. Meanwhile, the mode has great human factors, and is not beneficial to the management of the power industry.
The Chinese patent discloses a power consumption abnormal behavior identification method based on a fuzzy neural network with the application number of CN201810104000.4, and original data of part of users are extracted from a power consumption database to be used as sample data; carrying out data preprocessing; designing an electricity abnormal behavior evaluation index system on the basis of analyzing the historical electricity abnormal behavior case; constructing an expert sample by utilizing the preprocessed data; constructing a modeling fuzzy neural network model by taking the abnormal electricity consumption behavior mark as an input item and taking the abnormal electricity consumption suspicion coefficient as an output item; inputting test data into the constructed fuzzy neural network model, and carrying out abnormal electricity utilization behavior diagnosis; and evaluating the abnormal power utilization diagnosis result, setting target evaluation and optimizing the model. The invention realizes the automatic identification and diagnosis of abnormal power utilization behaviors, realizes the automatic training, learning and modeling of the system by using the fuzzy neural network method, achieves the quick and accurate positioning of suspected users, and provides convenience for acquiring various illegal behaviors of abnormal power utilization. However, since the subsequent operation is larger and the running time is long due to no data processing, the crash phenomenon is very easy to happen.
Disclosure of Invention
The invention provides a user power utilization abnormal behavior detection method based on an isolated forest, and solves the problem that in the prior art, the analysis and calculation running time is long due to large follow-up operation caused by the fact that data are not processed.
In order to achieve the purpose, the invention adopts the following technical scheme:
a user electricity consumption abnormal behavior detection method based on an isolated forest comprises the following steps:
s1, acquiring power utilization time sequence data in a data acquisition mode;
s2, cleaning the data to remove incomplete data, error data and repeated data;
s3, feature extraction based on statistics:
s31, data definition: s311, let X be { X ═ in the datasetnN is 1 to N, N daily electricity users are contained in the data set, and each user is divided into electricity data of D days, M months and Q quarters; s312, the daily electricity consumption sequence of each user: x is the number ofn={xndD is 1 to D; s313, the monthly electricity consumption sequence of each user: y isn={ynmTaking 1 to M as M,s114, quarterly electricity consumption sequence of each user: z is a radical ofn={znqQ is 1 to Q,
s32, dividing the electricity consumption behavior characteristics of the users in units of year, quarter and month in time, and calculating the mean value, standard deviation and discrete coefficient sequence of each user in unit time, namely calculating: the system comprises a standard deviation D1 of annual power consumption of each user, a discrete coefficient D2 of annual power consumption of each user, a standard deviation D3-D6 of quarterly power consumption, discrete coefficients D7-D10 of quarterly power consumption, standard deviations D11-D21 of monthly power consumption, discrete coefficients D22-D32 of monthly power consumption, a descending trend D33-D41 of an average power consumption ascending trend of each month, maximum values D42-D43 of differences and ratios of the average values of adjacent two months, minimum values D44-D45 of differences and ratios of the average values of adjacent two months, maximum values D46-D47 of differences and ratios of the average values of adjacent quarterly power consumption, and minimum values D48-D49 of the differences and ratios of the average values of adjacent quarterly power consumption, wherein D1-D49 are statistical characteristics;
s4, preprocessing data: assuming that the original data is used for M sample values which are processed based on statistical characteristics and then form N-dimensional vectors, wherein M represents the number of users, N represents the number of statistical characteristics extracted by each user, and the statistical characteristics are made to be an M multiplied by N matrix X, and X in the matrix X is a matrix X of M multiplied by NmnThe specific value of the Nth statistical characteristic of the Mth user is represented; reducing the matrix X to a matrix Y of MxK by a pre-processing modelM×K,K<N;
S5, judging whether the power consumption is abnormal or normal by adopting an isolated forest model:
s51, selecting the new matrix YM×KExtracting psi statistical characteristics from each user, and setting the number t, y of iTreeijIs a new matrix YM×KRow i and column j;
s52, the detection process is to make the statistical characteristic value y of each userijTraverse each iTree tree and then compute y in the traversal processijPath length h (y) through each iTree treeij) Finally, y is calculated according to all path lengthsijIs (a) is (b) is (d)ijψ), the calculation formula is:
c (psi) is used for calculating the average path length of the binary search tree, and the function is to normalize the result; the calculation of H (ψ) is:gamma is the Euler constant; e (h (yij)) is the average path length of yij for all iTree trees in soliton;
s53, determination S (y)ijψ) is less than 1- Δ e, Δ e is a constant in the range of 0.22 to 0.07; if yes, the power utilization is abnormal; if not, the electricity is normal.
Compared with the prior art, the invention has the following beneficial effects:
effective data are obtained by realizing the extraction of statistical characteristics; by realizing dimension reduction processing, the operation data is reduced, the operation speed is improved, the crash phenomenon is avoided, meanwhile, the operation data is guaranteed to have representativeness through condition selection, the phenomenon of missing judgment caused by selecting some statistical characteristics for calculation is shown, and the precision of the judgment result is guaranteed.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention.
Drawings
FIG. 1 is a diagram of an algorithm implementation process of an isolated forest model;
FIG. 2 is a diagram of an autoencoder network architecture;
FIG. 3 is a ReLU activation function image map of an auto-encoder;
FIG. 4 is a diagram of an implementation of a training optimization function algorithm of an autoencoder;
FIG. 5 is a diagram of a deep level auto-encoder network architecture;
FIG. 6 is a network structure of an auto encoder built using a keras tool;
fig. 7 is a network structure of a deep level automatic encoder built using a keras tool.
Detailed Description
In order to make the technical means, the creation characteristics, the achievement purposes and the functions of the invention clearer and easier to understand, the invention is further explained by combining the drawings and the detailed implementation mode as follows:
example 1:
a user electricity consumption abnormal behavior detection method based on an isolated forest comprises the following steps:
s1, acquiring power utilization time sequence data in a data acquisition mode;
s2, cleaning the data to remove incomplete data, error data and repeated data;
s3, feature extraction based on statistics:
s31, data definition: s311, let X be { X ═ in the datasetnN is 1 to N, N daily electricity users are contained in the data set, and each user is divided into electricity data of D days, M months and Q quarters; s312, the daily electricity consumption sequence of each user: x is the number ofn={xndD is 1 to D; s313, the monthly electricity consumption sequence of each user: y isn={ynmTaking 1 to M as M,s114, quarterly electricity consumption sequence of each user: z is a radical ofn={znqQ is 1 to Q,
s32, dividing the electricity consumption behavior characteristics of the users in units of year, quarter and month in time, and calculating the mean value, standard deviation and discrete coefficient sequence of each user in unit time, namely calculating: the system comprises a standard deviation D1 of annual power consumption of each user, a discrete coefficient D2 of annual power consumption of each user, a standard deviation D3-D6 of quarterly power consumption, discrete coefficients D7-D10 of quarterly power consumption, standard deviations D11-D21 of monthly power consumption, discrete coefficients D22-D32 of monthly power consumption, a descending trend D33-D41 of an average power consumption ascending trend of each month, maximum values D42-D43 of differences and ratios of the average values of adjacent two months, minimum values D44-D45 of differences and ratios of the average values of adjacent two months, maximum values D46-D47 of differences and ratios of the average values of adjacent quarterly power consumption, and minimum values D48-D49 of the differences and ratios of the average values of adjacent quarterly power consumption, wherein D1-D49 are statistical characteristics;
s4, preprocessing data: assuming that the original data is used for M sample values which are processed based on statistical characteristics and then form N-dimensional vectors, wherein M represents the number of users, N represents the number of statistical characteristics extracted by each user, and the statistical characteristics are made to be an M multiplied by N matrix X, and X in the matrix X is a matrix X of M multiplied by NmnTo express the MthThe specific value of the nth statistical characteristic; reducing the matrix X to a matrix Y of MxK by a pre-processing modelM×K,K<N;
S5, judging whether the power consumption is abnormal or normal by adopting an isolated forest model:
s51, selecting the new matrix YM×KExtracting psi statistical characteristics from each user, and setting the number t, y of iTreeijIs a new matrix YM×KRow i and column j;
s52, the detection process is to make the statistical characteristic value y of each userijTraverse each iTree tree and then compute y in the traversal processijPath length h (y) through each iTree treeij) (the walking mode is the same as the isolated forest model, the counting is 1 when no step is taken), and finally, y is calculated according to all path lengthsijIs given as an anomaly score s (y)ijψ), the calculation formula is:
c (psi) is used for calculating the average path length of the binary search tree, and the function is to normalize the result; the calculation of H (ψ) is:gamma is the Euler constant; e (h (yij)) is the average path length of yij for all iTree trees in soliton;
s53, determination S (y)ijψ) is less than 1- Δ e, Δ e is a constant in the range of 0.22 to 0.07; if yes, the power utilization is abnormal; if not, the electricity is normal.
In order to obtain the isolated forest model, as shown in fig. 1, the obtaining step of the isolated forest model comprises:
s711, assuming that the original data set is represented by F, randomly selecting F' samples from the data setThis point is placed as a child sample into the root node of the tree,
s712, randomly selecting a dimension q, and randomly generating a division point p in the current node data, wherein the division point p is generated between the maximum value and the minimum value of the specified dimension q in the current node data;
s713, generating a hyperplane by the division point p, and then dividing the data space of the current node into 2 subspaces: putting data with q < p in a specified dimension into a left sub-tree Fl of a current node, and putting data with q being more than or equal to p into a right sub-tree Fr of the current node;
s714, recursion steps S712 and S713 in the child nodes are carried out, new sub-tree nodes are continuously constructed until only one data or the sub-tree nodes in the sub-tree nodes reach the limited height, and the segmentation is not continued, so that t iTree trees are obtained.
In this embodiment, the preprocessing model is PCA dimension reduction.
In order to obtain more effective statistical features, the following steps are also performed after step S12:
s13, dividing the power utilization trend into three trend types of a variation trend, a fluctuation trend and a lifting trend;
s14, calculating a variation trend, a fluctuation trend and a lifting trend:
s141, fluctuation trend: in statistics, the standard deviation is used to evaluate the possible variation or fluctuation degree of the sequence, and the larger the standard deviation is, the larger the range of the numerical fluctuation is; therefore, the standard deviation std of the electricity consumption is calculated to represent the fluctuation trend characteristics of the electricity consumption data; meanwhile, calculating a power consumption discrete coefficient cv to measure the discrete degree of the power consumption of the user, and making the average value of the power consumption in a certain time period be mu, then:
standard deviation of electricity consumption:
power consumption dispersion coefficient:
cv=std/μ (2.2)
s142, variation trend: the variation trend characteristic refers to a front-back difference measurement of the power consumption of the user, that is, the average power consumption of a certain time period and a previous adjacent time period is compared, and the difference value and the ratio value reflect the speed of the power consumption variation, and the calculation mode is defined as follows:
difference of electricity utilization mean values of adjacent k months or k quarters:
ratio of electricity average values of adjacent k months or k quarters:
s143, ascending and descending trend: the ascending and descending trend characteristic means that the possibility of ascending or descending is obtained by predicting the next electricity consumption according to the electricity consumption of the user for several consecutive days and comparing the predicted next electricity consumption with the next actual electricity consumption; here, a simple moving average method is used to determine the feature vector of the ascending and descending trend; the simple moving average method sequentially calculates a group of average values of fixed terms according to the item-by-item transition of the time sequence, and the group of average values are used as next predicted values; let k be the number of the moving terms, and the actual value at time t be xnt, then the method for calculating the trend characteristic:
predicted value at time t:
Ft=(xn(t-1)+xn(t-2)+…xn(t-k))/k (2.5)
rising and falling trend at time t:
tr=xnt-Ft (2.6)
if tr is less than 0, indicating that the power utilization trend is reduced; if tr is greater than 0, the electricity utilization trend is increased;
wherein, the standard deviation std of the electric quantity, the ionization dispersion coefficient cv, and the difference avg of the electricity utilization mean value of adjacent k months or k quartersaRatio avg of average electricity consumption values of adjacent k months or k quartersbT time goes up and downThe potentials tr are all statistical characteristic values.
Preferably, the PCA dimension reduction step in step S2 is as follows:
s21, subtracting the mean value of each column of X, i.e. zero-averaging the features of each row of data X, to obtain X':
s23, obtaining N eigenvalues lambda of the covariance matrix C and an eigenvector V corresponding to each eigenvalue lambda:
CV=λV (3.2)
s24, arranging all the characteristic values lambda into a queue from large to small { lambda1,…,λi,…,λNAnd (4) arranging the eigenvectors V into a matrix W of N x N according to the eigenvalues from large to small, wherein the element of the ith column in the matrix W is the ith eigenvalue lambda in the queueiCorresponding to the elements of the eigenvector V, and taking the eigenvectors corresponding to the first K eigenvalues from the matrix W to obtain an NxK matrix AN×K;
S25, calculating K according to the formula 3.3, and taking the first K value meeting the formula 3.3:
s26, calculation formula 3.4, wherein YM×KNamely new characteristic data after dimension reduction to k dimension;
YM×K=XM×NAN×K (3.4)
1. introduction to the examples: the experimental data is derived from a daily electricity consumption data table collected by a national power grid for 2015 years of nearly 10000 users all the year, the daily electricity consumption table of the users records the total electricity consumption indicating values of kilowatt-hour, the current day and the previous day of all the users, and each user has a group of time sequence data with the dimension of 334. The user list determines user identification information and provides an identification of whether the corresponding numbered user is an abnormal power utilization user.
2. Data cleaning: the original data set of the user power consumption is cleaned to obtain 334 effective data dimensions, wherein the effective data dimensions comprise 1394 users with abnormal power consumption behaviors and 8562 users with unknown power consumption behaviors, and the proportion of the abnormal users is 14.00%.
3. Data preprocessing:
1) data pre-processing based on an auto-encoder: and performing data preprocessing on the cleaned data set based on an automatic encoder and a depth self-encoder. Firstly, normalizing the data, expressing each feature dimension data between [0,1], and then establishing network layer structures of two kinds of self-encoders by utilizing a neural network tool keras based on TensorFlow according to a designed self-encoder network model, as shown in FIG. 4. And setting an activation function ReLU of the middle layer, training an optimization function adadelta, training a loss function binary _ cross, and training times of 100 times.
The data is preprocessed through the established automatic encoder model and the established depth self-encoder model, after 100 times of training, the model tends to be stable, and the loss values respectively reach 0.0313 and 0.0311.
After the raw data is preprocessed, the dimensionality of the data is compressed to 32 dimensions. In order to intuitively test the effectiveness and the performance of the preprocessing method based on the automatic encoder model, a new preprocessed data set is mapped to a two-dimensional visualization plane as shown in fig. 6 for observation.
Wherein the white points represent users with no electricity abnormal suspicion, and the red points represent users with electricity abnormal behavior. On one hand, it can be seen that most white data points in the graph gather near the (0,0) region and have small diffusion outward, while most red data points have obvious outward diffusion and have a tendency of deviating from the region in the data set, showing the characteristic of outliers. On the other hand, compared to the auto-encoder model, the abnormal data points preprocessed based on the depth auto-encoder model show a more dispersed distribution, and the similarity metric function defined by the similarity function (equation 7) is used to analyze the two types of data points, where α is 0.1, and the calculation result is shown in table 1.
Where dist is a distance function, when two data samples are similar, dist approaches 0, Lp is 1; otherwise Lp approaches 0.
Table 1 comparison of similarity measures for autoencoder results (α ═ 0.1)
In the experiment, dist calculation adopts a Euclidean distance method to calculate the average distance between the same type of data. As can be seen from the table, the Lp values of the normal data points are far greater than those of the abnormal data points, so that the similarity degree of normal users is high, which indicates that the distribution is more concentrated, and the distance between users of the abnormal electricity consumption behavior is far, which indicates that the data dispersion is large. Meanwhile, compared with the preprocessing models of the automatic encoder and the depth self-encoder, the normal user data Lp trained by the depth self-encoder model are larger and more aggregated, and the abnormal user data Lp is smaller and more dispersed. Therefore, compared with the traditional automatic encoder, the depth self-encoder-based preprocessing method in the part of experiments has better effect performance when applied to power consumption abnormal data detection.
Data preprocessing based on a principal component analysis method: and performing linear PCA-based data preprocessing on the cleaned data set. The obtained principal components are arranged from big to small, and a new feature dimension is calculated by selecting the feature space corresponding to the first 32 principal components so as to facilitate comparative analysis.
The method comprises the steps of respectively establishing a linear PCA (principal component analysis) -based data dimension reduction method to preprocess original data, selecting eigenvectors corresponding to the first 32 principal components after preprocessing, and mapping the original data to a 32-dimensional new eigenspace. The purpose of selecting the first 32 principal components is to unify the results of all preprocessing methods into the same dimension.
Wherein the white points represent users with no electricity abnormal suspicion, and the red points represent users with electricity abnormal behavior. First, it can be seen that the data after PCA-based preprocessing all have a tendency to spread outward from a certain aggregation point, and relatively, the white data points are relatively aggregated, and the red data points are relatively more dispersed. Then, from the graph after the PCA-based preprocessing, the white data points and the red data points still have a large part of coincidence, and the preprocessing method is not obvious in the effect of dividing the two types of data.
The similarity metric function defined by equation (7) is used to analyze the two types of data points, and α is 0.03, and the results are calculated as shown in the following table.
Table 2 comparison of similarity measures for principal component analysis results (α ═ 0.03)
In the experiment, dist calculation still adopts a Euclidean distance method to calculate the average distance between the same type of data. As can be seen from the table, the PCA-based approach works well.
Establishing an isolated forest model: and performing two-dimensional visual display on the new data set obtained by the four data preprocessing modes, and comparing the effects of different preprocessing methods.
Next, for the four data preprocessing methods adopted by the isolated forest model, the finally obtained corresponding confusion matrix, Precision-reduce index and P-R curve graphs thereof are respectively shown in table 3 and table 4.
TABLE 3 confusion matrix results for isolated forest models under different preprocessing methods
TABLE 4 Precision-Recall index and Overall Precision for abnormal data
Firstly, as can be seen from the confusion matrix and Precision-Recall index results of the above experiments, the anomaly detection model based on the isolated forest achieves higher overall accuracy under different preprocessing models. Meanwhile, different data preprocessing method choices have different influences on the detection effect of the model. By observing the user data detection condition with abnormal electricity utilization behaviors, the model abnormity detection Precision value and the Recall value based on the depth self-encoder are found to be higher than the indexes of 0.07 and 0.14 based on the automatic encoder method, and the effect is better than that of the automatic encoder method. The preprocessing method based on the linear PCA is better than the automatic encoder method in the performance improvement of the model abnormity detection, the Precision value and the Recall value are higher by 0.05 and 0.04, but the performance improvement of the model abnormity detection by the depth self-encoder is not as great.
Example 2:
this example differs from example 1 only in that: in this embodiment, only the preprocessing model is changed based on embodiment 1, and an automatic encoder is adopted in this embodiment.
First, a conventional single-hidden-layer auto-encoder model is built, which is a fully-connected neural network, as shown in fig. 2.
In fig. 2, the first half of the model serves as an automatic encoding part, and the second half serves as an automatic decoding part. The model takes 334 characteristic dimensions obtained by cleaning the raw data as input and output at the same time, namely the number of neurons in an input layer is the same as that of neurons in an output layer. Here, the number of nodes in the intermediate layer is set to 32, which is smaller than the number of nodes in the input layer and the output layer, and the data compression function is performed.
Next, relevant parameters are configured for the auto-encoder model. The network middle layer activation function uses a ReLU activation function, the graph of the ReLU activation function is shown in FIG. 3, and the basic mathematical form is as follows:
f(x)=max(0,wTx+b) (5.1)
compared with the traditional sigmoid activation function, for the nonlinear function, firstly, because the gradient of the non-negative interval is constant, the ReLU is applied to the deep network without the problems of gradient disappearance and gradient explosion, so that the convergence speed of the model is maintained in a stable state. Then, the ReLU only needs one threshold value to obtain the activation value, and a large pile of complex operation is not needed to be calculated, so that the calculation process is simplified.
An adapelta gradient descent function is adopted as a training optimization function of the model, the adapelta gradient descent function is a learning rate self-adaptive optimization method, and faster convergence rate can be achieved when a deep complex network is trained. The specific calculation process of the algorithm is shown in FIG. 4.
The loss function selected for the model is binary _ cross, i.e., a logarithmic loss function, which is mainly used for maximum likelihood estimation and its calculation formula is shown in 4.2. And finally setting the number of training iterations as 100.
L(Y,P(Y|X))=-logP(Y|X) (5.2)
The software algorithm implementation is shown in fig. 6.
Example 3:
this example differs from example 2 in that: this embodiment adds an implicit layer to the auto-encoder based on embodiment 2 only.
The previous autoencoder data processing model only establishes a single hidden layer, this time establishes a deeper autoencoding model for the data to be processed, and the network structure is shown in fig. 5:
the basic configuration parameters are the same as the configuration of the previous model, the training optimization function of the configuration model is adadelta, the loss function is binary _ cross, the training times are 100 times, and the ReLU activation function is used by the intermediate coding layer and decoding layer activation functions. The software algorithm is shown in fig. 7.
Finally, the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and all of them should be covered by the claims of the present invention.
Claims (5)
1. A user electricity consumption abnormal behavior detection method based on an isolated forest is characterized by comprising the following steps:
s1, acquiring power utilization time sequence data in a data acquisition mode;
s2, cleaning the data to remove incomplete data, error data and repeated data;
s3, feature extraction based on statistics:
s31, data definition: s311, let X be { X ═ in the datasetnN is 1 to N, N daily electricity users are contained in the data set, and each user is divided into electricity data of D days, M months and Q quarters; s312, daily electric quantity sequence of each user: x is the number ofn={xndD is 1 to D; s313, the monthly electricity consumption sequence of each user: y isn={ynmTaking 1 to M as M,s114, quarterly electricity consumption sequence of each user: z is a radical ofn={znqQ is 1 to Q,
s32, dividing the electricity consumption behavior characteristics of the users in units of year, quarter and month in time, and calculating the mean value, standard deviation and discrete coefficient sequence of each user in unit time, namely calculating: the system comprises a standard deviation D1 of annual power consumption of each user, a discrete coefficient D2 of annual power consumption of each user, a standard deviation D3-D6 of quarterly power consumption, a discrete coefficient D7-D10 of quarterly power consumption, a standard deviation D11-D21 of monthly power consumption, a discrete coefficient D22-D32 of monthly power consumption, an ascending and descending trend D33-D41 of monthly average power consumption, maximum values D42-D43 of differences and ratios of adjacent monthly power consumption average values, minimum values D44-D45 of differences and ratios of adjacent monthly power consumption average values, maximum values D46-D47 of differences and ratios of adjacent quarterly power consumption average values, and minimum values D48-D49 of differences and ratios of adjacent quarterly power consumption average values, wherein D1-D49 are statistical characteristics;
s4, preprocessing data: assuming that the original data is used for M sample values which are processed based on statistical characteristics and then form N-dimensional vectors, wherein M represents the number of users, N represents the number of statistical characteristics extracted by each user, and the statistical characteristics are made to be an M multiplied by N matrix X, and X in the matrix X is a matrix X of M multiplied by NmnThe specific value of the Nth statistical characteristic of the Mth user is represented; reducing the matrix X to a matrix Y of MxK by a pre-processing modelM×K,K<N;
S5, judging whether the power consumption is abnormal or normal by adopting an isolated forest model:
s51, selecting the new matrix YM×KExtracting psi statistical characteristics from each user, and setting the number t, y of iTreeijIs a new matrix YM×KRow i and column j;
s52, the detection process is to make the statistical characteristic value y of each userijTraversing each iTree tree, and then calculating y in the traversal processijPath length h (y) through each iTree treeij) Finally, y is calculated according to all path lengthsijIs given as an anomaly score s (y)ijψ), the calculation formula is:
c (psi) is used for calculating the average path length of the binary search tree, and the function is to normalize the result; the calculation of H (ψ) is:gamma is the Euler constant; e (h (yij)) is the average path length of yij for all iTree trees in the isolated forest;
s53, determination S (y)ijψ) is less than 1- Δ e, Δ e is a constant in the range of 0.22 to 0.07; if yes, the power utilization is abnormal; if not, the electricity is normal.
2. The isolated forest-based user electricity consumption abnormal behavior detection method as claimed in claim 1, wherein the isolated forest model obtaining step comprises:
s711, assuming that the original data set is represented by F, randomly selecting F' sample points from the data set as root nodes of the subsample putting into the tree,
s712, randomly selecting a dimension q, and randomly generating a division point p in the current node data, wherein the division point p is generated between the maximum value and the minimum value of the specified dimension q in the current node data;
s713, generating a hyperplane by the division point p, and then dividing the data space of the current node into 2 subspaces: putting data with q < p in a specified dimension into a left sub-tree Fl of the current node, and putting data with q being more than or equal to p into a right sub-tree Fr of the current node;
s714, recursion steps S712 and S713 in the child nodes are carried out, new sub-tree nodes are continuously constructed until only one data or sub-tree node in the sub-tree nodes reaches the limited height, and the segmentation is not continued, so that t iTree trees are obtained.
3. The method for detecting abnormal behavior of users in solitary forests as claimed in claim 1, wherein in step S4, the preprocessing model is an auto-encoder, a deep auto-encoder or PCA dimension reduction.
4. The method for detecting abnormal user electricity consumption behavior based on the isolated forest as claimed in claim 3, wherein the following steps are further performed after step S12:
s13, dividing the power utilization trend into three trend types of a variation trend, a fluctuation trend and a lifting trend;
s14, calculating a variation trend, a fluctuation trend and a lifting trend:
s141, fluctuation trend: in statistics, the standard deviation is used to evaluate the possible variation or fluctuation degree of the sequence, and the larger the standard deviation is, the larger the range of the numerical fluctuation is; therefore, the standard deviation std of the electricity consumption is calculated to represent the fluctuation trend characteristics of the electricity consumption data; meanwhile, calculating a power consumption discrete coefficient cv to measure the discrete degree of the power consumption of the user, and making the average value of the power consumption in a certain time period be mu, then:
standard deviation of electricity consumption:
power consumption dispersion coefficient:
cv=std/μ (2.2)
s142, variation trend: the fluctuation trend characteristic refers to the difference measurement before and after the power consumption of the user, that is, the average power consumption of a certain time period and the previous adjacent time period is compared, and the difference value and the ratio value reflect the speed of the power consumption fluctuation, and the calculation mode is defined as follows:
difference of electricity utilization mean values of adjacent k months or k quarters:
ratio of electricity average values of adjacent k months or k quarters:
s143, ascending and descending trend: the ascending and descending trend characteristic means that the possibility of ascending or descending is obtained by predicting the next electricity consumption according to the electricity consumption of the user for several consecutive days and comparing the predicted next electricity consumption with the next actual electricity consumption; here, a simple moving average method is used to determine the feature vector of the ascending and descending trend; the simple moving average method sequentially calculates a group of average values of fixed terms according to the item-by-item transition of the time sequence, and the group of average values are used as next predicted values; let k be the number of the moving terms, and the actual value at time t be xnt, then the method for calculating the trend characteristic:
predicted value at time t:
Ft=(xn(t-1)+xn(t-2)+…+xn(t-k))/k (2.5)
rising and falling trend at time t:
tr=xnt-Ft (2.6)
if tr is less than 0, indicating that the power utilization trend is reduced; if tr is greater than 0, the electricity utilization trend is increased;
wherein, the standard deviation std of the electric quantity, the ionization dispersion coefficient cv, and the difference avg of the electricity utilization mean value of adjacent k months or k quartersaRatio avg of average electricity consumption values of adjacent k months or k quartersbAnd the ascending and descending trend tr at the time t is a statistical characteristic value.
5. The method for detecting abnormal behaviors of users on power utilization based on the isolated forest as claimed in claim 4, wherein in step S2, the PCA dimension reduction step is as follows:
s21, subtracting the mean value of each column of X, i.e. zero-averaging the features of each row of data X, to obtain X':
s23, obtaining N eigenvalues lambda of the covariance matrix C and an eigenvector V corresponding to each eigenvalue lambda:
CV=λV (3.2)
s24, arranging all the characteristic values lambda into a queue from large to small { lambda1,…,λi,…,λNAnd (4) arranging the eigenvectors V into a matrix W of N x N according to the eigenvalues from large to small, wherein the element of the ith column in the matrix W is the ith eigenvalue lambda in the queueiCorresponding to the elements of the eigenvector V, and taking the eigenvectors corresponding to the first K eigenvalues from the matrix W to obtain an NxK matrix AN×K;
S25, calculating K according to the formula 3.3, and taking the first K value meeting the formula 3.3:
s26, calculation formula 3.4, wherein YM×KNamely new characteristic data after dimension reduction to k dimension;
YM×K=XM×NAN×K (3.4)。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811151326.9A CN109308306B (en) | 2018-09-29 | 2018-09-29 | User power consumption abnormal behavior detection method based on isolated forest |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811151326.9A CN109308306B (en) | 2018-09-29 | 2018-09-29 | User power consumption abnormal behavior detection method based on isolated forest |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109308306A CN109308306A (en) | 2019-02-05 |
CN109308306B true CN109308306B (en) | 2021-07-06 |
Family
ID=65224976
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811151326.9A Active CN109308306B (en) | 2018-09-29 | 2018-09-29 | User power consumption abnormal behavior detection method based on isolated forest |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109308306B (en) |
Families Citing this family (43)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111723338A (en) * | 2019-03-18 | 2020-09-29 | 京东数字科技控股有限公司 | Detection method and detection equipment |
CN110135614A (en) * | 2019-03-26 | 2019-08-16 | 广东工业大学 | It is a kind of to be tripped prediction technique based on rejecting outliers and the 10kV distribution low-voltage of sampling techniques |
CN109977107B (en) * | 2019-04-02 | 2022-04-05 | 电子科技大学 | Method for cleaning power utilization collected data |
CN109948738B (en) * | 2019-04-11 | 2021-03-09 | 合肥工业大学 | Energy consumption abnormity detection method and device for coating drying chamber |
CN110149258A (en) * | 2019-04-12 | 2019-08-20 | 北京航空航天大学 | A kind of automobile CAN-bus network data method for detecting abnormality based on isolated forest |
CN110188799A (en) * | 2019-04-29 | 2019-08-30 | 国网上海市电力公司 | A kind of continuous data multidimensional analysis and intelligent O&M method |
WO2020237540A1 (en) * | 2019-05-29 | 2020-12-03 | 西门子股份公司 | Power grid user classification method and device, and computer-readable storage medium |
CN110288383B (en) * | 2019-05-31 | 2024-02-02 | 国网上海市电力公司 | Group behavior power distribution network electricity utilization abnormality detection method based on user attribute tags |
CN110243599B (en) * | 2019-07-02 | 2020-05-05 | 西南交通大学 | Method for monitoring temperature abnormal state of multi-dimensional outlier train motor train unit axle box bearing |
CN110502883B (en) * | 2019-08-23 | 2022-08-19 | 四川长虹电器股份有限公司 | PCA-based keystroke behavior anomaly detection method |
CN110852860B (en) * | 2019-11-15 | 2024-11-08 | 北京优途豪程汽车科技发展有限公司 | Method, device and storage medium for detecting abnormal vehicle maintenance reimbursement behavior |
CN110929643B (en) * | 2019-11-21 | 2022-04-26 | 西北工业大学 | Hyperspectral anomaly detection method based on multiple features and isolated trees |
CN111062590A (en) * | 2019-12-02 | 2020-04-24 | 深圳供电局有限公司 | Electricity abnormal behavior detection method and device, computer equipment and storage medium |
CN111008662B (en) * | 2019-12-04 | 2023-01-10 | 贵州电网有限责任公司 | Online monitoring data anomaly analysis method for power transmission line |
CN111160647B (en) * | 2019-12-30 | 2023-08-22 | 第四范式(北京)技术有限公司 | Money laundering behavior prediction method and device |
CN111275576A (en) * | 2020-01-19 | 2020-06-12 | 烟台海颐软件股份有限公司 | Identification method and identification system for abnormal electricity price execution user |
CN111612037B (en) * | 2020-04-24 | 2024-06-21 | 平安直通咨询有限公司上海分公司 | Abnormal user detection method, device, medium and electronic equipment |
CN112215386A (en) * | 2020-05-11 | 2021-01-12 | 北京明略软件系统有限公司 | Personnel activity prediction method and device and computer readable storage medium |
CN111861785A (en) * | 2020-06-12 | 2020-10-30 | 国网浙江省电力有限公司电力科学研究院 | Special transformer industry fault identification method based on power utilization characteristics and outlier detection |
CN111767951A (en) * | 2020-06-29 | 2020-10-13 | 上海积成能源科技有限公司 | Method for discovering abnormal data by applying isolated forest algorithm in residential electricity safety analysis |
CN111931834B (en) * | 2020-07-31 | 2023-05-02 | 广东工业大学 | Method, equipment and storage medium for detecting abnormal flow data in extrusion process of aluminum profile based on isolated forest algorithm |
CN112561251B (en) * | 2020-11-30 | 2022-10-25 | 广东电网有限责任公司广州供电局 | Power distribution network abnormal point detection method and device, computer equipment and storage medium |
CN112668614B (en) * | 2020-12-11 | 2022-11-01 | 浙江成功软件开发有限公司 | Anti-money laundering studying and judging method |
CN112633412B (en) * | 2021-01-05 | 2024-05-14 | 南方电网数字平台科技(广东)有限公司 | Abnormal electricity utilization detection method, abnormal electricity utilization detection equipment and storage medium |
CN112821556B (en) * | 2021-01-19 | 2023-04-07 | 深圳市迅捷光通科技有限公司 | Power detection control system and method |
CN112906744B (en) * | 2021-01-20 | 2023-08-04 | 湖北工业大学 | Fault single battery identification method based on isolated forest algorithm |
CN113128567A (en) * | 2021-03-25 | 2021-07-16 | 云南电网有限责任公司 | Abnormal electricity consumption behavior identification method based on electricity consumption data |
CN112966163A (en) * | 2021-03-31 | 2021-06-15 | 国家电网有限公司华东分部 | Auditing method and system for electricity consumption charge of power consumer and electronic equipment |
CN113420816B (en) * | 2021-06-24 | 2024-09-06 | 北京市生态环境监测中心 | Data outlier determining method for full-spectrum water quality monitoring equipment |
CN113469235B (en) * | 2021-06-24 | 2024-04-26 | 珠海卓邦科技有限公司 | Water fluctuation abnormality recognition method and device, computer device and storage medium |
CN113496440B (en) * | 2021-06-28 | 2023-12-12 | 国网上海市电力公司 | User abnormal electricity consumption detection method and system |
CN113902581B (en) * | 2021-08-04 | 2024-07-12 | 广西电网有限责任公司 | Power consumption anomaly detection method based on depth self-encoder Gaussian mixture model |
CN114168583A (en) * | 2021-12-15 | 2022-03-11 | 国网福建省电力有限公司营销服务中心 | Electric quantity data cleaning method and system based on regular automatic encoder |
CN114755002B (en) * | 2022-04-06 | 2023-05-30 | 燕山大学 | Buffer balance valve fault diagnosis method based on fully-connected neural network |
CN114495137B (en) * | 2022-04-15 | 2022-08-02 | 深圳高灯计算机科技有限公司 | Bill abnormity detection model generation method and bill abnormity detection method |
CN115099291B (en) * | 2022-08-29 | 2022-11-11 | 同方德诚(山东)科技股份公司 | Building energy-saving monitoring method |
CN116645097B (en) * | 2023-03-30 | 2024-08-23 | 广东盛迪嘉电子商务股份有限公司 | Payment clearing platform monitoring and early warning system |
CN116304962B (en) * | 2023-05-25 | 2023-08-04 | 湖南东润智能仪表有限公司 | Intelligent anomaly monitoring method for water meter metering data |
CN116611000B (en) * | 2023-07-17 | 2023-10-24 | 东营市恒盛农业科技有限公司 | Intelligent hairy crab culture environment monitoring system based on machine learning |
CN117312997B (en) * | 2023-11-21 | 2024-03-08 | 乾程电力有限公司 | Intelligent diagnosis method and system for power management system |
CN117349764B (en) * | 2023-12-05 | 2024-02-27 | 河北三臧生物科技有限公司 | Intelligent analysis method for stem cell induction data |
CN117786587B (en) * | 2024-02-28 | 2024-06-04 | 国网河南省电力公司经济技术研究院 | Power grid data quality abnormality diagnosis method based on data analysis |
CN118445273A (en) * | 2024-07-08 | 2024-08-06 | 伽利略(天津)技术有限公司 | Intelligent operation and maintenance monitoring system based on big data |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107657288A (en) * | 2017-10-26 | 2018-02-02 | 国网冀北电力有限公司 | A kind of power scheduling flow data method for detecting abnormality based on isolated forest algorithm |
US10045218B1 (en) * | 2016-07-27 | 2018-08-07 | Argyle Data, Inc. | Anomaly detection in streaming telephone network data |
CN108494747A (en) * | 2018-03-08 | 2018-09-04 | 上海观安信息技术股份有限公司 | Traffic anomaly detection method, electronic equipment and computer program product |
-
2018
- 2018-09-29 CN CN201811151326.9A patent/CN109308306B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10045218B1 (en) * | 2016-07-27 | 2018-08-07 | Argyle Data, Inc. | Anomaly detection in streaming telephone network data |
CN107657288A (en) * | 2017-10-26 | 2018-02-02 | 国网冀北电力有限公司 | A kind of power scheduling flow data method for detecting abnormality based on isolated forest algorithm |
CN108494747A (en) * | 2018-03-08 | 2018-09-04 | 上海观安信息技术股份有限公司 | Traffic anomaly detection method, electronic equipment and computer program product |
Non-Patent Citations (3)
Title |
---|
An Improved Data Anomaly Detection Method Based on Isolation Forest;Dong Xu等;《IEEE》;20180208;全文 * |
Isolation Forest;Fei Tony Liu等;《IEEE》;20090210;全文 * |
张荣昌.基于数据挖掘的用电数据异常的分析与研宄.《中国优秀硕士学位论文全文数据库》.2018, * |
Also Published As
Publication number | Publication date |
---|---|
CN109308306A (en) | 2019-02-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109308306B (en) | User power consumption abnormal behavior detection method based on isolated forest | |
CN110263846B (en) | Fault diagnosis method based on fault data deep mining and learning | |
CN108805202B (en) | Machine learning method for electrolytic bath fault early warning and application thereof | |
CN115276006A (en) | Load prediction method and system for power integration system | |
US10496730B2 (en) | Factor analysis device, factor analysis method, and factor analysis program | |
CN107065843A (en) | Multi-direction KICA batch processes fault monitoring method based on Independent subspace | |
CN108491991B (en) | Constraint condition analysis system and method based on industrial big data product construction period | |
CN113255777B (en) | Equipment fault early warning method and system based on multi-mode sensitive feature selection fusion | |
CN109947815B (en) | Power theft identification method based on outlier algorithm | |
CN114841268B (en) | Abnormal power customer identification method based on Transformer and LSTM fusion algorithm | |
CN104536996B (en) | Calculate node method for detecting abnormality under a kind of homogeneous environment | |
CN111612149A (en) | Main network line state detection method, system and medium based on decision tree | |
CN111898637B (en) | Feature selection algorithm based on Relieff-DDC | |
Liao et al. | Assessing neural network representations during training using noise-resilient diffusion spectral entropy | |
CN110689140A (en) | Method for intelligently managing rail transit alarm data through big data | |
CN117671393B (en) | Fault monitoring method and system for electrical mechanical equipment | |
CN117994026A (en) | Financial risk intelligent analysis method based on big data | |
CN109214268B (en) | Packed tower flooding state online monitoring method based on integrated manifold learning | |
CN114399407B (en) | Power dispatching monitoring data anomaly detection method based on dynamic and static selection integration | |
Gogebakan et al. | Mixture model clustering using variable data segmentation and model selection: a case study of genetic algorithm | |
CN116956089A (en) | Training method and detection method for temperature anomaly detection model of electrical equipment | |
CN118296565B (en) | Power battery accident tracing management and control system based on data mining | |
Wang | Employee Salaries Analysis and Prediction with Machine Learning | |
CN118548114B (en) | Tunnel monitoring data emergency response early warning method and system | |
Sumalatha et al. | Real Time Big Data Analytics for Agricultural Land Hotspot Prediction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |