Nothing Special   »   [go: up one dir, main page]

CN111861781A - Feature optimization method and system in residential electricity consumption behavior clustering - Google Patents

Feature optimization method and system in residential electricity consumption behavior clustering Download PDF

Info

Publication number
CN111861781A
CN111861781A CN202010132423.4A CN202010132423A CN111861781A CN 111861781 A CN111861781 A CN 111861781A CN 202010132423 A CN202010132423 A CN 202010132423A CN 111861781 A CN111861781 A CN 111861781A
Authority
CN
China
Prior art keywords
feature
cluster
electricity consumption
evaluation
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010132423.4A
Other languages
Chinese (zh)
Inventor
夏飞
张洁
张传林
龚春阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai University of Electric Power
Original Assignee
Shanghai University of Electric Power
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai University of Electric Power filed Critical Shanghai University of Electric Power
Priority to CN202010132423.4A priority Critical patent/CN111861781A/en
Publication of CN111861781A publication Critical patent/CN111861781A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Water Supply & Treatment (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Public Health (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method and a system for optimizing features in residential electricity consumption behavior clustering, which comprises the steps of collecting data and constructing an original feature set; constructing an evaluation function; screening the original feature set based on an evaluation function; improving a density peak algorithm; and carrying out clustering analysis based on an improved density peak algorithm. The invention has the beneficial effects that: by optimizing the original feature set formed by the electricity utilization features and the meteorological factor features, the optimal feature subset with the least calculation amount and capable of achieving a better effect is formed and is subjected to cluster analysis, so that classification research of the electricity utilization modes of users is completed.

Description

Feature optimization method and system in residential electricity consumption behavior clustering
Technical Field
The invention relates to the technical field, in particular to a characteristic optimization method and system in residential electricity consumption behavior clustering.
Background
In recent years, with the rapid development of electricity information acquisition systems in China, smart meters in electric power systems are widely applied, and user electricity consumption data available for electric power companies become a massive trend, so that the user electricity consumption behavior cluster analysis based on the mass electricity consumption data becomes increasingly important.
In order to process and analyze the electricity utilization data, corresponding features need to be extracted from a large amount of data, clustering is performed by utilizing the electricity utilization features, and data analysis is realized, wherein the more the data is, the higher the data processing time and the complexity of calculation are, and meanwhile, the clustering effect is difficult to ensure
In the traditional research aiming at the electricity utilization behavior of the user, how to select features is not proposed, namely, the feature set for clustering is not preferred, the effectiveness of the user load to be analyzed is not determined, and the effectiveness is to be verified. And for the load of the residents, the load is influenced not only by the load rate, the daily peak-to-valley rate and other typical electricity utilization characteristics in the conventional use, but also by the temperature, the rainwater, the pressure and other typical meteorological characteristic factors. Therefore, the traditional method is only based on a common electricity utilization characteristic clustering analysis mode, the data calculation amount is large, the accuracy is not enough, and improvement is needed.
Disclosure of Invention
This section is for the purpose of summarizing some aspects of embodiments of the invention and to briefly introduce some preferred embodiments. In this section, as well as in the abstract and the title of the invention of this application, simplifications or omissions may be made to avoid obscuring the purpose of the section, the abstract and the title, and such simplifications or omissions are not intended to limit the scope of the invention.
The present invention has been made in view of the above-mentioned conventional problems.
Therefore, one technical problem solved by the present invention is: the method for optimizing the characteristics in the residential electricity consumption behavior clustering is provided, and the original characteristics can be screened, so that the calculated amount in analysis is reduced, and the analysis accuracy is improved.
In order to solve the technical problems, the invention provides the following technical scheme: a characteristic optimization method in residential electricity consumption behavior clustering comprises the steps of collecting data and constructing an original characteristic set; constructing an evaluation function; screening the original feature set based on an evaluation function; improving a density peak algorithm; and carrying out clustering analysis based on an improved density peak algorithm.
As a preferable aspect of the characteristic preferable method in the residential electricity consumption behavior cluster according to the present invention, wherein: the original feature set comprises power utilization features and meteorological features, and the power utilization features further comprise a peak-valley feature change index, a power utilization feature change index and a daily power utilization feature index; the meteorological features also include average temperature, maximum temperature, lowest temperature, rain, wind direction, wind speed, pressure and humidity.
As a preferable aspect of the characteristic preferable method in the residential electricity consumption behavior cluster according to the present invention, wherein: the construction of the evaluation function comprises a contour coefficient index which is calculated by the following formula,
Figure BDA0002396158760000021
Where i is a sample in the original data set X, a (X)i) Denotes xiAverage distance to other objects in the same cluster, b (x)i) Denotes xiMinimum average distance to the remaining cluster classes.
As a preferable aspect of the characteristic preferable method in the residential electricity consumption behavior cluster according to the present invention, wherein: the evaluation function also comprises a Bayesian information criterion function which is calculated by the formula,
Figure BDA0002396158760000022
wherein k is the number of clustering clusters in the clustering model, n is the number of samples,
Figure BDA0002396158760000023
is a likelihood function, the formula of which is,
Figure BDA0002396158760000024
wherein SC and SC*Respectively the optimal value and the actual value of the cluster evaluation indexAnd outputting the evaluation index value.
As a preferable aspect of the characteristic preferable method in the residential electricity consumption behavior cluster according to the present invention, wherein: the merit function further includes a correlation coefficient ρxyThe calculation formula is as follows,
Figure BDA0002396158760000025
where cov (x, y) is the covariance of features x and y, σxAnd σyStandard deviation, ρ, of features x and y, respectivelyxyHas a value range of [ -1,1 [)]。
As a preferable aspect of the characteristic preferable method in the residential electricity consumption behavior cluster according to the present invention, wherein: the formula of the merit function is such that,
Figure BDA0002396158760000026
wherein Z (x) is an evaluation value of the feature x, B' (x) is a Bayesian information criterion value, ρ, obtained by normalizing the feature x xyIs the correlation coefficient.
As a preferable aspect of the characteristic preferable method in the residential electricity consumption behavior cluster according to the present invention, wherein: the optimal feature subset is constructed through feature optimization, and the feature optimization further comprises the step of calculating evaluation values of all features in an original feature library X; screening the features to form an optimal feature subset Y; calculating an evaluation value R of the optimal feature subset Y; and whether the evaluation value R is smaller than a set threshold value or not, and if so, outputting the final optimal feature subset Y.
As a preferable aspect of the characteristic preferable method in the residential electricity consumption behavior cluster according to the present invention, wherein: the evaluation value R is calculated by the formula,
Figure BDA0002396158760000031
and the evaluation value R is the ratio of the evaluation value of the optimal feature in the original feature library X to the evaluation value of the optimal feature subset Y, and the selection is stopped when the R is smaller than a set threshold value.
As a preferable aspect of the characteristic preferable method in the residential electricity consumption behavior cluster according to the present invention, wherein: the improved density peak algorithm comprises the following steps of optimizing a truncation distance by using a cuckoo search algorithm according to a cluster evaluation index SC; and realizing automatic selection of the clustering center by using the thought of abnormal value detection and adopting Gaussian distribution.
The invention solves another technical problem that: a characteristic optimization system in the clustering of the electricity consumption behaviors of residents is provided, so that the method can be realized by depending on the system.
In order to solve the technical problems, the invention provides the following technical scheme: the system for optimizing the characteristics in the clustering of the electricity consumption behaviors of residents comprises an acquisition module, a feature selection module and a feature selection module, wherein the acquisition module is used for acquiring and constructing an original characteristic set; the screening module can construct an evaluation function and screen the original feature set data; and the cluster analysis module is used for clustering the screened data.
The invention has the beneficial effects that: by optimizing the original feature set formed by the electricity utilization features and the meteorological factor features, the optimal feature subset with the least calculation amount and capable of achieving a better effect is formed and is subjected to cluster analysis, and therefore classification research of the user electricity utilization mode is completed.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise. Wherein:
Fig. 1 is a schematic overall flow chart of a preferred method for characteristics in the clustering of electricity consumption behaviors of residents according to a first embodiment of the invention;
FIG. 2 is a schematic diagram illustrating a process of constructing an optimal feature subset according to a first embodiment of the present invention;
FIG. 3 is a graph showing the variation of the accuracy of the feature selection process according to the first embodiment of the present invention;
fig. 4 is a schematic diagram of the overall structure of a characteristic optimization system in the residential electricity consumption behavior clustering according to the second embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, specific embodiments accompanied with figures are described in detail below, and it is apparent that the described embodiments are a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making creative efforts based on the embodiments of the present invention, shall fall within the protection scope of the present invention.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways than those specifically described and will be readily apparent to those of ordinary skill in the art without departing from the spirit of the present invention, and therefore the present invention is not limited to the specific embodiments disclosed below.
Furthermore, reference herein to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one implementation of the invention. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments.
The present invention will be described in detail with reference to the drawings, wherein the cross-sectional views illustrating the structure of the device are not enlarged partially in general scale for convenience of illustration, and the drawings are only exemplary and should not be construed as limiting the scope of the present invention. In addition, the three-dimensional dimensions of length, width and depth should be included in the actual fabrication.
Meanwhile, in the description of the present invention, it should be noted that the terms "upper, lower, inner and outer" and the like indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are only for convenience of describing the present invention and simplifying the description, but do not indicate or imply that the referred device or element must have a specific orientation, be constructed in a specific orientation and operate, and thus, cannot be construed as limiting the present invention. Furthermore, the terms first, second, or third are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
The terms "mounted, connected," and "connected" are to be construed broadly and include, for example: can be fixedly connected, detachably connected or integrally connected; they may be mechanically, electrically, or directly connected, or indirectly connected through intervening media, or may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
Example 1
Referring to the schematic of fig. 1, there is shown a flow diagram of a preferred method of characterizing a residential electricity consumption behavior cluster, the method comprising in particular the steps of,
s1: data is collected and an original feature set is constructed. The original feature set comprises power utilization features and meteorological features, and the power utilization features further comprise a peak-valley characteristic change index, a power utilization feature change index and a daily power utilization feature index; the meteorological features also include average temperature, maximum temperature, lowest temperature, rain, wind direction, wind speed, pressure and humidity. The data can be acquired through a residential electricity meter, a meteorological website and the like in the acquisition process.
Specifically, referring to table 1 below, the electricity consumption characteristics are related characteristic indexes of the electricity load of the residents, and the peak-to-valley characteristic change indexes include a peak load rate, a plateau load rate, and a valley load rate; the electricity utilization characteristic change indexes comprise a load rate, a peak-valley difference and a peak-valley difference rate; the daily electricity characteristic index is an index for expressing electricity characteristics in units of days, and includes daily electricity load, daily average load, daily maximum load, and daily minimum load.
Table 1: characteristic index of electricity consumption
Figure BDA0002396158760000051
Figure BDA0002396158760000061
In the definition of table 1, P represents the electrical load, peak, fl, val represent peak, plateau, and valley, respectively, and sum, av, max, and min represent the total, mean, maximum, and minimum load, respectively.
The meteorological factors are also hidden characteristics affecting the electricity utilization behavior of the user, and are generally affected by air temperature, precipitation, humidity, wind power and the like, typical meteorological factor characteristics are shown in the following table 2,
table 2: characteristic index of electricity consumption
Figure BDA0002396158760000062
In the embodiment, when the electricity utilization behavior of a user is researched, the comprehensive influence of the electricity utilization characteristic and the meteorological characteristic is considered at the same time, and the original feature set X is constructed by 18 feature indexes.
S2: and constructing an evaluation function to screen the original feature set. Because the feature indexes are of various types and the acquired data amount is large, the features in the original feature set X are optimized, and the optimal feature selection is performed by the method of constructing the evaluation function in the embodiment.
In the present embodiment, a new evaluation function is constructed based on the contour coefficient index, the BIC information criterion, and the correlation coefficient, the contour coefficient index,
the contour coefficient index is used for evaluating the clustering effect, and dividing the original feature set X into J clusters, wherein C is { C ═ C1,C2,...,CJAnd then the contour coefficient index of a certain sample i in the original feature set X is calculated by the formula,
Figure BDA0002396158760000063
Where i is a sample in the original data set X, a (X)i) Denotes xiAverage distance to other objects in the same cluster, b (x)i) Denotes xiMinimum average distance to the remaining cluster classes.
The Bayesian information criterion based on the information quantity is used for evaluating the effectiveness of the characteristics, and the calculation formula is as follows,
Figure BDA0002396158760000064
wherein k is the number of clustering clusters in the clustering model, n is the number of samples,
Figure BDA0002396158760000071
is a likelihood function, the formula of which is,
Figure BDA0002396158760000072
wherein SC and SC*The optimal value of the cluster evaluation index and the actually output evaluation index value are respectively.
Correlation coefficient ρxyIs used for characterizing the degree of association between two features, and the calculation formula is,
Figure BDA0002396158760000073
where cov (x, y) is the covariance of features x and y, σxAnd σyStandard deviation, ρ, of features x and y, respectivelyxyHas a value range of [ -1,1 [)],ρxyThe closer to 1 the absolute value of (b) is, the greater the correlation between the two is.
The formula of the merit function is such that,
Figure BDA0002396158760000074
wherein Z (x) is an evaluation value of the feature x, B' (x) is a Bayesian information criterion value, ρ, obtained by normalizing the feature xxyIs the correlation coefficient.
S3: and selecting qualified data to form the optimal characteristic subset.
Since the influence of each feature in the original feature set X on the cluster analysis is different, and there may be duplication and redundancy in the information provided by some features during the analysis. Therefore, the original data set is optimized, effective characteristic indexes are selected to map electricity utilization data, the optimal characteristic subset is obtained, clustering analysis is carried out, redundant characteristic indexes can be removed, calculation is simplified, and analysis performance can be improved.
In order to obtain the optimal feature subset, both the validity of the features and the complementarity between the features need to be considered. The embodiment constructs an evaluation function preferable for the characteristic by comprehensively considering the effectiveness of the characteristic and the correlation among the characteristics,
said feature preferably further comprises the step of,
calculating evaluation values of all the characteristics in the original characteristic library X;
screening the features to form an optimal feature subset Y;
calculating an evaluation value R of the optimal feature subset Y;
and whether the evaluation value R is smaller than a set threshold value or not, and if so, outputting the final optimal feature subset Y.
Specifically, when each feature in the original feature library is calculated by using the evaluation function, the smaller the evaluation value is, the greater the influence of the feature on the analysis of the electricity consumption behavior is, and the better the effect is.
When feature selection is performed, features with small evaluation values are selected from the original feature set X to form an optimal feature subset. The characteristic optimization process is specifically as follows: the method comprises the steps of firstly calculating evaluation values of all features in an original feature library, then selecting the features one by utilizing a heuristic sequence forward search method, and selecting the features with the smallest evaluation values from an empty set and putting the features into an optimal feature subset until the optimal feature subset meets a stop condition. A flow chart for constructing the optimal feature subset Y is shown in fig. 2. The selected features may be expressed as:
y=argmin{Z(x)}
The evaluation set of the optimal feature subset Y may be expressed as:
Figure BDA0002396158760000081
where z (Y) is the evaluation value of the optimal feature subset Y, which is the sum of the evaluation values of all features in that subset. The judgment condition of the feature selection termination is that when the validity of the remaining features in the original feature library X is far less than the redundancy brought by the feature selection termination, the selection is stopped, namely an evaluation value R needs to be calculated, wherein the calculation formula of the evaluation value R is as follows,
Figure BDA0002396158760000082
and the evaluation value R is the ratio of the evaluation value of the optimal feature in the original feature library X to the evaluation value of the optimal feature subset Y, and the selection is stopped when the R is smaller than a set threshold value.
S4: improving a density peak algorithm;
s5: and carrying out clustering analysis based on an improved density peak algorithm.
Because the redundancy of the traditional density peak algorithm is high, and the artificial selection of the clustering center also comprises subjectivity, based on the defects, the improved density peak method is provided, and the main work is to optimize the truncation distance by using a cuckoo search algorithm according to a clustering evaluation index SC; and automatically selecting the clustering center by using the idea of abnormal value detection and adopting Gaussian distribution.
In particular, the density peak value clustering algorithm using cuckoo optimization further comprises the following steps,
Initializing a population;
running a CFSFDP clustering algorithm to obtain an SC index;
keeping the current SC index as the optimal;
calculating SC index corresponding to next generation, if it is superior to previous generation, then calculating the SC index corresponding to next generationD at this timecThe value is retained to the next generation, otherwise the original dcThe value remains unchanged;
generating a random number P, and associating it with a probability of discovery PaBy comparison, if greater than PaUpdating, otherwise, keeping unchanged;
if the current optimal solution is kept unchanged or the maximum iteration number is met, outputting the corresponding SC index and the truncation distance dcThe algorithm ends, otherwise go to the second step of the algorithm.
In this embodiment, the method for automatically determining the clustering center by using normal distribution as the abnormal value detection model further comprises the following steps,
calculating the local density rho and the distance of each data point, and normalizing the local density rho and the distance;
the cluster center weight gamma of each data point is calculated, the calculation formula is as follows,
γ=ρ′′
where ρ 'and' are the normalized local density ρ and distance, respectively.
The mean and variance of each data point are calculated according to the following formula,
Figure BDA0002396158760000092
Figure BDA0002396158760000093
and determining an abnormal point according to a 3 sigma principle, namely automatically selecting a clustering center.
Scene one:
in order to verify the actual effect of the characteristic optimization method in the residential electricity consumption behavior clustering in the embodiment, 515 typical daily load curves of 5 classes are selected, 103 curves of each class are subjected to a clustering analysis experiment, and the classification accuracy is used for measuring the quality of the clustering effect. The experiment is completed on a single personal computer with a CPU of 2.6GHZ, a memory of 16GB and an operating system of 64 bits, and a Matlab R2018a is used for carrying out algorithmic test.
Firstly, selecting an original feature set according to a proposed feature optimization strategy. And the optimal feature subset sequentially selects the features with the minimum evaluation values from the empty set. First, when the first feature extraction is performed, the evaluation value of each feature index is calculated as shown in table 1 below.
Table 1: evaluation value of each index at first feature selection
Feature numbering 1 2 3 4 5 6
Evaluation value 3.99E-11 1.21E-09 1.84E-16 7.91E-11 2.39E-10 3.07E-10
Feature numbering 7 8 9 10 11 12
Evaluation value 0 3.07E-09 7.51E-10 3.30E-11 5.24E-09 7.05E-09
Feature numbering 13 14 15 16 17 18
Evaluation value 6.90E-09 8.98E-23 1.85E-20 3.57E-24 6.12E-09 2.15E-17
When selecting the features, the feature with the smallest evaluation value should be selected and put into the optimal feature subset, and according to table 1, the daily electricity load with the smallest evaluation value, namely number 7, should be selected for the first time. Next, a second feature selection is performed, which is similar to the first feature selection except that the selected daily electricity load feature does not need to be evaluated, and the evaluation values of the remaining feature indexes are shown in table 2 below.
Table 2: evaluation value of each index in second feature selection
Feature numbering 1 2 3 4 5 6
Evaluation value 6.32E-09 1.42E-08 1.39E-08 2.76E-21 7.04E-16 9.31E-10
Feature numbering 7 8 9 10 11 12
Evaluation value / 1.36E-08 7.27E-23 1.67E-15 8.98E-20 2.47E-10
Feature numbering 13 14 15 16 17 18
Evaluation value 9.59E-11 1.27E-10 1.04E-08 3.07E-09 3.37E-11 7.55E-10
Similar to the first selection, the feature with the smallest evaluation value is selected, and according to table 2 above, the daily maximum load, i.e., number 9, is selected, in which case the preferred subset of features consists of two features, daily power load and daily maximum load.
The third, fourth, etc. subsequent feature selection is similar to the above process, and will not be described again here. Finally, with the feature optimization method proposed in this embodiment, the sequence of feature selection is sequentially numbers 7,9,4,11,5,10,17,13,14,12,18,6,2,16,15,8,3,2, and if features are sequentially selected to the optimal feature subset according to this sequence, the variation trend of accuracy when performing cluster analysis is shown in fig. 3 below.
By using the feature optimization method proposed in this embodiment, when 7 features are selected and then the termination condition is satisfied, the selection is not performed, and the finally selected optimal feature subset is the features of numbers 7,9,4,11,5,10, and 17. And it can be seen from fig. 3 that the accuracy of the cluster analysis gradually increases with the increase of the selected features in the optimal feature subset, but when the number of features reaches 7, the features continue to be increased, and at this time, the cluster accuracy decreases. Therefore, it can be seen that the features selected by the feature optimization strategy provided in this embodiment can obtain effective and reliable results when performing clustering analysis.
In addition, in order to verify that the improved density peak algorithm in the feature optimization method of the embodiment performs clustering, the effect difference from the traditional clustering based on the Euclidean distance density peak value,
Table 3: comparison of different Process Performance
Figure BDA0002396158760000101
As can be seen from table 3, for the optimal feature subset, the clustering analysis is performed by using the improved density peak method provided in this embodiment, and the clustering accuracy is improved on the basis of maintaining the temporal performance.
Example 2
Referring to the schematic diagram of fig. 4, which is a schematic structural diagram illustrating a principle structure of a feature optimization system in a residential electricity consumption behavior cluster according to this embodiment, the feature optimization method in the residential electricity consumption behavior cluster according to the foregoing embodiment can be implemented by using the system.
Specifically, the system includes an acquisition module 100, a filtering module 200, and a cluster analysis module 300, wherein,
the acquisition module 100 is used for acquiring and constructing an original feature set;
the screening module 200 can construct an evaluation function and screen the original feature set data;
the cluster analysis module 300 clusters the filtered data.
It should be recognized that embodiments of the present invention can be realized and implemented by computer hardware, a combination of hardware and software, or by computer instructions stored in a non-transitory computer readable memory. The methods may be implemented in a computer program using standard programming techniques, including a non-transitory computer readable storage medium configured with the computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner, according to the methods and figures described in the detailed description. Each program may be implemented in a high level procedural or object oriented programming language to communicate with a computer system. However, the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language. Furthermore, the program can be run on a programmed application specific integrated circuit for this purpose.
Further, the operations of processes described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The processes described herein (or variations and/or combinations thereof) may be performed under the control of one or more computer systems configured with executable instructions, and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more applications) that is executed collectively on one or more processors, by hardware, or combinations thereof. The computer program includes a plurality of instructions executable by one or more processors.
Further, the methods may be implemented in any type of computing platform operatively connected to a suitable interface, including but not limited to a personal computer, mini computer, mainframe, workstation, networked or distributed computing environment, separate or integrated computer platform, or in communication with a charged particle tool or other imaging device, and the like. Aspects of the invention may be embodied in machine-readable code stored on a non-transitory storage medium or device, whether removable or integrated into a computing platform, such as a hard disk, optically read and/or write storage medium, RAM, ROM, or the like, such that it may be read by a programmable computer, which when read by the storage medium or device, is operative to configure and operate the computer to perform the procedures described herein. Additionally, the machine-readable code, or portions thereof, may be transmitted over a wired or wireless network. The invention described herein includes these and other different types of non-transitory computer-readable storage media when such media include instructions or programs that implement the steps described above in conjunction with a microprocessor or other data processor. The invention also includes the computer itself when programmed according to the methods and techniques described herein. The computer program can be applied to input data to perform the functions described herein to transform the input data to generate output data that is stored to non-volatile memory. The output information may also be applied to one or more output devices, such as a display. In a preferred embodiment of the invention, the transformed data represents physical and tangible objects, including particular visual depictions of physical and tangible objects produced on the display.
As used in this application, the terms "component," "module," "system," and the like are intended to refer to a computer-related entity, either hardware, firmware, a combination of hardware and software, or software in execution. For example, a component may be, but is not limited to being: a process running on a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of example, both an application running on a computing device and the computing device can be a component. One or more components can reside within a process and/or thread of execution and a component can be localized on one computer and/or distributed between two or more computers. In addition, these components can execute from various computer readable media having various data structures thereon. The components may communicate by way of local and/or remote processes such as in accordance with a signal having one or more data packets (e.g., data from one component interacting with another component in a local system, distributed system, and/or across a network such as the internet with other systems by way of the signal).
It should be noted that the above-mentioned embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions can be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, which should be covered by the claims of the present invention.

Claims (10)

1. A feature optimization method in a residential electricity consumption behavior cluster, characterized by: comprises the steps of (a) preparing a mixture of a plurality of raw materials,
collecting data and constructing an original feature set;
constructing an evaluation function;
screening the original feature set based on an evaluation function;
improving a density peak algorithm;
and carrying out clustering analysis based on an improved density peak algorithm.
2. The characteristic preference method in the residential electricity consumption behavior cluster as claimed in claim 1, wherein: the set of raw features includes power usage characteristics and meteorological characteristics,
the electricity utilization characteristics also comprise a peak-valley characteristic change index, an electricity utilization characteristic change index and a daily electricity utilization characteristic index; the meteorological features also include average temperature, maximum temperature, lowest temperature, rain, wind direction, wind speed, pressure and humidity.
3. The characteristic preference method in the residential electricity consumption behavior cluster as claimed in claim 1 or 2, wherein: the construction of the evaluation function comprises a contour coefficient index which is calculated by the following formula,
Figure FDA0002396158750000011
where i is a sample in the original data set X, a (X)i) Denotes xiAverage distance to other objects in the same cluster, b (x)i) Denotes xiMinimum average distance to the remaining cluster classes.
4. The characteristic preference method in the residential electricity consumption behavior cluster as claimed in claim 3, wherein: the evaluation function also comprises a Bayesian information criterion function which is calculated by the formula,
Figure FDA0002396158750000012
wherein k is the number of clustering clusters in the clustering model, n is the number of samples,
Figure FDA0002396158750000013
is a likelihood function, the formula of which is,
Figure FDA0002396158750000014
wherein SC and SC*The optimal value of the cluster evaluation index and the actually output evaluation index value are respectively.
5. The characteristic preference method in the residential electricity consumption behavior cluster as claimed in claim 4, wherein: the merit function further includes a correlation coefficient ρxyThe calculation formula is as follows,
Figure FDA0002396158750000015
where cov (x, y) is the covariance of features x and y, σxAnd σyStandard deviation, ρ, of features x and y, respectivelyxyHas a value range of [ -1,1 [)]。
6. The characteristic preference method in the residential electricity consumption behavior cluster as claimed in claim 4 or 5, wherein: the formula of the merit function is such that,
Figure FDA0002396158750000021
Wherein Z (x) is an evaluation value of the feature x, B' (x) is a Bayesian information criterion value, ρ, obtained by normalizing the feature xxyIs the correlation coefficient.
7. The characteristic preference method in the residential electricity consumption behavior cluster as claimed in claim 6, wherein: the optimal feature subset is constructed by feature optimization, which further includes,
calculating evaluation values of all the characteristics in the original characteristic library X;
screening the features to form an optimal feature subset Y;
calculating an evaluation value R of the optimal feature subset Y;
and whether the evaluation value R is smaller than a set threshold value or not, and if so, outputting the final optimal feature subset Y.
8. The characteristic preference method in the residential electricity consumption behavior cluster as claimed in claim 7, wherein: the evaluation value R is calculated by the formula,
Figure FDA0002396158750000022
and the evaluation value R is the ratio of the evaluation value of the optimal feature in the original feature library X to the evaluation value of the optimal feature subset Y, and the selection is stopped when the R is smaller than a set threshold value.
9. The characteristic preference method in the residential electricity consumption behavior cluster as claimed in claim 7 or 8, wherein: the improved density peaking algorithm includes the steps of,
optimizing the truncation distance by using a cuckoo search algorithm according to the cluster evaluation index SC;
And realizing automatic selection of the clustering center by using the thought of abnormal value detection and adopting Gaussian distribution.
10. A feature preference system in a residential electricity consumption behavior cluster, characterized in that: comprises the steps of (a) preparing a mixture of a plurality of raw materials,
an acquisition module (100), the acquisition module (100) being configured to acquire and construct a set of raw features;
a screening module (200), wherein the screening module (200) is capable of constructing an evaluation function and screening the raw feature set data;
a cluster analysis module (300), wherein the cluster analysis module (300) clusters the screened data.
CN202010132423.4A 2020-02-29 2020-02-29 Feature optimization method and system in residential electricity consumption behavior clustering Pending CN111861781A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010132423.4A CN111861781A (en) 2020-02-29 2020-02-29 Feature optimization method and system in residential electricity consumption behavior clustering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010132423.4A CN111861781A (en) 2020-02-29 2020-02-29 Feature optimization method and system in residential electricity consumption behavior clustering

Publications (1)

Publication Number Publication Date
CN111861781A true CN111861781A (en) 2020-10-30

Family

ID=72985939

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010132423.4A Pending CN111861781A (en) 2020-02-29 2020-02-29 Feature optimization method and system in residential electricity consumption behavior clustering

Country Status (1)

Country Link
CN (1) CN111861781A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112365164A (en) * 2020-11-13 2021-02-12 国网江苏省电力有限公司扬州供电分公司 Medium-large energy user energy characteristic portrait method based on improved density peak value fast search clustering algorithm
CN112906790A (en) * 2021-02-20 2021-06-04 国网江苏省电力有限公司营销服务中心 Method and system for identifying solitary old people based on electricity consumption data
CN112926645A (en) * 2021-02-22 2021-06-08 国网四川省电力公司营销服务中心 Electricity stealing detection method based on edge calculation
CN113191453A (en) * 2021-05-24 2021-07-30 国网四川省电力公司经济技术研究院 Power consumption behavior portrait generation method and system based on DAE network characteristics

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108765194A (en) * 2018-05-29 2018-11-06 深圳源广安智能科技有限公司 A kind of effective residential electricity consumption behavior analysis system
CN108960657A (en) * 2018-07-13 2018-12-07 国网上海市电力公司 One kind being based on the preferred building Load Characteristic Analysis method of feature
CN109883691A (en) * 2019-01-21 2019-06-14 太原科技大学 The gear method for predicting residual useful life that kernel estimates and stochastic filtering integrate
CN110825723A (en) * 2019-10-09 2020-02-21 上海电力大学 Residential user classification method based on power load analysis

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108765194A (en) * 2018-05-29 2018-11-06 深圳源广安智能科技有限公司 A kind of effective residential electricity consumption behavior analysis system
CN108960657A (en) * 2018-07-13 2018-12-07 国网上海市电力公司 One kind being based on the preferred building Load Characteristic Analysis method of feature
CN109883691A (en) * 2019-01-21 2019-06-14 太原科技大学 The gear method for predicting residual useful life that kernel estimates and stochastic filtering integrate
CN110825723A (en) * 2019-10-09 2020-02-21 上海电力大学 Residential user classification method based on power load analysis

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
曾兴东,林荣恒,邹华,张勇: "面向配电网故障数据的 BIC 评估后向选择方法", 《北京邮电大学学报》, vol. 40, no. 3, pages 104 - 108 *
曾兴东;林荣恒;邹华;张勇;: "面向配电网故障数据的BIC评估后向选择方法", 北京邮电大学学报, no. 03, pages 104 - 108 *
郑虹,周丽媛,韩旭明: "布谷鸟优化的密度峰值快速搜索聚类算法", 《长春工业大学学报》, vol. 39, no. 3, pages 253 - 259 *
陆俊,朱炎平,彭文昊,孙毅: "智能用电用户行为分析特征优选策略", 《电力系统自动化》, vol. 41, no. 5, pages 58 - 62 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112365164A (en) * 2020-11-13 2021-02-12 国网江苏省电力有限公司扬州供电分公司 Medium-large energy user energy characteristic portrait method based on improved density peak value fast search clustering algorithm
CN112365164B (en) * 2020-11-13 2023-09-12 国网江苏省电力有限公司扬州供电分公司 Energy characteristic portrait method for medium and large energy users based on improved density peak value rapid search clustering algorithm
CN112906790A (en) * 2021-02-20 2021-06-04 国网江苏省电力有限公司营销服务中心 Method and system for identifying solitary old people based on electricity consumption data
CN112906790B (en) * 2021-02-20 2023-08-18 国网江苏省电力有限公司营销服务中心 Solitary old man identification method and system based on electricity consumption data
CN112926645A (en) * 2021-02-22 2021-06-08 国网四川省电力公司营销服务中心 Electricity stealing detection method based on edge calculation
CN113191453A (en) * 2021-05-24 2021-07-30 国网四川省电力公司经济技术研究院 Power consumption behavior portrait generation method and system based on DAE network characteristics
CN113191453B (en) * 2021-05-24 2022-04-22 国网四川省电力公司经济技术研究院 Power consumption behavior portrait generation method and system based on DAE network characteristics

Similar Documents

Publication Publication Date Title
CN111861781A (en) Feature optimization method and system in residential electricity consumption behavior clustering
CN108733631A (en) A kind of data assessment method, apparatus, terminal device and storage medium
CN111832796B (en) Fine classification and prediction method and system for residential electricity load mode
CN109657891B (en) Load characteristic analysis method based on self-adaptive k-means + + algorithm
CN117113235B (en) Cloud computing data center energy consumption optimization method and system
CN108345908A (en) Sorting technique, sorting device and the storage medium of electric network data
CN108280236A (en) A kind of random forest visualization data analysing method based on LargeVis
CN117078048A (en) Digital twinning-based intelligent city resource management method and system
CN118171180A (en) Equipment state prediction method and device based on artificial intelligence
CN112215268A (en) Method and device for classifying disaster weather satellite cloud pictures
CN115329880A (en) Meteorological feature extraction method and device, computer equipment and storage medium
CN113094448B (en) Analysis method and analysis device for residence empty state and electronic equipment
CN110472659A (en) Data processing method, device, computer readable storage medium and computer equipment
CN114648060A (en) Fault signal standardization processing and classification method based on machine learning
CN117408394B (en) Carbon emission factor prediction method and device for electric power system and electronic equipment
CN116258279B (en) Landslide vulnerability evaluation method and device based on comprehensive weighting
CN117933316A (en) Groundwater level probability forecasting method based on interpretable Bayesian convolution network
CN111459926A (en) Park comprehensive energy anomaly data identification method
CN111612289B (en) New energy multi-scene risk feature oriented power system risk assessment method
CN116541780A (en) Power transmission line galloping early warning method, device, equipment and storage medium
CN114706751A (en) Software defect prediction method based on improved SMOTE
CN116010831A (en) Combined clustering scene reduction method and system based on potential decision result
CN111127184A (en) Distributed combined credit evaluation method
CN116955117B (en) Computer radiator performance analysis system based on data visualization enhancement
CN115330397B (en) Intelligent contract risk prediction method and device, storage medium and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination