Nothing Special   »   [go: up one dir, main page]

CN112906790A - Method and system for identifying solitary old people based on electricity consumption data - Google Patents

Method and system for identifying solitary old people based on electricity consumption data Download PDF

Info

Publication number
CN112906790A
CN112906790A CN202110192708.1A CN202110192708A CN112906790A CN 112906790 A CN112906790 A CN 112906790A CN 202110192708 A CN202110192708 A CN 202110192708A CN 112906790 A CN112906790 A CN 112906790A
Authority
CN
China
Prior art keywords
data
electricity consumption
living alone
elderly
elderly living
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110192708.1A
Other languages
Chinese (zh)
Other versions
CN112906790B (en
Inventor
王贺
刘颖
赵双双
何维民
王舒
陈奕彤
周家亿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Jiangsu Electric Power Co ltd Marketing Service Center
State Grid Corp of China SGCC
State Grid Jiangsu Electric Power Co Ltd
Original Assignee
State Grid Jiangsu Electric Power Co ltd Marketing Service Center
State Grid Corp of China SGCC
State Grid Jiangsu Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Jiangsu Electric Power Co ltd Marketing Service Center, State Grid Corp of China SGCC, State Grid Jiangsu Electric Power Co Ltd filed Critical State Grid Jiangsu Electric Power Co ltd Marketing Service Center
Priority to CN202110192708.1A priority Critical patent/CN112906790B/en
Publication of CN112906790A publication Critical patent/CN112906790A/en
Application granted granted Critical
Publication of CN112906790B publication Critical patent/CN112906790B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Alarm Systems (AREA)

Abstract

本申请公开了一种基于用电数据的独居老人识别方法和系统,所述方法包括:随机选取批量低压用户用电数据,对选取数据进行清洗;对数据进行特征选择并向量化;对向量化的数据进行聚类分析;聚类分析结果结合电力营销数据,筛选疑似独居老人群体样本数据;对疑似独居老人群体样本数据进行随机验证,确定模型训练用正样本和负样本;构建独居老人识别模型,并利用正样本、负样本进行模型训练;获取全量用户用电数据,利用独居老人识别模型识别独居老人。本发明可借助电表等电力采集设备获得数据,无需投入大量资金安装监控设备;可以在缺少训练样本的情况下,利用原始用电数据,构建独居老人识别模型,快速定位独居老人用户。

Figure 202110192708

The present application discloses a method and system for identifying elderly people living alone based on electricity consumption data. The method includes: randomly selecting a batch of low-voltage user electricity consumption data, cleaning the selected data; performing feature selection and quantization on the data; The cluster analysis results are combined with the power marketing data, and the sample data of the suspected elderly living alone is screened; the sample data of the suspected elderly living alone is randomly verified to determine the positive and negative samples for model training; the identification model of the elderly living alone is constructed. , and use positive samples and negative samples for model training; obtain the full amount of user electricity consumption data, and use the identification model for the elderly living alone to identify the elderly living alone. The invention can obtain data by means of electric power acquisition equipment such as electricity meters, without investing a lot of money to install monitoring equipment; in the absence of training samples, the invention can use the original electricity consumption data to construct a recognition model for the elderly living alone, and quickly locate the elderly living alone users.

Figure 202110192708

Description

Method and system for identifying solitary old people based on electricity consumption data
Technical Field
The invention belongs to the technical field of power consumption data analysis and application, and relates to a method and a system for identifying solitary old people based on power consumption data.
Background
At present, the monitoring method for the elderly living alone is more, if the elderly living alone is detected through intelligent equipment, the elderly are prevented from encountering dangers and being unable to be rescued timely, such as intelligent walking sticks, intelligent hand rings and the like, or behaviors, expressions and the like of the elderly are identified and monitored through video monitoring and artificial intelligence technology.
However, many solitary old people often live at the edge of the society, especially solitary old people who urgently need help, how to find the solitary old people to provide necessary help for the solitary old people, and at present, no good method exists. The list of solitary old people is often obtained only by the basic community visit survey. The method consumes huge manpower and material resources and completely depends on community management level and work efficiency. In some huge communities with large and complex population, the list information of the elderly people living alone is difficult to update in time. Although the intensity of community work can be reduced by installing monitoring equipment and the like, a large amount of capital investment is required in the early stage, and special maintenance is required in the later stage.
Disclosure of Invention
In order to overcome the defects in the prior art, the application provides a solitary old man identification method and system based on electricity consumption data.
In order to achieve the above purpose, the invention adopts the following technical scheme:
the method for identifying the old people living alone based on the electricity consumption data is characterized by comprising the following steps:
the method comprises the following steps:
step 1: randomly selecting batch low-voltage user electricity utilization data, cleaning the selected data, and removing abnormal values and null values of electricity utilization;
step 2: carrying out feature selection and vectorization on the data in the step 1;
and step 3: performing clustering analysis on the vectorized data in the step 2 by using a clustering algorithm;
and 4, step 4: step 3, combining the clustering analysis result with the power marketing data, and screening sample data of suspected solitary old people;
and 5: randomly verifying the sample data of the suspected solitary old people group in the step 4, and determining a positive sample and a negative sample for model training;
step 6: constructing a solitary old man identification model, and respectively labeling and combining the positive sample and the negative sample determined in the step (5) to be used as training data of the model to train the solitary old man identification model;
and 7: and acquiring full user power consumption data, and identifying the elderly people living alone by using the elderly people living alone identification model.
The invention further comprises the following preferred embodiments:
preferably, the low-voltage user in the step 1 is a user with an access voltage lower than 380V;
the used electricity consumption data is randomly selected 50 universal users of the daily electricity consumption in the last two years;
the abnormal value includes that the daily electricity consumption is a negative value or the daily electricity consumption is larger than a daily electricity consumption set threshold.
Preferably, in step 2, the electricity consumption statistical characteristics are selected and calculated for each user, and the electricity consumption statistical characteristics of each user are stored in a vector form.
Preferably, the electricity utilization statistical characteristics comprise average electricity utilization ratio of summer to winter, average level ratio of working days to non-working days, average electricity utilization and variance of holidays over three days and holidays over three days.
Preferably, the specific steps of step 3 are as follows:
step 301: taking each user feature vector as a class, and calculating the minimum Euclidean distance between every two users;
step 302: combining the two classes with the minimum Euclidean distance into a new class;
step 303: repeatedly calculating the distances between the new class and all classes;
step 304: step 302 and step 303 are repeated until all classes are merged into one class.
Preferably, the electric marketing data of step 4 comprises a user basic file and a payment channel.
Preferably, in step 5, random sampling is performed from sample data of the suspected solitary old people group, field verification is performed, if the accuracy reaches a set threshold, the sample data of the suspected solitary old people group in step 4 is used as a negative sample, the data group which is the farthest in the euclidean distance from the negative sample in the clustering result in step 3 is used as a positive sample, and if the accuracy does not reach the set threshold, the step 2 is returned, and feature selection and vectorization processing are performed again.
Preferably, in step 6, a random forest model is adopted to construct the elderly people living alone recognition model, the number of samples is assumed to be N, each sample has M characteristics, and the specific training steps are as follows:
step 601: sampling is carried out on samples for N times, 1 sample is obtained each time, N samples are formed, and a decision tree is trained by utilizing the randomly selected N samples to serve as samples at the root node of the decision tree;
step 602: when each node of the decision tree in the step 601 is split, randomly selecting M features from the M features, ensuring that M < < M, and then selecting one feature from the M features as the splitting feature of the node by using an information gain strategy;
step 603: repeating step 602 until the decision tree node cannot be split;
step 604: establishing a batch decision tree according to the sequence of the step 601, the step 602 and the step 603;
step 605: and (4) forming a random forest by the decision tree formed in the step 604 and using the random forest as a solitary old man identification model.
Preferably, in step 7, feature selection and vectorization are carried out on the full-amount user electricity consumption data according to the mode in step 2, and the feature selection and vectorization are input into the solitary old man recognition model trained in step 6 to obtain a solitary old man list;
the electricity consumption data of the full-quantity users refers to the daily electricity consumption of all low-voltage users in the whole province in the last two years.
The present application also discloses another invention, namely, a solitary old man identification system based on electricity consumption data, the system comprising:
the initial data acquisition module is used for randomly selecting batch low-voltage user electricity utilization data, cleaning the selected data and removing abnormal values and null values of electricity utilization;
the characteristic selection and vectorization module is used for carrying out characteristic selection and vectorization on the data of the initial data acquisition module;
the cluster analysis module is used for carrying out cluster analysis on the data of the opposite quantization by utilizing a clustering algorithm based on a hierarchy from bottom to top;
the sample data screening module is used for screening sample data of suspected solitary old people groups by utilizing the clustering analysis result in combination with the electric power marketing data;
the training sample verification module is used for randomly verifying the sample data of the suspected solitary old people group and determining a positive sample and a negative sample for model training;
the model building module is used for building a recognition model of the elderly living alone and performing model training by using a positive sample and a negative sample;
and the identification module is used for acquiring the full-scale electricity utilization data of the user and identifying the solitary old people by utilizing the solitary old people identification model.
The beneficial effect that this application reached:
1. according to the method, through analysis of power consumption data and auxiliary verification of other power marketing data, a list of high-probability solitary old people is rapidly found out, investigation work of communities and the like is reduced, and a precondition is provided for the follow-up care of the work of the solitary old people;
2. the invention can obtain data by means of electric power acquisition equipment such as an ammeter and the like, and does not need to invest a large amount of capital to install monitoring equipment; the old people living alone identification model can be constructed by utilizing original power utilization data under the condition of lacking a training sample, and the old people living alone can be quickly positioned.
Drawings
Fig. 1 is a flow chart of a solitary old man identification method based on electricity consumption data according to the present invention.
Detailed Description
The present application is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present application is not limited thereby.
The method utilizes the electricity consumption data of the users to quickly position the users of the elderly living alone, and obtains the labels and the characteristic information of the positive and negative samples for model training in the modes of clustering algorithm, electric power marketing data auxiliary verification and the like under the condition of lacking of training samples.
Specifically, as shown in fig. 1, the method for identifying the elderly living alone based on the electricity consumption data of the present invention includes the following steps:
step 1: randomly selecting batch low-voltage user electricity utilization data, cleaning the selected data, and removing abnormal values, null values and the like of electricity utilization, wherein the low-voltage users are users with access voltage lower than 380V;
the used electricity consumption data is randomly selected 50 universal users of the daily electricity consumption in the last two years;
the abnormal value comprises that the daily electricity consumption is a negative value or the daily electricity consumption is too large;
step 2: and (4) performing feature selection and vectorization on the data in the step 1.
Selecting power utilization statistical characteristics, calculating the power utilization statistical characteristics of each user, and storing the power utilization statistical characteristics of each user in a vector form;
the user electricity utilization statistical characteristics comprise average power consumption ratio in summer and winter, level average ratio between working days and non-working days, average power consumption and variance in holidays over three days and in holidays over non-three days and the like.
And step 3: and (3) carrying out clustering analysis on the vectorized data in the step (2) by using a clustering algorithm, wherein the specific clustering algorithm comprises the following steps:
step 301: taking each user feature vector as a class, and calculating the minimum Euclidean distance between every two users, if the feature vector of the user a is A ═ a (a)1,a2,a3…), the feature vector of user B is B ═ B (B)1,b2,b3…), the feature vector of user C is C ═ C (C)1,c2,c3…), euclidean distance L of user a from user b feature vectorsabIs composed of
Figure BDA0002945731230000041
The Euclidean distance L between the user a and the user c can be obtained by the same methodacAnd Euclidean distance L between user b and user cbc
Step 302: merging two classes with the minimum Euclidean distance into a new class, such as Lab、Lbc、LacAmong three distances, LabIf the minimum, combining the user a and the user b into a new class;
step 303: repeatedly calculating the distances between the new class and all classes;
step 304: step 302 and step 303 are repeated until all classes are merged into one class.
Through the clustering, users with similar electricity utilization characteristics can be classified into the same group.
And 4, step 4: step 3, screening sample data of suspected solitary old people group by combining the clustering analysis result with the modes of electric power marketing data and the like;
the electric power marketing data comprises a user basic file, a payment channel and the like;
for example, information such as the age of the user and whether the user is paying off-line may be used as a condition, and a clustering group meeting the condition in the clustering result may be screened out.
And 5: randomly verifying the sample data of the suspected solitary old people group in the step 4, and determining a positive sample and a negative sample for model training;
and (3) randomly sampling from the sample data of the suspected solitary old people group, carrying out on-site verification, and if the accuracy reaches a set threshold value such as 80%, taking the sample data of the suspected solitary old people group in the step (4) as a negative sample, and taking the data group which is the farthest Euclidean distance from the negative sample in the clustering result in the step (3) as a positive sample. If the accuracy rate does not reach 80%, returning to the step 2, and performing feature selection and vectorization again;
step 6: constructing a solitary old man identification model, and respectively labeling and combining the positive sample and the negative sample determined in the step (5) to be used as training data of the model to train the solitary old man identification model, wherein the model is a two-class model and needs to judge whether input data is solitary old man, so that samples for training need two types, namely the positive sample and the negative sample determined in the step (5);
a random forest model is adopted to construct a solitary old man identification model, the number of samples is assumed to be N, each sample has M characteristics, and the specific training steps are as follows:
step 601: sampling is carried out on samples for N times, 1 sample is obtained each time, N samples are formed, and a decision tree is trained by utilizing the randomly selected N samples to serve as samples at the root node of the decision tree;
step 602: when each node of the decision tree in the step 601 is split, randomly selecting M features from the M features, ensuring that M < < M, and then selecting one feature from the M features as the splitting feature of the node by using an information gain strategy;
step 603: repeating step 602 until the decision tree node cannot be split;
step 604: establishing a batch decision tree according to the sequence of the step 601, the step 602 and the step 603;
step 605: and (4) forming a random forest by the decision tree formed in the step 604 and using the random forest as a solitary old man identification model.
And 7: and acquiring full user power consumption data, and identifying the elderly people living alone by using the elderly people living alone identification model.
Selecting and vectorizing characteristics of the full-user electricity consumption data according to the mode of the step 2, inputting the characteristics into the solitary old man identification model trained in the step 6, and obtaining a solitary old man list;
the electricity consumption data of the full-quantity users refers to the daily electricity consumption of all low-voltage users in the whole province in the last two years.
An elderly people solitary identification system based on power consumption data, the system comprising:
the initial data acquisition module is used for randomly selecting batch low-voltage user electricity utilization data, cleaning the selected data and removing abnormal values and null values of electricity utilization;
the characteristic selection and vectorization module is used for carrying out characteristic selection and vectorization on the data of the initial data acquisition module;
the cluster analysis module is used for carrying out cluster analysis on the data of the opposite quantization by utilizing a clustering algorithm based on a hierarchy from bottom to top;
the sample data screening module is used for screening sample data of suspected solitary old people groups by utilizing the clustering analysis result in combination with the electric power marketing data;
the training sample verification module is used for randomly verifying the sample data of the suspected solitary old people group and determining a positive sample and a negative sample for model training;
the model building module is used for building a recognition model of the elderly living alone and performing model training by using a positive sample and a negative sample;
and the identification module is used for acquiring the full-scale electricity utilization data of the user and identifying the solitary old people by utilizing the solitary old people identification model.
The present applicant has described and illustrated embodiments of the present invention in detail with reference to the accompanying drawings, but it should be understood by those skilled in the art that the above embodiments are merely preferred embodiments of the present invention, and the detailed description is only for the purpose of helping the reader to better understand the spirit of the present invention, and not for limiting the scope of the present invention, and on the contrary, any improvement or modification made based on the spirit of the present invention should fall within the scope of the present invention.

Claims (10)

1.一种基于用电数据的独居老人识别方法,其特征在于:1. a method for identifying the elderly living alone based on electricity consumption data, is characterized in that: 所述方法包括以下步骤:The method includes the following steps: 步骤1:随机选取批量低压用户用电数据,对选取数据进行清洗,去除用电异常值和空值;Step 1: Randomly select the power consumption data of low-voltage users in batches, clean the selected data, and remove abnormal values and null values of power consumption; 步骤2:对步骤1数据进行特征选择并向量化;Step 2: Feature selection and quantification of the data in Step 1; 步骤3:利用聚类算法,对步骤2向量化的数据进行聚类分析;Step 3: Use the clustering algorithm to perform cluster analysis on the vectorized data in Step 2; 步骤4:步骤3的聚类分析结果结合电力营销数据,筛选疑似独居老人群体样本数据;Step 4: The cluster analysis results of Step 3 are combined with the power marketing data to screen the sample data of the group of elderly people who are suspected of living alone; 步骤5:对步骤4的疑似独居老人群体样本数据进行随机验证,确定模型训练用正样本和负样本;Step 5: Randomly verify the sample data of the suspected single-living elderly group in Step 4, and determine the positive and negative samples for model training; 步骤6:构建独居老人识别模型,将步骤5确定的正样本与负样本分别打上标签合并在一起作为模型的训练数据训练独居老人识别模型;Step 6: construct a recognition model for the elderly living alone, and label the positive samples and negative samples determined in step 5, respectively, and merge them together as the training data of the model to train the recognition model for the elderly living alone; 步骤7:获取全量用户用电数据,利用独居老人识别模型识别独居老人。Step 7: Obtain the electricity consumption data of the full amount of users, and use the identification model for the elderly living alone to identify the elderly living alone. 2.根据权利要求1所述的一种基于用电数据的独居老人识别方法,其特征在于:2. a kind of identification method for the elderly living alone based on electricity consumption data according to claim 1, is characterized in that: 步骤1所述低压用户为接入电压低于380V的用户;The low-voltage users described in step 1 are users whose access voltage is lower than 380V; 所使用的用电数据为随机选取的50万用户最近两年每日用电量;The electricity consumption data used is the daily electricity consumption of 500,000 randomly selected users in the last two years; 所述异常值包括日用电量为负值的或日用电量大于日用电量设定阈值。The abnormal value includes that the daily electricity consumption is a negative value or that the daily electricity consumption is greater than the set threshold value of the daily electricity consumption. 3.根据权利要求1所述的一种基于用电数据的独居老人识别方法,其特征在于:3. a kind of identification method for the elderly living alone based on electricity consumption data according to claim 1, is characterized in that: 步骤2中,选定用电统计特征并对各用户用电统计特征进行计算,将每一户的用电统计特征以向量形式存储。In step 2, the statistical characteristics of electricity consumption are selected and calculated for each user, and the statistical characteristics of electricity consumption of each user are stored in the form of a vector. 4.根据权利要求3所述的一种基于用电数据的独居老人识别方法,其特征在于:4. a kind of identification method for the elderly living alone based on electricity consumption data according to claim 3, is characterized in that: 所述用电统计特征包括夏季与冬季平均用电量比值、工作日与非工作日用电平均值比值、三日以上假期和非三日以上假期用电平均值和方差。The statistical characteristics of electricity consumption include the ratio of average electricity consumption in summer and winter, the ratio of average electricity consumption between working days and non-working days, and the average and variance of electricity consumption during holidays of more than three days and holidays of more than three days. 5.根据权利要求1所述的一种基于用电数据的独居老人识别方法,其特征在于:5. a kind of identification method for the elderly living alone based on electricity consumption data according to claim 1, is characterized in that: 步骤3具体步骤如下:Step 3 The specific steps are as follows: 步骤301:将每个用户特征向量作为一类,计算两两之间的最小欧式距离;Step 301: take each user feature vector as a class, and calculate the minimum Euclidean distance between pairs; 步骤302:将欧式距离最小的两个类合并成一个新类;Step 302: Combine the two classes with the smallest Euclidean distance into a new class; 步骤303:重复计算新类与所有类之间的距离;Step 303: Repeat the calculation of the distance between the new class and all classes; 步骤304:重复步骤302和步骤303,直到所有类合并为一类。Step 304: Repeat steps 302 and 303 until all classes are merged into one class. 6.根据权利要求1所述的一种基于用电数据的独居老人识别方法,其特征在于:6. a kind of identification method for the elderly living alone based on electricity consumption data according to claim 1, is characterized in that: 步骤4所述电力营销数据包括用户基础档案和缴费渠道。The electricity marketing data in step 4 includes basic user files and payment channels. 7.根据权利要求1所述的一种基于用电数据的独居老人识别方法,其特征在于:7. a kind of identification method for the elderly living alone based on electricity consumption data according to claim 1, is characterized in that: 步骤5中,从疑似独居老人群体样本数据中随机取样,进行实地验证,如果准确率达到设定阈值,则将步骤4的疑似独居老人群体样本数据作为负样本,并从步骤3聚类结果中与负样本欧式距离最远的数据群作为正样本,如果准确率未达到设定阈值,则返回步骤2,重新进行特征选择及向量化处理。In step 5, random sampling is carried out from the sample data of the group of elderly people who are suspected to live alone, and field verification is carried out. If the accuracy rate reaches the set threshold, the sample data of the group of elderly people who are suspected to be living alone is taken as a negative sample, and the clustering results in step 3 are used as negative samples. The data group with the farthest Euclidean distance from the negative sample is used as a positive sample. If the accuracy rate does not reach the set threshold, return to step 2, and perform feature selection and vectorization again. 8.根据权利要求1所述的一种基于用电数据的独居老人识别方法,其特征在于:8. a kind of identification method for the elderly living alone based on electricity consumption data according to claim 1, is characterized in that: 步骤6中,采用随机森林模型构建独居老人识别模型,假设样本数量为N,每个样本有M个特征,具体训练步骤如下:In step 6, a random forest model is used to build a recognition model for the elderly living alone. It is assumed that the number of samples is N, and each sample has M features. The specific training steps are as follows: 步骤601:从样本中有放回的抽样N次,每次1个,形成N个样本,并利用随机选择的N个样本来训练一个决策树,作为决策树根节点处的样本;Step 601 : sampling N times with replacement from the sample, one at a time, to form N samples, and use the randomly selected N samples to train a decision tree as the sample at the root node of the decision tree; 步骤602:步骤601中的决策树每个节点分裂时,随机从M个特征中选取m个特征,并保证m<<M,然后利用信息增益的策略从m个特征中选择一个特征作为该节点的分裂特征;Step 602: When each node of the decision tree in step 601 is split, randomly select m features from the M features, and ensure that m<<M, and then use the information gain strategy to select a feature from the m features as the node. the splitting features; 步骤603:重复步骤602,直到决策树节点不能分裂为止;Step 603: Repeat step 602 until the decision tree node cannot be split; 步骤604:按照步骤601、步骤602、步骤603的顺序,建立批量决策树;Step 604: Build a batch decision tree in the order of step 601, step 602, and step 603; 步骤605:将步骤604中形成的决策树组成随机森林,作为独居老人识别模型。Step 605: The decision tree formed in step 604 is formed into a random forest as a recognition model for the elderly living alone. 9.根据权利要求1所述的一种基于用电数据的独居老人识别方法,其特征在于:9. a kind of identification method for the elderly living alone based on electricity consumption data according to claim 1, is characterized in that: 步骤7中,将全量用户用电数据按照步骤2的方式进行特征选择并向量化,输入步骤6训练完成的独居老人识别模型中,获得独居老人名单;In step 7, the feature selection and quantification of the full amount of user electricity consumption data are carried out according to the method of step 2, and input into the identification model of the elderly living alone that has been trained in step 6 to obtain a list of elderly living alone; 所述全量用户用电数据是指全省全部低压用户最近两年每日用电量。The power consumption data of all users refers to the daily power consumption of all low-voltage users in the province in the past two years. 10.根据权利要求1-9任一项所述的一种基于用电数据的独居老人识别方法的基于用电数据的独居老人识别系统,其特征在于:10. The single-living elderly identification system based on electricity consumption data according to a kind of electricity consumption data-based single-living elderly identification method according to any one of claims 1-9, it is characterized in that: 所述系统包括:The system includes: 初始数据获取模块,用于随机选取批量低压用户用电数据,对选取数据进行清洗,去除用电异常值和空值;The initial data acquisition module is used to randomly select batch low-voltage user power consumption data, clean the selected data, and remove abnormal power consumption values and null values; 特征选择与向量化模块,用于对初始数据获取模块的数据进行特征选择并向量化;The feature selection and vectorization module is used to select and quantify the data of the initial data acquisition module; 聚类分析模块,用于利用自下向上基于层次的聚类算法,对向量化的数据进行聚类分析;The clustering analysis module is used to perform clustering analysis on the vectorized data by using the bottom-up hierarchical clustering algorithm; 样本数据筛选模块,用于利用聚类分析结果结合电力营销数据,筛选疑似独居老人群体样本数据;The sample data screening module is used to screen the sample data of the elderly suspected of living alone by using the cluster analysis results combined with the power marketing data; 训练样本验证模块,用于对疑似独居老人群体样本数据进行随机验证,确定模型训练用正样本和负样本;The training sample verification module is used to randomly verify the sample data of the group of elderly people who are suspected to live alone, and determine the positive and negative samples for model training; 模型构建模块,用于构建独居老人识别模型,并利用正样本、负样本进行模型训练;The model building module is used to build a recognition model for the elderly living alone, and use positive samples and negative samples for model training; 识别模块,用于获取全量用户用电数据,利用独居老人识别模型识别独居老人。The identification module is used to obtain the electricity consumption data of all users, and use the identification model of the elderly living alone to identify the elderly living alone.
CN202110192708.1A 2021-02-20 2021-02-20 A method and system for identifying elderly people living alone based on electricity consumption data Active CN112906790B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110192708.1A CN112906790B (en) 2021-02-20 2021-02-20 A method and system for identifying elderly people living alone based on electricity consumption data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110192708.1A CN112906790B (en) 2021-02-20 2021-02-20 A method and system for identifying elderly people living alone based on electricity consumption data

Publications (2)

Publication Number Publication Date
CN112906790A true CN112906790A (en) 2021-06-04
CN112906790B CN112906790B (en) 2023-08-18

Family

ID=76124115

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110192708.1A Active CN112906790B (en) 2021-02-20 2021-02-20 A method and system for identifying elderly people living alone based on electricity consumption data

Country Status (1)

Country Link
CN (1) CN112906790B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113298175A (en) * 2021-06-10 2021-08-24 国网江苏省电力有限公司营销服务中心 Method and system for monitoring power consumption of old people living alone based on multiple scenes and multivariate data
CN113447711A (en) * 2021-08-02 2021-09-28 宁夏隆基宁光仪表股份有限公司 Intelligent ammeter with old people electricity utilization abnormity warning function and warning method thereof
CN113780452A (en) * 2021-09-16 2021-12-10 国网北京市电力公司 Monitoring method and monitoring device for solitary group and electronic equipment
CN114970672A (en) * 2022-04-13 2022-08-30 广东以诺通讯有限公司 Safe household system and use method thereof

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120134532A1 (en) * 2010-06-08 2012-05-31 Gorilla Technology Inc. Abnormal behavior detection system and method using automatic classification of multiple features
US20130346346A1 (en) * 2012-06-21 2013-12-26 Microsoft Corporation Semi-supervised random decision forests for machine learning
WO2015133635A1 (en) * 2014-03-07 2015-09-11 株式会社日立製作所 Data analysis system and method
CN105184084A (en) * 2015-09-14 2015-12-23 深圳供电局有限公司 Method and system for predicting fault type of electric power metering automation terminal
CN109753989A (en) * 2018-11-18 2019-05-14 韩霞 Analysis method of electricity stealing behavior of power users based on big data and machine learning
CN110634080A (en) * 2018-06-25 2019-12-31 中兴通讯股份有限公司 Abnormal electricity utilization detection method, device, equipment and computer readable storage medium
KR102060301B1 (en) * 2019-04-19 2020-02-11 한국전력공사 Apparatus and method for analysing living pattern based on power data
CN110826641A (en) * 2019-11-13 2020-02-21 上海积成能源科技有限公司 System and method for classifying electricity consumption condition of residents based on cluster analysis
CN111126820A (en) * 2019-12-17 2020-05-08 国网山东省电力公司电力科学研究院 Anti-stealing method and system
CN111861781A (en) * 2020-02-29 2020-10-30 上海电力大学 A method and system for feature selection in clustering of residential electricity consumption behavior
WO2020238631A1 (en) * 2019-05-31 2020-12-03 南京瑞栖智能交通技术产业研究院有限公司 Population type recognition method based on mobile phone signaling data
CN112200209A (en) * 2020-06-28 2021-01-08 国网浙江省电力有限公司金华供电公司 Poor user identification method based on day-to-day power consumption

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120134532A1 (en) * 2010-06-08 2012-05-31 Gorilla Technology Inc. Abnormal behavior detection system and method using automatic classification of multiple features
US20130346346A1 (en) * 2012-06-21 2013-12-26 Microsoft Corporation Semi-supervised random decision forests for machine learning
WO2015133635A1 (en) * 2014-03-07 2015-09-11 株式会社日立製作所 Data analysis system and method
CN105184084A (en) * 2015-09-14 2015-12-23 深圳供电局有限公司 Method and system for predicting fault type of electric power metering automation terminal
CN110634080A (en) * 2018-06-25 2019-12-31 中兴通讯股份有限公司 Abnormal electricity utilization detection method, device, equipment and computer readable storage medium
CN109753989A (en) * 2018-11-18 2019-05-14 韩霞 Analysis method of electricity stealing behavior of power users based on big data and machine learning
KR102060301B1 (en) * 2019-04-19 2020-02-11 한국전력공사 Apparatus and method for analysing living pattern based on power data
WO2020238631A1 (en) * 2019-05-31 2020-12-03 南京瑞栖智能交通技术产业研究院有限公司 Population type recognition method based on mobile phone signaling data
CN110826641A (en) * 2019-11-13 2020-02-21 上海积成能源科技有限公司 System and method for classifying electricity consumption condition of residents based on cluster analysis
CN111126820A (en) * 2019-12-17 2020-05-08 国网山东省电力公司电力科学研究院 Anti-stealing method and system
CN111861781A (en) * 2020-02-29 2020-10-30 上海电力大学 A method and system for feature selection in clustering of residential electricity consumption behavior
CN112200209A (en) * 2020-06-28 2021-01-08 国网浙江省电力有限公司金华供电公司 Poor user identification method based on day-to-day power consumption

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
卢子萌;陈佳怡;李?;谢岳;蒋欣利;韩蕾;郭倩;: "基于加权随机森林算法的空巢电力用户识别方法", 电信科学, no. 08 *
庄绪强;: "基于云计算技术的用户用电智能分析技术研究", 自动化与仪器仪表, no. 02 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113298175A (en) * 2021-06-10 2021-08-24 国网江苏省电力有限公司营销服务中心 Method and system for monitoring power consumption of old people living alone based on multiple scenes and multivariate data
CN113298175B (en) * 2021-06-10 2023-09-12 国网江苏省电力有限公司营销服务中心 Method and system for monitoring electricity consumption of elderly people living alone based on multiple scenarios and multiple data
CN113447711A (en) * 2021-08-02 2021-09-28 宁夏隆基宁光仪表股份有限公司 Intelligent ammeter with old people electricity utilization abnormity warning function and warning method thereof
CN113780452A (en) * 2021-09-16 2021-12-10 国网北京市电力公司 Monitoring method and monitoring device for solitary group and electronic equipment
CN114970672A (en) * 2022-04-13 2022-08-30 广东以诺通讯有限公司 Safe household system and use method thereof

Also Published As

Publication number Publication date
CN112906790B (en) 2023-08-18

Similar Documents

Publication Publication Date Title
CN112906790A (en) Method and system for identifying solitary old people based on electricity consumption data
CN112383052B (en) Power grid fault repairing method and device based on power internet of things
Singh et al. A review of studies on machine learning techniques
CN109614488B (en) Discrimination method of live working conditions in distribution network based on text classification and image recognition
CN108985632A (en) A kind of electricity consumption data abnormality detection model based on isolated forest algorithm
CN110909224B (en) A method and system for automatic classification and identification of sensitive data based on artificial intelligence
CN113496440B (en) User abnormal electricity consumption detection method and system
CN111126820A (en) Anti-stealing method and system
CN114091549B (en) A device fault diagnosis method based on deep residual network
CN110472671A (en) Based on multistage oil-immersed transformer fault data preprocess method
CN117131449A (en) Data management-oriented anomaly identification method and system with propagation learning capability
CN111506635A (en) System and method for analyzing residential electricity consumption behavior based on autoregressive naive Bayes algorithm
CN112836809A (en) Device characteristic extraction method and fault prediction method of convolutional neural network based on differential feature fusion
CN113989544A (en) Group discovery method based on deep map convolution network
CN113641906A (en) System, method, device, processor and medium for realizing similar target person identification processing based on fund transaction relation data
CN114626433A (en) Fault prediction and classification method, device and system for intelligent electric energy meter
CN114611738A (en) A Load Forecasting Method Based on User&#39;s Electricity Behavior Analysis
CN104778250B (en) Information physical emerging system data classification method based on genetic planning decision tree
CN110334767A (en) An Improved Random Forest Method for Air Quality Classification
CN112001436A (en) Water quality classification method based on improved extreme learning machine
CN116796894A (en) An efficient deep learning weather prediction model construction method
CN114399407B (en) Power dispatching monitoring data anomaly detection method based on dynamic and static selection integration
CN113807462B (en) Network equipment fault cause positioning method and system based on AI
CN116611729A (en) A community resilience assessment method based on multi-task neural network
CN114490645A (en) An automated machine learning approach for grid structured data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant