CN111680820B - Distributed photovoltaic power station fault diagnosis method and device - Google Patents
Distributed photovoltaic power station fault diagnosis method and device Download PDFInfo
- Publication number
- CN111680820B CN111680820B CN202010381231.7A CN202010381231A CN111680820B CN 111680820 B CN111680820 B CN 111680820B CN 202010381231 A CN202010381231 A CN 202010381231A CN 111680820 B CN111680820 B CN 111680820B
- Authority
- CN
- China
- Prior art keywords
- photovoltaic power
- power station
- theoretical
- power generation
- fault diagnosis
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003745 diagnosis Methods 0.000 title claims abstract description 65
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000010248 power generation Methods 0.000 claims abstract description 87
- 238000012549 training Methods 0.000 claims abstract description 27
- 238000013528 artificial neural network Methods 0.000 claims abstract description 22
- 238000003066 decision tree Methods 0.000 claims abstract description 13
- 238000004364 calculation method Methods 0.000 claims description 30
- 238000012544 monitoring process Methods 0.000 claims description 4
- 230000006870 function Effects 0.000 description 13
- 238000010586 diagram Methods 0.000 description 8
- 238000004590 computer program Methods 0.000 description 7
- 238000012360 testing method Methods 0.000 description 6
- 238000012545 processing Methods 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 4
- 210000002569 neuron Anatomy 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000009434 installation Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000032683 aging Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000000873 masking effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 239000004069 plant analysis Substances 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000013138 pruning Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000000638 stimulation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/067—Enterprise or organisation modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Marketing (AREA)
- Tourism & Hospitality (AREA)
- Entrepreneurship & Innovation (AREA)
- General Business, Economics & Management (AREA)
- Health & Medical Sciences (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Game Theory and Decision Science (AREA)
- Development Economics (AREA)
- General Health & Medical Sciences (AREA)
- Public Health (AREA)
- Data Mining & Analysis (AREA)
- Educational Administration (AREA)
- Primary Health Care (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Water Supply & Treatment (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Photovoltaic Devices (AREA)
- Supply And Distribution Of Alternating Current (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application relates to a distributed photovoltaic power station fault diagnosis method which comprises the steps of firstly collecting historical operation data of each photovoltaic power station, establishing a time sequence of output data of each photovoltaic power station, and collecting historical fault diagnosis records of each photovoltaic power station; training a BP neural network according to the collected historical operation data of each photovoltaic power station to obtain a prediction model of theoretical power generation, and establishing a distributed photovoltaic fault diagnosis model by using a decision tree method; and then, output data of each photovoltaic power station of the photovoltaic power stations are monitored, the output data are input into a prediction model of theoretical generated energy to obtain the theoretical generated energy, correlation coefficients are calculated according to the output data and the theoretical generated energy, and the correlation coefficients are input into a distributed photovoltaic fault diagnosis model to judge the state of the power stations. The fault diagnosis method can be used for judging the state of the power station.
Description
Technical Field
The application belongs to the technical field of photovoltaic power generation fault diagnosis, and particularly relates to a distributed photovoltaic power station fault diagnosis method and device based on a BP neural network and a decision tree.
Background
The photovoltaic price is cheap, does not receive geographical position restriction, can satisfy off-grid system energy demand, and market is wide. The renewable energy generation (RDEG) technology market, released by international market research institute Technavio, will grow 295.15GW scale during 2019-. In recent years, the development speed of the Chinese photovoltaic power generation is remarkable, the installation rate of 44.06GW is newly increased in 2018 national photovoltaic power generation, and the total installation rate of the national photovoltaic power generation reaches 174.63 GW. With the rapid increase of the installed photovoltaic capacity, the intelligent operation function of the photovoltaic power generation system). The implementation of the above functions depends on the quality and reliability of the data.
The current distributed photovoltaic power station is different from a large grid-connected photovoltaic power station, and data collected during operation of the distributed photovoltaic power station often lack meteorological data of a power station field. The lack of meteorological information has led to the failure of many past photovoltaic power plant analysis methods, and has been constrained in practical engineering applications. Therefore, the photovoltaic power station needs to be analyzed and evaluated from a new perspective.
In addition, the output situation of actual photovoltaic is complex, the difference between the output data and theoretical data is large, and in order to analyze power station data, indexes capable of effectively describing the operation state of a photovoltaic power station are needed, and a photovoltaic power station direct current side weak point diagnosis method based on time and space functions is provided.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: in order to solve the defects in the prior art, the method and the device for diagnosing the faults of the distributed photovoltaic power station are provided.
The technical scheme adopted by the invention for solving the technical problem is as follows:
a distributed photovoltaic power station fault diagnosis method comprises the following steps:
s1: collecting historical operating data of each photovoltaic power station, establishing a time sequence of output data of each photovoltaic power station, and collecting historical fault diagnosis records of each photovoltaic power station;
s2: training a BP neural network according to the collected historical operation data of each photovoltaic power station to obtain a prediction model of theoretical power generation, wherein the prediction model of the theoretical power generation can calculate the theoretical power generation of a specific photovoltaic power station through the operation data of other photovoltaic power stations and establish a time sequence of the theoretical power generation of each photovoltaic power station;
s3: intercepting the time sequence of the output data of each photovoltaic power station and the time sequence of the theoretical generated energy according to the same occurrence time in a certain time period, calibrating the fault type of each intercepted section at the corresponding occurrence time according to historical fault diagnosis records, and respectively calculating the correlation coefficient of the intercepted section of the time sequence of the output data and the intercepted section of the time sequence of the theoretical generated energy
S4: taking the correlation coefficient as an input vector, marking the fault type as an output vector, taking the input vector and the output vector as training data, and establishing a distributed photovoltaic fault diagnosis model by using a decision tree method;
s5: the method comprises the steps of monitoring output data of each photovoltaic power station of the photovoltaic power stations, inputting the output data into a theoretical power generation prediction model to obtain theoretical power generation, calculating correlation coefficients of the output data and the theoretical power generation, and inputting the correlation coefficients into a distributed photovoltaic fault diagnosis model to judge the state of the power stations.
Preferably, in the distributed photovoltaic power station fault diagnosis method of the present invention, in the step S1, similarity is further performed on the time series of the output data of each photovoltaic power station through the pearson correlation coefficient, so as to obtain a plurality of photovoltaic power stations with higher similarity to the photovoltaic power station;
the theoretical power generation of a particular photovoltaic plant is calculated in step S2 from the selected operating data of a similar higher photovoltaic plant.
Preferably, according to the fault diagnosis method for the distributed photovoltaic power station, the correlation coefficients are relative Euclidean distance and Pearson correlation coefficient.
Preferably, in the distributed photovoltaic power station fault diagnosis method, the calculation formula of the relative Euclidean distance is
In the formula: delta (X) Tar ,X Ref ) Is a relative Euclidean distance; wherein X Tar Is an intercepted segment of the time series of the output data, X Ref Is an intercepted segment of a time sequence of theoretical generated energy;
the Pearson correlation coefficient is calculated by the formula
Wherein X Tar Is an intercepted segment of the time series of the output data, X Ref Is an intercepted segment of a time sequence of theoretical generated energy; r is the Pearson correlation coefficient;the average value of the intercepted segment of the time sequence of the output data and the intercepted segment of the time sequence of the theoretical generating capacity.
Preferably, in the distributed photovoltaic power plant fault diagnosis method, the time period intercepted in the step S3 is 2-5 h.
6. A distributed photovoltaic power station fault diagnosis device comprises:
a data acquisition module: collecting historical operating data of each photovoltaic power station, establishing a time sequence of output data of each photovoltaic power station, and collecting historical fault diagnosis records of each photovoltaic power station;
a prediction model of theoretical power generation: training a BP neural network according to the collected historical operating data of each photovoltaic power station to obtain a prediction model of theoretical power generation, wherein the prediction model of the theoretical power generation can calculate the theoretical power generation of a specific photovoltaic power station through the operating data of other photovoltaic power stations;
theoretical generated energy calculation module: the system comprises a power generation system, a power generation system and a power generation system, wherein the power generation system is used for collecting operation data of each photovoltaic power station, calculating theoretical power generation of each photovoltaic power station according to the collected operation data of each photovoltaic power station, and forming a theoretical power generation time sequence according to the calculated theoretical power generation;
a correlation calculation module: the system is used for intercepting the time sequence of the output data of each photovoltaic power station and the time sequence of the theoretical generated energy according to the same occurrence time in a certain time period, calibrating the fault type of each intercepted section at the corresponding occurrence time according to historical fault diagnosis records, and respectively calculating the correlation coefficient of the intercepted section of the time sequence of the output data and the intercepted section of the time sequence of the theoretical generated energy;
a fault diagnosis model: taking the correlation coefficient obtained by the correlation calculation module as an input vector, marking the fault type as an output vector, taking the input vector and the output vector as training data, and training and establishing by using a decision tree method to obtain the fault type;
a fault diagnosis module: and calculating a correlation coefficient according to the output data correlation calculation module of each photovoltaic power station of the monitored photovoltaic power stations, and inputting the correlation coefficient serving as an input vector into the distributed photovoltaic fault diagnosis model to judge the state of the power station.
Preferably, in the distributed photovoltaic power station fault diagnosis method of the invention, the data acquisition module further performs similarity on the time sequence of the output data of each photovoltaic power station through a pearson correlation coefficient to obtain a plurality of photovoltaic power stations with higher similarity to the photovoltaic power station;
and calculating the theoretical power generation of the specific photovoltaic power station by selecting similar higher operation data of the photovoltaic power station in the theoretical power generation prediction model.
Preferably, according to the fault diagnosis method for the distributed photovoltaic power station, the correlation coefficients are relative Euclidean distance and Pearson correlation coefficient.
Preferably, in the distributed photovoltaic power station fault diagnosis method, the calculation formula of the relative Euclidean distance is
In the formula: delta (X) Tar ,X Ref ) Is a relative Euclidean distance; wherein X Tar Is an intercepted segment of the time series of the output data, X Ref Is an intercepted segment of a time sequence of theoretical generated energy;
the calculation formula of the Pearson correlation coefficient is
Wherein X Tar Is an intercepted segment of the time series of the output data, X Ref Is an intercepted segment of a time sequence of theoretical generated energy; r is the Pearson correlation coefficient;the average value of the intercepted segment of the time sequence of the output data and the intercepted segment of the time sequence of the theoretical generating capacity.
Preferably, in the distributed photovoltaic power station fault diagnosis method, the time period intercepted by the correlation calculation module is 2-5 h.
The beneficial effects of the invention are:
drawings
The technical solution of the present application is further explained below with reference to the drawings and the embodiments.
FIG. 1 is a flow chart of a method of fault diagnosis for a photovoltaic power plant;
FIG. 2 is a schematic diagram of a BP neural network;
FIG. 3 is a graph of similarity of time series under different faults;
FIG. 4 is a distance graph of time series under different faults;
fig. 5 is a schematic diagram of the operation of a decision tree.
Detailed Description
It should be noted that, in the present application, the embodiments and features of the embodiments may be combined with each other without conflict.
The technical solutions of the present application will be described in detail below with reference to the accompanying drawings in combination with embodiments.
Example 1
The embodiment provides a photovoltaic fault diagnosis method based on a BP neural network, as shown in fig. 1, the specific steps are as follows:
s1: collecting historical operation data of each photovoltaic power station, establishing a time sequence of output data of each photovoltaic power station, and collecting historical fault diagnosis records of each photovoltaic power station;
the historical operating data of the photovoltaic power station can also be subjected to data preprocessing, and a large blank part or an obvious error part is removed;
taking the data of 10 photovoltaic power stations 2018 in the whole year in the Xin county of China as an example. The ten photovoltaic power stations are composed of photovoltaic power generation systems and photovoltaic power station monitoring systems with the capacities of 40MW respectively, the data sampling interval is 10 minutes, and the output data of the 10 power stations form a time sequence of the output data according to time.
S2: training a BP neural network according to the collected historical operation data of each photovoltaic power station to obtain a prediction model of theoretical power generation, wherein the prediction model of the theoretical power generation can calculate the theoretical power generation of a specific photovoltaic power station through the operation data of other photovoltaic power stations and establish a time sequence of the theoretical power generation of each photovoltaic power station;
the function of the theoretical power generation prediction model is to calculate the theoretical power generation of each photovoltaic power station, and when the theoretical power generation of a specific power station (such as the power station 1) is calculated, the theoretical power generation is calculated by using the output data of other photovoltaic power stations (power stations 2-9) which do not comprise the specific power station.
The BP neural network is a multi-layer forward feedback neural network, and is mainly characterized by forward propagation of input signals and backward propagation of errors. In forward transmission, the input signal is processed layer by layer from the input layer through the hidden layer to the output layer. The neuron state at each layer will only have an effect on the state of the neurons at the next layer. If the output layer does not get the desired output, it does back-propagation and adjusts the network weights and thresholds based on the prediction error so that the decision of the BP neural network is continually closer to the desired output.
X 1 ,X 2 ,X 3 ,…,X n Forming an input series X, Y for the input variables of the BP neural network (here the output data of other photovoltaic power plants not including this particular power plant) 1 ,Y 2 ,…,Y m Forming an output series Y, omega for the output variable of the BP neural network (i.e. the theoretical power generation of a particular photovoltaic plant) ij And ω jk Is the weight value of the BP neural network. Considering the BP neural network as a nonlinear function, if the input variable is a and the output variable is b, the BP neural network reflects a nonlinear mapping relationship from a independent variables to b dependent variables.
Before neural network prediction to predict BP, the network must be trained. The trained neural network has the capability of prediction and judgment. Generally, the training process of the BP neural network is as follows.
And 2, outputting and calculating the hidden layer. According to the input sequence X, the connection weight omega between the input layer and the hidden layer ij And a hidden layer threshold a, calculating a hidden layer output G.
In the formula, l is the number of hidden layer nodes; f is a hidden layer excitation function, the function has various expression forms, and the function selected in this chapter is:
and 3, outputting layer output calculation. And calculating the prediction output P of the BP neural network according to the hidden layer output H and the connection weight omega jk and the threshold b.
And 4, calculating errors. And calculating a network prediction error e according to the network prediction output P and the expected output sequence Y.
e k =Y k -P k (4)
And 5, updating the weight value. Updating the network weight omega of the network according to the network prediction error e calculated in the previous step ij ,ω jk 。
ω ij =ω jk +ηe k G j (6)
In the formula, η is a learning step length.
And 6, updating the threshold. New network node thresholds a, b are generated based on the calculated network prediction error e.
b k =b k +e k (8)
And 7, judging whether the algorithm iteration is finished or not, and returning to the step 2 if the algorithm iteration is not finished.
Preferably, the output data of the partial power stations with higher correlation can be selected for training, so that the calculated amount is reduced, and the training process is accelerated
Selecting the power station 1 as a target power station, randomly selecting n (n is 2,3, … …,9) power stations from the rest power stations as reference power stations to perform fitting, and calculating fitting errors. And respectively selecting power stations 2-9 to repeat the process. And finally, averaging the errors with the number of the fitting power stations being n each time. Generally, 3 power stations are selected or the number of input sources is determined according to the calculation result.
For example, according to the historical operating data of 10 photovoltaic power stations in 2018 from seine county, china, the pearson correlation coefficient is obtained as shown in table 1 below. Therefore, the power stations 1, 4, 3, and 7 can be selected as the power stations for calculation, and when the theoretical power generation amount of the photovoltaic power station of the power station 1 is calculated, the theoretical power generation amount of the power stations 4, 3, and 7 can be selected for calculation.
TABLE 1 plant similarity analysis
S3: intercepting the time sequence of the output data of each photovoltaic power station and the time sequence of the theoretical generated energy according to the same occurrence time in a certain time period (the intercepted time period is 2-5h, for example, 3h), calibrating the fault type of each intercepted section corresponding to the occurrence time according to historical fault diagnosis records, and respectively calculating the correlation coefficient of the intercepted section of the time sequence of the output data and the intercepted section of the time sequence of the theoretical generated energy;
the correlation coefficients are relative Euclidean distance and Pearson correlation coefficient.
The formula for calculating the relative Euclidean distance is
In the formula: delta (X) Tar ,X Ref ) Is a relative Euclidean distance; wherein X Tar Is an intercepted segment of the time series of the output data, X Ref Is an intercepted segment of a time sequence of theoretical generated energy;
the calculation formula of the Pearson correlation coefficient is
Wherein X Tar Is an intercepted segment of the time series of the output data, X Ref Is an intercepted segment of a time sequence of theoretical generated energy; r is the Pearson correlation coefficient;the average of the truncated segment of the time series of the output data and the truncated segment of the time series of the theoretical power generation amount.
S4: taking the correlation coefficient as an input vector, marking the fault type as an output vector, taking the input vector and the output vector as training data, and establishing a distributed photovoltaic fault diagnosis model by using a decision tree method;
table 2 output vector of fault type
Type of failure | Output vector |
Is normal | η 1 =[1 0 0 0] |
Abnormal aging | η 2 =[0 1 0 0] |
Shadow masking | η 3 =[0 0 1 0] |
Open circuit fault | η 4 =[0 0 0 1] |
The decision tree is a tree structure with sample attributes as leaf nodes and values of the attributes as branches. The basic principle of decision tree building is to recursively split the training data set into subsets such that each contains states where the target variables are similar, these targets being predictable attributes. And in the splitting process, splitting attribute selection is carried out by using the principle of an information theory. Let the set of fault features of a certain equipment be d ═ d1, d2, …, dn }, the set of fault points e ═ e1, e2, …, em }, d is the set of test attributes (relative euclidean distance and pearson correlation coefficient in this embodiment), and e is the set of class labels (output vector representing fault type in this embodiment). The formed root node of the fault characteristic decision tree is a training set, one internal node represents a test of a fault characteristic, one edge represents a test result, and the leaf represents a certain fault point or a certain fault processing mode. The attribute values for any internal node are discrete, and for each fault the fault signature either exists a 1 (indicating that the fault signature exists) or does not exist a 0 (indicating that the fault signature does not exist).
Taking historical fault diagnosis records as training samples, generating 10 different types of faults at different fault moments and different fault positions by using the established model, extracting corresponding fault characteristics to form corresponding training samples, wherein the effective training samples of the 10 different types of faults are all 100, and the total number of the samples is 1000: and (3) adopting an ID3 algorithm to take the observable fault characteristics as the test splitting attributes, taking fault points as class labels, correspondingly dividing records, and adopting a pre-pruning mode to control the growth of the tree to form a decision.
1) The tree growth may be stopped when the number of instances to reach this node is less than a certain threshold.
2) A substitution error rate is introduced. When a set is continuously divided on a certain branch of the sub-tree in the calculation process, although all samples do not belong to the same class, if the number of records in different classes is greatly different, an error substitution rate formula is introduced:
in the formula: n represents the number of records of the branch; n' represents the number of records in the majority category in the branch; m represents the total number of records in the training set. If the value calculated by the formula is less than a certain threshold value, converting the subtree into a leaf node, otherwise, continuing to call the 1 st step for further decomposition. The algorithm recurses the above operations until the fault-free feature attributes are available to partition the current sample subset or satisfy the condition that the tree stops growing.
The data used in this example is derived from a device fault record, and is used to establish a decision tree to find the association between the fault signature and the fault point. Step 1: 320 pieces of data are extracted, 70% of the data are taken as training tuples, and 30% of the data are taken as test data. And (5) counting a test attribute fault feature set and a fault point set. Step 2: data were preprocessed as above to count statistics of failure points in these 224 records. And 3, step 3: and selecting the attribute which can divide the training set into the most instances by calculating the gain of the fault characteristic information one by one. As can be seen from the table 3, the accuracy of the method for diagnosing various faults is over 97 percent, so that the fault diagnosis method has high accuracy in actual fault diagnosis of the photovoltaic power station and has practical application value.
TABLE 3 Fault diagnosis accuracy statistics
S5: the method comprises the steps of monitoring output data of each photovoltaic power station of the photovoltaic power stations, inputting the output data into a theoretical power generation prediction model to obtain theoretical power generation, calculating correlation coefficients of the output data and the theoretical power generation, and inputting the correlation coefficients into a distributed photovoltaic fault diagnosis model to judge the state of the power stations.
And (3) theoretical power generation calculation is carried out by inputting the monitored output data of each photovoltaic power station into a theoretical power generation prediction model, and if the theoretical power generation prediction model is trained by using a part of photovoltaic power stations with high similarity, the corresponding photovoltaic power stations are also used for calculation at the moment.
During judgment, the time sequence of the output data of the photovoltaic power station and the time sequence of the calculated theoretical power generation amount are also required to be intercepted according to the same time period length of the step S3, and the time sequence is 2-5 h.
Example 2
The implementation provides a distributed photovoltaic power station fault diagnosis device, which corresponds to the method of embodiment 1, and includes:
a data acquisition module: collecting historical operating data of each photovoltaic power station, establishing a time sequence of output data of each photovoltaic power station, and collecting historical fault diagnosis records of each photovoltaic power station;
a prediction model of theoretical power generation: training a BP neural network according to the collected historical operating data of each photovoltaic power station to obtain a prediction model of theoretical power generation, wherein the prediction model of the theoretical power generation can calculate the theoretical power generation of a specific photovoltaic power station through the operating data of other photovoltaic power stations;
theoretical generated energy calculation module: the system comprises a power generation system, a power generation system and a power generation control system, wherein the power generation system is used for collecting operation data of each photovoltaic power station, calculating theoretical power generation of each photovoltaic power station according to the collected operation data of each photovoltaic power station, and forming a theoretical power generation time sequence according to the calculated theoretical power generation;
a correlation calculation module: the system is used for intercepting the time sequence of the output data of each photovoltaic power station and the time sequence of the theoretical generated energy according to the same occurrence time in a certain time period, calibrating the fault type of each intercepted section at the corresponding occurrence time according to historical fault diagnosis records, and respectively calculating the correlation coefficient of the intercepted section of the time sequence of the output data and the intercepted section of the time sequence of the theoretical generated energy;
a fault diagnosis model: taking the correlation coefficient obtained by the correlation calculation module as an input vector, marking the fault type as an output vector, taking the input vector and the output vector as training data, and training and establishing by using a decision tree method;
a fault diagnosis module: and calculating a correlation coefficient according to the output data correlation calculation module of each photovoltaic power station of the monitored photovoltaic power stations, and inputting the correlation coefficient serving as an input vector into the distributed photovoltaic fault diagnosis model to judge the state of the power station.
The data acquisition module is also used for carrying out similarity on the time sequence of the output data of each photovoltaic power station through Pearson correlation coefficients to obtain a plurality of photovoltaic power stations with higher similarity with the photovoltaic power station;
and calculating the theoretical power generation of the specific photovoltaic power station by selecting similar higher operation data of the photovoltaic power station in the theoretical power generation prediction model.
The correlation coefficients are relative Euclidean distance and Pearson correlation coefficient.
The formula for calculating the relative Euclidean distance is
In the formula: delta (X) Tar ,X Ref ) Is a relative Euclidean distance; wherein X Tar Is an intercepted segment of the time series of the output data, X Ref Is an intercepted segment of a time sequence of theoretical generated energy;
the Pearson correlation coefficient is calculated by the formula
Wherein X Tar Is an intercepted segment of the time series of the output data, X Ref Is an intercepted segment of a time sequence of theoretical generated energy; r is the Pearson correlation coefficient;the average value of the intercepted segment of the time sequence of the output data and the intercepted segment of the time sequence of the theoretical generating capacity.
The time period intercepted in the correlation calculation module is 2-5h, and preferably 3 h.
In light of the foregoing description of the preferred embodiments according to the present application, it is to be understood that various changes and modifications may be made without departing from the spirit and scope of the invention. The technical scope of the present application is not limited to the content of the specification, and must be determined according to the scope of the claims.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Claims (10)
1. A distributed photovoltaic power station fault diagnosis method is characterized by comprising the following steps:
s1: collecting historical operation data of each photovoltaic power station, establishing a time sequence of output data of each photovoltaic power station, and collecting historical fault diagnosis records of each photovoltaic power station;
s2: training a BP neural network according to the collected historical operation data of each photovoltaic power station to obtain a prediction model of theoretical power generation, wherein the prediction model of the theoretical power generation can calculate the theoretical power generation of a specific photovoltaic power station through the operation data of other photovoltaic power stations and establish a time sequence of the theoretical power generation of each photovoltaic power station;
s3: intercepting the time sequence of the output data of each photovoltaic power station and the time sequence of the theoretical generated energy according to the same occurrence time in a certain time period, calibrating the fault type of each intercepted section at the corresponding occurrence time according to historical fault diagnosis records, and respectively calculating the correlation coefficient of the intercepted section of the time sequence of the output data and the intercepted section of the time sequence of the theoretical generated energy;
s4: taking the correlation coefficient as an input vector, marking the fault type as an output vector, taking the input vector and the output vector as training data, and establishing a distributed photovoltaic fault diagnosis model by using a decision tree method;
s5: the method comprises the steps of monitoring output data of each photovoltaic power station of the photovoltaic power stations, inputting the output data into a theoretical power generation prediction model to obtain theoretical power generation, calculating correlation coefficients of the output data and the theoretical power generation, and inputting the correlation coefficients into a distributed photovoltaic fault diagnosis model to judge the state of the power stations.
2. The distributed photovoltaic power station fault diagnosis method according to claim 1, wherein in the step S1, similarity is further performed on the time series of the output data of each photovoltaic power station through pearson correlation coefficients, so as to obtain a plurality of photovoltaic power stations with higher similarity to the photovoltaic power station;
and calculating the theoretical power generation amount of the specific photovoltaic power station by the selected operation data of the photovoltaic power station with higher similarity in the step S2.
3. The distributed photovoltaic power plant fault diagnosis method of claim 1 wherein the correlation coefficients are relative Euclidean distance and Pearson correlation coefficients.
4. The distributed photovoltaic power plant fault diagnosis method of claim 3 wherein the calculation formula for the relative Euclidean distance is
In the formula: delta (X) Tar ,X Ref ) Is a relative Euclidean distance; wherein X Tar Is an intercepted segment of the time series of the output data, X Ref Is an intercepted segment of a time sequence of theoretical generated energy;
the calculation formula of the Pearson correlation coefficient is
Wherein X Tar Is an intercepted segment of the time series of the output data, X Ref Is an intercepted segment of a time sequence of theoretical generated energy; r is the Pearson correlation coefficient;the average of the truncated segment of the time series of the output data and the truncated segment of the time series of the theoretical power generation amount.
5. The distributed photovoltaic power plant fault diagnosis method as claimed in any one of claims 1 to 4, wherein the time period intercepted in step S3 is 2 to 5 hours.
6. A distributed photovoltaic power station fault diagnosis device is characterized by comprising:
a data acquisition module: collecting historical operating data of each photovoltaic power station, establishing a time sequence of output data of each photovoltaic power station, and collecting historical fault diagnosis records of each photovoltaic power station;
a prediction model of theoretical power generation: training a BP neural network according to the collected historical operating data of each photovoltaic power station to obtain a prediction model of theoretical power generation, wherein the prediction model of the theoretical power generation can calculate the theoretical power generation of a specific photovoltaic power station through the operating data of other photovoltaic power stations;
theoretical generated energy calculation module: the system comprises a power generation system, a power generation system and a power generation system, wherein the power generation system is used for collecting operation data of each photovoltaic power station, calculating theoretical power generation of each photovoltaic power station according to the collected operation data of each photovoltaic power station, and forming a theoretical power generation time sequence according to the calculated theoretical power generation;
a correlation calculation module: the system comprises a time sequence acquisition module, a fault diagnosis module, a power generation module and a power generation module, wherein the time sequence acquisition module is used for acquiring the time sequence of the output data of each photovoltaic power station and the time sequence of the theoretical power generation according to the same occurrence time in a certain time period, calibrating the fault type of each acquisition section corresponding to the occurrence time according to historical fault diagnosis records, and respectively calculating the correlation coefficient of the acquisition section of the time sequence of the output data and the acquisition section of the time sequence of the theoretical power generation;
a fault diagnosis model: taking the correlation coefficient obtained by the correlation calculation module as an input vector, marking the fault type as an output vector, taking the input vector and the output vector as training data, and training and establishing by using a decision tree method;
a fault diagnosis module: and calculating a correlation coefficient according to the output data correlation calculation module of each photovoltaic power station of the monitored photovoltaic power stations, and inputting the correlation coefficient serving as an input vector into the distributed photovoltaic fault diagnosis model to judge the state of the power station.
7. The distributed photovoltaic power station fault diagnosis device according to claim 6, wherein the data acquisition module further performs similarity on the time series of the output data of each photovoltaic power station through Pearson correlation coefficients to obtain a plurality of photovoltaic power stations with higher similarity to the photovoltaic power station;
and calculating the theoretical power generation of the specific photovoltaic power station by the selected operation data of the photovoltaic power station with higher similarity in the theoretical power generation prediction model.
8. The distributed photovoltaic power plant fault diagnosis apparatus of claim 6 or 7 wherein the correlation coefficients are relative Euclidean distance and Pearson correlation coefficients.
9. The distributed photovoltaic power plant fault diagnosis apparatus of claim 8 wherein the calculation formula of the relative Euclidean distance is
In the formula: delta (X) Tar ,X Ref ) Is a relative Euclidean distance; wherein X Tar Is an intercepted segment of the time series of the output data, X Ref Is a truncated segment of a time series of theoretical power generation;
the calculation formula of the Pearson correlation coefficient is
Wherein X Tar Is an intercepted segment of the time series of the output data, X Ref Is an intercepted segment of a time sequence of theoretical generated energy; r is the Pearson correlation coefficient;the average value of the intercepted segment of the time sequence of the output data and the intercepted segment of the time sequence of the theoretical generating capacity.
10. The distributed photovoltaic power plant fault diagnosis apparatus as claimed in any one of claims 6 or 7, wherein the time period intercepted in the correlation calculation module is 2-5 h.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010381231.7A CN111680820B (en) | 2020-05-08 | 2020-05-08 | Distributed photovoltaic power station fault diagnosis method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010381231.7A CN111680820B (en) | 2020-05-08 | 2020-05-08 | Distributed photovoltaic power station fault diagnosis method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111680820A CN111680820A (en) | 2020-09-18 |
CN111680820B true CN111680820B (en) | 2022-08-19 |
Family
ID=72433396
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010381231.7A Active CN111680820B (en) | 2020-05-08 | 2020-05-08 | Distributed photovoltaic power station fault diagnosis method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111680820B (en) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112200464B (en) * | 2020-10-14 | 2023-04-28 | 国网山东省电力公司聊城供电公司 | Correction method and system for photovoltaic power station output data considering spatial correlation |
CN112731022B (en) * | 2020-12-18 | 2023-06-23 | 阳光智维科技股份有限公司 | Photovoltaic inverter fault detection method, equipment and medium |
CN113992153B (en) * | 2021-11-19 | 2023-03-14 | 珠海康晋电气股份有限公司 | Visual real-time monitoring distributed management system of photovoltaic power station |
CN114070198B (en) * | 2021-12-06 | 2023-11-07 | 北京中电普华信息技术有限公司 | Fault diagnosis method and device for distributed photovoltaic power generation system and electronic equipment |
CN114757097B (en) * | 2022-04-07 | 2023-09-26 | 国网河北省电力有限公司邯郸供电分公司 | Line fault diagnosis method and device |
CN114899949B (en) * | 2022-06-01 | 2022-12-23 | 深圳博浩远科技有限公司 | Data acquisition method and device suitable for commercial photovoltaic inverter |
CN114841081A (en) * | 2022-06-21 | 2022-08-02 | 国网河南省电力公司郑州供电公司 | Method and system for controlling abnormal accidents of power equipment |
CN115587642B (en) * | 2022-06-22 | 2023-07-11 | 大唐海南能源开发有限公司 | BP neural network-based photovoltaic system fault alarm method |
CN116628608B (en) * | 2023-04-23 | 2024-06-21 | 华能国际电力江苏能源开发有限公司 | Photovoltaic power generation fault diagnosis method and system |
CN117630579A (en) * | 2023-12-06 | 2024-03-01 | 海南电力产业发展有限责任公司 | Distribution network fault accurate positioning method based on distributed traveling wave detection |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8682585B1 (en) * | 2011-07-25 | 2014-03-25 | Clean Power Research, L.L.C. | Computer-implemented system and method for inferring operational specifications of a photovoltaic power generation system |
CN105846780B (en) * | 2016-03-19 | 2018-04-03 | 上海大学 | A kind of photovoltaic module method for diagnosing faults based on decision-tree model |
CN106961249B (en) * | 2017-03-17 | 2019-02-19 | 广西大学 | A kind of diagnosing failure of photovoltaic array and method for early warning |
CN107516145A (en) * | 2017-07-27 | 2017-12-26 | 浙江工业大学 | A kind of multichannel photovoltaic power generation output forecasting method based on weighted euclidean distance pattern classification |
CN109842373B (en) * | 2019-04-15 | 2020-04-28 | 国网河南省电力公司电力科学研究院 | Photovoltaic array fault diagnosis method and device based on space-time distribution characteristics |
CN110336534B (en) * | 2019-07-15 | 2022-05-03 | 龙源(北京)太阳能技术有限公司 | Fault diagnosis method based on photovoltaic array electrical parameter time series feature extraction |
-
2020
- 2020-05-08 CN CN202010381231.7A patent/CN111680820B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN111680820A (en) | 2020-09-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111680820B (en) | Distributed photovoltaic power station fault diagnosis method and device | |
CN109842373B (en) | Photovoltaic array fault diagnosis method and device based on space-time distribution characteristics | |
CN108199795A (en) | The monitoring method and device of a kind of equipment state | |
CN112818604A (en) | Wind turbine generator risk degree assessment method based on wind power prediction | |
CN110070228B (en) | BP neural network wind speed prediction method for neuron branch evolution | |
CN114006369B (en) | Regional wind and light station power joint prediction method and device, electronic equipment and storage medium | |
CN110570122A (en) | Offshore wind power plant reliability assessment method considering wind speed seasonal characteristics and current collection system element faults | |
CN113822418A (en) | Wind power plant power prediction method, system, device and storage medium | |
CN114021483A (en) | Ultra-short-term wind power prediction method based on time domain characteristics and XGboost | |
CN116432123A (en) | Electric energy meter fault early warning method based on CART decision tree algorithm | |
CN115034485A (en) | Wind power interval prediction method and device based on data space | |
CN116317937A (en) | Distributed photovoltaic power station operation fault diagnosis method | |
CN113127464B (en) | Agricultural big data environment feature processing method and device and electronic equipment | |
CN114595762A (en) | Photovoltaic power station abnormal data sequence extraction method | |
CN118036819A (en) | Influence of environmental factors on operation of power distribution network and equipment maintenance system and method | |
CN117974395A (en) | Environment-friendly community management method and system based on carbon emission monitoring | |
CN113449920A (en) | Wind power prediction method, system and computer readable medium | |
CN111428821A (en) | Asset classification method based on decision tree | |
CN116664098A (en) | Abnormality detection method and system for photovoltaic power station | |
CN114997475B (en) | Kmeans-based fusion model photovoltaic power generation short-term prediction method | |
CN116663393A (en) | Random forest-based power distribution network continuous high-temperature fault risk level prediction method | |
CN111061708A (en) | Electric energy prediction and restoration method based on LSTM neural network | |
CN113496255B (en) | Power distribution network mixed observation point distribution method based on deep learning and decision tree driving | |
CN113986636B (en) | Hard disk fault prediction method for data center based on hard disk self-adaptive report data | |
CN115496264A (en) | Method for predicting generated power of wind turbine generator |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |