CN111027775A

CN111027775A - Step hydropower station generating capacity prediction method based on long-term and short-term memory network

Info

Publication number: CN111027775A
Application number: CN201911276332.1A
Authority: CN
Inventors: 舒生茂; 张地继; 邢喜旺; 陈忠贤
Original assignee: China Three Gorges Corp
Current assignee: China Three Gorges Corp
Priority date: 2019-12-12
Filing date: 2019-12-12
Publication date: 2020-04-17

Abstract

The invention discloses a step hydropower station generating capacity prediction method based on a long-short term memory network, which comprises the steps of carrying out stability test on a hydropower station generating capacity time sequence; carrying out correlation test on the generated energy time series; converting the generated energy time series data into supervised learning data; establishing a generating capacity prediction model based on a long-term and short-term memory network; performing integrated empirical mode decomposition on the generated energy data to obtain a training set and a test set; training the generated energy prediction model by using a training set, and performing model hyper-parameter optimization by using an improved discrete differential evolution algorithm to obtain optimal model parameters; and adopting a power generation amount prediction model to predict the power generation amount of the cascade hydropower station. The method can be suitable for predicting the generated energy of large and medium-sized step power stations; the generated energy prediction model based on the LSTM neural network has more advantages on the generation energy prediction of the multi-year regulation power station, and the fitting precision of the model is improved by the super-parameter optimization of the model.

Description

Step hydropower station generating capacity prediction method based on long-term and short-term memory network

Technical Field

The invention belongs to the field of hydropower station energy optimization, and particularly relates to a step hydropower station generating capacity prediction method based on a long-term and short-term memory network.

Background

With the high-speed development of power grid construction and the continuous expansion of the scale of the power grid in China, the requirements of safe, stable and economic operation of the power grid are higher and higher, and the increasing contradiction between the power consumption requirement and the peak regulation capacity brings great challenges to the operation of water and electricity. However, hydropower is used as a main peak shaving power supply in a power grid in China, and the peak shaving requirements of each power receiving power grid need to be fully considered in the output process, so that higher requirements are provided for power generation prediction of an excavation hydropower station based on a multi-constraint situation. The hydropower station generated energy prediction uses data of a past period of hydropower station, such as generated energy, warehousing flow, ex-warehouse flow, dam upper water level, dam lower water level, water head, energy storage value and the like, which have a correlation with hydropower station generated electricity to predict the generated energy of the hydropower station in a future period of time, so that the scheduling plan is convenient to make.

The existing research of generating capacity prediction is mainly oriented to small hydropower stations without regulating capacity, the generating capacity prediction mainly adopts the following methods, the most common method is a gray prediction model, and the method is used for transforming less input information, establishing a differential gray system and explaining the development of things; dynamic neural network models are also commonly used for predicting the generated energy of the radial hydropower station, and comprise a BP neural network, a wavelet neural network, an Artificial Neural Network (ANN) and the like. But the power generation capacity of the radial hydropower station is basically dependent on the incoming water, and the manual operable space is small. The existing research methods only mention the prediction of the power generation capacity of large and medium-sized cascade power stations, which brings difficulty to the formulation of a scheduling plan, and the current research does not consider the long-term dependence of the power generation capacity time sequence.

Disclosure of Invention

The technical problem of the invention is that the existing power generation amount prediction method only refers to the prediction of the power generation amount of large and medium-sized cascade power stations, and does not consider the long-term dependence of the power generation amount time sequence.

The invention aims to solve the problems and provides a step hydropower station electric energy generation amount prediction method based on a Long-Short Term Memory network, wherein an electric energy generation amount prediction model is constructed by using the Long-Short Term Memory network (LSTM), an integrated Empirical Mode Decomposition (EEMD) is carried out on hydropower station electric energy generation amount data to obtain a training set and a test set of the electric energy generation amount prediction model, and meanwhile, an improved discrete differential Evolution (MDDE) algorithm is adopted to carry out hyperparametric optimization on the electric energy generation amount prediction model aiming at the characteristics of multiple types, non-continuity and large manual parameter adjustment difficulty of the LSTM neural network, so that the electric energy generation amount prediction model can be predicted more accurately.

The technical scheme of the invention is a step hydropower station generating capacity prediction method based on a long-short term memory network, which comprises the following steps of 1: preprocessing data of the characteristic engineering, and carrying out stability inspection on a hydropower station generating capacity time sequence;

step 2: performing correlation test on the generated energy time sequence, and selecting a generated energy influence factor with high correlation;

and step 3: converting the generated energy time series data into supervised learning data;

and 4, step 4: establishing a generating capacity prediction model based on a long-term and short-term memory network;

and 5: performing integrated empirical mode decomposition on the generated energy data in the step 3 to obtain a training set and a test set;

step 6: training the generated energy prediction model by using a training set, and performing model hyper-parameter optimization by using an improved discrete differential evolution algorithm to obtain optimal model parameters;

and 7: testing the generated energy prediction model based on the long-term and short-term memory network by using a test set;

and 8: and adopting a power generation amount prediction model based on a long-term and short-term memory network to predict the power generation amount of the cascade hydropower station.

Preferably, in step 1, the stationarity test is performed on the hydropower station generated energy time sequence, a unit root test method is adopted, when the unit root does not exist in the sequence, the sequence is considered to have stationarity, otherwise, the sequence is considered to be a non-stationary time sequence.

Preferably, in step 2, the correlation test is performed on the power generation amount time series, and comprises calculation of a Pearson correlation coefficient and a Spearman correlation coefficient.

Further, in step 3, the power generation amount time series data are converted into supervised learning data, the dimensionality of the time series data is N _ features, the length of the time series is l, and X _ time is the length of a time window and corresponds to sample data; y _ timestamps is the length of a forecast period and corresponds to data label data; and sequentially performing sliding segmentation on the generated energy time sequence, and taking out the sequence with the length of X _ times + Y _ times each time to obtain a plurality of subsequences, wherein the number of the subsequences is N _ samples which is l-X _ times-Y _ times + 1.

Further, in step 5, the performing integrated empirical mode decomposition on the power generation amount data in step 3 specifically includes:

1) importing a data set X, wherein the capacity of the data set is N, and the standard deviation is std;

2) initializing the ratio Nstd of the additive noise to the standard deviation and the integration quantity NE;

3) rounding the logarithm of N, and then subtracting 1 to obtain the number M of IMF components;

4) adding noise to the sequence;

5) performing empirical mode decomposition to obtain an IMF component and a residual error, and entering a step 4 when the decomposition times are less than M);

6) and outputting the IMF component and the residual error.

Further, in step 6, the improved discrete differential evolution algorithm is adopted to perform model hyper-parameter optimization, and the hyper-parameter learning rate, the iteration times, the optimizer, the activation function, the time window and the number of neurons in the deep learning are subjected to unique heat coding and then subjected to differential evolution.

Preferably, the process of performing model hyper-parameter optimization by using the improved discrete differential evolution algorithm utilizes a distributed parallel computing method, and the distributed parallel computing method includes the following steps:

(1) the main node loads a cluster computing node list and initializes the population of an optimization algorithm;

(2) the main node broadcasts confirmation information to all the computing nodes according to the node list, and if the node does not respond, the node is removed from the node list; when the node list is empty, namely all the nodes cannot work, prompting a user to check node information and terminating the calculation;

(3) if the available nodes exist, the main node starts multithreading, balances the calculation tasks of the population to all the calculation nodes, then starts thread waiting, sets timeout time, and waits for all the calculation nodes to return calculation results; when the calculation of a node is overtime, abandoning the calculation result of the node and removing the node;

(4) when all the nodes are calculated, judging whether a termination condition is reached, if the termination condition is not reached, updating the population and the optimal value, and executing the step (3); and if the termination condition is met, performing model training by using the optimal parameters, and outputting the trained model.

Compared with the prior art, the invention has the beneficial effects that:

1) the step short-term power generation amount prediction method based on the long and short-term neural network can be suitable for predicting the power generation amount of large and medium-sized step power stations;

2) through correlation analysis and principal component analysis, factors which have obvious influence on the power generation amount are extracted, and meanwhile, the time sequence is subjected to feature extraction by using an integrated empirical mode decomposition technology, so that the model is favorable for learning scheduling rules from data on different scales, and the generalization capability and stability of the model are improved;

3) the generated energy prediction model based on the LSTM neural network can deal with long-term dependence and has more advantages on the generation amount prediction of a multi-year regulation power station;

4) the ultra-parameter optimization of the generated energy prediction model by adopting the improved discrete differential evolution algorithm improves the fitting precision of the generated energy prediction model based on the long-term and short-term neural network.

Drawings

The invention is further illustrated by the following figures and examples.

Fig. 1 is a flow chart of a cascade hydropower station power generation amount prediction method based on a long-short term memory network.

FIG. 2 is a flow diagram of an integrated empirical mode decomposition.

Fig. 3 is a diagram illustrating a time series conversion into supervised learning data.

FIG. 4 is a diagram of comparing one-hot coding with natural coding.

Fig. 5 is a flowchart illustrating a distributed parallel computing method.

Fig. 6 is a time chart of daily generated energy of the water buffalo power station.

Fig. 7 is a time chart of daily power generation of a river-separating rock power station.

And fig. 8-1 is a generated energy time sequence autocorrelation analysis diagram of the water buffalo power station.

And fig. 8-2 is a time series partial autocorrelation analysis diagram of the power generation amount of the water buffalo power station.

Fig. 8-3 is a diagram of the autocorrelation analysis of the power generation time series of the river-separating rock power station.

Fig. 8-4 is a partial autocorrelation analysis chart of the power generation time series of the river-separating rock power station.

And fig. 9-1 is a Pearson correlation coefficient diagram of correlation analysis of the influence factors of the power generation amount of the water puerto power station.

And FIG. 9-2 is a Spearman correlation coefficient diagram for correlation analysis of the influence factors of the power generation amount of the water puerto power station.

FIG. 10-1 is a Pearson correlation coefficient diagram of the correlation analysis of the influence factors of the power generation of the river-cut rock power station.

FIG. 10-2 is a Spearman correlation coefficient diagram of the correlation analysis of the impact factors of the power generation of the river-separating rock power station.

Fig. 11 is the EEMD decomposition result of the hydrobelock daily generated energy sequence.

FIG. 12 is a graph comparing the predicted effect of the MDDE-EEMD-LSTM model and the LSTM model of the present invention.

FIG. 13 is a graph comparing the predicted effect of the MDDE-EEMD-LSTM model and the LR model of the present invention.

FIG. 14 is a comparison graph of the predicted effect of the MDDE-EEMD-LSTM model and the SVR model of the present invention.

FIG. 15 is a graph comparing the predicted effect of the MDDE-EEMD-LSTM model and the RF model of the present invention.

FIG. 16 is a comparison graph of the predicted effect of the MDDE-EEMD-LSTM model and the MA model of the present invention.

Fig. 17 is a schematic diagram of the structure of an LSTM network.

Detailed Description

As shown in fig. 1, the method for predicting the power generation capacity of the cascade hydropower station based on the long-short term memory network comprises the following steps,

step 1: preprocessing data of characteristic engineering, performing stability inspection on a hydropower station generating capacity time sequence by using a unit root inspection method, and considering that the sequence has stability when the unit root does not exist in the sequence, otherwise, considering that the sequence is a non-stable time sequence;

step 2: performing correlation test on the generated energy time sequence, calculating a Pearson correlation coefficient and a Spearman correlation coefficient, and selecting a generated energy influence factor with high correlation;

Step 1 adopts The unit root test method disclosed in The paper "The power of The ADF test" published in 1997, 57 of journal, Economics Letters.

The invention utilizes an integrated empirical mode to decompose EEMD, improves a discrete differential evolution algorithm MDDE, and records a generated energy prediction model established based on a long-term and short-term memory network LSTM as MDDE-EEMD-LSTM.

In step 3, converting the multivariate time sequence into supervised learning data, wherein the sequence dimension is N _ features, the length is l, and X _ time is the length of a time window and corresponds to sample data; y _ timestamps is the length of the foreseeable period and corresponds to the data tag data. And sequentially performing sliding segmentation on the sequence, and taking out the sequence with the length of (X _ timing + Y _ timing) to obtain a plurality of subsequences, wherein N _ samples is the number of the obtained subsequences. As shown in fig. 3, taking the time sequence with N _ features being 2 and l being 10 as an example, taking X _ times being 6 and Y _ times being 2, that is, using the sequence data of the historical 6 time intervals to predict the data of the future 2 time intervals, for each feature, the whole time sequence can generate N _ samples being 3 new subsequences. Time series of arbitrary dimensions and lengths can be handled in this way.

In step 5, performing integrated empirical mode decomposition on the generated energy data to obtain a training set, wherein the process is shown in fig. 2 and includes:

1) importing a data set X with the capacity of N and the standard deviation of std;

3) taking the logarithm of N, rounding, and then subtracting 1 to obtain the number M of IMF components;

4) adding noise to the sequence, wherein X + rand Nstd, rand is a random number, and rand is more than or equal to 0 and less than or equal to 1;

6) and outputting the IMF component and the residual error.

The improved discrete differential evolution algorithm in step 6 improves the existing discrete differential evolution algorithm, which is disclosed in the paper "An ensemble of discrete evolution for solving the generated transformed discrete evolution protocol" published by Tasgetiren M F2010. The method is characterized in that a discrete differential evolution algorithm is improved to carry out unique hot coding on model parameters, then differential evolution is carried out, an individual cross strategy and a variation strategy are consistent with the differential evolution of the existing discrete differential evolution algorithm, and the difference lies in that normalization is carried out after population individual components are operated.

Carrying out model hyper-parameter optimization by using improved discrete differential evolution algorithm to obtain the optimumModel parameters: the method comprises the steps of carrying out unique hot coding on the hyper-parameter learning rate, iteration times, an optimizer, an activation function, a time window and the number of neurons in deep learning, carrying out differential evolution after the unique hot coding mode is as shown in figure 4, wherein an individual cross strategy and a variation strategy are consistent with those of the traditional differential evolution, and the difference lies in that normalization is carried out after the population individual components are operated. For example, the jth component of the ith individual in the population

In other words, if the initial value is set

After cross mutation operation become

At this time, the maximum value in the components is changed to 1, and the rest values are set to 0, namely the result after normalization is

When a plurality of components are maximum values, randomly selecting one bit from the components to change the component into 1, and setting the rest components into 0; when all the components are 0, one of the components is set to 1 randomly. The process adopts a distributed parallel training scheme to train the model, and the flow is shown in fig. 5, and comprises the following steps: (1) the main node loads a cluster computing node list and initializes the population of an optimization algorithm; (2) and the main node broadcasts confirmation information to all the computing nodes according to the node list, and if the node does not respond, the node is removed from the node list. When the node list is empty, namely all the nodes cannot work, prompting a user to check node information and terminating the calculation; (3) if available nodes exist, the main node starts multithreading, balances the calculation tasks of the population to all the calculation nodes, then starts thread waiting, sets timeout time and waits for all the calculation nodes to return calculation results. When the calculation of a node is overtime, abandoning the calculation result of the node and removing the node; (4) when all the nodes are calculated, judging whether a termination condition is reached, and if the termination condition is not reached, judging whether the nodes are calculatedExecuting the step (3) by the new population and the optimal value; and if the termination condition is met, performing model training by using the optimal parameters, and outputting the trained model.

In the embodiment, a two-stage power station of a water budlock and a gazette in a Qingjiang step is used as an object to predict the power generation amount, and the selected time sequence is shown in table 1.

Table 1 step power station series selection statistical table

And analyzing the characteristics of the generated energy time series. To ensure the integrity of the time series, the missing values of the data set are first processed. And then drawing a historical power generation sequence diagram of the power station of the water budlock and the river-separated rock of the Qingjiang cascade, as shown in fig. 6 and 7. The daily full power generation amounts of the water buffet and the river-separated rock power station are known to be 4416 thousands KW.h and 2909 thousands KW.h respectively, the extreme value of a timing diagram is observed, the sequence values are in a reasonable range, and no abnormal point appears. Since zero value appears, the calculation of MAPE index is affected, and in order to ensure the sequence continuity, zero value point is not removed, but the value is processed as 1. The general characteristics of the observation sequence are easy to know that the power station has no obvious trend and periodicity in the process of generating energy for many years, and the generating energy time sequence is preliminarily judged to be a stable time sequence.

And analyzing the stationarity of the power generation sequence, and performing unit root test on the power generation time sequences of the waterflock and the river-separating rock, wherein the test results are shown in a table 2. The original hypothesis of the unit root test is that the test object has no unit root, and the p-values of the two power stations are both smaller than Critical Value (5%), so that the unit root is proved to exist, the original hypothesis is rejected, namely, the power generation amount time sequences of the two power stations are both stable time sequences, and a basis is provided for the next time sequence analysis.

Table 2. test results of unit roots of generated energy of hydropower station of buffy and gazette

The results of the autocorrelation and partial autocorrelation analysis of the generated energy time series are shown in fig. 8-1, 8-2, 8-3, and 8-4. Firstly, researching autocorrelation, wherein the images 8-1 and 8-3 are autocorrelation graphs of the generated energy time sequences of a water bufferblock power station and a river-separating rock power station respectively, a bell-shaped area is a 95% confidence interval, and correlation coefficients of the two power stations are respectively 60-order truncation and 50-order truncation; then, the partial autocorrelation of the generated energy time series is studied, and as can be seen from fig. 8-2 and 8-4, the partial autocorrelation graphs of the two power stations show a tailing trend. In summary, it can be determined that two plants can be modeled using a mobile autoregressive Model (MA), where a hydrobrak plant can be modeled using MA (60), and a gazette plant using model MA (50). The result of the correlation analysis is an applied scaling of the MA model.

And performing correlation analysis on the selected generating capacity influence factors comprehensively, calculating a Spearman correlation coefficient and a Pearson correlation coefficient respectively, and comparing the results. The known correlation coefficient has a value range of [ -1,1], which represents the strength of correlation, wherein a value greater than 0 represents positive correlation, a value less than 0 represents negative correlation, and a value equal to 0 represents no correlation. As shown in fig. 9-1 and fig. 9-2, in the two methods, the correlation degree between the generated energy of the water bufferblock power station and the ex-warehouse flow is the highest and reaches 0.92, and the correlation degree of the in-warehouse runoff is ranked secondly, and the preliminary analysis is that the water bufferblock reservoir has stronger regulating capacity. From the results of the Spearman correlation coefficient and the Pearson correlation coefficient, common factors with high correlation degrees are ex-warehouse flow and warehouse flow, however, the further analysis finds that the ex-warehouse flow and the warehouse flow have strong correlation, and in order to avoid introducing redundancy factors to influence the generalization performance of the model, only the ex-warehouse flow is selected as an influence factor of the water distribution puerperium station; for the river isolation rock power station, as shown in fig. 10-1 and fig. 10-2, both the Spearman correlation coefficient and Pearson correlation coefficient results show that the correlation degree between the warehousing flow and the power generation amount is highest, so that the warehousing flow is selected as a key factor influencing the power generation amount of the river isolation rock power station.

In the embodiment, a main structure of the power generation amount prediction model is built by stacking an LSTM layer and a full connection layer, the LSTM structure is shown in fig. 17, the MDDE algorithm is used for carrying out hyper-parameter optimization on each layer, and the parameter optimization result is shown in table 3.

TABLE 3 table of results of hyper-parametric optimization

In order to verify the performance of the power generation prediction model, single-step prediction is firstly carried out. Firstly, an original generating capacity data signal is decomposed by adopting an EEMD algorithm, taking a water buffet power station as an example, 4047-day generating capacity data, the ratio of additive noise to the standard deviation of the generating capacity is 0.2, and the value of the EEMD aggregate number is 70. The Original power generation sequence Original is decomposed into a Residual error and 10 eigenmode functions IMF through decomposition. As shown in fig. 11, it can be observed that the EEMD algorithm decomposes the original signal, and gradually decomposes the signals of different time scales. Taking the former 3600 days as a training set and the later 417 days as a testing set, respectively adopting MDDE-EEMD-LSTM, SVR, LR and RF models to carry out training and testing, wherein the index results are shown in Table 4, in the table, MAE represents mean absolute error, MAPE represents mean absolute percentage error, RMSE represents root mean square error, R represents R mean absolute percentage error, and R represents weight average absolute percentage error²The method represents and determines coefficients, the model of the invention is optimal in all indexes under the condition of single-station single-step prediction, the fitting effect of the existing LSTM neural network is better, but a certain difference is still formed between the fitting effect and the optimal value, and the improvement measure provided by the invention improves the fitting precision of the LSTM model.

In order to visually display the prediction effect, a comparison graph of 100 points of the power generation amount prediction of each model is drawn, as shown in fig. 12 to 16. Compared with the existing LSTM model, the generated energy prediction model disclosed by the invention well tracks the generated energy trend and is higher in fitting precision. The LSTM model does not learn enough data characteristics, only the power generation amount of the past day is taken as the current power generation amount, and the LSTM prediction curve is equivalent to translating the actual measurement curve by one step along with time; for models such as LR, RF and SVR, only the overall change trend of data is learned, and the predicted value has larger oscillation amplitude near the actual value and has no referential property; for the MA (60) model, the trend of the predicted value is gradually smooth along with the prolonging of time, the influence of the addition of new data on the average value is not large, and the prediction cannot be realized.

TABLE 4 comparison of the Performance of each model for single step prediction

To further evaluate the effectiveness of the MDDE-EEMD-LSTM model, 3-step and 7-step predictions were made and compared to the basic LSTM model. As shown in tables 5 and 6, each evaluation index of each model decreased with the extension of the prediction period, and the fluctuation width varied from model to model. When the prediction step length is increased, the prediction accuracy is reduced, but the model still has higher accuracy compared with other conventional methods, and can provide reference for engineering practice.

TABLE 5 prediction 3-step comparison of model performance

TABLE 6 prediction 7-step comparison table of model performance

And further researching a step power generation amount integral prediction mode, taking the power generation amounts of the water bufferas and the river rocks and the flow of entering and exiting the reservoir as input, and performing power generation amount prediction by adopting an MDDE-EEMD-LSTM model. The prediction is continued for 200 days and various indexes are calculated and compared with the single-station prediction. As shown in table 7, the prediction in steps is slightly less accurate than the single-stop prediction. When the power generation amount of the cascade hydropower station is predicted, the prediction is carried out in a single-station prediction mode, and then the prediction result of the cascade power generation amount is obtained through accumulation, so that the prediction precision can be improved.

TABLE 7 comparison table of single-stop prediction and cascade prediction

Claims

1. The method for predicting the generating capacity of the cascade hydropower station based on the long-term and short-term memory network is characterized by comprising the following steps,

step 1: preprocessing data of the characteristic engineering, and carrying out stability inspection on a hydropower station generating capacity time sequence;

2. The method for predicting the power generation capacity of the cascade hydropower station based on the long-short term memory network according to claim 1, wherein in the step 1, the stationarity test is carried out on the time series of the power generation capacity of the hydropower station by adopting a unit root test method.

3. The long-short term memory network-based cascade hydropower station electric energy generation amount prediction method as claimed in claim 1, wherein in the step 2, correlation test is carried out on the electric energy generation amount time series, and comprises calculation of Pearson correlation coefficient and Spearman correlation coefficient.

4. The method for predicting the electric energy production of the cascade hydropower station based on the long-short term memory network as claimed in claim 1, wherein in the step 3, the electric energy production is time-sequencedConverting the data into supervised learning data, wherein the dimension of the time series data is N _ features, and the length of the time series is N _ featureslX _ timepieces is the length of a time window and corresponds to sample data; y _ timestamps is the length of a forecast period and corresponds to data label data; sequentially performing sliding segmentation on the generated energy time sequence, taking out the sequence with the length of X _ times + Y _ times each time to obtain a plurality of subsequences, wherein the number of the subsequences is N _ samples =l-X_timesteps-Y_timesteps+1。

5. The method for predicting the electric energy production of the cascade hydropower station based on the long-short term memory network according to claim 1, wherein in the step 5, the integrated empirical mode decomposition is performed on the electric energy production data in the step 3, and specifically comprises the following steps:

4) adding noise to the sequence;

6) and outputting the IMF component and the residual error.

6. The method for predicting the power generation amount of the cascade hydropower station based on the long and short term memory network according to claim 1, wherein in the step 6, the improved discrete differential evolution algorithm is adopted to carry out model hyper-parameter optimization, and the hyper-parameter learning rate, the iteration times, the optimizer, the activation function, the time window and the number of neurons in deep learning are subjected to one-hot coding and then differential evolution is carried out.

7. The method for predicting the power generation capacity of the cascade hydropower station based on the long-short term memory network according to any one of claims 1-6, wherein the process of optimizing the model hyper-parameters by adopting the improved discrete differential evolution algorithm utilizes a distributed parallel computing method, and the distributed parallel computing method comprises the following steps: