Background
The air quality is always a great problem related to the future fate of human beings, and along with the social progress and the rapid increase of the automobile holding amount, the content of particles which can absorb people in the air is greatly increased, and the problem of environmental pollution is getting more and more serious. Along with the continuous deterioration of air quality, haze weather phenomenon is more and more, and harm is bigger and bigger. Haze is a disaster weather phenomenon. The inhalable particles PM2.5 are the main cause of haze weather, have serious influence on air quality, and importantly have great threat to human health.
The air quality prediction research has many ideas and methods, and among many methods, the realization of quantitative research and effective prediction on environmental quality, especially haze, based on the ideas of system engineering and effectively combined with new theories and new methods is a main development trend.
Due to the influence of a large number of uncertainty and complexity factors such as climate, temperature, human activities and the like, time sequences of various weather data have characteristics of high nonlinearity, uncertainty and the like, and conventional analysis and prediction methods are difficult to master change rules and change characteristics.
The shallow neural network has obvious effect on solving simple or more limited problems, but has limited realization capability for some more complex problems related to natural signals in real life due to limited modeling and representation capability.
The deep neural network has a plurality of hidden layers, has more structural advantages than the traditional neural network, and has strong feature abstraction capability. The deep neural network adopts a brand-new coding mode, an algorithm and programming are not required to be directly designed for solving the problems, only the programming is required in the training process, the correct method for solving the problems can be learned by the network in the retraining process, and under the condition that the data volume is ensured, the special effect can be obtained by the simple algorithm and the complex data.
Meanwhile, due to the huge improvement of the chip processing performance, the data for training is explosively increased, and machine learning and signal and information processing research are greatly developed recently, so that the deep learning method can effectively utilize complex nonlinear functions and nonlinear compound functions to learn distributed and layered feature representations and can fully and effectively utilize labeled and non-labeled data.
A Recurrent Neural Network (RNN) is a class of deep networks that can be used for unsupervised (and supervised) learning, even to the extent of the length of the input sequence, in unsupervised learning mode, RNNs are used to predict future data sequences from previous data samples, and class information is not used in the learning process, so RNNs are well suited for sequence data modeling.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a PM2.5 prediction method based on a deep-structure recurrent neural network, wherein a PM2.5 prediction model is built by combining the basic theory, the network structure and the flow principle of the recurrent neural network, so that the PM2.5 prediction is realized.
In order to achieve the above object, the present invention provides a PM2.5 prediction method based on a deep structure recurrent neural network, comprising the steps of:
(1) obtaining historical weather data, including hourly temperature, illumination, wind speed, rainfall, SO2, O3, NO, PM10, PM2.5 data indexes, wherein, the temperature unit: DEG C, light unit: lm/square meter, wind speed unit: m/s, rainfall units: mm, SO2, O3, NO, PM10 and PM2.5 are concentration data;
(2) data preprocessing
(2.1) complementing missing historical weather data
And (3) complementing the missing historical weather data by using an averaging method:
wherein, XtIndicating missing historical weather data at the current time, Xt-1Indicating missing historical weather data at the previous time, Xt+1Representing missing historical weather data at a previous and subsequent time;
(2.2) normalizing all historical weather data
Normalizing the historical weather data to be between-1 and 1 according to the following formula;
wherein X' represents the historical weather data after normalization, X represents the historical weather data before normalization,
means for representing historical weather data, X
maxRepresenting a maximum value, X, of historical weather data
minRepresenting historical weather data minimum;
(3) dividing the preprocessed historical weather data into training data and testing data according to a proportion;
(4) PM2.5 prediction model for constructing deep structure based on deep learning theory and recurrent neural network
(4.1) constructing a deep circulation neural network prediction model: the model depth is greater than N layers, the input is training data, and the output is a predicted value of PM2.5 concentration;
(4.2) setting the dimension of an input layer as Kx (H-1), setting the dimension of an output layer as 1 xT, and adopting Tanh functions as activation functions of the input layer and the hidden layer, the hidden layer and the hidden layer, and the hidden layer and the output layer;
wherein K represents the depth of the recurrent neural network expanded according to the time sequence, namely K time frames, and each time frame inputs a group of historical weather data; h represents the number of data indexes, T represents the number of data output by a prediction model of the recurrent neural network, and represents that K pieces of historical data are used for predicting the PM2.5 concentration at T moments in the future, namely the weather data at the first K moments are input, and the PM2.5 concentration data at the later T moments are predicted;
(4.3) selecting a loss function for use in the PM2.5 prediction model
The mean square error is used as a loss function in the PM2.5 prediction model:
where q represents the number of iterations, t is the output vector dimension, yi,jActual value, y, representing training datai,j' represents a predicted value of training data;
(4.4) updating parameters in the PM2.5 prediction model by adopting a small batch stochastic gradient descent algorithm
(4.4.1) initializing parameter θ0;
(4.4.2) dividing the training data into a group according to each m training data of the time sequence, calculating the gradient value of each training data in the first group of training data by using a small batch stochastic gradient descent algorithm, and performing weighted average summation on the gradient values to obtain the descent gradient of the group of training data
i represents the ith set of training data,
represents input and output data corresponding to the τ -th training data in the ith group;
(4.4.3) updating parameters in the PM2.5 prediction model by the descending gradient of the training data, wherein the parameter updating formula is as follows:
wherein, thetai-1Represents the target parameter theta after the training of the last group of data is finishediRepresenting target parameters after the training data of the group is finished, and representing the learning rate by eta;
(4.4.4), after the target parameters of the training data are updated, returning to the step (4.4.2) to train and update the next training data group until the error value is lower than the set expected error value or the training of the last training data group is finished, and then updating and storing the final parameters to obtain a PM2.5 prediction model after the training is finished;
(5) judging whether the PM2.5 prediction model reaches the training stop condition or not
Inputting K groups of data into a trained PM2.5 prediction model according to a time sequence, outputting T predicted values, judging the error between each predicted value and the true value, if the error is within an allowable range, considering that the prediction model completes training, and if not, returning to the step (4) to retrain until a stop condition is reached;
(6) PM2.5 prediction by using PM2.5 prediction model
And inputting the current K groups of weather data into a PM2.5 prediction model, and outputting T PM2.5 prediction values.
The invention aims to realize the following steps:
according to the PM2.5 prediction method based on the deep structure recurrent neural network, the collected mass data are utilized, the prediction model of the PM2.5 of the deep structure is constructed according to deep learning and recurrent neural network theories, and the prediction of haze weather is realized through extraction and training of data characteristics, so that the efficiency and the precision of haze prediction are improved, and a convincing decision basis is provided for haze prevention and treatment. The prediction model has little requirement on a data structure, and can be learned by self as long as the data is large enough, so that the deep learning is very suitable for the requirements of current internet big data application.
Meanwhile, the PM2.5 prediction method based on the deep structure recurrent neural network further has the following beneficial effects:
(1) the prediction model transforms original data into higher-level abstract expression through some simple nonlinear models, combines multi-layer transformation, and learns and extracts a very complex function characteristic method. The problem that the traditional prediction model of the shallow structure has limited representation capability on complex functions under the condition of limited samples and calculation units is solved.
(2) The core difference of the prediction model is that a plurality of hidden layers are provided, and the feature extraction of each layer is not designed manually but obtained by self-learning from data by using a general learning process, so that the data structure is not required, the data processing process is simplified, and the efficiency is improved.
(3) The PM2.5 concentration prediction of the region can be realized by using data of different regions, parameters of a PM2.2 prediction model are redefined according to actual conditions and requirements, and a network does not need to be reconstructed, so that the method has flexibility and portability.
Detailed Description
The following description of the embodiments of the present invention is provided in order to better understand the present invention for those skilled in the art with reference to the accompanying drawings. It is to be expressly noted that in the following description, a detailed description of known functions and designs will be omitted when it may obscure the subject matter of the present invention.
Examples
FIG. 1 is a flowchart of a method for predicting PM2.5 based on a deep-structure recurrent neural network according to the present invention.
In this embodiment, as shown in fig. 1, a method for predicting PM2.5 based on a deep-structure recurrent neural network according to the present invention includes the following steps:
s1, obtaining historical weather data, including hourly temperature, illumination, wind speed, rainfall, SO2, O3, NO, PM10 and PM2.5 data indexes, wherein the temperature unit: DEG C, light unit: lm/square meter, wind speed unit: m/s, rainfall units: mm, SO2, O3, NO, PM10 and PM2.5 are concentration data;
in the embodiment, historical weather data from 5 months 2014 to 5 months 2017 are applied and obtained from the China weather service bureau, and data information comprises temperature, light, wind speed, rainfall, SO2, O3, NO, PM10 and PM2.5 data indexes (wherein the temperature unit is in DEG C, the light unit is lm/square meter, the wind speed unit is m/s, the rainfall unit is mm, the SO2, the O3, the NO, the PM10 and the PM2.5 are concentration data) of each hour, 9 indexes are counted in total, 26280 multiplied by 9 data are counted in total, and the data quantity of the PM2.5 model predicted by the recurrent neural network of the deep layer structure is ensured.
S2, preprocessing data
S2.1, complementing the missing historical weather data
The collected weather data is historical data based on a time sequence, few missing data exist in the collected data, the missing data are supplemented by adopting a mean value method, and the integrity of the data is ensured.
The formula for complementing the missing historical weather data by using an averaging method is as follows:
wherein, XtIndicating missing historical weather data at the current time, Xt-1Indicating missing historical weather data at the previous time, Xt+1Indicating the number of missing historical days at a time before and afterAccordingly;
s2.2, normalizing all historical weather data
Normalizing the historical weather data to be between-1 and 1 according to the following formula;
wherein X' represents the historical weather data after normalization, X represents the historical weather data before normalization,
means for representing historical weather data, X
maxRepresenting a maximum value, X, of historical weather data
minRepresenting historical weather data minimum;
s3, dividing the preprocessed historical weather data into training data and testing data according to the proportion of 70% to 30%;
s4 PM2.5 prediction model for constructing deep structure based on deep learning theory and recurrent neural network
S4.1, in the embodiment, constructing a deep-layer circulation neural network prediction model: the model comprises an input layer, eight hidden layers and an input layer, wherein the model depth is 10 layers, the number of nodes of the input layer is 9, the number of nodes of the hidden layer is 50, the number of nodes of the output layer is 5, the input is training data, and the output is a predicted value of PM2.5 concentration;
s4.2, setting the dimension of an input layer as Kx 9, the dimension of an output layer as 1 xT, the dimension of the input layer and a hidden layer, the dimension of the hidden layer and the hidden layer, and the activation functions of the hidden layer and the output layer as Tanh functions;
wherein K represents the depth of the recurrent neural network expanded according to the time sequence, namely K time frames, and each time frame inputs a group of historical weather data; t represents the number of output data of a prediction model of the recurrent neural network, and represents that K pieces of historical data are used for predicting the PM2.5 concentration at the T moments in the future, namely the weather data at the K moments before input and the PM2.5 concentration data at the T moments after input are predicted;
s4.3, selecting a loss function used in the PM2.5 prediction model
In this embodiment, the subset M is selected from the training data set segment by segment according to the data in the PM2.5 prediction model in time seriesiSubset M as a small batch datasetiThe number of the middle sample points is M, the middle sample points comprise input and mark output data, and the number is marked as Mi(Xi,Yi) Input data is XiTrue tag output data is YiThe neural network training output data is represented as Yi', the correspondence relationship is expressed as follows:
i.e. representing the input of the # th one in time series
Matrix, namely to obtain
A row vector, τ ═ 1, 2.., m-k + 1.
Each subset M
iThe training loss function of (a) is chosen to be the mean square error, representing the measure of error of the parameter vector of the prediction model over a given subset of training data, noted as
Expressed by the following formula:
where q represents the number of iterations, t is the output vector dimension, yi,jActual value, y, representing training datai,j' represents a predicted value of training data;
s4.4, updating parameters in the PM2.5 prediction model by adopting a small-batch stochastic gradient descent algorithm
S4.4.1, initialization parameter θ0;
S4.4.2, dividing the training data into a group according to m training data of the time sequence, calculating the gradient value of each training data in the first group of training data by using a small batch random gradient descent algorithm, and then carrying out weighted average summation on q gradient values to obtain the descending gradient of the group of training data
i represents the ith set of training data,
represents input and output data corresponding to the τ -th training data in the ith group;
s4.4.3, updating parameters in the PM2.5 prediction model by the descending gradient of the training data, wherein the parameter updating formula is as follows:
wherein, thetai-1Represents the target parameter theta after the training of the last group of data is finishediRepresenting target parameters after the training data of the group is finished, and representing the learning rate by eta;
in the present embodiment, in combination with the embodiment in step S3, when the PM2.5 prediction model trains parameters using a small batch stochastic gradient descent algorithm, each subset M
iIn (m-K +1) iterations, each using K pieces of data, for
Deriving to obtain the gradient of each parameter, performing weighted average summation on m-k +1 gradients to serve as a descending gradient of one-time small-batch training, and then updating the parameters, specifically:
s4.4.4, when the target parameters of the training data are updated, returning to step S4.4.2 to train and update the next training data group until the gradient of the descent is lower than the set expected error value or the training of the last training data group is completed, and then updating and storing the final parameters to obtain the PM2.5 prediction model after training;
s5, judging whether the PM2.5 prediction model reaches the training stop condition or not
Inputting data of the test set into K groups of data according to a time sequence to a trained PM2.5 prediction model, outputting T predicted values, judging the error between each predicted value and a true value, if the error is within an allowable range, considering that the prediction model completes training, otherwise, returning to the step S4 to retrain until a stop condition is reached;
s6, PM2.5 prediction is carried out by utilizing PM2.5 prediction model
And inputting the current K groups of weather data into a PM2.5 prediction model, and outputting T PM2.5 prediction values.
Although illustrative embodiments of the present invention have been described above to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, and various changes may be made apparent to those skilled in the art as long as they are within the spirit and scope of the present invention as defined and defined by the appended claims, and all matters of the invention which utilize the inventive concepts are protected.