CN107909206B

CN107909206B - PM2.5 prediction method based on deep structure recurrent neural network

Info

Publication number: CN107909206B
Application number: CN201711130537.XA
Authority: CN
Inventors: 刘珊; 杨波; 郑文锋; 宋利红
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2017-11-15
Filing date: 2017-11-15
Publication date: 2021-06-04
Anticipated expiration: 2037-11-15
Also published as: CN107909206A

Abstract

The invention discloses a PM2.5 prediction method based on a deep structure cyclic neural network. A large amount of collected data is used to construct a deep structure PM2.5 prediction model according to deep learning and cyclic neural network theory. Training to realize the prediction of haze weather, aiming to improve the efficiency and accuracy of haze prediction, and to provide a persuasive decision basis for haze prevention and governance. The prediction model has almost no requirements on the data structure. As long as the data is large enough, it can learn by itself, making deep learning very suitable for the needs of current Internet big data applications.

Description

PM2.5 prediction method based on deep structure recurrent neural network

Technical Field

The invention belongs to the technical field of environmental engineering detection, and particularly relates to a PM2.5 prediction method based on a deep structure cyclic neural network.

Background

The air quality is always a great problem related to the future fate of human beings, and along with the social progress and the rapid increase of the automobile holding amount, the content of particles which can absorb people in the air is greatly increased, and the problem of environmental pollution is getting more and more serious. Along with the continuous deterioration of air quality, haze weather phenomenon is more and more, and harm is bigger and bigger. Haze is a disaster weather phenomenon. The inhalable particles PM2.5 are the main cause of haze weather, have serious influence on air quality, and importantly have great threat to human health.

The air quality prediction research has many ideas and methods, and among many methods, the realization of quantitative research and effective prediction on environmental quality, especially haze, based on the ideas of system engineering and effectively combined with new theories and new methods is a main development trend.

Due to the influence of a large number of uncertainty and complexity factors such as climate, temperature, human activities and the like, time sequences of various weather data have characteristics of high nonlinearity, uncertainty and the like, and conventional analysis and prediction methods are difficult to master change rules and change characteristics.

The shallow neural network has obvious effect on solving simple or more limited problems, but has limited realization capability for some more complex problems related to natural signals in real life due to limited modeling and representation capability.

The deep neural network has a plurality of hidden layers, has more structural advantages than the traditional neural network, and has strong feature abstraction capability. The deep neural network adopts a brand-new coding mode, an algorithm and programming are not required to be directly designed for solving the problems, only the programming is required in the training process, the correct method for solving the problems can be learned by the network in the retraining process, and under the condition that the data volume is ensured, the special effect can be obtained by the simple algorithm and the complex data.

Meanwhile, due to the huge improvement of the chip processing performance, the data for training is explosively increased, and machine learning and signal and information processing research are greatly developed recently, so that the deep learning method can effectively utilize complex nonlinear functions and nonlinear compound functions to learn distributed and layered feature representations and can fully and effectively utilize labeled and non-labeled data.

A Recurrent Neural Network (RNN) is a class of deep networks that can be used for unsupervised (and supervised) learning, even to the extent of the length of the input sequence, in unsupervised learning mode, RNNs are used to predict future data sequences from previous data samples, and class information is not used in the learning process, so RNNs are well suited for sequence data modeling.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a PM2.5 prediction method based on a deep-structure recurrent neural network, wherein a PM2.5 prediction model is built by combining the basic theory, the network structure and the flow principle of the recurrent neural network, so that the PM2.5 prediction is realized.

In order to achieve the above object, the present invention provides a PM2.5 prediction method based on a deep structure recurrent neural network, comprising the steps of:

(1) obtaining historical weather data, including hourly temperature, illumination, wind speed, rainfall, SO2, O3, NO, PM10, PM2.5 data indexes, wherein, the temperature unit: DEG C, light unit: lm/square meter, wind speed unit: m/s, rainfall units: mm, SO2, O3, NO, PM10 and PM2.5 are concentration data;

(2) data preprocessing

(2.1) complementing missing historical weather data

And (3) complementing the missing historical weather data by using an averaging method:

wherein, X_tIndicating missing historical weather data at the current time, X_t-1Indicating missing historical weather data at the previous time, X_t+1Representing missing historical weather data at a previous and subsequent time;

(2.2) normalizing all historical weather data

Normalizing the historical weather data to be between-1 and 1 according to the following formula;

wherein X' represents the historical weather data after normalization, X represents the historical weather data before normalization,

means for representing historical weather data, X_maxRepresenting a maximum value, X, of historical weather data_minRepresenting historical weather data minimum;

(3) dividing the preprocessed historical weather data into training data and testing data according to a proportion;

(4) PM2.5 prediction model for constructing deep structure based on deep learning theory and recurrent neural network

(4.1) constructing a deep circulation neural network prediction model: the model depth is greater than N layers, the input is training data, and the output is a predicted value of PM2.5 concentration;

(4.2) setting the dimension of an input layer as Kx (H-1), setting the dimension of an output layer as 1 xT, and adopting Tanh functions as activation functions of the input layer and the hidden layer, the hidden layer and the hidden layer, and the hidden layer and the output layer;

wherein K represents the depth of the recurrent neural network expanded according to the time sequence, namely K time frames, and each time frame inputs a group of historical weather data; h represents the number of data indexes, T represents the number of data output by a prediction model of the recurrent neural network, and represents that K pieces of historical data are used for predicting the PM2.5 concentration at T moments in the future, namely the weather data at the first K moments are input, and the PM2.5 concentration data at the later T moments are predicted;

(4.3) selecting a loss function for use in the PM2.5 prediction model

The mean square error is used as a loss function in the PM2.5 prediction model:

where q represents the number of iterations, t is the output vector dimension, y_i,jActual value, y, representing training data_i,j' represents a predicted value of training data;

(4.4) updating parameters in the PM2.5 prediction model by adopting a small batch stochastic gradient descent algorithm

(4.4.1) initializing parameter θ₀；

(4.4.2) dividing the training data into a group according to each m training data of the time sequence, calculating the gradient value of each training data in the first group of training data by using a small batch stochastic gradient descent algorithm, and performing weighted average summation on the gradient values to obtain the descent gradient of the group of training data

i represents the ith set of training data,

represents input and output data corresponding to the τ -th training data in the ith group;

(4.4.3) updating parameters in the PM2.5 prediction model by the descending gradient of the training data, wherein the parameter updating formula is as follows:

wherein, theta_i-1Represents the target parameter theta after the training of the last group of data is finished_iRepresenting target parameters after the training data of the group is finished, and representing the learning rate by eta;

(4.4.4), after the target parameters of the training data are updated, returning to the step (4.4.2) to train and update the next training data group until the error value is lower than the set expected error value or the training of the last training data group is finished, and then updating and storing the final parameters to obtain a PM2.5 prediction model after the training is finished;

(5) judging whether the PM2.5 prediction model reaches the training stop condition or not

Inputting K groups of data into a trained PM2.5 prediction model according to a time sequence, outputting T predicted values, judging the error between each predicted value and the true value, if the error is within an allowable range, considering that the prediction model completes training, and if not, returning to the step (4) to retrain until a stop condition is reached;

(6) PM2.5 prediction by using PM2.5 prediction model

And inputting the current K groups of weather data into a PM2.5 prediction model, and outputting T PM2.5 prediction values.

The invention aims to realize the following steps:

according to the PM2.5 prediction method based on the deep structure recurrent neural network, the collected mass data are utilized, the prediction model of the PM2.5 of the deep structure is constructed according to deep learning and recurrent neural network theories, and the prediction of haze weather is realized through extraction and training of data characteristics, so that the efficiency and the precision of haze prediction are improved, and a convincing decision basis is provided for haze prevention and treatment. The prediction model has little requirement on a data structure, and can be learned by self as long as the data is large enough, so that the deep learning is very suitable for the requirements of current internet big data application.

Meanwhile, the PM2.5 prediction method based on the deep structure recurrent neural network further has the following beneficial effects:

(1) the prediction model transforms original data into higher-level abstract expression through some simple nonlinear models, combines multi-layer transformation, and learns and extracts a very complex function characteristic method. The problem that the traditional prediction model of the shallow structure has limited representation capability on complex functions under the condition of limited samples and calculation units is solved.

(2) The core difference of the prediction model is that a plurality of hidden layers are provided, and the feature extraction of each layer is not designed manually but obtained by self-learning from data by using a general learning process, so that the data structure is not required, the data processing process is simplified, and the efficiency is improved.

(3) The PM2.5 concentration prediction of the region can be realized by using data of different regions, parameters of a PM2.2 prediction model are redefined according to actual conditions and requirements, and a network does not need to be reconstructed, so that the method has flexibility and portability.

Drawings

FIG. 1 is a flowchart of a method for predicting PM2.5 based on a deep-structure recurrent neural network according to the present invention;

Detailed Description

The following description of the embodiments of the present invention is provided in order to better understand the present invention for those skilled in the art with reference to the accompanying drawings. It is to be expressly noted that in the following description, a detailed description of known functions and designs will be omitted when it may obscure the subject matter of the present invention.

Examples

FIG. 1 is a flowchart of a method for predicting PM2.5 based on a deep-structure recurrent neural network according to the present invention.

In this embodiment, as shown in fig. 1, a method for predicting PM2.5 based on a deep-structure recurrent neural network according to the present invention includes the following steps:

s1, obtaining historical weather data, including hourly temperature, illumination, wind speed, rainfall, SO2, O3, NO, PM10 and PM2.5 data indexes, wherein the temperature unit: DEG C, light unit: lm/square meter, wind speed unit: m/s, rainfall units: mm, SO2, O3, NO, PM10 and PM2.5 are concentration data;

in the embodiment, historical weather data from 5 months 2014 to 5 months 2017 are applied and obtained from the China weather service bureau, and data information comprises temperature, light, wind speed, rainfall, SO2, O3, NO, PM10 and PM2.5 data indexes (wherein the temperature unit is in DEG C, the light unit is lm/square meter, the wind speed unit is m/s, the rainfall unit is mm, the SO2, the O3, the NO, the PM10 and the PM2.5 are concentration data) of each hour, 9 indexes are counted in total, 26280 multiplied by 9 data are counted in total, and the data quantity of the PM2.5 model predicted by the recurrent neural network of the deep layer structure is ensured.

S2, preprocessing data

S2.1, complementing the missing historical weather data

The collected weather data is historical data based on a time sequence, few missing data exist in the collected data, the missing data are supplemented by adopting a mean value method, and the integrity of the data is ensured.

The formula for complementing the missing historical weather data by using an averaging method is as follows:

wherein, X_tIndicating missing historical weather data at the current time, X_t-1Indicating missing historical weather data at the previous time, X_t+1Indicating the number of missing historical days at a time before and afterAccordingly;

s2.2, normalizing all historical weather data

s3, dividing the preprocessed historical weather data into training data and testing data according to the proportion of 70% to 30%;

s4 PM2.5 prediction model for constructing deep structure based on deep learning theory and recurrent neural network

S4.1, in the embodiment, constructing a deep-layer circulation neural network prediction model: the model comprises an input layer, eight hidden layers and an input layer, wherein the model depth is 10 layers, the number of nodes of the input layer is 9, the number of nodes of the hidden layer is 50, the number of nodes of the output layer is 5, the input is training data, and the output is a predicted value of PM2.5 concentration;

s4.2, setting the dimension of an input layer as Kx 9, the dimension of an output layer as 1 xT, the dimension of the input layer and a hidden layer, the dimension of the hidden layer and the hidden layer, and the activation functions of the hidden layer and the output layer as Tanh functions;

wherein K represents the depth of the recurrent neural network expanded according to the time sequence, namely K time frames, and each time frame inputs a group of historical weather data; t represents the number of output data of a prediction model of the recurrent neural network, and represents that K pieces of historical data are used for predicting the PM2.5 concentration at the T moments in the future, namely the weather data at the K moments before input and the PM2.5 concentration data at the T moments after input are predicted;

s4.3, selecting a loss function used in the PM2.5 prediction model

In this embodiment, the subset M is selected from the training data set segment by segment according to the data in the PM2.5 prediction model in time series_iSubset M as a small batch dataset_iThe number of the middle sample points is M, the middle sample points comprise input and mark output data, and the number is marked as M_i(X_i,Y_i) Input data is X_iTrue tag output data is Y_iThe neural network training output data is represented as Y_i', the correspondence relationship is expressed as follows:

i.e. representing the input of the # th one in time series

Matrix, namely to obtain

A row vector, τ ═ 1, 2.., m-k + 1.

Each subset M_iThe training loss function of (a) is chosen to be the mean square error, representing the measure of error of the parameter vector of the prediction model over a given subset of training data, noted as

Expressed by the following formula:

s4.4, updating parameters in the PM2.5 prediction model by adopting a small-batch stochastic gradient descent algorithm

S4.4.1, initialization parameter θ₀；

S4.4.2, dividing the training data into a group according to m training data of the time sequence, calculating the gradient value of each training data in the first group of training data by using a small batch random gradient descent algorithm, and then carrying out weighted average summation on q gradient values to obtain the descending gradient of the group of training data

i represents the ith set of training data,

s4.4.3, updating parameters in the PM2.5 prediction model by the descending gradient of the training data, wherein the parameter updating formula is as follows:

in the present embodiment, in combination with the embodiment in step S3, when the PM2.5 prediction model trains parameters using a small batch stochastic gradient descent algorithm, each subset M_iIn (m-K +1) iterations, each using K pieces of data, for

Deriving to obtain the gradient of each parameter, performing weighted average summation on m-k +1 gradients to serve as a descending gradient of one-time small-batch training, and then updating the parameters, specifically:

s4.4.4, when the target parameters of the training data are updated, returning to step S4.4.2 to train and update the next training data group until the gradient of the descent is lower than the set expected error value or the training of the last training data group is completed, and then updating and storing the final parameters to obtain the PM2.5 prediction model after training;

s5, judging whether the PM2.5 prediction model reaches the training stop condition or not

Inputting data of the test set into K groups of data according to a time sequence to a trained PM2.5 prediction model, outputting T predicted values, judging the error between each predicted value and a true value, if the error is within an allowable range, considering that the prediction model completes training, otherwise, returning to the step S4 to retrain until a stop condition is reached;

s6, PM2.5 prediction is carried out by utilizing PM2.5 prediction model

Although illustrative embodiments of the present invention have been described above to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, and various changes may be made apparent to those skilled in the art as long as they are within the spirit and scope of the present invention as defined and defined by the appended claims, and all matters of the invention which utilize the inventive concepts are protected.

Claims

1. a PM2.5 prediction method based on deep structure recurrent neural network, is characterized in that, comprises the following steps:

(1) Obtain historical weather data, including hourly temperature, light, wind speed, rainfall, SO2, O3, NO, PM10, PM2.5 data indicators, among which, temperature unit: °C, light unit: lm/㎡, Wind speed unit: m/s, rainfall unit: mm, SO2, O3, NO, PM10, PM2.5 are concentration data;

(2), data preprocessing

(2.1) Completion of missing historical weather data

Use the mean method to fill in missing historical weather data:

Among them, X _t represents the missing historical weather data at the current moment, X _t-1 represents the missing historical weather data at the previous moment, and X _t+1 represents the missing historical weather data at the previous moment;

(2.2), normalize all historical weather data

Normalize all historical weather data to between -1 and 1 according to the following formula;

Among them, X' represents the normalized historical weather data, X represents the historical weather data before normalization,

Represents the mean value of historical weather data, X _max represents the maximum value of historical weather data, and X _min represents the minimum value of historical weather data;

(3), divide the historical weather data after preprocessing into training data and test data according to the proportion;

(4) PM2.5 prediction model with deep structure based on deep learning theory and recurrent neural network

(4.1), build a deep recurrent neural network prediction model: one input layer, multiple hidden layers, one input layer, the model depth is greater than N layers, the input is the training data, and the output is the predicted value of PM2.5 concentration;

(4.2), set the dimension of the input layer as K×(H-1), the dimension of the output layer as 1×T, the input layer and the hidden layer, the hidden layer and the hidden layer, and the activation function of the hidden layer and the output layer using the Tanh function;

Among them, K represents the depth of the cyclic neural network expansion in time series, that is, K time frames, each time frame inputs a set of historical weather data; H represents the number of data indicators, T represents the number of output data of the prediction model of the cyclic neural network, Indicates that K pieces of historical data are used to predict the PM2.5 concentration at T times in the future, that is, in order to input the weather data at the first K times, the PM2.5 concentration data at the next T times are predicted;

(4.3), select the loss function used in the PM2.5 prediction model

The mean squared error is used as the loss function in the PM2.5 prediction model:

Among them, q represents the number of iterations, t is the dimension of the output vector, y _i,j represents the actual value of the training data, and y _i,j ' represents the predicted value of the training data;

(4.4), use the mini-batch stochastic gradient descent algorithm to update the parameters in the PM2.5 prediction model

(4.4.1), initialization parameter θ ₀ ;

(4.4.2) Divide the training data into one group for every m training data in the time series, and then use the mini-batch stochastic gradient descent algorithm to calculate the gradient value of each training data in the first group of training data, and then weight the gradient value Average summation to get the descending gradient of this set of training data

i represents the i-th group of training data,

represents the input and output data corresponding to the τth training data in the i-th group;

(4.4.3), the descending gradient of this group of training data updates the parameters in the PM2.5 prediction model, and the parameter update formula is:

Among them, θ _i-1 represents the target parameter after the previous set of data training is completed, θ _i represents the target parameter after the completion of this set of training data, and η represents the learning rate;

(4.4.4), when the target parameter update after the completion of this group of training data is completed, return to step (4.4.2) to train and update the next group of training data, until the error value is lower than the set expected error value or the last A set of training data is finished when the training is completed, and then the final parameters are updated and saved, and the trained PM2.5 prediction model is obtained;

(5), determine whether the PM2.5 prediction model reaches the training stop condition

Input the test set data into a K group of data according to the time series into the PM2.5 prediction model that has been trained, output T predicted values, and then judge the error between each predicted value and the actual value, if the error is within the allowable range within, the prediction model is considered to have completed the training, otherwise it returns to step (4) for retraining until the stopping condition is reached;

(6) Use the PM2.5 prediction model to predict PM2.5

Input the current K groups of weather data into the PM2.5 prediction model, and output T PM2.5 prediction values.