Nothing Special   »   [go: up one dir, main page]

Artificial Neuron Models For Hydrological Modeling: Seema Narain and Ashu Jain

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Artificial neuron models for hydrological modeling

Seema Narain and Ashu Jain


Abstract Artificial Neural Networks (ANNs) have been successfully employed for hydrological modeling in the last two decades or so. Most ANN hydrologic models use the McCulloch and Pitts Artificial Neuron (MPAN) as the building block of the ANN models. The existing neuron structure with MPAN has an aggregation function (usually summation) and its transformation through non-linear filter or squashing or thresholding functions. Such structure of neural networks has a number of disadvantages like large number of neurons, hidden layers and large amount of training data required for complex function approximations. This paper presents the results of a study employing a new artificial neuron called Generalized Neuron (GN). The GN model overcomes some of the limitations of MPAN. Two neural network models are presented in this study. The first is a traditional feed-forward neural network model trained using back-propagation algorithm and the second is GN model. The GN model employs linear discriminant functions and sigmoid and Gaussian activations functions. The GN models do not contain any hidden layers. Both the ANN models are developed using the rainfall and flow data derived from the Kentucky River Basin in USA. The results obtained in this study indicate that a compact ANN model consisting of a single artificial generalized neuron is capable of modeling the complex, dynamic and non-linear rainfallrunoff process in a large watershed. Based on the results, it is found that the GN model may be preferred over the traditional feed forward neural network for hydrological modeling.
I. INTRODUCTION

The modeling of Hydrological processes is needed in many watershed management activities. Over the past three decades, there have been significant developments in hydrological modeling. Most of the hydrological processes involve high degree of temporal and spatial variability. The involvement of many physiographic and climatic factors makes the hydrologic process more complex to understand and to model, especially the rainfall-runoff process. Historically, hydrologists have employed conceptual methods that incorporate the physics of the system in modeling, or empirical approaches that do not consider the underlying physics while modeling. There has been a tremendous growth in the use of ANNs for the modeling of hydrological systems in the last fifteen years or so. The ANN applications to hydrological modeling range from simple application of ANNs [1]-[4] to the complex ANN models
Manuscript received January 31, 2007. Seema Narain is a Ph.D. student at the Indian Institute of Technology, Kanpur, India (e-mail: seemanv@ iitk.ac.in). Ashu Jain is an Associate Professor at the Indian Institute of Technology, Kanpur, India (phone: +91-5122597411; fax: +91-2597395; e-mail: ashujain@iitk.ac.in).

involving specialized efforts such as the use of genetic algorithms for training of neural networks [5]; developing hybrid neural networks [6]; and data-decomposition and integration of techniques [7], [8]. Some recent studies in hydrology focusing on the integration of conceptual and ANN methods or using different training methods (viz. genetic algorithms) emphasize the need of developing more robust and efficient hydrological models capable of producing more accurate flow forecasts. Most of the ANN applications to hydrology employ the McCulloch and Pitts Artificial Neuron (MPAN) that was proposed in early 1940s. Although the MPAN has been found to function very well in most engineering applications, there are increased demands on the more and more accurate estimations of future variables. The researchers are looking beyond MPAN in the search of more efficient hydrological neural network models. The conventional neural networks using MPANs as building blocks suffer from several short-comings: (a) the training time for the conventional neural network is too long, which results in the slower response of the system [2], (b) the number of hidden layers and hidden neurons make the model too complex apart from being determined through a trial and error basis [9], (c) the existing ANN models perform only the operation of summation of its weighted inputs leading to linear discriminant functions, and (d) the training methods employed, usually back-propagation (BP) algorithm are slow, vulnerable of getting stuck in local minima, and can be biased towards a particular magnitude (low, medium, or high flows) [10]. Many studies have been done in the past to modify the traditional MPAN for more robust and effective results. Delayed differential equation for modelling single neuron with inertial term subject to time delay was considered by [11]. The dendritic structure as the primary autonomous computational unit was found capable of realizing logical operations. The networks of spiking neurons was employed and found effective in various applications [12] and [13]. This paper presents the results of a preliminary investigation of the use of a new artificial neuron called Generalized Neuron (GN). The GN differs from the traditional MPAN in many ways including its capability to have non-linear discriminant function [14], [15]. An ANN model based on GN is developed using the rainfall and flow data derived from the Kentucky River Basin in USA. A feed-forward type neural network model trained using BP method is also developed. The performance of the GN model is compared with the traditional ANN model developed using MPANs as building blocks in terms of a variety of error statistics.

II.

STUDY AREA AND DATA

The data derived from the Kentucky River Basin were employed to train and test all the models developed in this study. The Kentucky River Basin (see Figure 1), encompasses over 4.4 million acres (17,820 km2) of the state of Kentucky. Forty separate counties lie either completely or partially within the boundaries of the watershed. The Kentucky River is the sole source for the several water supply companies of the state. The drainage area of the Kentucky River at Lock and Dam 10 (LD10) near Winchester, Kentucky is approximately 10,240 km2 and the time of concentration of the watershed is approximately two days. The data used in this study include the average daily streamflow (m3/s) from Kentucky River at LD10 and LD11 (near Heidelberg), and the daily average rainfall (mm) from the five rain gauges (Manchester, Hyden, Jackson, Heidelberg, and Lexington Airport) scattered throughout the Kentucky River watershed. A total length of the data of 26-years (1960-1989 with data in some years missing) was available. The data were divided into two sets: a training data set consisting of the daily rainfall and flow data for thirteen years (1960-1972), and a testing data set consisting of the daily rainfall and flow data of thirteen years (1977-1989).

R=
TS x =

( XO XO) ( XE XE ) ( XO XO) ( XE XE )
2

(3)
2

nx 100% N 1 N XO (t ) XE (t ) AARE = * 100% N i =1 XO (t )

(4) (5)

Where XO is the observed value of the variable, XE is the estimated value of the variable from a model,

XO is the average observed value of the variable, XE is the

average estimated value of the variable, nx is the number of data points estimated for which the absolute relative error (ARE) is less than x%, N is the total number of data points estimated, and all the summations run from 1 to N. The value of x of 1%, 25%, 50%, and, 100% were considered in this study to compute threshold statistics. Values of SSE and AARE close to 0.0 represent good model performance. The distribution of errors is very well represented by the threshold statistics. The TSx may be defined as the percentage of data points forecasted for which the absolute relative error is less than x%, therefore higher value of threshold statistics indicates the better performance of the model. The TS can range between 0% and 100% with higher values representing good model performance. Coefficient of correlation can range between -1.0 and +1.0 with magnitudes close to 1.0 meaning good linear dependence between observed and modeled outputs. The Values of Nash efficiency can range between and +1.0 with values close to 1.0 being very good. The values of E equal to 0.0 means the model is as good as the mean values.
IV. MODEL DEVELOPMENT

Fig. 1. Kentucky river basin

III.

MODEL PERFORMANCE STATISTICS

The performance of all the models developed in this study was evaluated using five different standard statistical measures. These are sum square error (SSE), NashSutcliffe efficiency (E), Pearson coefficient of correlation (R), average absolute relative error (AARE) and threshold statistic (TS). The equations to compute these statistics are provided below.

SSE = ( XE XO) 2
E = 1

(1)
2 2

( XE XO) ( XO XO)

(2)

Two types of neural network models are developed. The first is the feed-forward type neural network model trained using BP algorithm. This ANN model uses MPAN as a building block. It consists of three layers: an input layer, a hidden layer, and an output layer. The inputs to the ANN are average rainfall at various time steps (Pt, Pt-1, and Pt-2); flow in Kentucky River at LD10 in the past (Q10t-1, and Q10t-2); and flow in Kentucky River at an upstream gauging station LD11 at various time steps (Q11t, Q11t-1, and Q11t-2). The inputs have been selected based on the cross-correlation and partial correlation analysis. Correlation analysis was done for the inputs at various time lag intervals and finally the significant input variables have been selected. The statistical properties of the training and testing input data set have been presented in Table 1. The ANN model thus developed would require forecasts of two key inputs Pt, and Q11t. The output from the ANN is Q10t being modeled. The number of neurons in the hidden layer was determined using a trial and error procedure. The BP method with momentum factor was used to train various ANN architectures (with hidden

neurons varying from 1 to 20) and the best ANN architecture in terms of various error statistics during training was selected. The trial and error method was used to optimize the parameters of the model, (1) learning rate (), and, (2) momentum correction factor (). The finally selected values of and are 0.0007 and 0.99 respectively. Based on this method, the ANN architecture 8-5-1 was found suitable for the modeling of daily flow in Kentucky River at LD10. The results in terms of various error statistics during training and testing are presented in Table 2.
TABLE 1 STATISTICAL PROPERTIES OF THE TRAINING AND TESTING DATA SET

weights and s_bias is the bias weight. NetNLij is the net input to the GN, WNLijs are the weights and pi_bias is the bias weight for the part (2 and ) of the GN model. The output from the discriminant portions (1 and 2) is

Model Inputs Average Training Testing Maximum Training Testing Minimum Training Testing Std. Deviation Training Testing Skewness Training Testing

P 3.24 3.16 77.40 81.99 0.00 0.00 6.20 6.18 3.49 3.94

Q11 105.65 100.43 2449.40 2432.41 1.98 1.27 176.44 168.45 4.34 4.82

Q10 149.94 144.78 2528.69 2806.19 3.60 3.28 243.30 232.81 3.67 4.00
Fig. 2. Generalized neuron model

calculated using the respective activation functions ( and ), which can be either a sigmoid, a Gaussian, a Spline, or any other mathematical function satisfying the conditions of being an activation function in the traditional ANNs employing MPANs. In this study, sigmoid and Gaussian activation functions were used to calculate the outputs from the components of the GN model.

y1 =

1 (1 + e
NetL j

(8)

)
(9)

y2 = e

NetNl j 2

A. Generalized Neuron Model The existing neuron structure has an aggregation function (usually summation) and its transformation through nonlinear filter or thresholding functions. Such structure of neural networks has a number of disadvantages like large number of neurons, hidden layers and huge training data required for complex function approximations [11], [12], [13]. In this paper, a new generalized neuron model is proposed, which overcomes some of the drawbacks of conventional neural network employing MPAN. The structure of GN model consists of five distinct components: an aggregation of weighted inputs (1 and 2) that is similar to MPAN (linear discriminant function), the corresponding outputs computed using independent activation functions ( and , respectively), and an assimilation function () that provides output to the external source. The bias terms are added to both the discriminant functions like in MPANs. The weighted inputs to the GN are represented as follows:

The assimilation function was a linear combination of the two outputs. The equation to calculate the overall output from the GN model can be represented as follows:

O pk = w y1 + (1 w) y2

(10)

NetL j = WLij X i + s _ bias NetNL j = WNLij X i + pi _ bias

(6) (7)

Where, Xis are the inputs. For the part (1 and ) of the GN model, NetLij is the net input to the GN, WLijs are the

Where Opk is the overall output from the GN model; w is the weight corresponding to the linear output y1; and y2 is the non-linear output from the GN model. The training of the GN model is carried out in a similar fashion as a traditional ANN using gradient descent method. The optimized value of parameters, learning rate () and momentum correction factor () were found as 0.0005 and 0.9 respectively by trial and error method. The weight parameter w is also optimized during training so that the GN model consists of a total of 2n+3 weight parameters for n inputs. The overall structure of the GN model described above provides a very compact ANN model as compared to the traditional ANN model having many times more weights due to the number of hidden neurons involved in them. The details of training of a GN model are not included here and can be found in [14], [16]. The generalized neural network has characteristics of both simple and high order neurons. The non-linearity present in the system is incorporated with suitable discriminant and activation functions in GN model. The proposed

Calculated Output

model has both sigmoidal and Gaussian functions with weight sharing. The number of weights in the case of a GN model is equal to twice the number of inputs plus two, which is very low in comparison to a multi-layer feed-forward ANN. The weights are determined through training. Hence, by reducing the number of unknown weights, training time as well as the minimum number of patterns required for training can be reduced. Also, there is no need to select the number of hidden layers and the number of hidden neurons. This reduces the complexity and dimensionality of the overall ANN model. A schematic of the GN model is presented in Figure 3. The GN model receives inputs from external source and gives output to the external source like in an ANN model built using MPANs. The GN model differs in its structure that is actually responsible for capturing the complex input-output relationships.
V. RESULTS AND DISCUSSIONS

indicates the 8-5-1 ANN models inability of estimating flows greater than 2000 m3/s. This problem is overcome by the GN model (see Figure 4). Also, the GN model appears to estimate the low, medium and high magnitude flows much better than those from the 8-5-1 ANN model during training. During testing also, the flow estimation from the GN model is better than that of 8-5-1 ANN model.

3000 2000 1000 0 (a) 0 1000 2000 3000


Desired Output

The results in terms of various statistical parameters from the two ANN models are presented in Table 2. The SSE values of GN model are better than the model 8-5-1 both during training and testing. The values of E and R in excess of 0.95 from 8-5-1 and GN models during training and testing indicate an excellent performance. During training the AARE of GN model is less as compared to the 8-5-1 and this difference become more significantly wide during testing of both the models. In case of GN model, the low AARE value 30.60 in training and 36.57 during testing demonstrates the better prediction capability of GN model over the 8-5-1 ANN model. The performance of the two models was comparable in terms of the threshold statistics TS1 and TS25 while the GN model indicates higher value for TS50 and TS100. The comparable performance of the two models indicates that the GN model has tremendous potential in hydrology. Also, it provided a very compact ANN structure consisting of a single artificial neuron. Looking at the overall results from Table 2, the GN model may be preferred over the 8-5-1 ANN model due to its better performance and based on the principle of parsimony.
TABLE 2 ERROR STATISTICS OF ANN MODELS

3000
Calculated Output

2000 1000 0

(b)

1000

2000

3000

Desired Output
Fig. 3. Scatter plot from 8-5-1 ANN model (a) Training; (b) Testing

VI.

SUMMARY AND CONCLUSIONS

___________________________________________ Model SSE E R AARE TS1 TS10 TS50 TS100 _____________________________________________ During Training 8-5-1 0.33 0.971 0.986 55.2 3.8 57.8 69.2 81.2 GN 0.26 0.975 0.987 30.6 3.3 61.0 81.7 95.0 During Testing 8-5-1 0.44 0.953 0.978 71.5 3.6 57.0 68.0 77.5 GN 0.34 0.966 0.983 36.6 2.1 54.1 80.4 92.5

__________________________________________
The results in terms of scatter plots from the two models are provided in Figures 3 and 4, respectively. Figure 3

In this paper, the rainfall-runoff process has been modeled using two different ANN models: Three layered Feed forward Neural Network using MPAN and Generalized neural network using single generalized neuron. The results from the GN model are compared with a traditional feed-forward ANN, trained with backpropagation using momentum factor. The daily rainfall and flow data for a 26-year period from Kentucky River, USA were employed to develop and test all the model structures investigated in this study. The performances of the two models were evaluated using six different types of error statistics capable of assessing ANN model performance comprehensively. The above analyses revealed the adaptive nature and suitability of GN for

hydrological forecasting. The results obtained in this study indicate that a compact ANN model consisting of a single artificial generalized neuron is capable of modeling the complex, dynamic, and non-linear rainfall-runoff process in a large watershed. It is observed that the GN model is superior to 8-5-1 ANN model in all respects like lesser number of neurons, lesser number of weights and hidden layers and lesser training time. The GN model was able to achieve better performance as compared to a fully connected feed-forward ANN developed on the same data set. The GN model overcomes some of the problems associated with traditional ANNs developed using MPAN as a building block. It offers a flexible structure wherein various alternative discriminant, activation, and assimilation functions can be used to model the specific nature inherent in different types of problems. The GN model has tremendous potential for solving a variety of problems in hydrology. It is hoped that future efforts will focus on the use of GN model in hydrology to exploit their strengths to advantage in hydrological modeling.

4000
Calculated Output

3000 2000 1000 0

(a)

1000 2000 3000 4000


Desired Output

4000
Calculated Output

3000 2000 1000 0

(b)

1000 2000 3000 4000


Desired Output

Fig. 4. Scatter plot from GN model (a) Training; (b) Testing REFERENCES

[2] Shamseldin, A. Y., (1997). Application of a neural network technique to rainfall-runoff modeling, J. Hydrol., 199, 272-294. [3] Campolo, M., Andreussi, P., and Soldati, A. (1999). River flood forecasting with neural network model. Water Resour. Res., 35(4), 1191-1197. [4] Jain, A., and Indurthy, S.K.V.P. (2003). Comparative analysis of event based rainfall-runoff modeling techniques-deterministic, statistical, and artificial neural networks, J. Hydrol. Engg., ASCE, 8(2), 93-98. [5] Jain, A. and Srinivasulu, S. (2004). Development of effective and efficient rainfall-runoff models using integration of deterministic, real-coded genetic algorithms, and artificial neural network techniques, Water Resour. Res., 40(4), W04302, doi:10.1029/2003WR002355. [6] Chen, J. and Adams, B.J. (2006). Integration of artificial neural networks with conceptual models in rainfall-runoff modeling, J. Hydrol., 318, 232-249. [7] Abrahart, R.J. and See, L. (2000). Comparing neural network and autoregressive moving average techniques for the provision of continuous river flow forecasts in two contrasting watersheds, Hydrol. Processes, 14, 21572172. [8] Jain, A. and Srinivasulu, S. (2006). Integrated approach to modelling decomposed flow hydrograph using artificial neural network and conceptual techniques, J. Hydrol., 317(3-4), 291-306. [9] Hsu, K., Gupta., H.V., and Sorooshian, S. (1995), Artificial neural network modeling of the rainfall-runoff process, Water Resources Research, 31(10), 2517-2530. [10] Rumelhart, D.E., Hinton, G.E. and Williams, R. J. (1986). Learning representations by back-propagating errors, Nature, 323, 533-536. [11] Chunguang, L.,a, Guangrong, C., Xiaofeng L., and Juebang Y. (2004), Hopf bifurcation and chaos in a single inertial neuron model with time delay, The European Physical Journal, 41, 337343. [12] Cios, K.J., Swiercz, W, Jackson, W. (2004), Networks of spiking neurons in modeling and problem solving, Neurocomputing 61, 99 119. [13] Prete, V.D., and Coolen, A.C.C. (2004), Nonequilibrium statistical mechanics of recurrent networks with realistic neurons, Neurocomputing, 58-60, 239-244. [14] Chaturvedi, D.K., Satsangi, P.S., and Kalra, P.K. (1999). New neuron models for simulating rotating electrical machines and load forecasting problems, Elec. Power Sys. Res., 52, 123-131. [15] Yadav, R.N., Kalra, P.K., and John, J. (2006). Neural network learning with generalized mean based neuron model, Soft Comput., 10, 257-263. [16] Chaturvedi, D.K., Malik, O.P., and Kalra, P.K. (2004), Generalised neuron-based adaptive power system stabilizer, IEEE Proc-Genetation, Transmission and Distribution , 151(2), 213-218.

[1] Minns, A.W. and Hall, M.J. (1996). Artificial neural networks as rainfall runoff models, Hydrol. Sci. J., 41(3), 399-417.

You might also like