Abstract
The solution of nonparametric regression problems is addressed via polynomial approximators and one-hidden-layer feedforward neural approximators. Such families of approximating functions are compared as to both complexity and experimental performances in finding a nonparametric mapping that interpolates a finite set of samples according to the empirical risk minimization approach. The theoretical background that is necessary to interpret the numerical results is presented. Two simulation case studies are analyzed to fully understand the practical issues that may arise in solving such problems. The issues depend on both the approximation capabilities of the approximating functions and the effectiveness of the methodologies that are available to select the tuning parameters, i.e., the coefficients of the polynomials and the weights of the neural networks. The simulation results show that the neural approximators perform better than the polynomial ones with the same number of parameters. However, this superiority can be jeopardized by the presence of local minima, which affects the neural networks but does not regard the polynomial approach.
Similar content being viewed by others
References
Anthony M and Bartlett P (1999). Neural network learning: theoretical foundations. Cambridge University Press, Cambride
Baohuai S, Jianli W, Ping L (2008) The covering number for some Mercer kernel Hilbert spaces. J Complex (to appear)
Barron A (1993). Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans Inf Theory 39(3): 930–945
Bartlett P and Kulkarni S (1998). The complexity of model classes and smoothing noisy data. Syst Control Lett 34(3): 133–140
Björck A (1996). Numerical methods for least squares problems. SIAM, Philadelphia
Concannon K, Elder M, Hunter K, Tremble J and Tse S (2003). Simulation modeling with SIMUL8. Visual Thinking International Ltd, Mississauga
Crino S and Brown D (2007). Global optimization with multivariate adaptive regression splines. IEEE Trans Syst Man Cybern B 37(2): 333–340
Demuth H and Beale M (2000). Neural network toolbox—user’s guide. The Math Works Inc., Natick
Girosi F (1994) Regularization theory, radial basis functions and networks. In: From statistics to neural networks. Theory and Pattern Recognition Applications, Subseries F, Computer and Systems Sciences. Springer, Heidelberg, pp 166–187
Gyöfi L, Kohler M, Krzyżak A and Walk H (2002). A distribution-free theory of nonparametric regression. Springer, New York
Kim B and Park G (2001). Modeling plasma equipment using neural networks. IEEE Trans Plasma Sci 29(1): 8–12
Kolmogorov A and Fomin S (1975). Introductory real analysis. Dover Publications, New York
Krzyżak A and Schäfer D (2005). Nonparametric regression estimation by normalized radial basis function networks. IEEE Trans Inf Theory 51(3): 1003–1010
Kůrková V (1995). Approximation of functions by perceptron networks with bounded number of hidden units. Neural Netw 8(5): 745–750
Kůrková V and Sanguineti M (2002). Comparison of worst-case errors in linear and neural network approximation. IEEE Trans Inf Theory 28(1): 264–275
Kůrková V and Sanguineti M (2005). Error estimates for approximate optimization by the extended Ritz method. SIAM J Optim 15(2): 461–487
Kůrková V and Sanguineti M (2007). Estimates of covering numbers of convex sets with slowly decaying orthogonal subsets. Discrete Appl Math 155(15): 1930–1942
Leshno M, Ya V, Pinkus A and Schocken S (1993). Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Netw 6(6): 861–867
Lugosi G and Zeger K (1995). Nonparametric estimation via empirical risk minimization. IEEE Trans Inf Theory 41(3): 677–687
Malik Z and Rashid K (2000). Comparison of optimization by response surface methodology with neurofuzzy methods. IEEE Trans Magn 36(1): 241–257
Nobel A and Adams T (2001). Estimating a function from ergodic samples with additive noises. IEEE Trans Inf Theory 47(7): 2985–2902
Park J and Sandberg IW (1991). Universal approximation using radial-basis-function networks. Neural Comput 3(2): 246–257
Pollard D (1984). Convergence of stochastic processes. Springer, New York
Pontil M (2003). A note on different covering numbers in learning theory. J Complex 19(5): 665–671
Pollard D and Radchenko P (2006). Nonlinear least-squares estimation. J Multivar Anal 97(2): 548–562
Shuhe H (2004). Consistency for the least squares estimator in nonlinear regression model. Stat Probab Lett 67(2): 183–192
Sjoberg J, Zhang Q, Ljung L, Benveniste A, Deylon B, Glorennec P, Hjalmarsson H and Juditsky A (1995). Nonlinear black-box models in system identification: a unified overview. Automatica 31(12): 1691–1724
Zhou D-X (2002). The covering number in learning theory. J Complex 18(3): 739–767
Zoppoli R, Sanguineti M and Parisini T (2002). Approximating networks and extended Ritz method for the solution of functional optimization problems. J Optim Theory Appl 112(2): 403–439
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Alessandri, A., Cassettari, L. & Mosca, R. Nonparametric nonlinear regression using polynomial and neural approximators: a numerical comparison. Comput Manag Sci 6, 5–24 (2009). https://doi.org/10.1007/s10287-008-0074-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10287-008-0074-3