Abstract
Limitations of capabilities of one-hidden-layer networks are investigated. It is shown that for networks with Heaviside perceptrons as well as for networks with kernel units used in SVM, there exist large sets of d-variable functions which cannot be tractably represented by these networks, i.e., their representations require numbers of units or sizes of weighs depending on d exponentially. Our results are derived using the concept of variational norm from nonlinear approximation theory and the concentration of measure property of high dimensional Euclidean spaces.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Fine, T.L.: Feedforward Neural Network Methodology. Springer, Heidelberg (1999)
Chow, T.W.S., Cho, S.Y.: Neural Networks and Computing: Learning Algorithms and Applications. World Scientific (2007)
Leshno, M., Lin, V.Y., Pinkus, A., Schocken, S.: Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Networks 6, 861–867 (1993)
Pinkus, A.: Approximation theory of the MLP model in neural networks. Acta Numerica 8, 143–195 (1999)
Park, J., Sandberg, I.: Approximation and radial-basis-function networks. Neural Computation 5, 305–316 (1993)
Mhaskar, H.N.: Versatile Gaussian networks. In: Proc. of IEEE Workshop of Nonlinear Image Processing, pp. 70–73 (1995)
Kůrková, V.: Some comparisons of networks with radial and kernel units. In: Villa, A.E.P., Duch, W., Érdi, P., Masulli, F., Palm, G. (eds.) ICANN 2012, Part II. LNCS, vol. 7553, pp. 17–24. Springer, Heidelberg (2012)
Steinwart, I., Christmann, A.: Support Vector Machines. Springer, New York (2008)
Ito, Y.: Finite mapping by neural networks and truth functions. Mathematical Scientist 17, 69–77 (1992)
Micchelli, C.A.: Interpolation of scattered data: Distance matrices and conditionally positive definite functions. Constructive Approximation 2, 11–22 (1986)
Kainen, P.C., Kůrková, V., Sanguineti, M.: Dependence of computational models on input dimension: Tractability of approximation and optimization tasks. IEEE Transactions on Information Theory 58, 1203–1214 (2012)
Maiorov, V.: On best approximation by ridge functions. J. of Approximation Theory 99, 68–94 (1999)
Maiorov, V., Pinkus, A.: Lower bounds for approximation by MLP neural networks. Neurocomputing 25, 81–91 (1999)
Bartlett, P.L.: The sample complexity of pattern classification with neural networks: The size of the weights is more important than the size of the network. IEEE Trans. on Information Theory 44, 525–536 (1998)
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Computation 18, 1527–1554 (2006)
Bengio, Y.: Learning deep architectures for AI. Foundations and Trends in Machine Learning 2, 1–127 (2009)
Bengio, Y., Delalleau, O., Roux, N.L.: The curse of highly variable functions for local kernel machines. In: Advances in Neural Information Processing Systems 18, pp. 107–114. MIT Press (2006)
Roychowdhury, V., Siu, K., Orlitsky, A.: Neural models and spectral methods. In: Roychowdhury, V., Siu, K., Orlitsky, A. (eds.) Theorertical Advances in Neural Computation and Learning, pp. 3–36. Kluwer Academic Press (1997)
Kůrková, V., Savický, P., Hlaváčková, K.: Representations and rates of approximation of real-valued Boolean functions by neural networks. Neural Networks 11, 651–659 (1998)
Kůrková, V.: Representations of highly-varying functions by perceptron networks. In: Informačné Technológie - Aplikácie a Teória - ITAT 2013, Košice, UPJŠ (2013)
Kůrková, V.: Dimension-independent rates of approximation by neural networks. In: Warwick, K., Kárný, M. (eds.) Computer-Intensive Methods in Control and Signal Processing. The Curse of Dimensionality, pp. 261–270. Birkhäuser, Boston (1997)
Barron, A.R.: Neural net approximation. In: Narendra, K. (ed.) Proc. 7th Yale Workshop on Adaptive and Learning Systems, pp. 69–72. Yale University Press (1992)
Barron, A.R.: Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans. on Information Theory 39, 930–945 (1993)
Kůrková, V., Sanguineti, M.: Comparison of worst-case errors in linear and neural network approximation. IEEE Transactions on Information Theory 48, 264–275 (2002)
Kůrková, V.: Minimization of error functionals over perceptron networks. Neural Computation 20, 250–270 (2008)
Kůrková, V.: Complexity estimates based on integral transforms induced by computational units. Neural Networks 33, 160–167 (2012)
Matoušek, J.: Lectures on Discrete Geometry. Springer, New York (2002)
Ball, K.: An elementary introduction to modern convex geometry. In: Levy, S. (ed.) Falvors of Geometry, pp. 1–58. Cambridge University Press (1997)
Schläfli, L.: Theorie der vielfachen Kontinuität. Zürcher & Furrer, Zürich (1901)
Schläfli, L.: Gesamelte Mathematische Abhandlungen, Band 1. Birkhäuser, Basel (1950)
Knuth, D.E.: Big omicron and big omega and big theta. SIGACT News 8, 18–24 (1976)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Kůrková, V. (2014). Representations of Highly-Varying Functions by One-Hidden-Layer Networks. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2014. Lecture Notes in Computer Science(), vol 8467. Springer, Cham. https://doi.org/10.1007/978-3-319-07173-2_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-07173-2_7
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07172-5
Online ISBN: 978-3-319-07173-2
eBook Packages: Computer ScienceComputer Science (R0)