Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Theory of deep convolutional neural networks: : Downsampling

Published: 01 April 2020 Publication History

Abstract

Establishing a solid theoretical foundation for structured deep neural networks is greatly desired due to the successful applications of deep learning in various practical domains. This paper aims at an approximation theory of deep convolutional neural networks whose structures are induced by convolutions. To overcome the difficulty in theoretical analysis of the networks with linearly increasing widths arising from convolutions, we introduce a downsampling operator to reduce the widths. We prove that the downsampled deep convolutional neural networks can be used to approximate ridge functions nicely, which hints some advantages of these structured networks in terms of approximation or modeling. We also prove that the output of any multi-layer fully-connected neural network can be realized by that of a downsampled deep convolutional neural network with free parameters of the same order, which shows that in general, the approximation ability of deep convolutional neural networks is at least as good as that of fully-connected networks. Finally, a theorem for approximating functions on Riemannian manifolds is presented, which demonstrates that deep convolutional neural networks can be used to learn manifold features of data.

References

[1]
Barron A.R., Universal approximation bounds for superpositions of a sigmoidal function, IEEE Transaction on Information Theory 39 (1993) 930–945.
[2]
Bölcskei H., Grohs P., Kutyniok G., Petersen P., Optimal approximation with sparsely connected deep neural networks, SIAM Journal on Mathematics of Data Science 1 (2019) 8–45.
[3]
Chui C.K., Li X., Mhaskar H.N., Limitations of the approximation capabilities of neural networks with one hidden layer, Advances in Computational Mathematics 5 (1996) 233–243.
[4]
Cybenko G., Approximations by superpositions of sigmoidal functions, Mathematics of Control, Signals, and Systems 2 (1989) 303–314.
[5]
Daubechies I., Ten lectures on wavelets, SIAM, 1992.
[6]
Fan J., Hu T., Wu Q., Zhou D.X., Consistency analysis of an empirical minimum error entropy algorithm, Applied and Computational Harmonic Analysis 41 (2016) 164–189.
[7]
Goodfellow I., Bengio Y., Courville A., Deep learning, MIT Press, 2016.
[8]
Gordon Y., Maiorov V., Meyer Y., Reisner S., On best approximation by ridge functions in the uniform norm, Constructive Approximation 18 (2002) 61–85.
[9]
Guo Z.C., Xiang D.H., Guo X., Zhou D.X., Thresholded spectral algorithms for sparse approximations, Analysis and Applications 15 (2017) 433–455.
[10]
Hinton G.E., Osindero S., Teh Y.W., A fast learning algorithm for deep belief nets, Neural Computation 18 (2006) 1527–1554.
[11]
Hornik K., Stinchcombe M., White H., Multilayer feedforward networks are universal approximators, Neural Networks 2 (1989).
[12]
Klusowski J., Barron A., Approximation by combinations of ReLU and squared ReLU ridge functions with ℓ 1 and ℓ 0 controls, IEEE Transactions on Information Theory 64 (2018) 7649–7656.
[13]
Kohler M., Krzyzak A., Adaptive regression estimation with multilayer feedforward neural networks, Journal of Nonparametric Statistics 17 (2005) 891–913.
[14]
Krizhevsky A., Sutskever I., Hinton G.G., Imagenet classification with deep convolutional neural networks, in: NIPS, 2012, 1097-1105.
[15]
LeCun Y., Bottou L., Bengio Y., Haffner P., Gradient-based learning applied to document recognition, Proceedings of the IEEE 86 (1998) 2278–2324.
[16]
Leshno M., Lin Y.V., Pinkus A., Schocken S., Multilayer feedforward networks with a non-polynomial activation function can approximate any function, Neural Networks 6 (1993) 861–867.
[17]
Lin S.B., Zhou D.X., Distributed kernel gradient descent algorithms, Constructive Approximation 47 (2018) 249–276.
[18]
Mallat S., Understanding deep convolutional networks, Philosophical Transactions of the Royal Society of London. Series A 374 (2016).
[19]
Mhaskar H.N., Approximation properties of a multilayered feedforward artificial neural network, Advances in Computational Mathematics 1 (1993) 61–80.
[20]
Petersen P., Voigtlaender F., Equivalence of approximation by convolutional neural networks and fully-connected networks, Proceedings of the Americal Mathematical Society (2018) (in press). arXiv:1809.00973.
[21]
Petersen P., Voigtlaender V., Optimal approximation of piecewise smooth functions using deep ReLU neural networks, Neural Networks 108 (2018) 296–330.
[22]
Shaham U., Cloninger A., Coifman R., Provable approximation properties for deep neural networks, Applied and Computational Harmonic Analysis 44 (2018) 537–557.
[23]
Steinwart I., Christmann A., Support vector machines, Springer, New York, 2008.
[24]
Telgarsky M., Benefits of depth in neural networks, in: 29th annual conference on learning theory, Vol. 49, PMLR, 2016, pp. 1517–1539.
[25]
Yarotsky D., Error bounds for approximations with deep ReLU networks, Neural Networks 94 (2017) 103–114.
[26]
Ying Y., Zhou D.X., Unregularized online learning algorithms with general loss functions, Applied and Computational Harmonic Analysis 42 (2017) 224–244.
[27]
Zhang Y.C., Duchi J., Wainwright M., Divide and conquer kernel ridge regression: A distributed algorithm with minimax optimal rates, Journal of Machine Learning Research 16 (2015) 3299–3340.
[28]
Zhou D.X., Deep distributed convolutional neural networks: universality, Analysis and Applications 16 (2018) 895–919.
[29]
Zhou D.X., Distributed approximation with deep convolutional neural networks, 2018, submitted for publication.
[30]
Zhou D.X., Universality of deep convolutional neural networks, Applied and Computational Harmonic Analysis 48 (2020) 787–794.

Cited By

View all
  • (2024)Detection System of Landscape’s Unnatural Changes by Satellite Images Based on Local AreasPattern Recognition and Image Analysis10.1134/S105466182470015934:2(365-378)Online publication date: 1-Jun-2024
  • (2024)Designing Universally-Approximating Deep Neural Networks: A First-Order Optimization ApproachIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.338000746:9(6231-6246)Online publication date: 1-Sep-2024
  • (2024)Generalization analysis of deep CNNs under maximum correntropy criterionNeural Networks10.1016/j.neunet.2024.106226174:COnline publication date: 1-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Neural Networks
Neural Networks  Volume 124, Issue C
Apr 2020
395 pages

Publisher

Elsevier Science Ltd.

United Kingdom

Publication History

Published: 01 April 2020

Author Tags

  1. Deep learning
  2. Convolutional neural networks
  3. Approximation theory
  4. Downsampling
  5. Filter masks

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 23 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Detection System of Landscape’s Unnatural Changes by Satellite Images Based on Local AreasPattern Recognition and Image Analysis10.1134/S105466182470015934:2(365-378)Online publication date: 1-Jun-2024
  • (2024)Designing Universally-Approximating Deep Neural Networks: A First-Order Optimization ApproachIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.338000746:9(6231-6246)Online publication date: 1-Sep-2024
  • (2024)Generalization analysis of deep CNNs under maximum correntropy criterionNeural Networks10.1016/j.neunet.2024.106226174:COnline publication date: 1-Jun-2024
  • (2024)A novel lightweight CNN for chest X-ray-based lung disease identification on heterogeneous embedded systemApplied Intelligence10.1007/s10489-024-05420-254:6(4756-4780)Online publication date: 1-Mar-2024
  • (2023)Theoretical analysis of inductive biases in deep convolutional networksProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669371(74289-74338)Online publication date: 10-Dec-2023
  • (2023)FouriDownProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666743(14094-14112)Online publication date: 10-Dec-2023
  • (2023)Investigation of the Training Data Set Influence on the Accuracy of the Optical Laguerre-Gaussian Modes RecognitionOptical Memory and Neural Networks10.3103/S1060992X2305003X32:Suppl 1(S54-S62)Online publication date: 1-Nov-2023
  • (2023)Spatial oblivion channel attention targeting intra-class diversity feature learningNeural Networks10.1016/j.neunet.2023.07.032167:C(10-21)Online publication date: 1-Oct-2023
  • (2023)Rates of approximation by ReLU shallow neural networksJournal of Complexity10.1016/j.jco.2023.10178479:COnline publication date: 17-Oct-2023
  • (2023)Approximating smooth and sparse functions by deep neural networksJournal of Complexity10.1016/j.jco.2023.10178379:COnline publication date: 17-Oct-2023
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media