research-article

Theory of deep convolutional neural networks: : Downsampling

Author:

Ding-Xuan ZhouAuthors Info & Claims

Volume 124, Issue C

Pages 319 - 327

https://doi.org/10.1016/j.neunet.2020.01.018

Published: 01 April 2020 Publication History

Abstract

Establishing a solid theoretical foundation for structured deep neural networks is greatly desired due to the successful applications of deep learning in various practical domains. This paper aims at an approximation theory of deep convolutional neural networks whose structures are induced by convolutions. To overcome the difficulty in theoretical analysis of the networks with linearly increasing widths arising from convolutions, we introduce a downsampling operator to reduce the widths. We prove that the downsampled deep convolutional neural networks can be used to approximate ridge functions nicely, which hints some advantages of these structured networks in terms of approximation or modeling. We also prove that the output of any multi-layer fully-connected neural network can be realized by that of a downsampled deep convolutional neural network with free parameters of the same order, which shows that in general, the approximation ability of deep convolutional neural networks is at least as good as that of fully-connected networks. Finally, a theorem for approximating functions on Riemannian manifolds is presented, which demonstrates that deep convolutional neural networks can be used to learn manifold features of data.

References

[1]

Barron A.R., Universal approximation bounds for superpositions of a sigmoidal function, IEEE Transaction on Information Theory 39 (1993) 930–945.

[2]

Bölcskei H., Grohs P., Kutyniok G., Petersen P., Optimal approximation with sparsely connected deep neural networks, SIAM Journal on Mathematics of Data Science 1 (2019) 8–45.

[3]

Chui C.K., Li X., Mhaskar H.N., Limitations of the approximation capabilities of neural networks with one hidden layer, Advances in Computational Mathematics 5 (1996) 233–243.

[4]

Cybenko G., Approximations by superpositions of sigmoidal functions, Mathematics of Control, Signals, and Systems 2 (1989) 303–314.

[5]

Daubechies I., Ten lectures on wavelets, SIAM, 1992.

[6]

Fan J., Hu T., Wu Q., Zhou D.X., Consistency analysis of an empirical minimum error entropy algorithm, Applied and Computational Harmonic Analysis 41 (2016) 164–189.

[7]

Goodfellow I., Bengio Y., Courville A., Deep learning, MIT Press, 2016.

Digital Library

[8]

Gordon Y., Maiorov V., Meyer Y., Reisner S., On best approximation by ridge functions in the uniform norm, Constructive Approximation 18 (2002) 61–85.

[9]

Guo Z.C., Xiang D.H., Guo X., Zhou D.X., Thresholded spectral algorithms for sparse approximations, Analysis and Applications 15 (2017) 433–455.

[10]

Hinton G.E., Osindero S., Teh Y.W., A fast learning algorithm for deep belief nets, Neural Computation 18 (2006) 1527–1554.

Digital Library

[11]

Hornik K., Stinchcombe M., White H., Multilayer feedforward networks are universal approximators, Neural Networks 2 (1989).

[12]

Klusowski J., Barron A., Approximation by combinations of ReLU and squared ReLU ridge functions with ℓ 1 and ℓ 0 controls, IEEE Transactions on Information Theory 64 (2018) 7649–7656.

[13]

Kohler M., Krzyzak A., Adaptive regression estimation with multilayer feedforward neural networks, Journal of Nonparametric Statistics 17 (2005) 891–913.

[14]

Krizhevsky A., Sutskever I., Hinton G.G., Imagenet classification with deep convolutional neural networks, in: NIPS, 2012, 1097-1105.

Digital Library

[15]

LeCun Y., Bottou L., Bengio Y., Haffner P., Gradient-based learning applied to document recognition, Proceedings of the IEEE 86 (1998) 2278–2324.

[16]

Leshno M., Lin Y.V., Pinkus A., Schocken S., Multilayer feedforward networks with a non-polynomial activation function can approximate any function, Neural Networks 6 (1993) 861–867.

[17]

Lin S.B., Zhou D.X., Distributed kernel gradient descent algorithms, Constructive Approximation 47 (2018) 249–276.

[18]

Mallat S., Understanding deep convolutional networks, Philosophical Transactions of the Royal Society of London. Series A 374 (2016).

[19]

Mhaskar H.N., Approximation properties of a multilayered feedforward artificial neural network, Advances in Computational Mathematics 1 (1993) 61–80.

[20]

Petersen P., Voigtlaender F., Equivalence of approximation by convolutional neural networks and fully-connected networks, Proceedings of the Americal Mathematical Society (2018) (in press). arXiv:1809.00973.

[21]

Petersen P., Voigtlaender V., Optimal approximation of piecewise smooth functions using deep ReLU neural networks, Neural Networks 108 (2018) 296–330.

[22]

Shaham U., Cloninger A., Coifman R., Provable approximation properties for deep neural networks, Applied and Computational Harmonic Analysis 44 (2018) 537–557.

[23]

Steinwart I., Christmann A., Support vector machines, Springer, New York, 2008.

[24]

Telgarsky M., Benefits of depth in neural networks, in: 29th annual conference on learning theory, Vol. 49, PMLR, 2016, pp. 1517–1539.

[25]

Yarotsky D., Error bounds for approximations with deep ReLU networks, Neural Networks 94 (2017) 103–114.

[26]

Ying Y., Zhou D.X., Unregularized online learning algorithms with general loss functions, Applied and Computational Harmonic Analysis 42 (2017) 224–244.

[27]

Zhang Y.C., Duchi J., Wainwright M., Divide and conquer kernel ridge regression: A distributed algorithm with minimax optimal rates, Journal of Machine Learning Research 16 (2015) 3299–3340.

[28]

Zhou D.X., Deep distributed convolutional neural networks: universality, Analysis and Applications 16 (2018) 895–919.

[29]

Zhou D.X., Distributed approximation with deep convolutional neural networks, 2018, submitted for publication.

[30]

Zhou D.X., Universality of deep convolutional neural networks, Applied and Computational Harmonic Analysis 48 (2020) 787–794.

Cited By

Xi Zhou Bu QMatskevich VNedzved A(2024)Detection System of Landscape’s Unnatural Changes by Satellite Images Based on Local AreasPattern Recognition and Image Analysis10.1134/S105466182470015934:2(365-378)Online publication date: 1-Jun-2024
https://dl.acm.org/doi/10.1134/S1054661824700159
Wu ZXiao MFang CLin Z(2024)Designing Universally-Approximating Deep Neural Networks: A First-Order Optimization ApproachIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.338000746:9(6231-6246)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1109/TPAMI.2024.3380007
Zhang YFang ZFan J(2024)Generalization analysis of deep CNNs under maximum correntropy criterionNeural Networks10.1016/j.neunet.2024.106226174:COnline publication date: 1-Jun-2024
https://dl.acm.org/doi/10.1016/j.neunet.2024.106226
Show More Cited By

Index Terms

Theory of deep convolutional neural networks: Downsampling

Index terms have been assigned to the content through auto-classification.

Recommendations

Convolutional neural networks for wavelet domain super resolution

Proposed a super resolution method with higher reconstruction accuracy than before.Cast super resolution as a problem of estimating sparse wavelet detail coefficients.Estimated sparse wavelet coefficients using a convolutional neural network (CNN)...
Towards dropout training for convolutional neural networks

Recently, dropout has seen increasing use in deep learning. For deep convolutional neural networks, dropout is known to work well in fully-connected layers. However, its effect in convolutional and pooling layers is still not clear. This paper ...
Convolutional neural networks for hyperspectral image classification

As a powerful visual model, convolutional neural networks (CNNs) have demonstrated remarkable performance in various visual recognition problems, and attracted considerable attention in recent years. However, due to the highly correlated bands and ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Neural Networks

Neural Networks Volume 124, Issue C

Apr 2020

395 pages

ISSN:0893-6080

Issue’s Table of Contents

Elsevier Ltd.

Publisher

Elsevier Science Ltd.

United Kingdom

Publication History

Published: 01 April 2020

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

29
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 23 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Xi Zhou Bu QMatskevich VNedzved A(2024)Detection System of Landscape’s Unnatural Changes by Satellite Images Based on Local AreasPattern Recognition and Image Analysis10.1134/S105466182470015934:2(365-378)Online publication date: 1-Jun-2024
https://dl.acm.org/doi/10.1134/S1054661824700159
Wu ZXiao MFang CLin Z(2024)Designing Universally-Approximating Deep Neural Networks: A First-Order Optimization ApproachIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.338000746:9(6231-6246)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1109/TPAMI.2024.3380007
Zhang YFang ZFan J(2024)Generalization analysis of deep CNNs under maximum correntropy criterionNeural Networks10.1016/j.neunet.2024.106226174:COnline publication date: 1-Jun-2024
https://dl.acm.org/doi/10.1016/j.neunet.2024.106226
Sanida TDasygenis M(2024)A novel lightweight CNN for chest X-ray-based lung disease identification on heterogeneous embedded systemApplied Intelligence10.1007/s10489-024-05420-254:6(4756-4780)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1007/s10489-024-05420-2
Wang ZWu LOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Theoretical analysis of inductive biases in deep convolutional networksProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669371(74289-74338)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3669371
Zhu QZhou MHuang JZheng NGao HLi CXv YZhao FOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)FouriDownProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666743(14094-14112)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3666743
Bekhterev A(2023)Investigation of the Training Data Set Influence on the Accuracy of the Optical Laguerre-Gaussian Modes RecognitionOptical Memory and Neural Networks10.3103/S1060992X2305003X32:Suppl 1(S54-S62)Online publication date: 1-Nov-2023
https://dl.acm.org/doi/10.3103/S1060992X2305003X
Han HZhang QLi FDu Y(2023)Spatial oblivion channel attention targeting intra-class diversity feature learningNeural Networks10.1016/j.neunet.2023.07.032167:C(10-21)Online publication date: 1-Oct-2023
https://dl.acm.org/doi/10.1016/j.neunet.2023.07.032
Mao TZhou D(2023)Rates of approximation by ReLU shallow neural networksJournal of Complexity10.1016/j.jco.2023.10178479:COnline publication date: 17-Oct-2023
https://dl.acm.org/doi/10.1016/j.jco.2023.101784
Liu X(2023)Approximating smooth and sparse functions by deep neural networksJournal of Complexity10.1016/j.jco.2023.10178379:COnline publication date: 17-Oct-2023
https://dl.acm.org/doi/10.1016/j.jco.2023.101783
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents