article

A Formal Proof of the Expressiveness of Deep Learning

Authors:

Alexander Bentkamp,

Jasmin Christian Blanchette,

Dietrich KlakowAuthors Info & Claims

Journal of Automated Reasoning, Volume 63, Issue 2

Pages 347 - 368

https://doi.org/10.1007/s10817-018-9481-5

Published: 01 August 2019 Publication History

Abstract

Deep learning has had a profound impact on computer science in recent years, with applications to image recognition, language processing, bioinformatics, and more. Recently, Cohen et al. provided theoretical evidence for the superiority of deep learning over shallow learning. We formalized their mathematical proof using Isabelle/HOL. The Isabelle development simplifies and generalizes the original proof, while working around the limitations of the HOL type system. To support the formalization, we developed reusable libraries of formalized mathematics, including results about the matrix rank, the Borel measure, and multivariate polynomials as well as a library for tensor analysis.

References

[1]

Bader, B.W., Kolda, T.G.: Algorithm 862: MATLAB tensor classes for fast algorithm prototyping. ACM Trans. Math. Softw. 32(4), 635---653 (2006)

Digital Library

[2]

Bentkamp, A.: Expressiveness of deep learning. Archive of Formal Proofs (2016). Formal proof development http://isa-afp.org/entries/Deep_Learning.shtml

[3]

Bentkamp, A.: An Isabelle formalization of the expressiveness of deep learning. M.Sc. thesis, Universität des Saarlandes (2016). http://matryoshka.gforge.inria.fr/pubs/bentkamp_msc_thesis.pdf

[4]

Bentkamp, A., Blanchette, J.C., Klakow, D.: A formal proof of the expressiveness of deep learning. In: Ayala-Rincón, M., Muñoz, C.A. (eds.) Interactive Theorem Proving (ITP 2017), LNCS, vol. 10499, pp. 46---64. Springer (2017)

[5]

Bernard, S., Bertot, Y., Rideau, L., Strub, P.: Formal proofs of transcendence for $e$ and $\pi $ as an application of multivariate and symmetric polynomials. In: Avigad, J., Chlipala, A. (eds.) Certified Programs and Proofs (CPP 2016), pp. 76---87. ACM (2016)

Digital Library

[6]

Bertot, Y., Gonthier, G., Biha, S.O., Pasca, I.: Canonical big operators. In: Mohamed, O.A., Muñoz, C.A., Tahar, S. (eds.) Theorem Proving in Higher Order Logics (TPHOLs 2008), vol. 5170, pp. 86---101. Springer (2008)

Digital Library

[7]

Bhat, S.: Syntactic foundations for machine learning. Ph.D. thesis, Georgia Institute of Technology (2013). https://smartech.gatech.edu/bitstream/handle/1853/47700/bhat_sooraj_b_201305_phd.pdf

[8]

Blanchette, J.C., Böhme, S., Fleury, M., Smolka, S.J., Steckermeier, A.: Semi-intelligible Isar proofs from machine-generated proofs. J. Autom. Reason. 56(2), 155---200 (2016)

Digital Library

[9]

Blanchette, J.C., Greenaway, D., Kaliszyk, C., Kühlwein, D., Urban, J.: A learning-based fact selector for Isabelle/HOL. J. Autom. Reason. 57(3), 219---244 (2016)

Digital Library

[10]

Boender, J., Kammüller, F., Nagarajan, R.: Formalization of quantum protocols using Coq. In: Heunen, C., Selinger, P., Vicary, J. (eds.) Workshop on Quantum Physics and Logic (QPL 2015), EPTCS, vol. 195, pp. 71---83 (2015)

[11]

Böhme, S., Weber, T.: Fast LCF-style proof reconstruction for Z3. In: Kaufmann, M., Paulson, L.C. (eds.) Interactive Theorem Proving (ITP 2010), LNCS, vol. 6172, pp. 179---194. Springer (2010)

Digital Library

[12]

Bürgisser, P., Cucker, F., Lotz, M.: The probability that a slightly perturbed numerical analysis problem is difficult. Math. Comput. 77(263), 1559---1583 (2008)

[13]

Caron, R., Traynor, T.: The zero set of a polynomial. Technical report, University of Windsor (2005). http://www1.uwindsor.ca/math/sites/uwindsor.ca.math/files/05-03.pdf

[14]

Church, A.: A formulation of the simple theory of types. J. Symb. Log. 5(2), 56---68 (1940)

[15]

Cohen, N., Sharir, O., Shashua, A.: Deep SimNets. In: Computer Vision and Pattern Recognition (CVPR 2016), pp. 4782---4791. IEEE Computer Society (2016)

[16]

Cohen, N., Sharir, O., Shashua, A.: On the expressive power of deep learning: a tensor analysis. In: Feldman, V., Rakhlin, A., Shamir, O. (eds.) Conference on Learning Theory (COLT 2016), JMLR Workshop and Conference Proceedings, vol. 49, pp. 698---728. JMLR.org (2016)

[17]

Cohen, N., Shashua, A.: Convolutional rectifier networks as generalized tensor decompositions. In: Balcan, M., Weinberger, K.Q. (eds.) International Conference on Machine Learning (ICML 2016), JMLR Workshop and Conference Proceedings, vol. 48, pp. 955---963. JMLR.org (2016)

Digital Library

[18]

Cohen, N., Shashua, A.: Inductive bias of deep convolutional networks through pooling geometry. CoRR arXiv:1605.06743 (2016)

[19]

Cohen, N., Tamari, R., Shashua, A.: Boosting dilated convolutional networks with mixed tensor decompositions. CoRR arXiv:1703.06846 (2017)

[20]

de Moura, L., BjØrner, N.: Z3: An efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) Tools and Algorithms for the Construction and Analysis of Systems (TACAS 2008), LNCS, vol. 4963, pp. 337---340. Springer (2008)

Digital Library

[21]

Gordon, M.J.C., Milner, R., Wadsworth, C.P.: Edinburgh LCF: A Mechanised Logic of Computation, LNCS, vol. 78. Springer, Berlin (1979)

[22]

Haftmann, F., Lochbihler, A., Schreiner, W.: Towards abstract and executable multivariate polynomials in Isabelle. In: Nipkow, T., Paulson, L., Wenzel, M. (eds.) Isabelle Workshop 2014 (2014)

[23]

Hardt, M., Recht, B., Singer, Y.: Train faster, generalize better: stability of stochastic gradient descent. In: Balcan, M., Weinberger, K.Q. (eds.) International Conference on Machine Learning (ICML 2016), JMLR Workshop and Conference Proceedings, vol. 48, pp. 1225---1234. JMLR (2016)

Digital Library

[24]

Harrison, J.: A HOL theory of Euclidean space. In: Hurd, J., Melham, T. (eds.) Theorem Proving in Higher Order Logics (TPHOLs 2005), LNCS, vol. 3603, pp. 114---129. Springer (2005)

Digital Library

[25]

Hölzl, J., Heller, A.: Three chapters of measure theory in Isabelle/HOL. In: van Eekelen, M.C.J.D., Geuvers, H., Schmaltz, J., Wiedijk, F. (eds.) Interactive Theorem Proving (ITP 2011), LNCS, vol. 6898, pp. 135---151. Springer (2011)

Digital Library

[26]

Immler, F., Maletzky, A.: Gröbner bases theory. Archive of Formal Proofs (2016). Formal proof development http://isa-afp.org/entries/Groebner_Bases.shtml

[27]

Kam, R.: Case studies in proof checking. Master's thesis, San Jose State University (2007). http://scholarworks.sjsu.edu/cgi/viewcontent.cgi?context=etd_projects&article=1149

[28]

Kawaguchi, K.: Deep learning without poor local minima. In: Lee, D.D., Sugiyama, M., von Luxburg, U., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems (NIPS 2016), NIPS, vol. 29, pp. 586---594 (2016)

Digital Library

[29]

Kobayashi, H., Chen, L., Murao, H.: Groups, rings and modules. Archive of Formal Proofs (2004). Formal proof development http://isa-afp.org/entries/Group-Ring-Module.shtml

[30]

Liu, L., Aravantinos, V., Hasan, O., Tahar, S.: On the formal analysis of HMM using theorem proving. In: Merz, S., Pang, J. (eds.) International Conference on Formal Engineering Methods (ICFEM 2014), LNCS, vol. 8829, pp. 316---331. Springer (2014)

[31]

Lotz, M.: On the volume of tubular neighborhoods of real algebraic varieties. Proc. Am. Math. Soc. 143(5), 1875---1889 (2015)

[32]

Murphy, C., Gray, P., Stewart, G.: Verified perceptron convergence theorem. In: Shpeisman, T., Gottschlich, J. (eds.) Machine Learning and Programming Languages (MAPL 2017), pp. 43---50. ACM (2017)

Digital Library

[33]

Nipkow, T., Klein, G.: Concrete Semantics: With Isabelle/HOL. Springer, Berlin (2014)

Digital Library

[34]

Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL: A Proof Assistant for Higher-Order Logic, LNCS, vol. 2283. Springer, Berlin (2002)

Digital Library

[35]

Paulson, L.C., Blanchette, J.C.: Three years of experience with Sledgehammer, a practical link between automatic and interactive theorem provers. In: Sutcliffe, G., Schulz, S., Ternovska, E. (eds.) International Workshop on the Implementation of Logics (IWIL-2010), EPiC, vol. 2, pp. 1---11. EasyChair (2012)

[36]

Paulson, L.C., Susanto, K.W.: Source-level proof reconstruction for interactive theorem proving. In: Schneider, K., Brandt, J. (eds.) Theorem Proving in Higher Order Logics (TPHOLs 2007), LNCS, vol. 4732, pp. 232---245. Springer (2007)

Digital Library

[37]

Poon, H., Domingos, P.M.: Sum---product networks: a new deep architecture. In: Cozman, F.G., Pfeffer, A. (eds.) Uncertainty in Artificial Intelligence (UAI 2011), pp. 337---346. AUAI Press (2011)

Digital Library

[38]

Prathamesh, T.V.H.: Tensor product of matrices. Archive of Formal Proofs (2016). Formal proof development http://isa-afp.org/entries/Matrix_Tensor.shtml

[39]

Selsam, D., Liang, P., Dill, D.L.: Developing bug-free machine learning systems with formal mathematics. In: Precup D., Teh, Y.W. (eds.) International Conference on Machine Learning (ICML 2017), Proceedings of Machine Learning Research, vol. 70, pp. 3047---3056. PMLR (2017)

Digital Library

[40]

Sternagel, C., Thiemann, R.: Executable multivariate polynomials. Archive of Formal Proofs (2010). Formal proof development http://isa-afp.org/entries/Polynomials.shtml

[41]

Thiemann, R., Yamada, A.: Matrices, Jordan normal forms, and spectral radius theory. Archive of Formal Proofs (2015). Formal proof development http://isa-afp.org/entries/Jordan_Normal_Form.shtml

[42]

Tishby, N., Zaslavsky, N.: Deep learning and the information bottleneck principle. In: Information Theory Workshop (ITW 2015), pp. 1---5. IEEE (2015)

[43]

Wenzel, M.: Isar--a generic interpretative approach to readable formal proof documents. In: Bertot, Y., Dowek, G., Hirschowitz, A., Paulin-Mohring, C., Théry, L. (eds.) Theorem Proving in Higher Order Logics (TPHOLs '99), LNCS, vol. 1690, pp. 167---184. Springer (1999)

Digital Library

Cited By

Zhou LBarthe GStrub PLiu JYing M(2023)CoqQ: Foundational Verification of Quantum ProgramsProceedings of the ACM on Programming Languages10.1145/35712227:POPL(833-865)Online publication date: 9-Jan-2023
https://dl.acm.org/doi/10.1145/3571222
Vajjha KShinnar ATrager BPestun VFulton NHriţcu CPopescu A(2021)CertRL: formalizing convergence proofs for value and policy iteration in CoqProceedings of the 10th ACM SIGPLAN International Conference on Certified Programs and Proofs10.1145/3437992.3439927(18-31)Online publication date: 17-Jan-2021
https://dl.acm.org/doi/10.1145/3437992.3439927
Tassarotti JVajjha KBanerjee ATristan JHriţcu CPopescu A(2021)A formal proof of PAC learnability for decision stumpsProceedings of the 10th ACM SIGPLAN International Conference on Certified Programs and Proofs10.1145/3437992.3439917(5-17)Online publication date: 17-Jan-2021
https://dl.acm.org/doi/10.1145/3437992.3439917

A Formal Proof of the Expressiveness of Deep Learning
1. Mathematics of computing
  1. Continuous mathematics
    1. Calculus
2. Theory of computation

Recommendations

A Mechanised Proof of Gödel's Incompleteness Theorems Using Nominal Isabelle

An Isabelle/HOL formalisation of Gödel's two incompleteness theorems is presented. The work follows wierczkowski's detailed proof of the theorems using hereditarily finite (HF) set theory (Dissertationes Mathematicae 422, 1---58, 2003). Avoiding the ...
Formalization of the Resolution Calculus for First-Order Logic

I present a formalization in Isabelle/HOL of the resolution calculus for first-order logic with formal soundness and completeness proofs. To prove the calculus sound, I use the substitution lemma, and to prove it complete, I use Herbrand interpretations ...
Soundness and Completeness Proofs by Coinductive Methods

We show how codatatypes can be employed to produce compact, high-level proofs of key results in logic: the soundness and completeness of proof systems for variations of first-order logic. For the classical completeness result, we first establish an ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Journal of Automated Reasoning

Journal of Automated Reasoning Volume 63, Issue 2

August 2019

376 pages

ISSN:0168-7433

Issue’s Table of Contents

Copyright © Copyright © 2019 Springer Nature B.V.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 August 2019

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 29 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhou LBarthe GStrub PLiu JYing M(2023)CoqQ: Foundational Verification of Quantum ProgramsProceedings of the ACM on Programming Languages10.1145/35712227:POPL(833-865)Online publication date: 9-Jan-2023
https://dl.acm.org/doi/10.1145/3571222
Vajjha KShinnar ATrager BPestun VFulton NHriţcu CPopescu A(2021)CertRL: formalizing convergence proofs for value and policy iteration in CoqProceedings of the 10th ACM SIGPLAN International Conference on Certified Programs and Proofs10.1145/3437992.3439927(18-31)Online publication date: 17-Jan-2021
https://dl.acm.org/doi/10.1145/3437992.3439927
Tassarotti JVajjha KBanerjee ATristan JHriţcu CPopescu A(2021)A formal proof of PAC learnability for decision stumpsProceedings of the 10th ACM SIGPLAN International Conference on Certified Programs and Proofs10.1145/3437992.3439917(5-17)Online publication date: 17-Jan-2021
https://dl.acm.org/doi/10.1145/3437992.3439917

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents