Nothing Special   »   [go: up one dir, main page]

skip to main content
article

A Formal Proof of the Expressiveness of Deep Learning

Published: 01 August 2019 Publication History

Abstract

Deep learning has had a profound impact on computer science in recent years, with applications to image recognition, language processing, bioinformatics, and more. Recently, Cohen et al. provided theoretical evidence for the superiority of deep learning over shallow learning. We formalized their mathematical proof using Isabelle/HOL. The Isabelle development simplifies and generalizes the original proof, while working around the limitations of the HOL type system. To support the formalization, we developed reusable libraries of formalized mathematics, including results about the matrix rank, the Borel measure, and multivariate polynomials as well as a library for tensor analysis.

References

[1]
Bader, B.W., Kolda, T.G.: Algorithm 862: MATLAB tensor classes for fast algorithm prototyping. ACM Trans. Math. Softw. 32(4), 635---653 (2006)
[2]
Bentkamp, A.: Expressiveness of deep learning. Archive of Formal Proofs (2016). Formal proof development http://isa-afp.org/entries/Deep_Learning.shtml
[3]
Bentkamp, A.: An Isabelle formalization of the expressiveness of deep learning. M.Sc. thesis, Universität des Saarlandes (2016). http://matryoshka.gforge.inria.fr/pubs/bentkamp_msc_thesis.pdf
[4]
Bentkamp, A., Blanchette, J.C., Klakow, D.: A formal proof of the expressiveness of deep learning. In: Ayala-Rincón, M., Muñoz, C.A. (eds.) Interactive Theorem Proving (ITP 2017), LNCS, vol. 10499, pp. 46---64. Springer (2017)
[5]
Bernard, S., Bertot, Y., Rideau, L., Strub, P.: Formal proofs of transcendence for $e$ and $\pi $ as an application of multivariate and symmetric polynomials. In: Avigad, J., Chlipala, A. (eds.) Certified Programs and Proofs (CPP 2016), pp. 76---87. ACM (2016)
[6]
Bertot, Y., Gonthier, G., Biha, S.O., Pasca, I.: Canonical big operators. In: Mohamed, O.A., Muñoz, C.A., Tahar, S. (eds.) Theorem Proving in Higher Order Logics (TPHOLs 2008), vol. 5170, pp. 86---101. Springer (2008)
[7]
Bhat, S.: Syntactic foundations for machine learning. Ph.D. thesis, Georgia Institute of Technology (2013). https://smartech.gatech.edu/bitstream/handle/1853/47700/bhat_sooraj_b_201305_phd.pdf
[8]
Blanchette, J.C., Böhme, S., Fleury, M., Smolka, S.J., Steckermeier, A.: Semi-intelligible Isar proofs from machine-generated proofs. J. Autom. Reason. 56(2), 155---200 (2016)
[9]
Blanchette, J.C., Greenaway, D., Kaliszyk, C., Kühlwein, D., Urban, J.: A learning-based fact selector for Isabelle/HOL. J. Autom. Reason. 57(3), 219---244 (2016)
[10]
Boender, J., Kammüller, F., Nagarajan, R.: Formalization of quantum protocols using Coq. In: Heunen, C., Selinger, P., Vicary, J. (eds.) Workshop on Quantum Physics and Logic (QPL 2015), EPTCS, vol. 195, pp. 71---83 (2015)
[11]
Böhme, S., Weber, T.: Fast LCF-style proof reconstruction for Z3. In: Kaufmann, M., Paulson, L.C. (eds.) Interactive Theorem Proving (ITP 2010), LNCS, vol. 6172, pp. 179---194. Springer (2010)
[12]
Bürgisser, P., Cucker, F., Lotz, M.: The probability that a slightly perturbed numerical analysis problem is difficult. Math. Comput. 77(263), 1559---1583 (2008)
[13]
Caron, R., Traynor, T.: The zero set of a polynomial. Technical report, University of Windsor (2005). http://www1.uwindsor.ca/math/sites/uwindsor.ca.math/files/05-03.pdf
[14]
Church, A.: A formulation of the simple theory of types. J. Symb. Log. 5(2), 56---68 (1940)
[15]
Cohen, N., Sharir, O., Shashua, A.: Deep SimNets. In: Computer Vision and Pattern Recognition (CVPR 2016), pp. 4782---4791. IEEE Computer Society (2016)
[16]
Cohen, N., Sharir, O., Shashua, A.: On the expressive power of deep learning: a tensor analysis. In: Feldman, V., Rakhlin, A., Shamir, O. (eds.) Conference on Learning Theory (COLT 2016), JMLR Workshop and Conference Proceedings, vol. 49, pp. 698---728. JMLR.org (2016)
[17]
Cohen, N., Shashua, A.: Convolutional rectifier networks as generalized tensor decompositions. In: Balcan, M., Weinberger, K.Q. (eds.) International Conference on Machine Learning (ICML 2016), JMLR Workshop and Conference Proceedings, vol. 48, pp. 955---963. JMLR.org (2016)
[18]
Cohen, N., Shashua, A.: Inductive bias of deep convolutional networks through pooling geometry. CoRR arXiv:1605.06743 (2016)
[19]
Cohen, N., Tamari, R., Shashua, A.: Boosting dilated convolutional networks with mixed tensor decompositions. CoRR arXiv:1703.06846 (2017)
[20]
de Moura, L., BjØrner, N.: Z3: An efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) Tools and Algorithms for the Construction and Analysis of Systems (TACAS 2008), LNCS, vol. 4963, pp. 337---340. Springer (2008)
[21]
Gordon, M.J.C., Milner, R., Wadsworth, C.P.: Edinburgh LCF: A Mechanised Logic of Computation, LNCS, vol. 78. Springer, Berlin (1979)
[22]
Haftmann, F., Lochbihler, A., Schreiner, W.: Towards abstract and executable multivariate polynomials in Isabelle. In: Nipkow, T., Paulson, L., Wenzel, M. (eds.) Isabelle Workshop 2014 (2014)
[23]
Hardt, M., Recht, B., Singer, Y.: Train faster, generalize better: stability of stochastic gradient descent. In: Balcan, M., Weinberger, K.Q. (eds.) International Conference on Machine Learning (ICML 2016), JMLR Workshop and Conference Proceedings, vol. 48, pp. 1225---1234. JMLR (2016)
[24]
Harrison, J.: A HOL theory of Euclidean space. In: Hurd, J., Melham, T. (eds.) Theorem Proving in Higher Order Logics (TPHOLs 2005), LNCS, vol. 3603, pp. 114---129. Springer (2005)
[25]
Hölzl, J., Heller, A.: Three chapters of measure theory in Isabelle/HOL. In: van Eekelen, M.C.J.D., Geuvers, H., Schmaltz, J., Wiedijk, F. (eds.) Interactive Theorem Proving (ITP 2011), LNCS, vol. 6898, pp. 135---151. Springer (2011)
[26]
Immler, F., Maletzky, A.: Gröbner bases theory. Archive of Formal Proofs (2016). Formal proof development http://isa-afp.org/entries/Groebner_Bases.shtml
[27]
Kam, R.: Case studies in proof checking. Master's thesis, San Jose State University (2007). http://scholarworks.sjsu.edu/cgi/viewcontent.cgi?context=etd_projects&article=1149
[28]
Kawaguchi, K.: Deep learning without poor local minima. In: Lee, D.D., Sugiyama, M., von Luxburg, U., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems (NIPS 2016), NIPS, vol. 29, pp. 586---594 (2016)
[29]
Kobayashi, H., Chen, L., Murao, H.: Groups, rings and modules. Archive of Formal Proofs (2004). Formal proof development http://isa-afp.org/entries/Group-Ring-Module.shtml
[30]
Liu, L., Aravantinos, V., Hasan, O., Tahar, S.: On the formal analysis of HMM using theorem proving. In: Merz, S., Pang, J. (eds.) International Conference on Formal Engineering Methods (ICFEM 2014), LNCS, vol. 8829, pp. 316---331. Springer (2014)
[31]
Lotz, M.: On the volume of tubular neighborhoods of real algebraic varieties. Proc. Am. Math. Soc. 143(5), 1875---1889 (2015)
[32]
Murphy, C., Gray, P., Stewart, G.: Verified perceptron convergence theorem. In: Shpeisman, T., Gottschlich, J. (eds.) Machine Learning and Programming Languages (MAPL 2017), pp. 43---50. ACM (2017)
[33]
Nipkow, T., Klein, G.: Concrete Semantics: With Isabelle/HOL. Springer, Berlin (2014)
[34]
Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL: A Proof Assistant for Higher-Order Logic, LNCS, vol. 2283. Springer, Berlin (2002)
[35]
Paulson, L.C., Blanchette, J.C.: Three years of experience with Sledgehammer, a practical link between automatic and interactive theorem provers. In: Sutcliffe, G., Schulz, S., Ternovska, E. (eds.) International Workshop on the Implementation of Logics (IWIL-2010), EPiC, vol. 2, pp. 1---11. EasyChair (2012)
[36]
Paulson, L.C., Susanto, K.W.: Source-level proof reconstruction for interactive theorem proving. In: Schneider, K., Brandt, J. (eds.) Theorem Proving in Higher Order Logics (TPHOLs 2007), LNCS, vol. 4732, pp. 232---245. Springer (2007)
[37]
Poon, H., Domingos, P.M.: Sum---product networks: a new deep architecture. In: Cozman, F.G., Pfeffer, A. (eds.) Uncertainty in Artificial Intelligence (UAI 2011), pp. 337---346. AUAI Press (2011)
[38]
Prathamesh, T.V.H.: Tensor product of matrices. Archive of Formal Proofs (2016). Formal proof development http://isa-afp.org/entries/Matrix_Tensor.shtml
[39]
Selsam, D., Liang, P., Dill, D.L.: Developing bug-free machine learning systems with formal mathematics. In: Precup D., Teh, Y.W. (eds.) International Conference on Machine Learning (ICML 2017), Proceedings of Machine Learning Research, vol. 70, pp. 3047---3056. PMLR (2017)
[40]
Sternagel, C., Thiemann, R.: Executable multivariate polynomials. Archive of Formal Proofs (2010). Formal proof development http://isa-afp.org/entries/Polynomials.shtml
[41]
Thiemann, R., Yamada, A.: Matrices, Jordan normal forms, and spectral radius theory. Archive of Formal Proofs (2015). Formal proof development http://isa-afp.org/entries/Jordan_Normal_Form.shtml
[42]
Tishby, N., Zaslavsky, N.: Deep learning and the information bottleneck principle. In: Information Theory Workshop (ITW 2015), pp. 1---5. IEEE (2015)
[43]
Wenzel, M.: Isar--a generic interpretative approach to readable formal proof documents. In: Bertot, Y., Dowek, G., Hirschowitz, A., Paulin-Mohring, C., Théry, L. (eds.) Theorem Proving in Higher Order Logics (TPHOLs '99), LNCS, vol. 1690, pp. 167---184. Springer (1999)

Cited By

View all
  • (2023)CoqQ: Foundational Verification of Quantum ProgramsProceedings of the ACM on Programming Languages10.1145/35712227:POPL(833-865)Online publication date: 9-Jan-2023
  • (2021)CertRL: formalizing convergence proofs for value and policy iteration in CoqProceedings of the 10th ACM SIGPLAN International Conference on Certified Programs and Proofs10.1145/3437992.3439927(18-31)Online publication date: 17-Jan-2021
  • (2021)A formal proof of PAC learnability for decision stumpsProceedings of the 10th ACM SIGPLAN International Conference on Certified Programs and Proofs10.1145/3437992.3439917(5-17)Online publication date: 17-Jan-2021
  1. A Formal Proof of the Expressiveness of Deep Learning

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Journal of Automated Reasoning
      Journal of Automated Reasoning  Volume 63, Issue 2
      August 2019
      376 pages

      Publisher

      Springer-Verlag

      Berlin, Heidelberg

      Publication History

      Published: 01 August 2019

      Author Tags

      1. Convolutional arithmetic circuits
      2. Deep learning
      3. Formalization
      4. Isabelle/HOL
      5. Machine learning
      6. Tensors

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 29 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)CoqQ: Foundational Verification of Quantum ProgramsProceedings of the ACM on Programming Languages10.1145/35712227:POPL(833-865)Online publication date: 9-Jan-2023
      • (2021)CertRL: formalizing convergence proofs for value and policy iteration in CoqProceedings of the 10th ACM SIGPLAN International Conference on Certified Programs and Proofs10.1145/3437992.3439927(18-31)Online publication date: 17-Jan-2021
      • (2021)A formal proof of PAC learnability for decision stumpsProceedings of the 10th ACM SIGPLAN International Conference on Certified Programs and Proofs10.1145/3437992.3439917(5-17)Online publication date: 17-Jan-2021

      View Options

      View options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media