Nothing Special   »   [go: up one dir, main page]

Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators

Abstract

It is widely known that neural networks (NNs) are universal approximators of continuous functions. However, a less known but powerful result is that a NN with a single hidden layer can accurately approximate any nonlinear continuous operator. This universal approximation theorem of operators is suggestive of the structure and potential of deep neural networks (DNNs) in learning continuous operators or complex systems from streams of scattered data. Here, we thus extend this theorem to DNNs. We design a new network with small generalization error, the deep operator network (DeepONet), which consists of a DNN for encoding the discrete input function space (branch net) and another DNN for encoding the domain of the output functions (trunk net). We demonstrate that DeepONet can learn various explicit operators, such as integrals and fractional Laplacians, as well as implicit operators that represent deterministic and stochastic differential equations. We study different formulations of the input function space and its effect on the generalization error for 16 different diverse applications.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Illustrations of the problem set-up and new architectures of DeepONets that lead to good generalization.
Fig. 2: Learning explicit operators using different V spaces and different network architectures.
Fig. 3: Fast learning of implicit operators in a nonlinear pendulum (k = 1 and T = 3).
Fig. 4: Fast learning of implicit operators in a diffusion-reaction system.
Fig. 5: DeepONet prediction for a stochastic ODE.
Fig. 6: DeepONet prediction for a stochastic elliptic equation.

Similar content being viewed by others

Data availability

All the datasets in the study were generated directly from the code.

Code availability

The code used in the study is publicly available from the GitHub repository https://github.com/lululxvi/deeponet55.

References

  1. Rico-Martinez, R., Krischer, K., Kevrekidis, I. G., Kube, M. C. & Hudson, J. L. Discrete- vs. continuous-time nonlinear signal processing of Cu electrodissolution data. Chem. Eng. Commun. 118, 25–48 (1992).

    Article  Google Scholar 

  2. Rico-Martinez, R., Anderson, J. S. & Kevrekidis, I. G. Continuous-time nonlinear signal processing: a neural network based approach for gray box identification. In Proc. IEEE Workshop on Neural Networks for Signal Processing 596–605 (IEEE, 1994).

  3. González-García, R., Rico-Martínez, R. & Kevrekidis, I. G. Identification of distributed parameter systems: a neural net based approach. Comput. Chem. Eng. 22, S965–S968 (1998).

    Article  Google Scholar 

  4. Psichogios, D. C. & Ungar, L. H. A hybrid neural network-first principles approach to process modeling. AIChE J. 38, 1499–1511 (1992).

    Article  Google Scholar 

  5. Kevrekidis, I. G. et al. Equation-free, coarse-grained multiscale computation: enabling mocroscopic simulators to perform system-level analysis. Commun. Math. Sci. 1, 715–762 (2003).

    Article  MathSciNet  Google Scholar 

  6. Weinan, E. Principles of Multiscale Modeling (Cambridge Univ. Press, 2011).

  7. Ferrandis, J., Triantafyllou, M., Chryssostomidis, C. & Karniadakis, G. Learning functionals via LSTM neural networks for predicting vessel dynamics in extreme sea states. Preprint at https://arxiv.org/pdf/1912.13382.pdf (2019).

  8. Qin, T., Chen, Z., Jakeman, J. & Xiu, D. Deep learning of parameterized equations with applications to uncertainty quantification. Preprint at https://arxiv.org/pdf/1910.07096.pdf (2020).

  9. Chen, T. Q., Rubanova, Y., Bettencourt, J. & Duvenaud, D. K. Neural ordinary differential equations. In Advances in Neural Information Processing Systems 6571–6583 (NIPS, 2018).

  10. Jia, J. & Benson, A. R. Neural jump stochastic differential equations. Preprint at https://arxiv.org/pdf/1905.10403.pdf (2019).

  11. Greydanus, S., Dzamba, M. & Yosinski, J. Hamiltonian neural networks. In Advances in Neural Information Processing Systems 15379–15389 (NIPS, 2019).

  12. Toth, P. et al. Hamiltonian generative networks. Preprint at https://arxiv.org/pdf/1909.13789.pdf (2019).

  13. Zhong, Y. D., Dey, B. & Chakraborty, A. Symplectic ODE-Net: learning Hamiltonian dynamics with control. Preprint at https://arxiv.org/pdf/1909.12077.pdf (2019).

  14. Chen, Z., Zhang, J., Arjovsky, M. & Bottou, L. Symplectic recurrent neural networks. Preprint at https://arxiv.org/pdf/1909.13334.pdf (2019).

  15. Winovich, N., Ramani, K. & Lin, G. ConvPDE-UQ: convolutional neural networks with quantified uncertainty for heterogeneous elliptic partial differential equations on varied domains. J. Comput. Phys. 394, 263–279 (2019).

    Article  MathSciNet  Google Scholar 

  16. Zhu, Y., Zabaras, N., Koutsourelakis, P.-S. & Perdikaris, P. Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty quantification without labeled data. J. Comput. Phys. 394, 56–81 (2019).

    Article  MathSciNet  Google Scholar 

  17. Trask, N., Patel, R. G., Gross, B. J. & Atzberger, P. J. GMLS-Nets: a framework for learning from unstructured data. Preprint at https://arxiv.org/pdf/1909.05371.pdf (2019).

  18. Li, Z. et al. Neural operator: graph kernel network for partial differential equations. Preprint at https://arxiv.org/pdf/2003.03485.pdf (2020).

  19. Rudy, S. H., Brunton, S. L., Proctor, J. L. & Kutz, J. N. Data-driven discovery of partial differential equations. Sci. Adv. 3, e1602614 (2017).

    Article  Google Scholar 

  20. Zhang, D., Lu, L., Guo, L. & Karniadakis, G. E. Quantifying total uncertainty in physics-informed neural networks for solving forward and inverse stochastic problems. J. Comput. Phys. 397, 108850 (2019).

    Article  MathSciNet  Google Scholar 

  21. Pang, G., Lu, L. & Karniadakis, G. E. fPINNs: fractional physics-informed neural networks. SIAM J. Sci. Comput. 41, A2603–A2626 (2019).

    Article  MathSciNet  Google Scholar 

  22. Lu, L., Meng, X., Mao, Z. & Karniadakis, G. E. DeepXDE: a deep learning library for solving differential equations. SIAM Rev. 63, 208–228 (2021).

    Article  MathSciNet  Google Scholar 

  23. Yazdani, A., Lu, L., Raissi, M. & Karniadakis, G. E. Systems biology informed deep learning for inferring parameters and hidden dynamics. PLoS Comput. Biol. 16, e1007575 (2020).

    Article  Google Scholar 

  24. Chen, Y., Lu, L., Karniadakis, G. E. & Negro, L. D. Physics-informed neural networks for inverse problems in nano-optics and metamaterials. Opt. Express 28, 11618–11633 (2020).

    Article  Google Scholar 

  25. Holl, P., Koltun, V. & Thuerey, N. Learning to control PDEs with differentiable physics. Preprint at https://arxiv.org/pdf/2001.07457.pdf (2020).

  26. Lample, G. & Charton, F. Deep learning for symbolic mathematics. Preprint at https://arxiv.org/pdf/1912.01412.pdf (2019).

  27. Charton, F., Hayat, A. & Lample, G. Deep differential system stability—learning advanced computations from examples. Preprint at https://arxiv.org/pdf/2006.06462.pdf (2020).

  28. Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 2, 303–314 (1989).

    Article  MathSciNet  Google Scholar 

  29. Hornik, K., Stinchcombe, M. & White, H. Multilayer feedforward networks are universal approximators. Neural Networks 2, 359–366 (1989).

    Article  Google Scholar 

  30. Chen, T. & Chen, H. Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems. IEEE Trans. Neural Networks 6, 911–917 (1995).

    Article  Google Scholar 

  31. Chen, T. & Chen, H. Approximations of continuous functionals by neural networks with application to dynamic systems. IEEE Trans. Neural Networks 4, 910–918 (1993).

    Article  Google Scholar 

  32. Mhaskar, H. N. & Hahm, N. Neural networks for functional approximation and system identification. Neural Comput. 9, 143–159 (1997).

    Article  Google Scholar 

  33. Rossi, F. & Conan-Guez, B. Functional multi-layer perceptron: a non-linear tool for functional data analysis. Neural Networks 18, 45–60 (2005).

    Article  Google Scholar 

  34. Chen, T. & Chen, H. Approximation capability to functions of several variables, nonlinear functionals, and operators by radial basis function neural networks. IEEE Trans. Neural Networks 6, 904–910 (1995).

    Article  Google Scholar 

  35. Brown, T. B. et al. Language models are few-shot learners. Preprint at https://arxiv.org/pdf/2005.14165.pdf (2020).

  36. Lu, L., Su, Y. & Karniadakis, G. E. Collapse of deep and narrow neural nets. Preprint at https://arxiv.org/pdf/1808.04947.pdf (2018).

  37. Jin, P., Lu, L., Tang, Y. & Karniadakis, G. E. Quantifying the generalization error in deep learning in terms of data distribution and neural network smoothness. Neural Networks 130, 85–99 (2020).

    Article  Google Scholar 

  38. Lu, L., Shin, Y., Su, Y. & Karniadakis, G. E. Dying ReLU and initialization: theory and numerical examples. Commun. Comput. Phys. 28, 1671–1706 (2020).

    Article  MathSciNet  Google Scholar 

  39. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. 2016 IEEE Conference on Computer Vision and Pattern Recognition 770–778 (IEEE, 2016).

  40. Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems 5998–6008 (NIPS, 2017).

  41. Dumoulin, V. et al. Feature-wise transformations. Distill https://distill.pub/2018/feature-wise-transformations (2018).

  42. Sutskever, I., Vinyals, O. & Le, Q. V. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems 3104–3112 (NIPS, 2014).

  43. Bahdanau, D., Cho, K. & Bengio. Y. Neural machine translation by jointly learning to align and translate. Preprint at https://arxiv.org/pdf/1409.0473.pdf (2014).

  44. Britz, D., Goldie, A., Luong, M. & Le, Q. Massive exploration of neural machine translation architectures. Preprint at https://arxiv.org/pdf/1703.03906.pdf (2017).

  45. Gelbrich, M. On a formula for the l2 Wasserstein metric between measures on Euclidean and Hilbert spaces. Math. Nachrichten 147, 185–203 (1990).

    Article  MathSciNet  Google Scholar 

  46. Podlubny, I. Fractional Differential Equations: An Introduction to Fractional Derivatives, Fractional Differential Equations, to Methods of their Solution and Some of their Applications (Elsevier, 1998).

  47. Zayernouri, M. & Karniadakis, G. E. Fractional Sturm–Liouville Eigen-problems: theory and numerical approximation. J. Comput. Phys. 252, 495–517 (2013).

    Article  MathSciNet  Google Scholar 

  48. Lischke, A. et al. What is the fractional Laplacian? A comparative review with new results. J. Comput. Phys. 404, 109009 (2020).

    Article  MathSciNet  Google Scholar 

  49. Born, M. & Wolf, E. Principles of Optics: Electromagnetic Theory of Propagation, Interference and Diffraction of Light (Elsevier, 2013).

  50. Mitzenmacher, M. & Upfal, E. Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis (Cambridge Univ. Press, 2017).

  51. Shwartz-Ziv, R. & Tishby, N. Opening the black box of deep neural networks via information. Preprint at https://arxiv.org/pdf/1703.00810.pdf (2017).

  52. Cai, S., Wang, Z., Lu, L., Zaki, T. A. & Karniadakis, G. E. DeepM&Mnet: inferring the electroconvection multiphysics fields based on operator approximation by neural networks. Preprint at https://arxiv.org/pdf/2009.12935.pdf (2020).

  53. Tai, K. S., Bailis, P. & Valiant, G. Equivariant transformer networks. Preprint at https://arxiv.org/pdf/1901.11399.pdf (2019).

  54. Hanin, B. Universal function approximation by deep neural nets with bounded width and ReLU activations. Preprint at https://arxiv.org/pdf/1708.02691.pdf (2017).

  55. Lu, L. DeepONet https://doi.org/10.5281/zenodo.4319385 (13 December 2020).

Download references

Acknowledgements

This work was supported by the DOE PhILMs project (no. DE-SC0019453) and DARPA-CompMods grant no. HR00112090062.

Author information

Authors and Affiliations

Authors

Contributions

L.L. and G.E.K. designed the study based on G.E.K.’s original idea. L.L. developed DeepONet architectures. L.L., P.J. and Z.Z. developed the theory. L.L. performed the experiments for the integral, nonlinear ODE, gravity pendulum and stochastic ODE/PDE operators. L.L. and P.J. performed the experiments for the Legendre transform, diffusion-reaction, advection and advection-diffusion PDEs. G.P. performed the experiments for fractional operators. L.L., P.J., G.P., Z.Z. and G.E.K. wrote the manuscript. G.E.K. supervised the project.

Corresponding author

Correspondence to George Em Karniadakis.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Machine Intelligence thanks Irana Higgins, Jian-Xun Wang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lu, L., Jin, P., Pang, G. et al. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nat Mach Intell 3, 218–229 (2021). https://doi.org/10.1038/s42256-021-00302-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s42256-021-00302-5

This article is cited by

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics