Abstract
It is widely known that neural networks (NNs) are universal approximators of continuous functions. However, a less known but powerful result is that a NN with a single hidden layer can accurately approximate any nonlinear continuous operator. This universal approximation theorem of operators is suggestive of the structure and potential of deep neural networks (DNNs) in learning continuous operators or complex systems from streams of scattered data. Here, we thus extend this theorem to DNNs. We design a new network with small generalization error, the deep operator network (DeepONet), which consists of a DNN for encoding the discrete input function space (branch net) and another DNN for encoding the domain of the output functions (trunk net). We demonstrate that DeepONet can learn various explicit operators, such as integrals and fractional Laplacians, as well as implicit operators that represent deterministic and stochastic differential equations. We study different formulations of the input function space and its effect on the generalization error for 16 different diverse applications.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
All the datasets in the study were generated directly from the code.
Code availability
The code used in the study is publicly available from the GitHub repository https://github.com/lululxvi/deeponet55.
References
Rico-Martinez, R., Krischer, K., Kevrekidis, I. G., Kube, M. C. & Hudson, J. L. Discrete- vs. continuous-time nonlinear signal processing of Cu electrodissolution data. Chem. Eng. Commun. 118, 25–48 (1992).
Rico-Martinez, R., Anderson, J. S. & Kevrekidis, I. G. Continuous-time nonlinear signal processing: a neural network based approach for gray box identification. In Proc. IEEE Workshop on Neural Networks for Signal Processing 596–605 (IEEE, 1994).
González-García, R., Rico-Martínez, R. & Kevrekidis, I. G. Identification of distributed parameter systems: a neural net based approach. Comput. Chem. Eng. 22, S965–S968 (1998).
Psichogios, D. C. & Ungar, L. H. A hybrid neural network-first principles approach to process modeling. AIChE J. 38, 1499–1511 (1992).
Kevrekidis, I. G. et al. Equation-free, coarse-grained multiscale computation: enabling mocroscopic simulators to perform system-level analysis. Commun. Math. Sci. 1, 715–762 (2003).
Weinan, E. Principles of Multiscale Modeling (Cambridge Univ. Press, 2011).
Ferrandis, J., Triantafyllou, M., Chryssostomidis, C. & Karniadakis, G. Learning functionals via LSTM neural networks for predicting vessel dynamics in extreme sea states. Preprint at https://arxiv.org/pdf/1912.13382.pdf (2019).
Qin, T., Chen, Z., Jakeman, J. & Xiu, D. Deep learning of parameterized equations with applications to uncertainty quantification. Preprint at https://arxiv.org/pdf/1910.07096.pdf (2020).
Chen, T. Q., Rubanova, Y., Bettencourt, J. & Duvenaud, D. K. Neural ordinary differential equations. In Advances in Neural Information Processing Systems 6571–6583 (NIPS, 2018).
Jia, J. & Benson, A. R. Neural jump stochastic differential equations. Preprint at https://arxiv.org/pdf/1905.10403.pdf (2019).
Greydanus, S., Dzamba, M. & Yosinski, J. Hamiltonian neural networks. In Advances in Neural Information Processing Systems 15379–15389 (NIPS, 2019).
Toth, P. et al. Hamiltonian generative networks. Preprint at https://arxiv.org/pdf/1909.13789.pdf (2019).
Zhong, Y. D., Dey, B. & Chakraborty, A. Symplectic ODE-Net: learning Hamiltonian dynamics with control. Preprint at https://arxiv.org/pdf/1909.12077.pdf (2019).
Chen, Z., Zhang, J., Arjovsky, M. & Bottou, L. Symplectic recurrent neural networks. Preprint at https://arxiv.org/pdf/1909.13334.pdf (2019).
Winovich, N., Ramani, K. & Lin, G. ConvPDE-UQ: convolutional neural networks with quantified uncertainty for heterogeneous elliptic partial differential equations on varied domains. J. Comput. Phys. 394, 263–279 (2019).
Zhu, Y., Zabaras, N., Koutsourelakis, P.-S. & Perdikaris, P. Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty quantification without labeled data. J. Comput. Phys. 394, 56–81 (2019).
Trask, N., Patel, R. G., Gross, B. J. & Atzberger, P. J. GMLS-Nets: a framework for learning from unstructured data. Preprint at https://arxiv.org/pdf/1909.05371.pdf (2019).
Li, Z. et al. Neural operator: graph kernel network for partial differential equations. Preprint at https://arxiv.org/pdf/2003.03485.pdf (2020).
Rudy, S. H., Brunton, S. L., Proctor, J. L. & Kutz, J. N. Data-driven discovery of partial differential equations. Sci. Adv. 3, e1602614 (2017).
Zhang, D., Lu, L., Guo, L. & Karniadakis, G. E. Quantifying total uncertainty in physics-informed neural networks for solving forward and inverse stochastic problems. J. Comput. Phys. 397, 108850 (2019).
Pang, G., Lu, L. & Karniadakis, G. E. fPINNs: fractional physics-informed neural networks. SIAM J. Sci. Comput. 41, A2603–A2626 (2019).
Lu, L., Meng, X., Mao, Z. & Karniadakis, G. E. DeepXDE: a deep learning library for solving differential equations. SIAM Rev. 63, 208–228 (2021).
Yazdani, A., Lu, L., Raissi, M. & Karniadakis, G. E. Systems biology informed deep learning for inferring parameters and hidden dynamics. PLoS Comput. Biol. 16, e1007575 (2020).
Chen, Y., Lu, L., Karniadakis, G. E. & Negro, L. D. Physics-informed neural networks for inverse problems in nano-optics and metamaterials. Opt. Express 28, 11618–11633 (2020).
Holl, P., Koltun, V. & Thuerey, N. Learning to control PDEs with differentiable physics. Preprint at https://arxiv.org/pdf/2001.07457.pdf (2020).
Lample, G. & Charton, F. Deep learning for symbolic mathematics. Preprint at https://arxiv.org/pdf/1912.01412.pdf (2019).
Charton, F., Hayat, A. & Lample, G. Deep differential system stability—learning advanced computations from examples. Preprint at https://arxiv.org/pdf/2006.06462.pdf (2020).
Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 2, 303–314 (1989).
Hornik, K., Stinchcombe, M. & White, H. Multilayer feedforward networks are universal approximators. Neural Networks 2, 359–366 (1989).
Chen, T. & Chen, H. Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems. IEEE Trans. Neural Networks 6, 911–917 (1995).
Chen, T. & Chen, H. Approximations of continuous functionals by neural networks with application to dynamic systems. IEEE Trans. Neural Networks 4, 910–918 (1993).
Mhaskar, H. N. & Hahm, N. Neural networks for functional approximation and system identification. Neural Comput. 9, 143–159 (1997).
Rossi, F. & Conan-Guez, B. Functional multi-layer perceptron: a non-linear tool for functional data analysis. Neural Networks 18, 45–60 (2005).
Chen, T. & Chen, H. Approximation capability to functions of several variables, nonlinear functionals, and operators by radial basis function neural networks. IEEE Trans. Neural Networks 6, 904–910 (1995).
Brown, T. B. et al. Language models are few-shot learners. Preprint at https://arxiv.org/pdf/2005.14165.pdf (2020).
Lu, L., Su, Y. & Karniadakis, G. E. Collapse of deep and narrow neural nets. Preprint at https://arxiv.org/pdf/1808.04947.pdf (2018).
Jin, P., Lu, L., Tang, Y. & Karniadakis, G. E. Quantifying the generalization error in deep learning in terms of data distribution and neural network smoothness. Neural Networks 130, 85–99 (2020).
Lu, L., Shin, Y., Su, Y. & Karniadakis, G. E. Dying ReLU and initialization: theory and numerical examples. Commun. Comput. Phys. 28, 1671–1706 (2020).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. 2016 IEEE Conference on Computer Vision and Pattern Recognition 770–778 (IEEE, 2016).
Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems 5998–6008 (NIPS, 2017).
Dumoulin, V. et al. Feature-wise transformations. Distill https://distill.pub/2018/feature-wise-transformations (2018).
Sutskever, I., Vinyals, O. & Le, Q. V. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems 3104–3112 (NIPS, 2014).
Bahdanau, D., Cho, K. & Bengio. Y. Neural machine translation by jointly learning to align and translate. Preprint at https://arxiv.org/pdf/1409.0473.pdf (2014).
Britz, D., Goldie, A., Luong, M. & Le, Q. Massive exploration of neural machine translation architectures. Preprint at https://arxiv.org/pdf/1703.03906.pdf (2017).
Gelbrich, M. On a formula for the l2 Wasserstein metric between measures on Euclidean and Hilbert spaces. Math. Nachrichten 147, 185–203 (1990).
Podlubny, I. Fractional Differential Equations: An Introduction to Fractional Derivatives, Fractional Differential Equations, to Methods of their Solution and Some of their Applications (Elsevier, 1998).
Zayernouri, M. & Karniadakis, G. E. Fractional Sturm–Liouville Eigen-problems: theory and numerical approximation. J. Comput. Phys. 252, 495–517 (2013).
Lischke, A. et al. What is the fractional Laplacian? A comparative review with new results. J. Comput. Phys. 404, 109009 (2020).
Born, M. & Wolf, E. Principles of Optics: Electromagnetic Theory of Propagation, Interference and Diffraction of Light (Elsevier, 2013).
Mitzenmacher, M. & Upfal, E. Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis (Cambridge Univ. Press, 2017).
Shwartz-Ziv, R. & Tishby, N. Opening the black box of deep neural networks via information. Preprint at https://arxiv.org/pdf/1703.00810.pdf (2017).
Cai, S., Wang, Z., Lu, L., Zaki, T. A. & Karniadakis, G. E. DeepM&Mnet: inferring the electroconvection multiphysics fields based on operator approximation by neural networks. Preprint at https://arxiv.org/pdf/2009.12935.pdf (2020).
Tai, K. S., Bailis, P. & Valiant, G. Equivariant transformer networks. Preprint at https://arxiv.org/pdf/1901.11399.pdf (2019).
Hanin, B. Universal function approximation by deep neural nets with bounded width and ReLU activations. Preprint at https://arxiv.org/pdf/1708.02691.pdf (2017).
Lu, L. DeepONet https://doi.org/10.5281/zenodo.4319385 (13 December 2020).
Acknowledgements
This work was supported by the DOE PhILMs project (no. DE-SC0019453) and DARPA-CompMods grant no. HR00112090062.
Author information
Authors and Affiliations
Contributions
L.L. and G.E.K. designed the study based on G.E.K.’s original idea. L.L. developed DeepONet architectures. L.L., P.J. and Z.Z. developed the theory. L.L. performed the experiments for the integral, nonlinear ODE, gravity pendulum and stochastic ODE/PDE operators. L.L. and P.J. performed the experiments for the Legendre transform, diffusion-reaction, advection and advection-diffusion PDEs. G.P. performed the experiments for fractional operators. L.L., P.J., G.P., Z.Z. and G.E.K. wrote the manuscript. G.E.K. supervised the project.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature Machine Intelligence thanks Irana Higgins, Jian-Xun Wang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Information.
Rights and permissions
About this article
Cite this article
Lu, L., Jin, P., Pang, G. et al. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nat Mach Intell 3, 218–229 (2021). https://doi.org/10.1038/s42256-021-00302-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s42256-021-00302-5
This article is cited by
-
Fault geometry invariance and dislocation potential in antiplane crustal deformation: physics-informed simultaneous solutions
Progress in Earth and Planetary Science (2024)
-
Application of graph neural networks to predict explosion-induced transient flow
Advanced Modeling and Simulation in Engineering Sciences (2024)
-
Progressive transfer learning for advancing machine learning-based reduced-order modeling
Scientific Reports (2024)
-
Generative learning for forecasting the dynamics of high-dimensional complex systems
Nature Communications (2024)
-
A Euclidean transformer for fast and stable machine learned force fields
Nature Communications (2024)