Abstract
POD-DL-ROMs have been recently proposed as an extremely versatile strategy to build accurate and reliable reduced order models (ROMs) for nonlinear parametrized partial differential equations, combining (i) a preliminary dimensionality reduction obtained through proper orthogonal decomposition (POD) for the sake of efficiency, (ii) an autoencoder architecture that further reduces the dimensionality of the POD space to a handful of latent coordinates, and (iii) a dense neural network to learn the map that describes the dynamics of the latent coordinates as a function of the input parameters and the time variable. Within this work, we aim at justifying the outstanding approximation capabilities of POD-DL-ROMs by means of a thorough error analysis, showing how the sampling required to generate training data, the dimension of the POD space, and the complexity of the underlying neural networks, impact on the solutions us to formulate practical criteria to control the relative error in the approximation of the solution field of interest, and derive general error estimates. Furthermore, we show that, from a theoretical point of view, POD-DL-ROMs outperform several deep learning-based techniques in terms of model complexity. Finally, we validate our findings by means of suitable numerical experiments, ranging from parameter-dependent operators analytically defined to several parametrized PDEs.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Data Availability
The source code implementation of the method described in the paper is made available from the GitHub repository: https://github.com/DLROM-hub/poddlrom-error-estimates together with a sample numerical experiment to showcase a possible use.
References
Barrault, M., Maday, Y., Nguyen, N.C., Patera, A.T.: An ‘empirical interpolation’ method: application to efficient reduced-basis discretization of partial differential equations. C.R. Math. 339(9), 667–672 (2004). https://doi.org/10.1016/j.crma.2004.08.006
Bhattacharya, K., Hosseini, B., Kovachki, N.B., Stuart, A.M.: Model reduction and neural networks for parametric PDEs. The SMAI J. Comput. Math. 7, 121–157 (2021). https://doi.org/10.5802/smai-jcm.74
Caflisch, R.E.: Monte Carlo and quasi-Monte Carlo methods. Acta Numer. 7, 1–49 (1998). https://doi.org/10.1017/S0962492900002804
Chaturantabut, S., Sorensen, D.C.: Nonlinear model reduction via discrete empirical interpolation. SIAM J. Sci. Comput. 32(5), 2737–2764 (2010). https://doi.org/10.1137/090766498
Chen, R.T.Q., Rubanova, Y., Bettencourt, J., Duvenaud, D.K.: Neural ordinary differential equations. In: S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, R. Garnett (eds.) Advances in Neural Information Processing Systems, vol. 31. Curran Associates, Inc. (2018)
Chen, W., Wang, Q., Hesthaven, J.S., Zhang, C.: Physics-informed machine learning for reduced-order modeling of nonlinear problems. J. Comput. Phys. 446, 110666 (2021). https://doi.org/10.1016/j.jcp.2021.110666
Deng, B., Shin, Y., Lu, L., Zhang, Z., Karniadakis, G.E.: Approximation rates of DeepONets for learning operators arising from advection-diffusion equations. Neural Netw. 153, 411–426 (2022). https://doi.org/10.1016/j.neunet.2022.06.019
DeVore, R.A., Howard, R., Micchelli, C.: Optimal nonlinear approximation. Manuscripta Math. 63, 469–478 (1989). https://doi.org/10.1007/BF01171759
Farhat, C., Grimberg, S., Manzoni, A., Quarteroni, A.: Computational bottlenecks for proms: precomputation and hyperreduction. In: P. Benner, S. Grivet-Talocia, A. Quarteroni, G. Rozza, W. Schilders, L.M. Silveira (eds.) Volume 2: Snapshot-Based Methods and Algorithms, pp. 181–244. De Gruyter, Berlin, Boston (2020). https://doi.org/10.1515/9783110671490-005
Franco, N., Manzoni, A., Zunino, P.: A deep learning approach to reduced order modelling of parameter dependent partial differential equations. Math. Comput. 92(340), 483–524 (2023). https://doi.org/10.1090/mcom/3781
Franco, N.R., Fresca, S., Manzoni, A., Zunino, P.: Approximation bounds for convolutional neural networks in operator learning. Neural Netw. 161, 129–141 (2023). https://doi.org/10.1016/j.neunet.2023.01.029
Fresca, S., Dedè, L., Manzoni, A.: A comprehensive deep learning-based approach to reduced order modeling of nonlinear time-dependent parametrized PDEs. Journal of Scientific Computing 87(61) (2021). https://doi.org/10.1007/s10915-021-01462-7
Fresca, S., Fatone, F., Manzoni, A.: Long-time prediction of nonlinear parametrized dynamical systems by deep learning-based ROMs. In: NIPS Workshop The Symbiosis of Deep Learning and Differential Equations (2021)
Fresca, S., Gobat, G., Fedeli, P., Frangi, A., Manzoni, A.: Deep learning-based reduced order models for the real-time simulation of the nonlinear dynamics of microstructures. Int. J. Numer. Meth. Eng. 123(20), 4749–4777 (2022). https://doi.org/10.1002/nme.7054
Fresca, S., Manzoni, A.: Real-time simulation of parameter-dependent fluid flows through deep learning-based reduced order models. Fluids 6(7) (2021). https://doi.org/10.3390/fluids6070259
Fresca, S., Manzoni, A.: POD-DL-ROM: enhancing deep learning-based reduced order models for nonlinear parametrized PDEs by proper orthogonal decomposition. Comput. Methods Appl. Mech. Eng. 388, 114181 (2022). https://doi.org/10.1016/j.cma.2021.114181
Fresca, S., Manzoni, A., Dedè, L., Quarteroni, A.: Deep learning-based reduced order models in cardiac electrophysiology. PLOS One 15(10) (2020). https://doi.org/10.1371/journal.pone.0239416
Fresca, S., Manzoni, A., Dedè, L., Quarteroni, A.: Pod-enhanced deep learning-based reduced order models for the real-time simulation of cardiac electrophysiology in the left atrium. Frontiers in physiology p. 1431 (2021)
Geist, M., Petersen, P., Raslan, M., Schneider, R., Kutyniok, G.: Numerical solution of the parametric diffusion equation by deep neural networks. J. Sci. Comput. 88(1), 22 (2021)
Gühring, I., Kutyniok, G., Petersen, P.: Error bounds for approximations with deep ReLU neural networks in Ws, p norms. Anal. Appl. 18(05), 803–859 (2020). https://doi.org/10.1142/S0219530519410021
Gühring, I., Raslan, M.: Approximation rates for neural networks with encodable weights in smoothness spaces. Neural Netw. 134, 107–130 (2021). https://doi.org/10.1016/j.neunet.2020.11.010
Hesthaven, J., Ubbiali, S.: Non-intrusive reduced order modeling of nonlinear problems using neural networks. J. Comput. Phys. 363, 55–78 (2018). https://doi.org/10.1016/j.jcp.2018.02.037
Jacod, J., Protter, P.: Probability essentials. Springer Berlin, Heidelberg (2004). https://doi.org/10.1007/978-3-642-55682-1
Jin, P., Meng, S., Lu, L.: MIONet: learning multiple-input operators via tensor product. SIAM J. Sci. Comput. 44, A3490–A3514 (2022). https://doi.org/10.1137/22M1477751
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. In: Y. Bengio, Y. LeCun (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings (2015). arXiv:1412.6980
Lanthaler, S., Mishra, S., Karniadakis, G.E.: Error estimates for DeepONets: a deep learning framework in infinite dimensions. Transactions of Mathematics and Its Applications 6(1), tnac001 (2022). https://doi.org/10.1093/imatrm/tnac001
Lee, K., Carlberg, K.T.: Model reduction of dynamical systems on nonlinear manifolds using deep convolutional autoencoders. J. Comput. Phys. 404, 108973 (2020). https://doi.org/10.1016/j.jcp.2019.108973
Lu, L., Jin, P., Pang, G., Zhang, Z., Karniadakis, G.E.: Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nature Machine Intelligence 3(3), 218–229 (2021). https://doi.org/10.1038/s42256-021-00302-5
Lu, L., Meng, X., Cai, S., Mao, Z., Goswami, S., Zhang, Z., Karniadakis, G.E.: A comprehensive and fair comparison of two neural operators (with practical extensions) based on fair data. Comput. Methods Appl. Mech. Eng. 393, 114778 (2022). https://doi.org/10.1016/j.cma.2022.114778
Mücke, N.T., Bohté, S.M., Oosterlee, C.W.: Reduced order modeling for parameterized time-dependent PDEs using spatially and memory aware deep learning. J. Comput. Sci. 53, 101408 (2021). https://doi.org/10.1016/j.jocs.2021.101408
Mishra, S., Molinaro, R.: Estimates on the generalization error of physics-informed neural networks for approximating PDEs. IMA J. Numer. Anal. 43(1), 1–43 (2022). https://doi.org/10.1093/imanum/drab093
Niederreiter, H.: Random number generation and quasi-Monte Carlo methods. SIAM (1992)
O’Leary-Roseberry, T., Du, X., Chaudhuri, A., Martins, J.R., Willcox, K., Ghattas, O.: Learning high-dimensional parametric maps via reduced basis adaptive residual networks. Comput. Methods Appl. Mech. Eng. 402, 115730 (2022). https://doi.org/10.1016/j.cma.2022.115730
Pant, P., Doshi, R., Bahl, P., Farimani, A.B.: Deep learning for reduced order modelling and efficient temporal evolution of fluid simulations. Phys. Fluids 33(10), 107101 (2021). https://doi.org/10.1063/5.0062546
Quarteroni, A.: Numerical models for differential problems. Springer Cham (2017). https://doi.org/10.1007/978-3-319-49316-9
Quarteroni, A., Manzoni, A., Negri, F.: Reduced basis methods for partial differential equations. Springer Cham (2016). https://doi.org/10.1007/978-3-319-15431-2
Quarteroni, A., Sacco, R., Saleri, F., Gervasio, P.: Matematica numerica. Springer Milano (2014). https://doi.org/10.1007/978-88-470-5644-2
Salvador, M., Dede, L., Manzoni, A.: Non intrusive reduced order modeling of parametrized PDEs by kernel pod and neural networks. Comput. Math. Appl. 104, 1–13 (2021). https://doi.org/10.1016/j.camwa.2021.11.001
Schwab, C., Todor, R.A.: Karhunen-Loéve approximation of random fields by generalized fast multipole methods. J. Comput. Phys. 217(1), 100–122 (2006). https://doi.org/10.1016/j.jcp.2006.01.048
Szlam, A., Kluger, Y., Tygert, M.: An implementation of a randomized algorithm for principal component analysis (2014). arXiv:1412.3510v1
Wang, Q., Hesthaven, J.S., Ray, D.: Non-intrusive reduced order modeling of unsteady flows using artificial neural networks with application to a combustion problem. J. Comput. Phys. 384, 289–307 (2019). https://doi.org/10.1016/j.jcp.2019.01.031
Yarotsky, D.: Error bounds for approximations with deep ReLU networks. Neural Netw. 94, 103–114 (2017). https://doi.org/10.1016/j.neunet.2017.07.002
Yarotsky, D.: Optimal approximation of continuous functions by very deep ReLU networks. In: S. Bubeck, V. Perchet, P. Rigollet (eds.) Proceedings of the 31st Conference On Learning Theory, Proceedings of Machine Learning Research, vol. 75, pp. 639–649. PMLR (2018). https://proceedings.mlr.press/v75/yarotsky18a.html
Zahm, O., Constantine, P.G., Prieur, C., Marzouk, Y.M.: Gradient-based dimension reduction of multivariate vector-valued functions. SIAM J. Sci. Comput. 42(1), A534–A558 (2020). https://doi.org/10.1137/18M1221837
Zhu, Y., Zabaras, N.: Bayesian deep convolutional encoder-decoder networks for surrogate modeling and uncertainty quantification. J. Comput. Phys. 366, 415–447 (2018). https://doi.org/10.1016/j.jcp.2018.04.018
Acknowledgements
The authors are members of the Gruppo Nazionale Calcolo Scientifico-Istituto Nazionale di Alta Matematica (GNCS-INdAM) and acknowledge the project “Dipartimento di Eccellenza” 2023-2027, funded by MUR, as well as the support of Fondazione Cariplo, Italy, Grant n. 2019-4608. AM acknowledges the PRIN 2022 Project “Numerical approximation of un- certainty quantification problems for PDEs by multi-fidelity methods (UQ-FLY)” (No. 202222PACR), funded by the European Union - NextGenerationEU. AM and SF acknowledge the project FAIR (Future Artificial Intelligence Research), funded by the NextGenerationEU program within the PNRR-PE-AI scheme (M4C2, Investment 1.3, Line on Artificial Intelligence). SF also acknowledges the Isaac Newton Institute for Mathematical Sciences, Cambridge, UK, for support and hospitality during the program “The mathematical and statistical foundation of future data-driven engineering,” EPSRC grant no EP/R014604, where part of this work was undertaken.
Funding
Open access funding provided by Politecnico di Milano within the CRUI-CARE Agreement.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Communicated by: Tobias Breiten
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A. Additional proofs
Appendix A. Additional proofs
1.1 A.1 Proof of Proposition 1
Thanks to Assumption 1, trivially we obtain \({\varDelta } t = T N_t^{-1} = O(N_t^{-1})\) and we set \(t_i = i{\varDelta } t\). Letting \(f=f(\varvec{\mu },t)\) be the (sufficiently regular) integrand of the integral that we want to approximate, we obtain
where
and
Notice that
because
since \(f \in L^2(\mathcal {P} \times \mathcal {T})\). Thus, the error we commit in approximating the integral goes to zero upon requiring \(N_s,N_t \rightarrow \infty \). Finally, notice that
which allows us to write
1.2 A.2 Proof of Proposition 2
We notice immediately that the integral is well defined \(\forall \textbf{v} \in L^2(\mathcal {P} \times \mathcal {T}; \mathbb {R}^{N_h})\) thanks to the boundedness assumptions on the solution \(\textbf{u} \in L^2(\mathcal {P} \times \mathcal {T}; \mathbb {R}^{N_h})\). We also remark that the boundedness hypotheses may be relaxed: our choice was aimed at consistency with the other theoretical results of the present work. In order to prove that \(\Vert \cdot \Vert _{L^2_w}\) is a norm, we have to show that:
-
(i)
It satisfies the triangle inequality. Given \(\textbf{v}, \textbf{z} \in L^2(\mathcal {P}\times \mathcal {T}; \mathbb {R}^{N_h})\), by means of the triangular inequality, it is trivial to show that
$$\begin{aligned} \begin{aligned}&\Vert \textbf{v} + \textbf{z}\Vert ^2_{L^2_w} = \\ {}&= \int _{\mathcal {P} \times \mathcal {T}}\Vert \textbf{v}(\varvec{\mu },t) + \textbf{z}(\varvec{\mu },t)\Vert ^2 w(\varvec{\mu },t) d(\varvec{\mu },t) \le \\ {}&\le \int _{\mathcal {P} \times \mathcal {T}} (\Vert \textbf{v}(\varvec{\mu },t) \Vert + \Vert \textbf{z}(\varvec{\mu },t)\Vert )^2 w(\varvec{\mu },t) d(\varvec{\mu },t) = \\ {}&= \int _{\mathcal {P} \times \mathcal {T}} (\Vert \textbf{v}(\varvec{\mu },t) \Vert ^2 + \Vert \textbf{z}(\varvec{\mu },t)\Vert ^2 + 2\Vert \textbf{v}(\varvec{\mu },t) \Vert \Vert \textbf{z}(\varvec{\mu },t) \Vert )w(\varvec{\mu },t) d(\varvec{\mu },t). \end{aligned} \end{aligned}$$Moreover, by the Cauchy-Schwarz inequality, the following inequality holds,
$$\begin{aligned} \begin{aligned}&\int _{\mathcal {P} \times \mathcal {T}} \Vert \textbf{v}(\varvec{\mu },t) \Vert \Vert \textbf{z}(\varvec{\mu },t) \Vert w(\varvec{\mu },t) d(\varvec{\mu },t) \le \\ {}&\le \sqrt{\int _{\mathcal {P} \times \mathcal {T}} \Vert \textbf{v}(\varvec{\mu },t) \Vert ^2 w(\varvec{\mu },t) d(\varvec{\mu },t) \int _{\mathcal {P} \times \mathcal {T}} \Vert \textbf{z}(\varvec{\mu },t) \Vert ^2 w(\varvec{\mu },t) d(\varvec{\mu },t)}. \end{aligned} \end{aligned}$$Thus, we can infer
$$\begin{aligned} \Vert \textbf{v} + \textbf{z}\Vert ^2_{L^2_w} \le \Vert \textbf{v}\Vert ^2_{L^2_w} + \Vert \textbf{z}\Vert ^2_{L^2_w} + 2\Vert \textbf{v}\Vert _{L^2_w} \Vert \textbf{z}\Vert _{L^2_w}=(\Vert \textbf{v}\Vert _{L^2_w} + \Vert \textbf{z}\Vert _{L^2_w})^2 \end{aligned}$$and derive the thesis;
-
(ii)
\(\Vert \cdot \Vert _{L^2_w}\) is homogeneous thanks to the linearity of the integral;
-
(iii)
If \(\textbf{v} \in L^2(\mathcal {P} \times \mathcal {T}; \mathbb {R}^{N_h})\), \(\Vert \textbf{v}\Vert _{L^2_w} = 0\) implies that \(\textbf{v} = \varvec{0}\) a.e. by trivial arguments.
1.3 A.3 Quasi-optimality of the discrete formulation of the POD decomposition
We base the following analysis on the results of the \((\mathcal {P} \times \mathcal {T})\)-continuous problem proposed in [36]. We first recall that by definition \(\textbf{V}_\infty \in \mathbb {R}^{N_h \times N}\) (where N is the POD dimension) is optimal for the \((\mathcal {P} \times \mathcal {T})\)-continuous formulation, that is with respect to the \(L^2(\mathcal {P} \times \mathcal {T}; \mathbb {R}^{N_h})\) norm. Formally, we set \(\delta ,\varepsilon > 0\) and, by assuming \(\textbf{u}(\varvec{\mu },t) \in L^2(\mathcal {P} \times \mathcal {T},\mathbb {R}^{N_h})\), we define \(T:L^2(\mathcal {P} \times \mathcal {T}) \rightarrow \mathbb {R}^{N_h}\) as
The adjoint operator of T, namely \(T^*\), enjoys the property
Moreover, recall the definition of the (continuous) correlation matrix () and denote by \((\sigma _{k,\infty }^2,\varvec{\zeta }_k)\) its eigenpairs (where \(\{\varvec{\zeta }_k\}_{k}\) denotes an orthonormal basis). We thus define the HS-norm of T as
Setting
we denote by \(T_{N,\infty }\) the rank-N Schmidt approximation, with
and by \(T_{N} = \textbf{V} \textbf{V}^T T\) its approximation by means of the discrete POD formulation.
where \(\mathcal {B}_N = \{ B \in \mathcal {L}(L^2(\mathcal {P} \times \mathcal {T}); \mathbb {R}^{N_h})): \text {rank}(B) \le N \wedge \Vert B\Vert _{HS} < +\infty \}\), being \(\mathcal {L}(U)\) the space of linear continuous operators from U to U, for U Banach. Now, suppose to define \(B_N \in \mathcal {B}_N\) which does not attain the minimum in (13), thus
By means of the results of Theorem 1, with the same hypotheses, we have that
Thus, since a.s. convergence implies convergence in probability, we derive that
Finally, thanks to (14), we have
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Brivio, S., Fresca, S., Franco, N.R. et al. Error estimates for POD-DL-ROMs: a deep learning framework for reduced order modeling of nonlinear parametrized PDEs enhanced by proper orthogonal decomposition. Adv Comput Math 50, 33 (2024). https://doi.org/10.1007/s10444-024-10110-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10444-024-10110-1
Keywords
- Operator learning
- Neural networks
- Approximation bounds
- Reduced order modeling
- Parametrized PDEs
- Deep learning-based reduced order modeling