Abstract
Probabilistic integration of a continuous dynamical system is a way of systematically introducing discretisation error, at scales no larger than errors introduced by standard numerical discretisation, in order to enable thorough exploration of possible responses of the system to inputs. It is thus a potentially useful approach in a number of applications such as forward uncertainty quantification, inverse problems, and data assimilation. We extend the convergence analysis of probabilistic integrators for deterministic ordinary differential equations, as proposed by Conrad et al. (Stat Comput 27(4):1065–1082, 2017. https://doi.org/10.1007/s11222-016-9671-0), to establish mean-square convergence in the uniform norm on discrete- or continuous-time solutions under relaxed regularity assumptions on the driving vector fields and their induced flows. Specifically, we show that randomised high-order integrators for globally Lipschitz flows and randomised Euler integrators for dissipative vector fields with polynomially bounded local Lipschitz constants all have the same mean-square convergence rate as their deterministic counterparts, provided that the variance of the integration noise is not of higher order than the corresponding deterministic integrator. These and similar results are proven for probabilistic integrators where the random perturbations may be state-dependent, non-Gaussian, or non-centred random variables.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Briol, F.-X., Oates, C., Girolami, M., Osborne, M.A.: Frank–Wolfe Bayesian quadrature: probabilistic integration with theoretical guarantees. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 28, pp. 1162–1170. Curran Associates Inc., Red Hook (2015)
Capistrán, M.A., Christen, J.A., Donnet, S.: Bayesian analysis of ODEs: solver optimal accuracy and Bayes factors. SIAM/ASA J. Uncertain. Quantif. 4(1), 829–849 (2016). https://doi.org/10.1137/140976777
Chkrebtii, O.A., Campbell, D.A., Calderhead, B., Girolami, M.A.: Bayesian solution uncertainty quantification for differential equations. Bayesian Anal. 11(4), 1239–1267 (2016). https://doi.org/10.1214/16-BA1017
Christen, J.A.: Posterior distribution existence and error control in Banach spaces (2017). arXiv:1712.03299
Cockayne, J., Oates, C., Sullivan, T.J., Girolami, M.: Probabilistic numerical methods for PDE-constrained Bayesian inverse problems. In: Verdoolaege, G. (ed.) Proceedings of the 36th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, volume 1853 of AIP Conference Proceedings, pp. 060001–1–060001–8 (2017a). https://doi.org/10.1063/1.4985359
Cockayne, J., Oates, C., Sullivan, T.J., Girolami, M.: Bayesian probabilistic numerical methods. SIAM Rev. (2017b). arXiv:1702.03673v2
Conrad, P.R., Girolami, M., Särkkä, S., Stuart, A.M., Zygalakis, K.C.: Statistical analysis of differential equations: introducing probability measures on numerical solutions. Stat. Comput. 27(4), 1065–1082 (2017). https://doi.org/10.1007/s11222-016-9671-0. ISSN 0960-3174
Diaconis, P.: Bayesian numerical analysis. In: Gupta, S.S., Berger, J.O. (eds.) Statistical Decision Theory and Related Topics, IV (West Lafayette, Ind., 1986), vol. 1, pp. 163–175. Springer, New York (1988)
Fang, W., Giles, M.B.: Adaptive Euler–Maruyama method for SDEs with non-globally Lipschitz drift: part I, finite time interval (2016). arXiv:1609.08101
Giles, M.B.: Multilevel Monte Carlo methods. Acta Numer. 24, 259–328 (2015). https://doi.org/10.1017/S096249291500001X
Gonzalez, J., Osborne, M., Lawrence, N.: GLASSES: Relieving the myopia of Bayesian optimisation. In: Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, pp. 790–799 (2016). http://jmlr.org/proceedings/papers/v51/gonzalez16b.html
Hairer, E., Nørsett, S., Wanner, G.: Solving Ordinary Differential Equations I: Nonstiff Problems, volume 8 of Springer Series in Computational Mathematics. Springer, New York (2009). https://doi.org/10.1007/978-3-540-78862-1
Hennig, P.: Probabilistic interpretation of linear solvers. SIAM J. Optim. 25(1), 234–260 (2015). https://doi.org/10.1137/140955501
Hennig, P., Osborne, M.A., Girolami, M.: Probabilistic numerics and uncertainty in computations. Proc. R. Soc. Lond. A Math. 471(2179), 20150142 (2015). https://doi.org/10.1098/rspa.2015.0142
Higham, D.J., Mao, X., Stuart, A.M.: Strong convergence of Euler-type methods for nonlinear stochastic differential equations. SIAM J. Numer. Anal. 40(3), 1041–1063 (2002). https://doi.org/10.1137/S0036142901389530
Holte, J.M.: Discrete Gronwall lemma and applications (2009). http://homepages.gac.edu/~holte/publications/GronwallLemma.pdf. Accessed 9 Oct 2019
Humphries, A.R., Stuart, A.M.: Runge–Kutta methods for dissipative and gradient dynamical systems. SIAM J. Numer. Anal. 31(5), 1452–1485 (1994). https://doi.org/10.1137/0731075
Jentzen, A., Neuenkirch, A.: A random Euler scheme for Carathéodory differential equations. J. Comput. Appl. Math. 224(1), 346–359 (2009). https://doi.org/10.1016/j.cam.2008.05.060
Kaipio, J., Somersalo, E.: Statistical and Computational Inverse Problems, volume 160 of Applied Mathematical Sciences. Springer, New York (2005). https://doi.org/10.1007/b138659
Knapik, B.T., van der Vaart, A.W., van Zanten, J.H.: Bayesian inverse problems with Gaussian priors. Ann. Stat. 39(5), 2626–2657 (2011). https://doi.org/10.1214/11-AOS920
Kruse, R., Wu, Y.: Error analysis of randomized Runge–Kutta methods for differential equations with time-irregular coefficients. Comput. Methods Appl. Math. 17(3), 479–498 (2017). https://doi.org/10.1515/cmam-2016-0048
Law, K., Stuart, A., Zygalakis, K.: Data Assimilation: A Mathematical Introduction, volume 62 of Texts in Applied Mathematics. Springer, Berlin (2015). https://doi.org/10.1007/978-3-319-20325-6
Lie, H.C., Sullivan, T.J., Teckentrup, A.L.: Random forward models and log-likelihoods in Bayesian inverse problems. SIAM/ASA J. Uncertain. Quantif. 6(4), 1600–1629 (2018). https://doi.org/10.1137/18M1166523
Mao, X., Szpruch, L.: Strong convergence and stability of implicit numerical methods for stochastic differential equations with non-globally Lipschitz continuous coefficients. J. Comput. Appl. Math. 238, 14–28 (2013). https://doi.org/10.1016/j.cam.2012.08.015
Novak, E.: Deterministic and Stochastic Error Bounds in Numerical Analysis, volume 1349 of Lecture Notes in Mathematics. Springer, Berlin (1988). https://doi.org/10.1007/BFb0079792
O’Hagan, A.: Some Bayesian numerical analysis. In: Bernardo, J.M., Berger, J.O., Dawid, A.P., Smith, A.F.M. (eds.) Bayesian Statistics, 4: Proceedings of the Fourth Valencia International Meeting: Dedicated to the Memory of Morris H. DeGroot, 1931–1989: April 15–20, 1991, pp. 345–363. Clarendon Press, Oxford (1992)
Owhadi, H.: Bayesian numerical homogenization. Multiscale Model. Simul. 13(3), 812–828 (2015). https://doi.org/10.1137/140974596
Owhadi, H.: Multigrid with rough coefficients and multiresolution operator decomposition from hierarchical information games. SIAM Rev. 59(1), 99–149 (2017). https://doi.org/10.1137/15M1013894
Peškir, G.: On the exponential Orlicz norms of stopped Brownian motion. Stud. Math. 117(3), 253–273 (1996). https://doi.org/10.4064/sm-117-3-253-273
Reich, S., Cotter, C.: Probabilistic Forecasting and Bayesian Data Assimilation. Cambridge University Press, New York (2015). https://doi.org/10.1017/CBO9781107706804
Ritter, K.: Average-Case Analysis of Numerical Problems, volume 1733 of Lecture Notes in Mathematics. Springer, Berlin (2000). https://doi.org/10.1007/BFb0103934
Schober, M., Duvenaud, D.K., Hennig, P.: Probabilistic ODE solvers with Runge–Kutta means. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 739–747. Curran Associates Inc., Red Hook (2014)
Skilling, J.: Bayesian solution of ordinary differential equations. In: Smith, C.R., Erickson, G.J., Neudorfer, P.O. (eds.) Maximum Entropy and Bayesian Methods, volume 50 of Fundamental Theories of Physics, pp. 23–37. Springer, Berlin (1992). https://doi.org/10.1007/978-94-017-2219-3
Smith, R.C.: Uncertainty Quantification: Theory, Implementation, and Applications, volume 12 of Computational Science & Engineering. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (2014)
Stengle, G.: Numerical methods for systems with measurable coefficients. Appl. Math. Lett. 3(4), 25–29 (1990). https://doi.org/10.1016/0893-9659(90)90040-I
Stuart, A.M.: Inverse problems: a Bayesian perspective. Acta Numer. 19, 451–559 (2010). https://doi.org/10.1017/S0962492910000061
Sullivan, T.J.: Introduction to Uncertainty Quantification, volume 63 of Texts in Applied Mathematics. Springer, Berlin (2015). https://doi.org/10.1007/978-3-319-23395-6
Teymur, O., Lie, H.C., Sullivan, T.J., Calderhead, B.: Implicit probabilistic integrators for ODEs. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems 31 (NIPS 2018) (2018). http://papers.nips.cc/paper/7955-implicit-probabilistic-integrators-for-odes
Traub, J.F., Woźniakowsi, H.: A General Theory of Optimal Algorithms. ACM Monograph Series. Academic Press Inc., New York (1980)
Traub, J.F., Wasilkowski, G.W., Woźniakowski, H.: Information, Uncertainty, Complexity. Addison-Wesley, Reading (1983)
Wang, J., Cockayne, J., Oates, C.: On the Bayesian solution of differential equations (2018). arXiv:1805.07109
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
HCL and TJS are supported by the Freie Universität Berlin within the Excellence Initiative of the German Research Foundation. HCL is supported by the Universität Potsdam. AMS is grateful to DARPA, EPSRC and ONR for funding. This material was based upon work partially supported by the National Science Foundation under Grant DMS-1127914 to the Statistical and Applied Mathematical Sciences Institute. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of these funding agencies and institutions.
A Proofs
A Proofs
Proof of Lemma 2.1
Assertion (2.3) holds immediately for \(n=1\), so let \(n\in {\mathbb {N}}\setminus \{1\}\), and recall the binomial formula: for \(x,y\in {\mathbb {R}}\) and \(n\in {\mathbb {N}}\setminus \{1\}\),
Fix \(\delta >0\). By (2.1), for any \(1\le k\le n-1\),
where the second inequality follows from \(-\tfrac{k}{n-k}\ge -(n-1)\). Therefore,
and the proof is complete upon observing that
and bounding the other binomial sum in a similar way. \(\square \)
Proof of Theorem 3.4
By (3.3),
By (2.2) with \(\delta = \tau \), by Assumptions 3.1 and 3.2, and using that \(\tau <\tau ^*\le 1\),
Observe that \([(1 + \tau )(1 + C_\varPhi \tau )^2-1]\tau ^{-1}\) equals a quadratic polynomial in \(\tau \) with coefficients \(a_0\), \(a_1\), and \(a_2\). Calculating these coefficients and defining
then yields that \([(1 + \tau )(1 + C_\varPhi \tau )^2-1]\tau ^{-1}\le C_1\) for all \(0<\tau <\tau ^*\).
Combining the preceding estimates yields
Using (A.4) in the telescoping sum
the fact that \(e_0=u_0-U_0=0\) and \(K=T/\tau \), we obtain
It follows from the last inequality that
Now replace \(\Vert e_j \Vert ^2\) on the right-hand side with \(\max _{\ell \le j}\Vert e_\ell \Vert ^2\) and take expectations of both sides of the inequality. Since Assumption 3.3 holds with \(R=2\),
Next, define for every \(k\in [K]\) the \(\sigma \)-algebra \({\mathcal {F}}_{j}\) generated by \(\xi _{0}(\tau ),\ldots ,\xi _{j}(\tau )\) Then, the sequence \(({\mathcal {F}}_{j})_{j\in [K]}\) forms a filtration. Define \((M_k)_{k\in [K]}\) by
We want to show that this process is a martingale with respect to \(({\mathcal {F}}_{j})_{j\in [K]}\). By (1.6), \(U_j\) is measurable with respect to \({\mathcal {F}}_{j-1}\), so \(M_k\) is measurable with respect to \({\mathcal {F}}_k\). Hence \((M_k)_{k\in [K]}\) is adapted with respect to \(({\mathcal {F}}_k)_{k\in [K]}\). Observe that
Using the assumption that \(X\in L^2_{{\mathbb {P}}} \implies \varPsi ^\tau (X)\in L^2_{{\mathbb {P}}}\), (1.6), Assumption 3.3, and the fact that \(U_0=u_0\) is fixed, it follows that \(U_j\) and \(\varPsi ^\tau (U_j)\) belong to \(L^2_{{\mathbb {P}}}\); thus \(M_k\) belongs to \(L^1_{{\mathbb {P}}}\) for every \(k\in [K]\). We now use the assumption that \({\mathbb {E}}[\xi _j(\tau )]=0\) for every \(j\in [K]\), and that the \((\xi _k(\tau ))_{k\in [K]}\) are mutually independent, in order to establish the martingale property:
and the right-hand side vanishes since \(U_k\) is measurable with respect to \({\mathcal {F}}_{k-1}\) as noted earlier. Since \((M_k)_{k\in [K]}\) is a martingale, we may apply the Burkholder–Davis–Gundy inequality (Peškir 1996, Equation 2.2). Letting \([Y]_{\ell }\) denote the quadratic variation up to time \(\ell \) of a process \(Y_{k}\), we have
where we define \(b:=\sqrt{ \sum _{j = 1}^{\ell - 1} \Vert \xi _{j}(\tau ) - \xi _{j - 1}(\tau ) \Vert ^{2} }\) and \(a:=\sqrt{\max _{j \le \ell } \Vert \varPhi ^{\tau } (u_{j}) - \varPsi ^{\tau } (U_{j}) \Vert ^2}\). Using (2.1) with the same a and b, \(r=r^*=2\), and \(\delta =[6(1 + \tau )(1 + C_\varPhi \tau )^2]^{-1}\), and using (A.2), it follows that
where we applied (2.2) with \(\delta =1\), \(r=r^*=2\), \(a=\xi _{j}(\tau )\) and \(b=\xi _{j-1}(\tau )\) to obtain the last inequality. Thus by Assumption 3.3 and by using \(\ell -1\le K=T/\tau \),
Combining the preceding estimates, we obtain
and by rearranging terms and using that \(\tau <\tau ^*\le 1\), we obtain
By the discrete Grönwall inequality (Theorem 2.1) with \(x_k:={\mathbb {E}}[ \max _{\ell \le k} \Vert e_{\ell } \Vert ^{2} ]\) and constant \(\alpha _k\) and \(\beta _j=2\tau C_1\), and by using that \(K=T/\tau \), we obtain
This establishes (3.4). \(\square \)
Proof of Theorem 3.5
Let \(0\le k\le K-1\) and \(n \in {\mathbb {N}}\). By applying the triangle inequality, (2.3), Assumptions 3.1 and 3.2, and by using that \(1 + \tau 2^{n-1}\le 1 + 2^{n-1}\) (since \(\tau \le 1\)),
Observe that, since \(2^{n-1}\) and \(C_\varPhi \) are non-negative, and since \(0<\tau <\tau ^*\),
Note that \(C_\varPhi (n,\tau )\le C_\varPhi (n,\tau ^*)\).
Since \(n\ge 1\) implies that \(1 + (2/\tau )^{n-1}\le 2^n\tau ^{1-n}\), we have
Decomposing \(\Vert e_{k + 1} \Vert ^n-\Vert e_0 \Vert ^n\) as a telescoping sum, using that \(e_0=u_0-U_0=0\), using the non-negativity of the summands on the right-hand side of the last inequality, and using the relation \(\Vert e_\ell \Vert ^n\le \max _{j\le \ell }\Vert e_j \Vert ^n\), we obtain
Using that \(K=T\tau \) and Grönwall’s inequality (Theorem 2.1),
Taking expectations, using (3.2) with \(w=n\) and \(v=1\), and using that \(K=T/\tau \) yields
Rearranging the above produces the desired inequality. \(\square \)
Proof of Corollary 3.1
Let \(m\in {\mathbb {N}}\) be arbitrary. Using (A.6), and applying (2.4) twice, we obtain
Taking expectations and using (3.2) with \(w=n\) and \(v=m\), we obtain
The conclusion follows by the series expansion of the exponential and the dominated convergence theorem. \(\square \)
Proof of Theorem 4.2
Recall that the solution map \(\varPhi ^\tau \) of the initial value problem (1.1) satisfies
For any \(\tau >0\) and \(a,b\in {\mathbb {R}}^{d}\), Assumption 4.1 and the integral Grönwall–Bellman inequality yield
Given the boundedness hypothesis on the \((\xi _k(\tau ))_{k\in [K]}\), we may define a finite constant \(C>0\) that does not depend on \(\tau \) or k, such that
The rest of the proof follows in a similar manner to that of Theorem 3.5. \(\square \)
Proof of Lemma 4.1
In what follows, we shall omit the dependence of all random variables on \(\omega \), with the understanding that \(\omega \) is arbitrary. Let \(n\in [K]\), where \(K = T/\tau \in {\mathbb {N}}\). From (1.6) we have, by (2.1),
Taking the inner product of (4.3) with \(\varPsi ^{\tau }(U_n)\), we obtain by (4.4)
Thus,
where we used the inequality \(1-2\vert \beta \vert \tau \le 1 + 2\beta \tau \) for the second inequality. Then, (A.7) and (A.8) yield
Let \(c_1(\tau ):=\tfrac{1+2\vert \beta \vert }{1-2\vert \beta \vert \tau }\) and \(c_2(\tau ):=\tfrac{2\alpha }{1-2\vert \beta \vert \tau }\). By (A.9), it follows that
Using the telescoping sum
it follows that
Since \(n \le K :=T/\tau \), and since the right-hand side of the inequality above is non-negative,
Applying the Grönwall inequality (Theorem 2.1), yields, for all \(n\in [K]\),
where we define, for \(\tau '\) as in Assumption 4.4, the scalar
This yields (4.5) for \(n=1\). By applying (2.4), we obtain (4.5) for arbitrary \(n\in {\mathbb {N}}\). \(\square \)
Proof of Proposition 4.2
Recall that in Assumption 4.1, we assume \(f \in C^1({\mathbb {R}}^{d}; {\mathbb {R}}^{d})\). Therefore, Taylor’s theorem applied to the function \(t\mapsto \varPhi ^t(a)\) yields
where \(R^\tau (a)\rightarrow 0\) as \(\tau \rightarrow 0\). Then, by (4.1a), (4.3), and (2.4),
By (4.1a), (4.3), (4.2), and (2.4) with the fact that \(C_\varPhi \ge 1\) in Assumption 4.1, we obtain
From (A.8) and (2.4), it holds that for any n and r such that \(nr\ge 1\),
for \(\tau '\) in Assumption 4.4. Applying the second inequality for the appropriate values of r and computing exponents yields that, for the polynomials \(\pi _1\), \(\pi _2\) and \(\pi \) defined on \({\mathbb {R}}\) by
and \(\pi (x):=\pi _1(x)\pi _2(x)\), it follows from Lemma 4.1 that
Taking expectations, applying Proposition 4.1, and using that \(\tau <\tau '\) to bound the right-hand side of inequality (4.6) in Proposition 4.1, we may define some \(C_3=C_3(\alpha ,\beta ,C_\varPhi ,\tau ',n)\) that does not depend on k or \(\tau \), such that
By Proposition 4.1, the finiteness of \(C_3\) follows from the hypothesis \(R\ge 2n(2s + 1)\) and the observation that \(\pi _1(x^2)\) and \(\pi _2(x^2)\) have degree ns and \(n(s + 1)\) in \(x^2\), respectively.
Now it remains to show that \(\Vert R^\tau (U_k) \Vert ^{2n}\in L^1_{{\mathbb {P}}}\). From (4.1a), (4.1b), and (A.11), we obtain
By the triangle inequality and (A.14),
Then by applying (2.4) and Proposition 4.1 with the hypothesis that \(R\ge 2n(2s + 1)\ge 2n(s + 1)\), and using the bound \(\tau <\tau '\), it follows that we may define a positive scalar \(C_4\) that does not depend on k or \(\tau \), such that
Therefore, with \(C_3\) and \(C_4\) as in (A.13) and (A.15) above, (A.12) yields
as desired. \(\square \)
The proof below makes clear that we make absolutely no effort to find optimal constants.
Proof of Theorem 4.5
Let \(n\in {\mathbb {N}}\). By (2.3)
Since \(\tau \le 1\) and \(n\ge 1\), it holds that \(1 + \tau ^{1-2n} 2^{2n-1}\le \tau ^{1-2n}(1 + 2^{2n-1})\) and \(1 + \tau 2^{2n-1}\le 1 + 2^{2n-1}\). Using these inequalities, (2.3), and (4.1b) in the preceding inequality, we obtain
Using (2.3) again, we obtain
so that by defining
we have
and, therefore,
By non-negativity of \(C_5\), it follows that \([(1 + \tau C_5)^{2n + 1}-1]\tau ^{-1}\) is a polynomial of degree 2n in \(\tau \) with positive coefficients. In particular, if we recall the definition of \(C_5\) and define \(C_6\) by
then \(C_6\) does not depend on \(\tau \), \([(1 + \tau C_5)^{2n + 1}-1]\tau ^{-1}\le C_6\) for all \(0<\tau <\tau '\), and
By the telescoping sum associated to \(\Vert e_{k + 1} \Vert ^{2n}-\Vert e_k \Vert ^{2n}\), the fact that \(e_0=0\), the bound \(1 + 2^{2n-1}\le 2^{2n}\), the non-negativity of the terms on the right-hand side of the inequality above, and the bound \(\Vert e_j \Vert \le \max _{\ell \le j}\Vert e_\ell \Vert \), we obtain
By Lemma 4.1,
which implies that
Define
Since \(C_\varPhi ,C_1\ge 1\), it follows that \(2^{4n}\le C_7\) ,and by Grönwall’s inequality (Theorem 2.1) we obtain
Taking expectations completes the proof, provided that we can ensure each sum is of the right order in \(\tau \). By Proposition 4.2 with the hypothesis that \(R\ge 2n(2s + 1)\), and by Assumption 3.3,
Thus, we need \(p-1/2\ge 1\) to hold. Next, using the bound \(\Vert e_{\ell } \Vert \le \max _{j\in [K]}\Vert e_j \Vert \), Young’s inequality (2.1) with \(a=(\sum ^{K}_{i=1}\Vert \xi _{i}(\tau ) \Vert ^2)^{ns}\), \(b=\Vert e_{\ell } \Vert ^{2n}\), and some \(\delta >0\) and conjugate exponent pair \((r,r^*)\in (1,\infty )^2\) to be determined later, we obtain with (3.2) that
Since \(R\ge 2n(2s + 1)\), the maximal value of r compatible with integrability of \((\sum ^{K}_{i=1}\Vert \xi _{i}(\tau ) \Vert ^2)^{nrs}\) is \(r=2 + s^{-1}\). Since we are not interested in optimal estimates, we shall set \(r=r^*=2\) and \(\delta =\tau ^{-n(2 + s)}\). We thus obtain
For the exponent of \(\tau \) of the first term in the parentheses, we want to ensure that \(-n(2 + s) + 2p(2ns)-ns\ge 2n\), or equivalently that \(p\ge \tfrac{1}{s} + \tfrac{1}{2}\). Comparing this condition with the condition \(p-\tfrac{1}{2}\ge 1\) that arose from (A.19), and recalling that \(s\ge 1\), we observe that if \(p\ge \tfrac{3}{2}\), then the preceding estimates yield
It remains to bound \({\mathbb {E}}[\max _{\ell \in [K]}\Vert e_\ell \Vert ^{4n}]\) by a constant that does not depend on \(\tau \). By (2.4), Proposition 4.1, and the assumption that \(\tau <\tau '\) for \(\tau '\) in Assumption 4.4, we obtain
where \(C_8=C_8(C_2,C_{\xi ,R},n,p,\tau ',T,(u_t)_{0\le t\le T})>0\) does not depend on \(\tau \). Note that in applying Proposition 4.1, we have used that \(s\ge 1\) for the exponent s in Assumption 4.1, since this implies that \(2n(2s + 1)\ge 4n\). \(\square \)
Proof of Theorem 5.2
Let \(k\in [K]\) and \(t_{k}<t\le t_{k + 1}\). Then
and given that Assumption 3.1 implies that \(\varPhi ^{t'}\) is Lipschitz on \({\mathbb {R}}^{d}\) for every \(t'\ge 0\),
by applying (2.4). Since \(t-t_{k}\le \tau \), it follows from the inequality above that
By Assumption 5.1,
Note that Assumption 5.1 is stronger than Assumption 3.3. Therefore, we may apply Theorem 3.5 to obtain (5.1). \(\square \)
Proof of Lemma 5.1
If \(r=0\), then the desired statement follows immediately. Therefore, let \(p,r \ge 1\). Let \(\xi _0\) be the integrated \({\mathbb {P}}\)-Brownian motion process scaled by \(\tau ^{p-1}\), so that
where we applied Jensen’s inequality to the uniform probability measure on [0, t]. It follows that
Above, we used the Fubini–Tonelli theorem to interchange expectation and integration with respect to s, and the fact that \({\mathbb {E}}\bigl [ \sup _{t \le \tau } \Vert B_t \Vert ^r \bigr ]\) is constant with respect to the variable of integration s. For \(r=1\), the Burkholder–Davis–Gundy martingale inequality (Peškir 1996, Equation 2.2) yields
with \((4-r)/(2-r)=3\) for \(r=1\). For \(r>1\), Doob’s inequality (Peškir 1996, Equation 2.1) yields
Since \(r\mapsto [r/(r-1)]^r\) is continuously differentiable and monotonically decreasing on \(2<r<\infty \), the desired conclusion follows. \(\square \)
Rights and permissions
About this article
Cite this article
Lie, H.C., Stuart, A.M. & Sullivan, T.J. Strong convergence rates of probabilistic integrators for ordinary differential equations. Stat Comput 29, 1265–1283 (2019). https://doi.org/10.1007/s11222-019-09898-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-019-09898-6
Keywords
- Probabilistic numerical methods
- Ordinary differential equations
- Convergence rates
- Uncertainty quantification