Abstract
Generating multivariate normal distributions is widely used in various fields, including engineering, statistics, finance and machine learning. In this paper, simulating large multivariate normal distributions truncated on the intersection of a set of hyperplanes is investigated. Specifically, the proposed methodology focuses on cases where the prior multivariate normal is extracted from a stationary Gaussian process (GP). It is based on combining both Karhunen-Loève expansions (KLE) and Matheron’s update rules (MUR). The KLE requires the computation of the decomposition of the covariance matrix of the random variables, which can become expensive when the random vector is too large. To address this issue, the input domain is split in smallest subdomains where the eigendecomposition can be computed. Due to the stationary property, only the eigendecomposition of the first subdomain is required. Through this strategy, the computational complexity is drastically reduced. The mean-square truncation and block errors have been calculated. The efficiency of the proposed approach has been demonstrated through both synthetic and real data studies.
Similar content being viewed by others
References
Aune E, Eidsvik J, Pokern Y (2013) Iterative numerical methods for sampling from high dimensional Gaussian distributions. Stat Comput 23(4):501–521
Chevalier C, Emery X, Ginsbourger D (2015) Fast update of conditional simulation ensembles. Math Geosci 47(7):771–789
Cho H, Venturi D, Karniadakis G (2013) Karhunen-Loève expansion for multi-correlated stochastic processes. Probab Eng Mech 34:157–167
Cong Y, Chen B, Zhou M (2017) Fast simulation of hyperplane-truncated multivariate normal distributions. Bayesian Anal 12(4):1017–1037. https://doi.org/10.1214/17-BA1052
Cousin A, Maatouk H, Rullière D (2016) Kriging of financial term-structures. Eur J Oper Res 255(2):631–648
Davis M (1987) Generating large stochastic simulations - The matrix polynomial approximation method. Math Geol 19(2):99–107
Golub G, Van Loan C (1996) Matrix Comput. The Johns Hopkins University Press, Baltimore
Hoffman Y, Ribak E (1991) Constrained realizations of Gaussian fields: A simple algorithm. APJ 380:L5. https://doi.org/10.1086/186160
Journel A, Huijbregts C (1976) Min Geostat. Academic Press, Cambridge
Kirby M, Sirovich L (1990) Application of the Karhunen-Loève procedure for the characterization of human faces. IEEE PAMI 12(1):103–108
Lenk PJ, Choi T (2017) Bayesian analysis of shape-restricted functions using Gaussian process priors. Stat Sin 23:43–69
Loève M (1977) Elementary probability theory. Springer, Berlin
Maatouk H, Bay X (2017) Gaussian process emulators for computer experiments with inequality constraints. Math Geosci 49(5):557–582
Maatouk H, Bay X, Rullière D (2022) A note on simulating hyperplane-truncated multivariate normal distributions. Stat Probab Lett 191(109):650. https://doi.org/10.1016/j.spl.2022.109650
Matérn B (1986) Spatial variation, vol 36. Springer Science & Business Media, Berlin
Panunzio A, Cottereau R, Puel G (2018) Large scale random fields generation using localized Karhunen-Loève expansion. AMSES 5(1):1–29
Ray P, Pati D, Bhattacharya A (2020) Efficient Bayesian shape-restricted function estimation with constrained Gaussian process priors. Stat Comput 30(4):839–853
Rue H (2001) Fast sampling of Gaussian Markov random fields. J R Stat Soc B Stat 63(2):325–338
Wang L (2008) Karhunen-Loève expansions and their applications. School of Economics and Political Science, London
Williams C, Rasmussen C (2006) Gaussian processes for machine learning, vol 2. MIT press, Cambridge
Wilson J, Borovitskiy V, Terenin A et al (2021) Pathwise conditioning of Gaussian processes. JMLR 22(105):1–47
Wood AT, Chan G (1994) Simulation of stationary Gaussian processes in \([0, 1]^d\). J Comput Graph Stat 3(4):409–432
Zhang J, Ellingwood B (1994) Orthogonal series expansions of random fields in reliability analysis. J Eng Mech 120(12):2660–2677
Zhou S, Giulani P, Piekarewicz J et al (2019) Reexamining the proton-radius problem using constrained Gaussian processes. Phys Rev C 99(055):202. https://doi.org/10.1103/PhysRevC.99.055202
Acknowledgements
The authors would like to express their gratitude to the Editor, the Associate Editor, and the two reviewers for their valuable and constructive comments, which have significantly enhanced the presentation and accuracy of the manuscript. Part of this research was conducted with the support of the consortium in Applied Mathematics CIROQUO, gathering partners in technological and academia in the development of advanced methods for Computer Experiments. https://doi.org/10.5281/zenodo.6581217
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A Subdomains with different lengths
The methodology presented in Sect. 2.2 can also be extended to cases where the domain is split into subdomains with different lengths. In such cases, the eigendecomposition and the coupling matrix must be computed at each subdomain. For example, suppose that the domain \({\mathcal {D}}=[0,1]\) is divided into two subdomains \([0,S]\cup [S,1]\) as illustrated in Fig. 12 below, where \(S=0.7\). The only modification in the methodology presented before is in computing both the coupling matrix \({\varvec{K}}\) and the lower triangle \({\varvec{L}}\). Let us denote these two matrices respectively by \({\varvec{K}}^{1,2}\) and \({\varvec{L}}^{1,2}\) since they depend on the two connected subdomains. The elements of \({\varvec{K}}^{1,2}\in {\mathbb {R}}^{p\times p}\) are defined as
where \(\{\lambda _i^{(1)},\phi _i^{(1)}\}_{i=1}^{p}\) and \(\{\lambda _j^{(2)},\phi _j^{(2)}\}_{j=1}^{p}\) are the eigendecomposition of C on [0, S] and (S, 1], respectively. The lower triangle matrix \({\varvec{L}}^{1,2}\) is computed as follows
As before, the second KLE conditional coefficients set is generated as follows:
where \(\varvec{\xi }^{(1)}=\varvec{\zeta }^{(1)}\in {\mathbb {R}}^{p}\) and \(\varvec{\zeta }^{(2)}\in {\mathbb {R}}^{p}\) are two replicates following a standard MVN distribution. Finally, the random process defined on the entire domain \({\mathcal {D}}=[0,1]\) is given by
The correlation between any two coefficients sets \(\varvec{\xi }^{(m)}\) and \(\varvec{\xi }^{(m')}\) for any \(m'>m\) is equal to
where \({\varvec{K}}^{\ell ,\ell +1}=\textrm{Cov}\left( \varvec{\xi }^{(\ell )},\varvec{\xi }^{(\ell +1)}\right)\), for any \(\ell \in \{m,\ldots ,m'-1\}\).
In Fig. 12, the domain \({\mathcal {D}}\) is split in two (resp. three) subdomains in the left (resp. right) panel with different lengths. The solid curves (resp. dashed curves) represent the paths before (resp. after) conditioning. The Matérn covariance function with a smoothness parameter \(\nu =5/2\) has been utilized, where the length-scale parameter \(\theta\) is fixed at 0.2.
Appendix B Proofs of Propositions
This appendix contains the proofs of the propositions presented in the main body of the manuscript.
Proof of Proposition 2.1
We have for any \(x,x'\in {\mathcal {D}}\)
which concludes the proof of the proposition. \(\square\)
Proof of Proposition 2.2
-
(i)
From right to left: suppose that \(\{\xi _i^{(m)}\}\) are independent and identically distributed \({\mathcal {N}}(0,1)\). The two processes \(Y\) and \(Y_m\) are zero-mean Gaussian. Firstly, from Proposition 2.1 and the stationary property of the covariance function, we have for any \(t,t'\in {\mathcal {D}}^{(m)}\)
$$\begin{aligned} C(t,t')=\, & {} \textrm{Cov}(Y(t),Y(t'))=\textrm{Cov}\left( Y(t-(m-1)S),Y(t'-(m-1)S)\right) \\= & {} \textrm{Cov}(Y(x),Y(x'))=\sum _{i=1}^{+\infty }\lambda _i\phi _i(x)\phi _i(x'), \end{aligned}$$where \(x=t-(m-1)S\) and \(x'=t'-(m-1)S\) are in \({\mathcal {D}}^{(1)}\). Secondly, we have for any \(t,t'\in {\mathcal {D}}^{(m)}\)
$$\begin{aligned} \textrm{Cov}(Y_m(t),Y_m(t'))= & {} \sum _{i,j=1}^{+\infty }\sqrt{\lambda _i\lambda _j}\phi _i(t-(m-1)S)\phi _j(t'-(m-1)S)\delta _{ij} \\= & {} \sum _{i=1}^{+\infty }\lambda _i\phi _i(t-(m-1)S)\phi _i(t'-(m-1)S) \\= & {} \sum _{i=1}^{+\infty }\lambda _i\phi _i(x)\phi _i(x'), \end{aligned}$$where \(x=t-(m-1)S\) and \(x'=t'-(m-1)S\) are in \({\mathcal {D}}^{(1)}\). Now, from left to right: suppose that \(Y_m\) and \(Y\) have the same distribution on \({\mathcal {D}}^{(m)}\). This means that for any \(x,x'\in {\mathcal {D}}^{(m)}\)
$$\begin{aligned} \textrm{Cov}\left( Y_m(x),Y_{m}(x')\right) =\textrm{Cov}(Y(x),Y(x'))=C(|x-x'|). \end{aligned}$$Therefore, for any i, j
$$\begin{aligned} \textrm{Cov}(\xi _i^{(m)},\xi _j^{(m)})= & {} \frac{1}{\sqrt{\lambda _i\lambda _j}}\iint _{{\mathcal {D}}^{(1)}}\phi _i(t)\phi _j(t')C(|t-t'|)\textrm{d}t\textrm{d}t' \\= & {} \frac{\lambda _j}{\sqrt{\lambda _i\lambda _j}}\int _{{\mathcal {D}}^{(1)}}\phi _i(t)\phi _j(t)\textrm{d}t=\delta _{ij}, \end{aligned}$$which concludes the proof of the first item of the proposition.
-
(ii)
Let us consider the simple case when \(m'=m+1\). For example, let \(m=1\) then \(m'=2\). From right to left: on the one hand, we have for any \((s,t)\in {\mathcal {D}}^{(1)}\times {\mathcal {D}}^{(2)}\),
$$\begin{aligned} \textrm{Cov}(Y(s),Y(t))=\textrm{E}[Y(s)Y(t)]=C(|s-t|). \end{aligned}$$On the other hand, we have
$$\begin{aligned} \textrm{E}[Y_1(s)Y_2(t)]= & {} \sum _{i,j=1}^{+\infty }\sqrt{\lambda _i\lambda _j}\phi _i(s)\phi _j(t-S)\textrm{E}[\xi _i^{(1)}\xi _j^{(2)}]\\= & {} \sum _{i,j=1}^{+\infty }\phi _i(s)\phi _j(t-S)\int _{x=0}^S\int _{x'=S}^{2S}\phi _i(x)\phi _j(x'-S)C(|x-x'|)\textrm{d}x\textrm{d}x'\\= & {} \int _{0}^S\int _S^{2S}\sum _{i=1}^{+\infty }\phi _i(s)\phi _i(x)\sum _{j=1}^{+\infty }\phi _j(t-S)\phi _j(x'-S)C(|x-x'|)\textrm{d}x\textrm{d}x' \\= & {} \int _0^S\int _S^{2S}\delta (s-x)\delta (t-x')C(|x-x'|)\textrm{d}x\textrm{d}x' = C(|s-t|). \end{aligned}$$From left to right: we have
$$\begin{aligned} \textrm{Cov}\left( \xi _{i}^{(1)}, \xi _{j}^{(2)}\right)=\, & {} \textrm{Cov}\left( \frac{1}{\sqrt{\lambda _{i}}} \int _{0}^{S} \phi _{i}(x) Y_{1}(x) \textrm{d}x, \frac{1}{\sqrt{\lambda _{j}}} \int _{t=S}^{2 S} \phi _{j}(t-S) Y_{2}(t) \textrm{d}t\right) \\= & {} \frac{1}{\sqrt{\lambda _{i} \lambda _{j}}} \int _{x=0}^{S} \int _{t=S}^{2 S} C(|x-t|) \phi _{i}(x) \phi _{j}(t-S) \textrm{d}x \textrm{d}t \\= & {} \frac{1}{\sqrt{\lambda _{i} \lambda _{j}}} \int _{x=0}^{S} \int _{x'=0}^{S} C\left( |x-x'-S|\right) \phi _{i}(x) \phi _{j}\left( x'\right) \textrm{d}x \textrm{d}x'. \end{aligned}$$The general case can be proved in a similar way.
\(\square\)
Proof of Proposition 2.3
-
From Equation (9), the correlation between \(\varvec{\xi }^{(m-1)}\) and \(\varvec{\xi }^{(m)}\) is imposed in order to satisfy the correlation structure between the connected subdomains \({\mathcal {D}}^{(m-1)}\) and \({\mathcal {D}}^{(m)}\)
$$\begin{aligned} {\varvec{K}}^{m-1,m}=\textrm{Cov}\left( \varvec{\xi }^{(m-1)},\varvec{\xi }^{(m)}\right) ={\varvec{K}}, \quad \forall m\in \{2,\ldots ,M\}. \end{aligned}$$(B1)For example, when \(m=3\), we get
$$\begin{aligned} {\varvec{K}}^{2,3}= \,& {} \textrm{Cov}\left( \varvec{\xi }^{(2)},\varvec{\xi }^{(3)}\right) =\textrm{Cov}\left( {\varvec{K}}^\top \varvec{\zeta }^{(1)}+{\varvec{L}}\varvec{\zeta }^{(2)},{\varvec{K}}^\top \varvec{\xi }^{(2)}+{\varvec{L}}\varvec{\zeta }^{(3)}\right) \\=\, & {} {\varvec{K}}^\top \textrm{Cov}\left( \varvec{\zeta }^{(1)},\varvec{\xi }^{(2)}\right) {\varvec{K}}+{\varvec{L}}\textrm{Cov}\left( \varvec{\zeta }^{(2)},\varvec{\xi }^{(2)}\right) {\varvec{K}}\\=\, & {} {\varvec{K}}^\top {\varvec{K}}{\varvec{K}}+{\varvec{L}}{\varvec{L}}^\top {\varvec{K}}=({\varvec{K}}^\top {\varvec{K}}+{\varvec{L}}{\varvec{L}}^\top ){\varvec{K}}={\varvec{K}}. \end{aligned}$$The general case can be proved in a similar way. The second item of Proposition 2.2 concludes the proof.
-
The second item is a simple consequence of Equations (9) and (B1). For example, when \(m=1\) and \(m'=3\), we get
$$\begin{aligned} {\varvec{K}}^{1,3}=\, & {} \textrm{Cov}\left( \varvec{\xi }^{(1)},\varvec{\xi }^{(3)}\right) =\,\textrm{Cov}\left( \varvec{\xi }^{(1)},{\varvec{K}}^\top \varvec{\xi }^{(2)}+{\varvec{L}}\varvec{\zeta }^{(3)}\right) \\=\, & {} \textrm{Cov}\left( \varvec{\xi }^{(1)},\varvec{\xi }^{(2)}\right) {\varvec{K}}={\varvec{K}}^2. \end{aligned}$$In particular, for any \(m\in \{2,\ldots ,M\}\)
$$\begin{aligned} {\varvec{K}}^{m,m}=\,\textrm{Cov}\left( \varvec{\xi }^{(m)}\right)=\, & {} \textrm{Cov}\left( {\varvec{K}}^\top \varvec{\xi }^{(m-1)}+{\varvec{L}}\varvec{\zeta }^{(m)}\right) \\=\, & {} {\varvec{K}}^\top \textrm{Cov}(\varvec{\xi }^{(m-1)}){\varvec{K}}+{\varvec{L}}\textrm{Cov}(\varvec{\zeta }^{(m)}){\varvec{L}}^\top \\=\, & {} {\varvec{K}}^\top {\varvec{K}}+{\varvec{L}}{\varvec{L}}^\top ={{\textbf {I}}}_{p}, \end{aligned}$$and it is evident for \(m=1\). This concludes the proof of the second item and thus of the proposition.
\(\square\)
Proof of Proposition 2.4
For the mean-square truncation error (12), on the one hand,
On the other hand,
Thus,
For the mean-square global block error, the result is a simple consequence of the fact that any Gaussian vector can be written as follows (Maatouk et al. 2022)
where \(\epsilon\) is a N-dimensional standard Gaussian vector chosen the same for Y and \(Y_{1:M}\) to get a specific dependence structure. The two matrices \(S_{Y}\) and \(S_{Y_{1:M}}\) are the Cholesky factorization of the covariance of \(Y\) and \(Y_{1:M}\) on the grid \({\mathcal {G}}=\{x_1,\ldots ,x_{N}\}\) respectively. \(\square\)
The following Lemma is presented before proving Proposition 3.2.
Lemma 4.1
Let \(W_1, W_2\) and \(W_3\) be three Gaussian random vectors such that \(W_2\) is independent of \(W_3\) verifying:
where f is a measurable function of \(W_2\). Then
Proof of Lemma 4.1
The proof is given in Wilson et al. (2021). \(\square\)
Proof of Proposition 3.2
We have \(\textrm{E}[\varvec{\eta }\ \text {s.t.} \ {\varvec{A}}\varvec{\eta }]=\Sigma _{\varvec{\eta },{\varvec{A}}\varvec{\eta }}\Sigma _{{\varvec{A}}\varvec{\eta },{\varvec{A}}\varvec{\eta }}^{-1}{\varvec{A}}\varvec{\eta }\). Let \(W_3=\varvec{\eta }-\Sigma _{\varvec{\eta },{\varvec{A}}\varvec{\eta }}\Sigma _{{\varvec{A}}\varvec{\eta },{\varvec{A}}\varvec{\eta }}^{-1}{\varvec{A}}\varvec{\eta }\). The two vectors \({\varvec{A}}\varvec{\eta }\) and \(W_3\) are uncorrelated:
Thus, \({\varvec{A}}\varvec{\eta }\) and \(W_3\) are independent (since Gaussian). Applying Lemma 4.1, we get
where \(\Sigma _{\varvec{\eta },{\varvec{A}}\varvec{\eta }}=\textrm{Cov}(\varvec{\eta },{\varvec{A}}\varvec{\eta })=\textrm{Cov}(\varvec{\eta },\varvec{\eta }){\varvec{A}}^\top =\Gamma {\varvec{A}}^\top =({\varvec{A}}\varvec{\Gamma })^\top\) and \(\Sigma _{{\varvec{A}}\varvec{\eta },{\varvec{A}}\varvec{\eta }}=\textrm{Cov}({\varvec{A}}\varvec{\eta },{\varvec{A}}\varvec{\eta })={\varvec{A}}\textrm{Cov}(\varvec{\eta },\varvec{\eta }){\varvec{A}}^\top ={\varvec{A}}\varvec{\Gamma }{\varvec{A}}^\top\). \(\square\)
Appendix C Computational complexity of Algorithms 1 and 2
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Maatouk, H., Rullière, D. & Bay, X. Sampling large hyperplane-truncated multivariate normal distributions. Comput Stat 39, 1779–1806 (2024). https://doi.org/10.1007/s00180-023-01416-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-023-01416-7