Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Sampling large hyperplane-truncated multivariate normal distributions

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

Generating multivariate normal distributions is widely used in various fields, including engineering, statistics, finance and machine learning. In this paper, simulating large multivariate normal distributions truncated on the intersection of a set of hyperplanes is investigated. Specifically, the proposed methodology focuses on cases where the prior multivariate normal is extracted from a stationary Gaussian process (GP). It is based on combining both Karhunen-Loève expansions (KLE) and Matheron’s update rules (MUR). The KLE requires the computation of the decomposition of the covariance matrix of the random variables, which can become expensive when the random vector is too large. To address this issue, the input domain is split in smallest subdomains where the eigendecomposition can be computed. Due to the stationary property, only the eigendecomposition of the first subdomain is required. Through this strategy, the computational complexity is drastically reduced. The mean-square truncation and block errors have been calculated. The efficiency of the proposed approach has been demonstrated through both synthetic and real data studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Algorithm 1
Algorithm 2
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Aune E, Eidsvik J, Pokern Y (2013) Iterative numerical methods for sampling from high dimensional Gaussian distributions. Stat Comput 23(4):501–521

    Article  MathSciNet  Google Scholar 

  • Chevalier C, Emery X, Ginsbourger D (2015) Fast update of conditional simulation ensembles. Math Geosci 47(7):771–789

    Article  Google Scholar 

  • Cho H, Venturi D, Karniadakis G (2013) Karhunen-Loève expansion for multi-correlated stochastic processes. Probab Eng Mech 34:157–167

    Article  Google Scholar 

  • Cong Y, Chen B, Zhou M (2017) Fast simulation of hyperplane-truncated multivariate normal distributions. Bayesian Anal 12(4):1017–1037. https://doi.org/10.1214/17-BA1052

    Article  MathSciNet  Google Scholar 

  • Cousin A, Maatouk H, Rullière D (2016) Kriging of financial term-structures. Eur J Oper Res 255(2):631–648

    Article  MathSciNet  Google Scholar 

  • Davis M (1987) Generating large stochastic simulations - The matrix polynomial approximation method. Math Geol 19(2):99–107

    Article  Google Scholar 

  • Golub G, Van Loan C (1996) Matrix Comput. The Johns Hopkins University Press, Baltimore

    Google Scholar 

  • Hoffman Y, Ribak E (1991) Constrained realizations of Gaussian fields: A simple algorithm. APJ 380:L5. https://doi.org/10.1086/186160

    Article  Google Scholar 

  • Journel A, Huijbregts C (1976) Min Geostat. Academic Press, Cambridge

    Google Scholar 

  • Kirby M, Sirovich L (1990) Application of the Karhunen-Loève procedure for the characterization of human faces. IEEE PAMI 12(1):103–108

    Article  Google Scholar 

  • Lenk PJ, Choi T (2017) Bayesian analysis of shape-restricted functions using Gaussian process priors. Stat Sin 23:43–69

    MathSciNet  Google Scholar 

  • Loève M (1977) Elementary probability theory. Springer, Berlin

    Google Scholar 

  • Maatouk H, Bay X (2017) Gaussian process emulators for computer experiments with inequality constraints. Math Geosci 49(5):557–582

    Article  MathSciNet  Google Scholar 

  • Maatouk H, Bay X, Rullière D (2022) A note on simulating hyperplane-truncated multivariate normal distributions. Stat Probab Lett 191(109):650. https://doi.org/10.1016/j.spl.2022.109650

    Article  MathSciNet  Google Scholar 

  • Matérn B (1986) Spatial variation, vol 36. Springer Science & Business Media, Berlin

    Google Scholar 

  • Panunzio A, Cottereau R, Puel G (2018) Large scale random fields generation using localized Karhunen-Loève expansion. AMSES 5(1):1–29

    Google Scholar 

  • Ray P, Pati D, Bhattacharya A (2020) Efficient Bayesian shape-restricted function estimation with constrained Gaussian process priors. Stat Comput 30(4):839–853

    Article  MathSciNet  Google Scholar 

  • Rue H (2001) Fast sampling of Gaussian Markov random fields. J R Stat Soc B Stat 63(2):325–338

    Article  MathSciNet  Google Scholar 

  • Wang L (2008) Karhunen-Loève expansions and their applications. School of Economics and Political Science, London

    Google Scholar 

  • Williams C, Rasmussen C (2006) Gaussian processes for machine learning, vol 2. MIT press, Cambridge

    Google Scholar 

  • Wilson J, Borovitskiy V, Terenin A et al (2021) Pathwise conditioning of Gaussian processes. JMLR 22(105):1–47

    MathSciNet  Google Scholar 

  • Wood AT, Chan G (1994) Simulation of stationary Gaussian processes in \([0, 1]^d\). J Comput Graph Stat 3(4):409–432

    MathSciNet  Google Scholar 

  • Zhang J, Ellingwood B (1994) Orthogonal series expansions of random fields in reliability analysis. J Eng Mech 120(12):2660–2677

    Article  Google Scholar 

  • Zhou S, Giulani P, Piekarewicz J et al (2019) Reexamining the proton-radius problem using constrained Gaussian processes. Phys Rev C 99(055):202. https://doi.org/10.1103/PhysRevC.99.055202

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to express their gratitude to the Editor, the Associate Editor, and the two reviewers for their valuable and constructive comments, which have significantly enhanced the presentation and accuracy of the manuscript. Part of this research was conducted with the support of the consortium in Applied Mathematics CIROQUO, gathering partners in technological and academia in the development of advanced methods for Computer Experiments. https://doi.org/10.5281/zenodo.6581217

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hassan Maatouk.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A Subdomains with different lengths

The methodology presented in Sect. 2.2 can also be extended to cases where the domain is split into subdomains with different lengths. In such cases, the eigendecomposition and the coupling matrix must be computed at each subdomain. For example, suppose that the domain \({\mathcal {D}}=[0,1]\) is divided into two subdomains \([0,S]\cup [S,1]\) as illustrated in Fig. 12 below, where \(S=0.7\). The only modification in the methodology presented before is in computing both the coupling matrix \({\varvec{K}}\) and the lower triangle \({\varvec{L}}\). Let us denote these two matrices respectively by \({\varvec{K}}^{1,2}\) and \({\varvec{L}}^{1,2}\) since they depend on the two connected subdomains. The elements of \({\varvec{K}}^{1,2}\in {\mathbb {R}}^{p\times p}\) are defined as

$$\begin{aligned} {\varvec{K}}^{1,2}_{i,j}=\frac{1}{\sqrt{\lambda ^{(1)}_i\lambda _j^{(2)}}} \int _{x=0}^S\int _{t=S}^1C(|x-t|)\phi _i^{(1)}(x)\phi _j^{(2)}(t)\textrm{d}x\textrm{d}t, \end{aligned}$$

where \(\{\lambda _i^{(1)},\phi _i^{(1)}\}_{i=1}^{p}\) and \(\{\lambda _j^{(2)},\phi _j^{(2)}\}_{j=1}^{p}\) are the eigendecomposition of C on [0, S] and (S, 1], respectively. The lower triangle matrix \({\varvec{L}}^{1,2}\) is computed as follows

$$\begin{aligned} {{\textbf {I}}}_{p}-\left( {\varvec{K}}^{1,2}\right) ^\top {\varvec{K}}^{1,2}={\varvec{L}}^{1,2} \left( {\varvec{L}}^{1,2}\right) ^\top . \end{aligned}$$

As before, the second KLE conditional coefficients set is generated as follows:

$$\begin{aligned} \varvec{\xi }^{(2)}=\left( {\varvec{K}}^{1,2}\right) ^\top \varvec{\xi }^{(1)}+{\varvec{L}}^{1,2}\varvec{\zeta }^{(2)}, \end{aligned}$$

where \(\varvec{\xi }^{(1)}=\varvec{\zeta }^{(1)}\in {\mathbb {R}}^{p}\) and \(\varvec{\zeta }^{(2)}\in {\mathbb {R}}^{p}\) are two replicates following a standard MVN distribution. Finally, the random process defined on the entire domain \({\mathcal {D}}=[0,1]\) is given by

$$\begin{aligned} \left\{ \begin{array}{ll} Y^{\perp }_{1}(x)=\sum _{i=1}^{p}\sqrt{\lambda _i^{(1)}}\zeta _i^{(1)} \phi _i^{(1)}(x), &{} \quad \text {if }x\in [0,S] \\ Y^p_{2}(x)=\sum _{j=1}^{p}\sqrt{\lambda _j^{(2)}}\xi ^{(2)}_j \phi _j^{(2)}(x), &{} \quad \text {if }x\in (S,1]. \end{array}\right. \end{aligned}$$

The correlation between any two coefficients sets \(\varvec{\xi }^{(m)}\) and \(\varvec{\xi }^{(m')}\) for any \(m'>m\) is equal to

$$\begin{aligned} {\varvec{K}}^{m,m'}= & {} \textrm{Cov}\left( \varvec{\xi }^{(m)},\varvec{\xi }^{(m')}\right) ={\varvec{K}}^{m,m+1}\times {\varvec{K}}^{m+1,m+2}\times \ldots \times {\varvec{K}}^{m'-1,m'} \\= & {} \prod _{\ell =m}^{m'-1}{\varvec{K}}^{\ell ,\ell +1}, \end{aligned}$$

where \({\varvec{K}}^{\ell ,\ell +1}=\textrm{Cov}\left( \varvec{\xi }^{(\ell )},\varvec{\xi }^{(\ell +1)}\right)\), for any \(\ell \in \{m,\ldots ,m'-1\}\).

Fig. 12
figure 12

GP sample paths when the domain is split in two (left panel) and three (right panel) subdomains with different lengths

In Fig. 12, the domain \({\mathcal {D}}\) is split in two (resp. three) subdomains in the left (resp. right) panel with different lengths. The solid curves (resp. dashed curves) represent the paths before (resp. after) conditioning. The Matérn covariance function with a smoothness parameter \(\nu =5/2\) has been utilized, where the length-scale parameter \(\theta\) is fixed at 0.2.

Appendix B Proofs of Propositions

This appendix contains the proofs of the propositions presented in the main body of the manuscript.

Proof of Proposition 2.1

We have for any \(x,x'\in {\mathcal {D}}\)

$$\begin{aligned} C(x,x')=\, & {} \textrm{E}[Y(x)Y(x')]=\sum _{i=1}^{+\infty }\sum _{j=1}^{+\infty } \sqrt{\gamma _i\gamma _j}\varphi _i(x)\varphi _j(x')\textrm{E}[\xi _i\xi _j] \\= & {} \sum _{i=1}^{+\infty }\sum _{j=1}^{+\infty }\sqrt{\gamma _i\gamma _j} \varphi _i(x)\varphi _j(x')\delta _{ij} = \sum _{i=1}^{+\infty }\gamma _i\varphi _i(x)\varphi _i(x'), \end{aligned}$$

which concludes the proof of the proposition. \(\square\)

Proof of Proposition 2.2

  1. (i)

    From right to left: suppose that \(\{\xi _i^{(m)}\}\) are independent and identically distributed \({\mathcal {N}}(0,1)\). The two processes \(Y\) and \(Y_m\) are zero-mean Gaussian. Firstly, from Proposition 2.1 and the stationary property of the covariance function, we have for any \(t,t'\in {\mathcal {D}}^{(m)}\)

    $$\begin{aligned} C(t,t')=\, & {} \textrm{Cov}(Y(t),Y(t'))=\textrm{Cov}\left( Y(t-(m-1)S),Y(t'-(m-1)S)\right) \\= & {} \textrm{Cov}(Y(x),Y(x'))=\sum _{i=1}^{+\infty }\lambda _i\phi _i(x)\phi _i(x'), \end{aligned}$$

    where \(x=t-(m-1)S\) and \(x'=t'-(m-1)S\) are in \({\mathcal {D}}^{(1)}\). Secondly, we have for any \(t,t'\in {\mathcal {D}}^{(m)}\)

    $$\begin{aligned} \textrm{Cov}(Y_m(t),Y_m(t'))= & {} \sum _{i,j=1}^{+\infty }\sqrt{\lambda _i\lambda _j}\phi _i(t-(m-1)S)\phi _j(t'-(m-1)S)\delta _{ij} \\= & {} \sum _{i=1}^{+\infty }\lambda _i\phi _i(t-(m-1)S)\phi _i(t'-(m-1)S) \\= & {} \sum _{i=1}^{+\infty }\lambda _i\phi _i(x)\phi _i(x'), \end{aligned}$$

    where \(x=t-(m-1)S\) and \(x'=t'-(m-1)S\) are in \({\mathcal {D}}^{(1)}\). Now, from left to right: suppose that \(Y_m\) and \(Y\) have the same distribution on \({\mathcal {D}}^{(m)}\). This means that for any \(x,x'\in {\mathcal {D}}^{(m)}\)

    $$\begin{aligned} \textrm{Cov}\left( Y_m(x),Y_{m}(x')\right) =\textrm{Cov}(Y(x),Y(x'))=C(|x-x'|). \end{aligned}$$

    Therefore, for any ij

    $$\begin{aligned} \textrm{Cov}(\xi _i^{(m)},\xi _j^{(m)})= & {} \frac{1}{\sqrt{\lambda _i\lambda _j}}\iint _{{\mathcal {D}}^{(1)}}\phi _i(t)\phi _j(t')C(|t-t'|)\textrm{d}t\textrm{d}t' \\= & {} \frac{\lambda _j}{\sqrt{\lambda _i\lambda _j}}\int _{{\mathcal {D}}^{(1)}}\phi _i(t)\phi _j(t)\textrm{d}t=\delta _{ij}, \end{aligned}$$

    which concludes the proof of the first item of the proposition.

  2. (ii)

    Let us consider the simple case when \(m'=m+1\). For example, let \(m=1\) then \(m'=2\). From right to left: on the one hand, we have for any \((s,t)\in {\mathcal {D}}^{(1)}\times {\mathcal {D}}^{(2)}\),

    $$\begin{aligned} \textrm{Cov}(Y(s),Y(t))=\textrm{E}[Y(s)Y(t)]=C(|s-t|). \end{aligned}$$

    On the other hand, we have

    $$\begin{aligned} \textrm{E}[Y_1(s)Y_2(t)]= & {} \sum _{i,j=1}^{+\infty }\sqrt{\lambda _i\lambda _j}\phi _i(s)\phi _j(t-S)\textrm{E}[\xi _i^{(1)}\xi _j^{(2)}]\\= & {} \sum _{i,j=1}^{+\infty }\phi _i(s)\phi _j(t-S)\int _{x=0}^S\int _{x'=S}^{2S}\phi _i(x)\phi _j(x'-S)C(|x-x'|)\textrm{d}x\textrm{d}x'\\= & {} \int _{0}^S\int _S^{2S}\sum _{i=1}^{+\infty }\phi _i(s)\phi _i(x)\sum _{j=1}^{+\infty }\phi _j(t-S)\phi _j(x'-S)C(|x-x'|)\textrm{d}x\textrm{d}x' \\= & {} \int _0^S\int _S^{2S}\delta (s-x)\delta (t-x')C(|x-x'|)\textrm{d}x\textrm{d}x' = C(|s-t|). \end{aligned}$$

    From left to right: we have

    $$\begin{aligned} \textrm{Cov}\left( \xi _{i}^{(1)}, \xi _{j}^{(2)}\right)=\, & {} \textrm{Cov}\left( \frac{1}{\sqrt{\lambda _{i}}} \int _{0}^{S} \phi _{i}(x) Y_{1}(x) \textrm{d}x, \frac{1}{\sqrt{\lambda _{j}}} \int _{t=S}^{2 S} \phi _{j}(t-S) Y_{2}(t) \textrm{d}t\right) \\= & {} \frac{1}{\sqrt{\lambda _{i} \lambda _{j}}} \int _{x=0}^{S} \int _{t=S}^{2 S} C(|x-t|) \phi _{i}(x) \phi _{j}(t-S) \textrm{d}x \textrm{d}t \\= & {} \frac{1}{\sqrt{\lambda _{i} \lambda _{j}}} \int _{x=0}^{S} \int _{x'=0}^{S} C\left( |x-x'-S|\right) \phi _{i}(x) \phi _{j}\left( x'\right) \textrm{d}x \textrm{d}x'. \end{aligned}$$

    The general case can be proved in a similar way.

\(\square\)

Proof of Proposition 2.3

  • From Equation (9), the correlation between \(\varvec{\xi }^{(m-1)}\) and \(\varvec{\xi }^{(m)}\) is imposed in order to satisfy the correlation structure between the connected subdomains \({\mathcal {D}}^{(m-1)}\) and \({\mathcal {D}}^{(m)}\)

    $$\begin{aligned} {\varvec{K}}^{m-1,m}=\textrm{Cov}\left( \varvec{\xi }^{(m-1)},\varvec{\xi }^{(m)}\right) ={\varvec{K}}, \quad \forall m\in \{2,\ldots ,M\}. \end{aligned}$$
    (B1)

    For example, when \(m=3\), we get

    $$\begin{aligned} {\varvec{K}}^{2,3}= \,& {} \textrm{Cov}\left( \varvec{\xi }^{(2)},\varvec{\xi }^{(3)}\right) =\textrm{Cov}\left( {\varvec{K}}^\top \varvec{\zeta }^{(1)}+{\varvec{L}}\varvec{\zeta }^{(2)},{\varvec{K}}^\top \varvec{\xi }^{(2)}+{\varvec{L}}\varvec{\zeta }^{(3)}\right) \\=\, & {} {\varvec{K}}^\top \textrm{Cov}\left( \varvec{\zeta }^{(1)},\varvec{\xi }^{(2)}\right) {\varvec{K}}+{\varvec{L}}\textrm{Cov}\left( \varvec{\zeta }^{(2)},\varvec{\xi }^{(2)}\right) {\varvec{K}}\\=\, & {} {\varvec{K}}^\top {\varvec{K}}{\varvec{K}}+{\varvec{L}}{\varvec{L}}^\top {\varvec{K}}=({\varvec{K}}^\top {\varvec{K}}+{\varvec{L}}{\varvec{L}}^\top ){\varvec{K}}={\varvec{K}}. \end{aligned}$$

    The general case can be proved in a similar way. The second item of Proposition 2.2 concludes the proof.

  • The second item is a simple consequence of Equations (9) and (B1). For example, when \(m=1\) and \(m'=3\), we get

    $$\begin{aligned} {\varvec{K}}^{1,3}=\, & {} \textrm{Cov}\left( \varvec{\xi }^{(1)},\varvec{\xi }^{(3)}\right) =\,\textrm{Cov}\left( \varvec{\xi }^{(1)},{\varvec{K}}^\top \varvec{\xi }^{(2)}+{\varvec{L}}\varvec{\zeta }^{(3)}\right) \\=\, & {} \textrm{Cov}\left( \varvec{\xi }^{(1)},\varvec{\xi }^{(2)}\right) {\varvec{K}}={\varvec{K}}^2. \end{aligned}$$

    In particular, for any \(m\in \{2,\ldots ,M\}\)

    $$\begin{aligned} {\varvec{K}}^{m,m}=\,\textrm{Cov}\left( \varvec{\xi }^{(m)}\right)=\, & {} \textrm{Cov}\left( {\varvec{K}}^\top \varvec{\xi }^{(m-1)}+{\varvec{L}}\varvec{\zeta }^{(m)}\right) \\=\, & {} {\varvec{K}}^\top \textrm{Cov}(\varvec{\xi }^{(m-1)}){\varvec{K}}+{\varvec{L}}\textrm{Cov}(\varvec{\zeta }^{(m)}){\varvec{L}}^\top \\=\, & {} {\varvec{K}}^\top {\varvec{K}}+{\varvec{L}}{\varvec{L}}^\top ={{\textbf {I}}}_{p}, \end{aligned}$$

    and it is evident for \(m=1\). This concludes the proof of the second item and thus of the proposition.

\(\square\)

Proof of Proposition 2.4

For the mean-square truncation error (12), on the one hand,

$$\begin{aligned}{} & {} \textrm{E}\left[ \int _{{\mathcal {D}}^{(m)}} Y_m(x)^2\textrm{d}x\right] \\{} & {} \quad =\,\textrm{E}\left[ \int _{{\mathcal {D}}^{(m)}}\sum _{i,j=1}^{+\infty }\sqrt{\lambda _i\lambda _j}\phi _i(x-(m-1)S)\phi _j(x-(m-1)S)\xi ^{(m)}_i\xi ^{(m)}_j\right] \\{} & {} \quad = \sum _{i,j=1}^{+\infty }\sqrt{\lambda _i\lambda _j}\textrm{E}[\xi ^{(m)}_i\xi ^{(m)}_j]\int _{{\mathcal {D}}^{(m)}}\phi _i(x-(m-1)S)\phi _j(x-(m-1)S)\textrm{d}x \\{} & {} \quad = \sum _{i,j=1}^{+\infty }\sqrt{\lambda _i\lambda _j}\textrm{E}[\xi ^{(m)}_i\xi ^{(m)}_j]\delta _{i,j}=\sum _{i=1}^{+\infty }\lambda _i. \end{aligned}$$

On the other hand,

$$\begin{aligned} \textrm{E}\left[ \displaystyle \int _{{\mathcal {D}}^{(m)}}(Y_m(x)-Y^p_m(x))^2\textrm{d}x\right]= & {} \textrm{E}\left[ \displaystyle \int _{{\mathcal {D}}^{(m)}}\left( \sum _{i=p+1}^{+\infty } \sqrt{\lambda _i}\phi _i(x-(m-1)S)\xi ^{(m)}_i\right) ^2\right] \\= & {} \sum _{i=p+1}^{+\infty }\sqrt{\lambda _i\lambda _j}\textrm{E}[\xi _i^{(m)} \xi _j^{(m)}]\delta _{i,j}=\sum _{i=p+1}^{+\infty }\lambda _i. \end{aligned}$$

Thus,

$$\begin{aligned} \epsilon _T^2=\frac{\sum _{i=p+1}^{+\infty }\lambda _i}{\sum _{i=1}^{ +\infty }\lambda _i}=\frac{\sum _{i=1}^{+\infty }\lambda _i -\sum _{i=1}^{p}\lambda _i}{\sum _{i=1}^{+\infty }\lambda _i}= 1-\frac{\sum _{i=1}^p\lambda _i}{\sum _{i=1}^{+\infty }\lambda _i}. \end{aligned}$$

For the mean-square global block error, the result is a simple consequence of the fact that any Gaussian vector can be written as follows (Maatouk et al. 2022)

$$\begin{aligned} \begin{pmatrix} Y(x_1) \\ \vdots \\ Y(x_{N}) \end{pmatrix}=S_{Y}\times \epsilon \quad \text {and} \quad \begin{pmatrix} Y_{1:M}(x_1) \\ \vdots \\ Y_{1:M}(x_{N}) \end{pmatrix}=S_{Y_{1:M}}\times \epsilon , \end{aligned}$$

where \(\epsilon\) is a N-dimensional standard Gaussian vector chosen the same for Y and \(Y_{1:M}\) to get a specific dependence structure. The two matrices \(S_{Y}\) and \(S_{Y_{1:M}}\) are the Cholesky factorization of the covariance of \(Y\) and \(Y_{1:M}\) on the grid \({\mathcal {G}}=\{x_1,\ldots ,x_{N}\}\) respectively. \(\square\)

The following Lemma is presented before proving Proposition 3.2.

Lemma 4.1

Let \(W_1, W_2\) and \(W_3\) be three Gaussian random vectors such that \(W_2\) is independent of \(W_3\) verifying:

$$\begin{aligned} W_1\overset{d}{=}f(W_2)+W_3, \end{aligned}$$

where f is a measurable function of \(W_2\). Then

$$\begin{aligned} \{W_1\ \text {s.t.} \ W_2=\beta \}\overset{d}{=}f(\beta )+W_3. \end{aligned}$$

Proof of Lemma 4.1

The proof is given in Wilson et al. (2021). \(\square\)

Proof of Proposition 3.2

We have \(\textrm{E}[\varvec{\eta }\ \text {s.t.} \ {\varvec{A}}\varvec{\eta }]=\Sigma _{\varvec{\eta },{\varvec{A}}\varvec{\eta }}\Sigma _{{\varvec{A}}\varvec{\eta },{\varvec{A}}\varvec{\eta }}^{-1}{\varvec{A}}\varvec{\eta }\). Let \(W_3=\varvec{\eta }-\Sigma _{\varvec{\eta },{\varvec{A}}\varvec{\eta }}\Sigma _{{\varvec{A}}\varvec{\eta },{\varvec{A}}\varvec{\eta }}^{-1}{\varvec{A}}\varvec{\eta }\). The two vectors \({\varvec{A}}\varvec{\eta }\) and \(W_3\) are uncorrelated:

$$\begin{aligned} \textrm{Cov}({\varvec{A}}\varvec{\eta },W_3)=\, & {} \textrm{Cov}({\varvec{A}}\varvec{\eta },W_3)=\,\textrm{Cov}({\varvec{A}}\varvec{\eta },\varvec{\eta }-\Sigma _{\varvec{\eta },{\varvec{A}}\varvec{\eta }}\Sigma _{{\varvec{A}}\varvec{\eta },{\varvec{A}}\varvec{\eta }}^{-1}{\varvec{A}}\varvec{\eta }) \\=\, & {} \textrm{Cov}({\varvec{A}}\varvec{\eta },\varvec{\eta })-\textrm{Cov}({\varvec{A}}\varvec{\eta },\varvec{\Gamma }{\varvec{A}}^\top ({\varvec{A}}\varvec{\Gamma }{\varvec{A}}^\top )^{-1}{\varvec{A}}\varvec{\eta }) \\= \,& {} {\varvec{A}}\varvec{\Gamma }-{\varvec{A}}\varvec{\Gamma }{\varvec{A}}^\top ({\varvec{A}}\varvec{\Gamma }{\varvec{A}}^\top )^{-1}{\varvec{A}}\varvec{\Gamma }=\, {\varvec{A}}\varvec{\Gamma }-{\varvec{A}}\varvec{\Gamma }={\varvec{0}}_{n,N}. \end{aligned}$$

Thus, \({\varvec{A}}\varvec{\eta }\) and \(W_3\) are independent (since Gaussian). Applying Lemma 4.1, we get

$$\begin{aligned} \{\varvec{\eta }\ \text {s.t.} \ {\varvec{A}}\varvec{\eta }=\,{\varvec{y}}\}&\overset{d}{=} f({\varvec{y}})+W_3=\Sigma _{\varvec{\eta },{\varvec{A}}\varvec{\eta }}\Sigma _{{\varvec{A}}\varvec{\eta },{\varvec{A}}\varvec{\eta }}^{-1}{\varvec{y}}+\varvec{\eta }-\Sigma _{\varvec{\eta },{\varvec{A}}\varvec{\eta }}\Sigma _{{\varvec{A}}\varvec{\eta },{\varvec{A}}\varvec{\eta }}^{-1}{\varvec{A}}\varvec{\eta }\\&\overset{d}{=} \varvec{\eta }+\Sigma _{\varvec{\eta },{\varvec{A}}\varvec{\eta }}\Sigma _{{\varvec{A}}\varvec{\eta },{\varvec{A}}\varvec{\eta }}^{-1}({\varvec{y}}-{\varvec{A}}\varvec{\eta }) \overset{d}{=}\varvec{\eta }+\varvec{\Gamma }{\varvec{A}}^\top ({\varvec{A}}\varvec{\Gamma }{\varvec{A}}^\top )^{-1}({\varvec{y}}-{\varvec{A}}\varvec{\eta })\\&\overset{d}{=}\varvec{\eta }+({\varvec{A}}\varvec{\Gamma })^\top ({\varvec{A}}\varvec{\Gamma }{\varvec{A}}^\top )^{-1}({\varvec{y}}-{\varvec{A}}\varvec{\eta }) \end{aligned}$$

where \(\Sigma _{\varvec{\eta },{\varvec{A}}\varvec{\eta }}=\textrm{Cov}(\varvec{\eta },{\varvec{A}}\varvec{\eta })=\textrm{Cov}(\varvec{\eta },\varvec{\eta }){\varvec{A}}^\top =\Gamma {\varvec{A}}^\top =({\varvec{A}}\varvec{\Gamma })^\top\) and \(\Sigma _{{\varvec{A}}\varvec{\eta },{\varvec{A}}\varvec{\eta }}=\textrm{Cov}({\varvec{A}}\varvec{\eta },{\varvec{A}}\varvec{\eta })={\varvec{A}}\textrm{Cov}(\varvec{\eta },\varvec{\eta }){\varvec{A}}^\top ={\varvec{A}}\varvec{\Gamma }{\varvec{A}}^\top\). \(\square\)

Appendix C Computational complexity of Algorithms 1 and 2

Table 5 The computational complexity of Algorithm 1. The parameters n and N are respectively the dimension of the set of constraints and of the prior MVN \(\varvec{\eta }\)
Table 6 The computational complexity of Algorithm 2 when the domain \({\mathcal {D}}\) is split into two subdomains \(M=2\). The parameters n and N represent the dimension of the set of constraints and the prior MVN \(\varvec{\eta }\), respectively, while \(p\) denotes the number of terms retained in the expansion

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Maatouk, H., Rullière, D. & Bay, X. Sampling large hyperplane-truncated multivariate normal distributions. Comput Stat 39, 1779–1806 (2024). https://doi.org/10.1007/s00180-023-01416-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-023-01416-7

Keywords

Navigation