Stuart 81
Stuart 81
Stuart 81
(2010) 322345.
APPROXIMATION OF BAYESIAN
INVERSE PROBLEMS FOR PDES∗
S. L. COTTER† , M. DASHTI† , AND A. M. STUART†
Abstract. Inverse problems are often ill posed, with solutions that depend sensitively on data.
In any numerical approach to the solution of such problems, regularization of some form is needed to
counteract the resulting instability. This paper is based on an approach to regularization, employing
a Bayesian formulation of the problem, which leads to a notion of well posedness for inverse problems,
at the level of probability measures. The stability which results from this well posedness may be
used as the basis for quantifying the approximation, in finite dimensional spaces, of inverse problems
for functions. This paper contains a theory which utilizes this stability property to estimate the
distance between the true and approximate posterior distributions, in the Hellinger metric, in terms
of error estimates for approximation of the underlying forward problem. This is potentially useful as
it allows for the transfer of estimates from the numerical analysis of forward problems into estimates
for the solution of the related inverse problem. It is noteworthy that, when the prior is a Gaussian
random field model, controlling differences in the Hellinger metric leads to control on the differences
between expected values of polynomially bounded functions and operators, including the mean and
covariance operator. The ideas are applied to some non-Gaussian inverse problems where the goal is
determination of the initial condition for the Stokes or Navier–Stokes equation from Lagrangian and
Eulerian observations, respectively.
Key words. inverse problem, Bayesian, Stokes flow, data assimilation, Markov chain–Monte
Carlo
DOI. 10.1137/090770734
(1.1) y = G(u)
19, 2010; published electronically April 2, 2010. This research was supported by the EPSRC, ERC,
and ONR.
http://www.siam.org/journals/sinum/48-1/77073.html
† Mathematics Institute, University of Warwick, Coventry CV4 7AL, UK (s.l.cotter@warwick.ac.
form of observation operator, such as pointwise evaluation at a finite set of points. The resulting
observation operator is often denoted with the letter H in the atmospheric sciences community [11];
because we need H for Hilbert space later on, we use the symbol G.
322
are subject to noise. A more appropriate model equation is then often of the form
(1.2) y = G(u) + η,
where η is a mean-zero random variable, whose statistical properties we might know,
or make a reasonable mathematical model for, but whose actual value is unknown to
us; we refer to η as the observational noise. We assume that it is possible to describe
our prior knowledge about u, before acquiring data, in terms of a prior probability
measure µ0 . It is then possible to use Bayes’s formula to calculate the posterior
probability measure µ for u given y.
In the infinite dimensional setting the most natural version of Bayes theorem is a
statement that the posterior measure is absolutely continuous with respect to the prior
[27] and that the Radon–Nikodým derivative (density) between them is determined
by the data likelihood. This gives rise to the formula
dµ 1 ! "
(1.3) (u) = exp −Φ(u; y) ,
dµ0 Z(y)
where the normalization constant Z(y) is chosen so that µ is a probability measure:
#
! "
(1.4) Z(y) = exp −Φ(u; y) dµ0 (u).
X
In the case where y is finite dimensional and η has Lebesgue density ρ this is simply
dµ
(1.5) (u) ∝ ρ(y − G(u)).
dµ0
More generally Φ is determined by the distribution of y given u. We call Φ(u; y) the
potential and sometimes, for brevity, refer to the evaluation of Φ(u; y) for a particular
u ∈ X as solving the forward problem as it is defined through G(·). Note that the
solution to the inverse problem is a probability measure µ which is defined through a
combination of solution of the forward problem Φ, the data y, and a prior probability
measure µ0 . Bayesian and classical regularization are linked via the fact that the
minimizer of the Tikhonov-regularized nonlinear least squares problem coincides with
MAP estimators, which maximize the posterior probability—see section 5.3 of [27].
In general it is hard to obtain information from a formula such as (1.3) for a prob-
ability measure. One useful approach to extracting information is to use sampling:
generate a set of points {u(k) }K k=1 distributed (perhaps only approximately) accord-
ing to µ. In this context it is noteworthy that the integral Z(y) appearing in formula
(1.3) is not needed to enable implementation of MCMC (Markov chain–Monte Carlo)
methods to sample from the desired measure. √ These methods incur an error which
is well understood and which decays as K [17]. However, for inverse problems on
function space there is a second source of error, arising from the need to approximate
the inverse problem in a finite dimensional subspace of dimension N . The purpose of
this paper is to quantify such approximation errors. The key analytical idea is that
we transfer approximation properties of the forward problem Φ into approximation
properties of the inverse problem defined by (1.3). Since the solution to the Bayesian
inverse problem is a probability measure, we will need to use metrics on probability
measures to quantify the effect of approximation. We will employ the Hellinger met-
ric because this leads directly to bounds on the approximation error incurred when
calculating the expectation of functions. The main general results concerning approx-
imation properties are Theorems 2.2 and 2.4, together with Corollary 2.3. The key
experience will need to be gathered in order to fully evaluate the relative merits of
discretizing before or after formulation of the Bayesian inverse problem.
Overviews of inverse problems arising in fluid mechanics, such as those studied
in sections 3 and 4, may be found in [19]. The connection between the Tikhonov-
regularized least squares and Bayesian approaches to inverse problems in fluid me-
chanics is overviewed in [1].
2. General framework. In this section we establish three useful results which
concern the effect of approximation on the posterior probability measure µ given by
(1.3). These three results are Theorem 2.2, Corollary 2.3, and Theorem 2.4. The key
point to notice about these results is that they simply require the proof of various
bounds and approximation properties for the forward problem, and yet they yield
approximation results concerning the Bayesian inverse problem. The connection to
probability comes only through the choice of the space X, in which the bounds and
approximation properties must be proved, which must have full measure under the
prior µ0 .
The probability measure of interest (1.3) is defined through a density with respect
to a prior reference measure µ0 which, by shift of origin, we take to have mean
zero. Furthermore, we assume that this reference measure is Gaussian with covariance
operator C. We write µ0 = N (0, C). In fact, the only consequence of the Gaussian
prior assumption that we use is the Fernique Theorem A.3. Hence the results may be
trivially extended to all measures which satisfy the conclusion of this theorem. The
Fernique theorem holds for all Gaussian measures on a separable Banach space [3]
and also for other measures with tails which decay at least as fast as a Gaussian.
It is demonstrated in [27] that in many applications, including those considered
here, the potential Φ(·; y) satisfies certain natural bounds on a Banach space (X, %·%X ).
It is then natural to choose the prior µ0 so that µ0 (X) = 1. Such bounds on the
forward problem Φ are summarized in the following assumptions. We assume that the
data y lay in a Banach space (Y, % · %Y ). The key point about the form of Assumption
1(i) is that it allows the use of the Fernique theorem to control integrals against µ.
Assumption 1(ii) may be used to obtain lower bounds on the normalization constant
Z(y).
Assumption 1. For some Banach space X with µ0 (X) = 1, the function Φ :
X × Y → R satisfies the following:
(i) for every ε > 0 and r > 0 there is M = M (ε, r) ∈ R such that for all u ∈ X
and y ∈ Y with %y%Y < r
Φ(u; y) ! M − ε%u%2X ;
(ii) for every r > 0 there is a L = L(r) > 0 such that for all u ∈ X and y ∈ Y
with max{%u%X , %y%Y } < r
Φ(u; y) " L(r).
For Bayesian inverse problems in which a finite number of observations are made
and the observation error η is mean zero Gaussian with covariance matrix Γ, the
potential Φ has the form
1
(2.1) Φ(u; y) = |y − G(u)|2Γ ,
2
where y ∈ Rm is the data, G : X → Rm is the forward model, and | · |Γ is a covariance
1
weighted norm on Rm given by | · |Γ = |Γ− 2 · | and | · | denotes the standard Euclidean
norm. In this case it is natural to express conditions on the measure µ in terms of G.
dµN 1 ! "
(2.4) (u) = N exp −ΦN (u) ,
dµ0 Z
where
#
N
! "
(2.5) Z = exp −ΦN (u) dµ0 (u).
X
where ψ(N ) → 0 as N → ∞. Then the measures µ and µN are close with respect to
the Hellinger distance: there is a constant C, independent of N , such that
This lower bound is positive because µ0 has a full measure on X and is Gaussian so
that all balls in X have positive probability. We have an analogous lower bound for
|Z N |.
From Assumptions 1(i) and (2.6), using the fact that µ0 is a Gaussian probability
measure so that the Fernique Theorem A.3 applies, we obtain
#
! " ! "
|Z − Z N | " Kψ(N ) exp ε%u%2X − M exp ε%u%2X dµ0 (u)
" Cψ(N ).
where
# $ $ % $ %%2
2 1 1
I1 = exp − Φ(u) − exp − ΦN (u) dµ0 (u),
Z 2 2
#
1 1 "
I2 = 2|Z − 2 − (Z N )− 2 |2 exp(−ΦN (u) dµ0 (u).
Now, again using Assumption 1(i) and (2.6), together with the Fernique Theorem
A.3,
#
Z 1 2 ! "
I1 " K ψ(N )2 exp 3ε%u%2X − M dµ0 (u)
2 4
" Cψ(N )2 .
Finally all moments of u in X are finite under the Gaussian measure µ0 by the
Fernique Theorem A.3. It follows that all moments are finite under µ and µN because,
for f : X → Z polynomially bounded,
! "1 ! "1
Eµ %f % " Eµ0 %f %2 2 Eµ0 exp(−2Φ(u; y)) 2 ,
and the first term on the right-hand side is finite since all moments are finite under
µ0 , while the second term may be seen to be finite by the use of Assumption 1(i) and
the Fernique Theorem A.3.
For Bayesian inverse problems with finite data the potential Φ has the form given
in (2.1) where y ∈ Rm is the data, G : X → Rm is the forward model, and | · |Γ is a
covariance weighted norm on Rm . In this context the following corollary is useful.
Corollary 2.3. Assume that Φ is given by (2.1) and that G is approximated by
a function G N with the property that, for any ε > 0, there is K % = K % (ε) > 0 such
that
! "
(2.8) |G(u) − G N (u)| " K % exp ε%u%2X ψ(N ),
1
|Φ(u) − ΦN (u)| " |2y − G(u) − G N (u)|Γ |G(u) − G N (u)|Γ
&2 ! "' ! "
" |y| + exp ε%u%2X + M × K % (ε) exp ε%u%2X ψ(N )
" K(2ε) exp(2ε%u%2X )ψ(N )
as required.
A notable fact concerning Theorem 2.2 is that the rate of convergence attained
in the solution of the forward problem, encapsulated in the approximation of the
function Φ by ΦN , is transferred into the rate of convergence of the related inverse
problem for measure µ given by (2.2) and its approximation by µN . Key to achieving
this transfer of rates of convergence is the dependence of the constant in the forward
error bound (2.6) on u. In particular, it is necessary that this constant is integrable
by use of the Fernique Theorem A.3. In some applications it is not possible to obtain
such dependence. Then convergence results can sometimes still be obtained, but at
weaker rates. We now describe a theory for this situation.
Theorem 2.4. Assume that Φ and ΦN satisfy Assumptions 1(i) and 1(ii) with
constants uniform in N . Assume also that for any R > 0 there is K = K(R) > 0
such that for all u with %u%X " R
where ψ(N ) → 0 as N → ∞. Then the measures µ and µN are close with respect to
the Hellinger distance:
Now, again by the Fernique Theorem A.3, JR → 0 as R → ∞ so, for any δ > 0, we
may choose R > 0 such that JR < δ. Now choose N > 0 so that K1 (R)ψ(N ) < δ to
deduce that |Z − Z N | < 2δ. Since δ > 0 is arbitrary, this proves that Z N → Z as
N → ∞.
From the definition of Hellinger distance we have
# $ $ % $ %%2
N 2 − 12 1 N − 12 1 N
2dHell (µ, µ ) = Z exp − Φ(u) − (Z ) exp − Φ (u) dµ0 (u)
2 2
" I1 + I2 ,
where
# $ $ % $ %%2
2 1 1 N
I1 = exp − Φ(u) − exp − Φ (u) dµ0 (u),
Z 2 2
#
1 1 "
I2 = 2|Z − 2 − (Z N )− 2 |2 exp(−ΦN (u) dµ0 (u).
with the usual L2 (D) norm and inner product on this subspace of L2per (D). Through-
out this article A denotes the (self-adjoint, positive) Stokes operator on T2 and
P : L2per → H the Leray projector [29, 30]. The operator A is densely defined on
H and is the generator of an analytic semigroup. We denote by {(φk , λk )}k∈K a
complete orthonormal set of eigenfunctions/eigenvalues for A in H. We then define
fractional powers of A by
+
(3.3) Aα u = λα
k +u, φk ,φk .
k∈K
Of course, H0 = H.
If we let ψ = P f , then in the Stokes case ι = 0 we may write (3.1) as an ODE in
Hilbert space H:
dv
(3.5) + νAv = ψ, v(0) = u.
dt
Our aim is to determine the initial velocity field u from Lagrangian data. To be
precise we assume that we are given noisy observations of J tracers with positions zj
solving the integral equations
# t
(3.6) zj (t) = zj,0 + v(zj (s), s)ds.
0
For simplicity assume that we observe all the tracers z at the same set of positive
times {tk }K
k=1 and that the initial particle tracer positions zj,0 are known to us:
where the ηj,k ’s are zero mean Gaussian random variables. Concatenating data we
may write
(3.8) y = G(u) + η
∗ ∗
with y = (y1,1 , . . . , yJ,K )∗ and η ∼ N (0, Γ) for some covariance matrix Γ capturing the
correlations present in the noise. Note that G is a complicated function of the initial
condition for the Stokes equations, describing the mapping from this initial condition
into the positions of Lagrangian trajectories at positive times. We will show that the
function G maps H into R2JK and is continuous on a dense subspace of H.
The objective of the inverse problem is thus to find the initial velocity field u,
given y. We adopt a Bayesian approach, place a prior µ0 (du) on u, and identify
the posterior µ(du) = P(u|y)du. We now spend some time developing the Bayesian
framework, culminating in Theorem 3.3 which shows that µ is well defined. The
reader interested purely in the approximation of µ can skip straight to Theorem 3.4.
The following result shows that the tracer equations (3.6) have a solution, under
mild regularity assumptions on the initial data. An analogous result is proved in [7]
for the case where the velocity field is governed by the Navier–Stokes equation, and
the proof may be easily extended to the case of the Stokes equations.
Theorem 3.1. Let ψ ∈ L2 (0, T ; H) and let v ∈ C([0, T ]; H) denote the solution of
(3.5) with initial data u ∈ H. Then the integral equation (3.6) has a unique solution
z ∈ C([0, T ], R2 ).
We assume throughout that ψ is sufficiently regular that this theorem applies. To
determine a formula for the probability of u given y, we apply the Bayesian approach
described in [5] for the Navier–Stokes equations and easily generalized to the Stokes
equations. For the prior measure we take µ0 = N (0, βA−α ) for some β > 0, α > 1,
with the condition on α chosen to ensure that draws from the prior are in H, by
Lemma A.4. We condition the prior on the observations, to find the posterior measure
on u. The likelihood of y given u is
& 1 '
P (y | u) ∝ exp − |y − G(u)|2Γ .
2
This suggests the formula
dµ
(3.9) (u) ∝ exp(−Φ(u; y)),
dµ0
where
1
(3.10) Φ(u; y) := |y − G(u)|2Γ
2
and µ0 is the prior Gaussian measure. We now make this assertion rigorous. The first
step is to study the properties of the forward model G. Proof of the following lemma
is given after statement and proof of the main approximation result, Theorem 3.4.
Lemma 3.2. Assume that ψ ∈ C([0, T ]; Hγ ) for some γ ! 0. Consider the forward
model G : H → R2JK defined by (3.7) and (3.8).
• If γ ! 0, then for any . ! 0 there is C > 0 such that for all u ∈ H#
! "
|G(u)| " C 1 + %u%# .
• If γ > 0, then for any . > 0 and R > 0 and for all u1 , u2 with %u1 %# ∨%u2 %# <
R, there is L = L(R) > 0 such that
Furthermore, for any ε > 0, there is M > 0 such that L(R) " M exp(εR2 ).
Thus G satisfies Assumption 2 with X = Hs and any s ! 0.
Since G is continuous on H# for . > 0 and since, by Lemma A.4, draws from µ0
are almost surely in Hs for any s < α − 1, use of the techniques in [5], employing the
Stokes equation in place of the Navier–Stokes equation, shows the following.
Theorem 3.3. Assume that ψ ∈ C([0, T ]; Hγ ), for some γ > 0, and that the
prior measure µ0 = N (0, βA−α ) is chosen with β > 0 and α > 1. Then the measure
µ(du) = P(du|y) is absolutely continuous with respect to the prior µ0 (du), with the
Radon–Nikodým derivative given by (3.9).
In fact, the theory in [5] may be used to show that the measure µ is Lipschitz
in the data y, in the Hellinger metric. This well posedness underlies the following
study of the approximation of µ in a finite dimensional space. We define P N to
be the orthogonal projection in H onto the subspace spanned by {φk }|k|!N ; recall
that k ∈ K := Z2 \{0}. Since P N is an orthogonal projection in any Ha , we have
%P N u%X " %u%X . Define
Theorem 3.4. Let the assumptions of Theorem 3.3 hold. Then, for any q <
α − 1, there is a constant c > 0, independent of N , such that dHell
! (µ, µ"N ) " cN −q .
N −q
Consequently the mean and covariance operator of µ and µ are O N close in the
H and H-operator norms, respectively.
Proof. We set X = Hs for any s ∈ (0, α − 1). We employ Corollary 2.3. Clearly,
since G satisfies Assumption 2 by Lemma 3.2, so too does G N , with constants uniform
in N. It remains to establish (2.8). Write u ∈ Hs as
+
u= uk φk
k∈K
By the Lipschitz properties of G from Lemma 3.2 we deduce that, for any . ∈ (0, s),
! "
|G(u) − G(P N u)| " M exp ε%u%2# %u − P N u%#
1 ! "
" C 2 M exp ε%u%2s %u%s N −(s−#) .
N
This!establishes
" the desired error bound (2.8). It follows from Corollary 2.3 that µ
−(s−#)
is O N close to µ in the Hellinger distance. Choosing s arbitrarily close to its
upper bound, and . arbitrarily close to zero, yields the optimal exponent q as appears
in the theorem statement.
Proof of Lemma 3.2. Throughout the proof, the constant C may change from
instance to instance, but is always independent of the ui . It suffices to consider a
single observation so that J = K = 1. Let z (i) (t) solve
# t
(i)
z (i) (t) = z0 + v (i) (z (i) (τ ), τ )dτ,
0
To prove the first part of the lemma note that, by the Sobolev embedding theorem,
for any s > 1,
# t
(i)
|z (i) (t)| " |z0 | + %v (i) (·, τ )%L∞ dτ
0
& # t '
"C 1+ %v (i) (·, τ )%s dτ
0
& # t '
1
"C 1+ (s−#)/2
%u %
i # dτ .
0 τ
.
For any γ ! 0 and . ∈ [0, 2 + γ) we may choose s such that s ∈ [., 2 + γ) (1, . + 2).
Thus the singularity is integrable and we have, for any t ! 0,
! "
|z (i) (t)| " C 1 + %ui %#
as required.
To prove the second part of the lemma, choose . ∈ (0, 2 + γ) and then choose
s ∈ [. − 1, 1 + γ) ∩ (1, . + 1); this requires γ > 0 to ensure a nonempty intersection.
Then
& 1 '
(3.13) %v (i) (t)%1+s " C (1+s−#)/2 %ui %# + %ψ%C([0,T ];Hγ ) .
t
Now we have
# t
|z (1)
(t) − z (2)
(t)| " |z (0) − z (0)| +
(1) (2)
|v (1) (z (1) (τ ), τ ) − v (2) (z (2) (τ ), τ )|dτ
0
# t
" %Dv (1) (·, τ )%L∞ |z (1) (τ ) − z (2) (τ )|dτ
0
# t
+ %v (1) (·, τ ) − v (2) (·, τ )%L∞ dτ
0
# t
" %v (1) (·, τ )%1+s |z (1) (τ ) − z (2) (τ )|dτ
0
# t
+ %v (1) (·, τ ) − v (2) (·, τ )%s dτ
0
# t & '
1
" C (1+s−#)/2 %u1 %# + %ψ%C([0,T ];Hγ ) |z (1) (τ ) − z (2) (τ )|dτ
0 τ
# t
C
+ (s−#)/2
%u1 − u2 %# dτ.
0 τ
Both time singularities are integrable and application of the Gronwall inequality from
Lemma A.1 gives, for some C depending on %u1 %# and %ψ%C([0,T ];Hγ ) ,
2 Here by number of Fourier modes, we mean the dimension of the Fourier space approximation,
Fig. 3.1. Marginal distributions on Re(u0,1 (0)) with differing numbers of Fourier modes.
Fig. 3.2. Marginal distributions on Re(u0,1 (0)) with differing numbers of Fourier modes, bicu-
bic interpolation used.
and 3.2 shows that the approximation (iii) by increased order of interpolation leads
to improved approximation of the posterior distribution, and Figure 3.2 alone again
illustrates Theorem 3.4. Figure 3.3 shows the effect (iv) of reducing the time-step used
in the integration of the Lagrangian trajectories. Note that many more (400) particles
were used to generate the observations leading to this figure than were used in the
preceding two figures. This explains the quantitatively different posterior distribution;
in particular, the variance in the posterior distribution is considerably smaller because
more data are present. The result shows clearly that reducing the time-step leads to
convergence in the posterior distribution.
4. Eulerian data assimilation. In this section we consider a data assimilation
problem that is related to weather forecasting applications. In this problem, direct
observations are made of the velocity field of an incompressible viscous flow at some
Fig. 3.3. Marginal distributions on Re(u0,1 (0)) with differing time-step, Lagrangian data.
yk = v(xk , t) + ηk , k = 1, . . . , K.
We assume that the noise is Gaussian and the ηk form an i.i.d. sequence with η1 ∼
N (0, γ 2 ). It is known (see Chapter 3 of [29], for example) that for u ∈ H and
f ∈ L2 (0, T ; Hs ) with s > 0 a unique solution to (4.1) exists which satisfies u ∈
L∞ (0, T ; H1+s ) ⊂ L∞ (0, T ; L∞(D)). Therefore for such initial condition and forcing
function the value of v at any x ∈ D and any t > 0 can be written as a function of u.
Hence, we can write
y = G(u) + η,
Now consider a Gaussian prior measure µ0 ∼ N (ub , βA−α ) with β > 0 and α > 1; the
second condition ensures that functions drawn from the prior are in H, by Lemma
A.4. In Theorem 3.4 of [5] it is shown that with such prior measure, the posterior
measure of the above inverse problem is well defined.
Theorem 4.1. Assume that f ∈ L2 (0, T, Hs ) with s > 0. Consider the Eulerian
data assimilation problem described above. Define a Gaussian measure µ0 on H, with
mean ub and covariance operator β A−α for any β > 0 and α > 1. If ub ∈ Hα , then
the probability measure µ(du) = P(du|y) is absolutely continuous with respect to µ0
with Radon–Nikodým derivative
$ %
dµ 1 2
(4.3) (u) ∝ exp − 2 |y − G(u)|Σ .
dµ0 2γ
and then consider the approximate prior measure µN defined via its Radon–Nikodým
derivative with respect to µ0 :
$ %
dµN 1 N 2
(4.5) ∝ exp − 2 |y − G (u)|Σ .
dµ0 2γ
Our aim is to show that µN converges to µ in the Hellinger metric. Unlike the examples
in the previous section, we are unable to obtain sufficient control on the dependence of
the error constant on u in the forward error bound to enable application of Theorem
2.2; hence we employ Theorem 2.4. In the following lemma we obtain a bound on
%v(t) − v N (t)%L∞ (D) and therefore on |G(u) − G N (u)|. Following the statement of the
lemma, we state and prove the basic approximation theorem for this section. The
proof of the lemma itself is given after the statement and proof of the approximation
theorem for the posterior probability measure.
Lemma 4.2. Let v N be the solution of the Galerkin system (4.4). For any t > t0
where ψ(N ) → 0 as N → ∞.
The above lemma leads us to the following convergence result for µN .
Theorem 4.3. Let µN be defined according to (4.5) and let the assumptions of
Theorem 4.1 hold. Then
dHell (µ, µN ) → 0
as N → ∞.
Proof. We apply Theorem 2.4 with X = H. Assumption 2 (and hence Assumption
1) is established in Lemma 3.1 of [5]. By Lemma 4.2
dP N v
+ νAP N v + P N B(v, v) = P N ψ.
dt
Therefore e2 = P N v − v N satisfies
de2
(4.6) + νAe2 = P N B(e1 + e2 , v) + P N B(v N , e1 + e2 ), e2 (0) = 0.
dt
Since for any l ! 0 and for m > l
1
(4.7) %e1 %2l " %v%2m ,
N 2(m−l)
we will obtain an upper bound for %e2 %1+l , l > 0, in terms of the Sobolev norms of
e1 and then use the embedding H1+l ⊂ L∞ to conclude the result of the lemma.
Taking the inner product of (4.6) with e2 , and noting that P N is self-adjoint,
N
P e2 = e2 , and (B(v, w), w) = 0, we obtain
1 d
%e2 %2 + ν%De2 %2 = (B(e1 + e2 , v), e2 ) + (B(v N , e1 ), e2 )
2 dt
1/2 1/2
" c%e1 %1/2 %e1 %1 %v%1 %e2 %1/2 %e2 %1 + c%e2 % %v%1 %e2 %1
1/2 1/2
+ c%v N %1/2 %v N %1 %e1 %1 %e2 %1/2 %e2 %1
" c%e1 %2 %e1 %21 + c%v%21 %e2 % + c%e2 %2 %v%21
ν
+ c%v N % %v N %1 %e1 %1 + c%e1 %1 %e2 % + %e2 %21 .
2
Therefore
d
(1 + %e2 %2 ) + ν %De2 %2
dt
" c (1 + %v%21 ) (1 + %e2 %2 ) + c(1 + %e1 %2 ) %e1 %21 + c %v N % %v N %1 %e1 %1 ,
which gives
# t $ # t %# t
%e2 (t)%2 + ν %De2 %2 " c β(t) 1 + %v N %2 %v N %21 dτ %e1 %21 dτ
0 0 0
# t
+ c β(t) (1 + %e1 %2 ) %e1 %21 dτ
0
with
$ # t %
2
β(t) = exp c 1 + %v%1 dτ .
0
Hence
# t # t
c+c$u$2
(4.8) 2
%e2 (t)% + ν %De2 % " c(1 + %u% ) e
2 4
(1 + %e1 %2 ) %e1 %21 dτ.
0 0
To estimate %e2 (t)%s for s < 1, we take the inner product of (4.6) with As e2 ,
0 < s < 1, and write
1 d ! " ! "
%e2 %2s + ν%e2 %21+s " | ((e1 + e2 ) · ∇)v, As e2 | + | (v N · ∇)(e1 + e2 ), As e2 |.
2 dt
Using
! "
| (u · ∇)v, As w | " c%u%s %v%1 %w%1+s
and Young’s inequality we obtain
d
%e2 %2s + ν%e2 %21+s " c (%e1 %2s + %e2 %2s ) %v%21 + c %v N %2s (%e1 %21 + %e2 %21 ).
dt
Now integrating with respect to t over (t0 , t) with 0 < t0 < t we can write
# t # t
%e2 (t)%2s + ν %e2 %21+s dτ " %e2 (t0 )%2s + c sup %v(τ )%21 %e1 %2s + %e2 %2s dτ
t0 τ "t0 0
# t
+ c sup %v N (τ )%2s %e1 %21 + %e2 %21 dτ.
τ "t0 0
for t > t0 .
Now we estimate %e2 (t)%s for s > 1. Taking the inner product of (4.6) with
A1+l e2 , 0 < l < 1, we obtain
1 d ! "
%e2 %21+l + ν%e2 %22+l " | ((e1 + e2 ) · ∇)v, A1+l e2 |
2 dt ! "
+ | (v N · ∇)(e1 + e2 ), A1+l e2 |.
Since (see [5])
! "
(u · ∇)v, A1+l w " c %u%1+l %v%1 %w%2+l + c %u%l %v%2 %w%2+l
and using Young’s inequality, we can write
d
%e2 %21+l + ν%e2 %22+l " c %e1 %21+l %v%21 + c %e1 %2l %v%22
dt
+ c %e2 %21+l %v%21 + c %e2 %2l %v%22
+ c %v N %21+l %e1 %21 + c %v N %2l %e1 %22
2/l
+ c %v N %21+l %e2 %21 + c %v N %l %e2 %21+l .
Now we integrate the above inequality with respect to t and over (t0 /2 + σ, t) with
0 < t0 < t and 0 < σ < t − t0 /2 and obtain (noting that %v N %s " %v%s for any s > 0)
# t
%e2 (t)%21+l " %e2 (t0 /2 + σ)%21+l + sup %v(τ )%21 %e1 %21+l + %e2 %21+l dτ
τ "t0 /2 t0 /2+σ
# t
+ sup (%e1 (τ )%2l + %e2 (τ )%2l ) %v%22 dτ
τ "t0 /2 t0 /2+σ
# t
+ sup (%e1 (τ )%21 + %e2 (τ )%21 ) %v N %21+l dτ
τ "t0 /2 t0 /2+σ
# t
N 2/l
+ sup (1 + %v (τ )%l ) %e1 %22 + %e2 %21+l dτ.
τ "t0 /2 t0 /2+σ
Since f ∈ L2 (0, T ; H), the above integral tends to zero as N → ∞ and the result
follows.
then
# t $# t %
u(t) " α(t) + α(s)β(s) exp β(r)dr ds, t ∈ I.
c s
Furthermore, if (Z, +·, ·,) is a Hilbert space and f : X → Z has fourth moments, then
"
& "
' 12
%Eµ f ⊗ f − Eµ f ⊗ f % " 2 Eµ %f %4 + Eµ %f %4 dHell (µ, µ% ).
REFERENCES
[25] P. D. Spanos and R. Ghanem, Stochastic finite element expansion for random media, J. Eng.
Mech., 115 (1989), pp. 1035–1053.
[26] P. D. Spanos and R. Ghanem, Stochastic Finite Elements: A Spectral Approach, Dover, New
York, 2003.
[27] A. M. Stuart, Inverse problems: A Bayesian approach, Acta Numer., 19 (2010), to appear.
[28] A. Tarantola, Inverse Problem Theory. SIAM, Philadelphia, PA, 2005.
[29] R. Temam, Navier-Stokes equations and nonlinear functional analysis, CBMS-NSF Regional
Conference Series in Applied Mathematics 66, SIAM, Philadelphia, PA, 1995.
[30] R. Temam, Navier-Stokes equations. AMS Chelsea, Providence, RI, 2001.
[31] R. A. Todor and C. Schwab, Convergence rates for sparse chaos approximations of elliptic
problems with stochastic coefficients, IMA J. Numer. Anal., 27 (2007), pp. 232–261.
[32] C. R. Vogel, Computational Methods for Inverse Problems, SIAM, Philadelphia, PA, 2002.
[33] L. W. White, A study of uniqueness for the initialiazation problem for burgers’ equation, J.
Math. Anal. Appl., 172 (1993), pp. 412–431.