Nothing Special   »   [go: up one dir, main page]

Adjoint Sensitivity Analysis For Differential-Algebraic Equations: The Adjoint Dae System and Its Numerical Solution

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

c 2003 Society for Industrial and Applied Mathematics

SIAM J. SCI. COMPUT.


Vol. 24, No. 3, pp. 10761089

ADJOINT SENSITIVITY ANALYSIS FOR


DIFFERENTIAL-ALGEBRAIC EQUATIONS: THE ADJOINT
DAE SYSTEM AND ITS NUMERICAL SOLUTION
YANG CAO , SHENGTAI LI , LINDA PETZOLD , AND RADU SERBAN
Abstract. An adjoint sensitivity method is presented for parameter-dependent dierentialalgebraic equation systems (DAEs). The adjoint system is derived, along with conditions for its
consistent initialization, for DAEs of index up to two (Hessenberg). For stable linear DAEs, stability
of the adjoint system (for semi-explicit DAEs) or of an augmented adjoint system (for fully implicit
DAEs) is shown. In addition, it is shown for these systems that numerical stability is maintained for
the adjoint system or for the augmented adjoint system.
Key words. sensitivity analysis, DAE, adjoint method
AMS subject classications. 65L10, 65L99
PII. S1064827501380630

1. Introduction. With the rapid development of faster computers, increasingly


realistic mathematical models are being used to investigate physical phenomena. New
model features often call for parameters whose values may not be accurately known.
Thus there is a need for parametric sensitivity analysis of dierential-algebraic models. Areas of application include optimization, parameter estimation, model simplication, data assimilation, optimal control, process sensitivity, uncertainty analysis,
and experimental design for a wide range of scientic and engineering problems.
Recent work on methods and software for sensitivity analysis of DAE systems
[15, 22, 20, 21, 23] has demonstrated that forward sensitivities can be computed
reliably and eciently via automatic dierentiation [8] in combination with DAE
solution techniques designed to exploit the structure of the sensitivity system. The
DASPK3.0 [20, 21] software package was developed for forward sensitivity analysis
of DAE systems of index up to two and has been used in sensitivity analysis and
design optimization of several large-scale engineering problems [19, 25]. DASPK3.0
is an extension of the DASPK software [9, 10] developed by Brown, Hindmarsh, and
Petzold for the solution of large-scale DAE systems. For a parameter-dependent DAE
system

F (x, x,
t, p) = 0,
(1)
x(0) = x0 (p),
where x Rnx and p Rnp , these problems take the following form: nd dx/dpj
Received by the editors September 3, 2001; accepted for publication (in revised form) June 17,
2002; published electronically January 23, 2003. This work was supported by grants EPRI WO8333-06, NSF CCR 98-96198, NSF/ARPA PC 239415, NSF/KDI ATM-9873133, and LLNL ISCR
00-15.
http://www.siam.org/journals/sisc/24-3/38063.html
Department of Computer Science, University of California, Santa Barbara, CA 93106 (ycao@
engineering.ucsb.edu, petzold@engineering.ucsb.edu).
Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545 (shenli@t7.
lanl.gov).
Center for Applied Scientic Computing, Lawrence Livermore National Laboratory, Livermore,
CA 94551 (radu@llnl.gov). The work of the last author was performed under the auspices of the
U.S. Department of Energy by the University of California, Lawrence Livermore National Laboratory,
under contract W-7405-Eng-48.

1076

ADJOINT SENSITIVITY ANALYSIS FOR DAEs

1077

at time T , for j = 1, . . . , np . Their solution requires the simultaneous solution of


the original DAE system with the np sensitivity systems obtained by dierentiating
the original DAE with respect to each parameter in turn. For large systems this
may look like a lot of work but it can be done eciently, if np is relatively small,
by exploiting the fact that the sensitivity systems are linear and all share the same
Jacobian matrices with the original system.
Some problems require the sensitivities with respect to a large number of parameters. For these problems, particularly if the number of state variables is also large,
the forward sensitivity approach is intractable. These problems can often be handled
more eciently by the adjoint method [14]. In this approach, we are interested in
calculating the sensitivity dG
dp of an objective function

(2)

G(x, p) =

g(x, t, p)dt,

or alternatively the sensitivity dg


dp of a function g(x, T, p) dened only at time T .
The function g must be smooth enough that gp and gx exist and are bounded. The
dg
primary cost in computing dG
dp or dp via the adjoint method is the calculation of the
intermediate quantity , called the adjoint variable, as the solution of the adjoint
system. The adjoint system is a linear DAE system associated with the governing
DAEs (1). While forward sensitivity analysis is best suited to the situation of nding
the sensitivities of a potentially large number of solution variables with respect to a
small number of parameters, reverse (adjoint) sensitivity analysis is best suited to the
complementary situation of nding the sensitivity of a scalar (or small-dimensional)
function of the solution with respect to a large number of parameters.
The specication of the adjoint system for DAEs in the absence of sensitivity
considerations is straightforward. For a linear DAE of the form
(3)

Ax + Bx = 0,

where A and B are suciently smooth matrix functions, the adjoint or dual DAE is
given by
(4)

(A ) B = 0,

where denotes the conjugate transpose and  denotes the time derivative. In related works [3, 4, 5, 7, 6], Balla and M
arz have derived the adjoint system for index-1
DAEs and showed its solvability and index-1 structure under minimal conditions of
smoothness. When the adjoint method is used for computation of the sensitivities, the
adjoint system (4) may become inhomogeneous and have dierent initial conditions,
depending on the form of the objective functions. Our contributions in the present
paper are rst to give a simple derivation using variational methods of the adjoint
system for general nonlinear DAEs and to show how to use it in sensitivity analysis. Then we present conditions for the consistent initialization of the adjoint DAE
system for DAEs of index up to two (Hessenberg). Finally, we show for linear DAEs
that if the original DAE is stable, then the adjoint DAE (for semi-explicit DAEs) or
an augmented adjoint DAE (for fully implicit DAEs) is stable, and that numerical
stability is maintained for the adjoint DAE or the augmented adjoint DAE.
We note that M
arz [24] has shown that for well-formulated DAE systems (these
DAE systems must be written in a specic form), the operations of adjoint and discretization commute. For such systems, stability of the adjoint and its discretization

1078

Y. CAO, S. LI, L. PETZOLD, AND R. SERBAN

is maintained, hence there is no need for an augmented adjoint DAE. There are many
computational advantages for such DAE systems, hence we advocate rewriting the
system into this form whenever possible. However, for large nonlinear problems from
scientic computing this may not always be possible or convenient. Thus we believe
it is important also to consider numerical methods for DAEs of the more general form
(1).
The outline of this paper is as follows. In section 2 we derive the adjoint system by
a variational method, along with conditions for its consistent initialization for index-0
and index-1 DAEs. In section 3 we give some examples and derive the conditions for
the consistent initialization of the adjoint system for Hessenberg index-2 DAEs. In
section 4 we investigate the stability of the adjoint system and show how to formulate
an augmented adjoint system for which stability is maintained even for fully implicit
DAE systems. In section 5 we address the numerical stability of the adjoint and
augmented adjoint DAE systems.
2. Derivation of the adjoint system for sensitivity.
2.1. Sensitivity of G(x, p). We focus rst on solving the sensitivity problem
for G(x, p) dened by (2). Introducing a Lagrange multiplier , we form the augmented objective function
 T
I(x, p) = G(x, p)
F (x, x,
t, p)dt.
0

Since F (x, x,
t, p) = 0, the sensitivity of G with respect to p is
 T
 T
dG
dI
=
=
(5)
(gp + gx xp )dt
(Fp + Fx xp + Fx x p )dt,
dp
dp
0
0
where subscripts on functions such as F or g are used to denote partial derivatives.
By integration by parts, we have
 T
 T

T
Fx x p dt = ( Fx xp )|0
( Fx ) xp dt.
0

Thus (5) becomes


 T
 T
dG
=
(6)
(gp Fp ) dt
[gx + Fx ( Fx ) ] xp dt ( Fx xp )|T0 .
dp
0
0
Now letting
( Fx ) Fx = gx ,

(7)
we obtain
(8)

dG
=
dp


0

(gp Fp ) dt ( Fx xp )|T0 .

Note that xp at t = 0 is the sensitivity of the initial conditions with respect to p,


which is easily obtained. To nd the initial conditions (at t = T ) for the adjoint
system, we must take into consideration the structure of the DAE system.
For index-0 and index-1 DAE systems, we can simply take
(9)

Fx |t=T = 0,

ADJOINT SENSITIVITY ANALYSIS FOR DAEs

yielding the sensitivity equation for


dG
=
dp

(10)

1079

dG
dp

(gp Fp ) dt + ( Fx xp )|t=0 .

This choice will not suce for a Hessenberg index-2 DAE system. Consider the system
 d
x = f 1 (xd , xa , p),
(11)
0 = f 2 (xd , p),
where xd and xa denote the dierential and algebraic solution components, respecf 1
f 1
f 2
tively, A = x
d , B = xa , C = xd , and CB invertible. Here the adjoint system is
given by

+ d A + a C = g d ,
d
x
(12)
d B = gxa ,
where d and a denote the dierential and algebraic adjoint variables, respectively.
If the initial conditions are set as in (9), then d (T ) = 0, which may be in conict
with the constraint equations of (12) if g(x, p) depends explicitly on xa . To resolve
this potential conict, we require
d (T ) = C|t=T ,

(13)

where is yet to be determined. Because f 2 (xd , p) = 0, we have Cxdp = fp2 . Thus


if d satises (13), we have
d xdp |t=T = fp2 |t=T .

(14)

Inserting (13) into the constraint equation (12), we have


CB|t=T = gxa |t=T .
Since CB is invertible, = gxa (CB)1 |t=T , and the boundary condition for d is
d |t=T = gxa (CB)1 C|t=T .

(15)

The sensitivity equation (8) then becomes


dG
=
dp

(16)


gp + d fp1 + a fp2 dt + (d xdp )|t=0 + gxa (CB)1 fp2 .

Thus we have derived the sensitivity equation for dG


dp along with the adjoint DAE
system for and its boundary condition at t = T for index-0, index-1, and index-2
Hessenberg DAE systems.
2.2. Sensitivity of g(x, T, p). Now let us consider the computation of
From
(17)

dg
dp

d dG
dT dp

and (8), we have

dg
= (gp Fp )(T )
dp


0

T Fp dt + (T Fx xp )|t=0

d( Fx xp )
,
dT

dg
dp .

1080

Y. CAO, S. LI, L. PETZOLD, AND R. SERBAN

where T denotes T
. For index-0 and index-1 DAEs,
Hessenberg index-2 DAE system of the form (11),

d( Fx xp )|t=T
dT

= 0. For a


d(gxa (CB)1 fp2 ) 
d( Fx xp )|t=T

=
.

dT
dt
t=T
The corresponding adjoint equations are
(18)

(T Fx ) T Fx = 0.

For index-0 and index-1 DAEs (as shown above, the index-2 case is dierent), to nd
the boundary condition for this equation we write as (t, T ) because it depends on
both t and T . Then
(T, T )Fx |t=T = 0.
Taking the total derivative, we obtain
(t + T ) (T, T )Fx |t=T + (T, T )

dFx
= 0.
dt

we have the boundary condition


Since t is just ,



dFx

+ Fx 
(T Fx )|t=T = (T, T )
(19)
.
dt
t=T
For the index-one DAE case, (7) and (19) yield
(20)

(T Fx )|t=T = [gx Fx ] |t=T .

For the regular implicit ODE case, Fx is invertible; thus we have (T, T ) = 0, which
).
leads to T (T ) = (T
We will discuss the appropriate boundary condition for index-2 Hessenberg DAEs
in the next section.
3. Examples. For simplicity of presentation, throughout this section we assume
that the only dependency of the dierential equations on the parameters p is through
the initial conditions and that the objective function g does not depend explicitly on
p. This is often the case in applications. We also assume that the initial conditions
are consistent with the algebraic constraints (including any hidden constraints) in a
neighborhood of the nominal values of the parameters.
3.1. Standard form ODE. Given the ODE initial value problem

x = f (x, t),
x(0) = x0 (p)
and the function g(x) at t = T , the corresponding adjoint system is

T = fx T ,
T (T ) = gx

and the sensitivity is given by dg


dp = T (0)x0p , where x0p =
with the adjoint system as commonly dened for ODEs.

dx0
dp .

This equation agrees

ADJOINT SENSITIVITY ANALYSIS FOR DAEs

1081

3.2. Implicit ODE. Given




F (x,
x, t) = 0,
x(0) = x0 (p),

F
with A = F
x nonsingular, B = x , and the function g(x) at t = T , the corresponding
adjoint system is


(A T ) B T = 0,
A T (T ) = gx

and the sensitivity is given by

dg
dp

= T (0)A(0)x0p .

3.3. Semi-explicit index-1 DAE. Given

x = f 1 (xd , xa , t),

0 = f 2 (xd , xa ),
d
x (0) = xd0 (p),
1

f
f
f
f
d
with A = x
d , B = xa , C = xd , D = xa nonsingular, and the function g(x ) at
t = T , the corresponding adjoint system is

dT = A dT C aT ,

0 = B dT + D aT ,
d
T (T ) = gxd
dxd

d
0
and the sensitivity is given by dg
dp = T (0) dp .
d
If the function g depends on both x and xa , g = g(xd (T ), xa (T )), the adjoint
equations are the same as above, but the boundary condition is now given by

dT (T ) = gxd C (D )1 gxa .
3.4. Hessenberg index-2 DAE. Given

x d = f 1 (xd , xa , t),
0 = f 2 (xd ),
d
x (0) = xd0 (p),
1

f
f
f
with A = x
d , B = xa , C = xd , and CB invertible, and given that the function
g(x, T, p) depends only on the dierential components xd of x, the corresponding
adjoint system is

dT = A dT C aT ,

0 = B dT ,
d
T (T ) = (I C (B C )1 B )gxd
dxd

d
0
and the sensitivity is given by dg
dp = T (0) dp .
d
a
If g(x, T, p) depends on both x and x , we must solve for the boundary condition
rst, from

d + A d + C a = gxd ,
(21)
0 = B d + gxa ,

d (T ) = C (B C )1 gxa .

1082

Y. CAO, S. LI, L. PETZOLD, AND R. SERBAN

The adjoint system is

dT = A dT C aT ,
0 = B dT .

As in (15), we have d |t=T = gxa |t=T (CB)1 C. If B and C are constant, we


can take the total derivative to obtain

dgxa 
d
d
(t + T ) (T, T ) =
(CB)1 C,
dt 
t=T

so that
Td (T, T )


dgxa 
d

= |t=T
(CB)1 C.
dt t=T

Substituting for d from (21), we obtain



dT (T, T ) = (I C (B C )1 B ) gxd |t=T + A d (T, T ) ,

where d (T, T ) = C (B C )1 gxa |t=T . Denoting P = I B(CB)1 C, dT (T, T )


can be expressed as
dT (T, T ) = P (gxd + A d (T, T )).
If B and C are not constant, we take the derivative of
d |t=T = gxa (CB)1 C|t=T
to obtain
(dt
or

dT ) (T, T )


d((CB)1 C)
dgxa 
1
a |t=T
=
(CB)
C

g
x
dt t=T
dt

dT |t=T = d C (B C )1



dgxa 
d(C (B C )1 ) 
g

.
xa 
dt t=T
dt
t=T

From




d(C (B C )1 )
dC 1

1 dB

dC
=
(B C ) C (B C )
C +B
(B C )1 ,
dt
dt
dt
dt

we obtain
dT |t=T



dB 1
dC 1
d

1 dgxa

C (B C ) gxa P
(B C ) gxa .
= C (B C )
dt
dt
dt

But we know that at t = T , d = C (B C )1 gxa . Thus




dB d
dgxa
dC 1
+
P
(B C ) gxa .
dT |t=T = d C (B C )1
dt
dt
dt
Taking the derivative of B d + gxa = 0, we have


dB d
dgxa
+
.
B d =
dt
dt
Substituting for d , nally we obtain the boundary condition for dT :


dC 1
d

d
T (T, T ) = P gxd + A (T, T )
(B C ) gxa .
dt

ADJOINT SENSITIVITY ANALYSIS FOR DAEs

1083

4. Stability. In this section we investigate the stability of the adjoint system.


That is, if the original system is stable, will the adjoint system also be stable? If we
just consider the adjoint equation (7), it may not be stable. Consider the following
example:
(22)

1
et x + et x = 0.
2

This system is equivalent to


1
x + x = 0,
2
so it is stable. But the adjoint system (7) for (22) is
(23)

1
et et + et = 0,
2

which is equivalent to
1
+ = 0.
2

(24)

Note that we solve the adjoint system in the backward direction. Thus the adjoint
system (24) is unstable.
= F , we can form the augmented adjoint system for (7):
Denoting
x

(25)

F = g ,

x
x
F = 0.

satises
If we solve the augmented adjoint system (25) instead of (24),
(26)

1
= 0,

which is stable in the backward direction. We will show that in general the aug is stable if the original system is stable. Thus in the
mented adjoint system (25) for
implementation [12] we solve (25) instead of (7).
We show rst that the adjoint system (7) for explicit ODE, semi-explicit index-1
DAE, and Hessenberg index-2 DAE systems preserves the stability. In these cases,
Then by bounded transformation, we show
there is no dierence between and .
is also preserved for
that the stability of the augmented adjoint system (25) for
implicit ODE and general index-1 DAEs.
4.1. Explicit ODE. For a linear ODE x = A(t)x, the corresponding homogeneous adjoint system is
= A .
Since the adjoint system is solved backward, we can do a change of variable = T t.
The adjoint system, now to be solved forward, is transformed to
= A .
Then we have the following well-known result.

1084

Y. CAO, S. LI, L. PETZOLD, AND R. SERBAN

Theorem 4.1. If the ODE system x = A(t)x is stable, the adjoint ODE system
is also stable.
Proof. If A is constant, the result follows from the fact that A and A have the
same eigenvalues. If A is not constant, we look at the Greens function. Let X and
be the fundamental solutions of the original ODE and the adjoint system, respectively.
The original system is stable if and only if X(s)X 1 (t) is bounded for s > t. The
adjoint system is stable if and only if (s)1 (t) is bounded for s > t.
Let Z(t) = X (T t)(t). Then

= X A + X A = 0.
Z = X (T t)(t) + X (T t)(t)
Thus Z(t) is constant, so that
X (T t)(t) = X (T s)(s).
Then


(s)1 (t) = X(T t)X 1 (T s) .
Finally, note that s > t leads to T t > T s.
4.2. Semi-explicit index-1 DAE. Consider the linear semi-explicit index-1
DAE system
 d
x = A(t)xd + B(t)xa ,
0 = C(t)xd + D(t)xa ,
where D(t) is invertible. The corresponding adjoint system is
 d
= A d C a ,
0 = B d + D a .
The original system is stable if and only if the essential underlying ODE (EUODE)
x d = Axd B(D)1 Cxd is stable [1]. The corresponding EUODE for the adjoint
system is given by d = A d + C (D )1 B d , or d = (A B(D)1 C) d .
Thus the EUODE of the adjoint system is the adjoint of the EUODE of the original
system. We know from Theorem 4.1 that the EUODE of the adjoint system is stable.
Thus the adjoint system is also stable.
4.3. Hessenberg index-2 DAE. Consider the linear Hessenberg index-2 DAE
system
 d
x = A(t)xd + B(t)xa + q(t),
0 = C(t)xd + r(t),
where C(t)B(t) is nonsingular. The corresponding EUODE is derived in [1] as follows. If B is suciently smooth, there exists a smooth bounded matrix function
R(t) R(nd na )nd whose linearly independent, normalized rows form a basis for the
nullspace of B . Here nd is the dimension of the dierential variables xd and na is
the dimension of the algebraic variables xa . Thus R(t)B(t) = 0 and


R
is invertible.
C

1085

ADJOINT SENSITIVITY ANALYSIS FOR DAEs

Dening new variables


u = Rxd ,

(27)

0 t T,

xd is given by
(28)

x =

R
C

1

u
r


= Su F r,

where S(t) Rnd (nd na ) satises


(29)

RS = I,

CS = 0.

The corresponding EUODE is


(30)

)r(t) + Rq(t).
u = (RAS + RS)u
(RAF RF

The original system is stable for the dierential variables if the EUODE (30) is stable.
Now consider the adjoint system
 d
= A (t)d C (t)a + q(t),
0 = B (t)d + r(t).
Since CS = 0, we have S C = 0. Multiplying the adjoint system by S (t), we obtain
(31)

S = S A (t) + S q(t).

Letting v = S , we have



v
S
= R
(32)
d =

r
B

C (B C )1


.

Thus the EUODE of the adjoint DAE is


(33)

v = S R v S C (B C )1 r S A R v + S A C (B C )1 r.

RS = I leads to S R = S R , so the homogeneous part of the EUODE is given by


(34)

+ RAS) v.
v = (S R + S A R )v = (RS

We can see that the EUODE of the adjoint system is the same as the adjoint of
the EUODE of the original system. Thus from Theorem 4.1 we know that the adjoint system for a stable linear Hessenberg index-2 DAE is stable for the dierential
variables.
4.4. Fully implicit DAE. For implicit ODE and index-1 DAEs, we have already seen from the example (24) that the adjoint DAE (7) for may be unstable.
preserves the stability.
But we will show that the augmented adjoint system (25) for
To prove that, we rst refer to a lemma from Campbell, Bichols, and Terrel [11] that
adjoint commutes with any nonsingular linear transformation.
Lemma 4.2. Given the time-dependent linear DAE system
(35)

A(t)x + B(t)x = f (t)

1086

Y. CAO, S. LI, L. PETZOLD, AND R. SERBAN

and nonsingular time-dependent dierentiable matrices P (t) multiplying the equations


of the DAE and Q(t) transforming the variables, the adjoint system of the transformed
DAE is the transformed system of the adjoint DAE.
Lemma 4.2 is valid for any P and Q. When we consider stability, we require
that the transformation matrix be bounded for all t. First, the matrix Q should
be bounded to ensure that the variable change x = Q(t)
x will not alter the stability.
will not
Second, the matrix P should be bounded so that the variable change = P
change the stability. But from the example (22), we can see that in the case that A(t)
is unbounded but the solution is still bounded, P and Q cannot both be bounded.
Thus we require that Q be bounded. Instead of requiring that P be bounded, we
= A P
will not alter the
require that P A be bounded. Then the variable change
stability. Thus, the augmented adjoint system (25) preserves the stability.
To illustrate this point, consider again the example (22). The transformation
as
matrix P is et , which is unbounded in the backward direction. If we set = P
t
in Theorem 4.3, will not be bounded. But A = e , so P A is the identity. If we let
= A = A P ,

will have the same stability as .
In general, the transformation

in the adjoint (7) may not be bounded but the transformation in the augmented
will be. Thus we have the following theorem.
adjoint system (25) for
Theorem 4.3. For general index-0 and index-1 DAE systems, if the original
is stable.
DAE system is stable, the augmented adjoint DAE system (25) for
Proof. Using the smooth SVD [13] of A(t), we can construct orthogonal matrices
Q and P so that


0
(36)
P (t)A(t)Q(t) =
,
0 0

 1
0

P (t), we have
where = diag{1 , . . . , k }, i = 0. Dening P (t) =
0
I


I 0
(37)
P (t)A(t)Q(t) =
.
0 0
Let x = Q
x. Since Q is orthogonal, if the original system is stable, the transformed
DAE system is also stable. From the conclusion for the explicit cases, we know that
the adjoint of the transformed DAE system is also stable. Now, instead of letting
we let
= A = (P A) .
Since
= P ,


I 0
PA =
Q ,
0 0
is
it follows that P A is bounded. Thus the augmented adjoint system (25) for
stable.
Remark. From the above proof, we can see that if P is bounded, is bounded.
P will be bounded if the pseudoinverse of A is bounded.
5. Numerical stability. In this section we consider numerical stability for the
adjoint system. For systems of index-0 or index-1, we consider the general linear DAE
(38)

A(t)x(t)
+ B(t)x(t) = 0.

The corresponding adjoint system, which is solved in the reverse direction, is


(39)

(A ) B = 0.

ADJOINT SENSITIVITY ANALYSIS FOR DAEs

1087

Zero-stability, the stability as the stepsize h 0 and the number of steps n


on a xed time interval, has been thoroughly investigated for ODEs, index-1, and
Hessenberg index-2 DAEs. See [18, 1, 16, 17], where it is shown that numerical ODE
methods such as the backward Euler method when applied to well-posed DAEs of
those classes are zero-stable and convergent. Zero-stability for the adjoint system
follows immediately for ODEs. For index-1 DAEs, zero-stability is a consequence
of the index-1 structure of the adjoint system [7]. It is simple to verify that the
adjoint system for an index-2 Hessenberg DAE is index-2 Hessenberg, from which
zero-stability follows immediately.
Asymptotic numerical stability, the stability for a xed stepsize h and n ,
has been thoroughly studied for ODE and DAE systems [18, 2, 1, 16, 17]. Here we
are concerned with the question, If a numerical method with a given stepsize is stable
for the original system (38), will it also be stable for the adjoint system (39)? We
consider asymptotic numerical stability for the backward Euler method.
Discretizing the original system (38), we obtain


xn+1 xn
An+1
+ Bn+1 xn+1 = 0,
(40)
h
which leads to
(41)

xn+1 = [An+1 + hBn+1 ]

An+1 xn .

For the adjoint system (39), we have (solving backward)




An+1 n+1 An n
Bn n = 0,
(42)
h
which leads to
(43)

n = [An + hBn ]

An+1 n+1 .

Thus we have





1 
1
1

0 = (A0 + hB0 ) A1 (An + hBn ) An+1 AN 1 + hBN


AN N
1






1
1
1

= (A0 + hB0 ) An (An + hBn )


AN 1 AN 1 + hBN
AN N .
1
Taking the transpose, we obtain
(44) 0 = N AN (AN 1 + hBN 1 )1 AN 1 (A1 + hB1 )1 A1 (A0 + hB0 )1 .
Let
CN = (AN 1 + hBN 1 )1 AN 1 (A1 + hB1 )1 A1 .
The numerical stability of the original system implies that CN is bounded. From (44)
we have for the adjoint system
(45)

0 = N AN CN (A0 + hB0 )1 .

Thus it follows for all linear ODEs and DAEs that asymptotic numerical stability
of the backward Euler method for the original system implies asymptotic numerical
stability for the adjoint system.

1088

Y. CAO, S. LI, L. PETZOLD, AND R. SERBAN

Note that the backward Euler discretization (42) which has been assumed in the
stability analysis is not the same as the discretization which would normally be used
in software like DASPK3.0 for solving the adjoint system (39). The dierence occurs
for systems that are fully implicit. In order to solve (39) using DASPK3.0, we would

need to rst rewrite (39) to isolate the time derivative :


(46)

A + A  B = 0.

Discretization by the backward Euler method would then yield (solving backward)


n+1 n
+ An  n Bn n = 0,
An
(47)
h
which diers from the discretization (42) and in general does not preserve the asymptotic numerical stability of the forward system.
Consider example (22), whose backward Euler discretization leads to

(48)

xn+1 =

1
1+ h
2

1
xn ,

which is stable for any h > 0. The adjoint system is


(49)

1
et et + et = 0.
2

The corresponding backward Euler discretization is given by




n+1 n
1
tn
e
+ etn n etn n = 0,
(50)
h
2
which is equivalent to

(51)

n =

1
1 h
2

1
n+1 ,

which is unstable for any 0 < h < 4. Thus this discretization is not stable using
stepsizes for which the discretization of the forward problem is stable.
The discretization (42), which as we have seen preserves the numerical stability,
can be achieved in DAE software by solving the augmented adjoint system (25).
Acknowledgments. The authors would like to thank Steve Lee, Peter Brown,
Keith Grant, Luc Machiels, and Alan Hindmarsh of Lawrence Livermore National
Laboratory for bringing this problem to our attention and for several interesting
discussions.
REFERENCES
[1] U. M. Ascher and L. R. Petzold, Stability of computational methods for constrained dynamics systems, SIAM J. Sci. Comput., 14 (1993), pp. 95120.
[2] U. M. Ascher and L. R. Petzold, Computer Methods for Ordinary Dierential Equations
and Dierential-Algebraic Equations, SIAM, Philadelphia, 1998.
[3] K. Balla, Linear subspaces for linear DAEs of index 1, Comput. Math. Appl., 32 (1996),
pp. 8186.
[4] K. Balla, Boundary conditions and their transfer for dierential algebraic equations of index
1, Comput. Math. Appl., 31 (1996), pp. 15.

ADJOINT SENSITIVITY ANALYSIS FOR DAEs

1089

rz, Transfer of boundary conditions for DAEs of index 1, SIAM J. Numer.


[5] K. Balla and R. Ma
Anal., 33 (1996), pp. 23182332.
rz, An Unied Approach to Linear Dierential Algebraic Equations and
[6] K. Balla and R. Ma
Their Adjoint Equations, Institute of Mathematics Technical report, Humboldt University,
Berlin, 2000.
rz, Linear dierential algebraic equations of index 1 and their adjoint
[7] K. Balla and R. Ma
equations, Results in Mathematics, 37 (2000), pp. 1335.
[8] C. Bischof, A. Carle, G. Corliss, A. Griewank, and P. Hovland, ADIFOR - Generating
derivative codes from Fortran programs, Scientic Programming, 1 (1992), pp. 1129.
[9] K. E. Brenan, S. L. Campbell, and L. R. Petzold, Numerical Solution of Initial-Value
Problems in Dierential-Algebraic Equations, SIAM, Philadelphia, 1995.
[10] P. N. Brown, A. C. Hindmarsh, and L. R. Petzold, Using Krylov methods in the solution of
large-scale dierential-algebraic systems, SIAM J. Sci. Comput., 15 (1994), pp. 14671488.
[11] S. L. Campbell, N. K. Bichols, and W. J. Terrel, Duality, observability and controllability
for linear time-varying systems, Circuits, Systems, Signal Process, 10 (1991), pp. 455470.
[12] Y. Cao, S. Li, and L. R. Petzold, Adjoint sensitivity analysis for dierential-algebraic equations: Algorithms and software, J. Comput. Appl. Math., to appear.
[13] L. Dieci and T. Eirola, On smooth decompositions of matrices, SIAM J. Matrix Anal. Appl.,
20 (1999), pp. 800819.
[14] R. M. Errico, What is an adjoint model?, Bulletin of the American Meteorological Society,
78 (1997), pp. 25772591.
[15] W. F. Feehery, J. E. Tolsma, and P. I. Barton, Ecient sensitivity analysis of large-scale
dierential-algebraic systems, Appl. Numer. Math., 25 (1997), pp. 4154.
[16] E. Hairer, S. P. Norsett, and G. Wanner, Solving Ordinary Dierential Equations I,
SpringerVerlag, Berlin, 1987.
[17] E. Hairer and G. Wanner, Solving Ordinary Dierential Equations II, Sti and DierentialAlgebraic Problems, SpringerVerlag, Berlin, 1991.
rz, On asymptotics in case of linear index-2 dierential[18] M. Hanke, E. I. Macana, and R. Ma
algebraic equations, SIAM J. Numer. Anal., 35 (1998), pp. 13261346.
[19] D. Knapp, V. Barocas, K. Yoo, L. R. Petzold, and R. Tranquillo, Rheology of reconstituted type I collagen gel in conned compression, J. Rheology, 41 (1997), pp. 971993.
[20] S. Li and L. R. Petzold, Software and algorithms for sensitivity analysis of large-scale
dierential-algebraic systems, J. Comput. Appl. Math., 125 (2000), pp. 131145.
[21] S. Li and L. R. Petzold, Design of New DASPK for Sensitivity Analysis, Technical report,
Dept. of Computer Science, UCSB, Santa Barbara, CA, 1999.
[22] S. Li, L. R. Petzold, and W. Zhu, Sensitivity analysis of dierential-algebraic equations: A
comparison of methods on a special problem, Appl. Numer. Math., 32 (2000), pp. 161174.
[23] T. Maly and L. R. Petzold, Numerical methods and software for sensitivity analysis of
dierential-algebraic systems, Appl. Numer. Math., 20 (1997), pp. 5779.
rz, Dierential Algebraic Systems Anew, Institute of Mathematics Technical Report,
[24] R. Ma
Humboldt University, Berlin, 2000.
[25] L. L. Raja, R. J. Kee, R. Serban, and L. R. Petzold, Computational algorithm for dynamic optimization of chemical vapor deposition processes in stagnation ow reactors, J.
Electrochemical Soc., 147 (2000), pp. 27182726.

You might also like