Nothing Special   »   [go: up one dir, main page]

Spectral Theorems in Euclidean and Hermitian Spaces: 12.1 Normal Linear Maps

Download as pdf or txt
Download as pdf or txt
You are on page 1of 44

Chapter 12

Spectral Theorems in Euclidean and


Hermitian Spaces

12.1 Normal Linear Maps

Let E be a real Euclidean space (or a complex Hermitian


space) with inner product u, v 7! hu, vi.

In the real Euclidean case, recall that h , i is bilinear,


symmetric and positive definite (i.e., hu, ui > 0 for all
u 6= 0).

In the complex Hermitian case, recall that h , i is


sesquilinear, which means that it linear in the first argu-
ment, semilinear in the second argument (i.e.,
hu, µvi = µhu, vi), hv, ui = hu, vi, and positive definite
(as above).

631
632 CHAPTER 12. SPECTRAL THEOREMS
p
In both cases we let kuk = hu, ui and the map
u 7! kuk is a norm.

Recall that every linear map, f : E ! E, has an adjoint


f ⇤ which is a linear map, f ⇤ : E ! E, such that
hf (u), vi = hu, f ⇤(v)i,
for all u, v 2 E.

Since h , i is symmetric, it is obvious that f ⇤⇤ = f .

Definition 12.1. Given a Euclidean (or Hermitian) space,


E, a linear map f : E ! E is normal i↵
f f ⇤ = f ⇤ f.
A linear map f : E ! E is self-adjoint if f = f ⇤, skew-
self-adjoint if f = f ⇤, and orthogonal if
f f ⇤ = f ⇤ f = id.
12.1. NORMAL LINEAR MAPS 633

Our first goal is to show that for every normal linear map
f : E ! E (where E is a Euclidean space), there is an
orthonormal basis (w.r.t. h , i) such that the matrix
of f over this basis has an especially nice form:

It is a block diagonal matrix in which the blocks are ei-


ther one-dimensional matrices (i.e., single entries) or two-
dimensional matrices of the form
✓ ◆
µ
µ

This normal form can be further refined if f is self-adjoint,


skew-self-adjoint, or orthogonal.

As a first step, we show that f and f ⇤ have the same


kernel when f is normal.

Proposition 12.1. Given a Euclidean space E, if


f : E ! E is a normal linear map, then
Ker f = Ker f ⇤.
634 CHAPTER 12. SPECTRAL THEOREMS

The next step is to show that for every linear map


f : E ! E, there is some subspace W of dimension 1 or
2 such that f (W ) ✓ W .

When dim(W ) = 1, W is actually an eigenspace for some


real eigenvalue of f .

Furthermore, when f is normal, there is a subspace W of


dimension 1 or 2 such that f (W ) ✓ W and f ⇤(W ) ✓ W .

The difficulty is that the eigenvalues of f are not nec-


essarily real. One way to get around this problem is to
complexify both the vector space E and the inner prod-
uct h , i.

First, we need to embed a real vector space E into a


complex vector space EC.
12.1. NORMAL LINEAR MAPS 635

Definition 12.2. Given a real vector space E, let EC


be the structure E ⇥ E under the addition operation
(u1, u2) + (v1, v2) = (u1 + v1, u2 + v2),
and multiplication by a complex scalar z = x+iy defined
such that
(x + iy) · (u, v) = (xu yv, yu + xv).
The space EC is called the complexification of E.

It is easily shown that the structure EC is a complex


vector space.

It is also immediate that


(0, v) = i(v, 0),
and thus, identifying E with the subspace of EC consist-
ing of all vectors of the form (u, 0), we can write
(u, v) = u + iv.

Given a vector w = u + iv, its conjugate w is the vector


w = u iv.
636 CHAPTER 12. SPECTRAL THEOREMS

Observe that if (e1, . . . , en) is a basis of E (a real vector


space), then (e1, . . . , en) is also a basis of EC (recall that
ei is an abreviation for (ei, 0)).

Given a linear map f : E ! E, the map f can be ex-


tended to a linear map fC : EC ! EC defined such that

fC(u + iv) = f (u) + if (v).

For any basis (e1, . . . , en) of E, the matrix M (f ) rep-


resenting f over (e1, . . . , en) is identical to the matrix
M (fC) representing fC over (e1, . . . , en), where we view
(e1, . . . , en) as a basis of EC.

As a consequence, det(zI M (f )) = det(zI M (fC)),


which means that f and fC have the same character-
istic polynomial (which has real coefficients).

We know that every polynomial of degree n with real (or


complex) coefficients always has n complex roots (counted
with their multiplicity), and the roots of det(zI M (fC))
that are real (if any) are the eigenvalues of f .
12.1. NORMAL LINEAR MAPS 637

Next, we need to extend the inner product on E to an


inner product on EC.

The inner product h , i on a Euclidean space E is ex-


tended to the Hermitian positive definite form h , iC
on EC as follows:

hu1 + iv1, u2 + iv2iC


= hu1, u2i + hv1, v2i + i(hu2, v1i hu1, v2i).

Then, given any linear map f : E ! E, it is easily verified


that the map fC⇤ defined such that
fC⇤ (u + iv) = f ⇤(u) + if ⇤(v)
for all u, v 2 E, is the adjoint of fC w.r.t. h , iC.
638 CHAPTER 12. SPECTRAL THEOREMS

Assuming again that E is a Hermitian space, observe


that Proposition 12.1 also holds. We deduce the following
corollary.

Proposition 12.2. Given a Hermitian space E, for


any normal linear map f : E ! E, we have Ker (f ) \
Im(f ) = (0).

Proposition 12.3. Given a Hermitian space E, for


any normal linear map f : E ! E, a vector u is an
eigenvector of f for the eigenvalue (in C) i↵ u is
an eigenvector of f ⇤ for the eigenvalue .

The next proposition shows a very important property of


normal linear maps: eigenvectors corresponding to dis-
tinct eigenvalues are orthogonal.

Proposition 12.4. Given a Hermitian space E, for


any normal linear map f : E ! E, if u and v are
eigenvectors of f associated with the eigenvalues
and µ (in C) where 6= µ, then hu, vi = 0.
12.1. NORMAL LINEAR MAPS 639

We can also show easily that the eigenvalues of a self-


adjoint linear map are real.

Proposition 12.5. Given a Hermitian space E, the


eigenvalues of any self-adjoint linear map f : E ! E
are real.

There is also a version of Proposition 12.5 for a (real)


Euclidean space E and a self-adjoint map f : E ! E.

Proposition 12.6. Given a Euclidean space E,


if f : E ! E is any self-adjoint linear map, then every
eigenvalue of fC is real and is actually an eigenvalue
of f (which means that there is some real eigenvector
u 2 E such that f (u) = u). Therefore, all the eigen-
values of f are real.

Given any subspace W of a Hermitian space E, recall


that the orthogonal W ? of W is the subspace defined
such that
W ? = {u 2 E | hu, wi = 0, for all w 2 W }.
640 CHAPTER 12. SPECTRAL THEOREMS

Recall that E = W W ? (construct an orthonormal ba-


sis of E using the Gram–Schmidt orthonormalization pro-
cedure). The same result also holds for Euclidean spaces.

As a warm up for the proof of Theorem 12.10, let us


prove that every self-adjoint map on a Euclidean space
can be diagonalized with respect to an orthonormal basis
of eigenvectors.

Theorem 12.7. Given a Euclidean space E of dimen-


sion n, for every self-adjoint linear map f : E ! E,
there is an orthonormal basis (e1, . . . , en) of eigenvec-
tors of f such that the matrix of f w.r.t. this basis is
a diagonal matrix
0 1
1 ...
B C
B . .2 . . . . C ,
@ . . ... . A
... n
with i 2 R.
12.1. NORMAL LINEAR MAPS 641

One of the key points in the proof of Theorem 12.7 is that


we found a subspace W with the property that
f (W ) ✓ W implies that f (W ?) ✓ W ?.

In general, this does not happen, but normal maps satisfy


a stronger property which ensures that such a subspace
exists.

The following proposition provides a condition that will


allow us to show that a normal linear map can be diago-
nalized. It actually holds for any linear map.

Proposition 12.8. Given a Hermitian space E, for


any linear map f : E ! E and any subspace W of E,
if f (W ) ✓ W , then f ⇤ W ? ✓ W ?.

Consequently, if f (W ) ✓ W and f ⇤(W ) ✓ W , then


f W ? ✓ W ? and f ⇤ W ? ✓ W ?.

The above Proposition also holds for Euclidean spaces.


Although we are ready to prove that for every normal
linear map f (over a Hermitian space) there is an or-
thonormal basis of eigenvectors, we now return to real
Euclidean spaces.
642 CHAPTER 12. SPECTRAL THEOREMS

If f : E ! E is a linear map and w = u + iv is an


eigenvector of fC : EC ! EC for the eigenvalue
z = + iµ, where u, v 2 E and , µ 2 R, since
fC(u + iv) = f (u) + if (v)
and
fC(u + iv) = ( + iµ)(u + iv)
= u µv + i(µu + v),

we have
f (u) = u µv and f (v) = µu + v,

from which we immediately obtain

fC(u iv) = ( iµ)(u iv),

which shows that w = u iv is an eigenvector of fC for


z= iµ. Using this fact, we can prove the following
proposition:
12.1. NORMAL LINEAR MAPS 643

Proposition 12.9. Given a Euclidean space E, for


any normal linear map f : E ! E, if w = u + iv is
an eigenvector of fC associated with the eigenvalue
z = + iµ (where u, v 2 E and , µ 2 R), if µ 6= 0
(i.e., z is not real) then hu, vi = 0 and hu, ui = hv, vi,
which implies that u and v are linearly independent,
and if W is the subspace spanned by u and v, then
f (W ) = W and f ⇤(W ) = W . Furthermore, with re-
spect to the (orthogonal) basis (u, v), the restriction of
f to W has the matrix
✓ ◆
µ
.
µ

If µ = 0, then is a real eigenvalue of f and either u


or v is an eigenvector of f for . If W is the subspace
spanned by u if u 6= 0, or spanned by v 6= 0 if u = 0,
then f (W ) ✓ W and f ⇤(W ) ✓ W .
644 CHAPTER 12. SPECTRAL THEOREMS

Theorem 12.10. (Main Spectral Theorem) Given a


Euclidean space E of dimension n, for every normal
linear map f : E ! E, there is an orthonormal basis
(e1, . . . , en) such that the matrix of f w.r.t. this basis
is a block diagonal matrix of the form
0 1
A1 ...
B C
B . A. 2 . . . . C
@ . . ... . A
. . . Ap

such that each block Aj is either a one-dimensional


matrix (i.e., a real scalar) or a two-dimensional ma-
trix of the form
✓ ◆
j µj
Aj =
µj j

where j , µj 2 R, with µj > 0.


12.1. NORMAL LINEAR MAPS 645

After this relatively hard work, we can easily obtain some


nice normal forms for the matrices of self-adjoint, skew-
self-adjoint, and orthogonal, linear maps.

However, for the sake of completeness, we state the fol-


lowing theorem.

Theorem 12.11. Given a Hermitian space E of di-


mension n, for every normal linear map f : E ! E,
there is an orthonormal basis (e1, . . . , en) of eigenvec-
tors of f such that the matrix of f w.r.t. this basis is
a diagonal matrix
0 1
1 ...
B C
B . .2 . . . . C
@ . . ... . A
... n

where j 2 C.

Remark : There is a converse to Theorem 12.11, namely,


if there is an orthonormal basis (e1, . . . , en) of eigenvec-
tors of f , then f is normal.
646 CHAPTER 12. SPECTRAL THEOREMS

12.2 Self-Adjoint, Skew-Self-Adjoint, and Orthogonal


Linear Maps

Theorem 12.12. Given a Euclidean space E of di-


mension n, for every self-adjoint linear map
f : E ! E, there is an orthonormal basis (e1, . . . , en)
of eigenvectors of f such that the matrix of f w.r.t.
this basis is a diagonal matrix
0 1
1 ...
B C
B . .2 . . . . C
@ . . ... . A
... n

where i 2 R.

Theorem 12.12 implies that if 1, . . . , p are the distinct


real eigenvalues of f and Ei is the eigenspace associated
with i, then
E = E1 · · · Ep ,
where Ei and Ej are othogonal for all i 6= j.
12.2. SELF-ADJOINT AND OTHER SPECIAL LINEAR MAPS 647

Theorem 12.13. Given a Euclidean space E of di-


mension n, for every skew-self-adjoint linear map
f : E ! E, there is an orthonormal basis (e1, . . . , en)
such that the matrix of f w.r.t. this basis is a block
diagonal matrix of the form
0 1
A1 ...
B A . . . C
B. . 2 C
@ . . . . . .. A
. . . Ap

such that each block Aj is either 0 or a two-dimensional


matrix of the form
✓ ◆
0 µj
Aj =
µj 0

where µj 2 R, with µj > 0. In particular, the eigen-


values of fC are pure imaginary of the form ±iµj , or
0.
648 CHAPTER 12. SPECTRAL THEOREMS

Theorem 12.14. Given a Euclidean space E of di-


mension n, for every orthogonal linear map
f : E ! E, there is an orthonormal basis (e1, . . . , en)
such that the matrix of f w.r.t. this basis is a block
diagonal matrix of the form
0 1
A1 ...
B A . . . C
B. . 2 C
@ . . . . . .. A
. . . Ap

such that each block Aj is either 1, 1, or a two-


dimensional matrix of the form
✓ ◆
cos ✓j sin ✓j
Aj =
sin ✓j cos ✓j
where 0 < ✓j < ⇡.

In particular, the eigenvalues of fC are of the form


cos ✓j ± i sin ✓j , or 1, or 1.
12.2. SELF-ADJOINT AND OTHER SPECIAL LINEAR MAPS 649

It is obvious that we can reorder the orthonormal basis of


eigenvectors given by Theorem 12.14, so that the matrix
of f w.r.t. this basis is a block diagonal matrix of the
form
0 1
A1 ...
B .. . . . .. .. C
B C
B . . . Ar C
B C
@ Iq A
... Ip
where each block Aj is a two-dimensional rotation matrix
Aj 6= ±I2 of the form
✓ ◆
cos ✓j sin ✓j
Aj =
sin ✓j cos ✓j
with 0 < ✓j < ⇡.

The linear map f has an eigenspace E(1, f ) = Ker (f id)


of dimension p for the eigenvalue 1, and an eigenspace
E( 1, f ) = Ker (f + id) of dimension q for the eigen-
value 1.
650 CHAPTER 12. SPECTRAL THEOREMS

If det(f ) = +1 (f is a rotation), the dimension q of


E( 1, f ) must be even, and the entries in Iq can be
paired to form two-dimensional blocks, if we wish.

Remark : Theorem 12.14 can be used to prove a sharper


version of the Cartan-Dieudonné Theorem.

Theorem 12.15. Let E be a Euclidean space of di-


mension n 2. For every isometry f 2 O(E), if
p = dim(E(1, f )) = dim(Ker (f id)), then f is the
composition of n p reflections and n p is minimal.

The theorems of this section and of the previous section


can be immediately applied to matrices.
12.3. NORMAL AND OTHER SPECIAL MATRICES 651

12.3 Normal, Symmetric, Skew-Symmetric, Orthogo-


nal, Hermitian, Skew-Hermitian, and Unitary Ma-
trices

First, we consider real matrices.


Definition 12.3. Given a real m ⇥ n matrix A, the
transpose A> of A is the n ⇥ m matrix A> = (a> i j)
defined such that
a>
i j = aj i
for all i, j, 1  i  m, 1  j  n. A real n ⇥ n matrix
A is
1. normal i↵
A A> = A>A,
2. symmetric i↵
A> = A,
3. skew-symmetric i↵
A> = A,

4. orthogonal i↵
A A> = A>A = In.
652 CHAPTER 12. SPECTRAL THEOREMS

Theorem 12.16. For every normal matrix A, there


is an orthogonal matrix P and a block diagonal matrix
D such that A = P D P >, where D is of the form
0 1
D1 ...
B D2 . . . C
D = @ . . .. . C
B
. . . . A
. . . Dp

such that each block Dj is either a one-dimensional


matrix (i.e., a real scalar) or a two-dimensional ma-
trix of the form
✓ ◆
j µj
Dj =
µj j

where j , µj 2 R, with µj > 0.


12.3. NORMAL AND OTHER SPECIAL MATRICES 653

Theorem 12.17. For every symmetric matrix A, there


is an orthogonal matrix P and a diagonal matrix D
such that A = P D P >, where D is of the form
0 1
1 ...
B 2 ...
C
D = @ . . .. . C
B
. . . .A
... n

where i 2 R.
654 CHAPTER 12. SPECTRAL THEOREMS

Theorem 12.18. For every skew-symmetric matrix


A, there is an orthogonal matrix P and a block diag-
onal matrix D such that A = P D P >, where D is of
the form
0 1
D1 ...
B D2 . . . C
D = @ . . .. . C
B
. . . . A
. . . Dp

such that each block Dj is either 0 or a two-dimensional


matrix of the form
✓ ◆
0 µj
Dj =
µj 0

where µj 2 R, with µj > 0. In particular, the eigen-


values of A are pure imaginary of the form ±iµj , or
0.
12.3. NORMAL AND OTHER SPECIAL MATRICES 655

Theorem 12.19. For every orthogonal matrix A, there


is an orthogonal matrix P and a block diagonal matrix
D such that A = P D P >, where D is of the form
0 1
D1 ...
B D2 . . . C
D = @ . . .. . C
B
. . . . A
. . . Dp

such that each block Dj is either 1, 1, or a two-


dimensional matrix of the form
✓ ◆
cos ✓j sin ✓j
Dj =
sin ✓j cos ✓j
where 0 < ✓j < ⇡.

In particular, the eigenvalues of A are of the form


cos ✓j ± i sin ✓j , or 1, or 1.

We now consider complex matrices.


656 CHAPTER 12. SPECTRAL THEOREMS

Definition 12.4. Given a complex m ⇥ n matrix A,


the transpose A> of A is the n ⇥ m matrix A> = (a>
i j)
defined such that
a>i j = aj i
for all i, j, 1  i  m, 1  j  n. The conjugate A of
A is the m ⇥ n matrix A = (bi j ) defined such that
b i j = ai j
for all i, j, 1  i  m, 1  j  n. Given an n ⇥ n
complex matrix A, the adjoint A⇤ of A is the matrix
defined such that
A⇤ = (A>) = (A)>.
A complex n ⇥ n matrix A is
1. normal i↵
AA⇤ = A⇤A,
2. Hermitian i↵
A⇤ = A,
3. skew-Hermitian i↵
A⇤ = A,
4. unitary i↵
AA⇤ = A⇤A = In.
12.3. NORMAL AND OTHER SPECIAL MATRICES 657

Theorem 12.11 can be restated in terms of matrices as


follows. We can also say a little more about eigenvalues
(easy exercise left to the reader).

Theorem 12.20. For every complex normal matrix


A, there is a unitary matrix U and a diagonal matrix
D such that A = U DU ⇤. Furthermore, if A is Hermi-
tian, D is a real matrix, if A is skew-Hermitian, then
the entries in D are pure imaginary or null, and if A
is unitary, then the entries in D have absolute value
1.
658 CHAPTER 12. SPECTRAL THEOREMS

12.4 Conditioning of Eigenvalue Problems

The following n ⇥ n matrix


0 1
0
B1 0 C
B C
B 1 0 C
A=B B ... ...
C
C
B C
@ 1 0 A
1 0

has the eigenvalue 0 with multiplicity n.

However, if we perturb the top rightmost entry of A by


✏, it is easy to see that the characteristic polynomial of
the matrix
0 1
0 ✏
B1 0 C
B C
B 1 0 C
B
A(✏) = B C
... ... C
B C
@ 1 0 A
1 0
is X n ✏.
12.4. CONDITIONING OF EIGENVALUE PROBLEMS 659

It follows that if n = 40 and ✏ = 10 40, A(10 40


) has the
eigenvalues ek2⇡i/4010 1 with k = 1, . . . , 40.

Thus, we see that a very small change (✏ = 10 40) to the


matrix A causes a significant change to the eigenvalues of
A (from 0 to ek2⇡i/4010 1 ).
39
Indeed, the relative error is 10 .

Worse, due to machine precision, since very small num-


bers are treated as 0, the error on the computation of
eigenvalues (for example, of the matrix A(10 40)) can be
very large.

This phenomenon is similar to the phenomenon discussed


in Section 6.3 where we studied the e↵ect of a small per-
tubation of the coefficients of a linear system Ax = b on
its solution.
660 CHAPTER 12. SPECTRAL THEOREMS

In Section 6.3, we saw that the behavior of a linear system


under small perturbations is governed by the condition
number cond(A) of the matrix A.

In the case of the eigenvalue problem (finding the eigen-


values of a matrix), we will see that the conditioning of the
problem depends on the condition number of the change
of basis matrix P used in reducing the matrix A to its
diagonal form D = P 1AP , rather than on the condition
number of A itself.

The following proposition in which we assume that A is


diagonalizable and that the matrix norm k k satisfies a
special condition (satisfied by the operator norms k kp
for p = 1, 2, 1), is due to Bauer and Fike (1960).
12.4. CONDITIONING OF EIGENVALUE PROBLEMS 661

Proposition 12.21. Let A 2 Mn(C) be a diagonal-


izable matrix, P be an invertible matrix and, D be a
diagonal matrix D = diag( 1, . . . , n) such that
1
A = P DP ,
and let k k be a matrix norm such that

kdiag(↵1, . . . , ↵n)k = max |↵i|,


1in

for every diagonal matrix. Then, for every perturba-


tion matrix A, if we write

Bi = {z 2 C | |z i|  cond(P ) k Ak},

for every eigenvalue of A + A, we have


n
[
2 Bk .
k=1
662 CHAPTER 12. SPECTRAL THEOREMS

Proposition 12.21 implies that for any diagonalizable ma-


trix A, if we define (A) by
1
(A) = inf{cond(P ) | P AP = D},

then for every eigenvalue of A + A, we have


n
[
2 {z 2 Cn | |z k|  (A) k Ak}.
k=1

The number (A) is called the conditioning of A relative


to the eigenvalue problem.

If A is a normal matrix, since by Theorem 12.20, A can be


diagonalized with respect to a unitary matrix U , and since
for the spectral norm kU k2 = 1, we see that (A) = 1.
12.4. CONDITIONING OF EIGENVALUE PROBLEMS 663

Therefore, normal matrices are very well conditionned


w.r.t. the eigenvalue problem. In fact, for every eigen-
value of A + A (with A normal), we have
n
[
2 {z 2 Cn | |z k|  k Ak2}.
k=1

If A and A+ A are both symmetric (or Hermitian), there


are sharper results; see Proposition 12.27.

Note that the matrix A(✏) from the beginning of the sec-
tion is not normal.
664 CHAPTER 12. SPECTRAL THEOREMS

12.5 Rayleigh Ratios and the Courant-Fischer Theo-


rem

A fact that is used frequently in optimization problems


is that the eigenvalues of a symmetric matrix are charac-
terized in terms of what is known as the Rayleigh ratio,
defined by

x>Ax
R(A)(x) = > , x 2 Rn, x 6= 0.
x x

The following proposition is often used to prove the cor-


rectness of various optimization or approximation prob-
lems (for example PCA).
12.5. RAYLEIGH RATIOS AND THE COURANT-FISCHER THEOREM 665

Proposition 12.22. (Rayleigh–Ritz) If A is a sym-


metric n ⇥ n matrix with eigenvalues 1  2 
· · ·  n and if (u1, . . . , un) is any orthonormal basis
of eigenvectors of A, where ui is a unit eigenvector
associated with i, then
x>Ax
max > = n
x6=0 x x

(with the maximum attained for x = un), and


x>Ax
max >
= n k
x6=0,x2{un k+1 ,...,un }? x x
(with the maximum attained for x = un k ), where
1  k  n 1. Equivalently, if Vk is the subspace
spanned by (u1, . . . , uk ), then
x>Ax
k = max , k = 1, . . . , n.
x6=0,x2Vk x> x
666 CHAPTER 12. SPECTRAL THEOREMS

For our purposes, we need the version of Proposition 12.22


applying to min instead of max, whose proof is obtained
by a trivial modification of the proof of Proposition 12.22.

Proposition 12.23. (Rayleigh–Ritz) If A is a sym-


metric n ⇥ n matrix with eigenvalues 1  2 
· · ·  n and if (u1, . . . , un) is any orthonormal basis
of eigenvectors of A, where ui is a unit eigenvector
associated with i, then
x>Ax
min > = 1
x6=0 x x

(with the minimum attained for x = u1), and


x>Ax
min >
= i
x6=0,x2{u1 ,...,ui 1}
? x x
(with the minimum attained for x = ui), where 2 
i  n. Equivalently, if Wk = Vk? 1 denotes the sub-
space spanned by (uk , . . . , un) (with V0 = (0)), then
x>Ax x>Ax
k = min = min , k = 1, . . . , n.
x6=0,x2Wk x> x ?
x6=0,x2Vk 1 x >x
12.5. RAYLEIGH RATIOS AND THE COURANT-FISCHER THEOREM 667

Propositions 12.22 and 12.23 together are known the


Rayleigh–Ritz theorem.

As an application of Propositions 12.22 and 12.23, we


prove a proposition which allows us to compare the eigen-
values of two symmetric matrices A and B = R>AR,
where R is a rectangular matrix satisfying the equation
R>R = I.

First, we need a definition. Given an n ⇥ n symmetric


matrix A and an m ⇥ m symmetric B, with m  n,
if 1  2  · · ·  n are the eigenvalues of A and
µ1  µ2  · · ·  µm are the eigenvalues of B, then we
say that the eigenvalues of B interlace the eigenvalues of
A if

i  µi  n m+i , i = 1, . . . , m.
668 CHAPTER 12. SPECTRAL THEOREMS

Proposition 12.24. Let A be an n ⇥ n symmetric


matrix, R be an n ⇥ m matrix such that R>R = I
(with m  n), and let B = R>AR (an m⇥m matrix).
The following properties hold:
(a) The eigenvalues of B interlace the eigenvalues of
A.
(b) If 1  2  · · ·  n are the eigenvalues of A and
µ1  µ2  · · ·  µm are the eigenvalues of B, and
if i = µi, then there is an eigenvector v of B with
eigenvalue µi such that Rv is an eigenvector of A
with eigenvalue i.

Proposition 12.24 immediately implies the Poincaré sep-


aration theorem. It can be used in situations, such as
in quantum mechanics, where one has information about
the inner products u>
i Auj .
12.5. RAYLEIGH RATIOS AND THE COURANT-FISCHER THEOREM 669

Proposition 12.25. (Poincaré separation theorem)


Let A be a n⇥n symmetric (or Hermitian) matrix, let
r be some integer with 1  r  n, and let (u1, . . . , ur )
be r orthonormal vectors. Let B = (u>
i Auj ) (an r ⇥ r
matrix), let 1(A)  . . .  n(A) be the eigenvalues
of A and 1(B)  . . .  r (B) be the eigenvalues of
B; then we have

k (A)  k (B)  k+n r (A), k = 1, . . . , r.

Observe that Proposition 12.24 implies that

1 + ··· + m  tr(R>AR)  n m+1 + ··· + n.


670 CHAPTER 12. SPECTRAL THEOREMS

If P1 is the the n ⇥ (n 1) matrix obtained from the


identity matrix by dropping its last column, we have
P1>P1 = I, and the matrix B = P1>AP1 is the ma-
trix obtained from A by deleting its last row and its last
column. In this case, the interlacing result is

1  µ1  2  µ2  · · ·  µn 2  n 1  µn 1  n,

a genuine interlacing.

We obtain similar results with the matrix Pn r obtained


by dropping the last n r columns of the identity matrix
and setting B = Pn> r APn r (B is the r ⇥ r matrix ob-
tained from A by deleting its last n r rows and columns).

In this case, we have the following interlacing inequalities


known as Cauchy interlacing theorem:

k  µk  k+n r , k = 1, . . . , r. (⇤)
12.5. RAYLEIGH RATIOS AND THE COURANT-FISCHER THEOREM 671

Another useful tool to prove eigenvalue equalities is the


Courant–Fischer characterization of the eigenvalues of a
symmetric matrix, also known as the Min-max (and Max-
min) theorem.

Theorem 12.26. (Courant–Fischer ) Let A be a sym-


metric n ⇥ n matrix with eigenvalues 1  2  · · · 
n . If Vk denotes the set of subspaces of R of dimen-
n

sion k, then
x>Ax
k = max min
W 2Vn k+1 x2W,x6=0 x> x
x>Ax
k = min max .
W 2Vk x2W,x6=0 x> x

The Courant–Fischer theorem yields the following useful


result about perturbing the eigenvalues of a symmetric
matrix due to Hermann Weyl.
672 CHAPTER 12. SPECTRAL THEOREMS

Proposition 12.27. Given two n ⇥ n symmetric ma-


trices A and B = A + A, if ↵1  ↵2  · · ·  ↵n are
the eigenvalues of A and 1  2  · · ·  n are the
eigenvalues of B, then
|↵k k|  ⇢( A)  k Ak2 , k = 1, . . . , n.

Proposition 12.27 also holds for Hermitian matrices.

A pretty result of Wielandt and Ho↵man asserts that


n
X
(↵k k)
2
 k Ak2F ,
k=1

where k kF is the Frobenius norm. However, the proof is


significantly harder than the above proof; see Lax [25].

The Courant–Fischer theorem can also be used to prove


some famous inequalities due to Hermann Weyl.
12.5. RAYLEIGH RATIOS AND THE COURANT-FISCHER THEOREM 673

Given two symmetric (or Hermitian) matrices A and B,


let i(A), i(B), and i(A+B) denote the ith eigenvalue
of A, B, and A+B, respectively, arranged in nondecreas-
ing order.

Proposition 12.28. (Weyl) Given two symmetric (or


Hermitian) n ⇥ n matrices A and B, the following in-
equalities hold: For all i, j, k with 1  i, j, k  n:
1. If i + j = k + 1, then
i (A) + j (B)  k (A + B).

2. If i + j = k + n, then
k (A + B)  i (A) + j (B).

In the special case i = j = k, we obtain


1 (A)+ 1 (B)  1 (A+B), n (A+B)  n (A)+ n (B).

It follows that 1 is concave, while n is convex.


674 CHAPTER 12. SPECTRAL THEOREMS

If i = 1 and j = k, we obtain
1 (A) + k (B)  k (A + B),
and if i = k and j = n, we obtain
k (A + B)  k (A) + n (B),

and combining them, we get

1 (A) + k (B)  k (A + B)  k (A) + n (B).

In particular, if B is positive semidefinite, since its eigen-


values are nonnegative, we obtain the following inequality
known as the monotonicity theorem for symmetric (or
Hermitian) matrices:

if A and B are symmetric (or Hermitian) and B is positive


semidefinite, then

k (A)  k (A + B) k = 1, . . . , n.

You might also like