Nothing Special   »   [go: up one dir, main page]

Notes On Eigenvalues 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

MA322

F all2013 Notes on Eigenvalues Fall 2013

1 Introduction
In these notes, we start with the definition of eigenvectors in abstract vector spaces and follow with
the more common definition of eigenvectors of a square matrix.
Then we discuss the diagonalization problem for a linear transformation.
Finally, we discuss all cases eigenvectors of 2 × 2 matrices.

2 Eigenvectors and Eigenvalues in abstract spaces.


Let V be a vector space over a field K. Let L be a linear transformation from V to itself.
A scalar λ ∈ K is said to be an eigenvalue for L if there is a non zero vector v such that
L(v) = λv.
A vector v ∈ V is said to be an eigenvector for L if it satisfies two conditions:
1. v 6= 0.

2. L(v) = λv for some λ ∈ K.

3. When the above conditions are satisfied, we get that λ is an eigenvalue for L and we will describe
this by saying v belongs to the eigenvalue λ.

1. Examples.
 
5 2 1
(a) Let A =  0 3 −1 . Define a transformation L from <3 to <3 by L(v) = Av. Note that
 

0 0 1
      
5 2 1 1 5 1
 0 3 −1   0  =  0  = 5  0  .
      

0 0 1 0 0 0
 
1
This shows that the vector v =  0  is an eigenvector of L belonging to the eigenvalue 5.
 

0
It is also clear that any non zero multiple of v also has the same property.
It is possible to show that λ = 3 and λ = 1 are also eigenvalues for L. What are eigenvectors
belonging to them?
(b) Consider a linear transformation T : P3 → P3 defined by T (p(x)) = xp0 (x) − 3p0 (x).
Verify that the polynomials 1, (x − 3), (x − 3)2 , (x − 3)3 are all eigenvectors belonging to
different eigenvalues.
For example

T ((x − 3)2 ) = x(2(x − 3)) − 3(2(x − 3)) = 2(x − 3)(x − 3) = 2(x − 3)2 .

Thus, (x − 3)2 is an eigenvector belonging to eigenvalue 2.

1
(c) Consider a linear transformation S on the space of twice differentiable real functions given
by S(y) = y 00 .
Verify that the eigenvectors for S are the exponential functions exp(rx) for various values
of r. What eigenvalue does exp(rx) belong to?
For example exp(3x) belongs to the eigenvalue 9.
2. Eigenspaces. Let V be a vector space with a linear transformation L from V to itself. Let λ be
any scalar in K.
We define the subspace Vλ to be the space of all vectors v such that L(v) = λv.
Note that we are not requiring the vector v to be non zero here!
• Clearly, every non zero vector in Vλ is an eigenvector for L belonging to the eigenvalue λ.
• Note that λ is an eigenvalue for L iff Vλ is not just the zero vector space.
• We define Vλ to be the eigenspace belonging to λ if it is not just the zero vector space (i.e.
λ is an eigenvalue.)
• If V = <n and L is defined by an n × n matrix A as L(X) = AX, then the space Vλ =
N ul(A − λI).
Thus, for such transformations, all eigenvalues λ can be identified as scalars λ for which
A − λI is singular, i.e. det(A − λI) = 0.
3. Examples of eigenspaces.
• For the transformation L in example (a) above, there are three non zero eigenspaces.  
For
1
the eigenvalue 5, the space V5 is the one dimensional space spanned by w1 =  0 . It is
 

0
also described as N ul((A − 5I)).
Similarly,

forthe eigenvalue 3 we calculate the V3 = N ul((A − 3I)) 
andfind its basis to be
−1 −1
w2 =  1 . Finally for the eigenvalue 1 we get V1 with basis  1 .
   

0 2
Later on we shall see that the resulting set of 3 vectors w1 , w2 , w3 } is linearly independent
and hence a basis for the space <3 .
• In the example (b) above, we find the eigenspaces by solving the equation S(p) = xp0 − 3p0 =
λp for the various eigenvalues. Unlike example (a) we don’t have just the luxury of finding
Null spaces of a suitable matrix.
The solution to xp0 − 3p0 = λp can be seen to be p = (x − 3)λ and thus we get the eigenvalues
0, 1, 2, 3 and the resulting four polynomials 1, (x − 3), (x − 3)2 , (x − 3)3 are independent and
form a basis of their respective eigenspaces. The four vectors together form a basis of P3 .
• For the example (c) above, we have to work like example (b) and solve S(y) = y 00 = λy.
You should verify that for y = exp(rx) we get y 00 − λy = (r2 − λ) exp(rx), so for λ = 0, 1, 4, 9
we get respective eigenvectors exp(0) = 1, exp(x), exp(2x), exp(3x) respectively.
Even though these are independent and form a basis of their respective eigenspaces, they do
not give a basis of our vector space since the vector space itself is infinite dimensional.
Note: Before long, we will learn how to compute the eigenspaces by evaluating the Null
space of a suitable square matrix.

2
• Notations. Let L : V → V be a linear transformation. Let B = (v1 v2 · · · vr ) be a
sequence of vectors in V which we treat as if it is a generalized row vector.
a1 a1
   
 a   a 
2   2 
If we have a column vector   then we shall give the natural meaning to B  =
 ···   ··· 

ar ar
a1 v1 + a2 v2 + · · · + ar vr .. Also, if M is any matrix with r rows, then BM similarly makes
sense.
Then, we will define L(B) = (L(v1 ) L(v2 ) · · · L(vr )).as the image of the generalized
vector.
Important Observation. In this notation, suppose that B = (v1 v2 · · · vr ) is a sequence
of eigenvectors of L belonging to the eigenvalues λ1 , λ2 , · · · , λr then we get a natural equation:
 
λ1 0 · · · 0
L(B) = (λ1 v1 λ2 v2 · · · λr vr ) = (v1 v2 · · · vr )  0

λ2 · · · 0  .
0 0 · · · λr
If our vector space is the usual <m , and if our linear transformation is multiplication by a
matrix A, then this takes a more familiar form as follows. Let P be the matrix formed by
the vectors of B as columns and D the diagonal matrix of type r × r with λ1 , λ2 , · · · , λr
along the diagonal. Then the above equation takes on the form:

AP = P D.
We will be interested in the special case of this when P itself is a square and invertible
matrix. Then the equation will be more conveniently rewritten as P −1 AP = D and this will
be the so-called diagonalization process described below.

3 Eigenvectors of square matrices.


As we saw above, for more abstract vector spaces we have to adopt a different strategy to find eigenvalues
and eigenvectors. Now we show how we can reduce the problem of finding eigenvectors to that of finding
a Null space of a matrix.

3.1 The matrix of a linear transformation.


We describe how to calculate the matrix of a linear transformation of a finite dimensional vector space
with respect to a given basis.
Here are the steps.

• Let L be a linear transformation


 from a vector space V to itself and assume that V has a basis
B = w1 w2 · · · wn .

• Next we calculate the vectors L(w1 ), L(w2 ), · · · , L(wn ).

• Let v1 = [L(w1 )]B , v2 = [L(w2 )]B , · · · , vn = [L(wn )]B be their respective coordinate vectors in the
basis B.

3
• Make a matrix A whose columns are these vectors v1 , v2 , · · · , vn .

This matrix is the so-called matrix of the linear transformation in the basis B.
Consider F (λ) = det(A − λI) which is easily seen to be a polynomial in λ of degree n.
Definition. Given a square n × n matrix A, the polynomial F (λ) = det(A − λI) is called its
characteristic polynomial and the equation F (λ) = 0 is called its characteristic equation.
The roots of the characteristic polynomial of A are called the eigenvalues of A and for any such
eigenvalue λ, the space N ul(A − λI) is its eigenspace.
Another way to explain this is as follows: If we define a linear transformation TA : <n → <n to be
TA (X) = AX. Then these eigenvalues and eigenspaces correspond to the eigenvalue and eigenspaces
of TA .

3.2 How to use the characteristic polynomial.


We now show that the eigenvalues of L are simply the roots of the polynomial F (λ), i.e. the solutions
of the polynomial equation F (λ) = 0. Moreover, if λ is such a root, an eigenvector belonging to it is a
vector whose coordinate vector is a non zero member of the N ul(A − λI).
Idea of the proof. Let v be an eigenvector for L belonging to an eigenvalue λ.
We then have v 6= 0 and L(v) = λv.
But we note that v = B · [v]B and hence λv = λB([v]B ). So:

L(v) = L(B[v]B ) = BA[v]B = Bλ[v]B = λB[v]B = λv.

Thus v is an eigenvector belonging to λ iff its coordinate vector [v]B is a non zero vector in N ul(A −
λI).
We illustrate this on our
 example (b)  above.
2 3
Choose the basis B = 1 x x x . Then calculation of L(v) for each of the basis vectors gives
 
L(B) = 0 (x − 3) (x − 3)(2x) (x − 3)(3x2 ) .

This gives the matrix of the transformation:

0 −3 0 0
 
 0 1 −6 0 
A= .
 
 0 0 2 −9 
0 0 0 3

It is now clear that its characteristic polynomial is λ(λ − 1)(λ − 2)(λ − 3).
The eigenvalues are 0, 1, 2, 3 and the respective eigenvectors can be seen to be the columns of the
matrix
1 −3 9 −27
 
 0 1 −6 27 
M = .
 
 0 0 1 −9 
0 0 0 1
When you look for the polynomials with these coordinate vectors, you get the given answer.

4
4 Diagonalization.
Let V be a vector space over a field K and let L be a linear transformation from V to itself.
We say that L is diagonalizable if V has a basis consisting of eigenvectors of L. Then the
matrix of L in such a basis is a diagonal matrix. This is the reason for the term.
Given a square n × n matrix A, we say it is diagonalizable if the corresponding transformation TA
is diagonalizable.
This can be made more explicit thus. Suppose B is a basis which diagonalizes TA . Form a matrix
P with coordinate vectors of vectors in B as columns.
Note: In this case, the vectors of B are their own coordinate vectors in the standard basis.
Then we see that AP = P D where D is a diagonal matrix and we get A = P DP −1 .
Thus, for a square matrix A, we can make a simpler definition which says: A is diagonalizable
iff A = P DP −1 for some n × n matrix P and a diagonal matrix D.
Example. For the example (a) above, we take the three eigenvectors w1 , w2 , w3 and form the
matrix    
1 −1 −1 5 0 0
P = 0 1 1  and note that AP = P  0 3 0  .
  

0 0 2 0 0 1
It follows that A = P DP −1 and we say that P diagonalizes A.
The most important theorem about diagonalization is this:
Theorem of Independence of eigenvectors. Suppose that v1 , · · · , vr are eigenvectors for a
linear transformation L belonging to different eigenvalues λ1 , · · · , λr .
Then v1 , · · · , vr are linearly independent.
Proof.
We use induction on r.
Induction step 1. If r = 1, then the result is trivially true since v1 being an eigenvector is non
zero and hence is an independent vector.
Induction step 2. Suppose the induction result is true for r − 1. Then we prove it for r.
Suppose if possible the result is false and v1 , · · · , vr are linearly dependent. Since v1 , · · · , vr−1 are
linearly independent by induction hypothesis, we must have

EQ1 vr = a1 v1 + c2 v2 + · · · + ar−1 vr−1 .

for some scalars a1 , · · · , ar−1 .


Applying L to both sides, we see:

EQ2 λr vr = λ1 a1 v1 + · · · λr−1 ar−1 vr−1 .

Then we calculate EQ2 − λr EQ1 to get:

EQ3 0 = a1 (λ1 − λr )v1 + · · · + ar−1 (λr−1 − λr )vr−1 .

By induction hypothesis v1 , · · · , vr−1 are linearly independent and we see that

a1 (λ1 − λr ) = · · · = ar−1 (λr−1 − λr )vr−1 = 0.

Since all the λi are distinct, we must have: a1 = · · · = ar−1 = 0 . It follows from EQ1 that vr = 0
which is a contradiction!

5
Hence the theorem is true.
Corollary 1. We can deduce the main criterion for diagonalization based on this theorem. It is:
Suppose that V has dimension n and Vλ1 , · · · , Vλr are all the eigenspaces for L with distinct eigen-
values λ1 , · · · , λr with corresponding dimensions d1 , · · · , dr .
Then the transformation L is diagonalizable iff d1 + · · · + dr = n.
Idea of Proof.
If the condition is satisfied then the n vectors obtained by taking the union of the basses of all these
eigenspaces give n independent eigenvectors in V and hence form a basis of V consisting of eigenvectors.
This gives the diagonalization.
Conversely, if V has a basis consisting of n eigenvectors then it is easy to show that these n vectors
are simply obtained by taking a union of bases for eigenspaces.
Corollary 2. If a linear transformation L of a vector space V of dimension has n distinct eigen-
values, then the set of n eigenvectors, one for each eigenvalue form a basis of V and hence L is
diagonalizable.

4.1 Examples.
!
1 2
1. The matrix A = as well as the corresponding transformation TA are not diagonalizable.
0 1
2
The reason is that the characteristic polynomial (λ − ! 1) has only one root 1 and the corre-
1
sponding eigenspace has a basis of a single vector . Thus, we cannot have two independent
0
eigenvectors and the diagonalization fails.

2. If A is a square matrix which is upper triangular with distinct entries on its diagonal, then it is
diagonalizable. The reason is that the diagonal entries are seen to be its eigenvalues and since
these are distinct, the Corollary 2 applies.
The result holds for a lower triangular matrix for the same reason.

3. Diagonalize the matrix  


3 1 2
A= 0 1 0 
 

1 0 2
if possible. Explain the reason if this is not possible.
Answer.
First we compute the characteristic polynomial


3−λ 1 2


0 1−λ 0
= (1 − λ)((3 − λ)(2 − λ) − (2)(1))
0 2−λ

1

where we have expanded the determinant by the second row. The determinant further factors as

(1 − λ)(λ2 − 5λ + 4) = (1 − λ)(λ − 1)(λ − 4).

Thus the eigenvalues are 1 and 4 where 1 is a double root.

6
To calculate V1 we find:  
2 1 2
N ul( 0 0 0 

).
1 0 1
It is not hard to see that the matrix has rank 2 and hence there is only one free variable in the
Null space calculation. This means the eigenspace V1 will have dimension 1.
The eigenspace V4 requires the solution of
 
−1 1 2
N ul( 0 −3

0 
).
1 0 −2
This matrix has rank 2 as well and thus the dimension of the Null space is again 1.
Thus, there are at most two independent eigenvectors and it is not diagonalizable.
4. Diagonalize the matrix  
3 1 2
A= 0 2 0 
 

1 0 2
if possible. Explain the reason if this is not possible.
Answer. This is similar to above but has three different eigenvalues 2, 1, 4. Thus, by Corollary
2 it is diagonalizable.
It is not hard to see that the respective eigenvectors are the columns of the matrix:
 
0 −1 2
 2
P = 0 0 
.
−1 1 1
Thus we have:  
2 0 0
A = P  0 1 0  P −1 .
 

0 0 4

5 The 2 × 2 real matrix.


!
a b
Let A = be a 2 × 2 matrix. We analyze the eigenspaces and diagonalization for A completely.
c d
First we note that the characteristic polynomial is easily seen to be F (λ) = λ2 − (a + d)λ + (ad − bc).
This is easy to remember by noting that the coefficient of λ is the sum of the diagonal entries or the
trace of the matrix and the constant term is its determinant.
Assumption. Here we assume that our matrix has real coefficients.
There are three cases for the roots which we analyze next.
1. Distinct real roots. We assume that F (λ) has two distinct real roots, p, q. By corollary 2, we
know it is diagonalizable and we simply need to find the matrix of the two eigenvectors.
For λ = p, we solve: !
a−p b 0
.
c d−p 0

7
!
b
If the first row is not zero, then we solve its equation by inspection: and know that
p−a
this is an eigenvector belonging to p. We note that the second row must give a dependent row
and hence we can ignore it!
Could the first row be a zero vector? Yes,!it is possible when a = p and b = 0. In that case, we
p−d
solve the second equation to get .
c
What if this one is zero too?
We see that we must have b = c = 0, a = d so our matrix is aI and already diagonalized!
! !
b q−d
Similarly, for λ = q, we deduce an eigenvector , if the first row is non zero, if
a−q c
the first row is zero but the second is not and note that the matrix is aI and is already diagonalized
in the remaining case.
Thus the matrix ! !
b b p−d q−d
P equals or or I.
p−a q−a c c

2. Double root case. We assume that the characteristic polynomial factors as (λ − p)2 .
The eigenspace Vp is N ul((A − pI)) and it is diagonalizable iff this space has dimension 2. But
that means the matrix A − pI must have rank zero, i.e. it must be the zero matrix. This means
A = pI or A is already diagonal!
Thus A is diagonalizable iff it is already diagonal!

3. Two complex roots. In this case, there are no real eigenvalues so the matrix cannot be
diagonalizable over the reals. However, we can put the matrix in a certain form which helps us
understand the nature of the transformation.
Worked Example: distinct complex roots.
!
1 4
Let A = .
−2 5
The characteristic polynomial is λ2 − 6λ + 13. So the eigenvalues are 3 ± 2i.
Suppose we were to find the eigenvector for the complex eigenvalue 3 − 2i as before. We would
have to solve the equations in Complex numbers represented by this augmented matrix:
! !
1 − (3 − 2i) 4 0 −2 + 2i 4 0
= .
−2 5 − (3 − 2i) 0 −2 2 + 2i 0
1
As before, the second equation is a multiple of the first (by 1−i ) and we solve the first equation
!
4
by inspection as . Suppose we write this vector into its real and complex components
2 − 2i
as: ! ! !
4 4 0
= +i = v1 + iv2 .
2 − 2i 2 −2
Then we know Av = (3 − 2i)v, i.e.

A(v1 + iv2 ) = (3 − 2i)(v1 + iv2 ) = (3v1 + 2v2 ) + i(−2v1 + 3v2 ).

8
Splitting this into real and complex parts, we get:

Av1 = (3v1 + 2v2 ) and Av2 = (−2v1 + 3v2 ).

If we form a matrix P with columns v1 , v2 , then we see that


!
3 2
AP = P .
−2 3
!
3 2
Thus if we change our basis to columns of P , the new matrix is .
−2 3
Conclusion. Thus we conclude that in general if we have a complex eigenvalue a − bi with b 6= 0
and v = v1 + iv2 is an eigenvector belonging to it, then we can set P to be the matrix with
columns v1 , v2 and get the equation:
!
a b
A=P .
−b a

Comment: If one interprets the points of the plane as complex numbers, then this corresponds
to multiplication by the complex number (a − bi). Moreover, if we write a − bi = r exp(iθ) using
the usual polar representation, then this can be described as a rotation by the angle θ followed
by expansion of scale by a factor r.

You might also like