Nothing Special   »   [go: up one dir, main page]

Hessian and Critical Points - Assignment 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Jim Lambers

MAT 280
Spring Semester 2009-10
Lecture 8 Notes

These notes correspond to Section 11.7 in Stewart and Section 3.3 in Marsden and Tromba.

Maximum and Minimum Values


In single-variable calculus, one learns how to compute maximum and minimum values of a function.
We first recall these methods, and then we will learn how to generalize them to functions of several
variables.
Let 𝑓 : 𝐷 ⊆ ℝ𝑛 → ℝ. A local maximum of a function 𝑓 is a point a ∈ 𝐷 such that 𝑓 (x) ≤ 𝑓 (a)
for x near a. The value 𝑓 (a) is called a local maximum value. Similarly, 𝑓 has a local minimum at
a if 𝑓 (x) ≥ 𝑓 (a) for x near a, and the value 𝑓 (a) is called a local minimum value.
When a function of a single variable, 𝑓 (𝑥), has a local maximum or minimum at 𝑥 = 𝑎, then 𝑎
must be a critical point of 𝑓 , which means that 𝑓 ′ (𝑐) = 0, or 𝑓 ′ does not exist at 𝑎 (which is the
case if, for example, the graph of 𝑓 has a sharp corner at 𝑎). In general, if 𝑓 is differentiable at a
point a, then in order for a to be a local maximum or minimum of 𝑓 , the rate of change of 𝑓 , as
its independent variables change in any direction, must be zero. The only way to ensure this is to
require that ∇𝑓 (a) = 0. Therefore, we say that a is a critical point if ∇𝑓 (a) = 0 or if any partial
derivative of 𝑓 does not exist at a.
Once we have found the critical points of a function, we must determine whether they correspond
to local maxima or minima. In the single-variable case, we can use the Second Derivative Test,
which states that if 𝑎 is a critical point of 𝑓 , and 𝑓 ′′ (𝑎) > 0, then 𝑎 is a local minimum, while if
𝑓 ′′ (𝑎) < 0, 𝑎 is a local maximum, and if 𝑓 ′′ (𝑎) = 0, the test is inconclusive.
This test is generalized to the multivariable case as follows: first, we form the Hessian, which
is the matrix of second partial derivatives at a. If 𝑓 is a function of 𝑛 variables, then the Hessian
is an 𝑛 × 𝑛 matrix 𝐻, and the entry in row 𝑖, column 𝑗 of 𝐻 is defined by

∂2𝑓
𝐻𝑖𝑗 = (a).
∂𝑥𝑖 ∂𝑥𝑗
Because mixed second partial derivatives are equal if they are continuous, it follows that 𝐻 is a
symmetric matrix, meaning that 𝐻𝑖𝑗 = 𝐻𝑗𝑖 .
We can now state the Second Derivatives Test. If a is a critical point of 𝑓 , and the Hessian, 𝐻,
is positive definite, then a is a local minimum of a. The notion of a matrix being positive definite is
the generalization to matrices of the notion of a positive number. When a matrix 𝐻 is symmetric,
the following statements are all equivalent:
∙ 𝐻 is positive definite.

1
∙ x𝑇 𝐻x > 0, where x is a nonzero column vector of real numbers, and x𝑇 is the transpose of
x, which is a row vector.

∙ The eigenvalues of 𝐻 are positive.

∙ The determinant of 𝐻 is positive.

∙ The diagonal entries of 𝐻, 𝐻𝑖𝑖 for 𝑖 = 1, 2, . . . , 𝑛, are positive.


On the other hand, if 𝐻 is negative definite, then 𝑓 has a local maximum at a. This means that
x𝑇 𝐻x < 0 for any nonzero real vector x, and that the eigenvalues and diagonal entries of 𝐻 are
negative. However, the determinant is not necessarily negative. Because it is equal to the product
of the eigenvalues, the determinant is positive of 𝑛 is even, and negative if 𝑛 is odd.
If 𝐻 is indefinite, which is the case if it is neither positive definite nor negative definite, and
therefore has both positive and negative eigenvalues, then we say that 𝑓 has a saddle point at a.
This means that the graph of 𝑓 crosses its tangent plane at a, and the term “saddle point” arises
from the fact that 𝑓 is increasing from a along some directions, but decreasing along others.
Finally, if 𝐻 is a singular matrix, meaning that one of its eigenvalues, and therefore its deter-
minant, is equal to zero, the test is inconclusive. Therefore, a could be a local minimum, local
maximum, saddle point, or none of the above. One must instead use other information about 𝑓 ,
such as its directional derivatives, to determine if 𝑓 has a maximum, minimum or saddle point at
a.
Example Let 𝑓 : ℝ2 → ℝ2 be defined by

𝑓 (𝑥, 𝑦) = 6𝑥2 + 4𝑥𝑦 + 8𝑦 2 − 𝑥 − 3𝑦.

We wish to find any local minima or maxima of this function. First, we compute its gradient,
[ ]
∇𝑓 = 12𝑥 + 4𝑦 − 1 4𝑥 + 16𝑦 − 3 .

To determine where ∇𝑓 = 0, we must solve the system of linear equations

12𝑥 + 4𝑦 = 1,
4𝑥 + 16𝑦 = 3.

Using the second equation to obtain 𝑥 = (3 − 16𝑦)/4 and substituting this into the first equation,
we obtain 𝑦 = 2/11 and 𝑥 = 1/44. Since the solution of this system is unique, it follows that this
is the only critical point of 𝑓 .
To determine whether this critical point corresponds to a maximum or minimum, we must
compute the Hessian 𝐻, whose entries are the second partial derivatives of 𝑓 at (1/44, 2/11). We
have [ ] [ ]
𝑓𝑥𝑥 𝑓𝑥𝑦 12 4
𝐻= = .
𝑓𝑦𝑥 𝑓𝑦𝑦 4 16

2
To determine whether this matrix is positive definite, we first compute its determinant,
2
det(𝐻) = 𝑓𝑥𝑥 𝑓𝑦𝑦 − 𝑓𝑥𝑦 = 12(16) − 4(4) = 176.

Since the determinant, which is the product of 𝐻’s two eigenvalues, is positive, it follows that they
must both be the same sign. To determine that sign, we check the trace of 𝐻, denoted by tr(𝐻).
The trace of a matrix is the sum of its diagonal entries, which is also the sum of the eigenvalues.
We have
tr(𝐻) = 𝑓𝑥𝑥 + 𝑓𝑦𝑦 = 12 + 16 = 28.
Since both eigenvalues are the same sign, and their sum is positive, they must both be positive.
Therefore, 𝐻 is positive definite, and we conclude that (2/11, 1/44) is a local minimum of 𝑓 . □
The preceding example describes how the Second Derivatives Test can be performed for a function
of two variables:
2 > 0, and 𝑓
∙ If det(𝐻) = 𝑓𝑥𝑥 𝑓𝑦𝑦 − 𝑓𝑥𝑦 𝑥𝑥 > 0, then the critical point is a minimum.

∙ If det(𝐻) > 0 and 𝑓𝑥𝑥 < 0, then the critical point is a maximum.

∙ If det(𝐻) < 0, then the critical point is a saddle point.

∙ If det(𝐻) = 0, then the test is inconclusive.

In many applications, it is desirable to know where a function assumes its largest or smallest
values, not just among nearby points, but within its entire domain. We say that a function 𝑓 :
𝐷 ⊆ ℝ𝑛 → ℝ has an absolute maximum at a if 𝑓 (a) ≥ 𝑓 (x) for x ∈ 𝐷, and that 𝑓 has an absolute
minimum at a if 𝑓 (a) ≤ 𝑓 (x) for x ∈ 𝐷.
In the single-variable case, it is known, by the Extreme Value Theorem, that if 𝑓 is continuous
on a closed interval [𝑎, 𝑏], then it has has an absolute maximum and an absolute minimum on [𝑎, 𝑏].
To find them, it is necessary to check all critical points in [𝑎, 𝑏], and the endpoints 𝑎 and 𝑏, as the
absolute maximum and absolute minimum must each occur at one of these points.
The generalization of a closed interval to the multivariable case is the notion of a compact set.
Previously, we defined an open set, and a boundary point. A closed set is a set that contains all of
its boundary points. A bounded set is a set that is contained entirely within a ball 𝐷𝑟 (x0 ) for some
choice of 𝑟 and x0 . Finally, a set is compact if it is closed and bounded.
We can now state the generalization of the Extreme Value Theorem to the multivariable case.
It states that a continuous function on a compact set has an absolute minimum and an absolute
maximum. Therefore, given such a compact set 𝐷, to find the absolute maximum and minimum, it
is sufficient to check the critical points of 𝑓 in 𝐷, and to find the extreme (maximum and minimum)
values of 𝑓 on the boundary. The largest of all of these values is the absolute maximum value, and
the smallest is the absolute minimum value.

3
It should be noted that in cases where 𝐷 has a simple shape, such as a rectangle, triangle or
cube, it is possible to check boundary points by characterizing them using one or more equations,
using these equations to eliminate a variable, and then substituting for the eliminated variable in 𝑓
to obtain a function of one less variable. Then, it is possible to find extreme values on the boundary
by solving a maximization or minimization problem in one less dimension.
Example Consider the function 𝑓 (𝑥, 𝑦) = 𝑥2 + 3𝑦 2 − 4𝑥 − 6𝑦. We will find the absolute maximum
and minimum values of this function on the triangle with vertices (0, 0), (4, 0) and (0, 3).
First, we look for critical points. We have
[ ]
∇𝑓 = 2𝑥 − 4 6𝑦 − 6 .

We see that there is only one critical point, at (𝑥0 , 𝑦0 ) = (2, 1). Because the triangle includes points
that satisfy the inequalities 𝑥 ≥ 0, 𝑦 ≥ 0 and 𝑦 ≤ 3 − 3𝑥/4, and the point (2, 1) satisfies all of these
inequalities, we conclude that this point lies within the triangle. It is therefore a candidate for an
absolute maximum or minimum.
We now check the boundary, by examining each edge of the triangle individually. On the
edge between (0, 0) and (0, 3), we have 𝑥 = 0, which yields 𝑓 (0, 𝑦) = 3𝑦 2 − 6𝑦. We then have
𝑓𝑦 (0, 𝑦) = 6𝑦 − 6, which has a critical point at 𝑦 = 1. Therefore, (0, 1) is also a candidate for an
absolute extremum. Similarly, along the edge between (0, 0) and (4, 0), we have 𝑦 = 0, which yields
𝑓 (𝑥, 0) = 𝑥2 − 4𝑥. We then have 𝑓𝑥 (𝑥, 0) = 2𝑥 − 4, which has a critical point at 𝑥 = 2. Therefore,
(2, 0) is a candidate for an absolute extremum.
We then check the edge between (0, 3) and (4, 0), along which 𝑦 = 3 − 3𝑥/4. Substituting this
into 𝑓 (𝑥, 𝑦) yields the function
( )
3𝑥 43
𝑔(𝑥) = 𝑓 𝑥, 3 − = 𝑥2 + 9 − 13𝑥.
4 16

To determine the critical points of this function, we solve 𝑔 ′ (𝑥) = 0, which yields 𝑥 = 104/43. Since
𝑦 = 3 − 3𝑥/4 along this edge, the point (104/43, 51/43) is a candidate for an absolute extremum.
Finally, we must include the vertices of the triangle, because they too are boundary points of
the triangle, as well as boundary points of the edges along which we attempted to find extrema
of single-variable functions. In all, we have seven candidates: the critical point of 𝑓 , (2, 1), the
three critical points found along the edges, (0, 1), (2, 0) and (104/43, 51/43), and the three vertices,
(0, 0), (4, 0) and (0, 3). Evaluating 𝑓 (𝑥, 𝑦) at all of these points, we obtain

4
x y f(x,y)
2 1 −7
0 1 −3
2 0 −4
104/43 51/43 −289/43
0 0 0
4 0 0
0 3 9

We conclude that the absolute minimum is at (2, 1), and the absolute maximum is at (0, 3). The
function is shown on Figure 1. □

Figure 1: The function 𝑓 (𝑥, 𝑦) = 𝑥2 + 3𝑦 2 − 4𝑥 − 6𝑦 on the triangle with vertices (0, 0), (4, 0) and
(0, 3).

5
Practice Problems
1. Find the local maximum and minimum values, as well as saddle points, of the following
functions.

(a) 𝑓 (𝑥, 𝑦) = 𝑥3 + 𝑦 3 − 3𝑥𝑦


(b) 𝑔(𝑥, 𝑦) = 𝑥2 + 𝑦 + 1/(𝑥2 𝑦)

2. Find the absolute maximum and minimum values of the following functions on the indicated
domains.

(a) 𝑓 (𝑥, 𝑦) = 𝑥3 + 𝑦 3 + 3𝑥2 − 6𝑦 2 , ∣𝑥∣ ≤ 1, ∣𝑦∣ ≤ 1


(b) 𝑓 (𝑥, 𝑦) = 𝑥2 − 2𝑦 2 − 4𝑥 + 5𝑦, on the triangle with vertices (−2, 0), (2, 0) and (0, 4)

Additional Practice Problems


Additional practice problems from the recommended textbooks are:

∙ Stewart: Section 11.7, Exercises 1-13 odd, 23-27 odd, 31-35 odd

∙ Marsden/Tromba: Section 3.3, Exercises 1-17 odd, 23, 25, 33

You might also like