Chapter 2, Lecture 3: Building Convex Functions
Chapter 2, Lecture 3: Building Convex Functions
Chapter 2, Lecture 3: Building Convex Functions
1 The plan
Today’s goal is to figure out how we can tell if a function is convex.
So far, we know three ways, each with their own drawbacks:
1. The definition: a function f : C → R is convex if and only if, for all x, y ∈ C and 0 ≤ t ≤ 1,
This is a useful property, and often useful in proofs, but for many actual functions, it’s not
clear how to prove this.
2. The tangent line test: for differential functions, f is convex if and only if, for all x, y ∈ C,
This is more useful as a consequence of convexity, rather than as a way to prove that a
function is convex.
3. The second-derivative test: assuming that Hf (x) exists for all x, f is convex if, for all x ∈ C,
Hf (x) 0.
This is often the way to go, but if f is a complicated function, lots of computation is involved.
Today, we take a different approach. The definition and the second-derivative test are easy to use
for functions with simple definitions. To deal with more complicated functions, we look at how
they are built out of simpler “building blocks”. We will prove several results about how we can
manipulate convex functions to get more complicated convex functions.
These have a slightly sharper version of most properties of convex functions. For example, if a
function is strictly convex, then any local minimizer (and any critical point) is not just a global
minimizer, but a strict global minimizer. So having strict convexity is often nice.
1
This document comes from the Math 484 course webpage: https://faculty.math.illinois.edu/~mlavrov/
courses/484-spring-2019.html
1
We’ll need to watch out for it today so that we know what operations preserve strict convexity, not
just convexity.
Warning: the second-derivative tests can show that a function is strictly convex. If Hf (x) 0
for all x ∈ C, then f is strictly convex. But the implication doesn’t go both ways. For example,
f (x) = x4 has f 00 (0) = 0, but is still strictly convex.
Proof. It’s enough to prove two simple cases of this theorem rather than deal with the arbitrary
sum.
First, if f is (strictly) convex and α > 0, then αf is (strictly) convex. This holds because we can
just multiply both sides of the definition by α:
f (tx + (1 − t)y) ≥ tf (x) + (1 − t)f (y) ⇐⇒ αf (tx + (1 − t)y) ≥ tαf (x) + (1 − t)αf (y).
Second, if f and g are convex, then their sum h defined by h(x) = f (x) + g(x) is convex. This is
also just a matter of adding together the two inequalities:
There are two inequalities here. If either f or g is strictly convex, then one inequality is strict; so
the whole inequality becomes strict, and the sum h = f + g is strictly convex.
To get the full theorem, we just build up the combination of k functions by induction.
Here is another relatively simple result. It rarely comes up, but when it does, it’s often the only
tool we have.
2
The textbook says “nonnegative”, but if αi = 0 it’s as though we didn’t include fi at all.
2
Theorem 2.2. Suppose that f1 , f2 , . . . , fk are (strictly) convex functions C → R. Then
is (strictly) convex.
Proof. As usual, take x, y ∈ C and t ∈ [0, 1]. Then f (tx+(1−t)y) must be equal to fi (tx+(1−t)y)
for some i, and we have
where the last inequality holds because for any i = 1, 2, . . . , k and any point x ∈ C, f (x) ≥ fi (x)
because f (x) is a maximum of several values including fi (x).
If all the fi are strictly convex and 0 < t < 1, we get to use a strict inequality in this proof and so
f is strictly convex.
It’s also important to mention that multiplying two convex functions does not guarantee convexity:
for example, f (x) = x2 − 1 is convex, but f (x) · f (x) = (x2 − 1)2 is not. Also, the minimum of two
convex functions isn’t convex, even though min looks a lot like max.
Proof. The proof is short; the hard part is watching out for this rule in examples. We have (in the
usual setup for a convexity proof):
If f is strictly convex, then the first inequality is strict (it’s <). If g is strictly increasing, then that
strict inequality is preserved, so h is strictly convex as well.
3
Proof. Such a function f has the useful property that it’s convex, and the definition of convex is
always an equality: for all x, y ∈ Rm and t ∈ [0, 1] (actually, any t), we have
So we have
Noteworthy special case: if g : R → R is convex, so is h(x) = g(ax + b). Also, by choosing the
matrix A appropriately, we know that h(x) = g(xi ) is convex as a function Rn → R.
3 Examples
3.1 Example 2.3.11.c in the textbook
To check that f (x1 , x2 ) = x21 − 4x1 x2 + 5x22 − ln x1 x2 is convex for x1 , x2 > 0, write f as a sum
1 −2 x1
f (x1 , x2 ) = x1 x2 + (− ln x1 ) + (− ln x2 ).
−2 5 x2
1 −2
The first term is convex because 0.
−2 5
The second term and theird term are convex because g(x) = − ln x is convex on (0, ∞) (and by our
last result, plugging in just x1 or just x2 doesn’t change this).
Finally, the sum of three convex functions is convex.