Advanced Microeconomic Theory

Chu Thanh Duc – MDE 10
Chapter 1. Sets and Mappings

1.1.Elements of logic
- A Theorem: is simply a statement deduced from other statements and should be a

concept familiar from courses in mathematics.
- Theorems provice a compact and precise format for presenting the assumptions and
important conclusions of somtimes lengthy arguments, and so help to identify
immediately the scope and limitations of the result presented.
1.1.1 Necessity and Sufficiency
- Necessity: A is necessary for B  A must hold or be true in order for B hold or be

true  B is true require that A must also be true  “A if B” or “A is implyed by B”
(A  B).
If A is not true, B must be not true. But that doesn’t mean that if B is not true, A must be
not true.
A true
B true
B not true
A not true
 A is not true  B must be not true  B is not true is necessary for A is not true.
~ A ~ B .
- Sufficiency: “A is sufficient for B” means that A holds, B must hold. We can say “A
is true only if B is true”; or A implies B (A  B).
B true
A true
A not true
B not true
B is not true  A must be not true.  B is not true is sufficient for A is not true
~B  ~A.
- Both Necessity and Sufficiency: A  B

Geoffrey A. Jehle – Advanced Microeconomic Theory 1
A
B
1.1.2 Theorems and Proofs
“A  B” : A is true  B must be true
- Here, A is called “premise” and “B is call “conclusion”
- Constructive proof / Direct proof: Assume that A is true, deduce various

consequences of that, and use them to show that B must hold.
- Contrapositive proof: Assume that B does not hold, then show that A cannot hold.
1.2.Elements of Set Theory
1.2.1 Notation and Basic Concepts
- A Set: is any collection of elements. Elements may be numbers or vectors
- A Subset: Set S is a subset of set T if every element of set S is also an element of set
T. Notation: S  T.
- Empty set: S is an empty set if it contains no elements at all. Notaton: S = 0
- Complement Set: The complement of set S in an Universal set U is the set of all
elements in U which are not in S. Notation: Complemet set of S: cS.
- Union: Notation S  T  {x | x  S _ or _ x  T } . In general: U iI Si (I is Index set).
- Intersection: Notation S  T  {x | x  S _ and _ x  T } . Or  iI Si (I is Index set).
- Index Set: The set of interger number starting with 1. S = (1, 2, 3, 4...n). Notation:
I  {1,2,3...}
- The Product of two sets: S  T  {(s, t ) | s  S , t  T }
- n – Space: is the Set Product : R n  R  R  ...  R  {( x1, x2,..., xn) | xi  R} i =1,2...n.
- Non – Negative Orthant: Rn  {( x1, x2,..., xn) | xi  0}  R n
1.2.2 Convex Set:

The property of convexity is most often assumed  analysis is mathematically tractable
and results are clear-cut and well-behaved.
Convex set in Rn : S  Rn is a conxex set iff for all x1  S and x2  S, we have:
tx1 + (1-t)x2  S for all t in the interval 0  t  1 .
Theorem: Intersection of Convex Sets is Convex.
1.2.3 Relations and Functions
Consider the two sets: set S (s1, s2...) and set T (t1, t2,...)
The product of the two sets: SxT = {(s,t)| s  S and t  T} : Order pairs.
Any Collection or Set of Order pairs is said to constitute a Binary Relation (SxT) of the
sets S and T.
Meaningful Relationship: (sRt) : The Set of the order pairs that are constituted by elements
of the sets S and T are meaningful.
sRt  {(s, t ) | s  S , t  T and sRt  S  T }
Example: S  S  S 2  {( x, y) | x  S , y  S} " "  {( x, y) | x  S , y  S , x  y}
Completeness: A relation R on S is complete if, for all distinct elements x and y in S, xRy or
yRx ( the order pairs (x,y) or (y,x) are all meaningful.)
Reflexivity: A relation R on S is reflexive if, for all elements x in S, xRx (oder pairs (x,x) are
meaningful)
Transitivity: A relation R on S is Transitive if, for all three elements x, y and z in S, xRy and
yRz implies xRz
Odering: A binary relation which satisfies all properties of Completeness, Reflexivity and
Transitivity is called an Ordering.
A Function: Is a Relation that associate each element of one set with a single, unique element
of another set. f: D  R: D is the Domain and R is calledthe Range.
1.3. A little Topology
Topology is a study of fundamental properties of sets and mappings.
The distance between the points: x1 ( x11 , x12 ) and x 2 ( x12 , x22 ) :

d(x1, x2) = |x1 – x2| = ( x11  x12 ) 2  ( x12  x22 ) 2
Open Ball: The open ball with the center x0  R n and radius e > 0 (a real number) is the
subset of poins in R n : Be ( x0 )  {x  R n | d ( x0 , x)  e} .
Closed Ball: the closed ball with center x0  R n and the radius e> 0 (a real number) is the
subset of points in R n : Be ( x0 )  {x  R n | d ( x0 , x)  e} .
Open Sets: S  R n is an Open Set if for all x  S, there exist some e > 0 such that the open
ball Be (x)  S .
The Theorem on Open Sets in R n :
1. The Empty set is an Open Set
2. The entire space R n is an open set
3. The union of open sets is an Open set
4. The intersection of any finite number of open sets is an Open set.
Theorem: Every Open set is a Collection of Open Balls.
S is an open set. For every x  S, choose some e x > 0 such that Bex ( x)  S .
S   Bex (x)
xS
Closed Sets: S  R n is a closed set if and only if its complement cS is an Open set.
Theorem on Closed sets:
1. The empty set is a closed set
2. The entire space is a closed set
3. The union of closed set is a close set
4. The intersection of a finite number of closed sets is a closed set.
Theorem: Closed sets in R and the Unions of close Intervals.
Let S is any closed set in R. Then:
S=  (, ai]  [bi,) 

iI
Proof:
If S  R is closed  cS is open. By the definition of an open set, for each x in cS,
we have: cS   B (x) .
We can rewrite as: cS   ( x   x , x   x ) .

xcS
Let i=x, I=cS, ai  x   x , bi  x   x , we have ai<bi
cS   (ai , bi )
iI
 S  c  (ai , bi )
iI
Applying the De Morgan’s law, we have: S   c(ai , bi )

iI
c(ai , bi )  (, ai ]  [bi ,)
 S   (, ai ]  [bi ,)  .

iI
Theorem: Close sets in R and the Union of Closed Intervals:
Let S is any closed set in R . Then:
S=  [0, ai]  [bi,)

iI
Bounded Sets in R n : A subset S in R n is called bounded if and only if it is entirely

contained within some ball (an open or a closed ball). That is, S is bounded if there exists e>0
such that S  Be (x) for some x  R n .
Consider a subset S in R space:
A lower bound: A real number l is called a lower bound for S if l  x for all x  S .
An Upper bound: A real number l is called an upper bound for S if l  x for all
xS .
A subset S in R has many lower bounds and upper bounds.
The greatest lower bound (g.l.b): the biggest number among those lower bounds for
S.
The least upper bound (l.u.b): the smallest number among those upper bounds for S.

A close set S in R contains its upper and lower bounds. An open set S in R does not
contain its upper and lower bounds.
Theorem: Upper and Lower Bounds in Subsets of Real Numbers
1. Let S  R is a bounded open set and let a be the g.l.b of S and b be the l.u.b of S.
Then a  S and b  S .
2. Let S is a bounded closed set in R. Let a is the g.l.b of S and b is the l.u.b of S.
Then a  S and b  S.
Compact Sets in R n ( Heine-Borel): A set S  R n is compact if and only if S is closed and

bounded.
R n is closed but is not bounded  R n is not compact.
closed
and
bounded
1.3.1. Continuity
- A Continuous mapping or a continuous function
- In most economic application, we will either want to assume that the function we are dealing
with are continuous or we want to discover whether they are continuous when we are
unwilling to assume it.
- A continuous function: f: R  R is continuous function at a point xo if for all  > 0,

there exists  > 0 such that d(x, xo)<  implies d((f(x)-f(xo)) <  .
f ( B ( xo)  B ( f ( xo))
f(xo+  )
f(xo+  )
f(xo)
f(xo)+ 
f(xo+  ) f(xo)
xo xo+ 
Geoffrey A. Jehle – Advanced Microeconomic 
xo xo+Theory 6
f(xo+  )  (f(xo), f(xo)+  )
Continuity: Let D be a set, R be another set, and let f: D  R. The function f is continuous
at the point xo  D if, and only if, for all  > 0, there exists a  > 0 such that:
f ( B ( xo))  B ( f ( xo)) . If f is continous at all xo  D, it is called a
continuous function.
A function is continuous at a point xo if for all  > 0, there exists  > 0 such that any point
less than a distance  away from xo is mapped by f into some point in the range which is less
than a distance  away from f(xo).
Every point in B (xo) is mapped by f into some point no father than  from f(xo).
f(xo) is the image of xo.
Basically, a function is continuous if a “small movement” in domain does not cause a “big
jump” in the range.
It is not true that a continuous function always maps an open set in the domain set into an
open set in the range, or that closed set is mapped into closed sets. Ex: y = a. (the domain is
an open set but the range (image) set may not be an open set)
Theorem: Continuity and the Inverse Image of Open Sets

1
Let f: D  R be a mapping and let f : R  D be its inverse mapping from R to D. Let
T  R be an open set in the range of f. Then f is continuous if and only if the inverse
image f 1 (T )  D is an open set in the domain.
Proof:
Necessity: Let f is a continuous function and T be an open set in the range. We have to prove
that f 1 (T )  D is an open set.
T = {f(x)} is an open set   some   0 such that B ( f ( x))  T . The function f is
a continuous function  according to the definition of a continous function, there exist   0
such that f ( B ( x))  B ( f ( x))  B ( x)  f 1[ B ( f ( x))]  f 1 (T )  D is an
open set because f 1 (T )  D is a set of B (x) .

Sufficiency: Let f 1 ( B ( f ( x))  D is an open set. We need to prove that f(x) is a

continuous function.
If f 1 ( B ( f ( x))  D is an open set   some   0 such that B ( x)  D is

1
contained by f
1
( B ( x)) .  B (x)  f ( B ( x))  f ( B ( x))  B f ( x)
for all x { f 1 ( B ( f ( x))} .
 f(x) is a continuous function for all x { f 1 ( B ( f ( x))} .

Theorem: Continuity and the Inverse Image of Closed Sets
1
Let f: D  R is a continuous mapping and f is inverse mapping from R to D. T  R is
closed set in the range of f. Then f is continuous function if and only if f 1 (T )  D is a

closed set in the domain of f.
Proof:
Let an image set T of f(x) is a closed set. We need to prove that f 1 (T )  D is a closed set
equivalent to f being a continuous function.
If T is a closed set  cT is an open set . According to the preceding theorem, f(x) is a

1
continuous function if and only if f 1 (cT ) is an open set or cf (T ) is an open set 
1
f (T ) is a closed set.
Theorem: The continuous Image of a Compact Set is a Compact set.
1.3.2. Some Existence Theorems
- All Existence Theorems specifies conditions if met, guarantee that something exists.
- The conditions in Existence Theorems are sufficient conditions.
something
E.T.Cs
E.T.Cs do not hold
Something doesn’t hold

- While these theorems assure us that something exists, they generally give no clue as to
what it may look like, or where we may find it.
- Optimization theory
- The Weierstrass Theorem specifies sufficient conditions under which the existence of
a maximum and a minimum of a continuous function are assured.
Theorem Weierstrass: Existence of Extreme Values
The Weierstrass Theorem specifies sufficient conditions under which the existence of a
maximum and a minimum of a continuous function are assured.
The theorem: Let f: R n  R be a continuous real valued mapping. Let S be a compact

(closed and bounded) subset of R n . Then there exists a vector x   S and a vector
~
x  S such that:
f ( x * )  f ( x)  f ( ~
x) for all x  S .
Proof:
According to the theorem: The continuous image of a Compact set is a compact set (D is a
compact set, f is continuous  the Range/image of D is a compact set. We have that S is
a compact set and the mapping is continuous  the image of S is a compact set.
Obviously, the image set of S  R   Greatest Lower Bound = a (g.l.b) and Smallest
Upper Bound = b (s.u.b) of the set image of S  there is x1 and x2 in S such that
f(a)=g.l.b and f(b)=s.l.b due to the property of a closed set S  f (a)  f ( x)  f (b) .
.
Max
Min
x1 x2
Hyperplane H in R n : A Hyperplane H in R n is a set of vectors x sastifying ax   for
some a  R n , a  0 and   0 .
A Hyperplane H separates two sets S and T if: a.x   for all x  S and a.x   for all
x T .
A hyperplance H in R n
The hyperplane H separates S and T if:
a.x   for all x  S
a.x   for all x  T
Theorem: Separating a Point and a Convex Set
Let S in R n be convex, closed and nonempty. Let y 0  R n and y 0  S . Then there exists
some a  R n , a  0 and   R such that all three of the following hold:
1. a.y 0  
2. a.x'   for some x’ is on the boundary of S ( S ).
3. a.x  
(1) and (3) tell that there is some hyperplane that separates the point y0 and the set S. (2)
means that the hyperplane will pass through a point on the boundary of S.
x’ will be the closest point in S to y0.
a = x’ – y0 and a.x’ =  .
Lemma: Under the condition of the preceding theorem, there exists at least one point
x' S (boundary of set S) such that d(y0, x’)  d(y0, x) for all x  S and d(y0, x’)>0.
x2
S
xo
B*
A
yo
x1
Prove the Lemma:

Let xo be any poin in S and let  be ||yo-xo||. Let B be a closed ball centered at yo with
*
radius  . Let A be intersection of B and S. Because S and B are closed and nonempty
* *
 the intersection A is closed and nonempty  A is compact. Let A={x’}.
Consider the function d(yo,x’) over domain A. This is a continuous function. Because A
– domain is a compact set  According to Weierstrass theorem, there exists x’ in A such
that d(y0, x’)  d(yo,x) for all x  A . Because x’ is in S and yo is not in S, so d(yo.x’)>0.
So x’ is the closest point in A to yo . It is easy to realize that every point in S that is not in
the closed ball B* must be a distance from yo which is trictly greater than  . So x’ is the
closest point in S set to yo. x’ must be on the boundary of S, ortherwise that there exists a
ball centered by x’ means there is other point which is closer to yo.
Prove the theorem: Separating a Point and a Convex set
Let a = x’ – yo, where x’ is the closest point in S to yo. a  0, due to x’  yo.
Let  = a.x’
Prove (1) a.yo <  :
a.yo = (x’-yo).yo = (x’-yo)yo +(x’-yo)x’ – (x’-yo)x’ = - (x’-yo)(x’-yo) + (x’-yo)x’
= -||x’-yo||2 +(x’-yo)x’ = -||x’-yo||2 + 
 a.yo -  < 0  a.yo <  .
Prove (2) a.x’ =  :
By the Lemma, x’ must be on the boundary of S and by the definition of  .
Prove (3) a.x   for all x in S.
S is a convex set  xt = tx + (1-t)x’ will be in S. ( 0  t  1 )
Due to x’ is the closest point in S to yo  ||x’-yo||  ||xt – yo||  ||x’-yo||2  ||xt – yo||2
Consider ||xt – yo||2 = ||tx + (1-t)x’ –yo||2 = ||(1-t)(x’-yo) + t(x-yo)||2
=((1-t)(x’-yo) + t(x-yo)). ((1-t)(x’-yo) + t(x-yo))
= (1-2t+t2) ||x’-yo|| 2 +2t(1-t)(x’-yo)(x-yo) + t2||x-yo||2
 ||x’-yo||  (1-2t+t2) ||x’-yo|| 2 +2t(1-t)(x’-yo)(x-yo) + t2||x-yo||2
 0  (1-2t+t2) ||x’-yo|| 2 +2t(1-t)(x’-yo)(x-yo) + t2||x-yo||2 - ||x’-yo||
 0  t(t-2) ||x’-yo|| 2 +2t(1-t)(x’-yo)(x-yo) + t2||x-yo||2

due to t > 0, we can devide both sides by t to get:
 0  (t-2) ||x’-yo|| 2 +2(1-t)(x’-yo)(x-yo) + t||x-yo||2
Hold for all t > 0, it must hold in the limit as t  0:
 0  -2 ||x’-yo|| 2 +2(x’-yo)(x-yo)
 0  -2( (x’-yo). (x’-yo) +(x’-yo)(x-yo) )
 0  (x’-yo).x’ - (x’-yo)x
 (x’-yo).x’  (x’-yo)x  a.x  
Theorem: The Minkowski Separation Theorem
Let S and T be two nonempty, disjoint, and convex sets in R n . Then there is an
a  R n , a  0 and an   R n such that a.x   for all x  S and a.x   for all x  T .
Theorem: The Brouwer Fixed Point Theorem
Let S  R n be a compact and convex set. Let f: S  S be a continuous mapping. Then
there exists at least one fixed point of f in S. That is, there exist at least one x*  S such
that x* = f(x*).
For all x  S we have f(x)  S as well.
S  R n and f: S  S , then f maps vectors in S back into other vector in the same set S.
Linear of Nonlinear systems of equations define functions of this sort.
y1  g 1 ( x1,...xn )
.
.
yn  g n ( x1,...gn)
maps points (x1,...,xn)  R n into points (y1,...,yn)  R n
A fixed point: (special case)
x1*  f 1 ( x1* ,...xn * )

.
.
xn *  f n ( x1* ,...xn * )
(x1 * ,...,xn * ) is called a fixed point
Prove the Brouwer Fixed Point theorem

We need to prove that for all x  S , f(x)  S as well, there exists at least x*  S such that
x*=f(x*).
This proof is only restricted in considering mapping: f: R  R
Let S = [a, b]  R is a compact and convex. Because f(x)  S  f(x)  [a,b].
 a  f ( x)  b or a  f (x) and b  f (x) (P.1)
Construct a new function: g ( x)  x  f ( x) (P.2)
and let the domain of g be the same interval [a,b].
y = x is a continuous function, f(x) is assumed to be a continuous function  g(x) is a

continuous function (difference of continuous functions is a continuous function).
Consider g(x) at the endpoints a, b:
We have g(a) = a – f(a)  0 and g(b) = b – f(b)  0 (according to P.2)
If one of these inequalities holds with equality, a or b will be a fixed point: a = f(a) or b=f(b).
Now we must consider the possibility of x = f(x) that g(a)<0 and g(b)>0.
We know g(a) < 0, we always find out c > a such that g(x) < 0 for all x  [a, c). Proving this
statement is rather easy. Beacase g(x) is a continuous function over [a, b]. g(x) is continuous
at a if and only if for all   0 , there exists   0 such that g( Ba )  B

g (a )
. Because g(a) <
0 , we can find out   0 that is small enough such that B

g (a)
 R (negative real number
set).  We can find c > a (maybe c = a+  ) such that g(x) < 0 for all x  [a, c).
Similiarly, we can find c<b such that g(x) > 0 for all x  (c, b] .
Let c* is the largest number to make the open interval [a, c*) such that g(x)<0 for all
x  [a, c*) . Let x* is the least upper bound of [a,c*)  x* = c.
Since x* = c, we know that a < x* < b and g(x) < 0 for all x  [a, x*) .
We realize that g(x*) can not be negative. If g(x*)<0  -g(x*)>0. Because g(x) is continuous
function at x*, we take  = -g(x*) > 0, there exists  >0 such that g ( B ( x*))  B g ( x*) g ( x*) .
B g ( x*) g ( x*)  (0,2 g ( x*)) , this contradics the assumption that g(x*)<0.

g(x*) can not be positive as well. If g(x*)>0, we take   g (x*) , then there exists  >0 such
that g ( B ( x*))  Bg ( x*) g ( x*) . And Bg ( x*) g ( x*)  (0,2 g ( x*)) . This vilolates the
condition g(x)<0 for all x  [a, x*) .
So, g(x*) can not be both positive and negative  g(x*) = 0  x* = f(x).
In the paragraph, the line f will cross the 450 curve at least one time. So according to
Brouwer’s Theorem, the fixed point is not unique.
1.4. Real Valued Function
Definition fo Real Valued Function: f: D  R is a real valued function if D is any set and
R  R.
This section we will restrict out attention to real valued functions whose domains are convex
sets.
Assumption: Real Valued Function over Convex Sets.
Let f: D  R is a real valued function where D  R n is a convex set and R  R .
Increasing Functions: f: D  R is increasing function whenever f(x0)  f(x)  x 0  x , and

x 0  x . We say f is strictly increasing function whenever f(x0) >f(x)  x0  x, and x 0  x .
Note: x 0  x means that at least one of components of vector x0 is greater than the same
ordered component of vector x.
Decreasing Functions: f: D  R is decreasing function whenever f(x0)  f(x)  x 0  x ,

and x 0  x . We say f is strictly increasing function whenever f (x0) < f(x)  x 0  x , and
x0  x .
1.4.1. Related Sets
The graph of a function is a related set which provices an easy and intuitive way of thinking
about the function.
There are some sets related to a function.
Level Sets: L(y0) is a level set of the real valued function f: D  R iff
L(y0)  {x | x  D, f(x)  yo}, where y0  R .
Examples: Isoquant curve, Indiferrent curves, Isoprofit curve…

Using of Level Sets:
- Reducing by one the number of dimesions needed to represent the function.
- Two indifferent level sets of a function can never cross
Level Sets Relative to a Point x0
L(x0) is a level sets relative to point x iff L(x0) = {x|x  D , f(x) = f(x0)}.
Superior and Inferior Sets
1. S(yo) = {x | x  D , f(x)  yo} is call superior set for level yo.
2. I(yo) = {x | x  D , f(x)  yo} is call inferior set for level yo.
3. S’(yo) = {x | x  D , f(x)  yo} is call strictly superior set for level yo.
4. I’(yo) = {x | x  D , f(x)  yo} is call strictly inferior set for level yo.
Theorem: Superior, Inferior, and Level Sets
For any f: D  R any y 0  R :
1. L(yo)  S(yo)
2. L(yo)  I(yo)
3. L(yo) = S(yo)  I(yo)
4. S’(yo)  S(yo)
5. I’(yo)  I(yo)
6. S’(yo)  L(yo) = O
7. I’(yo)  L(yo) = O
8. S’(yo)  I’(yo) = O.
Concave Functions: f: D  R is a concave function if and only if
f ( xt )  tf ( x1)  (1  t ) f ( x2) for all t  [0,1] .
(f(x) is a real valued fucntion)
- The set of points beneath concave regions is a convex set. The set of points beneath
the non-concave region is not a convex set.

y
Theorem: Points On and Below the Graph of a Concave Function Always Form a
Convex Set
Let D  R n be a convex set and let R  R. Let A  {( x, y) | x  D, f ( x)  y} be the set of

points “on or below” the graph of f: D  R . Then f is concave function  A is a convex
set.
Prove: We have to show that f(x) is a concave function implies A is convex and A is
convex implies f(x) is concave function.
First part: f(x) is a concave function  A is a convex set.
Let take any two points: (x1,y1) and (x2,y2) in set A
Take convex combination of the two point: (xt, yt) such that:
xt = tx1 + (1-t)x2 and yt = ty1 + (1-t)y2 .
We need to prove that (xt,yt) is also in set A.
F(x) is a concave function  by the definition of concave function:
f(xt)  tf(x1) + (1-t)f(x2)
By the definition of set A, we have f(x1)  y1 and f(x2)  y2
 tf(x1) + (1-t)f(x2)  ty1 + (1-t)y2 = yt
 f(xt)  yt  (xt,yt)  A  A is a convex set.
Second part: Prove A is convex implies f is a concave set.
Consider two any points on the graph of f(x): (x1,f(x1)) and (x2,f(x2)).
A is convex  (xt, yt)  A, in which :
Xt = tx1 + (1-t)x2 and yt = tf(x1) + (1-t)f(x2).
We need to prove that: f(xt)  tf(x1) + (1-t)f(x2).
According to the definition of set A: f(xt)  yt = tf(x1) + (1-t)f(x2).

 f(x) is a concave function.
Strictly Concave Functions
f: D  R is a strictly concave function iff for all x1 and x2 in D,

f ( xt )  tf ( x1)  (1  t ) f ( x2) for all t  (0,1)
Geometrically, these modification simply require the graph of the function to lie
everwhere strictly above the chord connecting any two points on the graph.
Rule out the flat portions on the graph of the function.
1.4.3. Quansiconcave Functions
Quansiconcave functions:
f: D  R is quansiconcave function for all x1 and x2 in D, f ( xt )  min[ f ( x1 ), f ( x2 )]
for all t  [0,1].
Geometrically:
- When f(x) is an increasing function, it will be quansiconcave whenever the level set
relative to any convex combination of two points, L(xt) is always on or above the
lowest of the level sets L(x1) and L(x2).
x2
x1
xt
x2
x1

- When f(x) is a decreasing function, it will be quansiconcave whenever the level set
relative to any convex combination of two points, L(xt) is always on or below the
highest of the level sets L(x1), L(x2).
x2
x2
xt
x1
x1
Theorem: Quansiconcave and the Superior Sets
f: D  R is a quansiconcave function if and only if S(x) –( the superior set relative to point
x: {x| x  D , f  f(x)}) is a convex set for all x  D .
Prove: We have to prove the both necessary and sufficient terms.
Sufficiency: Prove that f(x) is a quansiconcave function  S(x) is a convex set.
Consider any two point x1 and x2 in set S(x). We need to prove that xt made by the convex
combination of the two vectors x1 and x2 is also in S(x)  f(xt)  f(x).
 according to the definition of S(x), we have: f(x1)  f(x) and f(x2)  f(x).
According to the definition of a quansiconcave function:  f(xt)  min[f(x1),f(x2)]  f(x).
 f(xt)  f(x).
Necessarity: Prove that if S(x) is a convex set  f(x) is a quansiconcave function.
We need to prove that f(xt)  min[f(x),f(x2)] under the condition that S(x) is convex.
Consider any two points x1 and x2 in S(x). Without loss of generality, assume we have:
f(x1)  f(x2).
According to the definition of S(x), we have: f(x1)  f(x2)  f(x).
Because f(x1)  f(x2)  S(x2)  S(x1)  x1 and x2 are both in S(x2)
 xt = tx1 + (1-t)x2 (for all t  [0,1] ) is also in S(x2)  f(xt)  f(x1)  f(x2).
Because t  [0,1]  f(xt)  min[f(x1), f(x2)]  f(x) is a quansiconcave function.

x2 x2
Increasing function
Decreasing function
S(x)
S(x)
L(x) L(x)
x1
x1
Strictly Concave Functions:
A funtion f: D  R is strictly concave function iff, for all x1  x2 in D,
f(xt) > min[ f(x1) , f(x2) ] for all t  (0,1).
- A Strictly Concave Function forbid the convex combination of two points in the same
level set also lies in that level set
x2
x2
x2 Increasing Function
Not Strictly Quansiconcave xt
x1
xt
x2 L(x1)=L(x2)=L(xt)
x1
x1
x1
Strictly Quansiconcave Function
Theorem: Concavity Implies Quansiconcavity (Concavity  Quansiconcavity)
A concave function is always quansiconcave. A strictly concave function is always strictly

quansiconcave.
Prove: If f is a concave function, f is always a quansiconcave function
Consider x1 and x2 in D. Without loss of generality, we assume that f(x1)  f(x2).
f(x) is a concave function  f(xt)  tf(xt) + (1-t)f(x2) = f(x2) + t[f(x1) – f(x2)].
Because f(x1) – f(x2)  0  f(xt)  f(x2).
f(x1)  f(x2)  f(x2) = min[f(x1),f(x2)]  f(xt)  min[f(x1),f(x2)]  f(x) is a

quansiconcave function.

1.4.4 Convex and Quansiconvex Functions
Convex Functions:
f: D  R is a convex function iff for all x1 and x2 in D, f(xt)  tf(x1) + (1-t)f(x2),

for all t  [0,1].
Convexity  the region above the graph – set (x,y) or (D,R) is a convex set.
Strictly Convex Function:
f: D  R is a strictly convex function iff for all x1 and x2 in D, f(xt) < tf(x1) + (1-t)f(x2),
for all t  (0,1).
Theorem. Points On and Above the Graph of a Convex Function Always Form a Convex
Set
Let D  R n be a convex set, R  R. Let A* = {(x,y) | x  D, f(x)  y} be the set of points

“on and above” the graph of f: D  R.
f is a convex function  A* is a convex set.
Prove:
Sufficiency: We need to prove that f is a convex function  A* is a convex set.
Consider the two points (x1,y1) and (x2,y2) in set A*.
Take the convex combination of the two points, we have: (xt,yt), in which:
xt = tx1 + (1-t)x2 and yt = tf(x1) + (1-t)f(x2)
We need to prove that (xt,yt) is also in set A*  f(xt)  yt.
According to the definition of a convex function: f(xt)  tf(x1) + (1-t)f(x2) = yt
 f(xt)  yt  (xt,yt) is in set A* as well.
Necessity: We need to prove that A* is a convex set  f is a convex function.

Consider the two points on the graph of f (x): (x1,f(x1) and (x2,f(x2)).
A* is a convex set  f(xt)  tf(x1) + (1-t)f(x2)  is a convex function.
Quansiconvex and Strictly Quansiconvex Functions
1. A function f: D  R is Quansiconvex iff for all x1 and x2 in D,

f(xt)  max[f(x1),f(x2)] for all t  [0,1].
2. A function f: D  R is Strictly Quansiconvex iff for all x1 and x2 in D,

f(xt) < max [f(x1),f(x2)] for all t  (0,1).
Theorem: Quansiconvexity and the Inferior Sets
f: D  R is a quansiconvex function iff I(x) – Inferior set is a convex set for all x  D.
x2 x2
Increasing function
Decreasing function
Inferior set
Inferior set
x1 x1
Prove:
Sufficiency: We need to prove that f(x) is a quansiconvex  I(x) Inferior set is a convex
set.
We need to prove that (xt,yt) is also in the I(x) set.
Consider the two points in I(x) set (x1,y1) and (x2,y2). Without loss of generality, we
assume that f(x1)  f(x2)  I(x1)  I(x2) and f(x1)-f(x2)  0
F(x) is quansiconvex  f(xt)  max[f(x1),f(x2)]
Because f(x1)  f(x2)  f(x2) = max[f(x1),f(x2)]
 f(xt)  f(x2)  (xt,yt)  I(x2)  (xt,yt)  A*.
Necessarity: We need to prove that I(x) is a convex set  f(x) is a quansiconvex

function.
We need to prove that f(xt)  max[f(x1),f(x2)].

Consider the two points in I(x): x1(x11,x12) and x2(x22,x22)
I(x) is a convex set  the convex combination of the two points is also in I(x).
Without loss of generaliy, we assume that f(x1)  f(x2)  max[f(x1),f(x2)]=f(x2).
f(xt) = tf(x1) + (1-t)f(x2) = f(x2) + t[f(x1)-f(x2)]  f(x2) due to f(x1)  f(x2).
 f(xt)  max[f(x1),f(x2)]  f(x) is a quansiconvex function.
Theorem: Concave/Convex and Quansiconcave/Quansiconvex Funtions
1. f(x) is (strictly) concave function iff –f(x) is (strictly) convex function.
2. f(x) is (strictly) quansiconcave function iff –f(x) is (strictly) quansiconvex function.
Summaries:
1. f concave  convex sets beneath the graph.
2. f convex  convex sets above the graph.
3. f quansiconcave  Superior sets are convex sets.
4. f quansiconvex  Inferior sets are convex sets.
5. f concave  f quansiconcave.
6. f convex  f quansiconvex.
7. f (strictly) concave  -f (strictly) convex.
8. f (strictly) quansiconcave  -f (strictly) quansiconvex.
Strictly Quansiconvex function

y y
f(x2)
f(x1)
f(xt)
X x
X1 Xt X2
f(xt) < min[f(x1),f(x2)] < max[f(x1),f(x2)] Inferior set is a convex set

Strictly Quansiconcave function

f(xt) y
f(x2)
f(x1)
X1 Xt X2
Superior set is a convex set

f(xt) > max[f(x1),f(x2)] > min[f(x1),f(x2)]
f(xt) f(x1)
f(x1) f(xt)
f(x2) f(x2)
x1 xt x2
x1 xt x2
f(x) is concave  f(x) is f ( xt )  min( f ( x1), f ( x2))

quansiconcave: But f(x) is not concave function
f ( xt )  min( f ( x1), f ( x2))

Chapter 2. Calculus and Optimization

2.1. Calculus
2.1.1 Functions of a Single Variable
Differentiable function: If the function is continuous and smooth, with no breaks and kinks.
dy
Derivative f’(x): the slope or instantaneous rate of change in f(x):  f ' ( x) , the
dx
instantaneous amount by which y chances per unit change in x.
- First Derivative tells us whether the value of f(x) is rising or falling as we increase x.
- Second Derivative tells us what the curvature of the function is.
Differential (dy): Measure instantaneous amount by which f(x) change at the poin x
following from a “small” change in x. dy = f’(x)dx.
Theorem: Slope, Curvature and Differentials
For any twice continuous differentiable function, y = f(x), in the neighborhood of the point x,
and for all dx  0:
First Differentials:
dy  0  f’(x)  0  f is locally increasing
dy  0  f’(x)  0  f is locally decreasing
dy = 0  f’(x) = 0  f is locally constant
Second Differentials:
d2y  0  f’’(x)  0  f is locally concave
d2y  0  f’’(x)  0  f is locally convex
d2y = 0  f’’(x) = 0  f is locally linear.
2.1.2 Functions of Several Variables
We concern with real valued functions of several variables.
Partial Derivatives: Let y = f(x1, x2, …xn). The the partial derivative of f with respect to xi
is defined as:
f ( x) f ( x1, x2,..., xi  h,....xn)  f ( x1, x2,..., xn)

 f i ( x)  lim
xi h0 h

- Partial Derivative tells us that whether the function is rising or falling as we change
one variable alone, holding all others constant.
Total differential: Tells us value of the function is rising or falling as we change all or some
variables simultaneously.
n
dy   f i ( x)dxi
i 1
Using vector notation:
The Gradient Vector of all partial derivatives: Vector derivative with respect to
xi: f ( x)  ( f1 ( x), f 2 ( x)... f n ( x))1xn
 dx1
Vector changes in variables: dx   . 
dxn  nx1
dy  f ( x).dx
 2 f ( x)
Second – order partial derivative: f ij ( x) 
xixj
The Gradient Vector: f j ( x)  ( f1 j ( x), f 2 j ( x),..., f nj ( x))
  2 f ( x) 
H ( x)   
 xixj  nxn
The Hessian Matrix:
Theorem: Young’s Theorem
For any twice continuous and differentiable function f(x), we have:
 2 f ( x)  2 f ( x)

xixj xjxi
The Young’s Theorem tells us that the Hessian matrix will be symmetric.
Quadratic form: d 2 y  d T x.H ( x).dx

Theorem: Curvature in Several Variables
Let f: D  R be twice continuous and differentiable and let x  D. Then,
d2y  0  f is concave at x  d y  d x.H ( x).dx  0 for all dx.

2 T
d2y  0  f is convex at x  d y  d x.H ( x).dx  0 for all dx.

2 T
d2y < 0  f is strictly concave at x  d y  d x.H ( x).dx  0 for all dx  0

2 T
d2y > 0  f is strictly convex at x  d y  d x.H ( x).dx  0 for all dx  0 .

2 T
The relations hold globally if they hold for all x  D .
H(x) is positive semi-definite: d 2 y  d T x.H ( x).dx  0
H(x) is negative semi-definite: d 2 y  d T x.H ( x).dx  0
H(x) is positive definite: d 2 y  d T x.H ( x).dx  0
H(x) is negative definite: d 2 y  d T x.H ( x).dx  0
Theorem: Concavity, Convexity and Second-Order Own Partial Derivatives
Let f(x) be a twice contiounously differentiable function.
1. If f(x) is concave, then f ii ( x)  0 , i = 1,2…n
2. If f(x) is convex, then f ii ( x)  0 , i = 1,2…n
3. If f(x) is strictly concave, then f ii ( x)  0 , i = 1,2…n
4. If f(x) is strictly convex, then f ii ( x)  0 , i = 1,2…n
Prove: If f(x) is concave, then f ii ( x)  0 , i = 1,2…n
F(x) is concave  d 2 y  dx T .H ( x).dx  0 for all dx
Without loss of generality, consider a two independent variable function.

 f ( x) f12 ( x)   dx1
d 2 y  (dx1, dx 2)  11
f 22 ( x) dx 2
.
 f 21 ( x)
dx1
Let dx =    d 2 y  f11 ( x).(dx1) 2  d 2 y  0  f11 ( x)  0 .
 0 
This prove is similar for f ii ( x)  0 .
2.1.3. Homogeneous Functions
Homogeneous Functions: A real valued function is call
Homogeneous of degree k iff: f(tx) = tkf(x) for all t > 0.
Homogeneous of degree 1 iff: f(tx) = tf(x) for all t > 0.
Honogeneous of degree 0 iff: f(tx) = f(x) for all t > 0.
Theorem: Partial Derivative of Homogeneous
If f(x) is homogeneous of degree k, then its partial derivatives are homogeneous of

degree k-1.
Corollary: Linear Homogeneous Functions.
f (tx ) f ( x)
If f(x) is homogeneous of degree 1, then  for all t > 0.
xi xi
Example: Consider the Cobb-Douglas function: f ( x)  A.x1 x2
f (tx ) f ( x)
f (tx )  A.(tx1 ) (tx 2 )    A (tx1) 1 t.(tx 2 )   At    1x1 1 x2   t    1 .
x1 x1
f (tx ) f ( x)
      1  the Cobb-Douglas function is homogeneous of degree 1.
x1 x1
Theorem: Euler’s Theorem
1. If f(x) is homogeneous of degree k, then:
n
f ( x)
k . f ( x)   xi
i 1 xi
2. If f(x) is homogeneous of degree 1, then:
n
f ( x)
f ( x)   xi
i 1 xi
Prove: Assume that f(x) is homogeneous of degree k, by definition: f(tx) = tkf(x)
Differentiate the left - hand side with respect to t, we have:
f (tx ) n
f (tx )
 xi
t i 1 txi
Differentiate the right – hand side with respect to t, we have;
t k f ( x)
 kt k 1 f ( x)
t
n
f (tx )
 kt k 1 f ( x)   xi
i 1 txi
n
f ( x)
k . f ( x)   xi
Let t = 1, we have
i 1 xi
2.2. Optimization
Consider a single variable y = f(x) that is differentiable.
Local maximum at a point x*: means that f(x*)  f(x) for all x in some neighborhood of x*.
A unique local maximum at a point x*: means that f(x*) > f(x) for all x in some
neighborhood of x*.
Global maximum at a point x*: means that f(x*)  f(x) for all x in the Domain D.
A unique global maximum at a point x*: means that f(x*) > f(x) for all x in the Domain D.
Theorem: Necessary Conditions for Local Interior Optima in the Single – Variable Case
Let f(x) be a differentiable function of one variable. Then f(x) reaches a local interior
 f ' ( x*)  0( FONC )  dy  f ' ( x*)dx  0

1. Maximum at x* (f(x) is local concave)
 f ' ' ( x*)  0( SONC)  d 2 y  f ' ' ( x*)  0
 f ' ( x*)  0( FONC )  dy  f ' ( x*)dx  0

2. Minimum at x* (f(x) is local convex )
 f ' ' ( x*)  0( SONC)  d 2 y  f ' ' ( x*)dx  0
2.2.1 Real Valued Functions of Several Variables
Local Maximum: f(x) is local maximum at the point x* if for all x  B (x*) , f(x*)  f(x).
Global Maximum: f(x) is global maximum at the point x*  D  R n if for all x  D ,

f(x*)  f(x) .

Unique Local Maximum: f(x) is unique local maximum at the point x* if for all x
 B (x*) , f(x*) > f(x).
Unique Global Maximum: f(x) is global maximum at the point x*  D  R n if for all x
 D , f(x*)>f(x) .
Theorem: Local – Global Theorem
1. Let f(x) is a concave function. Then f(x) reaches a local interior maximum at x*
 f(x) reaches a global interior maximum at x*.
2. Let f(x) is a convex function. Then f(x) reaches a local interior minimum at x*
 f(x) reaches a global interior maximum at x*.
Prove: Let f(x) is a concave function. Then f(x) reaches a local interior maximum at x* 
f(x) reaches a global interior maximum at x*.
Sufficient condition: we need to prove that f(x) reaches a local interior maximum at x* 
f(x) reaches a global interior maximum at x*.
F(x) reaches a local interior maximum at x*  there exists   0 such that f(x*)  f(x) for all
x in B (x*) .
If f(x) doesn’t reach a global interior maximum at x*  there exist x’ in D
such that f(x’) > f(x*). We need to prove this is a contradiction.
Take value of the function of the convex combination of x* and x’, and base on the definition
of a concave function, we have: f ( xt )  tf ( x' )  (1  t ) f ( x*)  f ( x*)  t[ f ( x' )  f ( x*)] .
Because f(x’) > f(x*)  f(xt) > f(x*) for all t  (0,1) .
If we take   0 such that xt in B (x*)  f(xt) > f(x*) for all t  (0,1) .
This is contradiction of the assumption that f(x) is local interior maximum at x* for some
  0 such that f(x*)  f(x) for all x in B (x*) .

f(x)
f(x’)
f(x*)
f(xt)
Geoffrey A. Jehle – Advanced Microeconomic Theory x* xt x’

6
f(x*)>f(xt): contradiction of that f(x) is convace
Necessary condition: Assume f(x) is global interior maximum at x*  f(x) is local interior
maximum at x*.
F(x) always reaches a local interior maximum at x* if f(x) reaches a global interior
maximum at x*.
Theorem: Strict Cocavity/Convexity and the Uniqueness of Global Optima
1. Let f(x) be a strictly concave function. If x* maximizes f(x), then x* is the unique
global maximizer and f(x*) > f(x) for all x  D.
2. Let f(x) be a strictly convex function. If x* minimizes f(x), then x* is the unique
global minimizer and f(x*) < f(x) for all x  D.
Prove: (1): Let f(x) be a strictly concave function. If x* maximizes f(x), then x* is the
unique global maximizer and f(x*)>f(x) for all x  D.
Let x* maximize f(x). Assume that there was x’ that also maximizes f(x)  f(x*)=f(x’).
F(x) is strictly concave function  f(xt) > tf(x*) + (1-t)f(x’) = f(x’). This violates the
assumption that x* and x’ are global maximizer of f(x).
So assumption that there was x’ that also maximizes f(x) is impossible.
Theorem: First – Order Necessary Condition for Local Interior Optima or Real Valued
Functions
Let f(x) be a differetiable function. If f(x) reachs a local interior maximum or minimum at
x*, then x* solves the system of simultaneous equations,
 f ( x*)
 x1  0

 f ( x*)  0
 x 2

.
.

 f ( x*)
 xn  0


Proof:
We suppose that f(x) reaches a local interior extremum at x* and seek to show that
f ( x*)  0 .
Consider the function: g(t) = f(x* + t.dx).
Note that (x*+t.dx) is a vector, so g(t) = f(x* + t.dx) is a some value of f(x) different from
f(x*).
We assumed that f(x) reaches a local interior extremum at x*  g(t) ~ f(x) reaches a local
interior extremum at t = 0 because g(0) = f(x*)  g ' (0)  0
g (0) f ( x * tdx ) ( x * tdx ) f ( x * tdx ) f ( x*)

  .  dx   dxi  0
t ( xi  tdx ) t ( xi  tdx ) xi
 f ( x*)  0 (matrix)
2.2.2 Second – Order Conditions
We have a maximum if the function is locally concave at the point.
We have a minimum if the function is locally convex at the point.
Theorem: Second – Order Necessary Condition for Local Interior Optima of Real
Valued Fucntions
Let y=f(x) be twice differentiable.
1. If f(x) reaches a local interior maximum at x*, then
n n
d y  dx H ( x*)dx   f ij ( x*)dxidxj  0
2 T
i 1 j 1
2. If f(x) reaches a local interior minimum at x*, then
n n
d y  dx H ( x*)dx   f ij ( x*)dxidxj  0
2 T
i 1 j 1
Proof:
Consider: g(t) = f(x* + tdx)
f(x) reaches a critical point at x* , then g(t) reaches a critical at t = 0  g ' ' (0)  0 for
maximum target.
n
f ( x  tdx )
For any t, g ' (t )   dxi
i 1 ( xi  tdxi )
n n
 2 f ( x  tdx )
 g ' ' (t )   dxidxj
i 1 j 1  ( xi  tdxi ) ( xj  tdxj )
n n
 2 f ( x*)
At t = 0  g ' ' (0)   dxidxj  0
i 1 j 1 xixj
Note:
- The both above theorems are only Necessary conditions. They state that if f(x)
reaches optimal value, then f’(x*) =0 and f’’(x*)  ()0 .
- But the main target of us is to state that: “If such and such obtains at x, then x
optimizes the function” – Sufficient conditions.
Sufficient Conditions:
1. Local Maximum at x*: If f i ( x*)  0 and f(x) is strictly local concave at x*
2. Local Minimum at x*: If f i ( x*)  0 and f(x) is strictly local convex at x*.
Theorem. Sufficient Conditions for Strict Concavity or Strict Convexity of a Real

Valued Function
Let f(x) be twice differentiable function, and let Di (x) be ith-order principal minor
of the Hessian matrix H(x).
1. If (-1)i Di (x) > 0, then f(x) is strictly concave at x.
2. If Di (x) > 0, then f(x) is strictly convex at x.
If the respective conditions hold for all x in the domain, the funtion is globally
strictly concave or globally strictly convex, respectively.
Theorem. Sufficient Conditions for Local Interior Optima of Real Valued

Functions
1. If f i ( x*)  0 and (1) i Di ( x*)  0 , then f(x) reaches a local maximum at x*.
2. If f i ( x*)  0 and Di ( x)*  0 , then f(x) reaches a local minimum at x*.

Theorem. Sufficient Condition for Unique Global Optima
Let f(x) is differentiable
1. If f(x) is global strictly concave and f i ( x*)  0 then x* is the unique global
maximizer of f(x).
2. If f(x) is global strictly convex and f i ( x*)  0 then x* is the unique global
minimizer of f(x).
Proof: If f(x) is global strictly concave and f i ( x*)  0  x* is the unique global
maximizer of f(x).
Consider the two points: x* and x’
f (tx '(1  t ) x*)  tf ( x' )  (1  t ) f ( x*)

F(x) is strictly concave  f (tx '(1  t ) x*)  f ( x*) for all t in (0,1).
 f ( x' )  f ( x*)
t
f ( x * t ( x' x*))  f ( x*)

 lim  f ( x' )  f ( x*) (1)
t 0 t
Consider the function g(h) = f(x* + h(x’-x*))
f ( x * h( x' x*)

g ' ( h)   ( xi ' xi *)
( xi * h( x'i  x*))
f ( x*)
At h=0  g ' (0)   ( x'i  xi *)  f ( x*)dx (the product of two vectors)
xi
f ( x * (t  0)( x' x*))  f ( x*)

 g ' (0)  lim  f ( x*)dx
t 0 t
f ( x * (t  0)( x' x*))  f ( x*)

Because f ( x*)  0  lim  0 (2)
t 0 t
Combining (1) and (2), we have: f(x’) – f(x*) < 0 or f(x*) > f(x’).
Because x’ is chosen arbitriarily, this means that f(x*) > f(x’) for all x in the domain.
 x* is unique maximizer since f is global strictly concave.

2.3. Constrained Optimization
2.3.1 Equality Constraints
Max f(x1, x2) subject to g(x1, x2) = 0
Objective function / maximand: f(x1, x2)
Constraint: g(x1, x2)
Constraint set / Feasible set: the set (x1,x2) such that g(x1,x2) is satisfied.
2.3.2 Lagrange’s Method
Max f(x1, x2) subject to g(x1, x2) = 0
Lagrange Function: L(x1 ,x2,  ) = f(x1, x2) +  g(x1,x2)
Now we maximize L function when it is an ordinary function of three varibles (x1,x2,  ).
 L f ( x1, x 2) g ( x1, x 2)
 x1   . 0
x1 x1

 L f ( x1, x 2) g ( x1, x 2)
   . 0
 x 2  x 2 x 2
 L
   g ( x 2, x 2)  0

Solve this stimultaneous equation, we have a critical point (x1*,x2*,  ), such that
(x1*,x2*) is the critical point of f(x1,x2) along the constraint g(x1,x2)=0.
Proof:
We need to prove that (x1* , x2*) is also the critical of f(x1,x2)  df = 0 at (x1*,x2*).
f ( x1*, x 2*) f ( x1*, x 2*)

dL  dx1  dx 2  g ( x1*, x 2*)d
x1 x 2
 g ( x1*, x 2*) g ( x1*, x 2*) 
  dx1  dx 2  0
 x1 x 2 
For all dx1, dx2 and d  .
g ( x1*, x2*) g ( x1*, x2*)

Because g(x1,x2) = 0 for all dx1, dx2  dx1  dx2  0
x1 x2
f ( x1*, x2*) f ( x1*, x2*)

Take d  = 0  dL  dx1  dx2  df  0 at (x1*,x2*).
x1 x2
 df(x1*,x2*) = 0 for all dx1*, dx2*./.
- The critical points derived from the First Order condition can not alone decise to be
maxima or minima. To distinguish between two requires knowledge of the
“curvature” of the objective and constraint relations at the critical point in question.
General problem;
Max f(x1,x2….xn) subject to
g 1 ( x1,...xn )  0
g 2 ( x1,...xn )  0
.
.
g m ( x1,...xn )  0
Lagrange function with (n+m) variables: L( x,  )  f ( x)    j g j ( x)

j 1
First – Order Condition:
 L f ( x) m g j ( x)
    j
 xi xi j 1 xi

 L  g j ( x)  0
  j
Theorem. Lagrange’s Theorem
Let f(x) and gj(x), j=1,2…m, be twice continuously differentiable real valued
function over some domain D  R n . Let x* be an interior of D and suppose
that x* is an optimum of f subject to the constraint, gj. If m < n and if the
gradient vectors g j (x*) , j = 1,…m, are linearly independent, then there exist
m unique numbers  j , such that L function has an optimum in x at x* and
L( x*,  ) f ( x*) m g j ( x*)

   j i=1,…n
xi xi j 1 xi

2.3.3 Geometrical Interpretation
- Represent the objective function by its Level sets.
L(yo) = {(x1,x2)| f(x1,x2) = yo}
f ( x1, x2) f ( x1, x2)

The level set curve L(yo): dx1  dx2  0
x1 x2
dx2 f ( x1, x 2)
  1 : the slope of level set curve L(yo) through any point
dx1 along_ L ( yo ) f 2 ( x1, x 2)
(x1,x2).
- The slope of the constraint:
g ( x1, x2) g ( x1, x2)

dx1  dx2  0
x1 x2
dx 2 g ( x1, x 2)
  1
dx1 along_ g ( o ) g 2 ( x1, x 2)
- Recall the First – Order Condition of the Lagrange function:
 f ( x1*, x 2*) g ( x1*, x 2*)

  
x1 x1

 f ( x1*, x 2*) g ( x1*, x 2*)
  
 x 2 x 2
 g ( x1*, x 2*)  0


Deviding the first of these by the second to eliminate  , we have:
 f1 ( x1*, x 2*) g1 ( x1*, x 2*)

 
 f 2 ( x1*, x 2*) g 2 ( x1*, x 2*)
 g ( x1*, x 2*)  0

Thus, at (x1*,x2*), the slope of the level set curve of y equals the slope of the constraint
curve. However, this point must be on the constraint curve to satisfy the condition:
g(x1*,x2*)=0

L(y)
x2*
L(y*)
x2*
g(x)=0
L(y)
L(y*)
g(x)=0
x1*
x1*
Minimization Problem
Maximization Problem
2.3.4 Second – Order Conditions
- If (x*,  ) satisfy the second – order condition for a maximum of the Lagrange function,
we know we have a local maximum of f subject to the constraints.
- All we really need to know that we have a maximum is that the second differential of the
objective function at the point which solves the first order conditions is decreasing along
the constraint.
Theorem: Sufficient Conditions for Optima with Equality constraints
Let the objective function f(x) and m constraints be given by gj(x)=0, j=1,…,n. Let L
be the Lagrange function. Let (x*,  ) solve the First Order Condition. Then:
1. x* maximizes f(x) subject to the constraint if the principa minors alternate in sign
beginning with positive D3 >0, D4<0…. when evaluated at (x*,  ).
2. x* minimizes f(x) subject to the constraint if the principa minors are all negative
D3<0, D4<0…. when valuated at (x*,  ).
Hessian Matrix
 L11 . L1n g11 . g1m 

 
 . . . . . . 
L . Lnn g 1n . g nm 
H   n11 
 g1 . g1m 0 0 0 
 . . . 0 0 0 
 1 
 gn . g nm 0 0 0  ( nm) x ( nm)

2.3.5. Inequality Constraints:
Consider the simplest problem: max f(x) subject to x  0 ( D  R ).
Let x* be the maximum of the problem. There exists three cases:
Case 1: x*=0 and f ’(x*)<0.
Case2: x* = 0 and f ’(x*) = 0
Case 3: x* >0 and f ’(x*) = 0
Making a convenient set of conditions to summarize all three posibilities:
Conditions: x* must to satisfy all these conditions:
 f ' ( x*)  0

 x * . f ' ( x*)  0 (Maximizatin Problem)
 x*  0

To solve this inequation system, we should concentrate on (2) condition: x*f’(x*)=0 to

have a set of solutions x, then use the conditions (1) and (3) to determine which x* is.
Similarly, we can make conditions for Minimization Problem:
 f ' ( x*)  0

 x * . f ' ( x*)  0 (Minimization Problem)
 x*  0

Theorem: Necessary Conditions for Optima of Real Valued Funtions Subject to

Non-negativity Constraints (xi  0)
Let f(x) be continuously differentiable.
1. If x* maximizes f(x) subject to x  0, then vector x*(x1*,x2*…xn*) satisfies:
 f i ( x*)  0

 xi * . f i ( x*)  0
x *  0
 i
2. If x* minimizes f(x) subject to x  0, then vector x*(x1*,x2*…xn*) satisfies:

 f i ( x*)  0

 xi * . f i ( x*)  0
x *  0
 i
2.3.6 Kuhn – Tucker Conditions
Non – Linear Programming problem: there are no limitations on the forms of objective
function and constraint relations.
Max f(x1,x2) subject to g(x1,x2)  0 and x1  0 , x2  0 .
There is an above theorem that tells us that maximum of a function subject to equality
constrains concides with the maximum of its corresponding Lagrangian with no
constraints. There is also an above theorem that show us how to characterize the
maximum of a function with non – negativity constraints only.
To solve the non – linear programming problem, we will convert the problem to one with
equality constraints and non-negativity constraints and apply what we know.
The trick:
Because g  0  there is z such that g – z = 0 and z  0 (z must be positive due to g  0.)
Now, our problem is converted to:
Max f(x1,x2) subject to g(x1,x2) – z = 0 and x1  0, x2  0, z  0.
Theorem tell us that the maximum over x of f subject to equality constraints concides with
the unconstrainted maximum over x of the associated Lagrangian.
Then theorem tell us how to solve the problem of finding optima of f(x) subject to non-
negative constraints (xi  0).
The problem is converted to;
Max L(x1, x2, z,  ) = f(x1,x2) +  [g(x1,x2)-z] subject to x1  0, x2  0 and z  0.
The first – order condition on x1, x2, z,  :

 Lx1  0  f1  g1  0 (1)

 x1.L  0  x1( f  g )  0 ( 2)
 x1 1 1
 Lx 2  0  f 2  g 2  0 (3)

 x 2.Lx 2  0  x 2( f 2  g 2 )  0 ( 4)

 Lz  0     0 (5) (maximization problem)
 z.Lz  0   z  0 (6)

 x1  0, x 2  0, z  0 (7 )
 L  0  g ( x1, x 2)  z  0
  (8)
We don’t impose any sign restriction on 
By (8) condition, we can eliminate z by substituting z by g(x1,x2):
 Lx1  0  f1  g1  0 (1)

 x1.L  0  x1( f  g )  0 ( 2)
 x1 1 1
 Lx 2  0  f 2  g 2  0 (3)

 x 2.Lx 2  0  x 2( f 2  g 2 )  0 ( 4)
(maximization problem)
 g ( x1, x 2)  0 (5' )

 z.Lz  0  g ( x1, x 2)  0 ( 6' )

 x1  0, x 2  0,   0 (7 ' )
Condition (5’), (6’) and (7’) tell us that we are trying to minimize the Lagrangian in 
subject to   0
 Maximization Problem: Necessary Condition: A Maximum of Lagrangian in the

variables xi and a minimum of Lgrangian in the multiplier  .
Saddle Point (x1*,x2*,  )
 Lx1  0  f1  g1  0 (1)

 x1.L  0  x1( f  g )  0 ( 2)
 x1 1 1
 Lx 2  0  f 2  g 2  0 (3)

 x 2.Lx 2  0  x 2( f 2  g 2 )  0 ( 4) (Minimization Problem)
 L  0  g ( x1, x 2)  0 (5' )
 z
 z.Lz  0  g ( x1, x 2)  0 ( 6' )

 x1  0, x 2  0,   0 (7 ' )
Theorem Kuhn – Tucker: Necessary Conditions for Optima of Real Valued

Functions Subject to Inequality and Non – negativity Constraints

Let f(x) be continuously differentiable problem
1. Consider the maximation problem,
Max f(x) subject to gj (x)  0, j= 1,2,…m, and x  0 (1)

m
with associated Lagrangian: L = f(x) + 
j 1
j g j ( x) (2)
If x* solves (1) and if the gradient vector for all binding constraints at x* are linearly
independent, then there exist m numbers  j *  0 , such that ( x*, *) is a saddle point of
the Lagrangian satisfying the Kuhn – Tucker conditions:
Li ( x*, *)  0 and xi * .Li ( x*, *)  0, i  1,...n


Li ( x*, *)  0 and  j * .Li ( x*, *)  0 j  1,...m
2. Consider the minimization problem:
Min f(x) subject to gj (x)  0, j= 1,2,…m, and x  0 (3)

m
with associated Lagrangian: L = f(x) + 
j 1
j g j ( x) (4)
If x* solves (3) and if the gradient vector for all binding constraints at x* are linearly
independent, then there exist m numbers  j *  0 , such that ( x*, *) is a saddle point of
the Lagrangian satisfying the Kuhn – Tucker conditions:
Li ( x*, *)  0 and xi * .Li ( x*, *)  0, i  1,...n


Li ( x*, *)  0 and  j * .Li ( x*, *)  0 j  1,...m
Proof: Find Luenberger (1973).
2.4. Value Functions
Maximization problem:
Max f(x,a) subject to g(x,a) = 0 and x  0.
In which: x is a vector of choice variable; a is a vector of parameters.
The solution of this problem will depend on the vector of parameters a.
Denote: x=x(a)
The maximization problem:

Max M(x(a),a) subject to g(x(a),a).
Theorem of the Maximum: If f(x) and g(x) are continuous in parameters, and if the
domain is a compact set, then M(a) and x(a) are continuous functions of the parameters
a.
Envelope Theorem:
The problem Max f(x,a) subject to g(x,a) = 0 and x  0.
If f(x,a) and g(x,a) are continuously differentiable in a. Let x(a) > 0 solve the problem and
assume that it is continuously differentiable in a. Let L(x, a,  ) be the problem’s
associated Lagrangian function and let (x(a),  (a)) solve the Kuhn – Tucker conditions.
Finally, bet M(a) be the problem’s associate maximum value function. Then the Envelope
Theorem states that:
M (a) L
 j  1,..., m
a j a j
x ( a ),  ( a )
Where the right hand side denotes the partial derivative of the Lagrangian function with
respect to the parameter aj evaluated at the point (x(a),  (a)).

Chu Thanh Duc – MDE10
Chapter 3. Consumer Theory I.
3.1 Primitive Notions

- For each commodity, we assume that the consumer can coceive of pocessing “whole units”,
as well as any fraction of a unit, no matter how infinitesimal. This assumption is made purely
for mathematical expedience.
- Each commodity is measured in some infinitely divisible units.
There are four fundamental building blocks in any model of consumer choice
 A consumer set (choice set):
 A Feasible set
 A preference relation
 A behavioral Assumption
1. A consumption set
A cosumer set represents the set of all alternatives, or complete consumption plans, which
consumer is able to conceive of, whether some of them will be achievable in practice or not.
A consumption bundle (consumer plan): x(x1, x2,…xn)
Assumption 3.1.1 Properties of the Consumption Set, X
The minimal requirements on the consumption set are:
1. X  0(empty set )
2. X is closed
3. X is convex
4. X is bounded from below
5. 0  X
2. Feasible set (B):
Feasible set: the set of alternatives which are achievable given the economic realities faced by
the consumer.
The feasible set B is a subset of the consumption set X which remains after we have
accounted for any constraints imposed on the consumer’s access to commodities by the
practical, institutional, or economic realities of the world.

3. A Preference Relation
- A Preference Relation is a formal description of the consumer’s capabilities and inclination

when faced with a choice.
It typically specifies the limits, if any, on the consumer’s ability to perceive in situations
involving choice, the form of consistency or inconsistency in the consumer’s choice, and
information about the consumer’s tastes for the different objects of choice.
4. Behavioral Assumption
This express the guiding principle the consumer uses to make final choices and so idetifies the
ultimate objectives in choice.
It is generally supposed that the consumer seeks to identify and select that available
alternative which is most prefered in the light of his personal tastes.
Assume that agents are motivated by self-interest is neither innocuous nor vacuous.
Nonetheless, the vast majority of economists are willing to do it despite many of the
criticisms and exceptions.
We assume self-interest as the principle guiding choice. Though this is the veiw of mankind
which many feel earns economics the title “the Dismal Science”.
3.2 Preferences and Utility
The Preference Relation: specifies the capabilities and inclinations of the consuming agent
when faced with situations involving choice.
The “Law of Demand” was built upon some extremely strong assumption.
The “Priciple of Diminishing Marginal Utility”.
3.2.1 Preference Relations
The axioms of choice are intended to give formal mathematical expression to three
fundamental aspects of consumer behavior and attitudes toward the objects of choice.
 : “ is liked at least as well as”
The Binary relation: " "  {( x1 , x 2 ) | x i  X , i  1,2, and x1  x 2 }  X  X
Axiom 1: Completeness For any x1  x 2 in X, either x1  x 2 or x 2  x1 .
The consumer can make choice. He has ability to discriminate and the necessary knowledge
to evaluate alternatives.

Axiom 2: Reflexivity: For all x  X , x  x
Axiom 3: Transitivity: For any three elements x1, x2 and x3 in X, if x1  x 2 and x 2  x 3 ,

then x1  x 3 .
The consumer’s choice is consistent.
Axiom 1 through 3 constitute a formal definition of rationality as the term is used in

economic theory.
Rational economic agents are ones who have the ability to make choices, whose internal
workings are at least minimally logical, and whose choices display a logical consistency.
 An agent can completely order all elements in the consumption set X.
Such a consumer can examine each alternative in the set and place it somewhere in a
hierarchy, or ranking.
The consumer’s preference enable him to construct such a complete and consistent ranking of
all alternatives in the consumption set by saying that the cosumer’s preferences can be
represented by a preference relation.
Definition 3.2.1 Preference Relation
The binary relation on the consumption set, X,
" "  {( x1 , x 2 ) | x i  X , i  1,2, and x1  x 2 }  X  X
is called a preference relation if the symbol  stands for the statement “is liked at least as
well as” and the relation "" satisfies Axioms 1, 2, and 3.
(remind "" is a set)
Definition 3.2.2 Strict Preference Relation
The binary relation on the consumption set X,
" "  {( x1 , x 2 ) | x i  X , i  1,2, and x1  x 2 and x 2  x1}  X  X
Is call a strict preference relation if  is a preference relation, and  is read “ is not at least as
good as”.
"" is transitive, not complete and reflective.
Definition 3.2.3 Indifference Relation
The binary relation on the consumption set X

" ~"  {( x1 , x 2 ) | x i  X , i  1,2, and x1  x 2 and x 2  x1}  X  X
Is called indifference relation if  is preference relation. We can denote inclusion in the

indifference relation either by ( x1 , x 2 ) " ~" or, equivalently, by x1 ~ x 2 , which is read x1 is
indifferent to x2.
“~” is transitive and reflective but not complete.
Using the two supplementary relations, for any pair x1 and x2, exactly one of three
mutually exclusive possibilities exist: either x1  x 2 or x 2  x1 , or x1 ~ x 2 .
Definition 3.2.4 Sets in X Derived from the Preference Relation
Let x0 be any point in the consumption set, X. Relative to any such point, we can define the
following subsets of X:
1.  ( x 0 )  {x | x  X , x  x 0 } , called the “at least as good as” set
2.  ( x 0 )  {x | x  X , x 0  x} , called the “no better than” set
3.  ( x 0 )  {x | x  X , x  x 0 } , called the “preferred to” set
4.  ( x 0 )  {x | x  X , x 0  x} , called the “worse than” set
5. ~ ( x 0 )  {x | x  X , x 0 ~ x} , called the “indifference” set
Axiom 4: Continuity: For all x  X , the “at least as good as” sets,  (x) , and the “no better
than” sets,  (x) , are closed and connected sets.
Continuity requires that there is no “gap” or discrete “jumps” in the indifference sets or,
equivalently, in the level curve of the utility function.
 An open set S is called disconnected if there are two open, non-empty sets U and V
such that:
1. U V = 0
2. U V = S
 A set S (not necessarily open) is called disconnected if there are two open sets U
and V such that
1. (U S) # 0 and (V S) # 0
2. (U S) (V S) = 0
3. (U S) (V S) = S

 If S is not disconnected it is called connected.
Note that the definition of disconnected set is easier for an open set S. In principle, however,
the idea is the same: If a set S can be seperated into two open, disjoint sets in such a way that
neither set is empty and both sets combined give the original set S, then S is called
disconnected.
To show that a set is disconnected is generally easier than showing connectedness: if you can
find a point that is not in the set S, then that point can often be used to 'disconnect' your set
into two new open sets with the above properties.
- The Axiom 4 guarantees that the indifference set ~ ( x) is closed, because ~ ( x) is the
intersection of  (x) and  (x) sets. The intersection of closed set is a closed set.
- Requiring the  (x) and  (x) sets are connected sets, we ensure that the indifference
sets, too, are connected sets, with no gaps or holes in them.
Axiom 5’: Local Non-Satiation: For all x 0  X , and for all   0 , there exists some
x  B ( x 0 ) such that x  x 0 .
- Within any vicinity of a given point x 0 , no matter how big or small that vicinity is,
there will be always exist at least one other point x which the consumer prefer to x 0
- This Axiom 5 rules out the following case

X2
~(x)
X1
Axiom 5: Monotonicity: If for some x1 and x0 in X we have x1  x 0 (  in components of

vectors), then x1  x 0 .

If x1 involves more at least one commodity and no less of any other commodity than x 0 does,
then x1 will be strictly preferred to x0.
Axiom 6’: Convexity: If x1  x 0 (prefer), then tx 1  (1  t ) x 0  x 0 for all t  [0,1] .
Axiom 6: Strict Convexity: If x1  x 0 and x1  x o (prefer), then tx 1  (1  t ) x 0  x 0 for all

t  [0,1] .
X2
X1
- Either Axiom 6’ or Axiom 6, in conjunction with Axioms 1, 2, 3, 4, 5 will rule out

concave – to – the – origin segments in the indifference sets.
- The slope (absolute value) of an indefference curve is sometimes called the Marginal
Rate of Substitution.
- Axiom 6 goes a bit further and requires that the MRS be constantly diminishing.
Summary:
- The Axioms of Completeness, Reflexivity and Transitivity formalize the notion that
the consumer is rational. Consumer can make comparisons among alternatives and his
choices are consistent.
- The Conuinity’s purpose is primarily a mathematical one.
- All other Axioms serve to characterize consumer’s tastes over the objects of choice.
3.2.2 The Utility Function
Definition 3.2.5 A Utility Function Representing the Preference Relation ""
A real valued function U : X  R is called a utility function representing the preference

relation "" whenever U ( x 0 )  U ( x1 )  x 0  x1 and U ( x0 )  U ( x1 )  x0 ~ x1 for all x0
and x1 in X.
- Through our basic assumptions on cosumer preferences are stated in terms of axioms on
the preference relation, preference relation can be represented by a nice, continuous real
valued function.
- Any Preference relation which is complete, reflexive, transitive and continuous can
be represented by a continuous real valued utility function.
Theorem 3.2.1 Existence of a Real Valued Function representing the Preference

Relation
If the preference relation “  ” is complete, reflexive, transitive, continuous and

monotonic, there exists a continuous real valued function, U : X  R , which
represents “  ”.
- This is only an existence theorem. Under the conditions stated, at least one continuous
real valued function representing the preference relation is guaranteed to exist.
- However, the theorem itself makes no statement on how many more function there
may be, nor does it indicate in any way what form any of them must take.
- That the Preference Relation is complete, reflexive, transitive, continuous and

monotonic is sufficient condition to have at least continuous real valued function of
utility.
Proof:
Consider the vector e(1,1,…,1)  X. The point t.e  X. When preferences are monotonic,
for t1>t2, we have t1.e>>t2.e, so t1.e  t 2.e by monotonicity.
Consider the mapping U : X  R , whose image is defined as follows:
u( x)  {t | t  R , x  X , and t.e ~ x}
We need to prove that u(x) = t is unique for a given x and u(x) is a continuous function.
1. U(x) in order to be a function, the mapping satisfy two criteria: It must assign some
number in the range to every point in the domain, and that number must be unique.
- Completeness, reflexitivity, continuity, and monotonicity guarantee that the

indifference sets through x will form an unbroken, one-dimensional boundary between
>(x) and <(x).

x2
x2
t1.e (t1>1)
e u(x) t.e=u(x).e
~(x)
t2.e (t2<1) 1 e
x1
x1
1
u(x)
Constructing the mapping U:X R
If we begin at the origin and move out the ray through e, we must eventually encounte one
unique point t.e for some number t  0 such that t.e~x. That particular number will be image
of x under U, we denonte by u(x).
Since for all x  X there exists a unique u( x)  R , the mapping satisfies the requirements of
a function.
2. Now we must show that U represents the preferences.
Consider two point in X and suppose that x1  x 2 , Then u(x1) will be the number such that
u(x1).e~x1 and u(x2) will be the number such that u(x2).e~x2. Thus u( x1 ).e  u( x 2 ).e .
Recall that u(x1).e and u(x2).e are simply two points along the ray through e. Monotonicity
tells us that u( x1 ).e  u( x 2 ).e only if u ( x1 ).e lies father out than u ( x 2 ).e , or only if
u ( x1 )  u ( x 2 )
Since x1  x 2 implies u( x1 )  u( x 2 ) and u( x1 )  u( x 2 ) implies x1  x 2 .
x1 ~ x 2 implies u( x1 )  u( x 2 ) ; and u( x1 )  u( x 2 ) implies x1 ~ x 2 .
3. Now we show that U is continuous
We showed that a function is continuous if and only if the inverse image of every closed set in
the range is a closed set in its domain.
The range of U is the set of non-negative real numbers, R  {t | t  0} .
Any t  0 will be the image under U of the point x=t.e (t.e~x), or more compactly u(te)=t for
all t  0 .
Let T be any closed set in the range R of U. We need to prove that U 1 (T ) is a closed set in
the domain X.

By the theorem, any closed set in R will be the intersection of some collections of unitions
of closed intervals. There is 0  t i  t i and some index set I, such that:
T   [0, ti ]  [ti ,) 

iI
U 1 (T )  U 1   [0, ti ]  [ti ,) 

The inverse image of T will be
 iI 
We can prove that: U 1 (T )   U 1 [0, ti ]  [ti ,)

iI

U 1 (T )   U 1 [0, ti ]  U 1 [ti ,)
iI

We have: U [0, t i ]  (t i .e) and
1
U 1 [t i ,]  (t i .e)
So U 1 (T )    (t i .e)  (t i .e)  .
iI
By the Axiom 4, we have U [0, t i ]  (t i .e) and

1
U 1 [t i ,]  (t i .e) are closed
sets. By the theorem, the union of two closed sets is a closed set, and the intersection of closed
sets is also a closed set.
U
1
(T )    (t i .e)  (t i .e)  is a closed set.
iI
We can conclude that U : X  R is a continuous function.
The theorem 3.2.1 is very important. It frees us to choose the form in which we would like to
represent preferences.
If all we require of the preference ordering is that it order the bundles in the consumption set,
and if all we require of a utility function reperesenting that preference relation is that it reflect
that ordering of bundles by the ordering of numbers which it assigns to them, then any other
function which assigns bundles numbers in the same order as U does will also represent that
preference relation and itself be just as good a utility function as U.
Positive monotonic transforms
If we have some function U which we believe represents some set of preferences, it frees us to
transform U into other, perhaps more covenient or easily manipulated forms, so long as the
transformation we choose is order-preserving.

Theorem 3.2.2 Invariance of the Utility Function to Positive Monotonic Transforms
Let "" be a continuous preference ordering and suppose that u(x) is a utility function
which represents it. If Z(x) is any strictly increasing function of a single variable, then the
composite function Z(u(x)) called a positive monotonic transform of u(x), is also a utility
function representing "" .
Theorem 3.2.3 Properties of Preferences and Utility Functions
Let "" be complete, reflexive, transitive, and continuous, and let u(x) represent "" . Then:
1. u(x) is strictly increasing if and only if "" is monotonic
2. u(x) is quansiconcave if and only if "" is convex
3. u(x) is strictly quansiconcave if and only if "" is strictly convex.
Theorem 3.2.4 The Differentiability of Utility Functions:
For any preference ordering satisfying the conditions of Theorem 3.2.1(complete, reflexive,
transitive, continuous and monotonic), there will exists a strictly increasing and twice
continuously differentiable utility function if and only if the indifference sets are twice
continuously differentiable.
3.3 The Consumer’s Problem
Assumption 3.3.1 The Consumer
The consumer’s preference ordering "" is complete, reflexive, transitive, continuous,

monotonic, ans strictly convex. Then by Theorem 3.2.1, it can be represented by some real
valued and continuous utility function, u(x). By Theorem 3.2.3, u(x) will be strictly
increasing and strictly quasiconcave.
The assumption of economic environment of the consumer:
We assume that the individual consumer is atomistic, with no significant weight on the
markets in which it transacts.
The consumer is endowed with a fixed money income: y>0

n
The requirement is: y   pi xi , or more compactly y  p.x .
i 1
The budget set: B  {x | x  X , y  p.x}
B is convex, closed and bounded set, so compact set.

The problem: max u( x) s.t. y  p.x, x  0

x
Under the assumption on consumer preferences, the utility function u(x) is real valued and
continuous. The budget set B is a non-empty, closed, bounded and thus compact subset of Rn.
By the Weierstrass Theorem, we are assured that a maximum of u(x) over B exists. Moreover,
since B is convex and the objective function is strictly quansiconcave, the maximizer of u(x)
over B is unique.
Theorem Weierstrass: Existence of Extreme Values
The Weierstrass Theorem specifies sufficient conditions under which the existence of a
maximum and a minimum of a continuous function are assured.
The theorem: Let f: R n  R be a continuous real valued mapping. Let S be a compact

(closed and bounded) subset of R n . Then there exists a vector x   S and a vector
~
x  S such that:
f ( x * )  f ( x)  f ( ~
x) for all x  S .
Since preference are monotonic, the solution x* will satisfy the budget constrain with
equality, lying on, rather than inside, the boundary of the budget set.
x2
x1
The solution vector x* depends on the parameters to the problem. Since it will be unique for
any particular values of p and y, we can view the solution as a function from the set of prices
and income: xi *  xi ( p, y)
Marshallian Demand Functions: xi *  xi ( p, y)

The problem: max u( x) s.t. y  p.x, x  0
x
Form the Lagrangian, we obtain: L( x,  )  u( x)   ( y  p.x) . Now apply the Kuhn-Tucker

methods:

 L u ( x*) 
 x  x  pi  0 
 i i

 L  u ( x*)  
 xi * .  xi * .  p i   0
 xi  xi  
 L 
  y  p . x*  0 
  
 L 
   [ y  p.x*]  0 
 
  0, xi *  0 
 
 
u j ( x*) pj
We can derive that:
 : the ratio of marginal utilities must equal the ratio of
u k ( x*) pk
prices.
For simplicity, we will sometimes just assume that the consumer’s problem admits of an
interior solution. So the solution resulted from the following system of equations:
 L u ( x*)
 x  x  pi  0

i i
 L  y  p.x*  0
 

Chapter 4. Consumer Theory II

Two essential tools of analysis in the modern treatment of consumer theory:
 The Indirect Utility Function
 The Expenditure Function
4.1 Indirect Utility and Expenditure
4.1.1 The Indirect Utility Function
The direct Utility Function: The ordinary utility function, u(x), is defined over the
consumption set X and represents the consumer’s preferences directly.
The indirect Utility Function:
 The problem: max u( x) s.t. y  p.x, x  0

x
 the solution x* =x*(p,y)  The Indirect Utility Function: u(x*) = u*(p,y) = v(p,y)
 The indirect Utility Function represent the relation between prices, income and the
highest level of utility achieved.
 v : R n1  R defined as: v( p, y)  max u( x) s.t. y  p.x  0, x  0.

x
 v(p,y) is called indirect utility function.
 This function is clearly well-defined since, when preferences are monotonic and
strictly convex, a unique solution x(p,y) to the consumer’s problem exists.
 In the maximization problems max u( x) s.t. y  p.x, x  0 , continuity of the

x
constraint function in the parameters is sufficient to guarantee that v(p,y) will be

continuous in p and y.
Theorem 4.1.1 Properties of the Indirect Utility Function
Let preferences be monotonic and differentiable, and let p>>0 and y>0. Then v(p,y) has
these properties:
1. Homogeneous of degree zero in p and y
2. Increasing in y
3. Non-increasing in p
4. Quansiconvex in p
Proof:
1. Homogeneous of degree zero in p and y:
Equiproportionate changes in all p and y leave the consumer’s budget set unchanged.
y  p1 x1  p2 x2  ty  tp1 x1  tp 2 x2 (t  0)
The set of feasible choices, and so the maximal level of utility the consumer can achieve,
must therefore also remain the same.
Changing all p an y by proportion t>0 must leave the maximal utilitly unchanged.
2. Increasing in y and Non-increasing in p
Considering the Lagrangian for the utility maximization problem: L( x,  )  u( x)  [ y  p.x]
By the Envelop Theorem, we have:
v( p, y ) L
  x*  0
pi pi ( x *, )
as the proof.
v( p, y ) L
  0
y y ( x *, )
3. Quansiconvex in p
The Lagrangian multiplier (  ) will measure the sensitivity of the objective function u(x) to
changes in the constraint constant (y). (See the Exercise 2.29).
Thus the value of the Lagrangian multiplier at the solution measure the marginal utility of
income.
Let B1, B2, and Bt be the budget sets available when the consumer has income y and faces
prices p1, p2, and p t  tp 1  (1  t ) p 2 , then:
B1  {x | p 1 .x  y}
B 2  {x | p 2 .x  y}
B t  {x | p t .x  y}.
We need to show that: v( p t , y)  max[ v( p1 , y), v( p 2 , y)]
So we will show that every choice the consumer can possibly make when she faces budget B t
is a choice which could have been made when she faced either budget B 1 or budget B 2 . It
would be the case that every level of utility she can achieve faving B t is a level she could
have achieved either when facing B 1 or when facing B 2 . Then the maximum level of utility
that she can achieve over B t could be no larger than at least one of the maximum level of
utility that she can achieve over B 1 or the one she can achieve over B 2 .
We want to show that if x  B t , then x  B1 or x  B 2 for all t  [0,1] .
It is easy to realize that if t=1 or t=0  x  B t , then x  B1 or x  B 2 .
For t  (0,1) . Suppose that if x  B t , then x  B1 and x  B 2 , then
x. p1  y and x. p 2  y . Cos’s t  (0,1)  t>0 and (1-t)>0
 t.x. p1  t. y and (1  t ).x. p 2  (1  t ) y  t.x. p1  (1  t ).x. p 2  y  x. p t  y
 x  B t  contradicting our orginal assumption.
 if x  B t , then x  B1 or x  B 2 for all t  [0,1] .
 v( p t , y)  max[ v( p1 , y), v( p 2 , y)]  v(p,y) is quansiconcave function in p ./.
 The indirect utility function tells us the maximal level of utility the consumer can
achieve facing different prices and incomes.
 The demand functions give us the utility maximizing choices of each commodity he
will make facing different prices and incomes.
 To get the indirect utility function, we simply substitute the demand functions into the
direct utility function.
Theorem 4.1.2 Roy’s Identity
 To get the indirect utility function, we simply substitute the demand functions into the
direct utility function.
 There is a question that how to derive the direct utility function from the indirect
utility function ? This theorem will answer this question.
Theorem: Let v(p,y) be any indirect utility function satisfying the conditions of Theorem
4.1.1. Then,
v( p, y) / pi
x i ( p, y )  
v( p, y) / y

Fifth Property of an indirect utility function:
 An indirect utility function is demand – generating.: any demand function can be

generated from the indirect utility function.
Proof:
Let x* and  solve the Kuhn – Tucker conditions. The solution    ( p, y) gives us the
marginal utility of income at the consumer equilibrium and xi *  xi ( p, y) gives the

consumer’s demand function for good i.
By the Envelope Theorem, we have:
v( p, y ) L
   ( p , y ) x i ( p , y )
pi pi ( x *, )
v( p, y ) L
   ( p, y )
y y ( x *, )
v( p, y) / pi
 x i ( p, y )  
v( p, y) / y
4.1.2 The Expenditure Function
 What is the minimum level of money expenditure, or outlay, which the consumer
must make facing a given set of prices in order to achieve a given level of utility?
 In this construction, we ignore any limitations imposed by the consumer’s income

and simply ask what the consumer would have to spend in order to achieve some
particular level of utility.
X2
X1
 Iso – expenditure curve: e  p1 x1  p2 x2
 The problem: e( p, u )  min

x
p.x s.t. u  u( x)  0, x  0

 The solution is x h ( p, u) that depends on prices and utility level.
 If preferences are monotonic and strictly convex, the solution will be unique.
 The lowest expenditure necessary to achieve utility u at prices p will be equal to cost
of the bundle x h ( p, u) : e( p, u)  p.x h ( p, u) .
 We can use the consumer’s expenditure minimization problem to explore a very

different kind of “demand behavior”, this one entirely unobservable or hypothetical.
It is different from the Marshallian demand which is observable.
 If we fix the level of utility the consumer is permitted to achieve at some arbitrary
level u, how will his purchases of each good behave as we change the prices he faces?
 Utility – constant demand functions
Hicksian Demand Functions:
 Fix the utility level and solve the problem:
e( p, u)  min p.x s.t. u  u( x)  0, x  0 to have consumption bundle

x
h
x1 ( p, u) : x1h ( p10 , p20 , u) , x2h ( p10 , p20 , u) .
 Change price p1, the optimal choice will change:

h
x2 ( p, u) : x1h ( p11 , p20 , u) , x2h ( p11 , p20 , u) .
Theorem 4.1.3 Properties of the Expenditure Function
Let preferences be monotonic and let p>>0. Let u>u(0) and let e(p,u) be defined as in
4.1.3. Then e(p,u) is:
1. Increasing in u
2. Non-decreasing in p
3. Homogeneous of degree 1 in p
4. Concave in p
5. Also, the price partial derivatives of e(p,u) are the Hicksian demand functions
e( p, u )
 xih ( p, u )
pi

Proof:
The problem: e( p, u )  min

x
p.x s.t. u  u( x)  0, x  0
Lagrangian for this problem: L( x,  )  p.x  [u  u( x)]
If x h  x h ( p, u)  0 solves the minimization problem, then x h and  satisfy the Kuhn-

Tucker conditions:
L u ( x h )
 pi   0
xi xi
L u ( x h )
xih  xih [ pi   ]0
xi xi
L
 u  u( x h )  0

L
  [u  u ( x h )]  0

u ( x h )
From the Kuhn-Tucker conditions we must therefore have: p j    0 for at least one
x j
j.
u ( x h ) pj
By the monotonicity of preference, we have: 0     0.
x j u ( x h )
x j
Prove the property 1:
e( p, u ) L
By the Envelop theorem:   0
u u
( x , )
h
 e(p,u) is increasing in u.
Prove the properties 2 and 5:
e( p, u ) L
By the Envelop theorem we have:   xih ( p, u )  0
pi pi ( x h , )
 e(p,u) is in-decreasing in p.
Prove the property 3: e(p,u) is homogeneous of degree 1
e( p, u)  p1 .x1h  p2 .x2h    pn .xnh  e(t. p, u)  t.e( p, u)

 e(p,u) is homogeneous of degree 1.
Prove the property 4: e(p,u) is concave in p.
We need to prove that: e( p t , u)  t.e( p1 , u)  (1  t ).e( p 2 , u)
Let x1 , x 2 , x t minimize expenditure to achieve u when prices are p1 , p 2 , p t respectively.
By the definition of e(p,u), we have:
p 1 .x1  p1 .x t  tp 1 .x1  tp 1 .x t
p 2 .x 2  p 2 .xt  (1  t ) p 2 .x 2  (1  t ) p 2 .x t
 t.e( p1 , u )  (1  t ).e( p 2 , u )  e( p t , u )
4.1.3 Relations between the Two
From the definitions of the expenditure function:
e( p, v( p, y))  min p.x s.t. u( x)  v( p, y)
e( p, v( p, y))  min p.x s.t. u( x)  max u( x' ) s.t. p.x'  y

x x
We must of course have: e( p, v( p, y))  y .
From the definition of the indirect utility function:
v( p, e( p, u))  max u( x) s.t. p.x  e( p, u)

x
Substituting from the definition of the expenditure function, we have:
v( p, e( p, u))  max u( x) s.t. p.x  min p.x' s.t. u( x' )  u

x x'
We must have: v( p, e( p, u))  u
Theorem 4.1.4 Identities Relating Indirect Utility and Expenditure Functions
Let v(p,y) and e(p,u) be the indirect utility function and expenditure function for some
consumer. Then the following relations between the two obtain for all prices p, incomes y,
and utility level u:
e( p, v( p, y ))  y
v( p, e( p, u ))  u
 This theorem points us to an easy way to derive either one directly from knowledge of
the other, thus requiring us to solve only one optimization problem and giving us the
choice of which one we care to solve.
 v(p,u) is strictly increasing in y.
v( p, e( p, u))  u . Invert the indirect utility function in its income variable, we have:
 e( p, u)  v 1 ( p : u)
e( p, u ) is strictly increasing in u.
e( p, v( p, u))  u . Invert the expenditure function in its utility variable, we have:
v( p, u)  e 1 ( p : y)
 Example: The CES direct utility function gives the indirect utility function:
v( p, y )  ( p1r  p 2r ) 1 / r . y
For an income level equal to e(p,u), we must have: v( p, e( p, u))  ( p1r  p2r ) 1 / r .e( p, u)
By the theorem, we have ( p1r  p2r ) 1 / r .e( p, u)  u  e( p, u)  ( p1r  p2r )1 / r .u
Theorem 4.1.5 Identical Relations Between Marshallian and Hicksian Demand

Functions
For any prices p, income y, and utility level u, the following identical relations hold
between the consumer’s Hicksian and Marshallian demand functions:
xi ( p, y)  xih ( p, v( p, y)) for all p, y and i  1,2,, n

xih ( p, u )  xi ( p, e( p, u )) for all p, y and i  1,2,, n
Example:
The Hicksian demands are: xih ( p, u)  ( p1r  p2r )1/ r )1 pir 1u
r 1 / r
The indirect utility function is: v( p, y )  ( p1  p 2 )
r
y
pir 1 y
We have: xi ( p, y)  xih ( p, v( p, y )  ( p1r  p 2r ) (1 / r )1 pir 1 ( p1r  p 2r ) 1 / r y 
p1r  p 2r
4.2 Properties of Consumer Demand
In statistically estimating consumer demand systems, characteristics of demand behavior

predicted by the theory are used to provide restrictions on the values which estimated
parameters are allowed to take, thereby ensuring that the empirical estimates are at least
logically consistent with the underlying theory from which they are constructed.

 Relative Price and Real Income
 Economists generally prefer to measure important variables in real.
 Relative prices and real income are two such real measures
 Relative price: By the relative price of some good, we mean the number of units of
some other good which must be foregone in order to acquire 1 unit of the good in
question.
pi $ / unit i unit j
 
p j $ / unit j unit i measure the units of good j forgone per unit of good i acquired.
 Consumer’s real income: We mean the total number of units of some commodity
which could be acquired if the consumer spent his entire money income on that
commodity.
y $
Real income interm of good j:   units of j
p j $ / unit of j
Theorem 4.2.1 Homogeneity
The consumer’s demand function xi ( p, y) , are homogeneous of degree zero in all
prices and income.
Proof:
Equiproportionate changes in all p and y leave the consumer’s budget set unchanged.
y  p1 x1  p2 x2  ty  tp1 x1  tp 2 x2 (t  0)
So the optimal point is unchanged.
 xi ( p, y)  xi (t. p, t. y)  the consumer’s demand function xi ( p, y) is homogeneous of

degree zero.
Application:
1
With t  0
pn
p1 p y
We have: xi ( p, y )  xi (tp , ty )  xi ( ,, n1 ,1, )
pn pn pn

Demand for each of the n goods depends only on (n-1) relative prices and the consumer’s
real income.
4.2.2 Income and Substitution Effects
Ordinarily, a consumer will buy more of a good when its price declines, and less when its
price increases. However, these cases are not necessarily true.
Substitution effect:
 Since all goods are taken to be desirable by the consumer, even if the consumer’s total
command over goods were unchanged, we would expect him to substitute more of the
good which has become relatively cheapter for less of the goods which are now
relatively more expensive.
 The substitution effect is that (hypothetical) change in consumption which would

occur if relative prices were to change to their new level but the maximum level of
utility the consumer can achieve were kept the same as before the price changes.
 Change price of good 1 from x10 to x 11 , the problem is that:
min p11 x1  p20 x2 s.t. u( x)  u 0
Income effect:
 When the price of any one good declines, the consumer’s total command over all
resources is effectively increased, allowing him to change his purchases of all
goods in any way he sees fit. The effect on quantity demanded of this generalized
increase in purchasing power is called the income effect.
 The income effect is defined as the residual out of the total effect which is left
after the substitution effect.
Total effect:
 Change price of good 1 from x10 to x 11 , the problem is that:
max u( x) s.t. p11 x1  p20 x2  y
Slutsky Equation – Fundamental Equation of Demand Theory: the general analytical

relationships between total effect, substitution effect, and income effect are summarized
by the Slutsky Equation.

Theorem 4.2.2 The Slutsky Equation
Let x(p,y) be the consumer’s Marshallian demand system. Let u* be the level of utility the
consumer achieves at prices p and income y. Then,
x j ( p, y ) x hj ( p, u*) x j ( p, y )
  x i ( p, y )
p p y
i i   
TE SE IE
Proof:
The identity linking the Marshallian and Hicksian demand function:
x hj ( p, u*)  x j ( p, e( p, u*))
x hj ( p, u*) x j ( p, e( p, u*)) x j ( p, e( p, u*)) e( p, u*)

   (1)
pi pi e( p, u*) pi
By the assumption , u* is the level of utility the consumer achieves facing prices p and
having income y (see the theorem)  u*=v(p,y)
 the minimum expenditure at prices p and utility u* will therefore be the same as the
minimum expenditure at price p and utility v(p,y)
e( p, u*)  e( p, v( p, y))  y (2)
e( p, u*)
We have:  xi ( p, y) (3)
pi
Substitute (2) and (3) into (1), we have:
x j ( p, y ) x hj ( p, u*) x j ( p, y )
  x i ( p, y )
p p y
i i   
TE SE IE
Theorem 4.2.3 Negativity of Own-Substitution Terms:
Let xih ( p, u ) be the Hicksian demand for good i. Then,
xih ( p, u )
0
pi
Proof:

e( p, u )
 xih ( p, u )
pi
 2 e( p , u ) xih

pi
2
pi
By the theorem 4.1.3, the expenditure function is a concave function of p./.
Norminal or superior goods: consumption increases as real income increases, holding

relative prices constant.
Inferior goods: consumption decreases as real income increases, holding relative prices
constant.
Theorem 4.2.4 The Law of Demand
Let preferences be complete, transitive, reflexive, monotonic, and strictly convex. If a good
is a normal good, then a decrease in price will cause an increase in quantity demanded. If a
decrease in price causes a decrease in quantity demanded, then the good must be an
inferior good.
Theorem 4.2.5 Symmetry of Substitution Terms
Let x h ( p, u) be the consumer’s system of Hicksian demands. Then,
xih ( p, u ) x j ( p, u )
h

p j pi
Proof:
xih ( p, u )  2 e( p, u )

p j pip j
 2 e( p , u )  2 e( p , u )
By the Young’s Theorem: 
pi p j p j pi
xih ( p, u ) x j ( p, u )
h
 
p j pi

Theorem 4.2.6 Negative Semi-Definiteness of the Substitution Matrix
 x h ( p, u ) 
Let x h ( p, u) be the consumer’s system of Hicksian demands, and let  i 
 p j  i , j 1n
represent the entire matrix of Hicksian substitution terms. Then this matrix is negative
semi-definite.
Proof:
xih ( p, u )  2 e( p, u )

p j pip j
The expenditure functions is concave in prices. From the theorem 2.1.3, the matrix of second-
order partials (the Hessian) of a concave function is negative semi-definite
Theorem 4.2.7 Negative Semi – definiteness of the Slutsky Matrix
Let x(p,y) be the consumer’s Marshallian demand system. Define the Slutsky matrix as the
n  n matrix of price and income responses given by:
 xi ( p, y ) x ( p, y ) 
  x j ( p, y ) i 
 p j y 
i , j 1,n
Then the theory of the preference-maximizing, atomistic consumer requires that the Slutsky
matrix be negative semi – definite.
4.2.3 Some Elasticity Relations
If preferences are monotonic, at least one good will be bought in a positive amount.
Definition 4.2.1 Demand Elasticities and Income Shares
Let xi ( p, y) be the consumer’s Marshallian demand for good i. Then let
xi ( p, y ) y
Income elasticity:  i 
y x i ( p, y )
xi ( p, y ) p j
Price elasticity:  ij 
p j x i ( p, y )
n
p i x i ( p, y )
And Income Share: si 
y
so that si  0 and s
i 1
i  1.
Theorem 4.2.8 Aggregation in Consumer Demand

Let x(p,y) be the consumer’s Marshallian demand system. Let  i ,  ij , and si be income
elasticity, cross-price elasticity and income share. Then the following relations must hold
between income shares, price, and income elasticities of demand:
Engel Aggregation: s

i 1
i i 1
Cournot Aggregation: s 
i 1
i ij  s j
Proof:
Prove (1)
y  p.x( p, y)
n
xi ( p, y) n
p x x ( p, y) y n
 1   pi  i i i   si i
i 1 y i 1 y y xi i 1
Prove (2)
y  p.x( p, y) . Differentiating both sides with respect to p j
n
xi ( p, y )
0   pi  xj
i 1 p j
 n
xi ( p, y )
 x j   pi
i 1 p j
Multiply both sides of the equation by p j / y and get:
xj pj n
pi xi ( p, y ) n
p x x ( p, y ) p j n
  pj   i i i   si  ij
y i 1 y p j i 1 y p j xi i 1
n
  si  ij   s j
i 1
4.3 Duality In Consumer Theory
There is a question: Starting with an expenditure or an indirect utility function, can we “work
backwards” to discover the underlying direct utility function that would have generated it?
This question is answered by following the mathematical “duality” between various

optimization problems used to describe consumer behavior.

4.3.1 Expenditure and Consumer Preferences
4.3.2 Indirect Utility and Consumer Preferences
 The duality between direct and indirect utility functions.
 Suppose we are given a continuous function v(p,y), homogeneous of degree zero

in p and y, increasing in y, non-increasing and quasiconvex in p. It can be shown
that there exists a non-decreasing, quasiconcave direct utility function u*(x) which
rationalizes v(p,y).
 v( p, y) is homogeneous of degree zero in p and y  v(tp, ty )  v( p, y) . Let

t  1 / y  v( p / y,1)  v( p, y)  v( ~
p)  v( p, y)
 v( ~
p ) is called normalized indirect utility function, and depends on normaliz
price alone.
 With direct utility function u(x), the normalized indirect utility function is defined
as:
v( ~
p )  max u( x) s.t. ~
p.x  1
x
Thetheorem 4.3.4 Duality Between Direct and Indirect Utility
Let v( p, y) be any indirect utility function, and form the normalized indirect utility
function v( ~
p ) . Then the implied direct utility function is given by the following minimum
value function: u( x)  min

~
v( ~
p ) s.t. ~
p.x  1 .
p
Proof:
Theorem 4.3.5 (Hotelling, Wold) Duality and the System of Inverse Demands
Let u(x) be the consumer’s direct utility function. Then the inverse demand function for
good i is given by:
~ u ( x) / xi
pi ( x)  n
x
j 1
j (u ( x) / x j )
Proof:
From the normalized indirect utility function, v( ~

p)
u( x)  min
~
v( ~
p ) s.t. ~
p.x  1
p
The associated Lagrangian is:
L  v( ~
p)  [1  ~
p.x]
u ( x) L
Applying the Envelope Theorem:    * ( x). ~
p j * ( x) (1)
x j x j ~
p*( x ), *( x )
Multiplying both side by x j and summing from j  1,..., n gives:
n
u ( x) n

j 1
xj
x j
  * ( x )  ~
j 1
p j * ( x).x j   * ( x) (2)
From (1) and (2), we have:
~ u ( x) / x j
p j * ( x)  n
 x (u( x) / x )
j 1
j j
4.4 Uncertainty
Certainty: the consumer knows the prices of all commodities and knows that any feasible
consumption bundles can be obtained with certainty.
Many eoconomic decisions contain some element of uncertainty: future income, future
prices…
4.4.1 Preferences
 Beforem, the consumer was assumed to have a preference ordering over all
consumption bundles x in the consumption set X. Implicit in our statement that
“bundle xi is preferred to bundle xj” was the assumption the individual chooes between
xi with certainty and xj with certainty.
 Instead of ordering consumption bundles, the individual is assumed to have a

preference ordering over gambles.
 Let’s first define an outcome as a result of some uncertain situation. For example,
outcomes of betting are win and loss.

 A  {a1 , a2 ,..., an } : the set of all mutually exclusive ultimate outcomes that an
individual could endup with.
a i could be an m-dimensional commodity vector, or alternatively, a scalar. The way we

characterize A depend upon the nature of the particular problem we wish to address.
 Gambles: G  [ p1oa1p2 oa2  pn an ] denotes the entire gamble involving a1 with
probability p1 and a 2 with probability p 2 and so forth.
Where:
p1oa1 denotes the outcome a1 with its probability of occurrence p1 .
 denotes “and” – the logical symbol.
Definition 4.4.1 The Space of Gambles, g(A)
The space g(A) is the set of all possible gamble which can be constructed from the outcome

n
set A by varying the probabilities 0  pi  1 of each ai  A while ensuring that i 1
pi  1 .
Since each a i is the special gamble in g(A) where pi  1 and p j  0 , i  j , the set of all
ultimate outcomes A is itself a subset of g(A).
The problem of choice under uncertainty can be veiwed as a choice between alternative
gambles in g(A).
We can then define a binary relation  on g(A). Where the symbol  stands for the
statement “is atleast as well as”.
We again suppose that these preferences obey certain rules which we’ll lay down in the form
of axiom, called the “axioms of choice under uncertainty”.
Axiom G1: (Completeness) For any distinct gambles G1 and G2 in g(A), either G1  G2 or
G2  G1 .
Axiom G2: (Reflexivity) For any gamble G  g ( A), G  G .
Axiom G3: (Transitivity) For any three gambles G1, G2, and G3 in g(A), if G1  G2 and
G2  G3 , then G1  G3 .
With the addition of Axiom G3,  gives a complete ordering of gambles.
One important consequence of this Axiom is that there must exist a best and a worst outcome
in A. Note that best and worst outcomes need not be unique.

G  [ p1oa1p2 oa2  pn an ]  if all p j  0 and pi  0 ( i  j )  G  ai  By the
Axiom 3, there therefore must exist a best and a worst outcome in A. A best outcome a B  A
satistfies a B  a j for all a j  A . A worst outcome aW  A statisfies a w  a j for all a j  A .
Axiom G4: (Continuity) For any gamble G  g (A) , there exists some probability z,
0  z  1 , such that G ~ [ z o a B  (1  z ) o aW ] .
Indifference probability
Best – Worst gamble [ z o aB  (1  z ) o aW ]
For any gamble G there is some other gamble, involving only the best and the worst outcomes
in A, which the agent ranks indifferent to G.
Axiom G5. (Monotonicity) For any two best-worst gambles, G1  [ poa B  (1  p)oaW ] and
G2  [qoa B  (1  q)oaW ] , we have G1  G2 if and only if p  q .
Axiom G5 states that given the choice between any two best – worst gambles with different
probabilities attached to the same best outcome, an individual will never prefer the gamble
with the lower probability of the best outcome.
Together, Axiom G4 and G5 rule out some kinds of behavior which, at first glance, might
appear quite reasonable. Example, let A = {“death”, $10, $1000}; aW  death , a B  $1000
and a B  $10  aW . Consider the intermediate gamble G3 = $10. According to axiom G4,
there must be some best-worst gamble such that G3 ~ [ z o a B  (1  z ) o aW ] and (1-z)>0. If

there is no strictly positive probability of death at which you would prefer the gamble
[ z o $1000  (1  z) o death ] to $10 with certainty, then the preferences violate the combined
implications of Axiom G4 and G5.
Axiom G6. (Substitutability) For any outcome ai  A and any gamble G j  g (A) , if
ai ~ G j , then
[ p1oa1  pi oai   pn oan ] ~ [ p1oa1  pi oG j   pn oan ]
Axiom G6 states that if the individual is indifferent between and outcome promised with
certainty and some gamble, then he mus also be indifferent between two otherwise identical
gambles offering each of these with the same probability.
Axiom G7. (Net Probability Rule) Let the gamble

Gi  [ p1oa1 pi oG j  pn oan ]
Where the gamble
G j  [q1oa1 qi oai  qn oan ]
Then
Gi ~ [( p1  q1 pi )oa1pi qi oG j  ( pn  qn pi )oan ]
4.4.2 Von Neumann – Morgenstern Utility
Whether we can represent those preferences with a continuous real valued function? Say Yes!
Axiom G1, G2, G3 and some kind of continuity assumption should be sufficient to ensure the
existence of a simple numerical function representing  .
Instead of asking whether there is a certain kind of function, with a certain specific
mathematical property, representing  .
Let G  [ p1oa1 pi oai  pn oan ] be any gamble in g(A), and suppose that the function
U : g ( A)  R represents the preference  by assigning larger numbers to preferred gambles

and equal numbers to indifferent gambles. Then, of course, U is a utility function in the usual
sence. But if, in addition, the numbers assigned by U to gambles G satisfy:
n
U (G )   p i U (ai ) , we say that the utility function U possesses the extra, expected
i 1
utility property.
A Utility Function possesses the Expected Utility Property if and only if the Utility number it
assigns to any gamble can be expressed as the Expected Value of the Utility numbers it
assigns to the Ultimate outcomes in that gamble.
Theorem 4.4.1 Existence of a Von Neumann – Morgenstern (VNM) Utility Function over
Gambles
Let preferences over gambles,  , satisfy Axiom G1 through G7. Then there exists a
function U : g ( A)  R such that, for all G1 and G 2 in g(A), G1  G2 if and only if,
U (G1 )  U (G2 ) , and where, moreover, for any gamble G  [ p1oa1 pi oai  pn oan ] ,
n
U (G )   p i U (ai ) .
i 1
Proof:
Let G be any gamble in g(A), where: G  [ p1oa1 pi oai  pn oan ] (P.1)
By the Axiom G1, G2, and G3, there is a complete ordering over g(A), and we can therefore
identify a best outcome a B and worst outcome a w in A.
By Axiom G4, there exists an “indifference probability”, 0  z i  1 , for each outcome a i
which satisfies a i ~ [ z i oa B  (1  z i )oaW ] (P.2).
It is easy to prove that these indifference probabilities are unique (using Axiom G5).
According to Axiom G6, we can substitute from the right-hand side of (P.2) of each a i in
(P.1):
G ~  p1o[ z1oa B  (1  z1 )oaW ] pn o[ z n oa B  (1  z n )oaW ] (P.3)
Using transitivity and Axiom G7, we obtain
 n   n
 
G ~   pi z i oa B  1   pi z i oaW  (P.4) (remind that p  1 ).
 i 1   
i
i 1 
We now propose a mapping from grambles to the real line.
We let (propose): U (G )   piU (ai ) (P.5) where: U (ai )  z i , i  1,, n (P.6)

i 1
Take note that this mapping is indeed a function, since the indifference probabilities from
which it is constructed always exist and are unique.
We need to show that this function represents  .
Consider any two gambles in g(A) where:
G1  [q1oa1 qn oan ] (P.7)
G2  [r1oa1 rn oan ] (P.8)
Applying the mapping in (P.5) and (P.6) to the gamble G1, we obtain:
n n
U (G1 )   qiU (ai )   qi zi
i 1 i 1
From the (P.4), we obtain: G1 ~ U (G1 )oaB  1  U (G1 )oaW  .
With completely analogous steps, we obtain that:

G2 ~ U (G2 )oaB  1  U (G2 )oaW 
Note that 0  U (G1 )  1 and 0  U (G2 )  1 .
By the Axiom G5, we have: G1  G2  U (G1 )  U (G2 )

The theorem is proved./.
The proceduce to construct U(G).
1. Identify p i such that G  [ p1oa1 pi oai  pn oan ] .
2. Identify z i such that: a i ~ [ z i oa B  (1  z i )oaW ]
3. Let U (ai )  z i
4. U (G)   piU (ai )   pi z i
Example 4.4.1
Suppose that A = {$10,$4,-$2). We can reasonable suppose that a B = $10, aW = -$2.
Suppose we find that:
$10 ~ [1o$10  0o($2)]  U ($10)  1 (E.1)
$4 ~ [.6o$10  .4o($2)]  U ($4)  .6 (E.2)
 $2 ~ [0o$10  1o($2)]  U ($2)  0 (E.3)
Under this mapping, the utility of the best outcome must be (identically) 1, and that of
the worst outcome must be (identically) 0. The utility assigned to intermediate outcomes
will depend on the individual’s attitude toward taking risks.
Consider two gambles:
G1  [.2o$4  .8o$10] (E.4)
G2  [.07o  $2  .03o$4  .9o$10] (E.5)
U (G1 )  .2U ($4)  .8U ($10)  .2 * .6  .8 *1  .92
U (G2 )  .07U ($2)  .03U ($4)  .9U ($10)  .918
 G1  G2

We can rank any of the infinite number of gambles that could be constructed from the three
outcomes in A./.
 The VNM mapping does its assignment of numbers to gambles in two distinct stages:
1. First, all gambles in g(A) that offer one outcome with certainty are assigned
utility numbers that reflect th agent’s ordering of those alternatives with
certainty.
2. Then, all other gambles in g(A) are assigned utility numbers via the expected
utility calculation.
The VNM utility numbers assigned to ultimate outcomes must not only properly reflect the
agent’s ranking of those particular outcomes relative to each other, they must also be capable
of properly reflectig the agent’s ranking of gambles comprised of them through the (special)
expected utility calculation.
It should not, therefore, be terribly suprising that we are less free to transform VNM utility
functions if the ranking of every gamble is to be preserved.
Theorem 4.4.2 VNM Utility Functions are Unique Up to Positive Affine Tranformations
Let  satisfy Axiom G1 through G7, and suppose that the VNM utility function U(G)
represents  . Then the VNM utility function, V(G), represents those same preferences if,
and only if, V (G)    U (G) , for some arbitrary scalar  and some scalar  >0.
Proof:
Sufficient condition is obvious
Necessary condition: We need to prove that if V(G) is another utility function 

V (G)    U (G) with  ,   0 .
Let A  {a1 ,, an } and G  { p1oa1 pn oan } .
By the proof of Theorem 4.4.1 that if the VNM utility function U(G) represents  , it
possesses the expected utility property and so, for any G  G(A) , we can write:
n n
U (G)   piU (ai )   pi z i where z i satisfies: ai ~ [ zi oa B  (1  zi )oaW ] .
i 1 i 1
Suppose that V(G) is another VNM utility function which represents  , we must have the
following:

V (G )   piV (ai )   pi [ z i oa B (1  z i )oaW ]   pi z i V (a B )  1   pi z i V (aW )

n n
i 1 i 1
 U (G ).V (a B )  (1  U (G)).V (aW )  V (aW )  V (a B )  V (aW )U (G )
For any outcome set A and VNM utility function V, the numbers V (a B ) and V (aW ) are
constants, with V (a B ) > V (aW ) . Setting   V (aW ) and   V (a B )  V (aW )  0 , so the

theorem is proved./.
 Theorem 4.4.2 tells us that VNM utility functions are not completely unique, nor are
they entirely ordinal. We can still find an infinite number of them that will rank
gambles in precisely the same order and also possess the expected utility property.
 However, unlike ordinary utility functions, here we must limit the posivite
transformation of VNM utility function under the form of V (G)    U (G) with
,   0 .
 Yet the less than complete ordinality of the VNM utility function must not tempt us
into attaching undue significance to the absolute level of a gamble’s utility, or to the
differene in utility between one gamble and another. With what little weve required of
the agent’s binary comparisons between gambles in the underlying preference
ordering, we still cannot use VNM utility functions for interpersonal comparisons of
well – being, nor can we measure the “intensity” with which one gamble is perferred
to another.
4.4.3 Risk Aversion
The VNM utility function we created reflected some desire to avoid risk.
We shall assume that the VNM utility function is both increasing and differentiable over the
appropriate domain of wealth concerned.
We let the possible wealth outcomes be denoted A  {w1 ,, wn } .
n
The expected value of G: E[G ]   pi wi .
i 1
Now suppose that the agent is given a choice between accepting the gamble G on the one
hand, or receiving with certainty the expected value of G on the other. We can evaluate these
two alternative as follows:
The utility of the gamble: U (G)   piU (wi ) and

The utility of the gamble’s expected value: U ( E[G])  U ( pi wi ) .
When someone would rather receive the expected value of a gamble with certainty than face
the risk inherent in the gamble itself, we say they are risk averse.
Definition 4.4.2 Risk Aversion, Risk Neutrality, and Risk Loving
Let A  {w1 ,, wn } . Then at G  g (A) , an individual is said to be locally:
1. Risk averse whenever U ( E[G])  U (G)
2. Risk neutral whenever U ( E[G])  U (G)
3. Risk loving whenever U ( E[G])  U (G)
If these relationships hold for all gambles G  g (A) , these definitions apply globally.
 Each of these attitudes toward risk is equivalent to a particular property of the

VNM utility function.
 We will be asked to show that the agent is risk averse, risk neutral, or risk
loving over some subset of gambles if, and only if, their VNM utility function
is strictly concave, linear, or strictly convex, respectively, over the appropriate
domain of wealth.
 Consider a simple gamble involving two outcomes: G  [ p ow1(1  p)ow2 ]
and E[G]  pw1  (1  p)w2 .
 The individual is offered a choice between receiving an amount of wealth

equal to E[G]  pw1  (1  p)w2 with certainty, or receiving the gamble G
itself. We can compare the alternatives as follows:
U (G)  p.U (w1 )  (1  p)U (w2 ) and U ( E[G])  U ( pw1  (1  p)w2 ) .
U(w2)
U(E[G])
U(G)
U(w1)
P
w1 CE E(G) w2
Risk aversion and strict concavity of the VNM utility function

 We can see that, strict concavity of the VNM utility function, U(E[G])>U(G),
so the individual is risk averse.
 Certainty Equivalent (CE): Amount of wealth we could offer with certainty

that would make him indifferent between accepting that wealth with certainty
and facing the gamble G.
 When a person is risk  CE < E(G).
 A risk – averse person will “pay” some positive amount of wealth in order to
avoid the gamble’s inherent risk. This willingness to pay to avoid risk is
measured by the Risk Premium.
Definition 4.4.3 Certainty Equivalent and Risk Premium
The Certainty Equivalent of any gamble G is an amount of wealth, CE, offered with
certainty, such that U (G)  U (CE ) . The Risk Premium is an amount of wealth, P, such
that U (G)  U ( E[G]  P) . Clearly, the two are related, and P  E[G]  CE .
Example 4.4.2
Suppose that U (w)  log( w)  U is strictly concave in wealth, the individual is risk averse.
Let G offers 50-50 odds of winning or losing some amount of wealth, h, so that:
G  [.5o(w  h).5o(w  h)] where w is current wealth, and E[G] = w.
Log(CE) = (1/2)log(w+h) + (1/2)log(w-h) = log (w2 – h2)1/2.
Thus CE = (w2 – h2)1/2 < E[G] and P = w - (w2 – h2)1/2 > 0.
 Risk aversion and concavity of the VNM utility function in wealth are equivalent.
 The sign of the second derivative U’’(w) does tell us whether the individual is risk
averse, risk loving, or risk neutral, its size is entirely arbitrary.
 The size of U’’(w) depends on the positive affine transformations of U(w) and the
units in which the outcome is measured.
 Arrow (1965) and Pratt (1964) have proposed a measure a risk aversion which is
based on the second derivative, but which avoids these non-uniqueness problems.

Definition 4.4.4 The Arrow – Pratt Measure of Absolute Risk Aversion
The Arrow – Pratt measure of absolute risk aversion is given by
 U ' ' ( w)
Ra ( w) 
U ' ( w)
Ra (w) is positive, negative, or zero as the agent is Risk Averse, Risk Loving, or Risk Neutral
repectively.
Any positive affine transformation of utility will leave the measure unchanged.
Changing units of measurement of outcome leaves Ra (w) unaffected
Ra (w) is only a local measure of risk aversion, so it need not be the same at every level
wealth.

Chapter 5. Theory of the Firm
5.1 Primative Notions
 The firms buys its inputs on factor markets at prices determined on those markets, and
these expenditures are the firm’s costs. The firm sells its output on product markets
and earns revenue from these sales.
 The way inputs are combined to produce outputs is partly a matter of choice and partly
a matter of what is technologically possible. There is therefore a close relationship
between the prices at which the firm can acquire inputs, the amounts of these inputs it
decides to acquire, the technological possibilities in combining inputs to produce
outputs, and the cost at which the firm is able to produce outputs.
 Similarly, the prices at which the firm is able to sell its output, the quantity of output it
decides to sell, and any other stratergies it may use in marketing its products affect its
revenues.
 Circumstances and decisions affecting its costs and/or its revenues obviously affect the
difference between the two, firm profits.
 Profit maximization is not the only conceivable motive of firm behavior. Sales, market
share, or even prestige maximization are possibilities.The majority of economists
continue to embrace the hypothesis of profit maximization. Why?
 From an empirical point of view, the assumption that firms profit maximize leads to
predictions of firm behavior which are time and again borne out by the evidence. From
a theoretical point of view, there is first the virtue of simplicity and consistency with
the hypothesis of self – interested, utility maximization on the part of consumers.
 There are identifiable market forces which coerce the firm toward profit maximization
even if its owners or managers are not themselves innately inclined in that direction.
For suppose that some firm did not conduct its activities to maximize profits. Then if
the fault lies with the managers, and if at least a working majority of the firm’s owners
are no-satiated consumers, those owners have a clear interest in ridding themselves of
that management and replacing it with a profit – maximizing one. If the fault lies with
the owners, then there is an obvious incentive for any non – satiated entrepreneur
outside the firm to acquire it and change its ways.

5.2 Production
Production is the process of transforming inputs into outputs.
Firm must contend with in this process is technological feasibility.
The state of techonoloty determines and restricts what is possible in combining inputs to
produce output, and there are several ways what we can represent this constraint. The most
general way is to conceive of the firm as posseing a production possibility set Y.
Y  {x  R m | x is a feasible production plan}

A production plan is vector of both inputs and outputs, called a netput vector.
Assumption 5.2.1 Axioms on the Production Possibility Set, Y
1. 0  Y
2. Y  Rm  {0}
3. Rm  Y
4. Y is closed and bouned set.
 The first axiom is called the possibility of inaction. It’s possible for the firm to
acquire No inputs and produce no output. One immediate implication of this axiom is
that firm profits in the long run need never be negative.
 The axiom 2 is called no free production.
 The axiom 3 is called free disposal. It has little practical significance, it is included
for mathematical competeness. It says, in effect, that the firm can always use
unlimited amounts of inputs to produce no output.
 Axiom 4 is also for mathematical purposes. It ensures that the production possibility
set contains its boundary so that there will always be an efficient frontier, giving a
well-defined maximum amount of output that can be obtained from a given level of
inputs.
Input requirement set, V(y)
 Considering firms produces only a single product from many inputs.
 V ( y)  {x | x  Rn , y  R , ( y, x)  Y }

 The input requirement set is defined as all combinations of inputs which produce an
output level of at least y units.
Assumption 5.2.2 Axioms on the Input Requirement Set, V(y)
1. Input Regularity: V(y) is non – empty, closed, and if y > 0, then 0  V ( y) .
2. Monotonicity: If x  V ( y) and x'  x , then x' V ( y)
3. Convexity: If x 1 and x 2 are in V(y), then for all t  [0,1] ,
t.x1  (1  t ).x 2  V ( y)
 The fist axiom is both a continuity requirement and an implication of the “no free
production” axiom.
 Monotonicity says that adding more of any input can never reduce the amount of
output produced and it is implied by the axiom of free disposal.
 Convexity: If the production process is time divisible, any convex combination of two
processes in V(y) can be viewed as a hybrid production run where one process is run
some fraction of the relevant time period and the other process run the remaining
fraction of the period.
x2
V(y)
x  V (x)
x1
 Input Regularity ensures that the lower boundary of V(y) is solid, unbroken, and
contained in V(y).
 Under monotonicity, all points northeast of that boundary must produce an output of y
or more and the boundary must not be positively sloped.
 Monotonicity and convexity together require that boundary be at least weakly

“convex-upward”.
y-level isoquant: Q( y)  {x  Rn | x  V ( y) and x  V ( y) for 0    1}
The isoquant is the efficient frontier of the input requirement set and is where we would
always expect a firm producing y units of output to choose to operate whenever inputs are
costly.
Definition 5.2.1 The Production Function
A real valued function f(x) is called a production function if

f ( x)  max{ y  0 | x V ( y)} .
 The superior and level sets of the production function correspond precisely to the input
requirement sets and isoquants, respectively, since by definition,
V ( y)  {x | f ( x)  y}
Q( y)  {x | f ( x)  y}
 When V(y) is input regular, the production function is continuos and f(0)=0. If
y=f(x) and y>0, then xi  0 for at least one input i.
 If V(y) is monotonic, the production function is non-decreasing.
 When V(y) is convex, f(x) is quansiconcave.
 We assume that the production function is differentiable.
f ( x)
 is called Marginal Product of factor i
xi
f ( x)
 If V(y) is monotonic  0.
xi
The marginal Rate of Technical Substitution (MRTS): is the slope of the Isoquant.
f ( x) f ( x)
0 dx1  dx 2
x1 x 2
dx1 f ( x) x1
 (1)
dx 2 f ( x) x 2
In general, for x  Q( y) , the MRTS of factor i for factor j is defined as:
dx j f ( x) xi
MRTS ij   (1)
dxi along Q ( y )
f ( x) x j

 Monotonicity requires that MRTS  0 .
 Convexity requires that MRTS is everywhere non-increasing in absolute value.
 In general, the MRTS between any two factors depends on the amounts of all factors
employed.
Definition 5.2.2 Separable Production Functions
Let N = {1,…,n} index the set of all factors, and suppose that these factors can be
partitioned into s mutually exclusive and exhaustive subsets, N1 ,, N S .
The production function is called weakly separable if the MRTS between two factors within
the same group is independent of factor usage in other groups:
  f i ( x) / f j ( x) 
0 for all i, j  N s and k  N s .
x k
The production function is called strongly separable if the MRTS between two factors from
different groups is independent of all factors outside those two groups:
  f i ( x) / f j ( x) 
 0 for all i  N s , j  N k and k  N s  N t , ( s  t ).
x k
Definition 5.2.3 The Elasticity of Substitution
For a production function f(x), the elasticity of substitution between factors i and j at the
point x is defined as
d log( x j / xi ) d ( x j / xi ) f i ( x) / f j ( x)
 ij  
d log( f i ( x) / f j ( x)) x j / xi d ( f i ( x) / f j ( x))
Where f i and f j are the marginal product of factors i and j.
 Between two factors x i and x j , holding all other factors and the level of output
constant, this is defined as the percentage change in the factor proportions, x j / xi ,
associated with a 1 percent change in the MRTS between them.
 The closer  is to zero, the more strictly convex the isoquants and the more
“difficult” substitution between factors.
 The larger  is, the flatter the isoquants and the “easier” substitution between factors
x2 x2 x2
  0     0
Q(y)
Q(y)
Q(y)
x1 x1 x1
Perfect substitution Less than Perfect substitution No substitution
Example 5.2.1:
The CES utility function: y  ( x1  x2 )1 / 
 1 1 
log( x2 / x21 )  log( x2 )  log( x1 )  d log( x1 / x2 )   dx1  dx2 
 x1 x2 
1
f1  ( x1  x 2 )1 /  1 . x1  1  ( x1  x 2 )1/  1 x1  1

f 2  ( x1  x 2 )1 /  1 x 2  1
x1 1
log( f1 / f 2 )  log  1  (   1)log( x1 )  log( x 2 )
x2
1 1 
d log( f1 / f 2 )  (   1)  dx1  dx 2 
 x1 x2 
1
  12  
 1
 We can see that the degree of substitution between factors always be the same. This is
therefore on the one hand a somwhat restrictive characterization of the CES
technology.
 The closer  is to unity, the larger is  ; when   1 ,  is indifinite and the

production function is linear.
The general CES production function:
1/ 
 n  n
y     i xi  , where   i  1
 i 1  i 1

1
 In the CES form,  ij   for all i  j
1 
 When   0 ,  ij  1 , and this CES form reduces to the linear homogeneous Cobb-
n
Douglas form: y   xi
ai
i 1
 As   ,  ij  0 , giving the Leotief form: y  min{ x1 ,, xn }
Theorem 5.2.1 (Shephard) Linear Homogeneous Production Function Are Concave
Let f(x) be a production function and suppose that it is homogeneous of degree 1. Then f(x)
is a concave function of x.
Proof:
Take x1  0 and x 2  0 and suppose that f ( x1 )  y1  0 and f ( x 2 )  y 2  0 .
Because the production function is homogeneous of degree 1,
With 0  t  1
f ( x1 )  y1  0  f (tx 1 )  ty 1  0
f ( x 2 )  y 2  0  f ((1  t ) x 2 )  (1  t ) y 2  0
 tx 1  1  (1  t ) x 2  1
 f  1   1 f (tx )  1 and f 
  1

2 
f ((1  t ) x 2 )  1
 (1  t ) y  (1  t ) y
2
 ty  ty
 tx 1   (1  t ) x 2  tx 1 (1  t ) x 2
 f  1   f  
2 
 1  and are in V(1).
 ty   (1  t ) y  ty 1 (1  t ) y 2
 tx 1 (1  t ) x 2 
V(y) is convex set  f  z. 1  (1  z )   1 where 0  z  1 .
 ty (1  t ) y 2 
ty 1 (1  t ) y 2
Let z  1 and (1  z )  1
ty  (1  t ) y 2 ty  (1  t ) y 2
 tx 1 (1  t ) x 2   tx 1  (1  t ) x 2 
 f  1    1  f  1  1
 ty  (1  t ) y
2
ty 1  (1  t ) y 2   ty  (1  t ) y 
2 
 tx 1  (1  t ) x 2  1
We have: f  1  1
2 
f (tx 1  (1  t ) x 2 )
 ty  (1  t ) y  ty  (1  t ) y
2

 f (tx 1  (1  t ) x 2 )  ty 1  (1  t ) y 2  tf ( x1 )  (1  t ) f ( x 2 )
 the production fucntion is concave.
5.2.1 Returns to Scale and Varying Proportions
We are interested in how output responds as the amounts of different factors are varies.
Returns to variable proportions: In the short run, at least one factor is fixed to the firm. So
output can be varied only by changing the amounts of some factors. As amounts of the
variable factors are changed, the proportion in which the fixed and variable factors are used is
changed.
Elementary measures of Return to Variable Proportions:
The marginal product: MPi ( x)  f i ( x)
The average product: APi ( x)  f ( x) / xi
The output elasticity of factor i: i ( x)  f i ( x).xi / f ( x)  MPi ( x) / APi ( x)
Returns to Scale: How output responds when all factors are varies in the same proportion.
Definition 5.2.4 (Global) Returns to Scale
A production function f(x) has the property of (global)
1. Constant return to scale if, and only if, f(tx) = tf(x) for all t >0 and all x.
2. Increasing return to scale if, and only if, f(tx) > tf(x) for all t > 0 and all x.
3. Decreasing return to scale if, and only if, f(tx) < tf(x) for all t <0 and all x.
A production function has constant return to scale if and only if it is a (positive) linear
homogeneous function of degree 1.
A production function has Increasing (decreasing) return to scale if and only if it is a

(positive) homogeneous function of degree greater (less) than 1.
Suppose that every technology falls into just one of these catergories since each requires that
output always respond to proportional changes in factor usage in the same qualitative way,
regardless of the current level of output or scale of the inputs.

Definition 5.2.5 (Local) Returns to Scale
The elasticity of scale at the point x is defined as

n
fx n
d log f (tx ) i 1 i i
 ( x)  lim     i ( x)
t 1 d log t f ( x) i 1
Returns to scale are locally constant, increasing, or decreasing, as  (x) is equal to, greater
than, or less than 1.
The elasticity of scale and the output eslasticity of factors are related as follows:
n
 ( x)    i ( x)
i 1
Many technologies exhibit increasing, constant, and decreasing returns over only certain
ranges of output. It is therefore often useful to have a local measure of return to scale. It is
therefore often useful to have a local measure of return to scale.
The elasticity of scale or the (overall) elasticity of output tells us the instantaneous
percentage change in output that occurs with a 1 percent increase in all inputs.
5.3 Cost
The firm maybe is faced with upward –sloping supply curve for its factors, so the more it
hires, the higher the per-unit price it must pay. In the other case, the firm is astomistic or
perfectly competitive on its input markets.
Assumption: the firm is astomistic or perfectly competitive
Definition 5.3.1 The Cost Funtion
The cost function for the firm facing fixed factor prices w >>0 is defined as the minumum –
value function,
c(w, y)  min w.x s.t. x  V ( y)

x
If x(w,y) solves the cost – minimization problem, then c(w, y)  w.x(w, y)
We suppose that the technology can be represented by a continuosly differentiable production

function, f(x). Since x  V ( y) if y  f (x) , we can rewrite this problem as:
min w.x s.t. y  f ( x) and x0

x

Suppose that x* solves this problem. Forming Kuhn – Tucker necssary conditions, x* must
satisfy:
 f ( x*)
wi   x  0 (1)
 i
  f ( x*) 
 xi *  wi     0 (2)
   x i 
 y  f ( x*)  0 (3)

  y  f ( x*)  0 (4)
For all y>0, input regularity requires that xi *  0 for at least one factor
f ( x*)
 (2): wi   0
xi
From the conditions above, this ensure that   0 , so (4)  y = f(x*).
For every pair of factors which the firm chooses to employ in positive amounts, we have:
f ( x*) / xi w
 i
f ( x*) / x j w j Technical substitution
The firm’s factor demand functions – Conditional Factor demand: x(w,y) is the amount
of each factor that the cost – minimizing firm will buy to produce output y when it faces
factor prices w.
Example 5.3.1
The problem: min w1 x1  w2 x2 s.t. y  ( x1  x2 )1 /  and x1  0, x2  0

x1 . x2
Assuming that y>0 and an interior solution, the firs – order Lagrangian conditions reduce to
the two conditions:
 w  x   1
 1   1 
 w2  x 2 
   1/ 
 y  ( x1  x 2 )
Solving for x1 and x2, we obtain conditional factor demands:
x1  yw11 /(  1) ( w1 /(  1)  w2 /(  1) ) 1 / 

x2  yw12 /(  1) ( w1 /(  1)  w2 /(  1) ) 1 / 

c(w, y)  w1 x1 (w, y)  w2 x2 (w, y)  y w1 /(  1)  w2 /(  1) 
(  1) / 

Theorem 5.3.1 Properties of the Cost Function
Let V(y) be input regular and monotonic and suppose that w>>0 and y>0. Let c(w,y) be the
cost function as defined in Definition 5.3.1. Then c(w,y) is continuous and:
1. Increasing in y
2. Non – decreasing in w
3. Homogeneous of degree 1 in w
4. Concave in w
Theorem 5.3.2 Shephard’s Lemma
If c(w,y) is a differentiable cost function, then
c( w, y )
xi ( w, y ) 
wi
Proof:
This can be proved by application of the Envelope Theorem.
Theorem 5.3.3 Properties of Conditional Factor Demands
Let the cost function be twice differentiable and let x(w,y) be the vector of conditional
factor demands. Then:
1. x(w,y) is non – decreasing in y
2. x(w,y) is homogeneous of degree zero in w
xi ( w, y )
3. The own – substitution effect,  0 for all i.
wi
xi ( w, y ) x j ( w, y )
4. The cross – substitution effects are symmetric, so that:  for all
w j wi
i and j.
5. The substitution matrix is symmetric and negative semi – definite, so that:

dwT [xi (w, y) / w j ]i , j 1,,n dw  0 for all w and y.
Proof:
Each of these may be proved just as the corresponding properties of the consumer’s Hicksian
demands.

Theorem 5.3.4 Cost and Conditional Factor Demands When Production Is Homothetic
1. If the production function is homothetic, then
a) The cost function is multiplicatively separate in factor prices and output and can be
written c(w,y) = h(y)c(w,1), where h’(y)>0 and c(w,1) is the unit cost function, or the
cost of 1 unit of output.
b) The conditional factor demands are multiplicatively separable in factor prices and
output and can be written x(w, y)  h( y) x(w,1) where h’(y)>0 and x(w,1) is the
conditional factor demand for 1 unit of output.
2. If the production function is linear homogeneous, then
a) c(w, y)  y.c(w,1)
b) x(w, y)  y.x(w, y)
Proof:
Let the production function be F(x) and suppose it is homothetic. Then F(x) = f(g(x)), where
f’>0 and g(x) is homogeneous of degree one.
c(w,1)  min w.x s.t. f ( g ( x))  1

x
Since f is monotonic increasing

1
c( w,1)  min w.x s.t. g ( x)  f (1)
x
 x 
 min w.x s.t. g  1   1
x
 f (1) 
 x 
 min w.x s.t. f 1 ( y ) g  1   f 1 ( y )
x
 f (1) 
 f 1 ( y ) 
 min w.x s.t. g  1 x   f 1 ( y )
x
 f (1) 
  f 1 ( y )  
 min w.x s.t. f  g  1 x    y
x
  f (1) 
f 1 (1) f 1 (1)   f 1 ( y )  
 min w. s.t. f  g  1 x    y
f 1 ( y ) x f 1 ( y )   f (1) 
f 1 (1)
 min w.z s.t. f g ( z )   y
f 1 ( y ) x
f 1 (1)
 c( w, y )
f 1 ( y )

f 1 ( y )
So c( w, y)  c( w,1)  h( y)c( w,1)
f 1 (1)
To prove part (b), we use Shephard’s lemma.
Definition 5.3.2 The Short – Run or Restricted Cost Function
Let the production function be f(z), where z  ( x, x ) . Suppose that x is a subvector of

variable factors and x is a subvector of fixed factors. Let w and w be the associated factor
prices for the variable and fixed factors, respectively. The short – run or restrict total cost
function is defined as
sc(w, w , y; x )  min w.x  w.x s.t. f ( x, x )  y .

x
If x(w, w , y, x ) solves this minimization problem, then
sc(w, w, y; x )  w.x(w, w, y; x )  w.x .
The optimal cost of the variable factors, w.x(w, w , y; x ) , is called total variable cost. The
cost of the fixed factors, w.x , is called total fixed cost.
5.4 Duality in Production
There is a duality between production and cost just as there is between utility and expenditure.
The principles are identical: If we begin with a technology and derive its cost function, we can
take that cost function and use it to generate a technology.
Any function will all the properties of a cost function implies some technology for which it is
the cost function.
This is the most significant developmets in modern theory and has had important implications
for applied work. Applied research need no longer begin their study of the firm with detailed
knowledge of the technology and with access to relatively obscure data. Instead, they can
concentrate on devising and estimating flexible functions of observable market prices and
output and be assured that they are carrying along all economically relevant aspects of the
underlying technology.
Theorem 5.4.1 Sufficiency of the Cost Function
Let V(y) be input regular and monotonic. Let c(w, y)  min w.x s.t. x  V ( y) .
x
Let V * ( y)  {x  Rn | w.x  c(w, y) for all w  0}

and let c * (w, y)  min w.x s.t. x V * ( y) . Then for all y > 0:
1. V*(y) is input regular, monotonic, convex, and V ( y)  V * ( y) .
2. c(w, y)  c * (w, y)
3. If V(y) is convex, then V * ( y)  V ( y)

Exercises of Chapter 1: Sets and Mapping
E1.3
Prove De Morgan’s Law
1. c(S  T )  cS  cT
c(S  T )  {x | x  S and T }  {x | x  S  T }
cS  cT  {x | x  S or x  T }  {x | x  S and T }  {x | x  S  T }
 c(S  T )  cS  cT
2. c(S  T )  cS  cT
c(S  T )  {x | x  S  T }
cS  cT  {x | x  cS  cT }  {x | x  S  T }
 c(S  T )  cS  cT
E 1.5
Let A and B be convex set. Show by counter-example that A  B need not be a convex sex.
Considering subsets of vectors in R space.
We have: [1,3] and [6,8] are convex sets. However, with t=0.5, the convex combination of 1
and 6: 1*0.5+6*0.5=3.5 is not in the set: [1,3]  [6,8]  [1,3]  [6,8] is not a convex set.
E 1.6
The theorem: Intersection of convex sets is convex
Extend this theorem to the case of arbitriary many convex sets
Let S1, S2,…Sn are convex sets.

n
S i  {x | x  S1 and S 2  and S n }
i 1
n
Let x1 and x2 in S i  x1 and x2  S i . Because Si is a convex set  tx1  (1  t ) x2  S i
i 1
n
  S i is a convex set.
i 1
E 1.7
Graph each of the sets given below. If the set is convex, give a proof. If it is not convex, give
a counter-example
a. {( x, y) | y  e x }
This set is not convex.
X1=0  y1 = 1  (0,1)  the set.
X2=1  y2 = e  (1, e)  the set.
The convex combination of the two points with t=0.5:
(0.5*0+0.5*1,0,5*1+0,5*e)=(0.5,0.5+0.5*2.718)=(0.5,1.86)
However (0.5,1.86) is not in the set {( x, y) | y  e x }  that set is not convex.
b. {( x, y) | y  e x }
{( x, y) | y  e x }  {( x, y) | y  e x }
{( x, y) | y  e x } is not a convex set  {( x, y) | y  e x } is not convex
c. {( x, y) | y  2 x  x 2 , x  0, y  0}
Considering the set {( x, y) | y  2 x  x 2 , x  0, y  0}
x1  1  y1  1
x2  1.5  y 2  0.75
The convex combination of (1,1) and (1.5,0.75), with t=0.5: (1.25,0.875) 

{( x, y) | y  2 x  x 2 , x  0, y  0} .
 the set {( x, y) | y  2 x  x 2 , x  0, y  0} is not convex
The set {( x, y) | y  2 x  x 2 , x  0, y  0} is a subset of the set
{( x, y) | y  2 x  x 2 , x  0, y  0}
 the set {( x, y) | y  2 x  x 2 , x  0, y  0} is not also convex.
d. {( x, y) | xy  1, x  0, y  0}  {( x, y) | y  1 / x, x  0, y  0}
Considering the set: {( x, y) | y  1 / x, x  0, y  0}

x1  1  y1  1; x2  0.5  y 2  2  (1,1) and (0.5,2) in {( x, y) | y  1 / x, x  0, y  0} .
The convex combination of (x1,y1) and (x2,y2) with t=0.5 is: (0.75,1.5).
However, with xt=0.75  y = 1.33  the convex combination is not in

{( x, y) | y  1 / x, x  0, y  0} .
 the set {( x, y) | y  1 / x, x  0, y  0} is not convex
The set {( x, y) | y  1 / x, x  0, y  0} is a subset of {( x, y) | y  1 / x, x  0, y  0}
 {( x, y) | y  1 / x, x  0, y  0} is not also convex.
E 1.9
Let A and B be two sets in the domain D, and suppose that B  A . Prove that f ( B)  f ( A)
for any mapping f : D  R .
Consider the set A  B . Because A and B in D  A  B  for all x in A  B , there is an
unique f(x)  for all x in A  B , the range f ( A1)  f ( B) .
Consider the set A2  {x | x  A and x  A  B}  for all x in A2, there is an unique f(x) 
the range f ( A2)  f ( B)  0 .
We have the rage f ( A2)  f ( A1)  f ( A2)  f ( B)  f ( A) .
E 1.10
Let A and B be two sets in the range R, and suppose that B  A . Prove that
f 1 ( B)  f 1 ( A) for any mapping f : D  R .
Consider the set A1  A  B . Because B  A  A1  B  f 1 ( B)  f 1 ( A1) .
Consider the set A2  A and A2  A1  0  A  A1  A2  f 1 ( A)  f 1 ( A1)  f 1 ( A2) .
By the uniqueness of the function  f 1 ( B)  f 1 ( A) .
E 1.13
Let f : D  R be any mapping and let B be any set in the range R. Prove that
f 1 (cB)  c( f 1 ( B)) .
If f 1 (cB)  f 1 ( B)  f (cB)  f ( B)  this violate the uniqueness of a function.
 f 1 (cB)  c( f 1 ( B))
E 1.14
For any mapping f : D  R , and any two sets A and B in the range of f, show that:
1. f 1 ( A  B)  f 1 ( A)  f 1 ( B)
Let some y in A  B  y in A or B or A  B .
By the uniqueness of a function  f 1 ( y) in f 1 ( A) or f 1 ( B) or f 1 ( A)  f 1 ( B)
 f 1 ( y) in f 1 ( A)  f 1 ( B)
 f 1 ( A  B)  f 1 ( A)  f 1 ( B)
2. f 1 ( A  B)  f 1 ( A)  f 1 ( B)
Let some y in A  B  y must be in A and B.
By the uniqueness of a function  f 1 ( y) in f 1 ( A) and f 1

( B)
 f 1 ( y) in f 1 ( A)  f 1 ( B)
 f 1 ( A  B)  f 1 ( A)  f 1 ( B)
Exercises in Chapter 2. Calculus and Optimization
E 2.11
A real valued function is called homothetic if it can be written in the form y  g ( f ( x)) where
g : R  R is strictly increasing and f : R n  R is homogeneous of degree 1. Show that for
any such function, 1  f (t ( y).x) , where t ( y)  1 / g 1 ( y) .
1 1
We have: g 1 ( y)  f ( x)  f (t ( y ) x)  f ( .x )  f ( x)  1 due to f(x) is
f ( x) f ( x)
homogeneous of degree 1.
E 2.12
Let F(z) be a monotonic increasing function of the single variable z. Form the composite
function (or “transform”), F(f(x)). Show that x* is a local maximum (minimum) of f(x), if and
only if, x* is a local maximum (minimum) of F(f(x)).
If x* is optima of f(x)  f i ( x*)  0 and f ii ( x*)  (or  0) .
Suppose x* is maximum of f(x)  f i ( x*)  0 and f ii ( x*)  0)
F ( x*) F ( f ( x*)) f ( x*)


xi f ( x) xi
F ( f ( x))
Because F(z) is a monotonic increasing function of single variable z   0 for all
f ( x)
f(x).
F ( x*) F ( f ( x*)) f ( x*)

We have f i ( x*)  0   0
xi f ( x) xi
Consider:
 2 F ( x*)  2 F ( f ( x*)) f ( x*) f ( x*) F ( f ( x*))  2 f ( x*)

 
xi xi f ( x)f ( x) xi xi f ( x) xi xi
f ( x*) F ( f ( x*))  2 f ( x*)

We have:  0;  0 ; and 0
xi f ( x) xi xi
 2 F ( x*)  2 F ( f ( x*)) f ( x*) f ( x*) F ( f ( x*))  2 f ( x*)

   0
xi xi f ( x)f ( x) xi xi f ( x) xi xi
F ( x*)
  0 and F(x) is concave at x*  F(x) is maximized at x*.
xi
E 2.13
Suppose that f(x) is a concave function and M is the set of all points in R n which give global
maxima of f. Prove that M is a convex set.
Let x1 and x2 be in M  f(x1)=f(x2)=maxima (by the definition of the global maxima).
Because f(x) is a concave function  f ( xt )  tf ( x1)  (1  t ) f ( x2)  f ( x2)  max ima
 f(xt) must be equal to maxima  xt is in M  M is a convex set.
E 2.21
Find the local extreme values and classify the stationary points as maxima, minima, or
neither:
a. f ( x1, x2)  2 x1  x12  x22
The first – order condition:
 f1  2  2 x1  0  x1  1

 f 2  2 x2  0  x 2  0
The second – order condition:
 2 0 
Hessian Matrix: H   
 0  2
f11, f 22  0  f(x1,x2) is strictly concave  (1,0) is maximum.
The maxima: f = 2-1 = 1
c. f ( x1, x2)  x13  x22  2 x2
 f1  3x12  0  x1  0

 f 2  2 x2  2  0  x2  1
6 x 0
The Hessian Matrix: H   1
 0  2
0 0 
At the critical point, the Hessian matrix is: H   
0  2 
D1=0, D2=0
 the stationary point is not neither maxima or minima.
e. f ( x1, x2)  x13  6 x1 x2  x23

 f1  3x1  6 x 2  0
2


 f 2  6 x1  3x 2  0
2
The solutions of the stimutaneous equations are: (0,0) and (2,2).
6 x  6 
The Hessian matrix: H   1 
  6 6 x2 
 0  6
At the point (0,0), H     D1=0; D2=-36  The point (0,0) is not neither
 6 0 
maxima or minima.
 12  6
At the point (2,2), H     D1=12>0; D2>0  the point (2,2) is minima.
 6 12 
The mimima value is: f = 8-24+8 = -8.
E 2.22
Solve the following problems:
a. min x12  x22 s.t. x1 x2  1
The Lagrange function: L  x12  x22   ( x1 x2  1)  0
The First – order condition:
 L1  2 x1  x 2  0

 L2  2 x 2  x1  0  the critical points: (1,1,-2) and (-1,-1,-2)
L  x x  1  0
  1 2

2  x2 
The Hessian matrix: H    2 x1 
 x 2 x1 0 
 2  2 1
At the point (1,1,-2), H   2 2 1 , D3=0
 1 1 0
 The point (1,1,-2) is not local minimum.
 2  2  1
At the point (-1,-1,-2), H   2 2  1 , D3=0
  1  1 0 
 The Point (-1,-1,-2) is not local miximum.
b. min x1 x2 s.t. x12  x22  1
The Lagrange function: L  x1 x 2  ( x12  x22  1)
The First – order condition:
 L1  x 2  2x1  0

 L2  x1  2x 2  0

 L  x1  x 2  1  0
2 2
The critical points: (1 / 2 ,1 / 2 ,1 / 2) ; (1 / 2 ,1 / 2 ,1 / 2) ; (1 / 2 ,1 / 2 ,1 / 2) ;
(1 / 2 ,1 / 2 ,1 / 2) .
The Second – order condition:
 2 1 2 x1 
The border Hessian matrix: H   1 2 2 x 2 
2 x1 2 x2 0 
D3  8x1 x2  (8x1 x2  8x22 )
With the point (1 / 2 ,1 / 2 ,1 / 2) , D3=8
With the point (1 / 2 ,1 / 2 ,1 / 2) , D3=-4
With the point (1 / 2 ,1 / 2 ,1 / 2) , D3=-4
With the point (1 / 2 ,1 / 2 ,1 / 2) , D3=8

The points (1 / 2 ,1 / 2 ,1 / 2) and (1 / 2 ,1 / 2 ,1 / 2) are local mimimum of f(x1,x2).

Advanced Microeconomic Theory

Uploaded by

Copyright:

Available Formats

Advanced Microeconomic Theory

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Advanced Microeconomic Theory

Uploaded by

Copyright:

Available Formats

Chu Thanh Duc – MDE 10

Chapter 1. Sets and Mappings

- A Theorem: is simply a statement deduced from other statements and should be a

1.1.1 Necessity and Sufficiency

- Necessity: A is necessary for B  A must hold or be true in order for B hold or be

- Both Necessity and Sufficiency: A  B

1.1.2 Theorems and Proofs

“A  B” : A is true  B must be true

- Here, A is called “premise” and “B is call “conclusion”

- Constructive proof / Direct proof: Assume that A is true, deduce various

1.2.Elements of Set Theory

1.2.1 Notation and Basic Concepts

- A Set: is any collection of elements. Elements may be numbers or vectors

- Empty set: S is an empty set if it contains no elements at all. Notaton: S = 0

- Union: Notation S  T  {x | x  S _ or _ x  T } . In general: U iI Si (I is Index set).

- Intersection: Notation S  T  {x | x  S _ and _ x  T } . Or  iI Si (I is Index set).

- The Product of two sets: S  T  {(s, t ) | s  S , t  T }

- n – Space: is the Set Product : R n  R  R  ...  R  {( x1, x2,..., xn) | xi  R} i =1,2...n.

- Non – Negative Orthant: Rn  {( x1, x2,..., xn) | xi  0}  R n

1.2.2 Convex Set:

Geoffrey A. Jehle – Advanced Microeconomic Theory 2

Convex set in Rn : S  Rn is a conxex set iff for all x1  S and x2  S, we have:

tx1 + (1-t)x2  S for all t in the interval 0  t  1 .

Theorem: Intersection of Convex Sets is Convex.

1.2.3 Relations and Functions

sRt  {(s, t ) | s  S , t  T and sRt  S  T }

Example: S  S  S 2  {( x, y) | x  S , y  S} " "  {( x, y) | x  S , y  S , x  y}

1.3. A little Topology

Topology is a study of fundamental properties of sets and mappings.

Geoffrey A. Jehle – Advanced Microeconomic Theory 3

d(x1, x2) = |x1 – x2| = ( x11  x12 ) 2  ( x12  x22 ) 2

The Theorem on Open Sets in R n :

1. The Empty set is an Open Set

2. The entire space R n is an open set

3. The union of open sets is an Open set

4. The intersection of any finite number of open sets is an Open set.

Theorem: Every Open set is a Collection of Open Balls.

Theorem on Closed sets:

1. The empty set is a closed set

2. The entire space is a closed set

3. The union of closed set is a close set

4. The intersection of a finite number of closed sets is a closed set.

Theorem: Closed sets in R and the Unions of close Intervals.

Let S is any closed set in R. Then:

S=  (, ai]  [bi,) 

We can rewrite as: cS   ( x   x , x   x ) .

Let i=x, I=cS, ai  x   x , bi  x   x , we have ai<bi

Applying the De Morgan’s law, we have: S   c(ai , bi )

c(ai , bi )  (, ai ]  [bi ,)

 S   (, ai ]  [bi ,)  .

Theorem: Close sets in R and the Union of Closed Intervals:

Let S is any closed set in R . Then:

S=  [0, ai]  [bi,)

Bounded Sets in R n : A subset S in R n is called bounded if and only if it is entirely

Consider a subset S in R space:

A subset S in R has many lower bounds and upper bounds.

Geoffrey A. Jehle – Advanced Microeconomic Theory 5

Theorem: Upper and Lower Bounds in Subsets of Real Numbers

Compact Sets in R n ( Heine-Borel): A set S  R n is compact if and only if S is closed and