Multivariable Calculus: Inverse-Implicit Function Theorems: N N M F X
Multivariable Calculus: Inverse-Implicit Function Theorems: N N M F X
Multivariable Calculus: Inverse-Implicit Function Theorems: N N M F X
1. Introduction
(1.1) f (x) = y
and represent x = g(y) and if possible find good properties of g, namely smoothness.
More generally, if f : Rn+m → Rn , x ∈ Rn , y ∈ Rm , solve the implicit system of equations
(1.2) f (x, y) = 0
(1.3) Ax = y
1
The lectures were delivered at the Science academy workshop held at the department of mathematics,
Jain University, Bangalore during the period February 16-18, 2012.
2
Department of Mathematics, Indian Institute of Science, Bangalore, Email: nands@math.iisc.ernet.in
1
2
The system (1.3) or (1.4) is uniquely solvable for x in terms of y if and only if det A 6= 0.
In this case
x = A−1 y
and A−1 is also an n × n matrix.
We would like to address the solvability of (1.1) and (1.2) giving appropriate conditions
like det A 6= 0 as in linear system.
Remark 1.3. Thus we see the impact of non vanishing of the derivative on the solv-
ability similar to det A 6= 0 in the linear systems. In higher dimensional case, we have
many derivatives and we need a systematic procedure to deal with such complicated case.
In other words, we would like to understand the solvability of a system of non-linear
equations in many unknowns. This is given via inverse and implicit function theorems.
We also remark that we will only get a local theorem not a global theorem like in linear
systems.
3
Now for v ∈ Rn ,
n
X
f (x0 + hv) = (x0i + hvi )2
i=1
Xn n
X n
X
= x20i + 2h 2
x0i vi + h vi2
i=1 i=1 i=1
2 2
= f (x0 ) + 2h(x0 , v) + h |v|
It follows that
Dv f (x0 ) = 2(x0 , v).
Remark 2.3. As seen earlier the existence of all directional derivatives implies the ex-
istence of partial derivatives. But, the converse is not true
Then D(1,0) f (0, 0) = D(0,1) f (0, 0) = 1, but D(a,b) f (0, 0), a 6= 0, b 6= 0 does not exists.
∂ 2f ∂ 2f
Then show that (0, 0) = 1 and (0, 0) = −1 which shows that, in general, the
∂x∂y ∂y∂x
order of partial derivatives cannot be interchanged.
5
That is
f (x0 + h) = value of f at x0 + linearized term + reminder o(h).
This can be easily extended to vector valued functions with one variable, namely f : R →
0
Rm . Here f (x) = (f1 (x), · · · fm (x))T with x ∈ R and f 0 (x0 ) = (f10 (x0 ), · · · , fm (x0 ))T . If
we define αi = fi0 (x0 ), then α = (α1 , · · · , αm ) ∈ Rm . Correspondingly, we can associate
a linear operator Tα : R → Rm defined by
Tα (x) = αx = (α1 x, · · · , αm x)T
= (f10 (x0 )x, · · · , fm (x0 )x)T , x ∈ R
If f : Rn → Rm , then x0 , h are vectors and one has to interpret the meaning of product
f 0 (x0 )h and division r(h)
h
.
The equation (2.7 ) equivalently can be rewritten as
|f (x0 + h) − f (x0 ) − f 0 (x0 )h|
lim =0
h→0 |h|
6
Example 2.10. Let f : Rn → R by f (x) = |x|2 = (x, x). Then f 0 (x0 )h = 2(x0 , h) or
f 0 (x0 ) = 2x0
Remark 2.11. The definition can be extended to infinite dimensional normed linear
spaces without much difficulty and hence the enormous applications.
Exercise 2.14. Show that in Exercises 2.4, 2.6 and 2.7, f 0 (0, 0) does not exist.
Remark 2.15. This indicates that the existence of all directional derivatives are not
enough to guarantee the existence of total derivative. But if the total derivative exists,
then all the directional derivatives exist and in fact one can compute the total derivative
using the partial derivatives.
Let {e1 , · · · , en } and {ẽ1 , · · · , ẽm } be the standard basis of Rn and Rm respectively. If
f 0 (x0 ) exists, then for j fixed, by definition and linearity of f 0 (x0 ), we get
In this section, we address the solvability of the non linear equation in explicit form:
(3.1) f (x) = y
where f : E ⊂ Rn → Rm and y ∈ Rm is given and E is open. These are a set of m
non-linear equation in n unknowns:
f (x , · · · xn ) = y1
1 1
..
(3.2) .
fm (x1 , · · · xn ) = yn
Given a ∈ E, let b = f (a), then (a, b) is a solution to (3.1). We want to give conditions
under which one can solve for x for all y in a neighborhood of b.
Theorem 3.1 (Inverse Function Theorem). Let f be as above satisfies
(1) f is a C 1 map, that is f 0 (x) exists for all x ∈ E and the mapping x 7→ f 0 (x), E →
L(Rn , Rm ) is continuous.
(2) The matrix f 0 (a) is invertible, that is det f 0 (a) 6= 0. Then ∃ open sets U and V in
Rn and Rm , containing a and b, respectively such that
(i) f : U → V is 1-1 and onto
(ii) g = f −1 : V → U given by g(f (x)) = x, ∀ x ∈ U is a C 1 map.
The above theorem tells us that y = f (x) can be uniquely solved for x for y in a
neighborhood of b. Further the inverse map is also smooth.
We will not present a proof of the above theorem, but it is based on contraction
mapping theorem from functional analysis.
Theorem 3.2. Let φ : Rn → Rn be a contraction map, that is ∃ 0 ≤ α < 1 such that
|φ(x) − φ(y)| ≤ α|x − y| for all x, y ∈ Rn . Then ∃ a unique solution to the problem
φ(x) = x
Corollary 3.3. The above theorem is true for any complete metric space X in place of
Rn .
Remark 3.4. The proof is beautiful and constructive. Take any arbitrary point x0 ∈ Rn .
Construct inductively xn+1 = φ(xn ), n = 0, 1, 2, · · · . Then one can prove that xn → x
and φ(x) = x.
9
Proof. (Inverse Function Theorem; sketch): Given that A = f 0 (a) is invertible. Since f 0
is continuous, given > 0, ∃ U ⊂ E such that k f 0 (x) − A k≤ ∀ x ∈ U . Now define for
y ∈ Rn , φ(x) = x + A−1 (y − f (x)). Then f (x) = y is solvable if and only if φ(x) = x has
a solution.
Step 1: Show that φ is a contraction to get the solution.
Step 2: Prove the inverse map then obtained is C 1 .
Quite often, we do not expect to get equations in explicit form y = f (x) like in
x + y 2 − 1 = 0, we may get a relation connecting the variables x and y. Let f : E ⊂
2
,
Linear System: If fi s are linear, then ∃ n × n matrix A and n × m matrix B so that
(4.2) reduces to
(4.3) Ax + By = 0
If A is invertible, then x can be solve as
x = −A−1 By
and
T (h, k) = Tx h + Ty k
That is, we have
T = Tx + Ty
with Tx ∈ L(Rn , Rn ) and Ty ∈ L(Rm , Rn )
Theorem 4.2 (Implicit Function Theorem (Non Linear version)). Let f : E ⊂ Rn+m →
Rn be a C 1 map such that f (a, b) = 0 for some (a, b) ∈ E. Put T = f 0 (a, b) ∈
L(Rn+m , Rn ) and T = Tx + Ty as above and assume Tx is invertible. Then ∃ open
sets U ⊂ E ⊂ Rn+m , W ⊂ Rm with b ∈ W , (a, b) ∈ U satisfying
(i) for every y ∈ W , ∃! x such that (x, y) ∈ U and f (x, y) = 0.
(ii) define g : W → Rn by g(y) = x, then g is a C 1 map such that g(b) = a, and
f (g(y), y) = 0. Further
g 0 (b) = −Tx−1 Ty .
Proof. (Idea): The proof is based as an application of Inverse function theorem applied
to F : E ⊂ Rn+m → Rn+m defined by
Take a = (1, 1), b = (3, 2, 7), then f (a, b) = 0. Compute T = (Tx , Ty ) where
" # " #
2 3 1 −4 0
Tx = and Ty =
−6 1 2 0 1
11
References
[1] Tom M. Apostol, Mathematical Analysis, Narosa Publishing House (2002).
[2] W. Rendin, Principles of Mathematical Analysis, Mc. Graw Hill (1976).
[3] M. Spivak, Calculus on Manifolds, W. A. Benjamin Inc.(1965)