�
�
18.950 Handout 4. Inverse and Implicit Function Theorems.
Theorem 1 (Inverse Function Theorem). Suppose U ⊂ Rn is open, f : U → Rn is C1 , x0 ∈ U and dfx0 is invertible. Then there exists a neighborhood V of x0 in U and a neighborhood W of f (x0) in Rn such that f has a C1 inverse g = f−1 : W → V. (Thus f (g(y)) = y for all y ∈ W and g(f (x)) = x for all x ∈ V .) Moreover,
dgy = (dfg(y))−1 for all y ∈ W and g is smooth whenever f is smooth.
Remark. The theorem says that a continuously differentiable function f between regions in Rn is locally invertible near points where its differential is invertible.
Proof. Without loss of generality, we may assume that x0 = 0, f (x0) = 0 and dfx0 = I. (Otherwise, replace f with f (x) = dfx−10 (f (x + x0) − f (x0)).
Note that if the theorem holds with f , 0, 0, I and a function g� in place of f x0, f (x0), dfx0 and g respectively, then it is easily verified that the theorem as stated holds with g(y) = x0 + g�(df
−1(y − f (x0))).)
x0
Since dfx is continuous in x at x0 (see Exercise 1), there exists a number r > 0 such that
x ∈ Br(0) = dfx − I� ≤ 1 2.
⇒ �
(Recall that for a linear transformation A : Rn →Rm we define the norm of A by �A� = sup{|v|≤1} |A(v) .) Fix y ∈ B| r/2(0). Define a function φ by
φ(x) = x − f (x) + y.
Note that dφx = I − dfx and hence
�dφx� ≤ 1/2 if x ∈ Br(0).
Thus
� 1 d
|φ(x)| ≤ |φ(x) − y + y = | | | | φ(tx)dt| + y
dt | |
0
� 1 � 1
= | dφtx · xdt| + y| | ≤ �dφtx�|x dt + y| | |
0 0
≤ r/2 + r/2 = r (1)
�
�
�
�
�
�
�
�
whenever x ∈ Br(0). i.e. φ is a map from Br(0) into itself. For any x, z ∈ Br(0),
=
� 1 d
φ(z) − φ(x) φ(x + t(z − x))dt
| |
0 dt
� 1
dφx+t(z−x) · (z − x) dt
≤
0
| |
� 1
�dφ dt
≤
0 x+t(z−x)�|z − x|
1 z − x.
≤ 2| |
Thus φ : Br(0) → Br(0) is a contraction, and hence φ has a unique fixed point xy ∈ Br(0). i.e. there is a unique point xy ∈ Br(0) with f (xy) = y. In fact xy ∈ Br(0) since r 2 > y = f (x| | y)| ≥ |xy| − |xy − f (xy)| ≥ |xy| − 12|xy| =
1 xy . Set W = Br/2(0) and V =
|
f−1(W ) ∩ Br(0). Note then that V is open.
2| |
Define g : W → V by g(y) = xy. Then f (g(y)) = y for all y ∈ W and g(f (x)) = x for all x ∈ V.
Next we show that g is differentiable, with dgy = (dfg(y))−1 . First note that with ψ : Br(0) → Rn defined by ψ(x) = x − f (x), we have that for x1, x2 ∈ Br(0),
x1 − x2 f (x1) − f (x2) (x1 − x2) − (f (x1) − f (x2))
| | − | | ≤ | |
ψ(x1) − ψ(x2)
≤ | 1
| x1 − x2
≤ 2| |
where the last inequality follows by estimating as in (1), using dψx = I −dfx. Hence
1 x1 − x2 f (x1) − f (x2)
2| | ≤ | |
for any x1, x2 ∈ Br(0), which implies
g(y1) − g(y2) y1 − y2 (2)
| | ≤ 2| |
for any y1, y2 ∈ W = Br/2(0). In particular, g is continuous.
�
�
�
�
�
�
�
�
Now fix y ∈ W , and let A = dfg(y). Since W is open, there exists δ > 0 such that y + k ∈ W if k ∈ Bδ(0). Let h = g(y + k) − g(y). Then k = y + k − y = f (g(y + k)) − f (g(y)) = f (g(y) + h) − f (g(y)) and hence, for k ∈ Bδ(0) \ {0},
g(y + k) − g(y) − A−1k| A−1(Ah − k) h
= |
h
| | k
|
|
|k| | | | |
≤ �A−1�| h
|h k
|
− Ah| |
|k|
|
≤ 2�A−1�|f (g(y) +
| h
h )
|
− f (g(y)) − Ah|
(3) where the last estimate follows from (2). Note that since g(y+k) = g(y) = ⇒ f (g(y + k)) = f (g(y)) = ⇒ y + k = y = ⇒ k = 0, we have that h =� 0 if k =� 0. Sice A = dfg(y), it follows from the definition of differentiability of f that the right hand side of (3) tends to 0 as h → 0, and hence, since
h k by (2), it follows that
| | ≤ 2| |
g(y + k) − g(y) − A−1k
lim | |
= 0.
k→0 | |k
i.e. g is differentiable at y and
dgy = (dfg(y))−1 . (4)
Finally, note that the function y �→ dgy is the composition of the function y �→ dfg(y) and matrix inversion A �→ A−1 . Matrix inversion is a smooth map of the entries, and the function y �→ dfg(y) is continuous since g is continuous and f is C1 . Hence we conclude that y �→ dgy is continuous; i.e. that g is C1 . Repeatedly differentiating (4) shows that g is smooth if f is smooth.
Exercise 1. Let L(Rn; Rn) be the set of linear transformations from Rn into itself with the metric d(A, B) = �A − B�. (cf. Exercise 10 of handout 1.) Let U ⊂ Rn be open and f : U → Rn be a C1 function. Show that the map x �→ dfx is continuous as a map from U into L(Rn; Rn).
Exercise 2. Suppose g : [a, b]→ Rn is continuous. Show that
� b � b
|g(t) dt
≤ |
a
g(t)dt
a
�
�
�
�
�
�
� �
� �
� � where | |
h(t)dt
denotes the Euclidean norm. You may use without proof that
·
� b � b
h(t) dt for a scalar valued function h.
≤ a | |
a
Exercise 3. Define f : R → R by f (x) = x2 + x2 sin 1 x if x =� 0 and f (0) = 0. Compute f�(x) for all x ∈ R. Show that f�(0) > 0, yet f is not onetoone in any neighborhood of 0. This example shows that in the Inverse Function Theorem, the hypothesis that f is C1 cannot be weakened to the hypothesis that f is differentiable.
Exercise 4. Define f : R2 → R2 by f (x, y) = (ex cos y, ex sin y). Show that f is C1 and that df(x,y) is invertible for all (x, y) ∈ R2 and yet f is not a onetoone function globally. Why doesn’t this contradict the Inverse Function Theorem?
Next we prove the Implicit Function Theorem. This theorem gives con
ditions under which one can solve, locally, a system of equations fi(x, y) = 0, i = 1, 2, . . . n
where x ∈ Rm and y ∈ Rn, for y in terms of x. (Thus, y = (y1, . . . , yn) where y1, . . . , yn are regarded as n unknowns, satisfying the n equations fi(x, y) = 0, i = 1, . . . , n.) Geometrically, the set of solutions (x, y) to the system of equations is the graph of a function y = g(x). Note that we have from linear algebra that if for each i, the function fi is linear with constant coefficients in the variables yj, then whenever the (constant) n × n matrix
∂ fi is invertible, the system of equations is solvable for y in terms
∂ yj 1≤i,j≤n
of x. Implicit function theorem says that whenever fi are C1 and this matrix is invertible at a point (a, b), then the system is solvable for y in terms of x locally in a neighborhood of (a, b).
We shall use the following notation: For an Rn valued function f (x, y) = (f1(x, y), f2(x, y), . . . , fn(x, y)) in a domain U ⊂ Rm+n ≡ Rm × Rn, where , y∈ Rn, we shall denote by d f the partial differential represented
x ∈ Rm x
∂ fi
by the n × m matrix ∂ x and by dy f the partial differential
j 1≤i≤n,1≤j≤m
represented by the n × n matrix ∂ y∂ fi
j 1≤i,j≤n .
Theorem 2 (Implicit Function Theorem). Let U ⊂ Rm+n ≡ Rm × Rn be an open set, f : U → Rn a C1 function, (a, b) ∈ U a point such that f (a, b) = 0 and dyf |(a,b) invertible. Then there exists a neighborhood V of
�
�
�
�
� �
�
(a, b) in U , a neighborhood W of a in Rm and a C1 function g : W →Rn such that
{(x, y) ∈ V : f (x, y) = 0} = {(x, g(x)) : x ∈ W } . Moreover,
dgx = − (dy f )−1
(x,g(x)) dx f|(x,g(x))
(a,b)
� and g is smooth if f is smooth.
Rm+n
Proof. Define F : U → by F (x, y) = (x, f (x, y)). Then F is C1 in U, F (a, b) = (a, 0) and det dF(a,b) = det dy f | = 0. Hence by the Inverse
F has a C1 inverse F−1 : W
Function Theorem, → V for neighborhoods
{x ∈ Rm V of (a, b) and W of (a, 0) in Rm
Then W is open in Rm
: (x, 0) ∈ W }.
. Note then that if x ∈ W , then (x, 0) ∈ W so that (x, 0) = F (x1, y1) where (x1, y1) ∈ V is uniquely determined by x. (In fact, by the definition of F , x1 = x.) Define g : W →Rn by setting y1 = g(x).
Thus g(x) is defined by F−1(x, 0) = (x, g(x)); i.e. by g(x) = π◦F−1(x, 0) where π : Rm × Rn → Rn is the projection map π(x, y) = y. Then {(x, y) ∈ V : f (x, y) = 0} = {(x, y) ∈ V : F (x, y) = (x, 0)} = {(x, g(x)) : x ∈ W }.
Since π is a smooth map and F−1 is C1, it follows that g is C1 . The formula for dgx follows by differentiating the identity
f (x, g(x)) ≡ 0 on W
using the chain rule. By repeatedly differentiating this identity, it follows that g is smooth if f is smooth.
× Rn. Set W =