��18.950 Handout 4. Inverse and Implicit Function Theorems. Theorem1 (Inverse Function Theorem). Suppose U ⊂ R

(1)

�

18.950 Handout 4. Inverse and Implicit Function Theorems.

Theorem 1 (Inverse Function Theorem). Suppose U ⊂ Rⁿ is open, f : U → Rⁿ is C¹, x₀ ∈ U and df_x₀ is invertible. Then there exists a neighborhood V of x0 in U and a neighborhood W of f (x0) in Rⁿ such that f has a C¹ inverse g = f⁻¹ : W → V. (Thus f (g(y)) = y for all y ∈ W and g(f (x)) = x for all x ∈ V .) Moreover,

dgy = (df_g(y))⁻¹ for all y ∈ W and g is smooth whenever f is smooth.

Remark. The theorem says that a continuously diﬀerentiable function f between regions in Rⁿ is locally invertible near points where its diﬀerential is invertible.

Proof. Without loss of generality, we may assume that x₀ = 0, f (x₀) = 0 and df_x₀ = I. (Otherwise, replace f with f (x) = df_x⁻¹₀ (f (x + x₀) − f (x₀)).

Note that if the theorem holds with f , 0, 0, I and a function g� in place of f x₀, f (x₀), df_x₀ and g respectively, then it is easily veriﬁed that the theorem as stated holds with g(y) = x₀+ g�(df

−1(y − f (x₀))).)

x0

Since df_x is continuous in x at x₀ (see Exercise 1), there exists a number r > 0 such that

x ∈ Br(0) = dfx − I� ≤ 1 2.

⇒ �

(Recall that for a linear transformation A : Rⁿ→R^m we deﬁne the norm of A by �A� = sup_{|v|≤1}|A(v) .) Fix y ∈ B| _r/2(0). Deﬁne a function φ by

φ(x) = x − f (x) + y.

Note that dφ_x= I − df_x and hence

�dφ_x� ≤ 1/2 if x ∈ Br(0).

Thus

� 1 d

|φ(x)| ≤ |φ(x) − y + y = | | | | φ(tx)dt| + y

dt | |

0

� 1 � 1

0 0

≤ r/2 + r/2 = r (1)

(2)

�

whenever x ∈ Br(0). i.e. φ is a map from Br(0) into itself. For any x, z ∈ B_r(0),

=

� 1 d

φ(z) − φ(x) φ(x + t(z − x))dt

| |

0 dt

� 1

dφ_x+t(z−x)· (z − x) dt

≤

0

| |

� 1

�dφ dt

≤

0 x+t(z−x)�|z − x|

1 z − x.

≤ 2| |

Thus φ : B_r(0) → B_r(0) is a contraction, and hence φ has a unique ﬁxed point xy ∈ Br(0). i.e. there is a unique point xy ∈ Br(0) with f (xy) = y. In fact x_y ∈ B_r(0) since ^r₂ > y = f (x| | _y)| ≥ |x_y| − |x_y− f (x_y)| ≥ |x_y| − 1₂|x_y| =

1 x_y . Set W = B_r/2(0) and V =

|

f⁻¹(W ) ∩ B_r(0). Note then that V is open.

2| |

Deﬁne g : W → V by g(y) = xy. Then f (g(y)) = y for all y ∈ W and g(f (x)) = x for all x ∈ V.

Next we show that g is diﬀerentiable, with dgy = (df_g(y))⁻¹. First note that with ψ : B_r(0) → Rⁿ deﬁned by ψ(x) = x − f (x), we have that for x₁, x₂∈ B_r(0),

x₁− x2 f (x₁) − f (x₂) (x₁− x2) − (f (x₁) − f (x₂))

| | − | | ≤ | |

ψ(x₁) − ψ(x₂)

≤ | 1

| x1 − x2

≤ 2| |

where the last inequality follows by estimating as in (1), using dψ_x= I −df_x. Hence

1 x₁− x₂ f (x₁) − f (x₂)

2| | ≤ | |

for any x1, x2 ∈ Br(0), which implies

g(y₁) − g(y₂) y₁− y2 (2)

| | ≤ 2| |

for any y1, y2 ∈ W = B_r/2(0). In particular, g is continuous.

(3)

�

Now ﬁx y ∈ W , and let A = df_g(y). Since W is open, there exists δ > 0 such that y + k ∈ W if k ∈ B_δ(0). Let h = g(y + k) − g(y). Then k = y + k − y = f (g(y + k)) − f (g(y)) = f (g(y) + h) − f (g(y)) and hence, for k ∈ Bδ(0) \ {0},

g(y + k) − g(y) − A⁻¹k| A⁻¹(Ah − k) h

= |

h

| | k

|

|k| | | | |

≤ �A⁻¹�| h

|h k

|

− Ah| |

|k|

|

≤ 2�A⁻¹�|f (g(y) +

| h

h )

|

− f (g(y)) − Ah|

(3) where the last estimate follows from (2). Note that since g(y+k) = g(y) = ⇒ f (g(y + k)) = f (g(y)) = ⇒ y + k = y = ⇒ k = 0, we have that h =� 0 if k =� 0. Sice A = df_g(y), it follows from the deﬁnition of diﬀerentiability of f that the right hand side of (3) tends to 0 as h → 0, and hence, since

h k by (2), it follows that

| | ≤ 2| |

g(y + k) − g(y) − A⁻¹k

lim | |

= 0.

k^→0 | |k

i.e. g is diﬀerentiable at y and

dgy = (df_g(y))⁻¹. (4)

Finally, note that the function y �→ dgy is the composition of the function y �→ df_g(y)and matrix inversion A �→ A⁻¹. Matrix inversion is a smooth map of the entries, and the function y �→ df_g(y) is continuous since g is continuous and f is C¹. Hence we conclude that y �→ dgy is continuous; i.e. that g is C¹. Repeatedly diﬀerentiating (4) shows that g is smooth if f is smooth.

Exercise 1. Let L(Rⁿ; Rⁿ) be the set of linear transformations from Rⁿ into itself with the metric d(A, B) = �A − B�. (cf. Exercise 10 of handout 1.) Let U ⊂ Rⁿ be open and f : U → Rⁿ be a C¹ function. Show that the map x �→ dfx is continuous as a map from U into L(Rⁿ; Rⁿ).

Exercise 2. Suppose g : [a, b]→ Rⁿ is continuous. Show that

� b � b

|g(t) dt

≤ |

a

g(t)dt

a

(4)

�

� �

� � where | |

h(t)dt

denotes the Euclidean norm. You may use without proof that

·

� b � b

h(t) dt for a scalar valued function h.

≤ _a | |

a

Exercise 3. Deﬁne f : R → R by f (x) = x₂ + x²sin ¹_x if x =� 0 and f (0) = 0. Compute f^�(x) for all x ∈ R. Show that f^�(0) > 0, yet f is not onetoone in any neighborhood of 0. This example shows that in the Inverse Function Theorem, the hypothesis that f is C¹ cannot be weakened to the hypothesis that f is diﬀerentiable.

Exercise 4. Deﬁne f : R² → R² by f (x, y) = (e^xcos y, e^xsin y). Show that f is C¹ and that df_(x,y) is invertible for all (x, y) ∈ R² and yet f is not a onetoone function globally. Why doesn’t this contradict the Inverse Function Theorem?

Next we prove the Implicit Function Theorem. This theorem gives con

ditions under which one can solve, locally, a system of equations f_i(x, y) = 0, i = 1, 2, . . . n

where x ∈ R^m and y ∈ Rⁿ, for y in terms of x. (Thus, y = (y1, . . . , yn) where y1, . . . , yn are regarded as n unknowns, satisfying the n equations f_i(x, y) = 0, i = 1, . . . , n.) Geometrically, the set of solutions (x, y) to the system of equations is the graph of a function y = g(x). Note that we have from linear algebra that if for each i, the function fi is linear with constant coeﬃcients in the variables y_j, then whenever the (constant) n × n matrix

∂ fi is invertible, the system of equations is solvable for y in terms

∂ yj 1≤i,j≤n

of x. Implicit function theorem says that whenever f_i are C¹ and this matrix is invertible at a point (a, b), then the system is solvable for y in terms of x locally in a neighborhood of (a, b).

We shall use the following notation: For an Rⁿvalued function f (x, y) = (f1(x, y), f2(x, y), . . . , fn(x, y)) in a domain U ⊂ R^m+n ≡ R^m× Rⁿ, where , y∈ Rⁿ, we shall denote by d f the partial diﬀerential represented

x ∈ R^m _x

∂ fi

by the n × m matrix _{∂ x} and by dy f the partial diﬀerential

j 1≤i≤n,1≤j≤m

represented by the n × n matrix _{∂ y}^{∂ f}ⁱ

j 1≤i,j≤n .

Theorem 2 (Implicit Function Theorem). Let U ⊂ R^m+n≡ R^m× Rⁿ be an open set, f : U → Rⁿ a C¹ function, (a, b) ∈ U a point such that f (a, b) = 0 and d_yf |_(a,b) invertible. Then there exists a neighborhood V of

(5)

�

� �

�

(a, b) in U , a neighborhood W of a in R^m and a C¹ function g : W →Rⁿ such that

{(x, y) ∈ V : f (x, y) = 0} = {(x, g(x)) : x ∈ W } . Moreover,

dg_x= − (d_yf )⁻¹

(x,g(x)) d_xf|_(x,g(x))

(a,b)

� and g is smooth if f is smooth.

R^m+n

Proof. Deﬁne F : U → by F (x, y) = (x, f (x, y)). Then F is C¹ in U, F (a, b) = (a, 0) and det dF_(a,b)= det d_yf | = 0. Hence by the Inverse

F has a C¹ inverse F⁻¹ : W

Function Theorem, → V for neighborhoods

{x ∈ R^m V of (a, b) and W of (a, 0) in R^m

Then W is open in R^m

: (x, 0) ∈ W }.

. Note then that if x ∈ W , then (x, 0) ∈ W so that (x, 0) = F (x1, y1) where (x1, y1) ∈ V is uniquely determined by x. (In fact, by the deﬁnition of F , x1 = x.) Deﬁne g : W →Rⁿ by setting y1 = g(x).

Thus g(x) is deﬁned by F⁻¹(x, 0) = (x, g(x)); i.e. by g(x) = π◦F⁻¹(x, 0) where π : R^m× Rⁿ→ Rⁿis the projection map π(x, y) = y. Then {(x, y) ∈ V : f (x, y) = 0} = {(x, y) ∈ V : F (x, y) = (x, 0)} = {(x, g(x)) : x ∈ W }.

Since π is a smooth map and F⁻¹is C¹, it follows that g is C¹. The formula for dg_x follows by diﬀerentiating the identity

f (x, g(x)) ≡ 0 on W

using the chain rule. By repeatedly diﬀerentiating this identity, it follows that g is smooth if f is smooth.

× Rⁿ. Set W =