The Inverse Function Theorem

(1)

MATH 23b, SPRING 2005 THEORETICAL LINEAR ALGEBRA AND MULTIVARIABLE CALCULUS

The Inverse Function Theorem

The Inverse Function Theorem. Let f : Rⁿ −→ Rⁿ be continuously differentiable on some open set containing a, and suppose det J f (a) 6= 0. Then there is some open set V containing a and an open W containing f (a) such that f : V → W has a continuous inverse f⁻¹ : W → V which is differentiable for all y ∈ W .

Note: As matrices, J (f⁻¹)(y) = [(J f )(f⁻¹(y))]⁻¹.

Lemma: Let A ⊂ Rⁿ be an open rectangle, and suppose f : A −→ Rⁿ is continuously differentiable. If there is some M > 0 such that

∂f_i

∂xj

(x)

≤ M, ∀x ∈ A, then ||f (y) − f (z)|| ≤ n²· M · ||y − z||, ∀y, z ∈ A.

Proof: We write

f_i(y) − f_i(z) = f_i(y₁, . . . , y_n) − f_i(z₁, . . . , z_n)

=

n

X

j=1

[f (y₁, . . . , y_j, z_j+1, . . . , z_n) − f (y₁, . . . , y_j−1, z_j, z_j+1, . . . , z_n)]

=

n

X

j=1

∂f_i

∂x_j(x_ij)(y_j− z_j)

for some x_ij = (y₁, . . . , y_j−1, c_j, z_j+1, . . . , z_n) where, for each j = 1, . . . , n, we have c_j is in the interval (y_j, z_j), by the single-variable Mean Value Theorem.

Then

||f (y) − f (z)|| ≤

n

X

i=1

||f_i(y) − f_i(z)||

=

n

X

i=1 n

X

j=1

∂f_i

∂x_j(x_ij)

· |y_j − z_j|

≤

n

X

i=1 n

X

j=1

M · ||y − z||

= n²· M · ||y − z||

(2)

Proof of the Inverse Function Theorem:

(borrowed principally from Spivak’s Calculus on Manifolds)

Let L = J f (a). Then det(L) 6= 0, and so L⁻¹ exists. Consider the com- posite function L⁻¹◦ f : Rⁿ→ Rⁿ. Then:

J (L⁻¹◦ f )(a) = J(L⁻¹)(f (a)) ◦ J f (a)

= L⁻¹◦ Jf (a)

= L⁻¹◦ L

which is the identity. Since L is invertible, the theorem is equally true or false for both L⁻¹◦ f and f simultaneously, and hence we prove it in the case when L = I.

Suppose f (a + h) = f (a). Then |f (a + h) − f (a) − L(h)|

|h| = |h|

|h| = 1.

On the other hand, we have have lim

||h||→0

f (a + h) − f (a) − L(h)

||h|| = 0,

which is a contradiction, and hence there must be some open neighborhood/rectangle U around a in which f (a + h) 6= f (a), ∀a + h ∈ U, h 6= 0.

Furthermore, we may choose this neighborhood U small enough so that:

• det(Jf (x)) 6= 0, ∀x ∈ U

•

∂f_i

∂x_j(x) − ∂f_i

∂x_j(a)

< 1

2n², ∀i, j, ∀x ∈ U

since these are conditions on n²+ 1 continuous functions!

Claim 1: ||x₁− x₂|| ≤ 2 · ||f (x₁) − f (x₂)||, ∀x₁, x₂ ∈ U

Proof of Claim 1: First, we let g(x) = f (x) − x. By construction and the second fact above, we have

∂g_i

∂x_j(x)

=

∂f_i

∂x_j(x) − ∂f_i

∂x_j(a)

≤ 1

2n², and so we apply the Lemma with M = 1

2n²:

||x₁− x₂|| − ||f (x₁) − f (x₂)|| ≤ ||(f (x₁) − x₁) − (f (x₂) − x₂)||

= ||g(x₁) − g(x₂)||

≤ ¹₂ · ||x₁− x₂||

and so, combining these inequalities, we have

1

2 · ||x₁− x₂|| ≤ ||f (x₁) − f (x₂)||

(3)

Now consider the set ∂U , which is compact since U is bounded. We know by the reasoning in the second paragraph of the proof that if x ∈ ∂U , then f (x) 6= f (a). Hence ∃d > 0 such that ||f (x) − f (a)|| ≥ d, ∀x ∈ ∂U . (Since both f and the taking of norms are continuous functions, the expression

||f (x) − f (a)|| attains its non-zero minimum on the compact set ∂U .)

We construct the set W ⊂ Rⁿ, thinking of it as a subset of the range of f , as follows:

W =

y ∈ Rⁿ

||y − f (a)|| < d 2

= B_d/2(f (a))

By its construction and the use of the positive real number d, we see that if y ∈ W and x ∈ ∂U , then

||y − f (a)|| < ||y − f (x)||. (1)

Claim 2: Given y ∈ W , there is a unique x ∈ U such that f (x) = y.

Proof of Claim 2:

Existence:

Consider h : U → R defined by h(x) = ||y − f (x)||². A straightfor- ward simplification of this expression gives h(x) =

n

X

i=1

(yi − fi(x))². Note that h is continuous and hence attains its minimum on the compact set U . This minimum does not occur on the boundary,

∂U , by the inequality (1), and hence it must occur on the inte- rior. Since h is also differentiable, we must have ∇h(x) = 0 at the minimum, and hence:

0 = ∂h

∂x_j(x) =

n

X

i=1

2 · (y_i− f_i(x)) · ∂fi

∂x_j(x), ∀j

In other words, collecting this information over the various i and j, we have

0 = J f (x) · (y − f (x)),

but since we have assumed that det J f (x) 6= 0 for any x ∈ U , it follows that J f (x) is invertible, and hence y − f (x) = 0.

Uniqueness:

We use Claim 1. Suppose y = f (x₁) = f (x₂).

Then ||x₁− x₂|| ≤ 2 · ||f (x₁) − f (x₂)|| = 0, and hence x₁ = x₂.

(4)

By Claim 2, if we define V = U ∩ f⁻¹(W ), then f : V → U has an inverse!

It remains to show that f⁻¹ is continuous and differentiable. Even though continuity would follow from differentiability, we do this in two steps because we will use the continuity to help prove the differentiability.

Claim 3: f⁻¹ is continuous.

Proof of Claim 3:

For y₁, y₂ ∈ W , find x₁, x₂ ∈ U such that f (x₁) = y₁ and f (x₂) = y2. Claim 1 implies that ||x1 − x2|| ≤ 2 · ||f (x1) − f (x2)||, or in other words, that ||f⁻¹(y₁) − f⁻¹(y₂)|| ≤ 2 · ||y₁− y₂||.

It is now easy to see that given ε > 0, we need only choose δ = ε/2 to guarantee that if ||y₁− y₂|| < δ, then ||f⁻¹(y₁) − f⁻¹(y₂)|| < ε.

Claim 4: f⁻¹ is differentiable.

Proof of Claim 4:

Let x ∈ V , let A = J f (x), and let y = f (x) ∈ W . We claim that J f⁻¹(y) = A⁻¹.

Define ϕ(x) = f (x + h) − f (x) − A(h).

Then lim

||h||→0

||ϕ(h)||

||h|| = 0, by the differentiability of f .

Since det(A) = det J f (x) 6= 0 by hypothesis, we know that A⁻¹ exists, and it is linear since A is. Then:

A⁻¹(f (x + h) − f (x)) = h + A⁻¹(ϕ(h))

= [(x + h) − x] + A⁻¹(ϕ(h)) Letting y = f (x) and y₁ = f (x + h) on both sides yields:

A⁻¹(y₁− y) = [f⁻¹(y₁) − f⁻¹(y)] + A⁻¹(ϕ(f⁻¹(y₁) − f⁻¹(y))) Re-arranging sides:

A⁻¹(ϕ(f⁻¹(y₁) − f⁻¹(y))) = [f⁻¹(y₁) − f⁻¹(y)] − A⁻¹(y₁− y) (2)

(5)

To show differentiability, we need:

lim

||y1−y||→0

||f⁻¹(y₁) − f⁻¹(y) − A⁻¹(y₁− y)||

||y₁− y|| = 0

but by equation (2) above, this is the same as showing:

lim

||y1−y||→0

||A⁻¹(ϕ(f⁻¹(y₁) − f⁻¹(y)))||

||y₁− y|| = 0.

Since A⁻¹ is linear, it suffices to use the Chain Rule and show that:

||y1lim−y||→0

||ϕ(f⁻¹(y₁) − f⁻¹(y))||

||y₁− y|| = 0, (3)

so we factor the expression inside the limit as follows:

||ϕ(f⁻¹(y₁) − f⁻¹(y))||

||y₁− y|| = ||ϕ(f⁻¹(y₁) − f⁻¹(y))||

||f⁻¹(y₁) − f⁻¹(y)|| ·||f⁻¹(y₁) − f⁻¹(y)||

||y₁− y|| . The first term on the right tends to 0 because of how we defined ϕ and the fact that the continuity of f⁻¹ means that f⁻¹(y₁) → f⁻¹(y).

Observing that the second term on the right is less than or equal to 2 (by Claim 1) enables us to use the Squeeze Theorem and conclude that the product on the right tends to 0, which establishes equation (3).

End Proof of Inverse Function Theorem.

(borrowed principally from Spivak’s Calculus on Manifolds)