0 such that Br(p

(1)

Definitions

(a) Let p = (p1, . . . , pn) ∈ Rⁿ and let r > 0. Then B_r(p) = {x = (x₁, . . . , x_n) ∈ Rⁿ|

n

X

i=1

(x_i− p_i)²1/2

= kx − pk < r}

is the (open) ball in Rⁿ with center x and of radius r.

(b) E is called an opensubset of Rⁿ if

for each p ∈ E there is an r > 0 such that Br(p) ⊂ E.

(c) K is called aclosed subset of Rⁿ if K^c = {x ∈ Rⁿ | x /∈ K} is an open subset of Rⁿ. (d) Let E be a subset of Rⁿ. A point p ∈ E is called an interior point of E if

there is a r > 0 such that B_r(p) ⊂ E.

(e) Let E be a subset of Rⁿ. A point p (not necessarily in E) is calleda boundary point of E if for each r > 0 we have B_r(p) ∩ E 6= ∅ and B_r(p) ∩ E^c6= ∅.

(f) Given a subset E ⊂ Rⁿ, let E^o denote the set of interior points of E.

(g) Given a subset E ⊂ Rⁿ, let ∂E denote the set of boundary points of E.

(h) Given a subset E ⊂ Rⁿ, let ¯E = E ∪ ∂E denote the closure of E.

Remarks

(a) For each p ∈ E, either

(i) there is a r > 0 such that B_r(p) ⊂ E then p is an interior point of E.

or

(ii) there exists no such r > 0 such that B_r(p) ⊂ E, i.e.

for each r > 0, B_r(p) 6⊂ E ⇐⇒ B_r(p) ∩ E^c 6= ∅.

Since p ∈ E, we also have

p ∈ B_r(p) ∩ E 6= ∅.

Hence, if p ∈ E then p is either an interior point or a boundary point of E.

(b) E^o is an open subset of Rⁿ, i.e. the set of interior points of E is an open subset of Rⁿ. (c) ¯E = E ∪ ∂E is a closed subset of Rⁿ, i.e. the closure of E is a closed subset of Rⁿ.

Definition Let f have domain E in Rⁿ and range in R, and let p be an interior point of E. We say that f iscontinuous at p if

x→plimf (x) = f (p).

Equivalently, we say that f is continuous at p if

∀ > 0 ∃ δ() > 0 such that if x ∈ E and kx − pk ≤ δ(), then |f (x) − f (p)| < , or

∀ > 0 ∃ δ() > 0 such that f (B_δ()(p)) ⊂ B(f (p)).

(2)

Definition Let f have domain E in Rⁿ and range in R, and let p be an interior point of E. For each 1 ≤ i ≤ n, the partial derivative of f with respect to xi, denoted fxi = ∂f

∂x_i, is said to exist at p if

h→0lim

f (p₁, . . . , p_i−1,p_i+ h, p_i+1, . . . , p_n) − f (p₁, . . . , p_i−1,p_i, p_i+1, . . . , p_n)

h exists.

Remark Note that if f_x_i(p) exists for all i = 1, . . . , n, then f is continuous at p.

Definition Let f have domain E in Rⁿ and range in R, and let p be an interior point of E. We say that f is differentiable at p if for every > 0 there exist a δ() > 0 and a linear function L: Rⁿ→ R such that if x ∈ Rⁿ is any vector satisfying kx − pk ≤ δ(), then x ∈ E and

|f (x) − f (p) − L(x − p)| ≤ kx − pk.

Equivalently, we say that f is differentiable at p if there exists an 1 × n matrix L : Rⁿ→ R such that

x→plim

|f (x) − f (p) − L(x − p)|

kx − pk = lim

kx−pk→0

|f (x) − f (p) − L(x − p)|

kx − pk = 0.

A function g : E → R is said to be of the lower order magnitude than kx − pk, denoted g(x) = o(kx − pk) and read g(x) is of the little o of kx − pk, if

x→plim

|g(x)|

kx − pk = 0.

In terms of the vanishing order, f is differentiable at p if

f (x) − f (p) − L(x − p) = o(kx − pk).

Remarks

(a) Note that if f is differentaible at p, then the partial derivatives f_x_i(p) exists for all i = 1, . . . , n.

Proof Since f is differentiable at p, there exists an 1 × n matrix L : Rⁿ→ R such that

x→plim

|f (x) − f (p) − L(x − p)|

kx − pk = 0.

For each 1 ≤ i ≤ n, by setting x = p + h e_i, where e_i is the unit vector in the positive x_i coordinate, we have

kx − pk = kh eik = |h| =⇒x → p ⇐⇒ h → 0,

(3)

and

0 = lim

x→p

|f (x) − f (p) − L(x − p)|

kx − pk

= lim

h→0

|f (p + h ei) − f (p) − L(h ei)|

|h|

= lim

h→0

f (p + h e_i) − f (p) − h L(e_i) h

since L is linear

= lim

h→0

f (p + h e_i) − f (p)

h − L(ei)

=

h→0lim

f (p + h e_i) − f (p)

h − L(ei)

Hence

f_x_i(p) = ∂f

∂xi

(p) = lim

h→0

f (p + h e_i) − f (p)

h = L(e_i) exists for each 1 ≤ i ≤ n.

(b) If L₁, L₂ are linear functions such that f (x) − f (p) − L₁(x − p) = o(kx − pk) and f (x) − f (p) − L2(x − p) = o(kx − pk), then L1 = L2.

Proof Observe that

x → p ⇐⇒ t → 0

and if L : Rⁿ→ R is an 1 × n matrix and let ei denote the 1 × n unit vector in the positive i−th coordinate of Rⁿ then

L(e_i) = hL , e_ii = the i-th entry of L.

Since

0 ≤ lim

x→p

| L₁− L₂(x − p)|

kx − pk

= lim

x→p

|f (x) − f (p) − L2(x − p) − f (x) − f (p) − L1(x − p)|

kx − pk

≤ lim

x→p

|f (x) − f (p) − L₂(x − p)|

kx − pk + lim

x→p

|f (x) − f (p) − L₁(x − p)|

kx − p|

= 0,

we get L₁ = L₂ by taking x − p = te_i and the fact that 0 = lim

t→0

| L₁− L₂(te_i)|

kte_ik = | L1− L2(ei)|

which implies that the i-th entry of L₁ is equal to the i-th entry of L₂.

(c) If L exists at p ∈ E, i.e if f is differentiable at p, then L = Df (p), by part (b), is called the derivative of f at p. Note that the linear function Df (p) : Rⁿ → R is defined by

Df (p)(x − p) = ∂f

∂x₁(p), ∂f

∂x₂(p), · · · , ∂f

∂x_n(p)

, (x₁− p₁, x₂− p₂, · · · , x_n− p_n)

(4)

(d) When m = n = 1, f is differentiable at p ∈ (a, b) ⊂ R if there exists L ∈ R such that

x→plim

|f (x) − f (p) − L(x − p)|

|x − p| = lim

x→p

f (x) − f (p) − L(x − p) x − p

= 0

⇐⇒ lim

x→p

f (x) − f (p) − L(x − p)

x − p = 0

⇐⇒ lim

x→p

f (x) − f (p) x − p = L, and the deirivative of f at p is defined to be f⁰(p) = L.

Mean Value Theorem Let f be defined on an open subset E of Rⁿ and have values in R.

Suppose that E contains the points a, b and the line segement S joining them, and that f is differentiable at every point of S. Then there exists a point c ∈ S such that

f (b) − f (a) = Df (c)(b − a).

Proof Let φ : R → Rⁿ be defined by

φ(t) = (1 − t)a + tb = a + t(b − a) for t ∈ R,

so that φ(0) = a, φ(1) = b, and φ(t) ∈ S ( E for t ∈ [0, 1]. Since E is open and φ is continuous, differentiable on an open interval containing [0, 1], say [0, 1] ( (−γ, 1 + γ) for some γ > 0. Let F : (−γ, 1 + γ) → R be defined by

F (t) = f ◦ φ(t) = f (1 − t)a + tb.

Since F is continuous on [0, 1], differentiable on (0, 1), by the Mean Value Theoremand the Chain Rule, there exists 0 < s < 1 such that

F (1) − F (0) = F⁰(s)(1 − 0) = Df ((1 − s)a + sb)φ⁰(s) = Df (c)(b − a), where c = φ(s) = (1 − s)a + sb ∈ S.

Theorem Let f be defined on an open subset E of Rⁿ and have values in R. Suppose that the partial derivatives f_x_i, i = 1, . . . , n, exist in a neiborhood of p ∈ E and are continuous at p, then f is differentiable at p.

Example Let f : R² → R be defined by

f (x, y) =







x³− y³

x²+ y² if (x, y) 6= (0, 0) 0 if (x, y) = (0, 0).

Then

f_x(x, y) =







x⁴+ 3x²y²+ 2x y³

x²+ y²2 if (x, y) 6= (0, 0)

1 if (x, y) = (0, 0),

(5)

and

f_y(x, y) =







−y⁴+ 3y²x²+ 2y x³

x² + y²2 if (x, y) 6= (0, 0)

−1 if (x, y) = (0, 0).

Note that f_x, f_y are not continuous at (0, 0).

Also since lim

(x,y)→(0,0)

|f (x, y) − f (0, 0) − h(1, −1) , (x, y)i|

k(x, y)k = lim

(x,y)→(0,0)

|x³ − y³− (x − y)(x²+ y²)|

x²+ y²3/2

= lim

(x,y)→(0,0)

|x²y − x y²| x²+ y²3/2

By setting y = −x → 0 in the limit we get lim

(x,y)→(0,0)

|x²y − x y²|

x²+ y²3/2 = lim

(x,−x)→(0,0)

| − x²x − x x²| x²+ x²3/2 = 1

√2 6= 0.

Hence f is not differentiable at (0, 0).

Lemma Suppose that f is defined on an open ball B_r(o) of o = (0, 0) ∈ R² with values in R, that the partial derivatives D^xf = ∂f

∂x = fx and Dyf = ∂f

∂y = fy exist in Br(o), and that D_yxf = ∂²f

∂y∂x = f_xy is continuous at (0, 0). Let A : B_r(0) → R be defined by A(h, k) = f (h, k) − f (h, 0) − f (0, k) + f (0, 0) for (h, k) ∈ B_r(o).

Then

f_xy(0, 0) = D_yxf (0, 0) = lim

(h,k)→(0,0)

A(h, k) hk .

Proof Given > 0, since B_r(o) is open and D_yxf is continuous at (0, 0), there exists a δ > 0 such that

if |h| < δ and |k| < δ then (h, k) ∈ Br(o) and |Dyxf (h, k) − Dyxf (0, 0)| < .

For each |k| < δ and |h| < δ, we define the map G : (−h, h) → R by

G(h) = f (h, k) − f (h, 0) for each |h| < δ =⇒ A(h, k) = G(h) − G(0).

By hypothesis, D_xf exists in B_r(o) and hence G is differentiable on |h| < δ.

By the Mean Value Theorem, there exists h₀ with 0 < |h₀| < |h| such that A(h, k) = G(h) − G(0) = h G⁰(h₀) = hD_xf (h₀, k) − D_xf (h₀, 0).

Using the Mean Value Theorem again, there exists k₀ with 0 < |k₀| < |k| such that D_xf (h₀, k) − D_xf (h₀, 0) = k D_yxf (h₀, k₀) =⇒ A(h, k) = h k D_yxf (h₀, k₀)

(6)

for all 0 < |h| < δ and 0 < |h| < δ.

This completes the proof of

f_xy(0, 0) = D_yxf (0, 0) = lim

(h,k)→(0,0)

A(h, k) hk .

Theorem Suppose that f is defined on an open ball B_r(p) of p = (a, b) ∈ R² with values in R, that the partial derivatives D_xf = ∂f

∂x = f_x, D_yf = ∂f

∂y = f_y and D_yxf = ∂²f

∂y∂x = f_xy exist in B_r(p), and that D_yxf = ∂²f

∂y∂x = f_xy is continuous at p = (a, b). Then the partial derivative Dxyf = ∂²f

∂x∂y = fyx exists at p = (a, b) and

D_xyf (a, b) = D_yxf (a, b).

Proof Let

A(h, k) = f (a + h, b + k) − f (a + h, b) − f (a, b + k) + f (a, b) for (h, k) ∈ B_r(o).

Using the Lemma, we have

D_yxf (a, b) = lim

(h,k)→(0,0)

A(h, k) hk . By the hypothesis, D_yf exists in B_r(p) such that

lim

k→0

A(h, k)

hk = 1

h

lim

k→0

f (a + h, b + k) − f (a + h, b)

k − lim

k→0

f (a, b + k) − f (a, b) k

∀ 0 < |h| < r

= 1

h [D_yf (a + h, b) − D_yf (a, b)] ∀ 0 < |h| < r Given > 0, since Dyxf (a, b) = lim

(h,k)→(0,0)

A(h, k)

hk there exists a δ > 0 such that if 0 < |h| < δ and 0 < |k| < δ then

A(h, k)

hk − D_yxf (a, b)

< .

By taking the limit with respect to k → 0, we obtain

1

h [D_yf (a + h, b) − D_yf (a, b)] − D_yxf (a, b)

=

lim

k→0

A(h, k)

hk − D_yxf (a, b)

≤ for all h satisfying 0 < |h| < δ.

By letting h → 0, we get

(7)

|D_xyf (a, b) − D_yxf (a, b)| =

h→0lim

D_yf (a + h, b) − D_yf (a, b)

h − D_yxf (a, b)

≤ ∀ > 0.

Therefore, D_xyf (a, b) exists and equals D_yxf (a, b).

Remark Let E be an open subset of Rⁿ and let C^k(E) denote the space of functions on E with continuous partial derivatives of orders up to k.

Example Let f : R² → R be defined by

f (x, y) =







xy(x²− y²)

x²+ y² if (x, y) 6= (0, 0) 0 if (x, y) = (0, 0).

Then

f_x(x, y) =







3x²y − y³

x² + y² − 2x²y(x²− y²)

(x²+ y²)² if (x, y) 6= (0, 0)

0 if (x, y) = (0, 0),

and

f_y(x, y) =







x³− 3xy²

x²+ y² −2xy²(x²− y²)

(x²+ y²)² if (x, y) 6= (0, 0)

0 if (x, y) = (0, 0).

Direct computation shows that fx, fy are continuous on R², the second partial derivatives fxy, fyx

exist on R², yet

D_xyf (0, 0) = f_yx(0, 0) = 1 6= −1 = f_xy(0, 0) = D_yxf (0, 0), and

lim

(x,y)→(0,0)f_yx(x, y) 6= 1 = f_yx(0, 0) lim

(x,y)→(0,0)f_xy(x, y) 6= −1 = f_xy(0, 0).

Definition Let f have domain E in Rⁿ and range in R^m, and let p be an interior point of E.

We say that f is differentiable at p if there exists a linear function L : Rⁿ → R^m such that for every > 0 there exists δ() > 0 such that if x ∈ Rⁿ is any vector satisfying kx − pk ≤ δ(), then x ∈ E and

kf (x) − f (p) − L(x − p)k ≤ kx − pk.

Equivalently, we say that f is differentiable at p if there exists an m × n matrix L : Rⁿ → R^m such that

x→plim

kf (x) − f (p) − L(x − p)k

kx − pk = lim

kx−pk→0

kf (x) − f (p) − L(x − p)k

kx − pk = 0,

i.e., the function f (x) − f (p) − L(x − p) = o(kx − pk), or we say that f (x) − f (p) − L(x − p) is of a little o of kx − pk.

(8)

Chain Rule Let f have domain A ⊆ Rⁿ and range in R^m, let g have domain B ⊆ R^m and range in R^k. Suppose that f is differentiable at p ∈ A and g is differentiable at q = f (p). Then the composition h = g ◦ f is differentiable at p and

Dh(p) = Dg(q) ◦ Df (p),

where the linear transformations Df (p) : Rⁿ → R^m, Dg(q) : R^m → R^k and Dh(p) : Rⁿ → R^k are differential maps of f, g and h = g ◦ f, respectively. Alternately, we write

D(g ◦ f )(p) = Dg(f (p)) ◦ Df (p).

In practice, one can represent Df (p), Dg(q) and Dh(p) as m × n, n × k and m × k matrices, respectively and the composition operation ◦ between differentials as matrix multiplication.

Definition Let E ⊆ Rⁿ, let f : E → R be differentiable on E and let p be an interior point of E. If u is a unit vector in Rⁿ, the directional derivative of f at p in the direction of u, denoted by D_uf (p), is defined to be

D_uf (p) = lim

t→0

f (p + tu) − f (p)

t .

Remark Since p is an interior point of E, there exists > 0 such that the map r : (−, ) → E defined by

r(t) = p + tu for t ∈ (−, ) satisfies that

r(t) ∈ E for all t ∈ (−, ).

Since f is differentiable at p, the gradient vector ∇f (p) = (f_x₁(p), . . . , f_x_n(p)) exists and saisfies that

0 = lim

t→0

|f (p + tu) − f (p) − h∇f (p) , (p + tu − p)i|

kp + tu − pk

= lim

t→0

|f (p + tu) − f (p) − th∇f (p) , ui|

|t|

=

limt→0

f (p + tu) − f (p)

t − h∇f (p) , ui

= |Duf (p) − h∇f (p) , ui|

Thus we have

D_uf (p) = h∇f (p) , ui = k∇f (p)k cos θ,

where θ is the angle between ∇f (p) and the direction u. This imples that if ∇f (p) 6= 0, then f increases most rapidly when u = ∇f (p)

k∇f (p)k, i.e. θ = 0, and decreases most rapidly when u = − ∇f (p)

k∇f (p)k, i.e. θ = π.

(9)

Second Derivatives Test Let B_r(p) be an open ball of radius r with center p = (a, b) ∈ R². Suppose that f : B_r(p) → R has continuous partial derivatives up to the 2^nd order, i.e. f, f_x, f_y, f_xx, f_xy = f_yx and f_yy are continuous on B_r(p), and suppose that f_x(p) = 0 and f_y(p) = 0, i.e. p is a critical point of f. Let

D =

f_xx(p) f_xy(p) fyx(p) fyy(p)

= f_xx(p)f_yy(p) − f_xy² (p).

(a) If D > 0 and f_xx(p) > 0, then f (p) is a local minimum.

(b) If D > 0 and f_xx(p) < 0, then f (p) is a local maximum.

(c) If D < 0 and f_xx(p) > 0, then f (p) is not a local maximum or minimum.

Remarks

(a) Note that the characteristic equation 0 =

f_xx(p) − λ f_xy(p) f_yx(p) f_yy(p) − λ

= λ²− f_xx(p) + f_yy(p)λ + fxx(p)f_yy(p) − f_xy² (p) implies that

f_xx(p) + f_yy(p) = the sum of eigenvalues

f_xx(p)f_yy(p) − f_xy² (p) = the product of eigenvalues.

(b) If D = f_xx(p)f_yy(p) − f_xy² (p) > 0 then f_xx(p)f_yy(p) > f_xy² (p) ≥ 0 and both eigenvalues have the same sign as f_xx(p) or f_yy(p).