Finite Difference Methods

(1)

FINITE DIFFERENCE METHODS

FOR

SOLVING DIFFERENTIAL EQUATIONS

I-Liang Chern

Department of Mathematics National Taiwan University

(2)

(3)

Introduction

The goal of this course is to introduce theoretical analysis of finite difference methods for solving partial differential equations. The focuses are the stability and convergence theory. The partial differential equations to be discussed include

• parabolic equations, • elliptic equations,

• hyperbolic conservation laws.

1.1 Finite Difference Approximation

A finite difference approximation is to approximate differential operators by finite difference oper-ators, which is a linear combination of u on discrete points. For example,

• Forward difference: D+u(x) := u(x+h)−u(x)_h ,

• Backward difference: D−u(x) := u(x)−u(x−h)_h ,

• Centered difference: D0u(x) := u(x+h)−u(x−h)_2h .

Here, h is called the mesh size. By Taylor expansion, we can get • u0(x) = D+u(x) + O(h),

• u0(x) = D−u(x) + O(h),

• u0(x) = D0u(x) + O(h2).

These formulae can be derived by performing Taylor expansion of u at x. For instance, we expand u(x + h) = u(x) + u0(x)h +h 2 2 u 00_{(x) +} h3 3!u 000_{(x) + · · ·}

(8)

u(x − h) = u(x) − u0(x)h +h 2 2 u 00 (x) − h 3 3!u 000 (x) + · · · . Subtracting these two equations yields

u(x + h) − u(x − h) = 2u0(x)h +2h 3 3! u 000 (x) + · · · . This gives u0(x) = D0u(x) − h2 3!u 000 (x) + · · · = D0u(x) + O(h2).

Thus, u0(x) can be approximated by several difference operators. Indeed, we can approximate u0(x) by finite difference operators which involve u on more discrete points with higher order errors. For example,

u0(x) = D3u(x) + O(h3),

where

D3u(x) =

1

6h(2u(x + h) + 3u(x) − 6u(x − h) + u(x − 2h)) .

This formula can be derived by taking Taylor expansion of u(x + h), u(x − h), u(x − 2h) about x, then making proper combination to cancel 0th, and 2nd derivatives term. That is

u(x + h) = u(x) + u0(x)h +h 2 2 u 00_{(x) +} h3 3!u 000_{(x) + · · ·} u(x − h) = u(x) − u0(x)h +h 2 2 u 00_{(x) −} h3 3!u 000_{(x) + · · ·} u(x − 2h) = u(x) − 2u0(x)h + 4h 2 2 u 00_{(x) −} 8h3 3! u 000_{(x) + · · ·}

Taking the combination 2u(x + h) + 3u(x) − 6u(x − h) + u(x − 2h), we can cancel the zeroth, second derivatives and obtain u0(x) = D3u(x) + O(h3).

In general, suppose we are given the values of u at discrete points {xj}. These discrete points

are called grid points. We want to approximation u(k) at a specific point ¯x by the values of u at some of these grid points, say xj, j = 0, ..., n with n ≥ k. That is,

u(k)(¯x) =

n

X

j=0

cju(xj) + O(hp−k+1)

Here, the mesh size h denotes max{|xi− xj|}. The parameter p ≥ k is some positive integer. We

want to design cj so that p is as larger as possible. As we shall see later that we can choose p = n.

To find the coefficients cj, j = 0, ..., n, we make Taylor expansion of u(xj) about the point x:

u(xj) = p X i=0 1 i!(xj− ¯x) i_u(i)_(¯_{x) + O(h}p+1_).

(9)

We plug this expansion formula of u(xj) into the finite difference approximation formula for u(k)(x): u(k)(¯x) = n X j=0 cj p X i=0 1 i!(xj− ¯x) i_u(i)_(¯_{x) + O(h}p−k+1_).

Comparing both sides, we obtain

n X j=0 (xj− ¯x)i i! cj = 1 if i = k 0 otherwise , for i = 0, ..., p.

There are p + 1 equations here, it is natural to choose p = n to match the n + 1 unknowns. This is a n×n Vandermonde system. It is nonsingular if xiare different. The matlab code fdcoeffV(k,xbar,x)

can be used to compute these coefficients. Reference: Randy LeVeque’s book and his Matlab code. In the case of uniform grid, using central finite differencing, we can get high order approxima-tion by using less grid points. For instance, let xj = jh, where j ∈ Z. Let uj = u(xj). Then

u0(0) = u1− u−1 h + O(h 2₎ u00(0) = u1− 2u0+ u−1 h2 + O(h 2₎ u(3) = 1 2h3(u2− 2u1+ 2u0− 2u−1+ u−2) + O(h 2_). Homeworks.

1. Consider xi = ih, i = 0, ..., n. Let ¯x = xm. Find the coefficients ci for u(k)(¯x) and the

coefficient of the leading truncation error for the following cases: • k = 1, n = 2, 3, m = 0, 1, 2, 3.

• k = 2, n = 2, m = 0, 1, 2.

1.2 Basic Numerical Methods for Ordinary Differential Equations

The basic assumption to design numerical algorithm for ordinary differential equations is the smooth-ness of the solutions, which is in general valid provided the coefficients are also smooth. Basic designning techniques include numerical interpolation, numerical integration, and finite difference approximation.

Euler method

Euler method is the simplest numerical integrator for ODEs. The ODE

(10)

is discretized by

yn+1= yn+ kf (tn, yn). (1.2) Here, t0, ..., tnare the grid points of time t. k = tn+1− tn_{is time step size of the discretization.}

This method is called the forward Euler method. It simply replace dy/dt(tn) by the forward finite difference (yn+1− yn_{)/k. Given a smooth solution y(·), by Taylor expansion, we define the local}

truncation error to be

τn:= y0(tn) −y(t

n+1_{) − y(t}n₎

k = O(k).

We are interested in the true error, which is defined to be en:= yn− y(tn_{). We have the following}

convergence theorem.

Theorem 1.1. Assume f ∈ C1 and suppose the solutiony0 = f (t, y) with y(0) = y0 exists on

[0, T ]. Then the Euler method converges at any t ∈ [0, T ]. In fact, the true error en has the following estimate:

|en_{| ≤} eλt

λ O(k) → 0, as n → ∞. (1.3) Here,λ = max |∂f /∂y| and nk = t.

Proof. From the regularity of the solution, we have y ∈ C2[0, T ] and

y(tn+1) = y(tn) + kf (tn, y(tn)) + kτn. (1.4) Taking difference of (1.2) and (1.4), we obtain

|en+1| ≤ |en| + k|f (tn, yn) − f (tn, y(tn))| + k|τn| ≤ (1 + kλ)|en| + k|τn|.

where

|f (t, x) − f (t, y)| ≤ λ|x − y|.

The finite difference inequality has a fundamental solution Gn = (1 + λk)n, which is positive provided k is small. Multiplying above equation by (1 + λk)−n−1, we obtain

|em+1|G−m−1≤ |em|G−m+ kG−m−1|τm|. Summing in m from m = 0 to n − 1, we get

|en| ≤ n−1 X m=0 Gn−m−1k|τm| ≤ n−1 X m=0 GmO(k2) = G n_{− 1} G − 1 O(k 2_{) ≤} Gn λ O(k) ≤ eλt λ O(k), where t = nk and we have used (1 + λk)n≤ eλt_.

(11)

1. The theorem states that the numerical method converges in [0, T ] as long as the solutions of the ODE exists.

2. The error is O(k) if the solution is in C2[0, T ].

3. The proof above relies on the existence and smoothness of the solution. However, one can also use this approach to prove the local existence theorem by showning the approximate solutions generated by the Euler method form a Cauchy sequence.

Backward Euler method

In many applications, the system is relaxed to a stable solution in a very short period of time. For instance, consider

y0 = y − y¯ τ .

The corresponding solution y(t) → ¯y as t ∼ O(τ ). In the above forward Euler method, practically, we should require

1 + kλ ≤ 1

in order to have Gnremain bounded. Here, λ is the Lipschitz constant. In the present case, λ = 1/τ . If τ is very small, the the above forward Euler method will require very small k and lead to inefficient computation. In general, forward Euler method is inefficient (require small k) if

max ∂f (t, y) ∂y >> 1. There are two possibilities:

• ∂f /∂y >> 1: In this case, we need to choose a very small k in order to resolve details. • ∂f /∂y << −1.

For the second case, the backward Euler method is recommended: yn+1= yn+ kf (tn+1, yn+1). Comparing the Taylor expansion of the exact solution at tn+1:

y(tn+1) = y(tn) + kf (tn+1, y(tn+1)) + O(k), we get that the true error en:= yn− y(tn_{) satisfies}

en+1= en+ k∆f + O(k2) = en+ k ∂f ∂y(t, ¯y) en+1+ O(k2). 1 − k ∂f ∂y(t, ¯y) en+1= en+ O(k2).

(12)

Since ∂f_∂y(t, ¯y) << −1, we take λ = min ∂f ∂y(t, ¯y) , we get 1 − k ∂f ∂y(t, ¯y) ≥ (1 + λk). Thus, (1 + λk)|en+1| ≤ |en| + O(k2)

The corresponding fundamental solution is Gn:= (1 + λk)−n. Notice that the error satisfies |en| ≤ n−1 X m=0 (1 + λk)−mO(k2) ≤ (1 + λk) −n+1 λk O(k 2₎ ≤ e −λT λ O(k).

There is no restriction on the size of k. However, the price to pay is that we need to solve a nonlinear equation

yn+1= yn+ kf (tn+1, yn+1) for yn+1at each time step.

Leap frog method

We integrate y0 = f (t, y) from tn−1to tn+1: y(tn+1) − y(tn−1) =

ˆ tn+1

tn−1

f (τ, y(τ )) dτ. We apply the midpoint rule for numerical integration, we then get

y(tn+1) − y(tn−1) = 2kf (tn, y(tn)) + O(k3). The midpoint method (or called leapfrog method) is

yn+1− yn−1= 2kf (tn, yn). (1.5) Homeworks.

1. Prove the convergence theorem for the leap-frog method for ODE. Hint: consider the system y₁n= yn−1and y₂n= yn.

1.3 Runge-Kutta methods

The Runge-Kutta method (RK) is a strategy to integrate´_ttnn+1f dτ by some quadrature method.

(13)

RK2: A second order RK, denoted by RK2, is based on the trapezoidal rule of numerical integra-tion. First, we integrate the ODE y0 = f (t, y) to get

y(tn+1) − yn= ˆ tn+1

tn

f (τ, y(τ )) dτ. Next, this integration is approximated by

ˆ tn+1

tn

f (τ, y(τ )) dτ = k 2 f (t

n_{, y}n_{) + f (t}n+1_{, y}n+1_{) + O(k}3_).

The latter term involves yn+1. An explicit Runge-Kutta method approximate yn+1by yn+kf (tn, yn). Thus, RK2 reads ξ1 = f (tn, yn) yn+1 = yn+k 2(f (t n_{, y}n_{) + f (t}n+1_{, y}n_{+ kξ} 1)).

Another kind of RK2 is based on the midpoint rule of integration. It reads ξ1 = f (tn, yn)

yn+1 = yn+ kf (tn+1/2, yn+k 2ξ1) The truncation error of RK2 is

y(tn+1) − y(tn) = yn+1− y(tn_{) + O(k}3_).

RK4 The 4th order Runge-Kutta method uses Simpson’s rule to approximate integration: ˆ tn+1

tn

f (t, y(t)) dt = k 6

f (tn, y(tn)) + 4f (tn+1/2, y(tn+1/2)) + f (tn+1, y(tn+1))+ O(k5). The quantity y(tn+1/2) is approximated by forward Euler method. It has the form

yn+1 = yn+k 6(ξ1+ 2ξ2+ 2ξ3+ ξ4) ξ1 = f (tn, yn) ξ2 = f (tn+ 1 2k, y n₊k 2ξ1) ξ3 = f (tn+ 1 2k, y n₊k 2ξ2) ξ4 = f (tn+ k, yn+ kξ3)

The truncation error of RK4 is

(14)

General explicit Runge-Kutta methods The method takes the following general form yn+1= yn+ k s X i=1 biξi, where ξ1 = f (tn, yn), ξ2 = f (tn+ c2k, yn+ ka21ξ1), ξ3 = f (tn+ c3k, yn+ ka31ξ1+ ka32ξ2), .. . ξs = f (tn+ csk, yn+ k(as1ξ1+ · · · + as,s−1ξs−1)).

We need to specify s (the number of stages), the coefficients aij(1 ≤ j < i ≤ s), bi(i = 1, ..., s)

and ci(i = 2, ..., s). We list them in the following Butcher table.

There are s(s − 1)/2 + s + (s − 1) unknowns to be determined for a specific scheme. We require the 0 c2 a21 c3 a31 a32 .. . ... . .. cs as1 as2 · · · as,s−1 b1 b2 · · · bs−1 bs

truncation error to be O(kp+1). To find these coefficients, we need to expand the truncation error formula

y(tn+1) − yn= yn+1− yn+ O(kp+1)

about (tn, yn_{) in terms of derivatives of y(·) at t}n_{. Then we can get linear equations for the}

coeffi-cients.

Adaptive Runge-Kutta (Runge-Kutta-Fehlberg method, ODE45) The adaptive Runge-Kutta method is designed to be able to estimate local truncation error in each time step. From which, we can adjust time step size to have roughly uniform truncation error in each step. This is done by using two RK methods with the same sets of aij and ci but different bi, b∗i. The set bi produces

RK method of order p. The auxiliary set b∗_i produces a RK method with order p − 1. It is used to estimate the local truncation by

yn+1− yn+1,∗= h

s

X

i=1

(bi− b∗i)ki= O(hp)

The step size h is then estimated so that the truncation error is roughly the same in each time step. Below is the Butcher table for RK5 and RK4 (b∗).

(15)

0 1/4 1/4 3/8 3/32 9/32 12/13 1932/2197 −7200/2197 7296/2197 1 439/216 −8 3860/513 −845/4104 1/2 −8/27 2 −3544/2565 1859/4104 −11/40 b 16/135 0 6656/12825 28561/56430 −9/50 2/55 b∗ 25/216 0 1408/2565 2197/4104 −1/5 0

Convergence proof, an example Let us see the proof of the convergence of the two stage Runge-Kutta method. The scheme can be expressed as

yn+1= yn+ kΨ(yn, tn, k) (1.6) where

Ψ(yn, tn, k) := f (y + 1

2kf (y)). (1.7)

Suppose y(·) is a true solution, the corresponding truncation error τn:= y(t

n+1_{) − y(t}n₎

k − Ψ(y(t

n_{), t}n_{, k) = O(k}2₎

Thus, the true solution satisfies

y(tn+1) − y(tn) = kΨ(y(tn), tn, k) + kτn The true error en:= yn− y(tn_{) satisfies}

en+1= en+ k(Ψ(yn, tn, k) − Ψ(y(tn), tn, k)) − kτn. This implies

|en+1_{| ≤ |e}n_{| + kλ}0_|en_{| + k|τ}n_|,

where λ0 is the Lipschitz constant of Ψ(y, t, k) with respect to y. Hence, we get |en| ≤ (1 + kλ0)n|e0| + k n−1 X m=0 (1 + kλ0)n−1−m|τm| ≤ eλ0t|e0| +e λ0_t λ0 maxm |τ m_| Reference:

• Lloyd N. Trefethen, Finite Difference and Sprectral Methods for Ordinary and Partial Differ-ential Equations,

• Randy LeVeque,

(16)

1.4 Multistep methods

The idea of multi-step method is to derive a relation between, for instance, yn+1, yn, yn−1, y0nand y0n−1so that the corresponding truncation is small. The simplest multistep method is the midpoint method. Suppose ynand yn−1is given. The new state yn+1is defined by

yn+1− yn−1= 2ky0n= 2kf (tn, yn) The truncation error is

τn:= 1 k y(t

n+1_{) − y(t}n−1_{) − 2ky}0_(tn_{) = O(k}2_).

Thus, the method is second order.

We can also design a method which involves yn+1, yn, yn−1and y0n, y0n−1. For instance, yn+1= yn+k

2(3f (y

n_{) − f (y}n−1₎₎

The truncation error τn:= 1 k yn+1− yn₊ k 2(3f (y n_{) − f (y}n−1₎₎ = O(k2).

A general r-step multistep method involves (yn+1, yn, ..., yn+1−r) and (y0n+1, y0n, ..., y0n+1−r). It can be written as r X m=0 amyn+1−r+m= k r X m=0 bmy0n+1−r+m= k r X m=0 bmfn+1−r+m. (1.8)

We will always assume ar 6= 0. Because it is the coefficient corresponding to yn+1, which is what

we want to find. When br= 0 the method is explicit; otherwise it is implicit. For a smooth solution

of (1.1), we define the truncation error τnto be τn:= 1 k r X m=0 amy(tn+1−r+m) − k r X m=0 bmy0(tn+1−r+m) !

Definition 1.1. A multi-step method is called of order p if τn= O(kp) uniformly in n. It is called consistent if τn(k) → 0 uniformly in n as k → 0.

Remark. When f is smooth, the solution of ODE y0 = f (t, y) is also smooth. Then the truncation is a smooth function of k. In this case, τ (k) → 0 is equivalent to τ (k) = O(k) as k → 0.

Initial setup An r-step multi-step method needs (y0, y1, ..., yr−1)T to start. There is only y0 given initially. We need to construct y1, ..., yr−1 _{by other methods. For instance RK methods. In}

order to maintain the order of accuracy, we should use a method of p − 1 order. This will give initial error yi− y(ti_{) = O(k}p_{) for i = 0, ..., r − 1.}

(17)

Derivation of multistep method of order p For notational convenience, let us extend a’s and b’s by setting am= 0, bm= 0 for m > r. Taking Taylor expansion about tn+1−r, we get

kτn = r X m=0 am ∞ X j=0 1 j!y (j)_(mk)j_{− k} r X m=0 bm ∞ X j=1 1 (j − 1)!y (j)_(mk)j−1 = r X m=0 am ! y(0)+ ∞ X j=1 1 j! r X m=0 mjam− jmj−1bm kjy(j) = r X m=0 am ! y(0)+ ∞ X j=1 1 j! r X m=0 mj−1(mam− jbm) kjy(j) = ∞ X j=0 1 j! X m=0 mj−1(mam− jbm) kjy(j) = ∞ X j=0 Cjy(j).

Here, the derivatives of y(·) are evaluated at tn+1−r. We list few equations for the coefficients a and b: C0 = a0+ · · · + ar C1 = (a1+ 2a2+ · · · rar) − (b0+ · · · + br) C2 = 1 2 (a1+ 2 2_a 2+ · · · r2ar) − 2(b1+ · · · + rbr) .. . Cp = r X m=0 mp p! am− r X m=1 mp−1 (p − 1)!bm To obtain a scheme of order p, we need to require

Cj = 0, for j = 0, ..., p.

There are 2(r + 1) unknowns for the coefficients {am}rm=0, {bm}rm=0. In principle, we can choose

p = 2r + 1 to have the same number of equations. Unfortunately, there is some limitation from stability criterion which we shall be explained in the next section. The order of accuracy p should satisfy p ≤    r + 2 if r is even, r + 1 if r is odd, r if it is an explicit scheme.

This is the first Dahlquist stability barrier. We shall not discuss here. See Trefethen’s book, or Stiff Equation in Wiki, Let us see some concrete examples below.

(18)

Explicit Adams-Bashforth schemes When br= 0, the method is explicit. Here are some

exam-ples of the explicit schemes called Adams-Bashforth schemes, where ar = 1:

• 1-step: yn+1= yn+ kf (yn)

• 2-step: yn+1= yn+k₂(3f (yn) − f (yn−1)) • 3 step: yn+1 = yn₊ k

12(23f (yn) − 16f (yn−1) + 5f (yn−2))

The step size is r and the order is p = r.

Implicit Adams-Moulton schemes Another examples are the Adams-Moulton schemes, where br6= 0 and and the step size r = p

• 1-step: yn+1= yn+k₂(f (yn+1) + f (yn))

• 2-step: yn+1= yn+₁₂k(5f (yn+1) + 8f (yn) − f (yn−1))

• 3 step: yn+1 = yn+₂₄k(9f (yn+1) + 19f (yn) − 5f (yn−1) + f (yn−2))

Sometimes, we can use an explicit scheme to guess yn+1 as a predictor in an implicit scheme. Such a method is called a predictor-corrector method. A standard one is the following Adams-Bashforth-Moulton scheme: Its predictor part is the Adams-Bashforth scheme:

ˆ

yn+1= yn+ k

12(23f (y

n_{) − 16f (y}n−1_{) + 5f (y}n−2₎₎

The corrector is the Adams-Moulton scheme: yn+1= yn+ k

24(9f (ˆy

n+1_{) + 19f (y}n_{) − 5f (y}n−1_{) + f (y}n−2₎₎

The predictor-corrector is still an explicit scheme. However, for stiff problem, we should use im-plicit scheme instead.

Matlab codes are available on Wikiversity with key words “Bashforth and Adams-Moulton methods.”

Formal algebra Let us introduce the shift operator Zyn= yn+1, or in continuous sense, Zy(t) = y(t + k). Let D be the differential operator. The Taylor expansion

y(t + k) = y(t) + ky0(t) + 1 2!k

2_D2_{y(t) + · · ·}

can be expressed formally as Zy = 1 + (kD) + 1 2!(kD) 2_{+ · · ·} y = ekDy.

(19)

The multistep method can be expressed as

Ly := (a(Z) − kb(Z)D) y =a(ekD) − kDb(ekD)y = (C0+ C1(kD) + · · ·) y.

Here, a(Z) = r X m=0 amZm, b(Z) = r X m=0 bmZm

are the generating functions of {am} and {bm}. A multistep method is of order p means that

a(ekD) − kDb(kD)

y = O((kD)p+1)y. We may abbreviate kD by a symbol κ. The above formula is equivalent to

a(eκ) − κb(eκ) = O(κp+1). Or equivalently,

a(eκ)

b(eκ₎ = κ + O(κ

p+1_{) as κ → 0.} _(1.9)

We have the following theorem

Theorem 1.2. A multistep method with b(1) 6= 0 is of order p if and only if a(z)

b(z) = log z + O((z − 1)

p+1_{) as z → 1.}

It is consistent if and only if

a(1) = 0 and a0(1) = b(1).

Proof. The first formula can be obtain from (1.9) by writing eκ = z. For the second formula, we expand log z about 1. We can get

a(z) = b(z) (z − 1) −(z − 1) 2 2 + (z − 1)3 3 + · · · + O((z − 1)p+1). We also expand a(z) and b(z) about z = 1, we can get

a(1) + (z − 1)a0(1) = b(1)(z − 1) + O((z − 1)2). Thus, the scheme is consistent if and only if a(1) = 0 and a0(1) = b(1).

Homeworks.

1. Consider the linear ODE y0 = λy, derive the finite difference equation using multistep method involving yn+1, yn, yn−1and y0nand y0n−1for this linear ODE.

(20)

1.5 Linear difference equation

Second-order linear difference equation. In the linear case y0 = λy, the above difference scheme results in a linear difference equation. Let us consider general second order linear difference equation with constant coefficients:

ayn+1+ byn+ cyn−1= 0, (1.10) where a 6= 0. To find its general solutions, we try the ansatz yn= ρnfor some number ρ. Here, the n in ynis an index, whereas the n in ρnis a power. Plug this ansatz into the equation, we get

aρn+1+ bρn+ cρn−1= 0. This leads to

aρ2+ bρ + c = 0.

There are two solutions ρ1and ρ2. In case ρ1 6= ρ2, these two solutions are independent. Since the

equation is linear, any linear combination of these two solutions is again a solution. Moreover, the general solution can only depend on two free parameters, namely, once y0and y−1are known, then {yn_}

n∈Zis uniquely determined. Thus, the general solution is

yn= C1ρn1 + C2ρn2,

where C1, C2are constants. In case of ρ1 = ρ2, then we can use the two solutions ρn2 and ρn1 with

ρ2 6= ρ1, but very closed, to produce another nontrivial solution:

lim

ρ2→ρ1

ρn₂ − ρn 1

ρ2− ρ1

This yields the second solution is nρn−1₁ . Thus, the general solution is C1ρn1 + C2nρn−11 .

Linear finite difference equation of order r . We consider general linear finite difference equa-tion of order r:

aryn+r+ · · · + a0yn= 0, (1.11)

where ar 6= 0. Since yn+r can be solved in terms of yn+r−1, ..., ynfor all n, this equation together

with initial data y0, ..., y−r+1has a unique solution. The solution space is r dimensions.

To find fundamental solutions, we try the ansatz yn= ρn for some number ρ. Plug this ansatz into equation, we get

arρn+r+ · · · + a0ρn= 0,

for all n. This implies

a(ρ) := arρr+ · · · + a0 = 0. (1.12)

The polynomial a(ρ) is called the characteristic polynomial of (1.11) and its roots ρ1, ..., ρr are

(21)

• Simple roots (i.e. ρi 6= ρj, for all i 6= j): The fundamental solutions are ρni, i = 1, ..., r.

• Multiple roots: if ρi is a multiple root with multiplicity mi, then the corresponding

indepen-dent solutions

ρn_i, nρn−1_i , C₂nρn−2_i ..., C_mn_i₋₁ρn−mi+1

i

Here, C_kn:= n!/(k!(n − k)!). The solution C₂nρ_in−2can be derived from differentiation_dρdC₁nρn−1 at ρi.

In the case of simple roots, we can express general solution as yn= C1ρn1 + · · · + Crρnr,

where the constants C1, ..., Crare determined by

yk= C1ρk1+ · · · + Crρkr, k = 0, ..., r − 1.

System of linear difference equation. The above rth order linear difference equation is equiva-lent to a first order linear difference system:

A0yn+1= Ayn (1.13) where yn=    y₁n .. . y_rn   =    yn−r+1 .. . yn    A0 = I_{(r−1)×(r−1)} 0 0 ar , A =        0 1 0 · · · 0 0 0 1 · · · 0 .. . ... ... . .. ... 0 0 0 · · · 1 −a₀ −a₁ −a₂ · · · −a_r−1

       .

We may divide (1.13) by A0 and get

yn+1= Gyn.

We call G the fundamental matrix of (1.13). For this homogeneous equation, the solution is yn= Gny0

Next, we compute Gnin terms of eigenvalues of G.

In the case that all eigenvalues ρi, i = 1, ..., r of G are distinct, then G can be expressed as

G = TDT−1, D = diag (ρ1, · · · , ρr),

(22)

When the eigenvalues of G have multiple roots, we can normalize it into Jordan blocks: G = TJT−1, J = diag (J1, · · · , Js),

where the Jordan block Jicorresponds to eigenvalue ρiwith multiplicity mi:

Ji =        ρi 1 0 · · · 0 0 ρi 1 · · · 0 .. . ... ... . .. ... 0 0 0 · · · 1 0 0 0 · · · ρi        mi×mi . andPs

i=1mi = r. Indeed, this form also covers the case of distinct eigenvalues.

In the stability analysis below, we are concerned with whether Gnis bounnded. It is easy to see that Gn= TJnT−1, Jn= diag (Jn₁, · · · , Jn_s) Jn_i =        ρn_i nρn−1_i C₂nρn−2 · · · C_mn_i₋₁ρn−mi+1 i 0 ρn_i nρn−1_i · · · C_mn i−2ρ n−mi+2 i .. . ... ... . .. ... 0 0 0 · · · nρn−1_i 0 0 0 · · · ρn_i        mi×mi . where C_kn:= _k!(n−k)!n! .

Definition 1.2. The fundamental matrix G is called stable if Gn remains bounded under certain norm k · k for all n.

Theorem 1.3 (von Neumann). The fundamental matrix G is stable if and only if its eigenvalues satisfy the following condition:

either|ρ| = 1 and ρ is a simple root,

or|ρ| < 1 (1.14)

Proof. It is easy to see that the nth power of a Jordan form J_inis bounded if its eigenvalue |ρi| < 1

or if ρi| = 1 but simple. On the other hand, if |ρi| > 1 then Jinis unbounded; or if ρi| = 1 but not

simple, then the term nρn−1_i in J_inwill be unbounded.

Corollary 1.1. There exists a norm in Rnsuch that the above root condition forG is equivalent to kGk ≤ 1 with this norm.

Proof. _{1. First, in R}n_{(or C}n), we define kxk∞ = maxi|xi|. For a linear mapping G : Rn →

Rn, we define its operator norm under the k · k∞by

kGk∞:= sup x6=0

kGxk∞

kxk∞

(23)

It is an easy exercise that for G = (aij)n×n, the operator norm kGk∞= max i X j |aij|.

2. Second, a matrix G can be expressed as

G = TDT−1, D = diag (J1, · · · , Js)

where Ji are Jordan blocks. For any i 6= 0, we can further transform Jiinto

Ji = SiKiS−1i where Ki=        ρi 0 · · · 0 0 ρi · · · 0 .. . ... ... . .. ... 0 0 0 · · · 0 0 0 · · · ρi        mi×mi , Si= diag (1, i, ..., mi i−1).

Let S = diag(S1, ..., Ss), K = diag(K1, ..., Ks). We can express G as

G = TS K(TS)−1 We now define the new norm of G as

kGk := kKk∞

This means that we define the new norm k · k in Rnby kxk := k(TS)−1xk∞.

Since TS is invertible, this does define a norm in Rn. With this norm, the corresponding operator norm is kKk∞.

3. For those Ji with mi > 1, the stability condition requires that |ρi| < 1. We choose i such

that |ρi| + i ≤ 1. Then the corresponding kKik∞ ≤ 1. Thus, kGk ≤ 1 with the above

(24)

Nonhomogeneous linear finite difference system In general, we consider the nonhomogeneous linear difference system:

yn+1= Gyn+ fn (1.15)

with initial data y0. Its solution can be expressed as yn = Gyn−1+ fn−1 = G(Gyn−2+ fn−2) + fn−1 .. . = Gny0+ n−1 X m=0 Gn−1−mfm Homeworks.

1. Consider the linear ODE

y0 = λy

where λ considered here can be complex. Study the linear difference equation derived for this ODE by forward Euler method, backward Euler, midpoint. Find its general solutions. 2. Consider linear finite difference equation with source term

ayn+1+ byn+ cyn−1= fn Given initial data ¯y0and ¯y1, find its solution.

3. Find the characteristic roots for the Adams-Bashforth and Adams-Moulton schemes with steps 1-3 for the linear equation y0 = λy.

1.6 Stability analysis

There are two kinds of stability concepts.

• Zero stability: Fix t = nk, the computed solution yn remains bounded as n → ∞ (or equivalently, k → 0).

• Absolute stability: Fix k > 0, the computed solution ynremains bounded as n → ∞.

1.6.1 Zero Stability

Our goal is to develop general convergence theory for multistep finite difference method for ODE: y0 = f (t, y) with initial condition y(0) = y0. An r-step multistep finite difference scheme can be

expressed as Lyn₌ r X m=0 amyn+1−r+m− k r X m=0 bmf (tn+1−r+m, yn+1−r+m) = 0. (1.16)

(25)

Definition 1.3. The truncation error τn(k) for the above finite difference scheme is defined by τn(k) := 1 k r X m=0 amy(tn+1−r+m) − k r X m=0 bmf (tn+1−r+m, y(tn+1−r+m)) ! , where y(·) is a true solution of the ODE.

Definition 1.4. A difference scheme is called consistent if the corresponding truncation error τn(k) → 0 uniformly in n as the mesh size k → 0. The scheme is of order p if τn(k) = O(kp) uniform in n.

In the multistep method, the consistency is equivalent to τ (k) = O(k) because we assume y(·) is smooth and the truncation error is a smooth function in k. The consistency is τ (k) → 0 as k → 0. Thus the smoothness of τ implies τ (k) = O(k).

Definition 1.5. A difference scheme is called zero stable if its solutions at time step n remain bounded as the mesh size k → 0 (nk = T is fixed, according n → ∞).

The main theorem is the follows. We will postpone its proof at the end of this section.

Theorem 1.4 (Dahlquist equivalence theorem). For finite difference schemes for ODE y0 = f (t, y), consistency + zero-stability ⇔ convergence

Stability criterion Let us investigate the condition on the coefficients a and b of an explicit mul-tistep method for the stability

Lyn= 0

to be bounded. We may assume ar = 1 and br= 0. Let us write it in matrix form:

yn+1= Ayn+ kBfn where A =        0 1 0 1 . .. . .. 0 1

−a₀ · · · −a_r−2 −a_r−1        , yn=   yn−r · · · yn  , B =        0 0 0 0 . .. ... 0 0 b0 · · · br−2 br−1        , fn=   fn−r · · · fn  ,

In order to have solution to be bounded for a multistep scheme Ly = 0 for arbitrary f , it has at least to be valid when f ≡ 0. In this case, we need to invetigate the boundedness for the homogeneous equation:

(26)

We have seen in the last section that

Theorem 1.5. The necessary and sufficient condition for kAnk to be bounded is that the charac-teristic rootsρiof the characteristic equationa(z) = 0 satisfies:

either|ρ_i| < 1 or|ρi| = 1 but simple.

Convergence ⇒ Stability

Proof. We only need to find an f such that the corresponding multistep is not stable implies that it does not converge. We choose f ≡ 0. * _{Since A}n _{is unbounded, which means there is an}

eigenvalue ρi with eigenvector yi such that |ρi| > 1 or |ρi| = 1 but not simple. We discuss the

formal case. The latter case can also be prove easily. In the former case, let yi be the eigenvector

of A corresponding to the eigenvalue ρi which satisfies |ρi| > 1. Let us choose y0 and generate

y0 = (y₀r−1, · · · , y0)T by some explicit scheme starting from y0. We can choose y0 such that its

component on yi is nonzero. Then the corresponding yn := Any0 will be unbounded. Hence it

cannot converge to a constant, as k → 0. On the other hand, y0 depends on the mesh size k and y0(k) → (y0, · · · , y0)T as k → 0. Thus, the method does not converge for f ≡ 0.

Convergence ⇒ Consistency

Proof. From Theorem 1.2, we need to show that a(1) = 0 and a0(1) = b(1). To show the first, we consider the ODE: y0 = 0 with y(0) = 1. For the second, we consider the ODE: y0 = 1 and y(0) = 0.

• Show a(1) = 0: We choose y0 = (1, · · · , 1)T. From y1= Ay0, we get yr= −a0y0− · · · − ar−1yr−1 = −a0− · · · − ar−1.

Since yris independent of k, and we should have yr → 1 as k → 0 (by convergence), we conclude that yr= 1. Thus, we get a(1) = a0+ · · · + ar−1+ 1 = 0.

• Show a0(1) = b(1). We choose f ≡ 1, y(0) = 0. The corresponding ODE solution is y(t) = t. The multistep method gives

a(Z)yn− kb(Z)1 = 0. (1.17) We write

a(Z) = a0(1)(Z − 1) + O((Z − 1)2), b(Z)1 = b(1).

*_{Suppose a multistep method is convergence for every smooth f , then in particular, for f ≡ 0. In this case, if this}

(27)

Then the principal part of the above finite difference is (Z − 1)y − kb(1)

a0₍₁₎ = 0.

This is an arithmetic series. Its solution is yn= nk_ab(1)0₍₁₎. Indeed, this sequence also satisfies

(1.17) provided its initial data yn also has the form yn = nk_ab(1)0₍₁₎ for 0 ≤ n < r. Thus,

arithmetic series yn= nk_ab(1)0₍₁₎ is a solution of the difference equation (1.17). Since nk = t,

the convergence yn→ t as n → ∞ enforces _ab(1)0₍₁₎ = 1.

Stability + Consistency ⇒ Convergence

Proof. We recall that we can express the scheme as

yn+1= Ayn+ kBfn.

Let Y be an exact solution, then plug it into the above scheme, we get Yn+1= AYn+ kBFn+ kτn,

where Yn:= (Y (tn−r), ...Y (tn))T. We subtract these two and call en:= Yn− yn_{. We get}

en+1= Aen+ kB (Fn− fn) + kτn. The term Fn− fn_{can be repressed as}

Fn− fn = (f (Yn−r) − f (yn−r), · · · , f (Yn) − f (yn))T = (L−ren−r, · · · , L0en)T := Lnen where L−m := ˆ 1 0 f0(yn−m+ ten−m) dt. Thus, we get en+1 = (A + kBLn) en+ kτn = Gn(k)en+ kτn

with C independent of n and k. The reason is the follows. First, we assume that f is Lipschitz. Thus, the functions L−m above are uniformly bounded (independent of n). Hence the term kBLk

(28)

Lemma 1.1. If kAnk is bounded and kBnk are uniformly bounded, then the product

k(A + 1

nB1) · · · (A + 1 nBn)k is also uniformly bounded.

We have en ≤ Gn−1en−1+ kτn−1 ≤ G_n−1Gn−2en−2+ k Gn−2τn−2+ τn−1 ≤ G_n−1Gn−2· · · G0e0 +k Gn−2· · · G0τ0+ · · · + Gn−2τn−2+ τn−1

From the lemma, we get

kenk ≤ Cke0k + nkC max

n kτ

n_{k ≤ Cke}0_{k + O(k}p_).

Proof of Lemma 1.1

Proof. 1. We have seen that kAnk is uniformly bounded under some norm is equivalent to kAk ≤ 1 for some other operator norm. Thus, we may just assume kAk ≤ 1.

2. Since all norms in finite dimension are equivalent, we may assume kBik ≤ b for all i =

1, ..., n. 3. We have k(A + 1 nB1) · · · (A + 1 nBn)k ≤ (kAk + b n) n_{≤ (1 +} b n) n_{≤ exp(b).}

Theorem 1.6 (First Dahlquist barrier). A zero-stable and linear r-step multistep method with p order of convergence should satisfies

p ≤    r + 2 if r is even, r + 1 if r is odd, r if it is an explicit scheme.

For proof, see pp. Hairer, Norsett, Wanner, Solving Ordinary Differential Equations, 384-387.

1.6.2 Absolute Stability

See Randall LeVeque, Finite Difference Methods for Ordinary and Partial Differential Equations, Chapter 7.

(29)

Chapter 2

Finite Difference Methods for Linear

Parabolic Equations

2.1 Finite Difference Methods for the Heat Equation

2.1.1 Some discretization methods

Let us start from the simplest parabolic equation, the heat equation: ut= uxx

Let h = ∆x, k = ∆t be the spatial and temporal mesh sizes. Define xj = jh, j ∈ Z and tn= nk,

n ≥ 0. Let us abbreviate u(xj, tn) by unj. We shall approximate unj by Ujn, where Ujnsatisfies some

finite difference equations.

Spatial discretization : The simplest one is that we use centered finite difference approximation for uxx:

uxx=

uj+1− 2uj+ uj−1

h2 + O(h 2₎

This results in the following systems of ODEs ˙ Uj(t) = Uj+1(t) − 2Uj(t) + Uj−1(t) h2 or in vector form ˙ U = 1 h2AU where U = (U0, U1, ...)t, A = diag (1, −2, 1).

(30)

Homeworks.

1. Derive the 4th order centered finite difference approximation for uxx:

uxx =

1

h2(−uj−2+ 16uj−1− 30uj+ 16uj+1− uj+2) + O(h 4_).

2. Derive a 2nd order centered finite difference approximation for (κ(x)ux)x.

Temporal discretization We can apply numerical ODE solvers • Forward Euler method:

Un+1= Un+ k h2AU

n _(2.1)

• Backward Euler method:

Un+1= Un+ k h2AU n+1 _(2.2) • 2nd order Runge-Kutta (RK2): Un+1− Un= k h2AU n+1/2_{, U}n+1/2_{= U}n₊ k 2h2AU n _(2.3) • Crank-Nicolson: Un+1− Un= k 2h2(AU n+1_{+ AU}n_). _(2.4)

These linear finite difference equations can be solved formally as Un+1= GUn where • Forward Euler: G = 1 +_hk2A, • Backward Euler: G = (1 −_hk2A)−1, • RK2: G = 1 +_hk2A +1₂ _hk2 2 A2 • Crank-Nicolson: G = 1+ k 2h2A 1− k 2h2A

For the Forward Euler, We may abbreviate it as

U_jn+1= G(U_j−1n , U_jn, U_j+1n ), (2.5) where

G(Uj−1, Uj, Uj+1) = Uj +

k

(31)

2.1.2 Stability and Convergence for the Forward Euler method

Our goal is to show under what condition can U_jnconverges to u(xj, tn) as the mesh sizes h, k → 0.

To see this, we first see the local error a true solution can produce. Plug a true solution u(x, t) into (2.1). We get un+1_j − un_j = k h2 u n j+1− 2unj + unj−1 + kτjn (2.6) where τ_jn= Dt,+unj − (ut)nj − (D+D−unj − (uxx)nj) = O(k) + O(h2).

Let en_j denote for un_j − Un

j. Then subtract (2.1) from (2.6), we get

en+1_j − en_j = k h2 e

n

j+1− 2enj + enj−1 + kτjn. (2.7)

This can be expressed in operator form:

en+1 = Gen+ kτn. (2.8) kenk ≤ kGen−1k + kkτn−1k

≤ kG2en−2k + k(kGτn−2k + kτn−1k)

≤ kGne0k + k(kGn−1τ0k + · · · + kGτn−2k + kτn−1k) Suppose G satisfies the stability condition

kGn_{U k ≤ CkU k}

for some C independent of n. Then

kenk ≤ Cke0k + C max

m |τ m_|.

If the local truncation error has the estimate max

m kτ

m_{k = O(h}2_{) + O(k)}

and the initial error e0 satisfies

ke0_{k = O(h}2_),

then so does the global true error satisfies

kenk = O(h2) + O(k) for all n.

The above analysis leads to the following definitions and the equivalence theorem.

Definition 2.1. A finite difference method is called consistent if its local truncation error τ satisfies kτ_h,kk → 0 as h, k → 0.

(32)

Definition 2.2. A finite difference scheme Un+1= Gh,k(Un) is called stable in a region (h, k) ∈ R

if there exists a norm k · k such that

kGn_h,kU k ≤ CkU k for all n > 0. Here, C is a constant independent of n.

Definition 2.3. A finite difference method is called convergence if the true error ke_h,kk → 0 as h, k → 0.

In the above analysis, we have seen that for forward Euler method for the heat equation, Theorem 2.1. The forward Euler method for the heat equation has the property:

stability + consistency ⇔ convergence.

We have already proven stability + consistency ⇒ convergence. The proof of the other way is the same as the Dahlquist equivalent theorem 1.4.

2.2 L

2

Stability – von Neumann Analysis

We have seen from the above discussion that the convergence issue is reduced to the stability issue. In the stability analysis, we need to choose a norm to measure stability of the amplification operator G, we will choose operator norm in L2. For constant coefficient case, the von Neumann analysis (via Fourier method) provides a necessary and sufficient condition for stability. For more general cases such as variable coefficient cases, Kreiss’ matrix theorem provides a good characterization of stability.

Below, we describe L2stability analysis. We introduce two equivalent methods: Fourier method and energy method, the Fourier method is also known as the von Neumann stability analysis. Given {Uj}j∈Z, let us define

kU k2=X

j∈Z

|Uj|2

and its Fourier transform ˆ U (ξ) = 1 2π X j∈Z Uje−ijξ, ξ ∈ [0, 2π).

There are two advantages to analyze stability of a finite difference scheme using Fourier method. • The shift operator is transformed to a multiplier:

d

T U (ξ) = eiξU (ξ),ˆ

(33)

• The Parseval equality kU k2 ₌ _{k ˆ}_{U k}2 ≡ ˆ π −π | ˆU (ξ)|2dξ

holds, thus one can control the L2-norm of U and GU in the Fourier space. Now, let us consider a finite difference scheme of the form:

U_jn+1= (GUn)j = m

X

k=−l

akUj+kn .

Taking Fourier transform, we obtain

[ Un+1_{= b}_{G(ξ) c}_Un_(ξ), where b G(ξ) := m X k=−l akeikξ.

From the Parseval equality,

kUn+1k2 = k [Un+1_k2 = ˆ π −π | bG(ξ)|2| cUn_(ξ)|2_dξ ≤ max ξ | bG(ξ)| 2 ˆ π −π | cUn_(ξ)|2_dξ = | bG|2_∞kU k2

Thus a sufficient condition for stability is

| bG|∞≤ 1. (2.9)

Theorem 2.2. A finite difference scheme U_jn+1=

m

X

k=−l

akUj+kn

with constant coefficients is stable if

b G(ξ) := m X k=−l akeikξ satisfies max −π≤ξ≤π| bG(ξ)| ≤ 1. (2.10)

(34)

For the forward Euler method for the heat equation,

U_jn+1 = G(Uj−1, Uj, Uj+1) = αUj−1+ (1 − 2α)Uj+ αUj+1, α =

k h2.

the corresponding b

G(ξ) = α(eiξ+ e−iξ) + (1 − 2α) = 1 − 4α sin2 ξ 2

. The condition (2.10) is equivalent to

α ≤ 1 2. That is, k h2 ≤ 1 2.

Or equivalently, U_jn+1is the convex combination of Uj−1, Uj and Uj+1.

Homeworks.

1. Compute the bG for the schemes: Backward Euler, RK2 and Crank-Nicolson.

2.3 Energy method

Let us write the finite difference scheme as

U_jn+1= αU_j−1n + βU_jn+ γU_j+1n , (2.11) where

α, β, γ ≥ 0 and α + β + γ = 1.

We multiply (2.11) by U_jn+1on both sides, apply Cauchy-Schwarz inequality, we get

(U_jn+1)2 = αU_j−1n U_jn+1+ βU_jnU_jn+1+ γU_j+1n U_jn+1 ≤ α 2((U n j−1)2+ (Ujn+1) 2_{) +}β 2((U n j)2+ (Ujn+1) 2_{) +} γ 2((U n j+1)2+ (Ujn+1) 2₎

Here, we have used α, β, γ ≥ 0. We multiply this inequality by h and sum it over j ∈ Z. Denote

kU k2:=   X j |Uj|2h   1/2 . We get kUn+1k2 ≤ α 2(kU n_k2_{+ kU}n+1_k2_{) +} β 2(kU n_k2_{+ kU}n+1_k2_{) +}γ 2(kU n_k2_{+ kU}n+1_k2₎

(35)

= 1 2(kU

n_k2_{+ kU}n+1_k2_).

Here, α + β + γ = 1 is applied. Thus, we get the energy estimate

kUn+1_k2_{≤ kU}n_k2_. _(2.12)

Homeworks.

1. Can the RK-2 method possess an energy estimate?

2.4 Stability Analysis via Entropy Estimates

Stability in the maximum norm

We notice that the action of G is a convex combination of Uj−1, Uj, Uj+1, provided

0 < k h2 ≤ 1 2. (2.13) Thus, we get min {Uj−1n , Ujn, Uj+1n } ≤ Ujn+1≤ max {Uj−1n , Ujn, Uj+1n }. This leads to minjUjn+1≥ minjUjn, maxjU_jn+1≤ maxjUjn, and maxj|U_jn+1| ≤ maxj|Ujn|.

That is, G is stable in k · k∞.

Entropy estimates

The property that Un+1 is a convex combination (average) of Un is very important. Given any convex function η(u), by Jenson’s inequality, we have*

η(U_jn+1) ≤ αη(U_j−1n ) + βη(U_jn) + γη(U_j+1n ). (2.14)

*_{η is convex implies}

η(αUj−1+ (1 − α)V ) ≤ αη(Uj−1) + (1 − α)η(V ).

Take V = (βUj+ γUj+1)/(1 − α). Apply the definition of convex function again, we get

η(V ) ≤ β

1 − αη(Uj) + γ

1 − αη(Uj+1). Combine these two inequalities, we get

(36)

Summing over all j and using α + β + γ = 1, we get X j η(U_jn+1) ≤X j η(U_jn). (2.15)

The convex function is called entropy in this setting. The above inequality means that the “entropy” decreases in time. In particular, we choose

• η(u) = |u|2, we recover the L2stability, • η(u) = |u|p, 1 ≤ p < ∞, we get

X j |U_jn+1|p ≤X j |U_jn|p This leads to   X j |U_jn+1|ph   1/p ≤   X j |U_jn|ph   1/p , the general Lpstability. Taking p → ∞, we recover L∞stability. • η(u) = |u − c| for any constant c, we obtain

X j |Un+1 j − c| ≤ X j |Un j − c|

This is called Kruzkov’s entropy estimate. We will see this inequality in hyperbolic theory again.

Homeworks.

1. Show that the solution of the difference equation derived from the RK2 satisfies the entropy estimate. What is the condition required on h and k for such entropy estimate?

2.5 Entropy estimate for backward Euler method

The backward Euler method for the heat equation is

Un+1 = Un+ λAUn+1, A = diag(1, −2, 1), λ = k h2,

the amplification matrix is given by

Un+1= GUn, G = (I − λA)−1. (2.16) The matrix M := I − λA has the following property:

(37)

Definition 2.4. A matrix M = (mij) is called an M -matrix if it satisfies

mii> 0, mij ≤ 0,

X

j6=i

|mij| ≤ mii (2.17)

For M = I − λA arisen from the backward Euler method, the corresponding mii = 1 + 2λ,

mij = −λ for j 6= i. Thus, it is an M -matrix.

Theorem 2.3. The inverse of an M-matrix is a nonnegative matrix, i.e. all its entries are non-negative.

I shall not prove this general theorem. You can read Golub-von Loan’s book, or consult wiki. Instead, I will find the inverse of M for the above specific M-matrix. Let us express

M = I − λ diag(1, −2, 1) = 1 + 2λ

2 diag (−a, 2, −a). Here,

a = 2λ

1 + 2λ, and 0 < a < 1 if h, k > 0. The general solution of the difference equation

− auj−1+ 2uj− auj+1= 0 (2.18)

has the form:

uj = C1ρj₁+ C2ρj₂

where ρ1, ρ2are the characteristic roots, i.e. the roots of the polynomial equation

−aρ2+ 2ρ − a = 0. Thus,

ρi=

1 ±√1 − a2

a .

From the assumption of the M-matrix, 0 < a < 1, we have ρ1< 1 and ρ2 > 1.

Now, we define a fundamental solution: gj =

(

ρj₁ for j ≥ 0 ρj₂ for j < 0 .

We can check that gj → 0 as |j| → ∞. Moreover, gj satisfies the difference equation (2.18) for

|j| ≥ 1. For j = 0, we have

−ag−1+ 2g0− ag1 = −aρ2−1+ 2 − aρ1 = 2 − a(ρ1+ ρ−12 ) = d

We reset gj ← gj 2λ (1 + 2λ)d .

(38)

Then we have 1 + 2λ 2 (−ag−1+ 2g0− ag1) = 1, 1 + 2λ 2 (−agj−1+ 2gj − agj+1) = 0, ∀j 6= 0. This means X j gi−jmj,k= δi,k, or

G(I − λA) = Id.

Thus, M−1 = (gi−j) is a positive matrix (i.e. all its entries are positive). Furthermore, from

gi−j− λ (−gi−j−1+ 2gi−j− gi−j+1) = δij,

summing over j ∈ Z, we obtain

X

j

gi−j = 1 for all i.

This means that

U_in+1= (GUn)_i =X

j∈Z

gi−jUjn

is indeed average of Unwith weights gi−j. With this property, we can apply Jensen’s inequality to

get the entropy estimates:

Theorem 2.4. Let η(u) be a convex function. Let U_jnbe a solution of the difference equation derived from the backward Euler method for the heat equation. Then we have

X

j

η(U_jn) ≤X

j

η(U_j0). (2.19)

Remark 1. • From entropy estimate, we get stability estimates for G in all Lp norms with 1 ≤ p ≤ ∞.

• It is important to note that there is no restriction on the mesh sizes h and k for stability for the Backward Euler method.

Homeworks.

1. Can the Crank-Nicolson method for the heat equation satisfy the entropy estimate? What is the condition on h and k?

(39)

2.6 Existence Theory

We can prove existence theorem of PDEs through finite difference approximation. In order to do so, let us define continuous and discrete Sobolev spaces and make a connection between them.

The continuous Sobolev space is defined as

Hm := {u : R → R|u, u0, ..., u(m)∈ L2_(R)}.

The discrete Sobolev space for functions defined on grid Gh := {xj := jh|j ∈ Z}.

H_hm:= {U : Gh→ R|U, Dx,+U, ..., Dmx,+U ∈ `2}.

Here, (Dx,+U )nj := (Uj+1n − Ujn)/h

For any discrete function Uj ∈ Hhm we can construct a function uhin Hmdefined by

uh(x) :=

X

j

Ujφh(x − xj) (2.20)

where φh(x) = sinc(x/h). We have

uh(xj) = Uj, for all xj ∈ Gh

It can be shown that

kD_xmuhk ≈ kDmx,+U k. (2.21)

Here, the norm is the L2 norm. Similarly, the space L∞_k (H_hm) can be embeded into L∞(Hm) by defining uh,k(x, t) = X n≥0 X j U_jnφk(t)φh(x)

The discrete norm and the continuous norm are equivalent.

2.6.1 Existence via forward Euler method

The forward Euler method for the heat equation ut= uxxreads

U_jn+1= U_jn+ k h2 U

n

j−1− 2Ujn+ Uj+1n .

Here, We have seen that we can get the energy estimate: kUnk ≤ kU0k.

We perform finite difference operation on the above equation, say the forward Euler equation, for instance, let V_jn = (Dx,+U )nj := (Uj+1n − Ujn)/h. Then Vjn satisfies the same finite difference

equation

V_jn+1 = V_jn+ k h2 V

n

(40)

Thus, it also possesses the same energy estimate. Similar estimate for D_x,+2 U . In general, we have kD_x,+m Unk ≤ kDm_x,+U0k. (2.22) If we assume the initial data u0 ∈ H2, then we get Un∈ H_h2 for all n ≥ 0.

Theorem 2.5. If the initial data u0 ∈ Hm,m ≥ 2 and k/h2 ≤ 1/2, then the solution of forward

Euler equation has the estimate

kD_x,+m Unk ≤ kDm_x,+U0k, kDt,+Unk ≤ kDx,+2 U0k (2.23)

Further, the corresponding smoothing functionuh,k has the same estimate and has a subsequence

converges to a solutionu(x, t) of the original equation.

Proof. The functions uh,k are unformly bounded in W1,∞(H2). Hence they have a subsequence

converges to a function u ∈ W1,∞(H2) weakly in W1,∞(H2) and strongly in L∞(H1). The functions uh,k satisfy uh,k(xj, tn+1) − uh,k(xj, tn) = k h2(uh,k(xj−1, t n_{) − 2u} h,k(xj, tn) + uh,k(xj+1, tn))

Multiply a test smooth function φ, sum over j and n, take summation by part, we can get the subsequence converges to a solution of ut= uxxweakly.

2.6.2 A Sharper Energy Estimate for backward Euler method

In this subsection, we will get a sharper energy estimate for solutions obtained from the backward Euler method. Recall the backward Euler method for solving the heat equation is

U_jn+1− U_jn= λ(U_j−1n+1− 2U_jn+1+ U_j+1n+1) (2.24) where λ = k/h2. An important technique is the summation by part:

X j (Uj− Uj−1)Vj = − X j Uj(Vj+1− Vj) (2.25)

There is no boundary term because we consider periodic condition in the present case. We multiply both sides by U_jn+1, then sum over j. We get

X j (U_jn+1)2− U_jn+1U_jn=X j λ(U_j−1n+1− 2U_jn+1+ U_j+1n+1)U_jn+1 = λ   X j (U_j−1n+1− U_jn+1)U_jn+1+X j (U_j+1n+1− U_jn)U_jn+1   = λ   X j (U_jn+1− U_j+1n+1)U_j+1n+1+X j (U_j+1n+1− U_jn)U_jn+1  

(41)

= −λX j |Un+1 j+1 − Ujn+1|2. The term U_jn+1U_jn≤ 1 2((U n+1 j )2+ (Ujn)2)

by Cauchy-Schwartz. Hence, we get 1 2 X j (U_jn+1)2− (U_jn)2) ≤ −λX j |U_j+1n+1− U_jn+1|2 Or 1 2Dt,−kU n+1_k2 _{≤ −}h2 k k h2kDx,+U n+1_{k = −kD} x,+Un+1k2. (2.26) where, Dt,−Vjn+1:= V_jn+1− Vn j k , Dx,+U n+1 j := U_j+1n+1− U_jn+1 h ,

We sum in n from n = 1 to N , we get the following theorem. Theorem 2.6. For the backward Euler method, we have the estimate

kUNk2+ C

N

X

n=1

kD_x,+Unk2 ≤ kU0k2 (2.27)

This gives controls not only on kUnk2_{but also on kD}

x,+Unk.

Homeworks.

1. Show that the Crank-Nicolson method also has similar energy estimate. 2. Can forward Euler method have similar energy estimate?

2.7 Relaxation of errors

In this section, we want to study the evolution of an error on a periodic domain [0, 2π). We consider ut= uxx, x ∈ [0, 2π), (2.28)

with initial data u0. The grid points xj = 2πj/N and h = 2π/N . The error enj := u(xj, tn) − Ujn

satisfies

en+1_j = en_j + λ(en_j−1− 2en_j + en_j+1) + kτ_jn. (2.29) We want to know how error is relaxed to zero from an initial error e0. We study the homogeneous finite difference quation first. That is

(42)

or en+1 = G(un). The matrix is a tridiagonal matrix. It can be diagonalized by Fourier method. The eigenfunctions and eigenvalues are

vk,j = e2πijk/N, ρk = 1 − 2λ + 2λ cos(2πk/N ) = 1 − 4λ sin2(πk/N ), k = 0, ..., N − 1.

When λ ≤ 1/2, all eigenvalues are negative except ρ0:

1 = ρ0 > |ρ1| > |ρ2| > · · · .

The eigenfunction

v0≡ 1.

Hence, the projection of any discrete function U onto this eigenfunction is the average:P

jUj.

Now, we decompose the error into

en= N −1 X k=0 en_kvk, n ≥ 0 Then en+1_k = ρkenk. Thus, en_k = ρn_ke0_k.

Since ρ0= 1, we see that en0 = e00, which is the average of en, does not decay, unless e00 = 0 initially.

To guarantee the average of e0 is zero, we may choose U_jnto be the cell average of u(x, tn) in the jth cell: U_jn= 1 h ˆ xj+1/2 xj−1/2 u(x, tn) dx.

instead of the grid data. This implies the initial error has zero local averages, and thus so does the global average.

For the decay behaviours of errors en_k for k = 1, ..., N − 1, we notice that for 1 ≤ k ≤ N − 1, ρk= 1 − 4λ sin2 πk N ≈ 1 − 4λ πk N 2 , for N >> 1. The largest values of ρs are ρ1and ρN −1:

ρ1= ρN −1≈ 1 − 4λ π N 2 = 1 − 4∆t h2 π2 N2 = 1 − ∆t.

They correspond to low frequency eigenmodes: v1 = (e2πij/N)N −1j=0 and vN −1 = (e−2πij/N)N −1j=0 .

The decay rate

(43)

Here, t = n∆t. This is the decay rate of en₁ and en_{N −1}with n∆t = t. They are the slowest decay modes. For k = N/2, the corresponding eigenmode v_N/2 = ((−1)j)N −1_j=0 is the highest frequency mode. The corresponding eigenvalue

ρN/2 = 1 − 4λ = 1 − 4

∆t h2.

The decay rate is

ρn_N/2= 1 − 4∆t h2 n ≈ e−h24t.

which decays very fast.

The contribution of the truncation error to the true error is: en+1= ρkenk+ ∆tτkn Its solution is en_k = ρn_ke0_k+ ∆t n−1 X m=0 ρn−1−m_k τ_km

We see that the term en₀ does not tend to zero unless τ₀m = 0. This can be achieved if we choose Uj

as well as fj to be the cell averages instead the grid data. We have seen that the truncation error is

second order. That is

τk,max := max 0≤m≤n−1|τ m k | = O(h2). Then for k ≥ 1, ∆t n−1 X m=0 |ρ_k|n−1−m≤ ∆t n−1 X m=0 |ρ₁|n−1−m= ∆t1 − ρ n 1 1 − ρ1 ≈ ∆t 1 − e −t 1 − (1 − ∆t) = 1 − e −t_. Thus, we obtain |en| ≤ e−te0+ (1 − e−t)O(h2) with n∆t = t. Homeworks. 1. Define Uj := _h1 ´xj+1/2

xj−1/2 u(x) dx. Show that if u(x) is a smooth periodic function on [0, 1],

then

u00(xj) =

1

h2(Uj−1− 2Uj+ Uj+1) + τ

(44)

2.8 Boundary Conditions

2.8.1 Dirichlet boundary condition

Now, we consider the initial-boundary problem:

ut= uxx, x ∈ [0, 1]

The Dirichlet boundary condition is

u(0) = a, u(1) = b. (2.31) The initial condition is

u(x, 0) = u0(x).

We introduce uniform grids: xj = j/N , j = 0, ..., N . The forward Euler method can be realized

on x1, ..., xN −1as

U_jn+1− U_jn= ∆t h2 U

n

j−1− 2Ujn+ Uj+1n , j = 1, ..., N − 1.

Near the boundary point x1, the finite difference approximation of uxx at x1 involves u at x0 = 0.

We plug the boundary condition: uxx(x1) = U0− 2U1+ U2 h2 + O(h 2_{) =} a − 2U1+ U2 h2 + O(h 2₎ _(2.32) Similarly, uxx(xN −1) = UN −2− 2UN −1+ UN h2 + O(h 2_{) =} UN −2− 2UN −1+ b h2 + O(h 2₎

The unknowns are U₁n, ..., U_{N −1}n with N − 1 finite difference equations at x1, ..., xN −1. Including

boundary terms, we write the equation as

Un+1= (I + λA)Un+ λB, λ = ∆t h2, A =      −2 1 0 · · · 0 0 1 −2 1 · · · 0 0 .. . ... ... . .. ... ... 0 0 0 · · · 1 −2      (N −1)×(N −1) , B =      a 0 .. . b      (N −1)×1 . (2.33)

The matrix A is the discrete Laplacian with zero Dirichlet boundary condition. The term B comes from the Dirichlet boundary conditions.

(45)

Next, we examine how error is relaxed for the Euler method with zero Dirichlet boundary con-dition. We have seen that the error en_j := u(xj, tn) − U_jn satisfies the difference equation with

truncation error as the source term:

en+1_j = ((I + λA)en)_j+ λδB + ∆tτ_jn, j = 1, ..., N − 1 where τ_jnis the truncation error,

δB =      δa 0 .. . δb      (N −1)×1 .

If there is any error from u(0) = a, say U0= a + δa, it will creates a truncation error δa/h2at x1.

The solution for this difference equation is en= Gne0+

n−1

X

m=0

Gn−1−m(λδB + ∆tτm) , G = I + λA. From Fourier method, we can compute the eigenvector and eigenvalue of A:

vk= (sin(πjk/N )N −1j=1 , λk = −4 sin2(πk/(2N )), k = 1, ..., N − 1.

In fact, we extend vkto an N + 1-vector as vk,0 = vk,N = 0. Using this extended vector, we can

check

vk,j−1−2vk,j+vk,j+1= (2 cos(πk/N )−2) sin(πjk/N ) = −4 sin2

πk 2N

vk,j, j = 1, ..., N −1, k = 1, ..., N −1.

The eigenvalues of I + 4λA are

ρk= 1 − 4λ sin2

πk 2N

, k = 1, ..., N − 1. In the present case, all eigenvalues

ρk< 1, k = 1, ..., N − 1.

provided the stability condition

λ ≤ 1/2. In this case,

1 > ρ1 > |ρ2| > ... > |ρN −1|.

The lowest mode is ρ1, which is

ρ1 = 1 − 4λ sin2(π/2N ) ≈ 1 − λ π N 2 = 1 −∆t h2 π2 N2 = 1 − π 2_∆t.

(46)

and

ρn₁ ≈ 1 − π2∆tn≈ e−π2t Thus,

kGnk ≤ e−π2t, n∆t = t. The accumulation effect is

n−1 X m=0 kGn−1−mk ≤ 1 − e −π2_t ∆t . Thus, the error from the initial data is

kGne0k ≤ e−π2tke0k The error coming from truncation is

n−1

X

m=0

Gn−1−m(∆tτm) = (1 − e−π2t)O(h2). The error due to boundary is

n−1

X

m=0

Gn−1−m(λδB) = (1 − e−π2t) 1 h2kδBk.

2.8.2 Neumann boundary condition

The Neumann boundary condition is

u0(0) = σ0, u0(1) = σ1. (2.34)

We may use the following discretization methods: • First order: U1− U0 h = σ0. • Second order-I: U1− U0 h = ux(x1/2) = ux(0) + h 2uxx(x0) = σ0+ h 2f (x0) (2.35) • Second order-II: we use extrapolation

−3U0+ 4U1− U2

2h2 = σ0.

The knowns are U_jn with j = 0, ..., N . In the mean time, we add two more equations at the boundaries.

(47)

Homeworks.

1. Find the eigenfunctions and eigenvalues for the discrete Laplacian with the Neumann bound-ary condition (consider both first order and second order approximation at boundbound-ary). Notice that there is a zero eigenvalue.

Hint: You may use Matlab to find the eigenvalues and eigenvectors.

Here, I will provide another method. Suppose A is the discrete Laplacian with Neumann boundary condition. A is an (N + 1) × (N + 1) matrix. Suppose Av = λv. Then for j = 1, ..., N − 1, v satisfies vj−1− 2vj+ vj+1= λvj, j = 1, ..., N − 1. For v0, we have −2v0+ 2v1= λv0. For vN, we have −2v_N + 2vN −1= λvN.

Then this matrix has the following eigenvectors:

v_jk= cos(πjk/N ), k = 0, ..., N with eigenvalue λk= −2 + 2 cos(πk/N ) = −4 sin2 πk 2N , k = 0, ..., N. Notice that λ0= 0. The error corresponding this eigenmode does not decay.

Homeworks.

1. Complete the calculation. 2. Consider

ut= uxx+ f (x)

on [0, 1] with Neumann boundary condition u0(0) = u0(1) = 0. If´ f (x) dx 6= 0. What wil happen to u as t → ∞?

2.9 The discrete Laplacian and its inversion

We consider the elliptic equation

uxx− αu = f (x), x ∈ (0, 1),

with the Dirichlet boundary condition

(48)

The finite difference approximation of uxx at x1 involves u at x0 = 0. We plug the boundary contion: uxx(x1) = U0− 2U1+ U2 h2 + O(h 2_{) =} a − 2U1+ U2 h2 + O(h 2₎ Similarly, uxx(xN −1) = UN −2− 2UN −1+ UN h2 + O(h 2_{) =} UN −2− 2UN −1+ b h2 + O(h 2₎

The unknowns are U₁n, ..., U_{N −1}n with N −1 finite difference equations at x1, ..., xN −1. The discrete

Laplacian becomes A =      −2 1 0 · · · 0 0 1 −2 1 · · · 0 0 .. . ... ... . .. ... ... 0 0 0 · · · 1 −2      (N −1)×(N −1) . (2.37)

This is a discrete Laplacian with Dirichlet boundary condition. We have seen in the last section that A can be diagonalized by the discrete Fourier sin functions vk = (sin(πjk/N )_j=1N −1. In this

section, we will find explicit expression of A−1. Indeed, we will find (A−2β)−1where β = αh2/2. The difference equation

Uj−1− (2 + 2β)Uj+ Uj+1 = 0

has two independent solutions ρj₁and ρj₂, where ρiare roots of

ρ2− (2 + 2β)ρ + 1 = 0. That is

ρ = 1 + β ±p(1 + β)2_{− 1.}

Our goal below is to construct fundamental solution Gij, which is G = (A − 2β)−1.

Case 1: β = 0 When β = 0, the two independent solutions are Uj = 1 and Uj = j. Let us

construct the fundamental solution centered at i, call it Gij. It has the form:

Gi,j =

jCi j ≤ i,

(N − j)C_i0 j ≥ i, 1 ≤ i, j ≤ (N − 1). (2.38) for some constants Ciand Ci0. From Gi,i−1− 2Gi,i+ Gi,i+1= 1 and iCi = (N − i)Ci0, we obtain

that

Ci = −(N − i)/N Ci0= −i/N.

(49)

Case 2: β > 0 When β > 0, the two roots are ρ1 < 1 and ρ2 > 1. The fundamental solution Gij

has the following form Gij =

(

C1ρj1+ C2ρj2 for j ≤ i

D1ρj1+ D2ρj2 for j ≥ i

1 ≤ i ≤ (N − 1), 0 ≤ j ≤ N.

Here, we extend Gi,j with 1 ≤ j ≤ (N − 1) to 0 ≤ j ≤ N . The constants C1, C2, D1, D2 are

determined by

Gi0= 0, Gi,N = 0,

Gi,i−1− (2 + 2β)Gi,i+ Gi,i+1 = 1

C1ρi1+ C2ρi2= D1ρi1+ D2ρi2.

Homeworks.

1. Find the coefficients C1, C2, D1, D2above.

Let us go back to the original equation:

uxx− αu = f (x)

The above study of the Green’s function of the discrete Laplacian helps us to quantify the the error produced from the source term. If Au = f and A−1= G, then an error in f , say τ , will produce an error

e = Gτ.

The error from the boundary also has the same behaviour. If the off-diagonal part of G decays exponentially (i.e. β > 0), then the error is “localized,” otherwise, it pollutes everywhere.

Project 2. Solve the following equation

uxx− αu + f (x) = 0, x ∈ [0, 1]

numerically with periodic, Dirichlet and Neumann boundary condition. The equilibrium 1. A layer structure f (x) = −1 1/4 < x < 3/4 1 otherwise 2. An impluse f (x) = γ 1/2 − δ < x < 1/2 + δ 0 otherwise 3. A dipole f (x) =    γ 1/2 − δ < x < 1/2 −γ 1/2 < x < 1/2 + δ 0 otherwise

(50)

You may choose α = 0.1, 1, 10, observe how solutions change as you vary α. Project 3. Solve the following equation

−uxx+ f (u) = g(x), x ∈ [0, 1]

numerically with Neumann boundary condition. Here, f (u) = F0(u) and the potential is F (u) = u4− γu2.

Study the solution as a function of γ. Choose simple g, say piecewise constant, a delta function δ(x − x0), or a dipole δ(x − x0+ ) − δ(x − x0− ).

(51)

Chapter 3

Finite Difference Methods for Linear

Elliptic Equations

3.1 Discrete Laplacian in two dimensions

We will solve the Poisson equation

4u = f in a domain Ω ⊂ R2with Dirichlet boundary condition

u = g on ∂Ω

Such a problem is a core problem in many applications. We may assume g = 0 by substracting a suitable function from u. Thus, we limit our discussion to the case of zero boundary condition. Let h be the spatial mesh size. For simplicity, let us assume Ω = [0, 1] × [0, 1]. But many discussion below can be extended to general smooth bounded domain.

3.1.1 Discretization methods

Centered finite difference The Laplacian is approximated by A = 1

h2 (Ui−1,j+ Ui+1,j+ Ui,j−1+ Ui,j+1− 4Ui,j) .

For the square domain, the indeces run from 1 ≤ i, j ≤ N − 1 and U0,j = UN,j = Ui,0= Ui,N = 0

Finite Difference Methods

FINITE DIFFERENCE METHODS

FOR

SOLVING DIFFERENTIAL EQUATIONS

I-Liang Chern

Contents

Chapter 1

Introduction

1.1

Finite Difference Approximation

1.2

Basic Numerical Methods for Ordinary Differential Equations

1.3

Runge-Kutta methods

1.4

Multistep methods

1.5

Linear difference equation

1.6

Stability analysis

Chapter 2

Finite Difference Methods for Linear

Parabolic Equations

2.1

Finite Difference Methods for the Heat Equation

2.2

L

Stability – von Neumann Analysis

2.3

Energy method

2.4

Stability Analysis via Entropy Estimates

2.5

Entropy estimate for backward Euler method

2.6

Existence Theory

2.7

Relaxation of errors

2.8

Boundary Conditions

2.9

The discrete Laplacian and its inversion

Chapter 3

Finite Difference Methods for Linear

Elliptic Equations

3.1

Discrete Laplacian in two dimensions