5.2 Laplace transform for differential equations
5.2.2 Laplace transform applied to differential equations
P (D)y = f, we perform Laplace transform both sides:
L(P (D)y) = Lf.
In other words, the function Y (s) of the Laplace transform of y satisfies an algebraic equation.
To show this, we perform L(Dky) =
5.2. LAPLACE TRANSFORM FOR DIFFERENTIAL EQUATIONS 127 can be solved explicitly with
Y (s) = F (s) + I(s) P (s) . Let us call
G(t) = L−1
1
P (s)
(5.7) called the Green’s function. Then in the case of I(s) ≡ 0, we have
y(t) = L−1
1
P (s)· F (s)
= (G ∗ f )(t).
Thus, the solution is the convolution of the Green’s function and the source term.
Example
1. Solve y00+ 4y0+ 4y = te−2t, y(0) = 1, y0(0) = 1.
Taking Laplace transform, we get
L(Dy) = −y(0) + sY (s)
L(D2y) = −y0(0) + sL(Dy) = −y0(0) + s(−y(0) + sY (s)) Hence,
L[(D2+ 4D + 4)y] = (s2+ 4s + 4)Y (s) − [y0(0) + sy(0) + 4y(0)]
The Laplace transform of the source term is
L(te−2t) = 1 (s + 2)2. Thus, we get
(s2+ 4s + 4)Y (s) − [y0(0) + sy(0) + 4y(0)] = 1 (s + 2)2,
Y (s) = 1 (s + 2)2
[y0(0) + sy(0) + 4y(0)] + 1 (s + 2)2
= y(0)
s + 2+ y0(0) + 2y(0)
(s + 2)2 + 1 (s + 2)4 Thus, its inverse Laplace transform is
y(t) = y(0)e−2t+ (y0(0) + 2y(0))te−2t+ 1
3!t3e−2t.
2. Solve y00− y = f (t), y(0) = y0(0) = 0, where
f (t) =
t, 0 ≤ t < 1 0, 1 ≤ t < ∞ The Laplace transform of f is
F (s) = L(f ) = Z 1
0
te−stdt = 1
s2(1 − (s + 1)e−s) The Laplace transform of the equation gives
(s2− 1)Y (s) = F (s). The inverse Laplace transform of each term of Y is
L−1 1
Here h(t) is the Heaviside function.
Homeworks.
1. B-D,pp.322: 24,27,36,38.
2. B-D,pp. 338: 21,22
5.2. LAPLACE TRANSFORM FOR DIFFERENTIAL EQUATIONS 129 5.2.3 Generalized functions and Delta function
The delta function δ(t) is used to represent an impulse which is defined to be δ(t) =
The δ-function can be viewed as the limit of the finite impulses δ(t) = lim
Namely, for any smooth function φ with finite support (i.e. the nonzero domain of φ is bounded), the meaning of the integral:
Z
Since the latter is φ(0), we therefore define δ to be the generalized function such that Z
δ(t)φ(t) dt = φ(0)
for any smooth function φ with finite support. The function φ here is called a test function. Likewise, a generalized function is defined how it is used. Namely, it is defined how it acts on smooth test functions. For instance, the Heaviside function is a generalized function in the sense that
Z
All ordinary functions are generalized functions. In particular, all piecewise smooth functions are generalized functions. For such a function f , it is un-important how f is defined at the jump points.
All it matters is the integral
Z
f (t)φ(t) dt
with test function φ. For piecewise smooth function f , the jump point makes no contribution to the integration.
One can differentiate a generalized function. The generalized derivative of a generalized func-tion is again a generalized funcfunc-tion in the following sense:
Z
Dtf (t)φ(t) dt := − Z
f (t)φ0(t) dt.
The right-hand side is well-defined because f is a generalized function. You can check that Dth(t) = δ(t). If f is a piecewise smooth function having jump at t = a with jump height [f ]adefined by [f ]a := limt→a+f (t) − limt→a−f (t). Let f0(t) be the ordinary derivative of f in the classical sense. Thus, f0(t) is defined everywhere except at the jump t = a. This f0(t) is a piecewise smooth function and hence it is a generalized function. From the definition of the generalized derivative, we claim that You can check that Dtδ is a generalized function. It is defined by
Z
(Dtδ)(t)φ(t) dt := −φ0(0).
Let us abbreviate Dtδ by δ0(t) in later usage.
Similarly, one can take indefinite integral of a generalized function.
Z Z t
for any test function φ such thatR φ = 0. The Heaviside function h(t) can be viewed as the integral of the delta function, namely,
h(t) = Z t
0
δ(τ ) dτ.
Laplace transform of the delta-functions It is easy to check that 1. Lδ =R δ(t)e−stdt = 1.
2. Lδ0= s.
3. Lh = 1/s.
5.2. LAPLACE TRANSFORM FOR DIFFERENTIAL EQUATIONS 131 5.2.4 Green’s function
Let us go back to the differential equation:
P (D)y = f.
with initial data y(0), · · · , y(n−1)(0) prescribed. We recall that the Laplace transform of this equa-tion gives
L(P (D)y) = P (s) · Y (s) − I(s) = F (s) (5.8) where Y (s) = (Ly)(s), F (s) = Lf (s) and
I(s) =
n
X
i=1 n
X
k=i
aky(k−i)(0)si−1.
The Green’s function is defined to be
G = L−1
1 P (s)
. (5.9)
There are two situations that produce Green’s function as its solutions.
• Impulse source: I(s) ≡ 0 and F (s) ≡ 1: That is,
P (D)G(t) = δ(t), G(0) = G0(0) = · · · = G(n−1)(0) = 0.
Taking the Laplace transform on both sides, using Lδ = 1, we have P (s)LG = 1, or LG = 1/P (s), or
G = L−1
1 P (s)
.
The Green’s function corresponds to solution with impulse source and zero initial data.
• Initial impulse: I(s) = 1 and F (s) ≡ 0: That is
P (D)G(t) = 0 for t > 0, G(0) = G0(0) = · · · = 0, G(n−1)(0) = 1 an
.
Remark. Notice that the Green’s functions obtained by the above two methods are identical. In-deed, let us see the following simplest example. The function eatis the solution (Green’s function) of both problems:
(i) y0− ay = δ, y(0) = 0,
(ii) y0− ay = 0, y(0) = 1.
Indeed, in the first problem, the equation should be realized for t ∈ R. The corresponding initial data is y(0−) = 0. While in the second problem, the equation should be understood to be hold for t > 0 and the initial data understood to be y(0+) = 1. This is classical sense. With this solution eat, if we define
y(t) =
eat t ≥ 0 0 t < 0
then Dty + ay = δ. This means that this extended function is a solution of (i) and the derivative in (i) should be interpreted as weak derivative.
Examples
Indeed, G0has a jump at t = 0 and the generalized derivative of G0 produces the delta function.
Explicit form of the Green’s function
Case 1. Suppose P (s) has n distinct roots λ1, ..., λn. Then The corresponding Green’s function is
G(t) =
n
X
k=1
Akeλkt.
5.2. LAPLACE TRANSFORM FOR DIFFERENTIAL EQUATIONS 133 It can be shown that (see (5.2))
L−1
Representation of solutions in terms of Green’s function
1. Contribution from the source term With the Green’s function, using convolution, one can express the solution of the equation P (D)y = f with zero initial condition by
y(t) = (G ∗ f )(t) = Z t
0
G(t − τ )f (τ ) dτ.
A physical interpretation of this is that the source term f (t) can be viewed as
f (t) = Z t
0
f (τ )δ(t − τ ) dτ
the superposition of delta source δ(t − τ ) with weight f (τ ). This delta source produces a solution G(t − τ )f (τ ). By the linearity of the equation, we have the solution is also the superposition of these solution:
y(t) = Z t
0
G(t − τ )f (τ ) dτ.
2. Contribution from the initial data. Next, let us see the case when f ≡ 0 and the initial data are not zero. We have seen that the contribution of the initial state is
Y (s) = I(s)
P (s), where I(s) =
n
X
i=1 n
X
k=i
aky(k−i)(0)si−1.
We have seen that L−1(si−1/P (s)) = Di−1L−1(1/P (s)) = Di−1G(t) (5.2). With this, we can write the general solution as the follows.
Theorem 5.1. The solution to the initial value problem P (D)y = f
with prescribedy(0), ..., y(n−1)has the following explicit expression:
y(t) = L−1 I(s)
P (s) +F (s) P (s)
=
n
X
i=1 n
X
k=i
aky(k−i)(0)G(i−1)(t) + (G ∗ f )(t)
Homeworks.
1. B-D,pp. 344: 1, 10, 14,15,16 2. Prove L(δ(i)) = si.
3. Find the Green’s function for the differential operator P (D) = (D2+ ω2)m. 4. Find the Green’s function for the differential operator P (D) = (D2− k2)m. 5. Suppose G = L−1(1/P (s)) is the Green’s function. Show that
L−1
si P (s)
= DitG(t).
6. B-D, pp. 352: 13, 18,19,21,22,23
Chapter 6
Calculus of Variations
6.1 A short story about Calculus of Variations
The development of calculus of variations has a long history. It may goes back to the brachis-tochrone problem proposed by Johann Bernoulli (1696). This is an ancient Greek problem, which is to find a path (or a curve) connecting two points A and B with B lower than A such that it takes minimal time for a ball to roll from A to B under gravity. Hohann Bernoulli used Fermat principle (light travels path with shortest distance) to prove that the curve for solving the brachistochrone problem is the cycloid.
Euler (1707-1783) and Lagrange (1736-1813) are two important persons in the development of the theory of calculus of variations. I quote two paragraphs below from Wiki for you to know some story of Euler and Lagrange.
“Lagrange was an Italian-French Mathematician and Astronomer. By the age of 18 he was teaching geometry at the Rotal Artillery School of Turin, where he organized a discussion group that became the Turin Academy of Sciences. In 1755, Lagrange sent Euler a letter in which he discussed the Calculus of Variations. Euler was deeply impressed by Lagrange’s work, and he held back his own work on the subject to let Lagrange publish first.”
“Although Euler and Lagrange never met, when Euler left Berlin for St. Petersburg in 1766, he recommended that Lagrange succeed him as the director of the Berlin Academy. Over the course of a long and celebrated career (he would be lionized by Marie Antoinette, and made a count by Napoleon before his death), Lagrange published a systemization of mechanics using his calculus of variations, and did significant work on the three-body problem and astronomical perturbations.”
6.2 Problems from Geometry
Geodesic curves Find the shortest path connecting two points A and B on the plane. Let y(x) be a curve with (a, y(a)) = A and (b, y(b)) = B. The geodesic curve problem is to minimize
Z b a
p1 + y0(x)2dx
135
among all paths y(·) connecting A to B.
Isoperimetric problem This was an ancient Greek problem. It is to find a closed curve with a given length enclosing the greatest area. Suppose the curve is described by (x(t), y(t)), 0 ≤ t ≤ T . We may assume the total length is L. The isoperimetric inequality problem is to
max 1 2
Z T 0
(x(t) ˙y(t) − y(t) ˙x(t)) dt
, subject to
Z T
0
px(t)˙ 2+ ˙y(t)2dt = L.
Its solution is the circle with radius R = L/(2π). Since the circle has the maximal enclosed area among all closed curves with arc length L, we then get so-called iso-perimetric inequality
4πA ≤ L2.
The equality holds for circles. A geometric proof was given by Steiner (1838). An analytic proof was given by Weierstrass and by Edler.1 The proof by Hurwitz (1902) using Fourier method can also be found in John Hunter and Bruno Nachtergaele’s book, Applied Analysis. In later section, we shall give an ODE proof.
6.3 Euler-Lagrange Equation
Let us consider the following variational problem:
min J [y] :=
Z b a
F (x, y(x), y0(x)) dx, subject to the boundary conditions
y(a) = ya, y(b) = yb.
The function F : R × R × R → R is a smooth function. We call the set A =y : [a, b] → R ∈ C1[a, b]|y(a) = ya, y(b) = yb
an admissible class. Here, C1[a, b] denotes the set of functions from [a, b] to R which are con-tinuously differentiable. An element y ∈ A is a path connecting (a, ya) to (b, yb). The mapping J : A → R is called a functional. It measures the cost of a path. Given a path y ∈ A, we consider a variation of this path in the direction of v by
y(x, ) := y(x) + v(x).
1You can read a review article by Alan Siegel, A historical review of isoperimetric theorem in 2-D, and its place in elementary plan geometry . For applications, you may find a book chapter from Fan in .
6.3. EULER-LAGRANGE EQUATION 137 Here, v is a C1 function with v(a) = v(b) = 0 in order to have y(·, ) ∈ A for small . Such v is called a variation. Sometimes, it is denoted by δy. We can plug y(·, ) into J . Suppose y is a local minimum of J in A, then for any such variation v, J [y + v] takes minimum at = 0. This leads
Let us compute this derivative d
It is understood that Fy0 here means the partial derivative w.r.t. the third variable y0. For instance, suppose F (y, y0) = y22 +y202, then Fy0 = y0.
Theorem 6.1 (Necessary Condition). A necessary condition for y ∈ A to be a local minimum of J is
Z b a
Fy(x, y(x), y0(x))v(x) + Fy0(x, y(x), y0(x))v0(x) dx = 0 (6.1) for allv ∈ C1[a, b] with v(a) = v(b) = 0.
If the solution y ∈ C2[a, b], then we can take integration by part on the second term to get Z b
Here, I have used v(a) = v(b) = 0. Thus, the necessary condition can be rewritten as Z b
Theorem 6.2. If f ∈ C[a, b] satisfies Z b
a
f (x)v(x) dx = 0 for allv ∈ C∞[a, b] with v(a) = v(b) = 0, then f ≡ 0.
Proof. If f (x0) 6= 0 for some x0 ∈ (a, b) (say f (x0) = C > 0), then there is small neighborhood (x0 − , x0+ ) such that f (x) > C/2. We can choose v to be a hump such that v(x) = 1 for
|x − x0| ≤ /2 and v(x) ≥ 0 and v(x) = 0 for |x − x0| ≥ . The test function still satisfies the boundary constraint if is small enough. Using this v, we get
Z b a
f (x)v(x) dx ≥ C
2 > 0.
This contradicts to our assumption. We conclude f (x0) = 0 for all x0 ∈ (a, b). Since f is continu-ous on [a, b], we also have f (a) = f (b) = 0 by continuity of f .
Thus, we obtain the following stronger necessary condition.
Theorem 6.3. A necessary condition for a local minimum y of J in A ∩ C2is δJ
δy := Fy(x, y(x), y0(x)) − d
dxFy0(x, y(x), y0(x)) = 0. (6.2) Equation 6.2 is called the Euler-Lagrange equation for the minimization problem min J [y].
Example For the problem of minimizing arc length, the functional is
J [y] = Z b
a
q
1 + y02dx,
where y(a) = ya, y(b) = yb. The corresponding Euler-Lagrange equation is d
dxFy0 = d dx
y0 p1 + y02
!
= 0.
This yields
y0
p1 + y02 = Const.
Solving y0, we further get
y0 = C (a constant).
Hence y = Cx + D. Applying boundary condition, we get C = yb− ya
b − a , D = bya− ayb b − a . Thus, the curves with minimal arc length on the plane are straight lines.
6.4. PROBLEMS FROM MECHANICS 139 Homework
1. Compute δJ /δy of the following functionals. We will neglect boundary effects if there is any.
(a) J [y] =Rb
aV (x)y(x) dx.
(b) J [y] =Rb
aα(x)y0(x) dx.
(c) J [y] =Rb
a(α(x)y0(x))2dx.
(d) J [y] =Rb a
−y(x)2 2 +y(x)4 4
dx.
(e) J [y] = 1pRb
a(y0(x))pdx, 1 < p < ∞.
(f) J [y] =Rb
a(y00(x))2dx.
6.4 Problems from Mechanics
Least action principle In classical mechanics, the motion of a particle in R3is described by m¨x = −∇V (x) = F (x),
where, V (x) is called a potential and F is called a (conservative) force. This is called Newton’s mechanics. Typical examples of potentials are the potential V (x) = gx with uniform force field, the harmonic potential V (x) = k22|x|2for a mass-spring system, the Newtonian potential V (x) = −|x|G for solar-planet system, etc. Here, k is the spring constant, G, the gravitation constant.
The Newton mechanics was reformulated by Lagrange (1788) in variational form and was orig-inally motivated by describing particle motion under constraints. Let us explain this variational formulation without constraint. First, let us introduce the concept of virtual velocity or variation of position. Given a path x(t), t0≤ t ≤ t1, consider a family of paths
x(t) := x(t, ) := x(t) + v(t), t0 ≤ t ≤ t1, −0< < 0.
Here, v(t) is called a virtual velocity and x(·) is called a small variation of the path x(·). Some-times, we denote v(·), the variation of x(·), by δx. That is, δx := ∂|=0x.
Now, Newton’s law of motion can be viewed as
δW = (F − m¨x) · v = 0 for any virtual velocity v.
The term δW is called the total virtual work in the direction v. The term F · v is the virtual work done by the external force F , while m¨x · v is the work done by the inertia force. The d’Alembert principle of virtual workstates that the virtual work is always zero along physical particle path under small perturbationv.
If we integrate it in time from t0to t1with fixed v(t0) = v(t1) = 0, then we get is called the Lagrangian, and the integral
S[x] :=
Z t1
t0
L(x(τ ), ˙x(τ )) dτ
is called the action. Thus, δS = 0 along a physical path. This is called the Hamilton principle or the least action principle. You can show that the corresponding Euler-Language equation is exactly the Newton’s law of motion.
Theorem 6.4. The following formulations are equivalent:
• Newton’s equation of motion m¨x = −V0(x);
• d’Alembert principle of virtual work: Rt1
t0 (m ˙x · ˙v − V0(x)v) dt = 0 for all virtual velocity v;
• Hamilton’s least action principle: δRt1
t0
m
2| ˙x|2− V (x) dt = 0.
Remarks
1. The meaning of the notation δ. In the path space, we vary x(·) by x(·). This means that they are a family of paths. We can express them as x(t, ). A typical example is x(t, ) = x(t) + v(t). The variation of the path xsimply means
δx(t) = ∂
∂|=0x(t, ).
For the case x = x + v, δx = v. Sometimes, we use prime to denote for ∂∂, while dot denote for∂t∂. The two differentiations commute. That is
δ ˙x = ˙x0 = d dtδx.
6.4. PROBLEMS FROM MECHANICS 141 2. When we consider a variation of path x, the functional S[x] becomes a function of as well:
S() := S[x] =
Thus, the notationδSδx is δS
δx(t) = Lx(x(t), ˙x(t)) − d
dtLx˙(x(t), ˙x(t)).
is the variation of S w.r.t. the path x. Sometimes, we write δS = δS
δx · δx.
One advantage of variational formulation – existence of first integral One advantage of this variational formulation is that it is easy to find some invariants (or so-called integrals) of the system.
One exmple is the existence of the first integral.
Theorem 6.5. When the Lagrangian L(x, ˙x) isindependent oft, then the quantity (called the first integral)
I(x, ˙x) := ˙x ·∂L
∂ ˙x − L(x, ˙x) is independent oft along physical trajectories.
Proof. We differentiate I(x(·), ˙x(·)) along a physical trajectory x(·):
d
Remarks.
1. For the Newton mechanics where L(x, ˙x) = 12m| ˙x|2− V (x), this first integral is indeed the total energy. Indeed, we obtain
I(x, ˙x) = 1
2m| ˙x|2+ V (x).
2. In Newton’s equation:
m¨x = −∇V (x), we multiply both sides by ˙x and obtain
m¨x ˙x + ∇V (x) ˙x = 0.
This can be written as
d dt
1
2m| ˙x|2+ V (x)
= 0.
Thus,
1
2m| ˙x|2+ V (x) = E.
for some constant E. This is another equivalent derivation, called energy method for New-ton’s mechanics with conservative force field.
3. If the particle motion is in one dimension, that is, x(·) ∈ R, then the first integral m
2x˙2+ V (x) = E
determines trajectories on the phase plane. Let us see the following example.
(a) Harmonic oscillator: V (x) = k2x2. The conservation of energy gives m
2x˙2+k
2x2 = E.
Each fixed E determines an ellipse on the phase plane (x, ˙x). Given an initial state (x(0), ˙x(0)), it also determines a unique E0 = m2x(0)˙ 2+ k2x(0)2. This E0 determines a trajectory fromm2x˙2+k2x2 = E, which is exactly the trajectory with the initial state (x(0), ˙x(0)).
(b) Simple pendulum: A simple pendulum has a mass m hanging on a massless rod with length `. The rod is fixed at one end and the mass m swings at the other end by the gravitational force, which is mg. Let θ be the angle of the rod and the negative vertical direction (0, −1). The locus the mass travels is on the circle centered at the fixed end of the rod. Thus, we have
• mass position: `(sin θ, − cos θ),
• tangential direction of the motion: (cos θ, sin θ)
6.5. METHOD OF LAGRANGE MULTIPLIER 143
• tangential velocity: v = ` ˙θ,
• tangential acceleration: a = `¨θ,
• the gravitation force: F = mg(0, −1),
• the force in the tangential direction: −mg sin θ.
The Newton’s law of motion gives
m`¨θ = −mg sin θ.
We eliminate m and get
θ = −¨ g
`sin θ.
The conservation of energy reads 1 2θ˙2−g
`cos θ = E.
Each E determines a trajectory on the phase plane (θ, ˙θ). Here are some special trajec-tories.
• The stable equilibria: θ = 2nπ, ˙θ = 0. The corresponding E0 = −g`.
• The unstable equilibria: θ = (2n + 1)π, ˙θ = 0. The corresponding energy is E1 = g`.
• The heteroclinic orbit: it connects two neighboring unstable equilibria: it satisfies 1
2θ˙2−g
`cos θ = E1, but it is not an equilibrium state.
• For E0 < E < E1, the corresponding orbit is a closed curve. For E > E1, the corresponding is an unbounded orbit.
6.5 Method of Lagrange Multiplier
In variational problems, there are usually accompanied with some constraints. As we have seen that the iso-perimetric problem. Lagrange introduced auxiliary variable, called the Lagrange multiplier, to solve these kinds of problems. Below, we use the hanging rope problem to explain the method of Lagrange multiplier.
Hanging rope problem A rope given by y(x), a ≤ x ≤ b hangs two end points (a, ya) and (b, yb). Suppose the rope has length ` and density ρ(x). Suppose the rope is in equilibrium, then it minimizes its potential energy, which is
J [y] = Z `
0
ρgy ds = Z b
a
ρgy q
1 + y02dx.
The rope is subject to the length constraint W[y] =
Z b a
q
1 + y02dx = `.
Method of Lagrange multiplier In dealing with such problems, it is very much like the opti-mization problems in finite dimensions with constraints. Let us start with two dimensional ex-amples. Suppose we want to minimize f (x, y) with constraint g(x, y) = 0. The method of La-grange multiplier states that a necessary condition for (x0, y0) being such a solution is that, if
∇g(x0, y0) 6= 0, then ∇f (x0, y0) k ∇g(x0, y0). This means that there exists a constant λ0 such that ∇f (x0, y0)+λ0∇g(x0, y0) = 0. In other words, (x0, y0, λ0) is an extremum of the unconstraint function F (x, y, λ) := f (x, y) + λg(x, y). That is, (x0, y0, λ0) solves
∂F
∂x = 0, ∂F
∂y = 0, ∂F
∂λ = 0.
The first two is equivalent to ∇f (x0, y0) k ∇g(x0, y0). The last one is equivalent to the constraint g(x0, y0) = 0. The advantage is that the new formulation is an unconstrained minimization problem.
For constrained minimization problem in n dimensions, we have same result. Let y = (y1, ..., yn).
f : Rn→ R and g : Rn→ R. Consider
min f (y) subject to g(y) = 0.
A necessary condition for y0 being such a solution is that, if ∇g(y0) 6= 0, then there exists λ0 such that (y0, λ0) is an extremum of the unconstraint function F (y, λ) := f (y) + λg(y). That is, (y0, λ0) solves
∂F
∂y(y0, λ0) = 0, ∂F
∂λ(y0, λ0) = 0.
For variational problem, we have much the same. Let us consider a variational problem in an abstract form:
min J [y] subject to W[y] = 0
in some admissible class A = {y : [a, b] → R|y(a) = ya, y(b) = yb} in some function space. We approximate this variational problem to a finite dimensional problem. For any large n, we partition [a, b] into n even subintervals:
xi = a + ib − a
n , i = 0, ..., n.
We approximate y(·) ∈ A by piecewise linear continuous function ˜y with
˜
y(xi) = y(xi), i = 0, ..., n.
The function ˜y ∈ A has an one-to-one correspondence to y := (y1, ..., yn−1) ∈ Rn−1. We approxi-mate J [y] by J (y) := J [˜y], and W[y] by W (y) = W[˜y]. Then the original constrained variational problem is approximated by a constrained optimization problem in finite dimension. Suppose y0 is such a solution. According to the method of Lagrange multiplier, if ∇W (y0) 6= 0, then there exists a λ0such that (y0, λ0) solves the variational problem: J (y) + λW (y).
Notice that the infinite dimensional gradient δW/δy can be approximated by the finite dimen-sional gradient ∇W (y). That is
δW
δy [y] ≈ δW
δy [˜y] = ∂W
∂y = ∇W (y).
We summarize the above intuitive argument as the following theorem.
6.5. METHOD OF LAGRANGE MULTIPLIER 145 Theorem 6.6. If y0is an extremum ofJ [·] subject to the constraint W[y] = 0, and if δW/δy 6= 0, then there exists a constantλ0 such that (y0, λ0) is an extremum of the functional J [y] + λW[y]
with respect to(y, λ).
*Remark. A more serious proof is the follows.
1. We consider two-parameter variations
z(x) = y(x) + 1h1(x) + 2h2(x).
The variation hi should satisfy the boundary conditions: hi(a) = hi(b) = 0 in order to have z satisfy the boundary conditions: z(a) = ya and z(b) = yb. For arbitrarily chosen such variations hi, we should also require isatisfying
W (1, 2) = W[y + 1h1+ 2h2] = 0.
On the variational subspaces spanned by hi, i = 1, 2, the functional J becomes J (1, 2) := J [y + 1h1+ 2h2].
Thus the original problem is reduced to
min J (1, 2) subject to W (1, 2) = 0
on this variational subspace. By the method of Lagrange multiplier, there exists a λ such that an extremum of the original problem solves the unconstraint optimization problem min J + λW . This leads to three equations
0 = ∂
∂1(J + λW ) = δJ
δy + λδW δy
· h1
0 = ∂
∂2
(J + λW ) = δJ
δy + λδW δy
· h2
0 = ∂
∂λ(J + λW ) = W[y]
2. Notice that the Lagrange multiplier λ so chosen, depends on h1and h2. We want o show that it is indeed a constant. This is proved below.
3. Since δW/δy(x) 6= 0, we choose x1 where δW/δy(x1) 6= 0. For any x2 ∈ (a, b), we consider hi= δ(x − xi), i = 1, 2. Here, δ is the Dirac delta function. It has the property: for any continuous function f ,
Z
f (x)δ(x − x0) dx = f (x0).
By choosing such hi, we obtain that there exists a λ12such that In other words, the constant
λ12= −
δJ δy(x1)
δW δy (x1).
For any arbitrarily chosen x2, we get the same constant. Thus, λ12is independent of x2. In fact, the above formula shows
δJ
6.6.1 The hanging rope problem
Let us go back to investigate the hanging rope problem. By the method of Lagrange multiplier, we consider the extremum problem of new Lagrangian
L(y, y0, λ) = ρgy q
1 + y02+ λ q
1 + y02.
The Lagrangian is independent of x, thus it admits the first integral L − y0Ly0 = C, or (ρgy + λ)
Solving for y0 gives
y0 = ±1 C
p(ρgy + λ)2− C2. Using method of separation of variable, we get
dy
6.6. EXAMPLES 147
The constraints C, C2 and the Lagrange multiplier λ are then determined by the two boundary conditions and the constraint. The shape of this hanging rope is called a catenary.
6.6.2 Isoperimetric inequality
We recall that the isoperimetric inequality is to find a closed curve with a given length enclosing the greatest area. Suppose the curve is described by (x(t), y(t)), where t is a parameter on the curve, 0 ≤ t ≤ T . The iso-perimetric problem is to maximize the area
A[x, y] := 1
This is a constrained maximization problem. We use method of Lagrange multiplier, there exists a constant λ such that the solution satisfies
δ(A − λL) = 1
We claim that this means that the curve (x(·), y(·)) has constant curvature, and such curves must be circles.
To see this, let us review some plane curve theory. On the curve (x(t), y(t)), we may parametrize it by the arc length
s = Z t
0
px(τ )˙ 2+ ˙y(τ )2dτ.
Since we assume the total arc length is L, we have 0 ≤ s ≤ L. We have ds =p ˙x(t)2+ ˙y(t)2dt.
Let us denote the differentiation in s by prime. The tangent and normal of the curve are t := (x0, y0) = x˙
This equation, as expressed in terms of the parameter t, reads 1
Comparing this equation and the Euler-Lagrange equation corresponding iso-perimetric inequality problem, we conclude that K = 1/λ is a constant. The quantity λ = 1/K is called the radius of curvature.
Since the total length of this curve is L, we get L = 2π
K.
The area enclosed by the circle is A∗ = πK12, which has the maximal area among all closed curves
The area enclosed by the circle is A∗ = πK12, which has the maximal area among all closed curves