Ito’s Lemma

(1)

Ito’s Lemma

^a

A smooth function of an Ito process is itself an Ito process.

Theorem 18 Suppose f : R → R is twice continuously diﬀerentiable and dX = a_t dt + b_t dW . Then f (X) is the Ito process,

f (X_t)

= f (X₀) +

_t

0

f(X_s) a_s ds +

_t

0

f(X_s) b_s dW +1

2

_t

0

f(X_s) b²_s ds for t ≥ 0.

aIto (1944).

(2)

Ito’s Lemma (continued)

• In diﬀerential form, Ito’s lemma becomes df (X) = f(X) a dt + f(X) b dW + 1

2 f(X) b² dt.

(78)

• Compared with calculus, the interesting part is the third term on the right-hand side.

• A convenient formulation of Ito’s lemma is df (X) = f(X) dX + 1

2 f(X)(dX)². (79)

(3)

Ito’s Lemma (continued)

• We are supposed to multiply out

(dX)² = (a dt + b dW )² symbolically according to

× dW dt

dW dt 0

dt 0 0

– The (dW )² = dt entry is justiﬁed by a known result.

• Hence (dX)² = (a dt + b dW )² = b² dt in Eq. (79).

• This form is easy to remember because of its similarity to the Taylor expansion.

(4)

Ito’s Lemma (continued)

Theorem 19 (Higher-Dimensional Ito’s Lemma) Let W₁, W₂, . . . , W_n be independent Wiener processes and

X = (X^Δ ₁, X₂, . . . , X_m) be a vector process. Suppose

f : R^m → R is twice continuously diﬀerentiable and X_i is an Ito process with dX_i = a_i dt + _n

j=1 b_ij dW_j. Then df (X) is an Ito process with the diﬀerential,

df (X) =

m i=1

f_i(X) dX_i + 1 2

m i=1

m k=1

f_ik(X) dX_i dX_k,

where f_i = ∂f /∂X^Δ _i and f_ik = ∂^Δ ²f /∂X_i∂X_k.

(5)

Ito’s Lemma (continued)

• The multiplication table for Theorem 19 is

× dW_i dt

dW_k δ_ik dt 0

dt 0 0

in which

δ_ik =

⎧⎨

⎩

1, if i = k, 0, otherwise.

(6)

Ito’s Lemma (continued)

• In applying the higher-dimensional Ito’s lemma, usually one of the variables, say X₁, is time t and dX₁ = dt.

• In this case, b_1j = 0 for all j and a₁ = 1.

• As an example, let

dX_t = a_t dt + b_t dW_t.

• Consider the process f(Xt, t).

(7)

Ito’s Lemma (continued)

• Then

df = ∂f

∂X_t dX_t + ∂f

∂t dt + 1 2

∂²f

∂X_t² (dX_t)²

= ∂f

∂X_t (a_t dt + b_t dW_t) + ∂f

∂t dt +1

2

∂²f

∂X_t² (a_t dt + b_t dW_t)²

=

∂f

∂X_t a_t + ∂f

∂t + 1 2

∂²f

∂X_t² b²_t

dt + ∂f

∂X_t b_t dW_t. (80)

(8)

Ito’s Lemma (continued)

Theorem 20 (Alternative Ito’s Lemma) Let W₁, W₂, . . . , W_m be Wiener processes and

X = (X^Δ ₁, X₂, . . . , X_m) be a vector process. Suppose

f : R^m → R is twice continuously diﬀerentiable and Xi is an Ito process with dX_i = a_i dt + b_i dW_i. Then df (X) is the following Ito process,

df (X) =

m i=1

f_i(X) dX_i + 1 2

m i=1

m k=1

f_ik(X) dX_i dX_k.

(9)

Ito’s Lemma (concluded)

• The multiplication table for Theorem 20 is

× dW_i dt

dW_k ρ_ik dt 0

dt 0 0

• Above, ρ_ik denotes the correlation between dW_i and dW_k.

(10)

Geometric Brownian Motion

• Consider geometric Brownian motion Y (t) = e^Δ ^X(t). – X(t) is a (μ, σ) Brownian motion.

– By Eq. (73) on p. 565,

dX = μ dt + σ dW.

• Note that

∂Y

∂X = Y,

∂²Y

∂X² = Y.

(11)

Geometric Brownian Motion (continued)

• Ito’s formula (78) on p. 597 implies dY = Y dX + (1/2) Y (dX)²

= Y (μ dt + σ dW ) + (1/2) Y (μ dt + σ dW )²

= Y (μ dt + σ dW ) + (1/2) Y σ² dt.

• Hence

dY

Y =

μ + σ²/2

dt + σ dW. (81)

• The annualized instantaneous rate of return is μ + σ²/2 (not μ).^a

aConsistent with Lemma 9 (p. 294).

(12)

Geometric Brownian Motion (continued)

• Alternatively, from Eq. (73) on p. 565, X_t = X₀ + μt + σ W_t, an explicit (strong) solution.

• Hence

Y_t = Y₀ e^{μt+σ W}^t,

a strong solution to the SDE (81) where Y₀ = e^X⁰.

(13)

Geometric Brownian Motion (concluded)

• On the other hand, suppose dY

Y = μ dt + σ dW.

• Then X(t) = ln Y (t) follows^Δ dX =

μ − σ²/2

dt + σ dW.

(14)

Exponential Martingales

• The Ito process

dX_t = b_tX_t dW_t

is a martingale by Theorem 17 (p. 582).

• It is called an exponential martingale.

• By Ito’s formula (78) on p. 597, X(t) = X(0) exp

−1 2

_t

0

b²_s ds +

_t

0

b_s dW_s

.

(15)

Product of Geometric Brownian Motion Processes

• Let

dY

Y = a dt + b dW_Y , dZ

Z = f dt + g dW_Z.

• Assume dWY and dW_Z have correlation ρ.

• Consider the Ito process

U = Y Z.^Δ

(16)

Product of Geometric Brownian Motion Processes (continued)

• Apply Ito’s lemma (Theorem 20 on p. 603):

dU = Z dY + Y dZ + dY dZ

= ZY (a dt + b dWY ) + Y Z(f dt + g dWZ) +Y Z(a dt + b dWY )(f dt + g dWZ)

= U(a + f + bgρ) dt + Ub dWY + Ug dWZ.

• The product of correlated geometric Brownian motion processes thus remains geometric Brownian motion.

(17)

Product of Geometric Brownian Motion Processes (continued)

• Note that

Y = exp

a − b²/2

dt + b dW_Y , Z = exp

f − g²/2

dt + g dW_Z , U = exp

a + f −

b² + g²

/2

dt + b dW_Y + g dW_Z . – There is no bgρ term in U!

(18)

Product of Geometric Brownian Motion Processes (concluded)

• ln U is Brownian motion with a mean equal to the sum of the means of ln Y and ln Z.

• This holds even if Y and Z are correlated.

• Finally, ln Y and ln Z have correlation ρ.

(19)

Quotients of Geometric Brownian Motion Processes

• Suppose Y and Z are drawn from p. 610.

• Let

U = Y /Z.^Δ

• We now show that^a dU

U = (a − f + g² − bgρ) dt + b dW_Y − g dW_Z.

(82)

• Keep in mind that dWY and dW_Z have correlation ρ.

aExercise 14.3.6 of the textbook is erroneous.

(20)

Quotients of Geometric Brownian Motion Processes (concluded)

• The multidimensional Ito’s lemma (Theorem 20 on p. 603) can be employed to show that

dU

= (1/Z) dY − (Y/Z²) dZ − (1/Z²) dY dZ + (Y/Z³) (dZ)²

= (1/Z)(aY dt + bY dWY ) − (Y/Z²)(fZ dt + gZ dWZ)

−(1/Z²)(bgY Zρ dt) + (Y/Z³)(g²Z² dt)

= U(a dt + b dWY ) − U(f dt + g dWZ)

−U(bgρ dt) + U(g² dt)

= U(a − f + g² − bgρ) dt + Ub dWY − Ug dWZ.

(21)

Forward Price

• Suppose S follows dS

S = μ dt + σ dW.

• Consider F (S, t) = Se^Δ ^{y(T −t)} for some constants y and T .

• As F is a function of two variables, we need the various partial derivatives of F (S, t) with respect to S and t.

• Note that in partial diﬀerentiation with respect to one variable, other variables are held constant.^a

aContributed by Mr. Sun, Ao (R05922147) on April 26, 2017.

(22)

Forward Prices (continued)

• Now,

∂F

∂S = e^{y(T −t)},

∂²F

∂S² = 0,

∂F

∂t = −ySe^{y(T −t)}.

• Then

dF = e^{y(T −t)} dS − ySe^{y(T −t)} dt

= Se^{y(T −t)} (μ dt + σ dW ) − ySe^{y(T −t)} dt

= F (μ − y) dt + F σ dW.

(23)

Forward Prices (concluded)

• One can also prove it by Eq. (80) on p. 602.

• Thus F follows dF

F = (μ − y) dt + σ dW.

• This result has applications in forward and futures contracts.

• In Eq. (57) on p. 479, μ = r = y.

• So dF

F = σ dW, a martingale.^a

aIt is also consistent with p. 555.

(24)

Ornstein-Uhlenbeck (OU) Process

• The OU process:

dX = −κX dt + σ dW, where κ, σ ≥ 0.

• For t₀ ≤ s ≤ t and X(t₀) = x₀, it is known that

E[ X(t) ] = e^−κ(t−t⁰⁾E[ x0 ], Var[ X(t) ] = σ²

2κ

1 − e^−2κ(t−t⁰⁾

+ e^−2κ(t−t⁰⁾ Var[ x0 ],

Cov[ X(s), X(t) ] = σ²

2κ e^−κ(t−s)

1 − e^−2κ(s−t⁰⁾ +e^{−κ(t+s−2t}⁰⁾Var[ x0 ].

(25)

Ornstein-Uhlenbeck Process (continued)

• X(t) is normally distributed if x₀ is a constant or normally distributed.

– E[ x₀ ] = x₀ and Var[ x₀ ] = 0 if x₀ is a constant.

• X is said to be a normal process.

• The OU process has the following mean reversion property.

– When X > 0, X is pulled toward zero.

– When X < 0, it is pulled toward zero again.

(26)

Ornstein-Uhlenbeck Process (continued)

• A generalized version:

dX = κ(μ − X) dt + σ dW, where κ, σ ≥ 0.

• Given X(t₀) = x₀, a constant, it is known that

E[ X(t) ] = μ + (x₀ − μ) e^−κ(t−t⁰⁾, (83) Var[ X(t) ] = σ²

2κ

1 − e^−2κ(t−t⁰⁾ , for t₀ ≤ t.

(27)

Ornstein-Uhlenbeck Process (concluded)

• The mean and standard deviation are roughly μ and σ/√

2κ , respectively.

• For large t, the probability of X < 0 is extremely

unlikely in any ﬁnite time interval when μ > 0 is large relative to σ/√

2κ .

• The process is mean-reverting.

– X tends to move toward μ.

– Useful for modeling term structure, stock price volatility, and stock price return.^a

aSee Knutson, Wimmer, Kuhnen, & Winkielman (2008) for the bio- logical basis for mean reversion in ﬁnancial decision making.

(28)

Square-Root Process

• Suppose X is an OU process.

• Consider

V = X^Δ ².

• Ito’s lemma says V has the diﬀerential, dV = 2X dX + (dX)²

= 2√

V (−κ√

V dt + σ dW ) + σ² dt

=

−2κV + σ²

dt + 2σ√

V dW, a square-root process.

(29)

Square-Root Process (continued)

• In general, the square-root process has the SDE, dX = κ(μ − X) dt + σ√

X dW,

where κ, σ > 0, μ ≥ 0, and X(0) ≥ 0 is a constant.

• Like the OU process, it possesses mean reversion: X tends to move toward μ, but the volatility is

proportional to √

X instead of a constant.

(30)

Square-Root Process (continued)

• When X hits zero and μ ≥ 0, the probability is one that it will not move below zero.

– Zero is a reﬂecting boundary.

• Hence, the square-root process is a good candidate for modeling interest rates.^a

• The OU process, in contrast, allows negative interest rates.^b

• The two processes are related.^c

aCox, Ingersoll, & Ross (1985).

bSome rates did go negative in Europe in 2015.

cRecall p. 623.

(31)

Square-Root Process (concluded)

• The random variable 2cX(t) follows the noncentral chi-square distribution,^a

χ

4κμ

σ² , 2cX(0) e^−κt

, where c = (2κ/σ^Δ ²)(1 − e^−κt)⁻¹ and μ > 0.

• Given X(0) = x₀, a constant, E[ X(t) ] = x₀e^−κt + μ

1 − e^−κt , Var[ X(t) ] = x₀ σ²

κ

e^−κt − e^−2κt

+ μ σ² 2κ

1 − e^−κt₂ , for t ≥ 0.

aWilliam Feller (1906–1970) in 1951.

(32)

Modeling Stock Prices

• The most popular stochastic model for stock prices has been the geometric Brownian motion,

dS

S = μ dt + σ dW.

• The logarithmic price X = ln S follows^Δ dX =

μ − σ² 2

dt + σ dW by Ito’s lemma.^a

aRecall Eq. (81) on p. 606. Consistent with Lemma 9 (p. 294).

(33)

Local-Volatility Models

• The more general deterministic-volatility model posits dS

S = (r_t − qt) dt + σ(S, t) dW,

where instantaneous volatility σ(S, t) is called the local-volatility function.^a

– “A local volatility model is the only complete consistent volatility model[.]”^b

• A (weak) solution exists if Sσ(S, t) is continuous and grows at most linearly in S and t.^c

aDerman & Kani (1994); Dupire (1994).

bBennett (2014).

cSkorokhod (1961); Achdou & Pironneau (2005).

(34)

Local-Volatility Models (continued)

• One needs to recover the local volatility surface σ(S, t) from the implied volatility surface.

• Theoretically,^a

σ(X, T )² = 2

∂C∂T + (r_T − q_T)X _∂X^∂C + q_TC X^{2 ∂}_∂X²^C₂ .

(84) – C is the call price at time t = 0 (today) with strike

price X and time to maturity T .

– σ(X, T ) is the local volatility that will prevail at future time T and stock price S_T = X.

aDupire (1994); Andersen & Brotherton-Ratcliﬀe (1998).

(35)

Local-Volatility Models (continued)

• For more general models, this equation gives the

expectation as seen from today, under the risk-neural probability, of the instantaneous variance at time T given that S_T = X.^a

• In practice, the σ(S, t)² derived by Dupire’s formula (84) may have spikes, vary wildly, or even be negative.

• The term ∂²C/∂X² in the denominator often results in numerical instability.

aDerman & Kani (1997); R. W. Lee (2001); Derman & M. B. Miller (2016).

(36)

Local-Volatility Models (continued)

• Denote the implied volatility surface by Σ(X, T ) and the local volatility surface by σ(S, t).

• The relation between Σ(X, T ) and σ(X, T ) is^a

σ(X, T )² = Σ² + 2Στ _∂Σ

∂T + (rT − qT)X _∂X^∂Σ

1 − ^Xy_Σ _∂X^∂Σ ₂

+ XΣτ

∂Σ

∂X − ^XΣτ₄ _∂Σ

∂X

₂

+ X _∂X^∂²^Σ2

 ,

τ = T − t,^Δ

y = ln(X/S^Δ t) +

_T

t

(qs − rs) ds.

aAndreasen (1996); Andersen & Brotherton-Ratcliﬀe (1998);

Gatheral (2003); Wilmott (2006); Kamp (2009).

(37)

Local-Volatility Models (continued)

• Although this version may be more stable than Eq. (84) on p. 629, it is expected to suﬀer from similar problems.

• Under fairly loose conditions, Σ is symmetric if and only if σ is, in terms of y = ln(S^Δ _t/X) instead of X.^a

• Small changes to the implied volatility surface may produce big changes to the local volatility surface.

aR. W. Lee (2001).

(38)

Implied and Local Volatility Surfaces

^a

0 0.5

1 1.5

2 2.5

3

0 0.2 0.4 0.6 0.8 1 20 30 40 50 60 70 80 90 100 110

Strike ($)

Implied Vol Surface

Time to Maturity (yr)

Implied Vol (%)

0 0.5

1 1.5

2 2.5

3

0 0.2 0.4 0.6 0.8 1 20 30 40 50 60 70 80 90 100 110

Stock ($)

Local Vol Surface

Time (yr)

Local Vol (%)

(39)

Local-Volatility Models (continued)

• In reality, option prices only exist for a ﬁnite set of maturities and strike prices.

• Hence interpolation and extrapolation may be needed to construct the volatility surface.^a

• But then some implied volatility surfaces generate option prices that allow arbitrage opportunities.^b

aDoing it to the option prices produces worse results (Li, 2000/2001).

bSee Rebonato (2004) for an example.

(40)

Local-Volatility Models (concluded)

• There exist conditions for a set of option prices to be arbitrage-free.^a

• Some adopt parameterized implied volatility surfaces that guarantee freedom from certain arbitrages.^b

• For some vanilla equity options, the Black-Scholes model seems better than the local-volatility model in predictive power.^c

• The exact opposite is concluded for hedging in equity index markets!^d

aDavis & Hobson (2007).

bGatheral & Jacquier (2014).

cDumas, Fleming, & Whaley (1998).

dCr´epey (2004); Derman & M. B. Miller (2016).

(41)

Local-Volatility Models: Popularity

• Hirsa and Neftci (2014), “most traders and ﬁrms actively utilize this [local-volatility] model.”

• Bennett (2014), “Of all the four volatility regimes,

[sticky local volatility] is arguably the most realistic and fairly prices skew.”

• Derman & M. B. Miller (2016), “Right or wrong, local volatility models have become popular and ubiquitousin modeling the smile.”

(42)

Implied Trees

• The trees for the local volatility model are called implied trees.^a

• Their construction requires option prices at all strike prices and maturities.

– That is, an implied volatility surface.

• The local volatility model does not require that the implied tree combine.

• Exponential-sized implied trees exist.^b

aDerman & Kani (1994); Dupire (1994); Rubinstein (1994).

bCharalambousa, Christoﬁdesb, & Martzoukosa (2007); Gong & Xu (2019).

(43)

Implied Trees (continued)

• How to construct a valid implied tree with eﬃciency has been open for a long time.^a

– Reasons may include: noise and nonsynchrony in data, arbitrage opportunities in the smoothed and interpolated/extrapolated implied volatility surface, wrong model, wrong algorithms, nonlinearity,

instability, etc.

• Inversion is an ill-posed numerical problem.^b

aRubinstein (1994); Derman & Kani (1994); Derman, Kani, & Chriss (1996); Jackwerth & Rubinstein (1996); Jackwerth (1997); Coleman, Kim, Li, & Verma (2000); Li (2000/2001); Rebonato (2004); Moriggia, Muzzioli, & Torricelli (2009).

bAyache, Henrotte, Nassar, & X. Wang (2004).

(44)

Implied Trees (continued)

• It is ﬁnally solved for separable local volatilities.^a

– The local-volatility function σ(S, V ) is separable^b if σ(S, t) = σ₁(S) σ₂(t).

• A solution is also available for any upper- and lower-bounded σ.^c

aLok (D99922028) & Lyuu (2015, 2016, 2017).

bBrace, G¸atarek, & Musiela (1997); Rebonato (2004).

cLok (D99922028) & Lyuu (2016, 2017, 2020).

(45)

Implied Trees

^a

(concluded)

10 10³ 20

10²

2 30

10¹ 1 1.5

40

0.5 10⁰ 0

50

Root

aPlot supplied by Prof. Lok, U Hou (D99922028) on May 4, 2019.

(46)

The Hull-White Model

• Hull and White (1987) postulate the following stochastic-volatility model,

dS

S = r dt + √

V dW₁, dV = μ_vV dt + bV dW₂.

• Above, V is the instantaneous variance.

• They assume μ_v depends on V and t (but not S).

(47)

The Barone-Adesi–Rasmussen–Ravanelli Model

• Barone-Adesi, Rasmussen, and Ravanelli (2005) postulate the following model,

dS

S = μ dt + √

V dW₁,

dV = κ(θ − V ) dt + bV dW₂.

• Above, W₁ and W₂ are correlated.

(48)

The Stein-Stein Model

• E. Stein and J. Stein (1991) postulate the following model,

dS

S = r dt + V dW₁,

dV = κ(μ − V ) dt + σ dW.

• Closed-form formulas exist for European calls and puts.^a

aSch¨obel & Zhu (1999).

(49)

The SABR Model

• Hagan, Kumar, Lesniewski, and Woodward (2002) postulate the following model,

dS

S = r dt + S^θV dW₁, dV = bV dW₂,

for 0 ≤ θ ≤ 1.

• A nice feature of this model is that the implied volatility surface has a compact approximate closed form.

(50)

The Blacher Model

• Blacher (2001) postulates the following model, dS

S = r dt + σ

1 + α(S − S₀) + β(S − S₀)²

dW₁, dσ = κ(θ − σ) dt + σ dW₂.

• The volatility σ follows a mean-reverting process to level θ.

(51)

The Hilliard-Schwartz Model

• Hilliard and Schwartz (1996) postulate the following general model,

dS

S = r dt + f (S)V ^a dW₁, dV = μ(V ) dt + bV dW₂,

for some well-behaved function f (S) and constant a.

• It includes all previously mentioned stochastic-volatility models as special cases.^a

aH. Chiu (R98723059) (2012).

(52)

Heston’s Stochastic-Volatility Model

• Heston (1993) assumes the stock price follows dS

S = (μ − q) dt + √

V dW₁, (85) dV = κ(θ − V ) dt + σ√

V dW₂. (86) – V is the instantaneous variance, which follows a

square-root process.

– dW₁ and dW₂ have correlation ρ.

– The riskless rate r is constant.

• It may be the most popular continuous-time stochastic-volatility model.^a

aChristoﬀersen, Heston, & Jacobs (2009).

(53)

Heston’s Stochastic-Volatility Model (continued)

• Heston assumes the market price of risk is b₂√ V .

• So μ = r + b₂V .

• Deﬁne

dW₁^∗ = dW₁ + b₂√

V dt, dW₂^∗ = dW₂ + ρb₂√

V dt, κ^∗ = κ + ρb₂σ,

θ^∗ = θκ

κ + ρb₂σ.

• dW₁^∗ and dW₂^∗ have correlation ρ.

(54)

Heston’s Stochastic-Volatility Model (continued)

• Under the risk-neutral probability measure Q, both W₁^∗ and W₂^∗ are Wiener processes.

• Heston’s model becomes, under probability measure Q, dS

S = (r − q) dt + √

V dW₁^∗, dV = κ^∗(θ^∗ − V ) dt + σ√

V dW₂^∗.

(55)

Heston’s Stochastic-Volatility Model (continued)

• Deﬁne

φ(u, τ) = exp { ıu(ln S + (r − q) τ) +θ^∗κ^∗σ⁻²

(κ^∗ − ρσuı − d) τ − 2 ln 1 − ge^−dτ 1 − g

+ vσ⁻²(κ^∗ − ρσuı − d)

1 − e^−dτ 1 − ge^−dτ

,

d =

(ρσuı − κ^∗)² − σ²(−ıu − u²) , g = (κ^∗ − ρσuı − d)/(κ^∗ − ρσuı + d).

(56)

Heston’s Stochastic-Volatility Model (continued)

The formulas for European calls and puts are^a

C = S

1

2 + 1 π

_∞

0

Re

X^−ıuφ(u − ı, τ ) ıuSe^rτ

du

−Xe^−rτ

1

2 + 1 π

_∞

0

Re

X^−ıuφ(u, τ ) ıu

du

, P = Xe^−rτ

1

2 − 1 π

_∞

0

Re

X^−ıuφ(u, τ ) ıu

du

,

−S

1

2 − 1 π

_∞

0

Re

X^−ıuφ(u − ı, τ ) ıuSe^rτ

du

,

where ı = √

−1 and Re(x) denotes the real part of the complex number x.

aContributed by Mr. Chen, Chun-Ying (D95723006) on August 17, 2008 and Mr. Liou, Yan-Fu (R92723060) on August 26, 2008. See Lord &

Kahl (2009) and Cui, Rollin, & Germano (2017) for alternative formulas.

(57)

Heston’s Stochastic-Volatility Model (concluded)

• For American options, trees are needed.

• They are all O(n³)-sized and do not match all moments.^a

• An O(n^2.5)-sized 9-jump tree that matches all means and variances with valid probabilities is available.^b

• The size reduces to O(n²) for knock-out double-barrier options.^c

aNelson & Ramaswamy (1990); Nawalkha & Beliaeva (2007); Leisen (2010); Beliaeva & Nawalkha (2010); M. Chou (R02723073) (2015); M.

Chou (R02723073) & Lyuu (2016).

bZ. Lu (D00922011) & Lyuu (2018).

cZ. Lu (D00922011) & Lyuu (2018).

(58)

Stochastic-Volatility Models and Further Extensions

^a

• How to explain the October 1987 crash?

– The Dow Jones Industrial Average fell 22.61% on October 19, 1987 (called the Black Monday).

– The CBOE S&P 100 Volatility Index (VXO) shot up to 150%, the highest VXO ever recorded.^b

• Stochastic-volatility models require an implausibly high-volatility level prior to and after the crash.

– Because the processes are continuous.

• Discontinuous jump models in the asset price can alleviate the problem somewhat.^c

aEraker (2004).

bCaprio (2012).

(59)

Stochastic-Volatility Models and Further Extensions (continued)

• But if the jump intensity is a constant, it cannot explain the tendency of large movements to cluster over time.

• This assumption also has no impacts on option prices.

• Jump-diﬀusion models combine both.

– E.g., add a jump process to Eq. (85) on p. 647.

– Closed-form formulas exist for GARCH-jump option pricing models.^a

aLiou (R92723060) (2005).

(60)

Stochastic-Volatility Models and Further Extensions (concluded)

• But they still do not adequately describe the systematic variations in option prices.^a

• Jumps in volatility are alternatives.^b

– E.g., add correlated jump processes to Eqs. (85) and Eq. (86) on p. 647.

• Such models allow high level of volatility caused by a jump to volatility.^c

aBates (2000); Pan (2002).

bDuﬃe, Pan, & Singleton (2000).

cEraker, Johnnes, & Polson (2000); Y. Lin (2007); Zhu & Lian (2012).

(61)

Why Are Trees for Stochastic-Volatility Models Diﬃcult?

• The CRR tree is 2-dimensional.^a

• The constant volatility makes the span from any node ﬁxed.

• But a tree for a stochastic-volatility model must be 3-dimensional.

– Every node is associated with a combination of stock price and volatility.

aRecall p. 291.

(62)

Why Are Trees for Stochastic-Volatility Models

Diﬃcult (Binomial Case)?

(63)

Why Are Trees for Stochastic-Volatility Models

Diﬃcult (Trinomial Case)?

(64)

Why Are Trees for Stochastic-Volatility Models Diﬃcult? (concluded)

• Locally, the tree looks ﬁne for one time step.

• But the volatility regulates the spans of the nodes on the stock-price plane.

• Unfortunately, those spans diﬀer from node to node because the volatility varies.

• So two time steps from now, the branches will not combine!

• Smart ideas are thus needed.

(65)

Complexities of Stochastic-Volatility Models

• A few stochastic-volatility models suﬀer from subexponential (c^√ⁿ) tree size.

• Examples include the Hull-White (1987),

Hilliard-Schwartz (1996), and SABR (2002) models.^a

• Future research may extend this negative result to more stochastic-volatility models.

– We suspect many GARCH option pricing models entertain similar problems.^b

aH. Chiu (R98723059) (2012).

bY. C. Chen (R95723051) (2008); Y. C. Chen (R95723051), Lyuu, &

Wen (D94922003) (2011).

(66)

Complexities of Stochastic-Volatility Models (concluded)

• Flexible placement of nodes and removal of low-probability nodes may make the models O(n^2.5)-sized!^a

• Calibration can be computationally hard.

– Few have tried it on exotic options.^b

• There are usually several local minima.^c

– They will give diﬀerent prices to options not used in the calibration.

– But which set capture the smile dynamics?

aZ. Lu (D00922011) & Lyuu (2018).

bAyache, Henrotte, Nassar, & X. Wang (2004).

(67)

Continuous-Time Derivatives Pricing

(68)

I have hardly met a mathematician who was capable of reasoning.

— Plato (428 B.C.–347 B.C.) Fischer [Black] is the only real genius I’ve ever met in ﬁnance. Other people, like Robert Merton or Stephen Ross, are just very smart and quick, but they think like me.

Fischer came from someplace else entirely.

— John C. Cox, quoted in Mehrling (2005)

(69)

Toward the Black-Scholes Diﬀerential Equation

• The price of any derivative on a non-dividend-paying stock must satisfy a partial diﬀerential equation (PDE).

• The key step is recognizing that the same random process drives both securities.

– Their prices are perfectly correlated.

• We then ﬁgure out the amount of stock such that the gain from it oﬀsets exactly the loss from the derivative.

• The removal of uncertainty forces the portfolio’s return to be the riskless rate.

• PDEs allow many numerical methods to be applicable.

(70)

Assumptions

^a

• The stock price follows dS = μS dt + σS dW .

• There are no dividends.

• Trading is continuous, and short selling is allowed.

• There are no transactions costs or taxes.

• All securities are inﬁnitely divisible.

• The term structure of riskless rates is ﬂat at r.

• There is unlimited riskless borrowing and lending.

• t is the current time, T is the expiration time, and τ = T^Δ − t.

aDerman & Taleb (2005) summarizes criticisms on these assumptions and the replication argument.

(71)

Black-Scholes Diﬀerential Equation

• Let C be the price of a derivative on S.

• From Ito’s lemma (p. 599), dC =

μS ∂C

∂S + ∂C

∂t + 1

2 σ²S² ∂²C

∂S²

dt + σS ∂C

∂S dW.

– The same W drives both C and S.

• Short one derivative and long ∂C/∂S shares of stock (call it Π).

• By construction,

Π = −C + S(∂C/∂S).

(72)

Black-Scholes Diﬀerential Equation (continued)

• The change in the value of the portfolio at time dt is^a dΠ = −dC + ∂C

∂S dS.

• Substitute the formulas for dC and dS into the partial diﬀerential equation to yield

dΠ =

−∂C

∂t − 1

2 σ²S² ∂²C

∂S²

dt.

• As this equation does not involve dW , the portfolio is riskless during dt time: dΠ = rΠ dt.

aBergman (1982) and Bartels (1995) argue this is not quite right. But see Macdonald (1997).

(73)

Black-Scholes Diﬀerential Equation (continued)

• So

∂C

∂t + 1

2 σ²S² ∂²C

∂S²

dt = r

C − S ∂C

∂S

dt.

• Equate the terms to ﬁnally obtain

∂C

∂t + rS ∂C

∂S + 1

2 σ²S² ∂²C

∂S² = rC.

• This is a backward equation, which describes the

dynamics of a derivative’s price forward in physical time.

(74)

Black-Scholes Diﬀerential Equation (concluded)

• When there is a dividend yield q,

∂C

∂t + (r − q) S ∂C

∂S + 1

2 σ²S² ∂²C

∂S² = rC. (87)

• Dupire’s formula (84) for the local-volatility model^a is simply the dual of this equation:^b

∂C

∂T + (r_T − qT)X ∂C

∂X − 1

2 σ(X, T )²X² ∂²C

∂X² = −qTC.

• This is a forward equation, which describes the dynamics of a derivative’s price backward in maturity time.

aSee p. 629.

bDerman & Kani (1997).

(75)

Rephrase

• The Black-Scholes diﬀerential equation can be expressed in terms of sensitivity numbers,

Θ + rSΔ + 1

2 σ²S²Γ = rC. (88)

• Identity (88) leads to an alternative way of computing Θ numerically from Δ and Γ.

• When a portfolio is delta-neutral, Θ + 1

2 σ²S²Γ = rC.

– A deﬁnite relation thus exists between Γ and Θ.

(76)

Black-Scholes Diﬀerential Equation: An Alternative

• Perform the change of variable V = ln S.^Δ

• The option value becomes U(V, t) = C(e^Δ ^V , t).

• Furthermore,

∂C

∂t = ∂U

∂t ,

∂C

∂S = 1

S

∂U

∂V , (89)

∂²C

∂²S = 1 S²

∂²U

∂V ² − 1 S²

∂U

∂V . (90)

(77)

Black-Scholes Diﬀerential Equation: An Alternative (concluded)

• Equations (89) and (90) are alternative ways to calculate delta and gamma.^a

• They are particularly useful for a tree of logarithmic prices.

• The Black-Scholes diﬀerential equation (87) on p. 669 becomes

1

2 σ² ∂²U

∂V ² +

r − q − σ² 2

∂U

∂V − rU + ∂U

∂t = 0 subject to U (V, T ) being the payoﬀ such as

max(X − e^V , 0).

aSee Eqs. (49) on p. 358 and (50) on p. 360.

(78)

[Black] got the equation [in 1969] but then was unable to solve it. Had he been a better physicist he would have recognized it as a form of the familiar heat exchange equation, and applied the known solution. Had he been a better mathematician, he could have solved the equation from ﬁrst principles.

Certainly Merton would have known exactly what to do with the equation had he ever seen it.

— Perry Mehrling (2005)

(79)

PDEs for Asian Options

• Add the new variable A(t) =^Δ _t

0 S(u) du.

• Then the value V of the Asian option satisﬁes this two-dimensional PDE:^a

∂V

∂t + rS ∂V

∂S + 1

2 σ²S² ∂²V

∂S² + S ∂V

∂A = rV.

• The terminal conditions are V (T, S, A) = max

A

T − X, 0

for call, V (T, S, A) = max

X − A T , 0

for put.

aKemna & Vorst (1990).

(80)

PDEs for Asian Options (continued)

• The two-dimensional PDE produces algorithms similar to that on pp. 436ﬀ.^a

• But one-dimensional PDEs are available for Asian options.^b

• For example, Veˇceˇr (2001) derives the following PDE for Asian calls:

∂u

∂t + r

1 − t

T − z

∂u

∂z +

1 − _T^t − z₂ σ² 2

∂²u

∂z² = 0 with the terminal condition u(T, z) = max(z, 0).

aBarraquand & Pudet (1996).

bRogers & Shi (1995); Veˇceˇr (2001); Dubois & Leli`evre (2005).

(81)

PDEs for Asian Options (concluded)

• For Asian puts:

∂u

∂t + r

t

T − 1 − z

∂u

∂z + _t

T − 1 − z₂ σ² 2

∂²u

∂z² = 0 with the same terminal condition.

• One-dimensional PDEs lead to highly eﬃcient numerical methods.

(82)

Hedging

(83)

When Professors Scholes and Merton and I invested in warrants, Professor Merton lost the most money.

And I lost the least.

— Fischer Black (1938–1995)

(84)

Delta Hedge

• The delta (hedge ratio) of a derivative f is deﬁned as Δ =^Δ ∂f

∂S.

• Thus

Δf ≈ Δ × ΔS

for relatively small changes in the stock price, ΔS.

• A delta-neutral portfolio is hedged as it is immunized against small changes in the stock price.

• A trading strategy that dynamically maintains a delta-neutral portfolio is called delta hedge.

(85)

Delta Hedge (concluded)

• Delta changes with the stock price.

• A delta hedge needs to be rebalanced periodically in order to maintain delta neutrality.

• In the limit where the portfolio is adjusted continuously,

“perfect” hedge is achieved and the strategy becomes self-ﬁnancing.

(86)

Implementing Delta Hedge

• We want to hedge N short derivatives.

• Assume the stock pays no dividends.

• The delta-neutral portfolio maintains N × Δ shares of stock plus B borrowed dollars such that

−N × f + N × Δ × S − B = 0.

• At next rebalancing point when the delta is Δ, buy N × (Δ − Δ) shares to maintain N × Δ shares.

• Delta hedge is the discrete-time analog of the

continuous-time limit and will rarely be self-ﬁnancing.