2. for any 0 ≤ t0 < t1 < · · · < tn, the random variables X(tk) − X(tk−1)
for 1 ≤ k ≤ n are independent.b
3. for 0 ≤ s < t, X(t) − X(s) is normally distributed with mean μ(t − s) and variance σ2(t − s), where μ and σ = 0 are real numbers.
aRobert Brown (1773–1858).
bSo X(t) − X(s) is independent of X(r) for r ≤ s < t.
Brownian Motion (concluded)
• The existence and uniqueness of such a process is guaranteed by Wiener’s theorem.a
• This process will be called a (μ, σ) Brownian motion with drift μ and variance σ2.
• The (0, 1) Brownian motion is called the Wiener process.
aNorbert Wiener (1894–1964). He received his Ph.D. from Harvard in 1912.
Ito Process
a• A shorthandb is the following stochastic differential equation for the Ito differential dXt,
dXt = a(Xt, t) dt + b(Xt, t) dWt. (14) – Or simply
dXt = at dt + bt dWt.
aIto (1944).
bPaul Langevin (1872–1946) in 1904.
Ito Process (concluded)
• dW is normally distributed with mean zero and variance dt.
• An equivalent form of Eq. (14) is dXt = at dt + bt√
dt ξ, (15)
where ξ ∼ N (0, 1).
Modeling Stock Prices
• The most popular stochastic model for stock prices has been the geometric Brownian motion,
dS
S = μ dt + σ dW.
• The continuously compounded rate of return X ≡ ln S follows
dX = (μ − σ2/2) dt + σ dW by Ito’s lemma.a
aConsistent with Lemma 1 (p. 84).
0.2 0.4 0.6 0.8 1 Time (t) -1
1 2 3 4 5 6 Y(t)
Local-Volatility Models
• The more general deterministic volatility model posits dS
S = (rt − qt) dt + σ(S, t) dW,
where instantaneous volatility σ(S, t) is called the local volatility function.a
• One needs to recover σ(S, t) from the implied volatilities.
aDerman and Kani (1994); Dupire (1994).
0 0.5
1 1.5
2 2.5
3
0 0.2 0.4 0.6 0.8 1 20 30 40 50 60 70 80 90 100 110
Strike ($)
Implied Vol Surface
Time to Maturity (yr)
Implied Vol (%)
0 0.5
1 1.5
2 2.5
3
0 0.2 0.4 0.6 0.8 1 20 30 40 50 60 70 80 90 100 110
Stock ($)
Local Vol Surface
Time (yr)
Local Vol (%)
Implied Trees
• The trees for the local volatility model are called implied trees.a
• Their construction requires an implied volatility surface.
• An exponential-sized implied tree exists.b
• How to construct a valid implied tree with efficiency has been open for a long time.c
aDerman & Kani (1994); Dupire (1994); Rubinstein (1994).
bCharalambousa, Christofidesb, & Martzoukosa (2007).
cRubinstein (1994); Derman & Kani (1994); Derman, Kani, & Chriss (1996); Jackwerth & Rubinstein (1996); Jackwerth (1997); Coleman, Kim, Li, & Verma (2000); Li (2000/2001); Moriggia, Muzzioli, & Torri-celli (2009).
Implied Trees (concluded)
• It is solved for separable local volatilities σ.a
– The local-volatility function σ(S, V ) is separableb if σ(S, t) = σ1(S) σ2(t).
• A general solution is close.c
aLok (D99922028) & Lyuu (2015, 2016).
bRebonato (2004); Brace, G¸atarek, & Musiela (1997).
cLok (D99922028) & Lyuu (2016).
The Hull-White Model
• Hull and White (1987) postulate the following model, dS
S = r dt + √
V dW1, dV = μvV dt + bV dW2.
• Above, V is the instantaneous variance.
• They assume μv depends on V and t (but not S).
The SABR Model
• Hagan, Kumar, Lesniewski, and Woodward (2002) postulate the following model,
dS
S = r dt + SθV dW1, dV = bV dW2,
for 0 ≤ θ ≤ 1.
The Hilliard-Schwartz Model
• Hilliard and Schwartz (1996) postulate the following general model,
dS
S = r dt + f (S)V a dW1, dV = μ(V ) dt + bV dW2,
for some well-behaved function f (S) and constant a.
Heston’s Stochastic-Volatility Model
• Heston (1993) assumes the stock price follows dS
S = (μ − q) dt + √
V dW1, (16)
dV = κ(θ − V ) dt + σ√
V dW2. (17) – V is the instantaneous variance, which follows a
square-root process.
– dW1 and dW2 have correlation ρ.
– The riskless rate r is constant.
Heston’s Stochastic-Volatility Model (concluded)
• It may be the most popular continuous-time stochastic-volatility model.a
• For American options, we will need a tree for Heston’s model.b
• They are all O(n3)-sized.
aChristoffersen, Heston, & Jacobs (2009).
bLeisen (2010); Beliaeva & Nawalka (2010); Chou (R02723073) (2015).
Why Are Trees for Stochastic-Volatility Models Difficult?
• The CRR tree is 2-dimensional.a
• The constant volatility makes the span from any node fixed.
• But a tree for a stochastic-volatility model must be 3-dimensional.
– Every node is associated with a pair of stock price and a volatility.
Why Are Trees for Stochastic-Volatility Models
Difficult: Binomial Case?
Why Are Trees for Stochastic-Volatility Models
Difficult: Trinomial Case?
Why Are Trees for Stochastic-Volatility Models Difficult? (concluded)
• Locally, the tree looks fine for one time step.
• But the volatility regulates the spans of the nodes on the stock-price plane.
• Unfortunately, those spans differ from node to node because the volatility varies.
• So two time steps from now, the branches will not combine!
Complexities of Stochastic-Volatility Models
• A few stochastic-volatility models suffer from subexponential (c√n) tree size.
• Examples include the Hull-White (1987),
Hilliard-Schwartz (1996), and SABR (2002) models.a
aChiu (R98723059) (2012).
Trees
I love a tree more than a man.
— Ludwig van Beethoven (1770–1827)
Trinomial Tree
• Set up a trinomial approximation to the geometric Brownian motiona
dS
S = r dt + σ dW.
• The three stock prices at time Δt are S, Su, and Sd, where ud = 1.
• Let the mean and variance of the stock price be SM and S2V , respectively.
aBoyle (1988).
* -j
pu
pm
pd
Su S Sd S
-
Δt
* -j
* -j
* -j
* -j
Trinomial Tree (continued)
• By Eqs. (5) on p. 24,
M ≡ erΔt,
V ≡ M2(eσ2Δt − 1).
• Impose the matching of mean and that of variance:
1 = pu + pm + pd,
SM = (puu + pm + (pd/u)) S,
S2V = pu(Su − SM)2 + pm(S − SM)2 + pd(Sd − SM)2.
Trinomial Tree (concluded)
• Use linear algebra to verify that pu = u
V + M2 − M
− (M − 1) (u − 1) (u2 − 1) , pd = u2
V + M2 − M
− u3(M − 1) (u − 1) (u2 − 1) .
– We must also make sure the probabilities lie between 0 and 1.
A Trinomial Tree
• Use u = eλσ√Δt, where λ ≥ 1 is a tunable parameter.
• Then
pu → 1
2λ2 +
r + σ2 √ Δt
2λσ ,
pd → 1
2λ2 −
r − 2σ2 √ Δt
2λσ .
Barrier Options Priced by Trinomial Trees
0 50 100 150 200
5.61 5.62 5.63 5.64 5.65 5.66
Down-and-in call value
Multivariate Contingent Claims
• They depend on two or more underlying assets.
• The basket call on m assets has the terminal payoff max
m
i=1
αiSi(τ ) − X, 0
.
Multivariate Contingent Claims (continued)
aName Payoff
Exchange option max(S1(τ) − S2(τ), 0) Better-off option max(S1(τ), . . . , Sk(τ), 0) Worst-off option min(S1(τ), . . . , Sk(τ), 0)
Binary maximum option I{ max(S1(τ), . . . , Sk(τ)) > X } Maximum option max(max(S1(τ), . . . , Sk(τ)) − X, 0) Minimum option max(min(S1(τ), . . . , Sk(τ)) − X, 0) Spread option max(S1(τ) − S2(τ) − X, 0)
Basket average option max((S1(τ) + · · · + Sk(τ))/k − X, 0) Multi-strike option max(S1(τ) − X1, . . . , Sk(τ) − Xk, 0)
Pyramid rainbow option max(| S1(τ) − X1 | + · · · + | Sk(τ) − Xk | − X, 0)
Multivariate Contingent Claims (concluded)
• Trees for multivariate contingent claims typically has size exponential in the number of assets.
• This is called the curse of dimensionality.
Numerical Methods
All science is dominated by the idea of approximation.
— Bertrand Russell
Monte Carlo Simulation
a• Monte Carlo simulation is a sampling scheme.
• In many important applications within finance and without, Monte Carlo is one of the few feasible tools.
aA top 10 algorithm according to Dongarra and Sullivan (2000).
Monte Carlo Option Pricing
• For the pricing of European options, we sample the stock prices.
• Then we average the payoffs.
• The variance of the estimator is now 1/N of that of the original random variable.
How about American Options?
• Standard Monte Carlo simulation is inappropriate for American options because of early exercise.
• It is difficult to determine the early-exercise point based on one single path.
• But Monte Carlo simulation can be modified to price American options with small biases.a
• The LSM can be easily parallelized.b
aLongstaff and Schwartz (2001).
bHuang (B96902079, R00922018) (2013); Chen (B97902046,
Delta and Common Random Numbers
• In estimating delta ∂f/∂S, it is natural to start with the finite-difference estimate
e−rτ E[ P (S + ) ] − E[ P (S − ) ]
2 .
– P (x) is the terminal payoff of the derivative security when the underlying asset’s initial price equals x.
• Use simulation to estimate E[ P (S + ) ] first.
• Use another simulation to estimate E[ P (S − ) ].
• Finally, apply the formula to approximate the delta.
Delta and Common Random Numbers (concluded)
• This method is not recommended because of its high variance.
• A much better approach is to use common random numbers to lower the variance:
e−rτ E
P (S + ) − P (S − ) 2
.
• Here, the same random numbers are used for P (S + ) and P (S − ).
Gamma
• The finite-difference formula for gamma ∂2f /∂S2 is e−rτ E
P (S + ) − 2 × P (S) + P (S − ) 2
.
• Choosing an of the right magnitude can be challenging.
– If is too large, inaccurate Greeks result.
– If is too small, unstable Greeks result.
• This phenomenon is sometimes called the curse of differentiation.a
aA¨ıt-Sahalia and Lo (1998); Bondarenko (2003).
Gamma (continued)
• In general, suppose
∂i
∂θie−rτE[ P (S) ] = e−rτE
∂iP (S)
∂θi
holds for all i > 0, where θ is a parameter of interest.
– A common requirement is Lipschitz continuity.a
• Then formulas for the Greeks become integrals.
• As a result, we avoid , finite differences, and resimulation.
Gamma (concluded)
• This is indeed possible for a broad class of payoff functions.a
• In queueing networks, this is called infinitesimal perturbation analysis (IPA).b
aTeng (R91723054) (2004) and Lyuu and Teng (R91723054) (2011).
bCao (1985); Ho and Cao (1985).
Interest Rate Models
[Meriwether] scoring especially high marks in mathematics — an indispensable subject for a bond trader.
— Roger Lowenstein, When Genius Failed (2000) Bond market terminology was designed less to convey meaning than to bewilder outsiders.
— Michael Lewis, The Big Short (2011)
The Vasicek Model
a• The short rate follows
dr = β(μ − r) dt + σ dW.
• The short rate is pulled to the long-term mean level μ at rate β.
• Superimposed on this “pull” is a normally distributed stochastic term σ dW .
aVasicek (1977).
The Cox-Ingersoll-Ross Model
a• It is the following square-root short rate model:
dr = β(μ − r) dt + σ√
r dW. (18)
• The diffusion differs from the Vasicek model by a multiplicative factor √
r .
• The parameter β determines the speed of adjustment.
• The short rate can reach zero only if 2βμ < σ2.
aCox, Ingersoll, and Ross (1985).
The Ho-Lee Model
a• The continuous-time limit of the Ho-Lee model is dr = θ(t) dt + σ dW.
• This is Vasicek’s model with the mean-reverting drift replaced by a deterministic, time-dependent drift.
• A nonflat term structure of volatilities can be achieved if the short rate volatility is also made time varying,
dr = θ(t) dt + σ(t) dW.
The Black-Derman-Toy Model
a• The continuous-time limit of the BDT model is d ln r =
θ(t) + σ(t)
σ(t) ln r
dt + σ(t) dW.
• This model is extensively used by practitioners.
• The BDT short rate process is the lognormal binomial interest rate process.
• Lognormal models preclude negative short rates.
aBlack, Derman, and Toy (BDT) (1990), but essentially finished in 1986 according to Mehrling (2005).
The Black-Karasinski Model
a• The BK model stipulates that the short rate follows d ln r = κ(t)(θ(t) − ln r) dt + σ(t) dW.
• This explicitly mean-reverting model depends on time through κ( · ), θ( · ), and σ( · ).
• The BK model hence has one more degree of freedom than the BDT model.
• The speed of mean reversion κ(t) and the short rate volatility σ(t) are independent.
The Extended Vasicek Model
a• The extended Vasicek model adds time dependence to the original Vasicek model,
dr = (θ(t) − a(t) r) dt + σ(t) dW.
• Like the Ho-Lee model, this is a normal model.
• Many European-style securities can be evaluated analytically.
• Efficient numerical procedures can be developed for American-style securities.
aHull and White (1990).
The Hull-White Model
• The Hull-White model is the following special case, dr = (θ(t) − ar) dt + σ dW.
* -j
(0, 0)
* -j
(1, 1)
* -j
(1, 0)
*
(1, −1)
-* -j
* -j
* -j
* -j
-j R
* -j
* -j
* -j
* -j
* -
-
Δt
6
?Δr
The Extended CIR Model
• In the extended CIR model the short rate follows dr = (θ(t) − a(t) r) dt + σ(t)√
r dW.
• The functions θ(t), a(t), and σ(t) are implied from market observables.
• With constant parameters, there exist analytical solutions to a small set of interest rate-sensitive securities.