### Biases in Pricing Continuously Monitored Options with Monte Carlo

*• We are asked to price a continuously monitored*
*up-and-out call with barrier H.*

*• The Monte Carlo method samples the stock price at n*
*discrete time points t*_{1}*, t*_{2}*, . . . , t** _{n}*.

*• A sample path*

*S(t*_{0}*), S(t*_{1}*), . . . , S(t** _{n}*)
is produced.

**– Here, t**_{0} *= 0 is the current time, and t*_{n}*= T is the*
expiration time of the option.

### Biases in Pricing Continuously Monitored Options with Monte Carlo (continued)

*• If all of the sampled prices are below the barrier, this*
*sample path pays max(S(t** _{n}*)

*− X, 0).*

*• Repeating these steps and averaging the payoﬀs yield a*
Monte Carlo estimate.

1: *C := 0;*

2: **for i = 1, 2, 3, . . . , N do**

3: *P := S; hit := 0;*

4: **for j = 1, 2, 3, . . . , n do**

5: *P := P × e*^{(r−σ}^{2}*/2) (T /n)+σ**√*

*(T /n) ξ**; {By Eq. (120) on p.*

*853.}*

6: **if P ≥ H then**

7: hit := 1;

8: break;

9: **end if**

10: **end for**

11: **if hit = 0 then**

12: *C := C + max(P − X, 0);*

13: **end if**

14: **end for**

15: *return Ce*^{−rT}*/N ;*

### Biases in Pricing Continuously Monitored Options with Monte Carlo (continued)

*• This estimate is biased.*^{a}

**– Suppose none of the sampled prices on a sample path**
*equals or exceeds the barrier H.*

**– It remains possible for the continuous sample path**
*that passes through them to hit the barrier between*
sampled time points (see plot on next page).

**– Hence knock-out probabilities are underestimated.**

aShevchenko (2003).

H

### Biases in Pricing Continuously Monitored Options with Monte Carlo (concluded)

*• The bias can be lowered by increasing the number of*
observations along the sample path.

* – For trees, the knock-out probabilities may decrease*
as the number of time steps is increased.

*• However, even daily sampling may not suﬃce.*

*• The computational cost also rises as a result.*

### Brownian Bridge Approach to Pricing Barrier Options

*• We desire an unbiased estimate which can be calculated*
eﬃciently.

*• The above-mentioned payoﬀ should be multiplied by the*
*probability p that a continuous sample path does not*
hit the barrier conditional on the sampled prices.

*• This methodology is called the Brownian bridge*
approach.

*• Formally, we have*

*p* *= Prob[ S(t) < H, 0*^{Δ} *≤ t ≤ T | S(t*_{0}*), S(t*_{1}*), . . . , S(t*_{n}*) ].*

### Brownian Bridge Approach to Pricing Barrier Options (continued)

*• As a barrier is hit over a time interval if and only if the*
*maximum stock price over that period is at least H,*

*p = Prob*

*0≤t≤T*max *S(t) < H* *| S(t*_{0}*), S(t*_{1}*), . . . , S(t** _{n}*)

*.*

*• Luckily, the conditional distribution of the maximum*
over a time interval given the beginning and ending
stock prices is known.

### Brownian Bridge Approach to Pricing Barrier Options (continued)

**Lemma 21 Assume S follows dS/S = μ dt + σ dW and define**^{a}
*ζ(x)* = exp^{Δ}

*− 2 ln(x/S(t)) ln(x/S(t + Δt))*
*σ*^{2}*Δt*

*.*
*(1) If H > max(S(t), S(t + Δt)), then*

Prob

*t≤u≤t+Δt*max *S(u) < H*

* S(t),S(t + Δt)*

*= 1 − ζ(H).*

*(2) If h < min(S(t), S(t + Δt)), then*
Prob

*t≤u≤t+Δt*min *S(u) > h*

* S(t),S(t + Δt)*

*= 1 − ζ(h).*

aHere, Δ*t is an arbitrary positive real number.*

### Brownian Bridge Approach to Pricing Barrier Options (continued)

*• Lemma 21 gives the probability that the barrier is not*
hit in a time interval, given the starting and ending
stock prices.

*• For our up-and-out*^{a} *call, choose n = 1.*

*• As a result,*

*p =*

⎧⎨

⎩

1 *− exp*

*−**2 ln(H/S(0)) ln(H/S(T ))*
*σ*^{2}*T*

*, if H > max(S(0), S(T )),*

0*,* otherwise.

### Brownian Bridge Approach to Pricing Barrier Options (continued)

*The following algorithm works for up-and-out and*
down-and-out calls.

1: *C := 0;*

2: **for i = 1, 2, 3, . . . , N do**

3: *P := S × e*^{(r−q−σ}^{2}^{/2) T +σ}

*√**T ξ( )*;

4: **if (S < H and P < H) or (S > H and P > H) then**

5: *C := C+max(P −X, 0)×*

*1 − exp*

*−**2 ln(H/S)×ln(H/P )*
*σ*^{2}*T*

;

6: **end if**

7: **end for**

8: *return Ce*^{−rT}*/N ;*

### Brownian Bridge Approach to Pricing Barrier Options (concluded)

*• The idea can be generalized.*

*• For example, we can handle more complex barrier*
options.

*• Consider an up-and-out call with barrier H** _{i}* for the

*time interval (t*

_{i}*, t*

*], 0*

_{i+1}*≤ i < n.*

*• This option contains n barriers.*

*• Multiply the probabilities for the n time intervals to*
obtain the desired probability adjustment term.

### Pricing Barrier Options without Brownian Bridge

*• Let T**h* *denote the amount of time for a process X** _{t}* to hit

*h for the first time.*

*• It is called the ﬁrst passage time or the ﬁrst hitting time.*

*• Suppose X*_{t}*is a (μ, σ) Brownian motion:*

*dX*_{t}*= μ dt + σ dW*_{t}*,* *t* *≥ 0.*

### Pricing Barrier Options without Brownian Bridge (continued)

*• The ﬁrst passage time T** _{h}* follows the inverse Gaussian
(IG) distribution with probability density function:

^{a}

*| h − X(0) |*
*σt*^{3/2}*√*

*2π* *e**−(h−X(0)−μx)*^{2}*/(2σ*^{2}*x)**.*

*• For pricing a barrier option with barrier H by*
simulation, the density function becomes

*| ln(H/S(0)) |*
*σt*^{3/2}*√*

*2π* *e** ^{−}*[

*ln(H/S(0))−(r−σ*

^{2}

*/2) x*]

^{2}

*/(2σ*

^{2}

*x)*

*.*

aA. N. Borodin & Salminen (1996), with Laplace transform

### Pricing Barrier Options without Brownian Bridge (concluded)

*• Draw an x from this distribution.*^{a}

*• If x > T , a knock-in option fails to knock in, whereas a*
knock-out option does not knock out.

*• If x ≤ T , the opposite is true.*

*• If the barrier option survives at maturity T , then draw*
*an S(T ) to calculate its payoﬀ.*

*• Repeat the above process many times to average the*
discounted payoﬀ.

aThe IG distribution can be very eﬃciently sampled (Michael, Schu- cany, & Haas, 1976).

### Brownian Bridge Approach to Pricing Lookback Options

^{a}

*• By Lemma 21(1) (p. 876),*
*F*_{max}*(y)* =^{Δ} Prob

*0≤t≤T*max *S(t) < y* *| S(0), S(T )*

= 1 *− exp*

*−2 ln(y/S(0)) ln(y/S(T ))*
*σ*^{2}*T*

*.*

*• So F*_{max} is the conditional distribution function of the
maximum stock price.

aEl Babsiri & Noel (1998).

### Brownian Bridge Approach to Pricing Lookback Options (continued)

*• A random variable with that distribution can be*

*generated by F*_{max}^{−1}*(x), where x is uniformly distributed*
*over (0, 1).*^{a}

*• In other words,*

*x = 1* *− exp*

*−2 ln(y/S(0)) ln(y/S(T ))*
*σ*^{2}*T*

*.*

aThis is called the inverse-transform technique (see p. 259 of the text- book).

### Brownian Bridge Approach to Pricing Lookback Options (continued)

*• Equivalently,*

ln(1 *− x)*

= *−2 ln(y/S(0)) ln(y/S(T ))*
*σ*^{2}*T*

= *−* 2

*σ*^{2}*T* *{ [ ln(y) − ln S(0) ] [ ln(y) − ln S(T ) ] }.*

### Brownian Bridge Approach to Pricing Lookback Options (continued)

*• There are two solutions for ln y.*

*• But only one is consistent with y ≥ max(S(0), S(T )):*

*ln y*

=

*ln(S(0) S(T )) +*

ln ^{S(T )}_{S(0)}

_{2}

*− 2σ*^{2}*T ln(1* *− x)*

2 *.*

### Brownian Bridge Approach to Pricing Lookback Options (concluded)

The following algorithm works for the lookback put on the maximum.

1: *C := 0;*

2: **for i = 1, 2, 3, . . . , N do**

3: *P := S × e*^{(r−q−σ}^{2}^{/2) T +σ}

*√**T ξ( )**; {By Eq. (120) on p. 853.}*

4: *Y := exp*

*ln(SP )+*

(^{ln} ^{P}*S* )^{2}^{−2σ}^{2}*T ln[ 1−U(0,1) ]*
2

;

5: *C := C + (Y − P );*

6: **end for**

7: *return Ce*^{−rT}*/N ;*

### Pricing Lookback Options without Brownian Bridge

*• Suppose we do not draw S(T ) in simulation.*

*• Now, the distribution function of the maximum*
logarithmic stock price is^{a}

Prob

max

*0≤t≤T* ln *S(t)*

*S(0)* *< y*

= 1 *− N*

⎛

⎝*−y +*

*r − q −* ^{σ}_{2}^{2}
*T*
*σ**√*

*T*

⎞

⎠

*−e*

*2y*

*r−q− σ*2
2

*σ2* *N*

⎛

⎝*−y −*

*r − q −* ^{σ}_{2}^{2}
*T*
*σ**√*

*T*

⎞

*⎠ , y ≥ 0.*

*• The inverse of that is much harder to calculate.*

aA. N. Borodin & Salminen (1996).

### Variance Reduction

*• The statistical eﬃciency of Monte Carlo simulation can*
be measured by the variance of its output.

*• If this variance can be lowered without changing the*
expected value, fewer replications are needed.

*• Methods that improve eﬃciency in this manner are*
called variance-reduction techniques.

*• Such techniques become practical when the added costs*
are outweighed by the reduction in sampling.

### Variance Reduction: Antithetic Variates

*• We are interested in estimating E[ g(X*_{1}*, X*_{2}*, . . . , X** _{n}*) ].

*• Let Y*_{1} *and Y*_{2} be random variables with the same
*distribution as g(X*_{1}*, X*_{2}*, . . . , X** _{n}*).

*• Then*

Var

*Y*_{1} *+ Y*_{2}
2

= *Var[ Y*_{1} ]

2 + *Cov[ Y*_{1}*, Y*_{2} ]

2 *.*

**– Var[ Y**_{1} *]/2 is the variance of the Monte Carlo*
*method with two independent replications.*

*• The variance Var[ (Y*_{1} *+ Y*_{2}*)/2 ] is smaller than*

*Var[ Y*_{1} *]/2 when Y*_{1} *and Y*_{2} are negatively correlated.

### Variance Reduction: Antithetic Variates (continued)

*• For each simulated sample path X, a second one is*
*obtained by reusing the random numbers on which the*
ﬁrst path is based.

*• This yields a second sample path Y .*

*• Two estimates are then obtained: One based on X and*
*the other on Y .*

*• If N independent sample paths are generated, the*
*antithetic-variates estimator averages over 2N*

estimates.

### Variance Reduction: Antithetic Variates (continued)

*• Consider process dX = a**t* *dt + b*_{t}*√*

*dt ξ.*

*• Let g be a function of n samples X*_{1}*, X*_{2}*, . . . , X** _{n}* on
the sample path.

*• We are interested in E[ g(X*_{1}*, X*_{2}*, . . . , X** _{n}*) ].

*• Suppose one simulation run has realizations*

*ξ*_{1}*, ξ*_{2}*, . . . , ξ** _{n}* for the normally distributed ﬂuctuation

*term ξ.*

*• This generates samples x*1*, x*_{2}*, . . . , x** _{n}*.

**• The estimate is then g(x), where x***= (x*^{Δ} _{1}*, x*_{2} *. . . , x** _{n}*).

### Variance Reduction: Antithetic Variates (concluded)

*• The antithetic-variates method does not sample n more*
*numbers from ξ for the second estimate g( x*

*).*

^{}**• Instead, generate the sample path x**^{ Δ}*= (x*^{}_{1}*, x*^{}_{2} *. . . , x*^{}* _{n}*)
from

*−ξ*

_{1}

*,−ξ*

_{2}

*, . . . ,−ξ*

*n*.

**• Compute g(x*** ^{}*).

**• Output (g(x) + g(x**^{}*))/2.*

*• Repeat the above steps for as many times as required by*
accuracy.

### Variance Reduction: Conditioning

*• We are interested in estimating E[ X ].*

*• Suppose here is a random variable Z such that*

*E[ X* *| Z = z ] can be eﬃciently and precisely computed.*

*• E[ X ] = E[ E[ X | Z ] ] by the law of iterated conditional*
expectations.

*• Hence the random variable E[ X | Z ] is also an unbiased*
*estimator of E[ X ].*

### Variance Reduction: Conditioning (concluded)

*• As*

*Var[ E[ X* *| Z ] ] ≤ Var[ X ],*

*E[ X* *| Z ] has a smaller variance than observing X*
directly.

*• First, obtain a random observation z on Z.*

*• Then calculate E[ X | Z = z ] as our estimate.*

**– There is no need to resort to simulation in computing**
*E[ X* *| Z = z ].*

*• The procedure can be repeated a few times to reduce*

### Control Variates

*• Use the analytic solution of a “similar” yet “simpler”*

problem to improve the solution.

*• Suppose we want to estimate E[ X ] and there exists a*
*random variable Y with a known mean μ* *= E[ Y ].*^{Δ}

*• Then W* *= X + β(Y*^{Δ} *− μ) can serve as a “controlled”*

*estimator of E[ X ] for any constant β.*

**– However β is chosen, W remains an unbiased***estimator of E[ X ] as*

*E[ W ] = E[ X ] + βE[ Y* *− μ ] = E[ X ].*

### Control Variates (continued)

*• Note that*

*Var[ W ] = Var[ X ] + β*^{2} *Var[ Y ] + 2β Cov[ X, Y ],*

(121)

*• Hence W is less variable than X if and only if*

*β*^{2} *Var[ Y ] + 2β Cov[ X, Y ] < 0.* (122)

### Control Variates (concluded)

*• The success of the scheme clearly depends on both β*
*and the choice of Y .*

**– American options can be priced by choosing Y to be***the otherwise identical European option and μ the*
Black-Scholes formula.^{a}

**– Arithmetic Asian options can be priced by choosing**
*Y to be the otherwise identical geometric Asian*
*option’s price and β =* *−1.*

*• This approach is much more eﬀective than the*
antithetic-variates method.^{b}

aHull & White (1988).

bBoyle, Broadie, & Glasserman (1997).

### Choice of *Y*

*• In general, the choice of Y is ad hoc,*^{a} and experiments
must be performed to conﬁrm the wisdom of the choice.

*• Try to match calls with calls and puts with puts.*^{b}

*• On many occasions, Y is a discretized version of the*
*derivative that gives μ.*

**– Discretely monitored geometric Asian option vs. the**
continuously monitored version.^{c}

*• The discrepancy can be large (e.g., lookback options).*^{d}

aBut see Dai (B82506025, R86526008, D8852600), C. Chiu (B90201037, R94922072), & Lyuu (2015, 2018).

bContributed by Ms. Teng, Huei-Wen (R91723054) on May 25, 2004.

c

### Optimal Choice of *β*

*• Equation (121) on p. 897 is minimized when*
*β =* *−Cov[ X, Y ]/Var[ Y ].*

**– It is called beta in the book.**

*• For this speciﬁc β,*

*Var[ W ] = Var[ X ]* *−* *Cov[ X, Y ]*^{2}

*Var[ Y ]* =

1 *− ρ*^{2}_{X,Y}

*Var[ X ],*
*where ρ*_{X,Y}*is the correlation between X and Y .*

### Optimal Choice of *β (continued)*

*• Note that the variance can never be increased with the*
optimal choice.

*• Furthermore, the stronger X and Y are correlated, the*
greater the reduction in variance.

*• For example, if this correlation is nearly perfect (±1),*
*we could control X almost exactly.*

### Optimal Choice of *β (continued)*

*• Typically, neither Var[ Y ] nor Cov[ X, Y ] is known.*

*• Therefore, we cannot obtain the maximum reduction in*
variance.

*• We can guess these values and hope that the resulting*
*W does indeed have a smaller variance than X.*

*• A second possibility is to use the simulated data to*
estimate these quantities.

**– How to do it eﬃciently in terms of time and space?**

### Optimal Choice of *β (concluded)*

*• Observe that −β has the same sign as the correlation*
*between X and Y .*

*• Hence, if X and Y are positively correlated, β < 0,*
*then X is adjusted downward whenever Y > μ and*
upward otherwise.

*• The opposite is true when X and Y are negatively*
*correlated, in which case β > 0.*

*• Suppose a suboptimal β + is used instead.*

*• The variance increases by only *^{2}*Var[ Y ].*^{a}

### A Pitfall

*• A potential pitfall is to sample X and Y independently.*

*• In this case, Cov[ X, Y ] = 0.*

*• Equation (121) on p. 897 becomes*

*Var[ W ] = Var[ X ] + β*^{2} *Var[ Y ].*

*• So whatever Y is, the variance is increased!*

*• Lesson: X and Y must be correlated.*

### Problems with the Monte Carlo Method

*• The error bound is only probabilistic.*

*• The probabilistic error bound of O(1/√*

*N ) does not*
beneﬁt from regularity of the integrand function.

*• The requirement that the points be independent random*
samples are wasteful because of clustering.

*• In reality, pseudorandom numbers generated by*
completely deterministic means are used.

*• Monte Carlo simulation exhibits a great sensitivity on*
the seed of the pseudorandom-number generator.

*Matrix Computation*

To set up a philosophy against physics is rash;

philosophers who have done so have always ended in disaster.

— Bertrand Russell

### Definitions and Basic Results

*• Let A* *= [ a*^{Δ} * _{ij}* ]

*1≤i≤m,1≤j≤n*

*, or simply A*

**∈ R***,*

^{m×n}*denote an m*

*× n matrix.*

*• It can also be represented as [ a*_{1}*, a*_{2}*, . . . , a** _{n}* ] where

*a*

_{i}

**∈ R***are vectors.*

^{m}**– Vectors are column vectors unless stated otherwise.**

*• A is a square matrix when m = n.*

*• The rank of a matrix is the largest number of linearly*
independent columns.

### Definitions and Basic Results (continued)

*• A square matrix A is said to be symmetric if A*^{T} *= A.*

*• A real n × n matrix*

*A* *= [ a*^{Δ} * _{ij}* ]

*is diagonally dominant if*

_{i,j}*| a*

*ii*

*| >*

*j=i* *| a**ij* *| for*
1 *≤ i ≤ n.*

**– Such matrices are nonsingular.**

*• The identity matrix is the square matrix*
*I* *= diag[ 1, 1, . . . , 1 ].*^{Δ}

### Definitions and Basic Results (concluded)

*• A matrix has full column rank if its columns are linearly*
independent.

*• A real symmetric matrix A is positive deﬁnite if*
*x*^{T}*Ax =*

*i,j*

*a*_{ij}*x*_{i}*x*_{j}*> 0*
*for any nonzero vector x.*

*• A matrix A is positive deﬁnite if and only if there exists*
*a matrix W such that A = W*^{T}*W and W has full*

column rank.

### Cholesky Decomposition

*• Positive deﬁnite matrices can be factored as*
*A = LL*^{T}*,*

called the Cholesky decomposition.

**– Above, L is a lower triangular matrix.**

### Generation of Multivariate Distribution

**• Let x***= [ x*^{Δ} _{1}*, x*_{2}*, . . . , x** _{n}* ]

^{T}be a vector random variable

*with a positive deﬁnite covariance matrix C.*

**• As usual, assume E[ x ] = 0.**

**• This covariance structure can be matched by P y.**

**– y***= [ y*^{Δ} _{1}*, y*_{2}*, . . . , y** _{n}* ]

^{T}is a vector random variable

with a covariance matrix equal to the identity matrix.

**– C = P P**^{T} *is the Cholesky decomposition of C.*^{a}

aWhat if *C is not positive deﬁnite? See Y. Y. Lai (R93942114) &*

Lyuu (2007).

### Generation of Multivariate Distribution (concluded)

*• For example, suppose*

*C =*

⎡

⎣ 1 *ρ*
*ρ* 1

⎤

*⎦ .*

*• Then*

*P =*

⎡

⎣ 1 0

*ρ*

1 *− ρ*^{2}

⎤

⎦

*as P P*^{T} *= C.*^{a}

aRecall Eq. (28) on p. 179.

### Generation of Multivariate Normal Distribution

*• Suppose we want to generate the multivariate normal*
*distribution with a covariance matrix C = P P*^{T}.

**– First, generate independent standard normal**
*distributions y*_{1}*, y*_{2}*, . . . , y** _{n}*.

**– Then**

*P [ y*_{1}*, y*_{2}*, . . . , y** _{n}* ]

^{T}has the desired distribution.

**– These steps can then be repeated.**

### Multivariate Derivatives Pricing

*• Generating the multivariate normal distribution is*
essential for the Monte Carlo pricing of multivariate
derivatives (pp. 809ﬀ).

*• For example, the rainbow option on k assets has payoﬀ*
*max(max(S*_{1}*, S*_{2}*, . . . , S** _{k}*)

*− X, 0)*

at maturity.

*• The closed-form formula is a multi-dimensional integral.*^{a}

aJohnson (1987); C. Y. Chen (D95723006) & Lyuu (2009).

### Multivariate Derivatives Pricing (concluded)

*• Suppose dS**j**/S*_{j}*= r dt + σ*_{j}*dW** _{j}*, 1

*≤ j ≤ k, where C is*

*the correlation matrix for dW*

_{1}

*, dW*

_{2}

*, . . . , dW*

*.*

_{k}*• Let C = P P*^{T}.

*• Let ξ consist of k independent random variables from*
*N (0, 1).*

*• Let ξ*^{}*= P ξ.*

*• Similar to Eq. (120) on p. 853, for each asset 1 ≤ j ≤ k,*
*S*_{i+1}*= S*_{i}*e*^{(r−σ}^{j}^{2}^{/2) Δt+σ}^{j}

*√**Δt ξ*_{j}^{}

by Eq. (120) on p. 853.

### Least-Squares Problems

*• The least-squares (LS) problem is concerned with*

*x∈R*min^{n}* Ax − b ,*
*where A* **∈ R**^{m×n}*, b* **∈ R**^{m}*, and m* *≥ n.*

*• The LS problem is called regression analysis in statistics*
and is equivalent to minimizing the mean-square error.

*• Often written as*

*Ax = b.*

### Polynomial Regression

*• In polynomial regression, x*_{0} *+ x*_{1}*x +* *· · · + x*_{n}*x** ^{n}* is used
to ﬁt the data

*{ (a*

_{1}

*, b*

_{1}

*), (a*

_{2}

*, b*

_{2}

*), . . . , (a*

_{m}*, b*

*)*

_{m}*}.*

*• This leads to the LS problem,*

⎡

⎢⎢

⎢⎢

⎢⎢

⎣

1 *a*_{1} *a*^{2}_{1} *· · · a*^{n}_{1}
1 *a*_{2} *a*^{2}_{2} *· · · a*^{n}_{2}
... ... ... . .. ...
1 *a*_{m}*a*^{2}_{m}*· · · a*^{n}_{m}

⎤

⎥⎥

⎥⎥

⎥⎥

⎦

⎡

⎢⎢

⎢⎢

⎢⎢

⎣
*x*_{0}
*x*_{1}
...
*x*_{n}

⎤

⎥⎥

⎥⎥

⎥⎥

⎦

=

⎡

⎢⎢

⎢⎢

⎢⎢

⎣
*b*_{1}
*b*_{2}
...
*b*_{m}

⎤

⎥⎥

⎥⎥

⎥⎥

⎦
*.*

*• Consult p. 273 of the textbook for solutions.*

### American Option Pricing by Simulation

*• The continuation value of an American option is the*
conditional expectation of the payoﬀ from keeping the
option alive now.

*• The option holder must compare the immediate exercise*
value and the continuation value.

*• In standard Monte Carlo simulation, each path is*
treated independently of other paths.

*• But the decision to exercise the option cannot be*
reached by looking at one path alone.

### The Least-Squares Monte Carlo Approach

*• The continuation value can be estimated from the*
cross-sectional information in the simulation by using
least squares.^{a}

*• The result is a function (of the state) for estimating the*
continuation values.

*• Use the function to estimate the continuation value for*
each path to determine its cash ﬂow.

*• This is called the least-squares Monte Carlo (LSM)*
approach.

aLongstaﬀ & Schwartz (2001).

### The Least-Squares Monte Carlo Approach (concluded)

*• The LSM is provably convergent.*^{a}

*• The LSM can be easily parallelized.*^{b}

**– Partition the paths into subproblems and perform**
LSM on each of them independently.

**– The speedup is close to linear (i.e., proportional to**
the number of cores).

*• Surprisingly, accuracy is not aﬀected.*

aCl´ement, Lamberton, & Protter (2002); Stentoft (2004).

bK. Huang (B96902079, R00922018) (2013); C. W. Chen (B97902046, R01922005) (2014); C. W. Chen (B97902046, R01922005), K. Huang

### A Numerical Example

*• Consider a 3-year American put on a*
non-dividend-paying stock.

*• The put is exercisable at years 0, 1, 2, and 3.*

*• The strike price X = 105.*

*• The annualized riskless rate is r = 5%.*

**– The annual discount factor hence equals 0.951229.**

*• The current stock price is 101.*

*• We use 8 price paths to illustrate the algorithm.*

### A Numerical Example (continued)

Stock price paths

Path Year 0 Year 1 Year 2 Year 3

1 **101** **97.6424** **92.5815** 107.5178

2 **101** **101.2103** 105.1763 **102.4524**
3 **101** 105.7802 **103.6010** 124.5115

4 **101** **96.4411** **98.7120** 108.3600

5 **101** 124.2345 **101.0564 104.5315**

6 **101** **95.8375** **93.7270** **99.3788**

7 **101** 108.9554 **102.4177 100.9225**

0 0.5 1 1.5 2 2.5 3 95

100 105 110 115 120 125

1

2

3

4 5

6 7

8

### A Numerical Example (continued)

*• We use the basis functions 1, x, x*^{2}.
**– Other basis functions are possible.**^{a}

*• The plot next page shows the ﬁnal estimated optimal*
exercise strategy given by LSM.

*• We now proceed to tackle our problem.*

*• The idea is to calculate the cash ﬂow along each path,*
*using information from all paths.*

aLaguerre polynomials, Hermite polynomials, Legendre polynomials, Chebyshev polynomials, Gedenbauer polynomials, or Jacobi polynomi-

0 0.5 1 1.5 2 2.5 3 95

100 105 110 115 120 125

1

2 3

4 5

6 7 8

### A Numerical Example (continued)

Cash flows at year 3

Path Year 0 Year 1 Year 2 Year 3

1 — — — 0

2 — — — 2.5476

3 — — — 0

4 — — — 0

5 — — — 0.4685

6 — — — 5.6212

7 — — — 4.0775

### A Numerical Example (continued)

*• The cash ﬂows at year 3 are the exercise value if the put*
is in the money.

*• Only 4 paths are in the money: 2, 5, 6, 7.*

*• Some of the cash ﬂows may not occur if the put is*
exercised earlier, which we will ﬁnd out later.

*• Incidentally, the European counterpart has a value of*

*0.951229*^{3} *×* *2.5476 + 0.4685 + 5.6212 + 4.0775*
8

= *1.3680.*

### A Numerical Example (continued)

*• We move on to year 2.*

*• For each state that is in the money at year 2, we must*
decide whether to exercise it.

*• There are 6 paths for which the put is in the money: 1,*
3, 4, 5, 6, 7 (p. 923).

*• Only in-the-money paths will be used in the regression*
because they are where early exercise is relevant.

**– If there were none, move on to year 1.**

### A Numerical Example (continued)

*• Let x denote the stock prices at year 2 for those 6 paths.*

*• Let y denote the corresponding discounted future cash*
*ﬂows (at year 3) if the put is not exercised at year 2.*

### A Numerical Example (continued)

Regression at year 2

Path *x* *y*

1 *92.5815* 0 *× 0.951229*

2 — —

3 *103.6010* 0 *× 0.951229*
4 *98.7120* 0 *× 0.951229*
5 *101.0564* *0.4685* *× 0.951229*
6 *93.7270* *5.6212* *× 0.951229*
7 *102.4177* *4.0775* *× 0.951229*

### A Numerical Example (continued)

*• We regress y on 1, x, and x*^{2}.

*• The result is*

*f (x) = 22.08* *− 0.313114 × x + 0.00106918 × x*^{2}*.*

*• f(x) estimates the continuation value conditional on the*
stock price at year 2.

*• We next compare the immediate exercise value and the*
estimated continuation value.^{a}

aThe *f(102.4177) entry on the next page was corrected by Mr. Tu,*
Yung-Szu (B79503054, R83503086) on May 25, 2017.

### A Numerical Example (continued)

Optimal early exercise decision at year 2
Path Exercise Continuation
1 12.4185 *f (92.5815) = 2.2558*

2 — —

3 1.3990 *f (103.6010) = 1.1168*
4 6.2880 *f (98.7120) = 1.5901*
5 3.9436 *f (101.0564) = 1.3568*
6 11.2730 *f (93.7270) = 2.1253*
7 2.5823 *f (102.4177) = 1.2266*

### A Numerical Example (continued)

*• Amazingly, the put should be exercised in all 6 paths: 1,*
3, 4, 5, 6, 7.

*• Now, any positive cash ﬂow at year 3 vanishes for these*
paths as the put is exercised before year 3 (p. 923).

**– They are paths 5, 6, 7.**

*• The cash ﬂows on p. 927 become the ones on next slide.*

### A Numerical Example (continued)

Cash flows at years 2 & 3

Path Year 0 Year 1 Year 2 Year 3

1 — — 12.4185 0

2 — — 0 2.5476

3 — — 1.3990 0

4 — — 6.2880 0

5 — — 3.9436 0

6 — — 11.2730 0

7 — — 2.5823 0

### A Numerical Example (continued)

*• We move on to year 1.*

*• For each state that is in the money at year 1, we must*
decide whether to exercise it.

*• There are 5 paths for which the put is in the money: 1,*
2, 4, 6, 8 (p. 923).

*• Only in-the-money paths will be used in the regression*
because they are where early exercise is relevant.

**– If there were none, move on to year 0.**

### A Numerical Example (continued)

*• Let x denote the stock prices at year 1 for those 5 paths.*

*• Let y denote the corresponding discounted future cash*
ﬂows if the put is not exercised at year 1.

*• From p. 935, we have the following table.*

### A Numerical Example (continued)

Regression at year 1

Path *x* *y*

1 97.6424 *12.4185* *× 0.951229*
2 101.2103 *2.5476* *× 0.951229*^{2}

3 — —

4 96.4411 *6.2880* *× 0.951229*

5 — —

6 95.8375 *11.2730* *× 0.951229*

7 — —

8 104.1475 0 *× 0.951229*

### A Numerical Example (continued)

*• We regress y on 1, x, and x*^{2}.

*• The result is*

*f (x) =* *−420.964 + 9.78113 × x − 0.0551567 × x*^{2}*.*

*• f(x) estimates the continuation value conditional on the*
stock price at year 1.

*• We next compare the immediate exercise value and the*
estimated continuation value.

### A Numerical Example (continued)

Optimal early exercise decision at year 1

Path Exercise Continuation

1 7.3576 *f (97.6424) = 8.2230*
2 3.7897 *f (101.2103) = 3.9882*

3 — —

4 8.5589 *f (96.4411) = 9.3329*

5 — —

6 9.1625 *f (95.8375) = 9.83042*

7 — —

8 0.8525 *f (104.1475) =* *−0.551885*

### A Numerical Example (continued)

*• The put should be exercised for 1 path only: 8.*

**– Note that f(104.1475) < 0.**

*• Now, any positive future cash ﬂow vanishes for this path.*

**– But there is none.**

*• The cash ﬂows on p. 935 become the ones on next slide.*

*• They also conﬁrm the plot on p. 926.*

### A Numerical Example (continued)

Cash flows at years 1, 2, & 3

Path Year 0 Year 1 Year 2 Year 3

1 — 0 12.4185 0

2 — 0 0 2.5476

3 — 0 1.3990 0

4 — 0 6.2880 0

5 — 0 3.9436 0

6 — 0 11.2730 0

7 — 0 2.5823 0

8 — 0.8525 0 0

### A Numerical Example (continued)

*• We move on to year 0.*

*• The continuation value is, from p 942,*

*(12.4185* *× 0.951229*^{2} *+ 2.5476* *× 0.951229*^{3}
*+1.3990* *× 0.951229*^{2} *+ 6.2880* *× 0.951229*^{2}
*+3.9436* *× 0.951229*^{2} *+ 11.2730* *× 0.951229*^{2}
*+2.5823* *× 0.951229*^{2} *+ 0.8525* *× 0.951229)/8*

= *4.66263.*

### A Numerical Example (concluded)

*• As this is larger than the immediate exercise value of*
105 *− 101 = 4,*

the put should not be exercised at year 0.

*• Hence the put’s value is estimated to be 4.66263.*

*• Compare this with the European put’s value of 1.3680*
(p. 928).

*Time Series Analysis*

The historian is a prophet in reverse.

— Friedrich von Schlegel (1772–1829)

### GARCH Option Pricing

^{a}

*• Options can be priced when the underlying asset’s*
return follows a GARCH process.

*• Let S*_{t}*denote the asset price at date t.*

*• Let h*^{2}_{t}*be the conditional variance of the return over*
*the period [ t, t + 1) given the information at date t.*

**– “One day” is merely a convenient term for any**
*elapsed time Δt.*

aARCH (autoregressive conditional heteroskedastic) is due to Engle (1982), co-winner of the 2003 Nobel Prize in Economic Sciences. GARCH (generalized ARCH) is due to Bollerslev (1986) and Taylor (1986). A Bloomberg quant said to me on Feb 29, 2008, that GARCH is seldom

### GARCH Option Pricing (continued)

*• Adopt the following risk-neutral process for the price*
dynamics:^{a}

ln *S*_{t+1}

*S*_{t}*= r* *−* *h*^{2}_{t}

2 *+ h*_{t}_{t+1}*,* (123)
where

*h*^{2}* _{t+1}* =

*β*

_{0}

*+ β*

_{1}

*h*

^{2}

_{t}*+ β*

_{2}

*h*

^{2}

_{t}*(*

_{t+1}*− c)*

^{2}

*,*(124)

_{t+1}*∼ N(0, 1) given information at date t,*
*r* = *daily riskless return,*

*c* *≥ 0.*

aDuan (1995).

### GARCH Option Pricing (continued)

*• The ﬁve unknown parameters of the model are c, h*_{0}*, β*_{0},
*β*_{1}*, and β*_{2}.

*• It is postulated that β*_{0}*, β*_{1}*, β*_{2} *≥ 0 to make the*
conditional variance positive.

*• There are other inequalities to satisfy (see text).*

*• The above process is called the nonlinear asymmetric*
GARCH (or NGARCH) model.

### GARCH Option Pricing (continued)

*• It captures the volatility clustering in asset returns ﬁrst*
noted by Mandelbrot (1963).^{a}

**– When c = 0, a large ***t+1* *results in a large h** _{t+1}*,

*which in turns tends to yield a large h** _{t+2}*, and so on.

*• It also captures the negative correlation between the*
asset return and changes in its (conditional) volatility.^{b}

**– For c > 0, a positive ***t+1* (good news) tends to
*decrease h*_{t+1}*, whereas a negative ** _{t+1}* (bad news)
tends to do the opposite.

a“*. . . large changes tend to be followed by large changes—of either*
sign—and small changes tend to be followed by small changes *. . . ”*

bNoted by Black (1976): Volatility tends to rise in response to “bad news” and fall in response to “good news.”

### GARCH Option Pricing (continued)

*• This is called the leverage eﬀect.*

**– A falling stock price raises the ﬁxed costs, relatively**
speaking.^{a}

*• With y**t* Δ

*= ln S** _{t}* denoting the logarithmic price, the
model becomes

*y*_{t+1}*= y*_{t}*+ r* *−* *h*^{2}_{t}

2 *+ h*_{t}_{t+1}*.* (125)

*• The pair (y**t**, h*^{2}* _{t}*) completely describes the current state.

aBlack (1992).

### GARCH Option Pricing (concluded)

*• The conditional mean and variance of y**t+1* are clearly
*E[ y*_{t+1}*| y**t**, h*^{2}* _{t}* ] =

*y*

_{t}*+ r*

*−*

*h*

^{2}

_{t}2 *,* (126)
*Var[ y*_{t+1}*| y*_{t}*, h*^{2}* _{t}* ] =

*h*

^{2}

_{t}*.*(127)

*• Finally, given (y**t**, h*^{2}_{t}*), the correlation between y** _{t+1}* and

*h*

*equals*

_{t+1}*−* *2c*

*√2 + 4c*^{2} *,*
*which is negative for c > 0.*

### GARCH Model: Inferences

*• Suppose the parameters c, h*_{0}*, β*_{0}*, β*_{1}*, and β*_{2} are given.

*• Then we can recover h*_{1}*, h*_{2}*, . . . , h*_{n}*and *_{1}*, *_{2}*, . . . , ** _{n}*
from the prices

*S*_{0}*, S*_{1}*, . . . , S*_{n}

under the GARCH model (123) on p. 948.

*• This property is useful in statistical inferences.*