Biases in Pricing Continuously Monitored Options with Monte Carlo
• We are asked to price a continuously monitored up-and-out call with barrier H.
• The Monte Carlo method samples the stock price at n discrete time points t1, t2, . . . , tn.
• A sample path
S(t0), S(t1), . . . , S(tn) is produced.
– Here, t0 = 0 is the current time, and tn = T is the expiration time of the option.
Biases in Pricing Continuously Monitored Options with Monte Carlo (continued)
• If all of the sampled prices are below the barrier, this sample path pays max(S(tn) − X, 0).
• Repeating these steps and averaging the payoffs yield a Monte Carlo estimate.
1: C := 0;
2: for i = 1, 2, 3, . . . , N do
3: P := S; hit := 0;
4: for j = 1, 2, 3, . . . , n do
5: P := P × e(r−σ2/2) (T /n)+σ√
(T /n) ξ; {By Eq. (117) on p.
841.}
6: if P ≥ H then
7: hit := 1;
8: break;
9: end if
10: end for
11: if hit = 0 then
12: C := C + max(P − X, 0);
13: end if
14: end for
Biases in Pricing Continuously Monitored Options with Monte Carlo (continued)
• This estimate is biased.a
– Suppose none of the sampled prices on a sample path equals or exceeds the barrier H.
– It remains possible for the continuous sample path that passes through them to hit the barrier between sampled time points (see plot on next page).
– Hence knock-out probabilities are underestimated.
aShevchenko (2003).
H
Biases in Pricing Continuously Monitored Options with Monte Carlo (concluded)
• The bias can be lowered by increasing the number of observations along the sample path.
– For trees, the knock-out probabilities may decrease as the number of time steps is increased.
• However, even daily sampling may not suffice.
• The computational cost also rises as a result.
Brownian Bridge Approach to Pricing Barrier Options
• We desire an unbiased estimate which can be calculated efficiently.
• The above-mentioned payoff should be multiplied by the probability p that a continuous sample path does not hit the barrier conditional on the sampled prices.
• This methodology is called the Brownian bridge approach.
• Formally, we have
p = Prob[ S(t) < H, 0Δ ≤ t ≤ T | S(t0), S(t1), . . . , S(tn) ].
Brownian Bridge Approach to Pricing Barrier Options (continued)
• As a barrier is hit over a time interval if and only if the maximum stock price over that period is at least H,
p = Prob
0≤t≤Tmax S(t) < H | S(t0), S(t1), . . . , S(tn)
.
• Luckily, the conditional distribution of the maximum over a time interval given the beginning and ending stock prices is known.
Brownian Bridge Approach to Pricing Barrier Options (continued)
Lemma 21 Assume S follows dS/S = μ dt + σ dW and definea ζ(x) = expΔ
− 2 ln(x/S(t)) ln(x/S(t + Δt)) σ2Δt
. (1) If H > max(S(t), S(t + Δt)), then
Prob
t≤u≤t+Δtmax S(u) < H
S(t),S(t + Δt)
= 1 − ζ(H).
(2) If h < min(S(t), S(t + Δt)), then Prob
t≤u≤t+Δtmin S(u) > h
S(t),S(t + Δt)
= 1 − ζ(h).
Brownian Bridge Approach to Pricing Barrier Options (continued)
• Lemma 21 gives the probability that the barrier is not hit in a time interval, given the starting and ending stock prices.
• For our up-and-out call,a choose n = 1.
• As a result,
p =
⎧⎨
⎩
1 − exp
−2 ln(H/S(0)) ln(H/S(T )) σ2T
, if H > max(S(0), S(T )),
0, otherwise.
aSo S(0) < H.
Brownian Bridge Approach to Pricing Barrier Options (continued)
The following algorithm works for up-and-out and down-and-out calls.
1: C := 0;
2: for i = 1, 2, 3, . . . , N do
3: P := S × e(r−q−σ2/2) T +σ
√T ξ( );
4: if (S < H and P < H) or (S > H and P > H) then
5: C := C+max(P −X, 0)×
1 − exp
−2 ln(H/S)×ln(H/P ) σ2T
;
6: end if
7: end for
8: return Ce−rT/N ;
Brownian Bridge Approach to Pricing Barrier Options (concluded)
• The idea can be generalized.
• For example, we can handle more complex barrier options.
• Consider an up-and-out call with barrier Hi for the time interval (ti, ti+1 ], 0 ≤ i < n.
• This option thus contains n barriers.
• Multiply the probabilities for the n time intervals to obtain the desired probability adjustment term.
Pricing Barrier Options without Brownian Bridge
• Let Th denote the amount of time for a process Xt to hit h for the first time.
• It is called the first passage time or the first hitting time.
• Suppose Xt is a (μ, σ) Brownian motion:
dXt = μ dt + σ dWt, t ≥ 0.
Pricing Barrier Options without Brownian Bridge (continued)
• The first passage time Th follows the inverse Gaussian (IG) distribution with probability density function:a
| h − X(0) | σt3/2√
2π e−(h−X(0)−μx)2/(2σ2x).
• For pricing a barrier option with barrier H by simulation, the density function becomes
| ln(H/S(0)) | σt3/2√
2π e−[ln(H/S(0))−(r−σ2/2) x]2/(2σ2x).
aA. N. Borodin & Salminen (1996), with Laplace transform
√
Pricing Barrier Options without Brownian Bridge (concluded)
• Draw an x from this distribution.a
• If x > T , a knock-in option fails to knock in, whereas a knock-out option does not knock out.
• If x ≤ T , the opposite is true.
• If the barrier option survives at maturity T , then draw an S(T ) to calculate its payoff.
• Repeat the above process many times to average the discounted payoff.
aThe IG distribution can be very efficiently sampled (Michael, Schu-
Brownian Bridge Approach to Pricing Lookback Options
a• By Lemma 21(1) (p. 864), Fmax(y) =Δ Prob
0≤t≤Tmax S(t) < y | S(0), S(T )
= 1 − exp
−2 ln(y/S(0)) ln(y/S(T )) σ2T
.
• So Fmax is the conditional distribution function of the maximum stock price.
aEl Babsiri & Noel (1998).
Brownian Bridge Approach to Pricing Lookback Options (continued)
• A random variable with that distribution can be
generated by Fmax−1 (x), where x is uniformly distributed over (0, 1).a
• In other words,
x = 1 − exp
−2 ln(y/S(0)) ln(y/S(T )) σ2T
.
aThis is called the inverse-transform technique (see p. 259 of the text- book).
Brownian Bridge Approach to Pricing Lookback Options (continued)
• Equivalently,
ln(1 − x)
= −2 ln(y/S(0)) ln(y/S(T )) σ2T
= − 2
σ2T { [ ln(y) − ln S(0) ] [ ln(y) − ln S(T ) ] }.
Brownian Bridge Approach to Pricing Lookback Options (continued)
• There are two solutions for ln y.
• But only one is consistent with y ≥ max(S(0), S(T )):
ln y
=
ln(S(0) S(T )) +
ln S(T )S(0)
2
− 2σ2T ln(1 − x)
2 .
Brownian Bridge Approach to Pricing Lookback Options (concluded)
The following algorithm works for the lookback put on the maximum.
1: C := 0;
2: for i = 1, 2, 3, . . . , N do
3: P := S × e(r−q−σ2/2) T +σ
√T ξ( ); {By Eq. (117) on p. 841.}
4: Y := exp
ln(SP )+
(ln PS )2−2σ2T ln[ 1−U(0,1) ] 2
;
5: C := C + (Y − P );
6: end for
7: return Ce−rT/N ;
Pricing Lookback Options without Brownian Bridge
• Suppose we do not draw S(T ) in simulation.
• Now, the distribution function of the maximum logarithmic stock price isa
Prob
0≤t≤Tmax ln S(t)
S(0) < y
= 1 − N
⎛
⎝−y +
r − q − σ22 T σ√
T
⎞
⎠ − N
⎛
⎝−y −
r − q − σ22 T σ√
T
⎞
⎠ .
• The inverse of that is much harder to calculate.
aA. N. Borodin & Salminen (1996).
Variance Reduction
• The statistical efficiency of Monte Carlo simulation can be measured by the variance of its output.
• If this variance can be lowered without changing the expected value, fewer replications are needed.
• Methods that improve efficiency in this manner are called variance-reduction techniques.
• Such techniques become practical when the added costs are outweighed by the reduction in sampling.
Variance Reduction: Antithetic Variates
• We are interested in estimating E[ g(X1, X2, . . . , Xn) ].
• Let Y1 and Y2 be random variables with the same distribution as g(X1, X2, . . . , Xn).
• Then
Var
Y1 + Y2 2
= Var[ Y1 ]
2 + Cov[ Y1, Y2 ]
2 .
– Var[ Y1 ]/2 is the variance of the Monte Carlo method with two independent replications.
• The variance Var[ (Y1 + Y2)/2 ] is smaller than
Var[ Y1 ]/2 when Y1 and Y2 are negatively correlated.
Variance Reduction: Antithetic Variates (continued)
• For each simulated sample path X, a second one is obtained by reusing the random numbers on which the first path is based.
• This yields a second sample path Y .
• Two estimates are then obtained: One based on X and the other on Y .
• If N independent sample paths are generated, the antithetic-variates estimator averages over 2N
estimates.
Variance Reduction: Antithetic Variates (continued)
• Consider process dX = at dt + bt√
dt ξ.
• Let g be a function of n samples X1, X2, . . . , Xn on the sample path.
• We are interested in E[ g(X1, X2, . . . , Xn) ].
• Suppose one simulation run has realizations
ξ1, ξ2, . . . , ξn for the normally distributed fluctuation term ξ.
• This generates samples x1, x2, . . . , xn.
• The estimate is then g(x), where x = (xΔ 1, x2 . . . , xn).
Variance Reduction: Antithetic Variates (concluded)
• The antithetic-variates method does not sample n more numbers from ξ for the second estimate g(x).
• Instead, generate the sample path x Δ= (x1, x2 . . . , xn) from −ξ1,−ξ2, . . . ,−ξn.
• Compute g(x).
• Output (g(x) + g(x))/2.
• Repeat the above steps for as many times as required by accuracy.
Variance Reduction: Conditioning
• We are interested in estimating E[ X ].
• Suppose here is a random variable Z such that
E[ X | Z = z ] can be efficiently and precisely computed.
• E[ X ] = E[ E[ X | Z ] ] by the law of iterated conditional expectations.
• Hence the random variable E[ X | Z ] is also an unbiased estimator of E[ X ].
Variance Reduction: Conditioning (concluded)
• As
Var[ E[ X | Z ] ] ≤ Var[ X ],
E[ X | Z ] has a smaller variance than observing X directly.
• First, obtain a random observation z on Z.
• Then calculate E[ X | Z = z ] as our estimate.
– There is no need to resort to simulation in computing E[ X | Z = z ].
• The procedure can be repeated a few times to reduce the variance.
Control Variates
• Use the analytic solution of a “similar” yet “simpler”
problem to improve the solution.
• Suppose we want to estimate E[ X ] and there exists a random variable Y with a known mean μ = E[ Y ].Δ
• Then W = X + β(YΔ − μ) can serve as a “controlled”
estimator of E[ X ] for any constant β.
– However β is chosen, W remains an unbiased estimator of E[ X ] as
E[ W ] = E[ X ] + βE[ Y − μ ] = E[ X ].
Control Variates (continued)
• Note that
Var[ W ] = Var[ X ] + β2 Var[ Y ] + 2β Cov[ X, Y ],
(118)
• Hence W is less variable than X if and only if
β2 Var[ Y ] + 2β Cov[ X, Y ] < 0. (119)
Control Variates (concluded)
• The success of the scheme clearly depends on both β and the choice of Y .
– American options can be priced by choosing Y to be the otherwise identical European option and μ the Black-Scholes formula.a
– Arithmetic Asian options can be priced by choosing Y to be the otherwise identical geometric Asian option’s price and β = −1.
• This approach is much more effective than the antithetic-variates method.b
aHull & White (1988).
Choice of Y
• In general, the choice of Y is ad hoc,a and experiments must be performed to confirm the wisdom of the choice.
• Try to match calls with calls and puts with puts.b
• On many occasions, Y is a discretized version of the derivative that gives μ.
– Discretely monitored geometric Asian option vs. the continuously monitored version.c
• The discrepancy can be large (e.g., lookback options).d
aBut see Dai (B82506025, R86526008, D8852600), C. Chiu (B90201037, R94922072), & Lyuu (2015, 2018).
bContributed by Ms. Teng, Huei-Wen (R91723054) on May 25, 2004.
cPriced by formulas (55) on p. 434.
Optimal Choice of β
• Equation (118) on p. 885 is minimized when β = −Cov[ X, Y ]/Var[ Y ].
– It is called beta in the book.
• For this specific β,
Var[ W ] = Var[ X ] − Cov[ X, Y ]2
Var[ Y ] =
1 − ρ2X,Y
Var[ X ], where ρX,Y is the correlation between X and Y .
Optimal Choice of β (continued)
• Note that the variance can never be increased with the optimal choice.
• Furthermore, the stronger X and Y are correlated, the greater the reduction in variance.
• For example, if this correlation is nearly perfect (±1), we could control X almost exactly.
Optimal Choice of β (continued)
• Typically, neither Var[ Y ] nor Cov[ X, Y ] is known.
• Therefore, we cannot obtain the maximum reduction in variance.
• We can guess these values and hope that the resulting W does indeed have a smaller variance than X.
• A second possibility is to use the simulated data to estimate these quantities.
– How to do it efficiently in terms of time and space?
Optimal Choice of β (concluded)
• Observe that −β has the same sign as the correlation between X and Y .
• Hence, if X and Y are positively correlated, β < 0, then X is adjusted downward whenever Y > μ and upward otherwise.
• The opposite is true when X and Y are negatively correlated, in which case β > 0.
• Suppose a suboptimal β + is used instead.
• The variance increases by only 2Var[ Y ].a
aHan & Y. Lai (2010).
A Pitfall
• A potential pitfall is to sample X and Y independently.
• In this case, Cov[ X, Y ] = 0.
• Equation (118) on p. 885 becomes
Var[ W ] = Var[ X ] + β2 Var[ Y ].
• So whatever Y is, the variance is increased!
• Lesson: X and Y must be correlated.
Problems with the Monte Carlo Method
• The error bound is only probabilistic.
• The probabilistic error bound of O(1/√
N ) does not benefit from regularity of the integrand function.
• The requirement that the points be independent random samples are wasteful because of clustering.
• In reality, pseudorandom numbers generated by completely deterministic means are used.
• Monte Carlo simulation exhibits a great sensitivity on the seed of the pseudorandom-number generator.
Matrix Computation
To set up a philosophy against physics is rash;
philosophers who have done so have always ended in disaster.
— Bertrand Russell
Definitions and Basic Results
• Let A = [ aΔ ij ]1≤i≤m,1≤j≤n, or simply A ∈ Rm×n, denote an m × n matrix.
• It can also be represented as [ a1, a2, . . . , an ] where ai ∈ Rm are vectors.
– Vectors are column vectors unless stated otherwise.
• A is a square matrix when m = n.
• The rank of a matrix is the largest number of linearly independent columns.
Definitions and Basic Results (continued)
• A square matrix A is said to be symmetric if AT = A.
• A real n × n matrix
A = [ aΔ ij ]i,j is diagonally dominant if | aii | >
j=i | aij | for 1 ≤ i ≤ n.
– Such matrices are nonsingular.
• The identity matrix is the square matrix I = diag[ 1, 1, . . . , 1 ].Δ
Definitions and Basic Results (concluded)
• A matrix has full column rank if its columns are linearly independent.
• A real symmetric matrix A is positive definite if xTAx =
i,j
aijxixj > 0 for any nonzero vector x.
• A matrix A is positive definite if and only if there exists a matrix W such that A = WTW and W has full
column rank.
Cholesky Decomposition
• Positive definite matrices can be factored as A = LLT,
called the Cholesky decomposition.
– Above, L is a lower triangular matrix.
Generation of Multivariate Distribution
• Let x = [ xΔ 1, x2, . . . , xn ]T be a vector random variable with a positive definite covariance matrix C.
• As usual, assume E[ x ] = 0.
• This covariance structure can be matched by P y.
– y = [ yΔ 1, y2, . . . , yn ]T is a vector random variable
with a covariance matrix equal to the identity matrix.
– C = P PT is the Cholesky decomposition of C.a
aWhat if C is not positive definite? See Y. Y. Lai (R93942114) &
Lyuu (2007).
Generation of Multivariate Distribution (concluded)
• For example, suppose
C =
⎡
⎣ 1 ρ ρ 1
⎤
⎦ .
• Then
P =
⎡
⎣ 1 0
ρ
1 − ρ2
⎤
⎦
as P PT = C.a
aRecall Eq. (29) on p. 178.
Generation of Multivariate Normal Distribution
• Suppose we want to generate the multivariate normal distribution with a covariance matrix C = P PT.
– First, generate independent standard normal distributions y1, y2, . . . , yn.
– Then
P [ y1, y2, . . . , yn ]T has the desired distribution.
– These steps can then be repeated.
Multivariate Derivatives Pricing
• Generating the multivariate normal distribution is essential for the Monte Carlo pricing of multivariate derivatives (pp. 797ff).
• For example, the rainbow option on k assets has payoff max(max(S1, S2, . . . , Sk) − X, 0)
at maturity.
• The closed-form formula is a multi-dimensional integral.a
aJohnson (1987); C. Y. Chen (D95723006) & Lyuu (2009).
Multivariate Derivatives Pricing (concluded)
• Suppose dSj/Sj = r dt + σj dWj, 1 ≤ j ≤ k, where C is the correlation matrix for dW1, dW2, . . . , dWk.
• Let C = P PT.
• Let ξ consist of k independent random variables from N (0, 1).
• Let ξ = P ξ.
• Similar to Eq. (117) on p. 841, for each asset 1 ≤ j ≤ k, Si+1 = Sie(r−σj2/2) Δt+σj
√Δt ξj
by Eq. (117) on p. 841.
Least-Squares Problems
• The least-squares (LS) problem is concerned with
x∈Rminn Ax − b , where A ∈ Rm×n, b ∈ Rm, and m ≥ n.
• The LS problem is called regression analysis in statistics and is equivalent to minimizing the mean-square error.
• Often written as
Ax = b.
Polynomial Regression
• In polynomial regression, x0 + x1x + · · · + xnxn is used to fit the data { (a1, b1), (a2, b2), . . . , (am, bm)}.
• This leads to the LS problem,
⎡
⎢⎢
⎢⎢
⎢⎢
⎣
1 a1 a21 · · · an1 1 a2 a22 · · · an2 ... ... ... . .. ... 1 am a2m · · · anm
⎤
⎥⎥
⎥⎥
⎥⎥
⎦
⎡
⎢⎢
⎢⎢
⎢⎢
⎣ x0 x1 ... xn
⎤
⎥⎥
⎥⎥
⎥⎥
⎦
=
⎡
⎢⎢
⎢⎢
⎢⎢
⎣ b1 b2 ... bm
⎤
⎥⎥
⎥⎥
⎥⎥
⎦ .
• Consult p. 273 of the textbook for solutions.
American Option Pricing by Simulation
• The continuation value of an American option is the conditional expectation of the payoff from keeping the option alive now.
• The option holder must compare the immediate exercise value and the continuation value.
• In standard Monte Carlo simulation, each path is treated independently of other paths.
• But the decision to exercise the option cannot be reached by looking at one path alone.
The Least-Squares Monte Carlo Approach
• The continuation value can be estimated from the cross-sectional information in the simulation by using least squares.a
• The result is a function (of the state) for estimating the continuation values.
• Use the function to estimate the continuation value for each path to determine its cash flow.
• This is called the least-squares Monte Carlo (LSM) approach.
aLongstaff & Schwartz (2001).
The Least-Squares Monte Carlo Approach (concluded)
• The LSM is provably convergent.a
• The LSM can be easily parallelized.b
– Partition the paths into subproblems and perform LSM on each of them independently.
– The speedup is close to linear (i.e., proportional to the number of cores).
• Surprisingly, accuracy is not affected.
aCl´ement, Lamberton, & Protter (2002); Stentoft (2004).
bK. Huang (B96902079, R00922018) (2013); C. W. Chen (B97902046, R01922005) (2014); C. W. Chen (B97902046, R01922005), K. Huang (B96902079, R00922018) & Lyuu (2015).
A Numerical Example
• Consider a 3-year American put on a non-dividend-paying stock.
• The put is exercisable at years 0, 1, 2, and 3.
• The strike price X = 105.
• The annualized riskless rate is r = 5%.
– The annual discount factor hence equals 0.951229.
• The current stock price is 101.
• We use only 8 price paths to illustrate the algorithm.
A Numerical Example (continued)
Stock price paths
Path Year 0 Year 1 Year 2 Year 3 1 101 97.6424 92.5815 107.5178 2 101 101.2103 105.1763 102.4524 3 101 105.7802 103.6010 124.5115 4 101 96.4411 98.7120 108.3600 5 101 124.2345 101.0564 104.5315
6 101 95.8375 93.7270 99.3788
7 101 108.9554 102.4177 100.9225 8 101 104.1475 113.2516 115.0994
0 0.5 1 1.5 2 2.5 3 95
100 105 110 115 120 125
1
2
3
4 5
6 7
8
A Numerical Example (continued)
• We use the basis functions 1, x, x2. – Other basis functions are possible.a
• The plot next page shows the final estimated optimal exercise strategy given by LSM.
• We now proceed to tackle our problem.
• The idea is to calculate the cash flow along each path, using information from all paths.
aLaguerre polynomials, Hermite polynomials, Legendre polynomials, Chebyshev polynomials, Gedenbauer polynomials, and Jacobi polynomi- als.
0 0.5 1 1.5 2 2.5 3 95
100 105 110 115 120 125
1
2 3
4 5
6 7 8
A Numerical Example (continued)
Cash flows at year 3
Path Year 0 Year 1 Year 2 Year 3
1 — — — 0
2 — — — 2.5476
3 — — — 0
4 — — — 0
5 — — — 0.4685
6 — — — 5.6212
7 — — — 4.0775
8 — — — 0
A Numerical Example (continued)
• The cash flows at year 3 are the exercise value if the put is in the money.
• Only 4 paths are in the money: 2, 5, 6, 7.
• Some of the cash flows may not occur if the put is exercised earlier, which we will find out step by step.
• Incidentally, the European counterpart has a value of
0.9512293 × 2.5476 + 0.4685 + 5.6212 + 4.0775
8 = 1.3680.
A Numerical Example (continued)
• We move on to year 2.
• For each state that is in the money at year 2, we must decide whether to exercise it.
• There are 6 paths for which the put is in the money: 1, 3, 4, 5, 6, 7 (p. 911).
• Only in-the-money paths will be used in the regression because they are where early exercise is relevant.
– If there were none, we would move on to year 1.
A Numerical Example (continued)
• Let x denote the stock prices at year 2 for those 6 paths.
• Let y denote the corresponding discounted future cash flows (at year 3) if the put is not exercised at year 2.
A Numerical Example (continued)
Regression at year 2
Path x y
1 92.5815 0 × 0.951229
2 — —
3 103.6010 0 × 0.951229 4 98.7120 0 × 0.951229 5 101.0564 0.4685 × 0.951229 6 93.7270 5.6212 × 0.951229 7 102.4177 4.0775 × 0.951229
8 — —
A Numerical Example (continued)
• We regress y on 1, x, and x2.
• The result is
f (x) = 22.08 − 0.313114 × x + 0.00106918 × x2.
• f(x) estimates the continuation value conditional on the stock price at year 2.
• We next compare the immediate exercise value and the continuation value.a
aThe f(102.4177) entry on the next page was corrected by Mr. Tu, Yung-Szu (B79503054, R83503086) on May 25, 2017.
A Numerical Example (continued)
Optimal early exercise decision at year 2 Path Exercise Continuation 1 12.4185 f (92.5815) = 2.2558
2 — —
3 1.3990 f (103.6010) = 1.1168 4 6.2880 f (98.7120) = 1.5901 5 3.9436 f (101.0564) = 1.3568 6 11.2730 f (93.7270) = 2.1253 7 2.5823 f (102.4177) = 1.2266
8 — —
A Numerical Example (continued)
• Amazingly, the put should be exercised in all 6 paths: 1, 3, 4, 5, 6, 7.
• Now, any positive cash flow at year 3 should be set to zero or overridden for these paths as the put is exercised before year 3 (p. 911).
– They are paths 5, 6, 7.
• The cash flows on p. 915 become the ones on next slide.
A Numerical Example (continued)
Cash flows at years 2 & 3
Path Year 0 Year 1 Year 2 Year 3
1 — — 12.4185 0
2 — — 0 2.5476
3 — — 1.3990 0
4 — — 6.2880 0
5 — — 3.9436 0
6 — — 11.2730 0
7 — — 2.5823 0
8 — — 0 0
A Numerical Example (continued)
• We move on to year 1.
• For each state that is in the money at year 1, we must decide whether to exercise it.
• There are 5 paths for which the put is in the money: 1, 2, 4, 6, 8 (p. 911).
• Only in-the-money paths will be used in the regression because they are where early exercise is relevant.
– If there were none, we would move on to year 0.
A Numerical Example (continued)
• Let x denote the stock prices at year 1 for those 5 paths.
• Let y denote the corresponding discounted future cash flows if the put is not exercised at year 1.
• From p. 923, we have the following table.
A Numerical Example (continued)
Regression at year 1
Path x y
1 97.6424 12.4185 × 0.951229 2 101.2103 2.5476 × 0.9512292
3 — —
4 96.4411 6.2880 × 0.951229
5 — —
6 95.8375 11.2730 × 0.951229
7 — —
8 104.1475 0
A Numerical Example (continued)
• We regress y on 1, x, and x2.
• The result is
f (x) = −420.964 + 9.78113 × x − 0.0551567 × x2.
• f(x) estimates the continuation value conditional on the stock price at year 1.
• We next compare the immediate exercise value and the continuation value.
A Numerical Example (continued)
Optimal early exercise decision at year 1
Path Exercise Continuation
1 7.3576 f (97.6424) = 8.2230 2 3.7897 f (101.2103) = 3.9882
3 — —
4 8.5589 f (96.4411) = 9.3329
5 — —
6 9.1625 f (95.8375) = 9.83042
7 — —
8 0.8525 f (104.1475) = −0.551885
A Numerical Example (continued)
• The put should be exercised for 1 path only: 8.
– Note that f(104.1475) < 0.
• Now, any positive future cash flow should be set to zero or overridden for this path.
– But there is none.
• The cash flows on p. 923 become the ones on next slide.
• They also confirm the plot on p. 914.
A Numerical Example (continued)
Cash flows at years 1, 2, & 3
Path Year 0 Year 1 Year 2 Year 3
1 — 0 12.4185 0
2 — 0 0 2.5476
3 — 0 1.3990 0
4 — 0 6.2880 0
5 — 0 3.9436 0
6 — 0 11.2730 0
7 — 0 2.5823 0
8 — 0.8525 0 0
A Numerical Example (continued)
• We move on to year 0.
• The continuation value is, from p 930,
(12.4185 × 0.9512292 + 2.5476 × 0.9512293 +1.3990 × 0.9512292 + 6.2880 × 0.9512292 +3.9436 × 0.9512292 + 11.2730 × 0.9512292 +2.5823 × 0.9512292 + 0.8525 × 0.951229)/8
= 4.66263.
A Numerical Example (concluded)
• As this is larger than the immediate exercise value of 105 − 101 = 4,
the put should not be exercised at year 0.
• Hence the put’s value is estimated to be 4.66263.
• Compare this with the European put’s value of 1.3680 (p. 916).