• 沒有找到結果。

` 1 −Penalized Likelihood Smoothing of Volatility Processes allowing for Abrupt Changes

N/A
N/A
Protected

Academic year: 2022

Share "` 1 −Penalized Likelihood Smoothing of Volatility Processes allowing for Abrupt Changes"

Copied!
25
0
0

加載中.... (立即查看全文)

全文

(1)

` 1 −Penalized Likelihood Smoothing of Volatility Processes allowing for Abrupt Changes

David Neto

(a)

, Sylvain Sardy

(b)

and Paul Tseng

(c)

April 2009

Abstract

We consider the problem of estimating the volatility of a financial asset from a time series record of length T . We believe the underly- ing volatility process is smooth, possibly stationary, and with potential abrupt changes due to market news. By drawing parallels between time series and regression models, in particular between stochastic volatility models and Markov random fields smoothers, we propose a semiparamet- ric estimator of volatility. Our Bayesian posterior mode estimate is the solution to an `1-penalized likelihood optimization that we solve with an interior point algorithm that is efficient since its complexity is bounded by O(T3/2). We apply our volatility estimator to real financial data, di- agnose the model and perform back-testing to investigate to forecasting power of the method by comparison to (I)GARCH.

Keywords: `1 penalty, Markov random field, stochastic volatility model, smoothing, wavelet, extreme value theory, forecasting, GARCH.

JEL classification: C14, C22, C61.

(a)Department of Econometrics, University of Geneva, CH-1211 Geneva 4, Switzerland.

E-mail: david.neto@unige.ch).

(b)Department of Mathematics, University of Geneva, CH-1211 Geneva 4, Switzerland.

E-mail: Sylvain.Sardy@unige.ch).

(c)Department of Mathematics, University of Washington, Seattle, WA.

E-mail: tseng@math.washington.edu).

(2)

1 Introduction

Suppose we observe values yt of a financial asset (stock-returns, interest-rates or exchange-rates) at regularly spaced times t = 1, . . . , T . Our goal is to es- timate an important intrinsic characteristic of the financial asset of interest, the evolution of the conditional variance of the stochastic return process, so as to assess the past, present and future risk of the asset. To that aim, many stochastic models have been proposed. Standard among them is to assume a data generating process for yt. Most models assume yt = σtt, where σt is a measure of volatility, and t is white standard Gaussian noise. With- out additional assumption, the maximum likelihood estimate is ˆσMLEt = |yt|, which is practically useless due to its nonsmoothness (i.e., high variability).

A temporal structure for σt is assumed to regularize the maximum likelihood estimation, so as to obtain a smoother estimate of σt while capturing the styl- ized features (Rydberg 2000) observed in financial econometric, like heavy tails of the marginal distributions, volatility clustering or evolution with possible abrupt changes (e.g., peaks, discontinuities). The popular GARCH-type mod- els (Engle 1982, Bollerslev 1986) are parametric and enjoy good estimation properties. Stochastic volatility models (Taylor 1986, Taylor 1994) are power- ful semiparametric alternatives which fit more flexibly the stylized features. In particular, the log-normal stochastic volatility model seems to better capture the leptokurticity of the marginal distributions of the financial data than the standard GARCH model (Shephard 1996). Stochastic volatility models are the empirical discrete-time analogs of continuous-time models in finance theory and, in particular, option pricing (Hull and White 1987). They can also be viewed as an Euler discretization of a diffusion. The general discrete model of Andersen (1994) includes GARCH and stochastic volatility models as particular cases. It defines the discrete polynomial stochastic autoregressive volatility process as:

yt = σtt, (1)

σtq = ϕ (ht) , (2)

ht = ω + φht−1+ (γ + ψht−1) ηt, (3) where q ∈ {1, 2}, ϕ(·) is a positive continuous and strictly monotone function, the autoregressive process hthas positive coefficients ψ, φ, γ such that ψ + φ > 0 and ψ + φ > 0, the errors ηt and t are i.i.d. and mutually independent with mean-variance of 1, ση2

and (0, 1), respectively. This process includes, for instance, the GARCH process (corresponding to q = 2, ϕ (ht) = ht, ηt = 2t−1, and γ = 0) and the stochastic volatility model (SVM), corresponding to ϕ (ht) = exp (ht), q = 1 and ψ = 0. Often ηt and t are assumed to be Gaussian.

(3)

One challenge for these models has been how to infer the parameters (ψ, φ, γ, ω) and how to estimate the hidden conditional volatility σt within the sample (smoothing) and out of sample (forecasting). While parametric GARCH mod- els have been relatively easy to fit owing to the small number of parameters, it is not the case for semi-parametric SVM that represent the returns as a non- linear hidden Markov chain. Several methods have been proposed to estimate the latter data generating process, many of which are based on calculating or estimating the conditional expectations

( ˆψ, ˆφ, ˆγ, ˆω) = E((ψ, φ, γ, ω) | y; ση2) and ˆht = E ht|y; σ2η , t = 1, . . . , T, (4) where y = (y1, ..., yT) are the returns up to time T . For a linear and Gaussian data generating process, the Kalman filter calculates a closed form expression for the posterior means (4). The quasi-maximum likelihood approach of Harvey, Ruiz and Shephard (1994) applies Kalman filtering to the logarithm of the squared observations log yt2 = 2ht+ ζt, where ζt = log 2t. However, the data generating process is non-Gaussian, and the poor approximation of ζt by a Gaussian distribution leads to poor estimation of σt, particularly when the variance ση2 of the log-volatility ηt in (3) is small (Jacquier, Polson and Rossi 1994). Another possible approach is to generate by Monte-Carlo an ergodic chain from the exact posterior distribution given the observations (Jacquier et al. 1994) and calculate the componentwise averages to estimate the conditional expectations (4). However, this may require generating long sample paths before reaching convergence. The estimate of the variance ση2 of the log-volatility innovations also remains a difficult issue.

Based on a GARCH model, the estimated persistence of the conditional volatilities of many financial time series is seemingly high, which also makes the inference difficult. For example the second moment of an Integrated GARCH (Engle and Bollerslev 1986) does not exist even if the stationarity conditions are met. One also wonders whether the volatility process is really stationary, and in fact, it is now believed that such strong persistence is due to misspecification modeling. For instance Diebold (1986) argued that the persistence may be due to the instability of the unconditional variance. Lastrapes (1989) observed that the persistence of exchange-rate volatility depends on U.S. monetary policy regimes. This statement is confirmed by Lamoureux and Lastrapes (1990) who obtained a substantial lower persistence volatility measure (i.e. the sum of the GARCH parameters, ψ + φ in equation (3)) when some dummy variables, corresponding to some periods over which the volatility seems to be stationary, are included in the GARCH model. Cai (1994) and Hamilton and Susmel (1994) used Markov switching ARCH models to take into account the regime effect in

(4)

the volatility, and in both cases, the persistence is captured by the stochastic process which describes the state of the system, rather than by the GARCH parameters. Another widely used approach (see for instance Chan and Maheu (2002) and Chernov, Gallant, Ghysels and Tauchen (1999)) adds a jump process to capture large discrete changes. However, the empirical evidence suggests that the jump models capture less the persistence than additional extremes in the returns marginal distribution. Starica (2003) shares the concerns early stated by Diebold (1986) about persistence and misspecification, and discusses the danger of assuming global stationarity. Starica and Granger (2005) proposed a nonstationary approach to estimate volatility by considering the log-absolute returns as a locally stationary process as defined in Dahlhaus (1997). Their methodology considers the underlying process as a slowly and smoothly varying process with sharp changes.

The aforementioned results motivate the features we aim to reproduce with our proposed model and estimator, namely, that volatility evolves smoothly except for occasional abrupt changes whose transient effect prevents the market from returning quickly to the level before shock. However, while Starica and Granger’s approach is based on successive stationarity tests over intervals of varying width to identify the times of regime switching, our `1-based estimator finds the desired segmentation via the selection of a single parameter. Moreover, the solution of our `1-penalized likelihood optimization problem is calculated with a fast algorithm. This paper is organized as follows. In Section 2.1 we define our volatility estimator and discuss its link to wavelet and Markov random field smoothing. In Section 2.2 we use the connection to smoothers to present a selection rule for the smoothing parameter which guarantees smoothness of the volatility estimate. In Section 2.3 we show that the volatility estimate can be efficiently computed using an interior-point algorithm. Section 2.4 illustrates the performance of the estimator on a simulated time series. In Section 3 we apply our estimator to the DOW JONES and NASDAQ, analyze the results and evaluate the forecasting performance. Section 4 draws some conclusions and points to future extensions.

2 Smoothing the volatility

2.1 `

1

penalized likelihood estimator (fSVM)

We consider a discrete time SVM with Laplace innovations ηt whose density fη(x) = λ2 exp(−λ|x|) has heavier tails than the Gaussian density to better re- flect sudden changes of regime due to important market news. More precisely,

(5)

this model corresponds to (1)–(3) with ψ = 0, q = 1, γ = 1 and ω = µ(1 − φ).

Hence the variance of the log-volatility innovations is ση2 = 2/λ2 and this SVM is related to variance gamma (VG) models (Madan and Seneta 1990) for modeling the non-Gaussian nature of stock market returns and future index prices. In- deed the VG model can be expressed by equation (1), where σtfollows a Gamma process, which in its continuous version writes as S (t) = S (0) exp (L (t)), where S(t) are the stock prices and L (t) is a Laplace motion (Kotz, Kozubowski and Podg´orski 2001). Hence, the Laplace motion is defined as a Brownian mo- tion, denoted B(τ ), evaluated at random time distributed as a Gamma process, denoted γ (t). This Brownian is said subordinated to the process γ (t). There- fore, the Laplace motion is defined as L (t) = B (γ (t)). Another appeal of thed Laplace motion is it can be written as a compound Poisson process with inde- pendent and random jumps. In this sense it is a pure jumps process able then to capture abrupt changes. In addition to VG model, our model considers an extra parameter to model and capture the persistence in volatility.

Using Bayes theorem with (1) for the noise model and (2)–(3) for the prior assumption on the volatility process, we derive the posterior distribution of the volatility given the returns y. More precisely, by considering the negative log-posterior distribution, we define the maximum a posteriori estimate as the solution to

min

φ,µ,h T

X

t=1

log ϕ(ht) − log{f(yt/ϕ(ht))} + λ

T

X

t=2

|ht− (µ + φ (ht−1− µ))|, (5)

where f denotes the density function of the error term t, (ht)Tt=1 are the ϕ−1- volatilities, µ is an average volatility measure, and φ is the persistence parame- ter. The function ϕ(·) is a strictly monotone function that maps the estimand ht to the volatility σt = ϕ(ht). For instance, the exponential function used by the original stochastic volatility model conveniently maps R into R+ for a pos- itive volatility. Our estimator has the advantage that the positivity constraint is already active with the first logarithmic term in (5) which acts as a barrier against negativity. Hence we can consider a broader class of links, for instance, the power transform ht = (σδt − 1)/δ = ϕ−1t) (Box and Cox 1982) that in- cludes as special cases the exponential link (take δ → 0) and the linear link (take δ = 1).

The first sum in (5) is the negative log-likelihood for (1)–(2) and the second sum stems from the autoregressive process prior (3) with Laplace η-innovations.

The estimator is akin to a Tikhonov regularization (Tikhonov 1963) of the er- ratic maximum likelihood estimate, but using an `1-based penalty and suffi- ciently large penalty parameter λ > 0 to enforce smoothness. The `1 penalized

(6)

likelihood formulation of the estimator (5) draws connection to two nonpara- metric function estimation techniques: One is Markov random field smoothing (Geman and Geman 1984, Besag 1986, Sardy and Tseng 2004), which solves a similar problem

min

h −l(h; y) + λ

T

X

t=2

|ht− ht−1|,

where −l(·; ·) is the negative log-likelihood of a noisy signal/image and h = (h1, . . . , hT). Another is wavelet smoothing (Donoho and Johnstone 1994). For instance, soft-Waveshrink for a wavelet matrix Φ, solves

h=Φαmin 1

2ky − hk22+ λkαk1,

where α are the wavelet coefficients (Donoho, Johnstone, Hoch and Stern 1992).

The main appeal of soft-Waveshrink is that it has near minimax properties for a class of loss functions and smoothness classes (Donoho, Johnstone, Kerky- acharian and Picard 1995) for a simple selection of λ that only depends on T , the so-called universal rule (Donoho and Johnstone 1994). We exploit this con- nection to propose in Section 2.2 a selection rule for λ that leads to a smooth estimation of the volatility process.

How does our smoothing approach differ from existing approaches? We aim at estimating a smooth evolution of volatility in time with possible abrupt changes of regime which seems more realistic for finance applications, while the latter aims at estimating the true coefficients of some assumed erratic volatility data generating process. To that aim we propose a selection of λ (hence of σ2η) borrowing ideas from wavelet smoothing. Our estimator also differs in the way the estimator is computed since we solve a convex optimization problem (5) (see Section 2.3) instead of solving an integration problem (4) approximately by sampling an ergodic Markov chain.

2.2 Selection of the smoothing parameter λ

The `1-regularization parameter λ ≥ 0 in (5) controls the smoothing: When λ = 0, the solution is the wiggly maximum likelihood estimate ˆσtMLE = |yt| for t = 1, . . . , T , while when λ tends to infinity the estimates httends to either a linear function for φ = 1 or a function that is asymptotically constant for φ < 1. We see that selection of λ (or equivalently σ2η in (3)) is crucial. In Gaussian wavelet smoothing, the universal penalty λwaveT = √

2 log T is a simple but surprisingly efficient choice as it endows the `1-penalized likelihood wavelet estimator with near minimax properties for a class of loss functions and smoothness measures

(7)

(Donoho and Johnstone 1994, Donoho et al. 1995). Borrowing from wavelet smoothing, we derive a universal penalty λfSVMT for the maximum a posteriori estimator (5) such that the estimated volatility is a smooth fit to the data.

With λ selected by the universal rule, our stochastic volatility model (1)–(3) can be seen as a functional data generating process in the sense that it leads to an estimated volatility process which is a smooth rather than erratic function of time that fits the volatility of the financial asset. Hence our estimator is derived from a functional stochastic volatility model (fSVM).

In wavelet smoothing, the universal parameter is chosen such that, when the underlying signal is piecewise constant, the estimate is also piecewise constant with probability tending to one as the sample size tends to infinity. Likewise here, we set the universal parameter so that, when the true volatility is piecewise constant (i.e., persistence with φ = 1 on each interval) on KT successive times, the volatility estimate is also piecewise constant with probability tending to one.

In Appendix A we derive the universal parameter λfSVMT =pKT log(nTlog nT) with nT = T /KT for standard Gaussian -innovations, φ = 1, KT ∼ log T and the link ϕ(·) = exp(·). Deriving the universal parameter for φ = 1 is valid since we expect strong persistence in practice, and this avoids the derivation of a φ-dependent universal parameter.

2.3 Optimization issues

We study here how to solve (5) to obtain the proposed estimate. To solve (5) in (φ, µ, h), we use a decomposition approach that alternately solves in (µ, h) with φ held fixed, and in φ with (µ, h) held fixed. This alternating minimization approach, though not guaranteed to converge to a global minimum, works well in practice. How to solve each subproblem? For a fixed (µ, h), the objective function of (5) is convex piecewise-linear in φ and the minimum can be found by, e.g., sorting the breakpoints. In what follows, we focus on solving in (µ, h), with φ held fixed. The resulting subproblem can be written compactly as

min

h,µ T

X

t=1

gt(ht) + π(Bφh + µ(φ − 1)1), (6)

where 1 denotes the T −vector of ones, gt(ht) = log ϕ(ht) − log{f(yt/ϕ(ht))}, (Bφh)t = ht+1− φht for t = 1, . . . , T − 1 and π(·) = λk · k1. We will focus on the link ϕ(·) = exp(·), and it can be seen that gt(ht) = ht+ 12yt2exp(−2ht)+

constant is strictly convex for a Gaussian -innovations. Hence (6) is a convex optimization problem.

(8)

The case of φ = 1 can be solved efficiently using the IDM algorithm. Specif- ically, Theorem 3 of Sardy and Tseng (2004) applies with gt(ut) = 12(1 − ut)(log(1−uy2t

t

) − 1) for Gaussian -innovations, so that a dual coordinate de- scent algorithm can be employed on the dual problem in u = (u1, ..., uT). We consider below the more challenging and interesting case of φ 6= 1, for which the IDM algorithm is impractical. As we show below, in this case (6) can be efficiently solved by a primal-dual interior-point algorithm.

2.3.1 Dual formulation

Using ϕ(·) = exp(·), we derive in Appendix B the dual of the primal subproblem (6). It has the general form

min

Q

X

t=1

qt(xt) s.t. Ax = b, x ≥ 0, (7)

where x is the dual vector, the matrix A has Q = 3T − 2 columns and qt(·) is a function assumed to be convex, twice differentiable on (0, ∞) with limξ→0qt(ξ) = qt(0), with q0t(·) concave and which satisfies

(ξ + δ) (qt0(ξ + δ) − qt0(ξ) − qt00(ξ)δ) ≥ −κqt00(ξ)δ2 whenever |δ|

ξ ≤ ρ, (8) for some κ > 0 and 0 < ρ < 1. In particular, Appendix C shows that (8) is satisfied with κ = 2(1−ρ)1 for Gaussian noise and exponential link. Specifically, we can take qt(xt) = xtlog(xt) + ctxt with ct = −(1 + log(yt2)), t = 1, . . . , T . 2.3.2 Log-barrier problem

The log-barrier problem, parameterized by ε > 0, is

min

Q

X

t=1

qt(xt) − ε log(xt) s.t. Ax = b, x > 0,

with Karush-Kuhn-Tucker condition

Ax = b, x > 0, q0(x) − εX−11 − A>u = 0,

where q0(x) = (qt0(xt))Qt=1 and X = diag(x1, . . . , xQ). This can be rewritten as Ax = b, x > 0, s = q0(x) − A>u, Xs = ε1. (9)

(9)

The exact solution of (9) traces the central path as ε ranges over (0, ∞). The primal-dual interior-point algorithm solves the equations approximately using damped Newton method and decreases ε after each iteration. Specifically, (x, u) is an approximate solution of (9) if it belongs, together with ε, to the following so-called “wide neighborhood” of the central path:

N (τ ) =



(x, u, ε) | Ax = b, x > 0, s = q0(x) − A>u, min

t xtst≥ τ ε, ε = x>s Q

 , with 0 < τ < 1; see (Wright 1997) and references therein.

2.3.3 Primal-dual interior-point algorithm

The algorithm begins with any (x, u, ε) ∈ N (τ ). Then it solves the Newton equation

Xds+ Sdx = δε1 − Xs, (10)

Adx = 0, (11)

q00(x)dx− A>du = ds, (12) for (dx, ds, du), where s = q0(x) − A>u, and 0 < δ < 1. Let

x[α] = x + αdx, u[α] = u + αdu, s[α] = q0(x[α]) − A>u[α] ∀α > 0.

Let ν and ¯α be given by (26) and (27) in Appendix D. Then, beginning with α = 1, it checks if

(x[α], u[α]) ∈ N (τ ), ε[α] = x[α]>s[α]

Q ≤ (1 − ¯αν)ε, (13) and if not, it decreases α by some factor 0 < % < 1 and repeat, until (13) is satisfied. Then we update

(xnew, unew, εnew) ←− (x[α], u[α], ε[α]),

and re-iterate, until ε ≤ εfinal. In our implementation, we use τ = 10−4, δ = 0.5,

% = 0.7, ρ = 0.99, we initialize by h = α11 − c, w = αλ1, where c = (c1, ..., cT).

(which uniquely determine u and x), s = q0(x) − A>u, and ε = xQ>s, with 0 < α < 1 chosen so that (x, u, ε) ∈ N (τ ).

Our code is available from the authors and is fast since its complexity is bounded by O(Q3/2), as we now show.

(10)

2.3.4 Iteration complexity

Appendix D shows that (13) is satisfied when α = ¯α given by (27) in Ap- pendix D. Thus ε decreases by a factor of at most 1 − ¯αν after each iteration so that, after k iterations, ε ≤ (1 − ¯αν)kεinit. Thus ε ≤ εfinal whenever k ≥ log

εinit εfinal

 1

− log(1− ¯αν). Since log(1− ¯αν) ≤ − ¯αν and, by (27), α1¯ = O(κQ3/2+Q), this shows that the number of iterations until termination is at most

log εinit εfinal

! 1

¯

αν = O (κQ3/2+ Q) log εinit εfinal

!!

,

where Q = 3T − 2 and T is the length of the time series. While there have been previous studies of path-following algorithms for entropic optimization of the form (7) and (8) (Potra and Ye 1993, Tseng 1992), these algorithms use the so-called “narrow neighborhood”, which is not practically efficient. To our knowledge, this is the first study of a path-following algorithm for entropic op- timization that uses the wide neighborhood and is practically and theoretically efficient. Specifically, when (7) is a linear or a convex quadratic program, i.e., κ = 0 in (8), the above complexity result is the best known for an algorithm using the wide neighborhood; see Wright (1997, Theorem 5.11).

2.4 Simulated time series

We simulate data from a smooth volatility function with periods of abrupt changes of regime and volatility peaks, as one may expect in the financial mar- kets. To that aim we take the sum of two classical functions in wavelet smooth- ing, the blocks and bumps functions (Donoho and Johnstone 1994), rescaled to have a range of volatility σt ∈ [0.1, 10]. The log of the volatility function σt for t = 1, . . . , 5000 is the curve plotted on Figure 1 (b), where the dots are the maximum likelihood estimates log ˆσtMLE = log |yt|. Figure 1 (a) shows the simulated returns yt, and (b) shows the empirical autocorrelation function (acf) of the absolute returns, which reflects potential phenomena observed on real financial time series, such as volatility clustering, nonstationarity or long mem- ory. Figure 1 (e) shows the fSVM penalized likelihood estimated log-volatilities solution to (5) with ϕ(·) = exp(·), using the universal penalty λT derived in Section 2.2, while (d) shows the acf of the fitted absolute residuals |yt/ˆσt| and (f) shows the quantile-quantile plot of the fitted residuals. We see with these simulated data that the estimation of the underlying volatility captures the important features of the true volatility, and that the fitted residual process matches well the i.i.d. standard Gaussian distribution.

(11)

0 1000 2000 3000 4000 5000

−20

−10 0 10

(a) Simulated returns yt

0 50 100 150 200

−1

−0.5 0 0.5 1

lag (b) acf(|y|)

0 1000 2000 3000 4000 5000

−4

−2 0 2 4

(c) log |y

t| and true volatilities log σt

0 50 100 150 200

−1

−0.5 0 0.5 1

lag (d) acf(|res|)

0 1000 2000 3000 4000 5000

−4

−2 0 2 4

time (e) log |y

t| and estimated volatilities log σt

−4 −2 0 2 4

−4

−2 0 2 4

Theoretical quantiles

Empirical quantiles

(f) qq−plot of residuals

Figure 1: Volatility simulation with T = 5000. Top: simulated returns yt(dots), and empirical correlation function (acf) of |yt|; Middle: true log-volatilities σt (line) with log-absolute-returns log |yt| (dots), and acf of residuals |yt/ˆσt|;

Bottom: log-volatilities (line) estimated with the universal penalty λfSVMT with log |yt| (dots), and Gaussian qq-plot of residuals.

(12)

3 Applications

We illustrate our methodology on the DOW JONES and NASDAQ stock index returns. Data are the T = 4846 daily closing prices Pt between 29 December 1989 until 23 march 2009. Stock returns yt are computed as 100 log (Pt/Pt−1).

Figures 2 and 3 display the log-returns for both indices on their top-left graphs.

In Section 3.1 we first consider the entire sequence of returns, apply our volatility estimator to it and analyze the result. Note that the estimation for such time- series took less than 2 minutes running a Matlab code on a standard computer.

In Section 3.2 we evaluate and compare the forecasting performance of our method to that of GARCH and IGARCH for short and long horizons.

3.1 Volatility smoothing

The volatility clustering feature is reflected by the strong autocorrelation in the absolute values of the returns on Figures 2 and 3 (b). Results of our volatil- ity estimation is presented in graphs (d) on log-scale, and graphs (e)-(f) are diagnostics plots on the residuals. As expected the autocorrelation in the ab- solute rescaled residuals, |yt/ˆσt|, has been removed (see graphs (e)). And the normality holds quite well, except in the negative extreme tail (see graphs (f)).

This asymmetry is certainly due to the fact that our methodology does not yet model the leverage effect on the volatility (Nelson 1991, Glosten, Jagannathan and Runkle 1993), i.e. the responses of the volatility to the negative shocks (bad news) are stronger than the ones to the positive shocks. To capture this asymmetry, model (1)-(3) can be modified as in Omori, Chib, Shephard and Nakajima (2007).

The persistence parameter estimates for both indices are ˆφ = 1.0003 and φ = 1.0001, for the DOW JONES and the NASDAQ, respectively. These highˆ values of φ lead to conclude that the volatility may be nonstationary. However in the out-of-sample forecast exercise presented below, the estimates ˆφ are less than 1.00 and quite stable over the period from November 2001 until just before the recent subprime crisis, where the estimate of φ rises dramatically. Therefore, the fact that the full sample estimate of φ reaches above unity is probably due to the instability related to the recent crashes (see Figure 4). To derive the variance-covariance matrix of the vector of parameters and in particular of φ, we observe that for given λ and (ht)Tt=1, the second term of the objective function (5) is a least absolute error regression problem of (ht)Tt=2 on (ht)T −1t=1 with slope φ and intercept ˜µ = µ(φ − 1). Bassett and Koenker (1978) provide asymptotic theory for least absolute fit. Hence the asymptotic covariance is given by V((ˆ˜µ, ˆφ)>) = λ12Q−1 with λ = λfSVMT of Section 2.2, where Q = X>X

(13)

0 1000 2000 3000 4000

−5 0 5 10

(a) DOW JONES returns yt

50 100 150

0 0.1 0.2 0.3 0.4 0.5

lag (b) acf(|y|)

50 100 150

−0.2

−0.1 0 0.1 0.2

lag (c) acf(y)

0 1000 2000 3000 4000

−4

−2 0 2

time

(d) log |yt| and estimated volatilities log σt

−4 −2 0 2 4

−4

−2 0 2 4

Theoretical quantiles

Empirical quantiles

(f) qq−plot of residuals

50 100 150

−0.2

−0.1 0 0.1 0.2

lag (e) acf(|res|)

Figure 2: DOW JONES: (a) returns yt from 29 december 1989 until 23 march 2009; (b) empirical autocorrelation function (acf) of |yt|; (c) acf of yt; (d) log- absolute-returns log |yt|; (e) acf of standardized absolute residuals |yt/ˆσt|; (f) Gaussian qq-plot of standardized residuals.

(14)

0 1000 2000 3000 4000

−10

−5 0 5 10

(a) NASDAQ returns yt

50 100 150

0 0.1 0.2 0.3 0.4 0.5

lag (b) acf(|y|)

50 100 150

−0.2

−0.1 0 0.1 0.2

lag (c) acf(y)

0 1000 2000 3000 4000

−4

−2 0 2

time

(d) log |yt| and estimated volatilities log σt

−4 −2 0 2 4

−4

−2 0 2 4

Theoretical quantiles

Empirical quantiles

(f) qq−plot of residuals

50 100 150

−0.2

−0.1 0 0.1 0.2

lag (e) acf(|res|)

Figure 3: NASDAQ: (a) returns yt from 29 december 1989 until 23 march 2009;

(b) empirical autocorrelation function (acf) of |yt|; (c) acf of yt; (d) log-absolute- returns log |yt|; (e) acf of standardized absolute residuals |yt/ˆσt|; (f) Gaussian qq-plot of standardized residuals.

(15)

with X = (1, h−1), and h−1 = (h1, ..., hT −1)0, and with the standard assumption limT →∞ 1

(T −1)X>X is positive definite. Here the vector h−1 is not observable, but is estimated by our estimator. So we compute the matrix Q with the estimated log-volatilities to provide an approximate covariance matrix for the two estimated parameters of the model. For the time series considered the estimated standard deviations of ˆφ are 0.0055 for the DOW JONES and 0.0045 for the NASDAQ.

Figure 4 compares the evolution of both estimated log-volatilities for the DOW JONES and the NASDAQ. As expected the NASDAQ is more volatile than the DOW JONES, except for the recent crashes of Autumn 2008 which affected all economic sectors. Some important market turbulence periods have been identified, and correspond to the main abrupt changes estimated by fSVM.

DOWJONES and NASDAQ estimated log−volatilities

-1.00 -0.50 0.00 0.50 1.00 1.50 2.00

02.01.1990 02.01.1991 02.01.1992 02.01.1993 02.01.1994 02.01.1995 02.01.1996 02.01.1997 02.01.1998 02.01.1999 02.01.2000 02.01.2001 02.01.2002 02.01.2003 02.01.2004 02.01.2005 02.01.2006 02.01.2007 02.01.2008 02.01.2009

Dow Jones

Nasdaq Autumn

1997:

Asiatic crisis

1998 Russian

crisis

Dot-com bubble

Subprimes summer

2007 Autumn crashes 2008

Figure 4: DOW JONES and NASDAQ’s estimated log-returns from 29 decem- ber 1989 until 23 march 2009.

(16)

3.2 Volatility forecasting

To evaluate the forecasting performance of various models, we calibrate on time series running from day one until day t = 3000 + (k − 1)H/2, k = 1, 2, . . . and forecast the volatility out-of-sample from day t + 1 until day t + H, for a short horizon H = 20 business days (one month) and long horizon H = 120 days (six months). The calibration up to time t provides parameter estimates for forecasting the volatility beyond time t using the following formulas for the different models, with the notation σt+j2 = E(ϕ(ht+j)|y1, ..., yt):

for fSVM:

σt+j2,fSVM = exp(2(µ + φj(ht− µ))), j = 1, . . . , H, for GARCH:

σ2,GARCHt+j = (ψ + φ)jσt2+ ω

j−1

X

k=0

(ψ + φ)k, j = 1, . . . , H

and for IGARCH:

σ2,IGARCHt+j = σt2+ jω, j = 1, . . . , H.

For each model, we then calculate the median absolute error (MAE) forecasting measure between the forecasted volatilities and the realized volatilities

MAE(H) = median{|

H

X

j=1

ˆ σt+j2,∗

H

X

j=1

y2t+j|, t = 3001+(k −1)H/2, k = 1, 2, . . .},

where “*” stands for fSVM, GARCH or IGARCH. We report in Table 1 the relative MAE with respect to that of fSVM (i.e., MAE(I)GARCH/MAEfSVM), so that a ratio larger than 1 means better forecasting for fSVM. The results show that fSVM outperforms GARCH for forecasting, especially for a long horizon.

Consequently our methodology appears better for the purpose of option pricing or portfolio management.

4 Conclusion

This paper proposes an original and complete new way to estimate returns volatility. Our approach combines the dynamic proposed by the well known SV models and the Markov random field smoother in order to estimate hid- den Markov chains. The proposed estimator is based on Bayesian posterior

(17)

Table 1: Volatility forecast based on relative median absolute errors (MAE) for a short (H = 20) and long (H = 120) horizons.

DOW JONES NASDAQ

Horizon H = 20 H = 120 H = 20 H = 120

MAE GARCH/fSVM 1.1 1.9 0.8 1.8

MAE IGARCH/fSVM 1.2 3.0 0.8 2.7

mode estimation. Its performance is illustrated through smulations and empir- ical applications based on the Dow Jones and Nasdaq. Our volatility forecast outperforms GARCH’s forcast, especially on long horizon, and therefore our methodology looks more suitable for option pricing models or portfolio man- agement which require long term forcast. As a major topic for future research, the multivariate extension is probably the one which is most promising, espe- cially given our selection of the smoothing parameter, the speed of our algorithm and the good forecasting performance. Another natural extension would consist in capturing leverage effect or asymmetry in the marginal distribution of the returns.

APPENDIX

A Universal penalty

Assuming for simplicity T is a multiple of K, let BK be the matrix operating fi- nite differences skipping every other K, i.e., BKh = 0 iff h is piecewise constant taking identical values at K successive points:

hK(i−1)+1= . . . = hK(i−1)+K, i = 1, . . . , nT = T /K. (14) The Karush-Kuhn-Tucker first-order optimality conditions for (5) with φ = 1 and skipping every other K differences in the penalty are

1 − yt2exp(−2ht) + (BK>w)t = 0 t = 1, . . . , T (15)

kwk ≤ λ, (16)

where kwk = max (|w1|, ..., |wT|). The solution w to (15)-(16) subject to K successive identical values has entries

wK(i−1)+k = k − K Pk

j=1yK(i−1)+j2 PK

j=1yK(i−1)+j2 , i = 1, . . . , nT, k = 1, . . . , K (17)

(18)

provided λ is large enough. The smallest possible λ allowing a solution is λy = kwk for w given in (17). The goal of the universal penalty λT is to control the extremal behavior of λy so that P(kwk ≤ λT) T →∞→ 1 when the true underlying volatility is constant σ0, i.e., yt i.i.d. N (0, σ02). The vector w can be broken into nT = T /K independent blocks wi = (wi1, . . . , wi(K−1)) each of which converging to a Brownian bridge process. To see this, consider the first block for which

P (kw1k ≤ λ) = P max

k=1,...,K−1|k − K Pk

j=1Zj2 PK

j=1Zj2| ≤ λ

!

where Zj = yj0is i.i.d. N (0, 1). Note that Zj2+Zj+12 =dEl, with j = 2(l−1)+1 and l = 1, . . . , K/2, where El is i.i.d. Exp(1/2), so

Pk j=1Zj2 PK

j=1Zj2

=d

Pk/2 l=1El PK/2

l=1 El

= Ud l,

where 0 ≤ U1 ≤ . . . ≤ UK/2≤ 1 are distributed as Uniform(0, 1) order statistics from a sample of size K/2 (Shorack and Wellner 1986, p.496). Moreover the uniform quantile process converges to a Brownian bridge process W (r), so

P(kw1k ≤ λ) = P( sup·

r∈[0,1]

|W (r)| ≤ λ/√ 2K)

= 1 − 2

X

k=1

(−1)k+1exp(−2k2(λ/√ 2K)2)

≥ 1 − 2 exp(−2(λ/√

2K)2).

Consequently

P(kwk ≤ λT) ≥

1 − 2 exp(−2(λ/√

2K)2)nT ·

= exp(−2/ log nT) with λf SV MT =√

2K q1

2 log((nT log nT)), which tends to one with a slow rate as T grows. We choose K = KT ∼ log T for the blocks size to slowly grow with T .

B Derivation of the dual problem

Letting θ = (h, µ), the primal problem (6) has the form of min

θ g(θ) + π(Bθ),

(19)

where g(θ) = PT

t=1gt(ht) + g0(µ), g0(µ) = 0, B = [Bφ (φ − 1)1]. Thus the Fenchel dual (Rockafellar 1970) has the form

minw g(B>w) + π(−w), with g(η) = PT

t=1gtt) + g00). Here denotes the convex conjugate, i.e., g(η) = supθθTη − g(θ). Straightforward calculation yields gt(ut) =

1

2(1 − ut)(log(1−uy2t

t ) − 1) when gt(ht) = ht + 12yt2exp(−2ht) (Gaussian noise) and gt(ut) = (1 − ut)(log(1−uy t

t ) − 1) when gt(ht) = ht+ ytexp(−ht) (Laplace noise), for t = 1, . . . , T . Moreover, g00) = 0 if η0 = 0 and otherwise equals

∞. Similarly, π(−w) = 0 if kwk ≤ λ and otherwise equals ∞. This yields the dual problem

min

T

X

t=1

gt(ut) s.t. u = Bφ>w, 1>w = 0, −λ1 ≤ w ≤ λ1.

Then letting z = 1 − u and w1 = λ1 − w, w2 = λ1 + w, and ct = −(1 + log(yt2)) or ct = −(1 + log(yt)), and upon eliminating w, the above dual problem is equivalent to

wmin1,w2,z T

X

t=1

ztlog(zt) + ctzt

s.t.  B>φw2+ z = 1 + λBφ>1, 1>w2 = λ(T − 1), w1+ w2 = 2λ1, z ≥ 0, w1 ≥ 0, w2 ≥ 0.

where x = (z, w1, w2) are the dual variables. This has the form (7) with

A =

I 0 Bφ>

0 0 1>

0 I I

, b =

1 + λB>φ1 λ(T − 1) 2λ1

, qt(xt) =

(ztlog(zt) + ctzt if t ≤ T ;

0 else.

C Checking the condition (8)

It is easily seen that the condition (8) with κ = 2(1−ρ)1 is satisfied by qt(xt) = 0 for any 0 < ρ < 1. Below we show that it is also satisfied by qt(xt) = xtlog(xt)+ctxt. We have qt0(xt) = log(xt) + 1 + ct and q00t(xt) = 1/xt. Since |δ|/ξ ≤ ρ < 1, we have the series expansion

log(ξ + δ) − log ξ −δ ξ =

X

k=2

(−1)k+1 δkk.

(20)

When δ < 0, we have 0 < ξ+δ ≤ ξ. Moreover, all terms in the series are negative and the series can be bounded from below byP

k=2δξ22

ρk−2

2 = −2(1−ρ)1 δξ22. Thus (ξ + δ)



log(ξ + δ) − log ξ − δ ξ



≥ − ξ + δ 2(1 − ρ)

δ2

ξ2 ≥ − δ2 2(1 − ρ)ξ,

implying (8) with κ = 2(1−ρ)1 . When δ ≥ 0, we have 0 < ξ + δ ≤ ξ(1 + ρ) ≤

ξ

1−ρ. Moreover, the series alternates in sign and can be written as −δ22 + P

k=3,5,...

1

k(k+1)ξδ 

δk

ξk ≥ −δ22. Then we again obtain (8) with κ = 2(1−ρ)1 .

D Iteration complexity analysis

Fix any (x, u, ε) ∈ N (τ ). Let (dx, ds, du) be the solution of (10)–(12), where s = q0(x) − A>u, and 0 < δ < 1. We show that (13) is satisfied when α = ¯α, where ¯α and ν are given by (27) and (26).

As in the proof of Wright (1997, Lemma 5.10), we first bound kDxdsk.

Letting D = X−1/2S1/2, we rewrite (10) as

D−1dx+ Dds = (XS)−1/2r,

where for simplicity we denote r = δε1 − Xs. Left multiplying (12) by d>x and using q00t(xt) ≥ 0 (since qt(·) is convex) and (11) yields 0 ≤ d>xq00(x)dx = d>xds. Then, by Wright (1997, Lemma 5.3), we have

kDxdsk = k(D−1Dx)(Dds)k

≤ 2−3/2kD−1dx+ Ddsk2

= 2−3/2k(XS)−1/2rk2

= 2−3/2k(XS)−1/2δε1 − (XS)1/21k2

= 2−3/2 X

t

δ2ε2

xtst − 2σεQ + x>s

!

≤ 2−3/2

 Qδ2ε2

τ ε − 2δεQ + εQ



= ϑQε, (18)

where we let ϑ = 2−3/2

δ2

τ − 2δ + 1 .

Let ¯dx = X−1dx. We next bound k ¯dxk. We have from left multiplying (10) by ¯d>x and using d>xds = d>xq00(x)dx that

d>xq00(x)dx+ ¯d>xSX ¯dx = ¯d>xr,

(21)

where for simplicity we denote r = δε1 − Xs. Since (x, u, ε) ∈ N (τ ) so that xtst ≥ τ ε for all t, this and the Cauchy-Schwarz inequality yields τ εk ¯dxk2 ≤ k ¯dxkkrk so that

k ¯dxk ≤ krk

τ ε . (19)

Since z = Xsε satisfies z ≥ 0 and 1>z = Q, we have krk ≤ max

z≥0{εkδ1 − zk | 1>z = Q} = εp

(Q − δ)2+ (Q − 1)δ2 ≤ εQ, (20) where the equality is due to the maximum of a convex function being attained at an extreme point, which in this case is a Q multiple of the unit coordinate vector.

Fix any α > 0 with αk ¯dxk≤ ρ. By (19) and (20), this occurs whenever α ≤ ρτ

Q. (21)

We have Ax[α] = Ax + αAdx = b and X[α]s[α]

= X[α](q0(x + αdx) − A>(u + αdu))

= X[α](q0(x + αdx) − q0(x) − αq00(x)dx) + X[α](s + αq00(x)dx− αA>du)

= X[α](q0(x + αdx) − q0(x) − αq00(x)dx) + Xs + αSdx+ αXds+ α2Dxds

= X[α](q0(x + αdx) − q0(x) − αq00(x)dx) + (1 − α)Xs + αδε1 + α2Dxds

≥ −κα2q00(x)Dxdx+ (1 − α)τ ε1 + αδε1 + α2Dxds

≥ −κα2kq00(x)Dxdxk11 + ((1 − α)τ + αδ)ε1 − α2kDxdsk1

= −κα2d>xds1 + ((1 − α)τ + αδ)ε1 − α2kDxdsk1

≥ −κα2ϑQ3/2ε1 + ((1 − α)τ + αδ)ε1 − α2ϑQε1, (22) where the third equality uses (12); the fourth equality uses (10); the first in- equality uses x[α] > 0, (8), and xtst≥ τ ε for all t (since (x, u, ε) ∈ N (τ )); the last inequality uses d>xds ≤ kDxdsk1 ≤√

QkDxdsk and (18). Similarly, we have x[α]>s[α] = x[α]>(q0(x + αdx) − q0(x) − αq00(x)dx)

+(1 − α)x>s + αδεQ + α2d>xds

≤ (1 − α)εQ + αδεQ + α2ϑQ3/2ε, (23) where the inequality uses the concavity of qt0(·) as well as d>xds ≤ √

QkDxdsk and (18). Comparing the two bounds (22) and (23), we see that X[α]s[α] ≥

τ

Qx[α]>s[α]1 whenever

−κα2ϑQ3/2+ ((1 − α)τ + αδ) − α2ϑQ ≥ τ (1 − α + αδ + α2ϑQ1/2)

(22)

or, equivalently,

α ≤ δ(1 − τ )

ϑ(κQ3/2+ Q + τ Q1/2). (24) Moreover, by (23), we have

ε[α] = x[α]>s[α]

Q ≤ (1 − α(1 − δ − αϑQ1/2))ε ≤ (1 − αν)ε whenever

α ≤ ρ(1 − δ)

ϑQ1/2 , (25)

where

ν = (1 − δ)(1 − ρ). (26)

The minimum of the three bounds (21), (24), and (25) is

¯

α = min ρτ

Q, δ(1 − τ )

ϑ(κQ3/2+ Q + τ Q1/2),ρ(1 − δ) ϑQ1/2



. (27)

References

Andersen, T. G. (1994), “Stochastic autoregressive volatility: A framework for volatility modeling,” Mathematical Finance, 4, 75–102.

Bassett, G. Jr, and Koenker, R. (1978), “Asymptotic theory of least absolute error regression,” Journal of the American Statistical Association, 73, 618–

622.

Besag, J. (1986), “On the statistical analysis of dirty pictures (with discussion),”

Journal of the Royal Statistical Society, Series B, 48, 192–236.

Bollerslev, T. (1986), “Generalized autoregressive conditional heteroskedastic- ity,” Journal of Econometrics, 31, 307–327.

Box, G. E. P., and Cox, D. R. (1982), “An analysis of transformations revisited, rebutted,” Journal of the American Statistical Association, 77, 209–210.

Cai, J. (1994), “A Markov model of switching-regime ARCH,” Journal of Busi- ness and Economic Statistics, 12, 309–316.

Chan, H. W., and Maheu, J. M. (2002), “Conditional jump dynamics in stock market returns,” Journal of Business and Enonomic Statistics, 20, 377–

389.

(23)

Chernov, M., Gallant, A. R., Ghysels, E., and Tauchen, G. (1999), “A new class of stochastic volatility models with jumps: Theory and estimation,”

Technical Report, CIRANO, Montr´eal.

Dahlhaus, R. (1997), “Fitting time series models to nonstationary processes,”

Annals of Statistics, 25, 1–37.

Diebold, F.X. (1986), “Modeling the persistence of conditional variances: A comment,” Econometric Reviews, 5, 51–56.

Donoho, D. L., and Johnstone, I. M. (1994), “Ideal spatial adaptation via wavelet shrinkage,” Biometrika, 81, 425–455.

Donoho, D. L., Johnstone, I. M., Hoch, J. C., and Stern, A. S. (1992), “Max- imum entropy and the nearly black object (with discussion),” Journal of the Royal Statistical Society, Series B, Methodological, 54, 41–81.

Donoho, D. L., Johnstone, I. M., Kerkyacharian, G., and Picard, D. (1995),

“Wavelet shrinkage: Asymptopia? (with discussion),” Journal of the Royal Statistical Society, Series B, 57, 301–369.

Engle, R. F. (1982), “Autoregressive conditional heteroscedasticity with esti- mates of the variance of United Kingdom inflation,” Econometrica, 50, 987–

1007.

Engle, R.F., and Bollerslev, T. (1986), “Modelling the persistence of conditional variance,” Econometric Reviews, 5, 1–50.

Geman, S., and Geman, D. (1984), “Stochastic relaxation. Gibbs distributions, and the Bayesian restoration of images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 61, 721–741.

Glosten, R. T., Jagannathan, R., and Runkle, D. (1993), “On the relation between the expected value and the volatility of the nominal excess return on stocks,” Journal of Finance, 48, 1779–1801.

Hamilton, J. D., and Susmel, R. (1994), “Autoregressive conditional het- eroskedasticity and changes in regime,” Journal of Econometrics, 64, 307–

333.

Harvey, A.C., Ruiz, E., and Shephard, N. (1994), “Multivariate stochastic vari- ance models,” Review of Economic Studies, 61, 247–264.

(24)

Hull, J., and White, A. (1987), “The pricing of options on assets with stochastic volatilities,” Journal of Finance, 42, 281–300.

Jacquier, E., Polson, N. G., and Rossi, P. E. (1994), “Bayesian analysis of stochastic volatility models (Disc: P389-417),” Journal of Business and Economic Statistics, 12, 371–389.

Kotz, S., Kozubowski, T. J., and Podg´orski, K. (2001) The Laplace Distribu- tion and Generalizations: a Revisit with Applications to Communications, Economics, Engineering, and Finance, Birkhaeuser Verlag.

Lamoureux, C. G., and Lastrapes, W. D. (1990), “Persistence in variance, struc- tural change, and the GARCH model,” Journal of Business and Economic Statistics, 8, 225–234.

Lastrapes, W.D. (1989), “Exchange rate volatility and u.s. monetary policy: An arch application,” Journal of Money, Credit, and Banking, 21, 66–77.

Madan, D. B., and Seneta, E. (1990), “The variance gamma (V.G.) model for share market returns,” The Journal of Business, 63, 511–524.

Nelson, D.B. (1991), “Conditional heteroskedasticity in asset returns: A new approach,” Econometrica, 59, 397–370.

Omori, Y., Chib, S., Shephard, N., and Nakajima, J. (2007), “Stochastic volatil- ity with leverage: fast and efficient likelihood inference,” Journal of Econo- metrics, 140, 425–449.

Potra, F., and Ye, Y. (1993), “A quadratically convergent polynomial interior- point algorithm for solving entropy optimization problems,” SIAM Journal on Optimization, 3, 843–860.

Rockafellar, R. (1970) Convex Analysis, Princeton: Princeton University Press.

Rydberg, T. H. (2000), “Realistic statistical modelling of financial data,” In- ternational Statistical Review, 68(3), 233–258.

Sardy, S., and Tseng, P. (2004), “On the statistical analysis of smoothing by maximizing dirty markov random field posterior distributions,” Journal of the American Statistical Association, 99, 191–204.

Shephard, N. (1996), “Statistical aspects of arch and stochastic volatility,” In

“Time Series Models in Econometrics, Finance and Other Fields,” Lon- don: Chapman and Hall: Eds. D.R. Cox, David V. Hinkley and Ole E. Barndorff-Nielsen pp. 1–67.

(25)

Shorack, G.R., and Wellner, J. (1986) Empirical Processes with Applications to Statistics, Wiley.

Starica, C. (2003), “Is garch(1,1) as good a model as the nobel prize accolades would imply?,” Working paper.

Starica, C., and Granger, C. (2005), “Nonstationarities in stock returns,” The Review of Economics and Statistics, 87, 503–522.

Taylor, S. (1986) Modelling Financial Time Series, John Wiley.

Taylor, S.J. (1994), “Modelling stochastic volatility: a review and comparative study,” Mathematical Finance, 4, 183–204.

Tikhonov, A. N. (1963), “Solution of incorrectly formulated problems and the regularization method,” Soviet Math. Dokl., 4, 1035–1038.

Tseng, P. (1992), “Global linear convergence of a path-following algorithm for some monotone variational inequality problems,” Journal of Optimization Theory and Applications, 75, 265–279.

Wright, S. (1997) Primal-Dual Interior-Point Methods, Philadelphia: SIAM.

參考文獻

相關文件

The bivariate binomial values with 270 time steps are compared with the values generated by Hilliard and Schwartz [1996], the Hull-White stochastic volatility model [1987], and

Conditional variance, local likelihood estimation, local linear estimation, log-transformation, variance reduction, volatility..

Simonato, 1999, “An Analytical Approximation for the GARCH Option Pricing Model,” Journal of Computational Finance 2, 75- 116.

Then, we recast the signal recovery problem as a smoothing penalized least squares optimization problem, and apply the nonlinear conjugate gradient method to solve the smoothing

Then, we recast the signal recovery problem as a smoothing penalized least squares optimization problem, and apply the nonlinear conjugate gradient method to solve the smoothing

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005..

For a 4-connected plane triangulation G with at least four exterior vertices, the size of the grid can be reduced to (n/2 − 1) × (n/2) [13], [24], which is optimal in the sense

• The local volatility model does not require that the implied tree combinea. • Exponential-sized implied