• 沒有找到結果。

預測性迴歸之間接推論

N/A
N/A
Protected

Academic year: 2021

Share "預測性迴歸之間接推論"

Copied!
37
0
0

加載中.... (立即查看全文)

全文

(1)

行政院國家科學委員會專題研究計畫 成果報告

預測性迴歸之間接推論

研究成果報告(精簡版)

計 畫 類 別 : 個別型 計 畫 編 號 : NSC 100-2410-H-004-069- 執 行 期 間 : 100 年 08 月 01 日至 101 年 07 月 31 日 執 行 單 位 : 國立政治大學國際經營與貿易學系 計 畫 主 持 人 : 郭炳伸 計畫參與人員: 博士班研究生-兼任助理人員:莊珮玲 公 開 資 訊 : 本計畫涉及專利或其他智慧財產權,1 年後可公開查詢

中 華 民 國 101 年 11 月 09 日

(2)

中 文 摘 要 : 本文採用間接推論方法,探討具有一般化 AR(p) 解釋變數之 預測迴歸模型的估計偏誤問題。模擬研究顯示本文所提出的 方法,在增加變異的些許代價下,能有效降低估計誤差。因 此,相對於最小平方估計,建構在間接推論估計下的漸近 t 檢定量會有較少的誤差扭曲問題。 中文關鍵詞: 誤差修正、模擬式估計、自我迴歸

英 文 摘 要 : In this paper, an indirect inference method is

introduced to address the problem of estimation bias in predictive regression with a general AR(p)

regressor. Simulation studies show that the proposed procedure works quite well to reduce bias, at little cost of increase in variance. As a result, the

asymptotic t-test based on the indirect inference estimator is subject to much less size distortion problem than that of the least-squares counterpart. 英文關鍵詞: Bias correction; Simulation-based estimation;

(3)

Indirect Inference on Predictive Regression

Biing-Shen Kuo

National Chengchi University

Jhih-Gang Chen

National Chengchi University

Oct. 2012

Abstract

In this paper, an indirect inference method is introduced to address the problem of es-timation bias in predictive regression with a general AR(p) regressor. Simulation studies show that the proposed procedure works quite well to reduce bias, at little cost of increase in varaicne. As a result, the asymptotic t-test based on the indirect inference estimator is subject to much less size distortion problem than that of the least-squares counterpart.

KEY WORDS: Bias correction; Simulation-based estimation; Autoregression.

Corresponding author, Dept. of International Business, National Chengchi University, Taipei 116, Taiwan;

Tel: (02)29393091 ext. 81029, Fax: (02)29387699, Email: bsku@nccu.edu.tw.

Dept. of International Business, National Chengchi University, Taipei 116, Taiwan; Tel: (02)29393091 ext.

(4)

1

INTRODUCTION

This paper is concerned with the estimation of the predictive regression model which has

appli-cations that abound in empirical finance and economics. Specifically, the dependent variable in the model usually reflects an asset’s price change, while the regressor is some associated lagged

variable implied by economic theorems or hypotheses. For example, international finance

re-searchers utilize the predictive regression to investigate the ability of monetary fundamentals to forecast future exchange rate returns, and to examine the long-run relationship between

eco-nomic fundamentals and exchange rate (Mark, 1995; Kilian, 1999; Mark and Sul, 2001; Groen, 2000, 2005; Engel and West, 2005; Engel, Mark, and West, 2008; Molodtsova and Papell,

2009). In empirical finance, a continuing popular topic is whether the stock returns (either real or nominal) can be predicted by lagged financial ratios such as dividend-price ratio,

book-to-market ratio and etc. (Fama and French, 1988; Campbell and Shiller, 1988; Hodrick, 1992; Lewellen, 2004; Welch and Goyal, 2008; Campbell and Thompson, 2008). In macroeconomics,

Staiger, Stock, and Watson (1997) and Stock and Watson (1999) examined whether the inflation rate can be predicted by lagged unemployment rate as the well-known Phillips curve postulates.

While the structure of the predictive regression model is simple, the estimation can be trou-blesome. In usual, the predictor variable in the model displays strong autoregressive behavior,

and the innovations in the predictive regression are contemporarily correlated with those from the predictor processes. As pointed out by Mankiw and Sharpio (1986) and Stambaugh (1986),

these two common features together lead to biased least-squares estimate of the predictive coef-ficient in finite sample. Meanwhile, when the predictor variable is nearly or actually integrated,

the asymptotic distribution of the least-squares estimator tends to be nonstandard. In conse-quence, the conventional t-test suffers from the problem of size distortion, and brings about

misleading inference on the predictability.

Many efforts have been made in the literature to develop robust test methods. Mark (1995)

and Kilian (1999) secure the critical values by bootstrapping the test statistics under the esti-mated null data generating process. Besides, as the predictor variable is persistent, some studies

derive new tests under the framework of local-to-unity . Among these, Cavanagh, Elliott, and Stock (1995) proposed three asymptotically conservative tests based on sup-c, Bonferroni, and

Scheffe-type confidence intervals. Torous, Valkanov, and Yan (2004) applied the Scheffe test to stock returns, and found evidence of predictability at shorter rather than at longer horizon.

(5)

Lewellen (2004) proposed a test that gains more power by combining the conditional and un-conditional tests, that distinguished from the imposition of a unit root on the predictor variable.

Campbell and Yogo (2006) developed a new efficient Bonferroni test that is asymptotically valid under fairly general assumptions on the dynamics of the predictor variable.

On the contrary, the literature seems to pay less concern to bias correction of the point es-timation. Stambaugh (1986, 1999) derived the O T−1 bias formula of the least-squares in the predictive regression model with an AR(1) predictor, and this formula is often utilized by

researchers, such as Stambaugh (1999) himself, Kothari and Shanken (1997), and Lewellen (2004), to obtain bias-adjusted estimator. Under the same model, Amihud and Hurvich (2004)

obtained bias-reduced estimation via an augmented regression in which the estimated innova-tions of the predictor variable are included into the predictive regression. However, as shown in

this paper, many of the predictor variables used to predict stock returns act more like an AR(p)

process, rather than an AR(1). In these cases, the existing AR(1)-assumed methods usually provide poor bias-correction.

So far, none has been done to correct the estimation bias in predictive regression models with a more general AR(p) regressor, except the recent work of Amihud, Hurvich, and Wang

(2010). The bias-correction method suggested by Amihud, Hurvich, and Wang (2010) is ex-actly an extension of Amihud and Hurvich (2004), but the model being considered differs from

those usually used in empirical research. Specifically, they allow the predictor variable to be characterized by an AR(p) model; meanwhile, all p lags (rather than single lag) of the predictor

variable are also included in the predictive regression. Although they claim that fail to do so could result in extremely low asymptotic power for the predictability test, it remains interesting

and important to develop a bias-correction method for the frequently-considered “single-lag” predictive regression model.

The present paper proposes a simulation-based method, named as “indirect inference” in the

literature, to address the bias problem in the predictive regression model with an general AR(p)

predictor. This methodology is first introduced by Smith (1993) and Gouri´eroux, Monfort, and Renault (1993), and exhibits to be applicable to a wide range of structure models. Especially,

it has been proven to be useful to reduce finite sample bias in various dynamic models (e.g. MacKinnin and Smith, 1998; Gouri´eroux, Renault, and Touzi, 2000; Gouri´eroux, Phillips, and

(6)

models.

Relative to conventional bias-correction methods, the main advantage of indirect inference

is its capability to reduce bias without any explicit form of the bias function. Instead, the bias function is implicitly calibrated via simulation with an auxiliary model. This appears to be

profoundly desirable under framework we considered because the least-squares bias function becomes much complicated when the AR-order (p) is large, and no exact or approximate bias formula has been provided in the literature.

Our simulation studies indicate that the indirect inference method actually provides a well bias correction to the least-squares, although little increases in variance arise as the cost in some

cases. Besides, the indirect inference estimator is proven to be consistent and asymptotically normal, and a new t-test for the predictive coefficient is then derived. As a benefit from the bias

reduction, the new test involves little size distortion in most cases considered, and leads to a

more reliable inference.

The rest of the paper is organized as follows. Section 2 specifies the predictive regression

model with an AR(p) regressor, and discusses the least-squares bias. Section 3 develops the indirect inference method and the corresponding asymptotic theory. Section 4 evaluates the

performance of the indirect inference estimator relative to least-squares by means of simulation. Section 5 provides empirical illustrations of the indirect inference method by predicting S&P

500 equity returns with various lagged financial variables, and Section 6 concludes.

2

PREDICTIVE REGRESSION AND LEAST-SQUARES

We consider a predictive model usually seen in the financial literature where the dependent variable yt is predicted by a lagged autoregressive variable xt−1. The model is specified as

yt =α+βxt−1+ ut, (1)

xt =ρ0+ρ1xt−1+ ··· +ρpxt−p+ vt, (2)

where both yt and xt are observed at t= 0, 1, ··· ,T , and the innovation vector, (ut, vt)′, is as-sumed to be serially independent and bivariate normal, i.e.

  ut vt   iid ∼ N     0 0  ,   σ2 u σuv σuv σv2    . (3)

(7)

For stationarity, we further assume that all roots of the polynomial,λpip=1ρiλp−i, lie within the unit circle. Let y= (y1, ··· ,yT)′, X = [ıT, x], x = (x0, ··· ,xT−1)′, and ıT denote a T× 1 vector of ones. Then the least-squares estimator of (1) is given by

  ˆ αLS ˆ βLS  = (XX)−1Xy,

and the finite-sample bias of ˆβLShas the expression in the following proposition.

Proposition 1. Under the assumption that xt is covariance-stationary, we have

E ˆβLSβ  = σuv σ2 v E  ∑T t=1(xt−1− ¯x)vtT t=1(xt−1− ¯x)2  =σuv σ2 v f(ρ1, ··· ,ρp, T ), (4) where ¯x=∑Tt=1xt−1/T , and f (·) = E hT t=1(xt−1− ¯x)vtT t=1(xt−1− ¯x)2 i is O(T−1). Remark.

Observing the numerator and the denominator in the expectation term in (4) are correlated, the finite-sample bias of ˆβLSwill generally be nonzero. Besides, the bias could be critical when the sample size is relatively small, and will tend to reduce the prediction accuracy and inference credibility in applications.

Consider the simplest case where xt is AR(1), we have vt = xt−ρ0−ρ1xt−1, and thus the expectation term in (4) equals the bias of the least-square estimator of ρ1, i.e. E( ˆρ1LS−ρ1),

which has an O(T−1) approximation as −(1 + 3ρ1)/T .1 This leads to

E ˆβLSβ  = −σσuv2 v  1 + 3ρ1 T  + O(T−2),

which is also the result of Stambaugh (1999). Researchers may construct bias-corrected

estima-tors according to the bias formula, and one example is that suggested by Kothari and Shanken (1998) which takes the form:

ˆ βKS= ˆβLS+ σˆuvLS ˆ σLS2 v  1 + 3 ˆρbc 1 T  , (5)

where ˆσuvLS and ˆσvLS2 are estimated based on the least-squares residuals in (1) and (2), and ˆ

ρbc = (T ˆρLS

1 + 1)/(T − 3) is a bias-adjusted estimator of ρ1. Although this kind of

estima-tor is supposed to well correct the bias, it has the limitation that the predicestima-tor process is AR(1).

1The bias of the least-squares estimator in an AR(1) model has been derived by Marriott and Pope (1954) and

(8)

With a general AR(p) predictor, no exact or approximate form of the expectation term,

f(ρ1, ··· ,ρp, T ), has been derived, and bias-correction based on AR(1) assumption is expected to be less accurate. To see this, we perform some simulations to illustrate the bias performance of the AR(1)-based estimator, ˆβKS, when the true process of the predictor is an AR(2). Table 1 details the settings of the simulations and shows the results. One thing noteworthy is that we fix the value of the largest characteristic root (λ1) of the AR(2) process, and allow the other root

(λ2) to vary across simulations. Thus, the predictor process is actually AR(1) whenλ2= 0, and

is less like an AR(1) asλ2departs from zero. We might directly jump to the last column which

depicts the absolute values of the relative bias ratio between ˆβKS and ˆβLS. Roughly speaking, the values can be treat as the portion of the least-squares bias that is not corrected by ˆβKS. The pattern is clear and verifies our concern that the farλ2is removed from zero, the less portion of

the least-squares bias can be taken care by the AR(1)-based estimator, ˆβKS.

3

INDIRECT INFERENCE APPROACH

We attempt to correct the estimation bias by the indirect inference method, which is introduced

by Gouri´eroux, Monfort, and Renault (1993) and Smith (1993). The major advantage of the procedure is that a given explicit bias function or its expansion is not required but is calibrated

via simulation. Under the framework of the predictive regression considered, this feature is profoundly desirable as not much about the least-squares bias function have been known.

To illustrate the basic idea of indirect inference, we shall define the so-called binding func-tion that mappingΘinto bT) given the sample-size T by:2

bT(θ) = Eh ˆβLS(θ) i

, (6)

whereθ = (β,σuvv2,ρ1, ··· ,ρp) ∈Θ⊂ R3+p. Under the assumption that the binding function bT(θ) is uniformly continuous and one-to-one, the indirect inference estimator can be obtained by:

ˆ

θII= b−1 T ( ˆβ

LS).

Obviously, the one-to-one assumption is obviously violated here, and thus the binding function

bT is not invertible. In fact, we need p+ 2 more conditions that help define the relationship

(9)

between all the elements ofθ.

To get over, we propose a two-stage estimation procedure which is inspired by the fact that

the dependent variable{yt}Tt=0 never presents in the process of xt. This infers the estimation of the autoregressive coefficients can be isolated from that ofθ, and could be done as the first stage. Then we come back to the estimation of the predictive regression conditional on the estimates from the first stage. The full estimation procedure is elaborated as followed.

1. Indirect Inference on Autoregression

LetΓ0= (ρ10, ··· ,ρ0p) denote the true value of the autoregressive coefficient vectorΓ= (ρ1, ··· ,ρp) in (2). Given any ofΓ∈ S with S being the subset of Rpthat satisfies the sta-tionarity conditions, we can draw the simulated paths according to (2).3 This is achieved by drawing independent simulated innovation paths{vht}Tt=0, h= 1, ··· ,H, from N(0,σv2), and computing the desired simulated paths{xht}Tt=0. 4 Let ˆΓLS and ˆΓLS,hdenote the least-squares estimates ofΓwith the true data{xt}Tt=0and with the hthsimulated path{xht}tT=0, respectively. Then the indirect inference estimator ofΓis defined by:

ˆ ΓII=argmin Γ∈S ˆ ΓLSH1 H

h=1 ˆ ΓLS,h(Γ) , (7)

where k · k is some finite-dimensional distance metric. One might note that when the number of the simulated paths tends to infinity (i.e. H∞),

ˆ ΓII=argmin Γ∈S Γˆ LS − EhΓˆLS,h Γ= ˆΓIIi ,

In other words, if we define a vector binding function as bT(Γ) = EΓˆLS (Γ), ˆ

ΓII= b−1 T ( ˆΓLS).

Given ˆΓII, we can further obtain the estimates ofρ0andσvby ˆ ρII 0 = ∑T t=p xt−∑ip=1ρˆiIIxt−i T− p + 1 , ˆ σII2 v = ∑T t=pvˆII 2 t T− 2p ,

3As shown by Shaman and Stine (1988), the bias function of the least-squares estimator ofΓdoes not depend

on either the variance of the residual (σv2) or the drift coefficientρ0. Thus, the two parameters can be set arbitrarily

when drawing the simulated paths.

4In general, initial values of(xh

0, ··· ,xhp−1) are required when drawing the simulated paths. In practice, we could generate a much longer simulated paths, say{xht}tT=−m, with initial condition(xh

−m+p−1, ··· ,xh−m) = (x0, ··· ,xp−1),

and then drop the first m observations. With the stationarity property of the process, the effect of the initial condition diminishes as m∞.

(10)

where ˆvIIt = xt− ˆρ0II−∑ip=1ρˆiIIxt−i.

2. Indirect Inference on Predictive Regression

The estimation in this stage is conditional on { ˆσvII2, ˆΓII}. More specifically, we treat { ˆσII2

v , ˆΓII} to be the true values of {σv2,Γ}, and then the biding function (6) degenerates into bT  θ|σv2= ˆσII 2 v ,Γ= ˆΓII  =bT  β,σuv; ˆσII 2 v , ˆΓII  =Eh ˆβLSβ,σuv; ˆσII 2 v , ˆΓII i .

However, it remains non one-to-one. To make the indirect inference procedure applicable,

we allow the covariance coefficient σuv to be data-based when drawing the simulated paths. That is, given any value ofβ, we letσuv= ˜σuv(β) which is computed as

˜

σuv(β) =

T t=pu˜tvˆIIt

p(T − p − 1)(T − 2p),

where ˜ut = yt− ˜α−βxt−1, and ˜α =∑tT=1(yt−βxt−1)/T . The binding function further becomes bT  β, ˜σuv(β); ˆσII 2 v , ˆΓII  = Eh ˆβLSβ, ˜σuv(β); ˆσII 2 v , ˆΓII i . (8)

and in which β is the only parameter left to be estimated. Now given some value of β, we are able to draw the simulated paths yht, xh

t

β, ˜σuv(β), ˆσvII, ˆΓII T

t=0 according to (1) and (2).5 Letting ˆβLSand ˆβLS,hrespectively denote the least-squares estimates ofβ with the true data {yt, xt}Tt=0 and with the hth simulated path yht, xth

β, ˜σuv(β), ˆσvII, ˆΓII T

t=0, the “two-stage indirect inference estimator” ofβ is defined as

ˆ βII=argmin β∈R ˆ βLSH1 H

h=1 ˆ βLS,h β, ˜σ uv(β); ˆσvII, ˆΓII  , (9) Again, when H=∞, ˆ βII=argmin β∈R ˆ βLS − Eh ˆβLS,h β, ˜σuv(β); ˆσvII, ˆΓII i .

And given (8), we have

ˆ βII= b−1 T  ˆβ LS, ˜σ uv( ˆβII); ˆσII 2 v , ˆΓII  . (10)

5When drawing simulated paths, we setα= ˜α,σ2

u = ˜σu2=∑tT=1u˜2t/(T − 2), andρ0= ˆρ0II, even though they

do not exist in the biding function and thus almostly never affect the indirect inference estimation when H is large enough.

(11)

The asymptotic behavior of ˆβIIcan be summarized by the following theorem.

Theorem 2. Under the stationarity assumption of the predictor xt−1 and when the number of

the simulated paths H=∞, we have

T ˆβIIβ



⇒ N 0,σu2Q−1 ,

where Q= var(xt) which is a function of the autoregression coefficients. This means the indirect inference estimator is consistent and has the asymptotic distribution identical to that of the least-squares estimator.

4

SIMULATION STUDY

In this section, we investigate the relative finite-sample performance of the indirect inference estimator against the least-squares estimator by means of simulations. For each simulation, the numbers of replications and simulated paths (H) are set to be 5, 000 and 10, 000, respectively.

We begin with simulations with arbitrary parameter settings to examine how the estimators perform when the values of some important parameters that determine least-squares bias vary. According to Proposition 1, these important parameters should include the autoregressive

coef-ficients that characterized the predictor variable, the covariance of ut and vt, and the variance of vtrelative to that of ut.

The first simulation design is to illustrate the autoregressive effect of the predictor variable on the least-squares bias. The effect is complicated to analyze when AR order is high because

it depends on all of the autoregressive coefficients. To simplify the analysis, the predictor is assumed to follow an AR(2) model with a fixed largest characteristic root equal to 0.95 on purpose to feature its high-persistence usually seen in financial forecasting literature. The full data generating process (DGP) is constituted by (1), (2) and (3) with T = 80, α =β =ρ0=

σuv= 1,σuv= −0.8,ρ1=λ1+λ2,ρ2= −λ1λ2,λ1= 0.95 andλ2= (−0.9,−0.8,··· ,0.9).

Figure 1 depicts the simulation results by graphing some statistics as functions ofλ2. The

least-squares estimator appears to be biased within the whole parameter space considered, and

the bias function is nonlinear in the small root,λ2. It is expected that the bias function would be

more complicated as the AR order increases. On the contrary, the indirect inference estimator

(12)

does not require an explicit form of the bias function of the base estimator (LS) and would remain applicable even when the process of the predictor is involved with a much higher

lag-order. As shown by Figure 1(b), the variance of the indirect inference estimator is larger than that of the least-squares although the difference could be very minor when λ2> 0. This is

generally true and can be explained by that the indirect inference requires additional estimation of the predictor process while the least-squares does not.

When conducting a conventional t-test for the predictive coefficient in finite-sample,

re-searchers might encounter the problem of size-distortion that could be attributed to two reasons. The first is the estimation bias that leads to a horizontal shift of the null distribution, and the

other is the tendency of the explanatory variable(s) to depart from stationarity which results in a change in the shape of the distribution. In general, the only we could take care of in the

esti-mation phase is the bias. Thus, we focus on the size-distortion induced from the bias presuming

the shape of the null distribution is not altered. To this end, it would be more meaningful to in-vestigate the standardized bias (SB) calculated as the ratio of the bias to the standard deviation

of each estimator, rather than the raw value of the bias which seems to be very small relative to the trueβ(= 1). This is because the standardized bias has the sense of the horizontal distance that the null distribution of the t-test statistic is shifted. As shown by Figure 1(c), the standard-ized bias of the least-squares is much more critical than the counterpart of the indirect inference

estimator. Figure 1(d) further illustrates the size distortion of a two-tailed t-test with a nominal size of 5% by assuming that the actual null distribution is N(0, 1) + SB. It states that test based on the indirect inference estimator would be preferred because it is involved with very little size distortion induced by the estimation bias.

The next simulation is to investigate how the estimators perform when the covariance of ut and vt varies. The parameter settings of the DGP are T = 80,α=β =ρ0=σuv= 1,λ1=

0.95,λ2= 0, andσuv= (−1,−0.9,··· ,1). Althoughλ2= 0 implies that the predictor is AR(1),

we still fit an AR(2) model in the first-stage estimation of the indirect inference even it would

be less efficient. Simulation results are displayed by Figure 2. Consistent with Proposition 1, the bias of the least-squares is linear in the covariance parameterσuvand reaches the minimum 0 when σuv= 0. For the indirect inference estimator, it still appears to have virtually no bias irrespective of the value of σuv. One thing interesting is that the change of σuv brings very little effect to the standard deviation of the least-square and the consequent standardized bias

(13)

0 0.02 0.04 0.06 -0.95 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 0.95 LS II λ2 (a) Bias 0 0.002 0.004 0.006 0.008 -0.95 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 0.95 LS II λ2 (b) Variance 0 0.2 0.4 0.6 0.8 1 -0.95 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 0.95 LS II λ2 (c) Standardized bias 0 0.02 0.04 0.06 0.08 0.1 -0.95 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 0.95 LS II λ2

(d) Size distortion induced by bias

Figure 1: Finite-sample properties of ˆβLS and ˆβII asλ2varies

seems to also be linear but actually not. Again, we compute the size distortion of the t-test under the assumption that the null distribution is shifted horizontally for the distance equal to

standardized bias of each estimator, but not alters in its shape. For the least-squares, the test appears to be well sized whenσuv= 0, but the extent of distortion is raised as the absolute value of σuv increases. Benefiting from the bias-reduction, the test related to the indirect inference estimator is involved with little size distortion for all values ofσuv.

Figure 3 shows the simulation results with various σuv ratios holding other parameters unchanged. The parameter settings are T = 80, α =β =ρ0= 1, λ1= 0.95,λ2= 0, σv= 1,

σu= (1, 2, ··· ,10), andσuv= −0.8σu. The setting ofσuvimplies a fixed correlation coefficient between ut and vt that is equal to −0.8. As shown by Figure 3(a), the absolute value of the least-squares bias is increasing with theσuv ratio abiding by a linear relationship. This can be explained by re-writing (4) in Proposition 1 as

E ˆβLSβ  = σuv σuσv· σu σv E ∑T t=1(xt−1− ¯x)vtT t=1(xt−1− ¯x)2  = −0.8 ·σσu v E ∑T t=1(xt−1− ¯x)vtT t=1(xt−1− ¯x)2  , and observing that the expectation term is fixed. Besides, the variance of the least-squares is also increasing with theσuv ratio. This is because the unexplained signal from the residual

(14)

-0.06 -0.04 -0.02 0 0.02 0.04 0.06 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 LS II σuv (a) Bias 0.002 0.003 0.004 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 LS II σuv (b) Variance -1.2 -0.8 -0.4 0 0.4 0.8 1.2 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 LS II σuv (c) Standardized bias -0.01 0.01 0.03 0.05 0.07 0.09 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 LS II σuv

(d) Size distortion induced by bias

Figure 2: Finite-sample properties of ˆβLS and ˆβIIasσuvvaries

ut is strengthen as σuv becomes larger, while the signal from the explanatory variable (the predictor) remains unchanged. According to Theorem 2, the standard deviation of the

least-squares have a linear relationship with σu asymptotically, and it seems to be also true in our finite-sample simulations. As a result, the standardized bias of the least-squares appears to be

constant as the σuv ratio varies, and so does the size distortion of the t-test that due to the shift of the null distribution. For the indirect inference estimator, it remains to perform well in

both bias, standardized bias, and the size of the t-test, although a little increase of variance is present to be the cost.

We now turn to the simulations based on DGP estimated from real data which might tell us more about the estimators under the situations we actually encounter. There are fourteen

designs of simulation. For each simulation, the pseudo data is generated from (1), (2) and (3) with parameter values obtained from the empirical study in the next section. Specifically, we

estimate 14 models in which the annual equity premium on S&P500 index is predicted by a different lagged variable, and then treat the indirect inference estimates of the parameters as the

(15)

0 0.1 0.2 0.3 0.4 0.5 1 2 3 4 5 6 7 8 9 LS II σu (a) Bias 0 0.1 0.2 0.3 1 2 3 4 5 6 7 8 9 LS II σu (b) Variance 0 0.2 0.4 0.6 0.8 1 1.2 1 2 3 4 5 6 7 8 9 LS II σu (c) Standardized bias 0 0.02 0.04 0.06 0.08 0.1 1 2 3 4 5 6 7 8 9 LS II σu

(d) Size distortion induced by bias

Figure 3: Finite-sample properties of ˆβLSand ˆβII asσuvaries

As mentioned earlier, indirect inference on predictive regression model involves a two-stage estimation procedure, and the first-stage estimation of the autoregressive model would have a

great influence on the second-stage estimation of the desired predictive coefficient. Thus, we would like to see at first how well the widely-known estimation bias problem of the

autore-gression can be addressed by indirect inference. Table 2 reports the simulation results of the first-stage estimation; meanwhile, the results of the least-squares estimator are also presented

for comparison purpose. Overall speaking, the least-squares bias for the autoregressive models is economically significant. On the contrary, the indirect inference estimator has virtually no

bias even in the cases where the series of the predictors are featured by a large root and/or a high lag-order. Besides, the indirect inference estimator appears to have smaller root mean square

errors (RMSE) than the least-squares when the predictor variable is AR(1). The same seems not to be true when the series is characterized by a higher-order model because the RMSE is

almost determined by the variance on which measure the indirect inference estimator tends to be slightly inferior.

(16)

apparent that the indirect inference estimator has very little bias (in absolute value) compared with that of the least-squares for all cases. Besides, the indirect inference estimator appears to

have a slightly larger variance than that of the least-squares, and this could be viewed as the cost of bias reduction. On the measure of RMSE, no estimator could dominate the other in all cases.

This is because the two estimators are respectively predominant for the bias and the variance, that together make up the RMSE.

As before, we compute the standardized bias as the ratio of the bias to the standard

devi-ation of each estimator, which would has more sense about the magnitude of the bias. The standardized bias of the indirect inference estimator is far smaller than the counterpart of the

least-squares in all cases. According to previous simulation results, it is expected that testing based on the indirect inference estimator could well immune from the bias-induced size

distor-tion that usually accompanies with the least-squares-based test.

Unlike the simulations with arbitrary settings, we actually construct the t-statistic based on each estimators and simulate it in order to account for the “complete” size distortion arising

from both the horizontal shift and the change of the shape of the null distribution caused by the autoregressive properties of the predictor. According to the asymptotic results in Theorem 2,

the t-statistics are constructed as

tLS= βˆ LSβ  ∑T t=1(xt−1− ¯x)2−1σˆu2LS and tII= βˆ IIβ  ∑T t=1(xt−1− ¯x)2−1σˆu2II , (11)

respectively. The last four columns in Table 3 report the realized size when the 5% and 10% nominal size are used for both left-tailed and right-tailed t-tests. Roughly speaking, the degree

of size distortion of each test is consistent to the value of the standardized bias of the estimator that the test is based. For cases such as the regressions where the excess stock return is predicted

by the book-to-market ratio (b/m), default yield spread (dfy), or dividend-price ratio (d/p), the least-squares estimator tends to be involved with extremely large standardized bias and the tLS

-test encounters critical size distortion problem. On the contrary, the tII-test appears to be well sized in all cases but with exceptions for the left-tailed tests under the regression models where

either b/m or d/p serves as the predictor of the excess stock return. Since the indirect inference

estimator has very little standardized bias in these exception cases, we can attribute the size distortion to the change in shape of the null distribution induced by the high-persistency of the

predictor. One thing interesting is that the left-tailed tests with worse realized size is usually not employed in practice because a positive coefficient is expected when the predictor is b/m or

(17)

d/p. So far, we might have the conclusion that testing based on the indirect inference estimator

would be more reliable than that based on the least-squares.

5

EMPIRICAL ILLUSTRATION

In this section, we illustrate the indirect inference method using the annual equity premium prediction models considered by Goyal and Welch (2008).6 In each model, the annual S&P500

equity premium is predicted by a single lagged variable listed in Table 4. The data set being used had been updated to 2008 by Amit Goyal and is available on his website.

Assuming the observations of the equity premium and the predictor variable actually come from the model system, (1), (2) and (3), the indirect inference estimate of the predictive

coeffi-cient can be obtained following the estimation procedure described in section 2. However, two problems may be encountered when applying the procedure. First, the maximum lag-order (p)

of the predictor’s process is usually unknown in practice. To get over, we suggest a sequential method to determine it. Specifically, we start by fitting the predictor variable by an “AR(0)”

model with a drift, and test the least-squares residuals by the Breusch-Godfrey LM tests for no first- to fourth-order serial correlation. If all of the 4 tests can not be rejected at the 10% level of

significance, the lag-order of the autoregressive model is determined to be 0. Else, we increase the lag-order by 1 and repeat the tests until the least-squares residual does not exhibit significant

serial-correlation.

Secondly, it is usually impossible to solve the minimization problems, such as (7) and (9),

by the widely-used grid search method. For (9), the predictive coefficientβ is known to be in the set of real numbers which is unbounded if no other economic constraint is imposed, and the

grid search method is not applicable in consequence. For (7) with a lag-order higher than 1,

even the value of the autoregressive coefficient vector is bounded by the stationarity conditions, it is necessary but knotty to transform the conditions expressed by the characteristic roots into the one expressed by the autoregressive coefficients. Moreover, it is time-consuming to do

multi-parameter grid search, and in this case, the computation time tends to grow geometrically as the lag-order increases.

6We do not consider the monthly predictive regression models as in Goyal and Welch (2008) because extremely

high autoregressive lag-orders are selected for the predictors based on either the criterion we used in this paper, AIC, or SBC. Besides, models with the predictors such as default return spread (dfr) and long term return (ltr), are also excluded because zero autoregressive lag-order is selected with our criterion.

(18)

To overcome such difficulties, we adapt the algorithm that is suggested by Gouri´eroux, Renault, and Touzi (2000) for the indirect inference estimation of a AR(2) model. We use the

same notation as in section 2. For the general case of the AR(p) model such as (2), suppose the real data are generated with the unknown true value of the autoregressive parameter vector Γ=Γ0. We define a function as:

gλ(Γ) =Γ+λ " ˆ ΓLS − 1 H H

h=1 ˆ ΓLS,h(Γ) # , (12)

where ˆΓLS is the least-squares estimate ofΓwith the true data, and ˆΓLS,h(Γ) denotes the least-squares estimate ofΓwith the hth simulated path,{xht}Tt=0, drawn from the AR(p) model with some value ofΓ. When H=∞, we hope (12) is a strong contraction with someλ, and ˆΓII is the unique fixed point. Thus, for a given ˆΓLS, we can construct the sequence{ ˆΓ(n)}n≥0:

ˆ

Γ(0)= ˆΓLS and ˆΓ(n+1)= g

λ( ˆΓ(n)),

and expect the convergence of the sequence to ˆΓII. The same algorithm is also used by Tanizaki, Hamori, and Matsubayashi (2005), but with a slightly modification that the contraction parame-terλ is allowed to depend on the iteration number, n.7 In the applications of this paper, the con-traction parameter is set asλ= 0.2 following Gouri´eroux, Renault, and Touzi (2000). We obtain convergence in all cases considered with the terminal condition ∑ip=1| ˆρi(n+1)− ˆρi(n)| < 10−4. The same algorithm can also be applied to solve the minimization problem (9), because the predictive coefficientβ serves as the only argument.

Now we turn to the empirical results. Table 4 reports the autoregressive order of each pre-dictor process selected with the sequential LM test. Half of the 14 prepre-dictor processes appear to

have a lag-order higher than 1, which sheds some light to our concern that the bias-correction of the least-squares estimator based on the formula derived under the AR(1)-predictor assumption

could usually be inappropriate. Table 5 shows the estimation results of the predictor processes. As shown by previous simulations, the least-squares estimator of the autoregressive coefficients

could be much biased. Thus, it is not surprising to see big differences between the the least-squares and the indirect inference estimates, as the latter is able to well take care of the possible

bias. The last column reports the largest characteristic root implied by the estimates of the autoregressive coefficients. All of the implied roots are smaller than unity irrespective of the

7In Tanizaki, Hamori, and Matsubayashi (2005), the contraction parameter is set asλ(n)= cn−1with c= 0.9,

(19)

estimation methods. Besides, in 13 of the 14 cases, the largest root implied by the indirect in-ference estimates is larger than that implied by the least-squares estimates. The only exception

is the process of the term spread (tms), which is expected to have a much smaller largest-root compared with that of other processes. It seems to say but requires future verification that the

least-squares is inclined to underestimate a large root.

Table 6 reports the estimation results for the predictive regressions. We find a upward bias-correction by comparing the estimates of the least-squares and the indirect inference in four

cases in which either the dividend yield (d/y), investment capital ratio (i/k), T-Bill rate (tbl), or net equity expansion (ntis), serves as the predictor, and find a downward bias-correction in

other cases. To go a step further, the estimates of the covariance between the innovations in the predictive regression and the autoregression are reported in Table 7. According to Proposition

1 and the signs of the ratio ˆσuv/ ˆσv2, we can expect a negative estimate of the expectation term in the proposition for all cases. As a result, the direction of the bias-correction implied by the indirect inference estimate is always the same as the sign of ˆσuv/ ˆσv2.

Table 6 also reports the t-statistics based on both estimators computed using formula (11). The values of the two t-statistics could differ a lot, and the indirect inference based t-test which

is bias-corrected does not always lower or raise the significance when compared with the least-squares based. A the 10% level, the two tests lead to the same inference about the

predic-tive coefficient in all cases, with the only exception where the equity premium is predicted by dividend-price ratio (d/p), and significance is not observed for the indirect inference based test.

With the simulation results in the previous section, one might prefer to believe the testing results based on indirect inference estimator. However, as what was shown in the last four columns,

the residual in the predictive regression appears to be serially correlated in six of the cases. For these cases, both the two tests might not be reliable even we reconstruct each test with a consistent estimation of the asymptotic covariance matrix. For example, consider the simple

model:

yt =α+βxt−1+ ut,

xt =θ+ρxt−1+ vt =θ/(1 −ρ) + vtvt−1+ρ2vt−2+ ··· ,

utut−1+ wt = wtwt−1+γ2wt−2+ ··· ,

where |ρ| < 1, |γ| < 1, and the innovation vector (wt, vt)′ is serially independent and nor-mally distributed with mean zero and covariance matrix [σ2

(20)

E(wt, vt+ j) = 0 ∀ j 6= 0, we have E(xt−1ut) = γσ

wv 1ργ,

which is generally not zero regardless of the sample size. As a result, the least-squares estimator

of β is expected to be inconsistent, and the conventional t-test would be accompanied with a asymptotic size distortion. Certainly, the indirect inference approach would suffer from the

same problem, as it is not designed to take account of the serial correlation. Although there exists a possibility to adapt our approach to this kind of model, it is left to be an issue for future

research.

6

CONCLUDING REMARKS

It is well known that the least-squares estimator is biased when the predictive regression

con-tains an autoregressive predictor which has innovations contemporarily correlated with the

de-pendent variable. In consequence, the conventional asymptotic tests, such as t-test, suffer from size-distortion and lead to invalid inference. For the purpose of bias reduction, we have

pro-posed the indirect inference method which is simulation-based and applicable without any ex-plicit form of the bias function or its expansion. Simulation studies show that the bias can be

effectively reduced even when the autoregressive order of the predictor is high. Thanks to the little bias, the problem of size distortion is much alleviated when the t-test is built on the

indi-rect inference estimator. The proposed method is also applied to investigate the predictability of stock returns, but less evidence has been found. Although inference based on our approach is supposed to be more reliable, some evidence of serial correlation emerges to challenge the

white noise assumption in our model. If this is the case, either the least-squares or the indirect inference could produce misleading inference. Fortunately, with the spirit of simulation, the

in-direct inference method is expected to deal with serial correlation by some modifications. One drawback of the simulation-based estimation method should be its compute-intensive nature.

However, with the rapid advancement of computer technology and the development of

numer-ical methods, the computation cost is becoming a minor problem in practice and we could reasonably expect the widespread use of indirect inference in the near future.

(21)

APPENDIX: PROOFS

A

Proof of Proposition 1

Following Stambaugh (1999), u= (u0, ··· ,uT)′can be decomposed as u= σuv

σ2

v v+ε,

where v= (v0, ··· ,vT)′. With the i.i.d. normality assumption, we have E(ε|X) = E(ε|x−p, x−p+1, ··· ,x−1, v0, ··· ,vT) = 0. Thus, E   ˆ αLSα ˆ βLSβ  = E[(XX)−1Xu] = σuv σ2 v E[(X′X)−1Xv].

where the second element can be expressed explicitly as

E ˆβLSβ  = σuv σ2 v E ∑T t=1(xt−1− ¯x)vtT t=1(xt−1− ¯x)2  .

The expectation term in the equation will generally be O(T−1), because ˆβLS is T1/2 consistent as shown by Theorem 2 (cf. MacKinnon and Smith, 1998). Besides, the expectation term does not depend onσv2, as the result thatσv serves as the role of scale parameter for both xt and vt for all t.

B

Proof of Theorem 2

Utilizing (4) and (10), we have

ˆ βII= ˆβLS −σˆ II uv ˆ σII v E " ∑T t=1(xth−1− ¯xh)vhtT t=1(xth−1− ¯xh)2 Γ= ˆΓII # = ˆβLSσˆ II uv ˆ σII v f ˆΓII, T , (13) where E[·] = f ˆΓII, T, {xh

t−1, vht}tT=1denotes the hthsimulated path based on the autoregressive coefficient vector Γ= ˆΓII, and ¯xh=T

t=1xth−1/T . Given that ˆΓII is a consistent estimator of the true Γand xt is covariance-stationary, xht will meet the stationary condition as well when

(22)

T ∞.8 By the law of large number (LLN), T−1 T

t=1  xht−1− ¯xh2 p→ Qh, (14) where Qh= var(xh

t). Moreover, since xht−1− ¯xh vth is a martingale difference sequence, the central limit theory (CLT; Corollary 5.25 of White, 1984) leads to

T−1/2 T

t=1  xth−1− ¯xh  vht ⇒ N  0, ˆσvII2Qh  . (15)

Combining (14) and (15) gives that√T f ˆΓII, T → 0 as T →∞. And together with (13), it is revealed that√T( ˆβIIβ) andT( ˆβLSβ) share the same asymptotics.

Now what left to be proven is the√T asymptotics of ˆβLSor ˆβII. Observing that √ T( ˆβLSβ) = T −1/2T t=1(xt−1− ¯x)ut T−1T t=1(xt−1− ¯x)2 ,

and making use of the similar arguments of (14) and (15) yields the desired results:

T−1 T

t=1 (xt−1− ¯x)2 p→ Q, T−1/2 T

t=1 (xt−1− ¯x)ut⇒ N 0,σu2Q , where Q= var(xt).

REFERENCES

Amihud, Y., and Hurvich, C.M. (2004), “Predictive Regression: A Reduced-Bias Estimation Method,” Journal of Financial and Quantitative Analysis, 39, 813−841.

Amihud, Y., Hurvich, C.M., and Wang, Y. (2010), “Predictive Regression With Order-p Au-toregressive Predictors,” Journal of Empirical Finance, 17, 513−525.

Campbell, J.Y., and Shiller, R.J. (1988), “The Dividend-Price Ratio and Expectations of Future

Dividends and Discount Factors,” The Review of Financial Studies, 1, 195−228.

Campbell, J.Y., and Thompson, S.B. (2008), “Predicting Excess Stock Returns Out of Sample:

Can Anything Beat the Historical Average?” Review of Financial Studies, 21, 1509−1531.

8As shown by Shaman and Stine (1988), the bias function of the least-squares estimator ˆΓLS is an O(T−1)

function of T andΓitself. Thus, the indirect inference estimator can be expressed as ˆΓII= ˆΓLS+ O(T−1). This

(23)

Campbell, J. Y., and Yogo, M. (2006), “Efficient Tests of Stock Return Predictability,” Journal

of Financial Economics, 81, 27−60.

Cavanagh, C.L., Elliott, G., Stock, J.H. (1995), “Inference in Models With Nearly Integrated Regressors,” Econometric Theory, 11, 1131−1147.

Engel, C., and West, K.D. (2005), “Exchange Rate and Fundamentals,” Journal of Political

Economy, 113, 485−517.

Engel, C., Mark, N.C., and West, K.D. (2008), “Exchange Rate Models Are Not as Bad as You Think,” NBER Chapters, in NBER Macroeconomics Annual 2007, Volume 22, National

Bureau of Economic Research, Inc., pp. 381−441.

Fama, E. F., and French, K. R. (1988), “Dividend Yields and Expected Stock Returns,” Journal

of Financial Economics, 22, 3−25.

Gouri´eroux, C., Monfort, A., Renault, E. (1993), “Indirect Inference,” Journal of Applied

Econometrics, 8, S85−S118.

Gouri´eroux, C., Renault, E., and Touzi, N. (2000), “Calibration by Simulation for Small Sample Bias Correction,” in Simulation-Based Inference in Econometrics: Methods and

Applica-tions, eds. Mariano, R.S., Schuermann, T., and Weeks, M., Cambridge University Press, pp.

328−358.

Gouri´eroux, C., Phillips, P. C. B., and Yu, J. (2010), “Indirect Inference for Dynamic Panel

Models,” Journal of Econometrics, 157, 68−77.

Goyal, A., and Welch, I. (2008), “A Comprehensive Look at the Empirical Performance of Equity Premium Prediction,” The Review of Financial Studies, 21, 1455−1508.

Groen, J.J.J. (2000), “The Monetary Exchange Rate Model as a Long-Run Phenomenon,”

Jour-nal of InternatioJour-nal Economics, 52, 299−319.

Groen, J.J.J. (2005), “Exchange Rate Predictability and Monetary Fundamentals in a Small Multi-Country Panel,” Journal of Money, Credit, and Banking, 37, 495−516.

Kendall, M. G. (1954), “Note on the Bias in the Estimation of Autocorrelation,” Biometrika,

41, 403−404.

Kothari, S., and Shanken, J. (1997), “Book-to-Market, Dividend Yield, and Expected Market Returns: A Time-Series Analysis,” Journal of Financial Economics, 44, 169−203.

(24)

Hodrick, R. J. (1992), “Dividend Yields and Expected Stock Returns: Alternative Procedures for Inference and Measurement,” The Review of Financial Studies, 5, 257−86.

Kilian L. (1999), “Exchange Rates and Monetary Fundamentals: What Do We Learn From Long Horizon Regressions?” Journal of Applied Econometrics, 14, 491−510.

Kothari, S. P., and Shanken, J. (1997), “Book-to-Market, Dividend Yield, and Expected Market Returns: A Time-Series Analysis,” Journal of Financial Economics, 2, 169−203.

Lewellen, J. (2004), “Predicting Returns With Financial Ratios,” Journal of Financial

Eco-nomics, 74, 209−35.

MacKinnon, J. G., and Smith Jr., A. A. (1998), “Approximate Bias Correction in Econometrics,”

Journal of Econometrics, 85, 205−230.

Mankiw, N.G., and Shapiro, M.D. (1986), “Do We Reject Too Often? Small Sample Properties of Tests of Rational Expectations Models,” Economics Letters, 20, 139−145.

Mark, N.C. (1995), “Exchange Rates and Fundamentals: Evidence on Long-Horizon

Pre-dictability,” American Economic Review, 85, 201-218.

Mark, N.C., and Sul, D. (2001), “Nominal Exchange Rates and Monetary Fundamentals: Evi-dence From a Seventeen Country Panel,” Journal of International Economics, 53, 29−52. Marriott, F. H. C., and Pope, J. A. (1954), “Bias in the Estimation of Autocorrelations,” Biometrika,

41, 393−402.

Molodtsova, T., and Papell, D. H. (2009), ”Out-of-Sample Exchange Rate Predictability With Taylor Rule Fundamentals,” Journal of International Economics, 77, 167−180.

Shaman, P., and Stine, R. A. (1988), “The Bias of Autoregressive Coefficient Estimators,”

Jour-nal of the American Statistical Association, 83, 842−848.

Staiger, D., Stock, J.H., and Watson, M.W. (1997), “The NAIRU, Unemployment and Monetary Policy,” Journal of Economic Perspectives, 11, 33−49.

Stambaugh, R. (1986), “Bias in Regressions With Lagged Stochastic Regressors,” Unpublished

Manuscript, University of Chicago, Chicago, IL.

Stambaugh, R. F. (1999), “Predictive Regressions,” Journal of Financial Economics, 54, 375−421. Stock, J.H., and Watson, M.W. (1999), “Forecasting Inflation,” Journal of Monetary Economics,

(25)

Tanizaki, H., Hamori, S., and Matsubayashi, Y. (2005), “On Least-Squares Bias in the AR(p) Models: Bias Correction Using the Bootstrap Methods,” Statistical Papers, 47, 109−124. Torous, W., Valkanov, R. and Yan, S. (2004), “On Predicting Stock Returns with Nearly

Inte-grated Explanatory Variables,” Journal of Business, 77, 937−66.

White, H. (1984), Asymptotic Theory for Econometricians, Academic Press.

Table 1: Bias of AR(1)-Based Estimator When

Pre-dictor is AR(2) λ2 bias ˆβLS  bias ˆβKS bias(βˆLS) bias(βˆKS) −0.8 0.0219 0.0147 0.671 −0.7 0.0322 0.0194 0.602 −0.6 0.0361 0.0183 0.506 −0.5 0.0396 0.0172 0.434 −0.4 0.0398 0.0134 0.338 −0.3 0.0399 0.0103 0.258 −0.2 0.0400 0.0078 0.196 −0.1 0.0384 0.0042 0.111 0.0 0.0363 0.0008 0.022 0.1 0.0350 -0.0011 0.032 0.2 0.0332 -0.0028 0.085 0.3 0.0303 -0.0051 0.168 0.4 0.0275 -0.0065 0.236 0.5 0.0242 -0.0079 0.326 0.6 0.0213 -0.0082 0.386 0.7 0.0182 -0.0079 0.436 0.8 0.0151 -0.0071 0.469

1 The DGP is (1), (2), and (3), with T= 80,α=ρ

0=σu=

σv= 1,β= 0,σuv= −0.8,ρ1=λ1+λ2, andρ2= −λ1λ2.

The two characteristic roots are set asλ1= 0.9, andλ2=

(−0.8,−0.7,··· ,0.8).

2 The formula of ˆβKS is given in (5) in the text with ˆρLS

1

obtained from an AR(1) regression of the predictor.

(26)

Table 2: Simulation Results: Autoregression of Predictor

Predictor T p Largest-Root |Bias| Variance RMSE

b/m 88 1 0.886 LS 0.044 0.004 0.078 II 0.001 0.005 0.069 dfy 90 1 0.828 LS 0.041 0.005 0.081 II 0.000 0.006 0.074 d/y 137 1 0.980 LS 0.034 0.001 0.049 II 0.008 0.001 0.032 e/p 137 1 0.785 LS 0.026 0.003 0.062 II 0.000 0.003 0.058 i/k 62 1 0.773 LS 0.058 0.009 0.112 II 0.001 0.010 0.102 lty 90 1 0.977 LS 0.052 0.003 0.073 II 0.016 0.002 0.046 svar 124 1 0.711 LS 0.025 0.005 0.072 II 0.001 0.005 0.069 tms 89 2 0.407 LS 0.036 0.023 0.153 II 0.002 0.024 0.155 d/p 137 3 0.966 LS 0.049 0.026 0.163 II 0.006 0.027 0.164 infl 90 4 0.920 LS 0.098 0.054 0.241 II 0.006 0.059 0.243 tbl 89 4 0.997 LS 0.084 0.075 0.280 II 0.041 0.081 0.286 ntis 82 5 0.794 LS 0.078 0.082 0.289 II 0.006 0.091 0.302 d/e 137 6 0.927 LS 0.094 0.060 0.250 II 0.004 0.064 0.254 eqis 82 7 0.831 LS 0.153 0.110 0.338 II 0.011 0.125 0.354 1 Estimated model: x

t=ρ0+ρ1xt−1+ ··· +ρpxt−p+ vtwhere xt is the predictor in the predictive regression model.

2 |Bias|, Variance, and RMSE are defined asp

i=1|E( ˆρi−ρi)|,∑ip=1Var( ˆρi) and 

p

(27)

Table 3: Simulation Results: Predictive Regression

Predictor T β Bias( ˆβ) Var( ˆβ) RMSE( ˆβ) |Bias( ˆβ)|

std( ˆβ) Size

L

5% SizeR5% SizeL10% SizeR10% b/m 88 0.131 LS 0.049 0.007 0.097 0.575 1.9% 12.5% 3.3% 21.3% II -0.002 0.008 0.088 0.022 10.5% 5.2% 16.8% 10.0% dfy 90 -0.642 LS 1.005 6.454 2.732 0.396 2.6% 9.6% 5.75% 17.4% II 0.015 6.781 2.604 0.006 6.1% 4.8% 12.5% 10.0% d/y 137 0.077 LS -0.002 0.001 0.037 0.062 6.0% 4.7% 11.2% 9.2% II 0.000 0.001 0.037 0.008 5.2% 5.5% 10.2% 10.7% e/p 137 0.074 LS 0.005 0.001 0.039 0.139 4.2% 6.7% 8.1% 12.4% II 0.000 0.001 0.039 0.006 5.5% 5.2% 10.6% 10.0% i/k 62 -13.2 LS -0.149 37.710 6.143 0.024 5.5% 4.9% 11.0% 9.9% II 0.038 37.974 6.162 0.006 5.3% 5.4% 10.4% 10.6% lty 90 -0.584 LS 0.138 1.124 1.069 0.130 4.1% 6.5% 8.2% 11.9% II 0.018 1.140 1.068 0.017 5.5% 5.3% 11.0% 10.0% svar 124 0.085 LS 0.062 0.162 0.407 0.155 3.5% 6.4% 7.5% 11.8% II -0.005 0.164 0.405 0.011 5.3% 4.8% 10.7% 9.4% tms 89 1.497 LS 0.072 2.308 1.521 0.047 4.7% 5.6% 9.3% 11.1% II 0.002 2.315 1.522 0.001 5.2% 5.0% 10.3% 10.3% d/p 137 0.031 LS 0.030 0.002 0.050 0.761 0.7% 15.3% 2.1% 26.0% II 0.001 0.002 0.042 0.014 10.9% 5.5% 18.0% 10.3% infl 90 -0.218 LS 0.016 0.236 0.487 0.032 4.8% 5.4% 9.6% 10.4% II 0.001 0.238 0.488 0.002 5.3% 5.1% 10.2% 10.1% tbl 89 -0.592 LS -0.126 0.395 0.641 0.200 7.3% 3.4% 13.8% 7.0% II -0.036 0.403 0.636 0.057 5.4% 5.1% 10.6% 9.9% ntis 82 -1.450 LS -0.024 1.485 1.219 0.020 5.3% 5.0% 10.4% 9.8% II -0.011 1.491 1.221 0.009 5.2% 5.1% 10.3% 10.1% d/e 137 -0.001 LS 0.007 0.002 0.047 0.157 3.9% 6.8% 7.8% 12.7% II 0.000 0.002 0.048 0.001 5.7% 5.3% 11.1% 10.0% eqis 82 -0.470 LS 0.006 0.062 0.249 0.025 5.2% 5.5% 10.1% 10.5% II 0.001 0.062 0.249 0.003 5.4% 5.3% 10.5% 10.1%

1 Boldface number denotes the value of |Bias( ˆβ)|

std( ˆβ) is larger than 0.2.

2 SizeL

αand SizeRαare respectively the realized sizes of the left-tailed test and right-tailed test with a nominal size ofα.

(28)

Table 4: Predictors for S&P500 Equity Premium and the AR-Order Selections

Predictor Definition Time Span AR-Order χ[1]2 χ[2]2 χ[3]2 χ[4]2

b/m Book to Market 1921-2008 1 1.511 3.447 3.920 5.890

dfy Default Yield Spread 1919-2008 1 2.488 2.472 2.837 3.612

d/y Dividend Yield 1872-2008 1 0.016 2.983 3.219 3.613

e/p Earning Price Ratio 1872-2008 1 0.694 1.688 1.723 3.431

i/k Investment Capital Ratio 1947-2008 1 1.592 2.108 3.596 5.545

lty Long Term Yield 1919-2008 1 1.466 1.416 2.434 3.575

svar Stock Variance 1885-2008 1 2.569 4.469 4.729 5.945

tms Term Spread 1920-2008 2 0.607 0.782 5.727 5.219

d/p Dividend Price Ratio 1872-2008 3 0.609 1.081 2.723 2.756

infl Inflation 1919-2008 4 0.150 1.413 2.856 3.317

tbl T-Bill Rate 1920-2008 4 2.538 2.580 2.906 3.617

ntis Net Equity Expansion 1927-2008 5 0.055 1.525 2.363 3.027

d/e Dividend Payout Ratio 1872-2008 6 0.536 2.185 5.484 7.250

eqis Pct Equity Issuing 1927-2008 7 2.193 2.412 2.343 4.192

1 See Goyal and Welch (2008) and Amit Goyal’s website for detailed variable description. 2 χ2

[q] is the Breusch-Godfrey LM test statistic, with a null hypothesis that there is no serial correlation up to order q.

3 Boldface number denotes significance at 10% level.

4 The lag-order is selected by continuously increasing the lag-order of the AR model from a

(29)

Table 5: Estimation of the Autoregression

Predictor Method ρˆ0 ρˆ1 ρˆ2 ρˆ3 ρˆ4 ρˆ5 ρˆ6 ρˆ7 Largest-Root

b/m LS 0.089 0.842 0.842 II 0.063 0.886 0.886 dfy LS 0.003 0.787 0.787 II 0.002 0.828 0.828 d/y LS −0.179 0.946 0.946 II −0.071 0.980 0.980 e/p LS −0.647 0.760 0.760 II −0.581 0.785 0.785 i/k LS 0.010 0.717 0.717 II 0.008 0.773 0.773 lty LS 0.002 0.962 0.962 II 0.001 0.977 0.977 svar LS 0.010 0.685 0.685 II 0.009 0.711 0.711 tms LS 0.007 0.691 −0.181 0.426 II 0.007 0.709 −0.166 0.407 d/p LS −0.314 0.807 −0.164 0.261 0.932 II −0.163 0.829 −0.158 0.281 0.966 infl LS 0.007 0.671 −0.097 −0.120 0.303 0.866 II 0.005 0.700 −0.076 −0.125 0.351 0.920 tbl LS 0.003 1.062 −0.351 0.047 0.158 0.933 II −0.000 1.111 −0.349 0.045 0.190 0.997 ntis LS 0.005 0.626 −0.006 0.019 0.041 −0.025 0.671 II 0.004 0.650 0.012 0.020 0.063 −0.012 0.794 d/e LS −0.126 0.799 −0.250 0.115 −0.053 −0.215 0.362 0.892 II −0.088 0.818 −0.238 0.114 −0.036 −0.229 0.402 0.927 eqis LS 0.075 0.504 0.142 0.023 0.268 −0.280 −0.103 0.044 0.819 II 0.060 0.529 0.165 0.014 0.302 −0.309 −0.090 0.065 0.831

(30)

Table 6: Estimation of the Predictive Regression Predictor T αˆ βˆ tβˆ χ[1]2 χ[2]2 χ[3]2 χ[4]2 b/m 88 LS −0.046 0.180 2.259 2.459 2.607 3.432 3.658 II −0.018 0.131 1.645 2.691 2.846 3.683 4.030 dfy 90 LS 0.052 0.327 0.122 0.751 2.314 2.085 4.328 II 0.064 −0.642 −0.240 1.067 2.531 2.283 4.379 d/y 137 LS 0.280 0.075 1.877 0.088 4.683 5.968 6.914 II 0.286 0.077 1.930 0.081 4.680 5.952 6.882 e/p 137 LS 0.255 0.079 1.843 0.533 7.102 8.830 10.148 II 0.241 0.074 1.725 0.569 7.149 8.863 10.171 i/k 62 LS 0.534 −13.343 −2.156 0.119 1.943 2.207 3.344 II 0.529 −13.225 −2.137 0.120 1.936 2.195 3.339 lty 90 LS 0.080 −0.460 −0.591 0.415 2.428 2.158 4.482 II 0.086 −0.584 −0.750 0.442 2.450 2.19 4.489 svar 124 LS 0.043 0.150 0.337 0.096 5.011 6.062 7.177 II 0.045 0.084 0.190 0.118 4.967 6.039 7.141 tms 89 LS 0.038 1.559 1.025 0.498 1.739 1.810 3.580 II 0.038 1.497 0.984 0.506 1.744 1.816 3.588 d/p 137 LS 0.239 0.061 1.655 0.939 5.317 6.773 8.181 II 0.143 0.031 0.837 1.819 6.166 7.745 9.180 infl 90 LS 0.062 −0.210 −0.458 0.671 2.637 2.729 4.701 II 0.062 −0.218 −0.475 0.686 2.651 2.751 4.718 tbl 89 LS 0.087 −0.714 −1.025 0.427 1.975 2.016 4.045 II 0.082 −0.592 −0.852 0.458 1.998 2.041 4.084 ntis 82 LS 0.080 −1.464 −1.764 0.151 1.110 0.963 7.879 II 0.080 −1.450 −1.748 0.164 1.123 0.961 7.842 d/e 137 LS 0.048 0.006 0.115 0.247 5.955 6.974 8.266 II 0.044 −0.001 −0.009 0.281 5.946 7.010 8.298 eqis 82 LS 0.141 −0.463 −2.408 0.121 0.692 0.738 2.406 II 0.143 −0.470 −2.441 0.114 0.687 0.752 2.435

1Boldface number denotes the significance at 10% level when a two-tailed test is

conducted.

(31)

Table 7: Estimation of the Residual Covariance Matrix Predictor σˆu2 σˆv2 σˆuv σˆuv ˆσv2 b/m LS 3.574 × 10−2 1.992 × 10−2 −2.196 × 10−2 −1.103 II 3.590 × 10−2 2.005 × 10−2 −2.210 × 10−2 −1.103 dfy LS 3.811 × 10−2 2.784 × 10−5 −6.717 × 10−4 −24.131 II 3.812 × 10−2 2.794 × 10−5 −6.741 × 10−4 −24.131 d/y LS 3.257 × 10−2 1.984 × 10−2 1.453 × 10−3 0.073 II 3.257 × 10−2 2.002 × 10−2 1.464 × 10−3 0.073 e/p LS 3.260 × 10−2 7.081 × 10−2 −1.439 × 10−2 −0.203 II 3.260 × 10−2 7.090 × 10−2 −1.441 × 10−2 −0.203 i/k LS 2.697 × 10−2 5.917 × 10−6 2.033 × 10−5 3.436 II 2.697 × 10−2 5.956 × 10−6 2.041 × 10−5 3.427 lty LS 3.780 × 10−2 5.945 × 10−5 −1.425 × 10−4 −2.396 II 3.780 × 10−2 5.962 × 10−5 −1.438 × 10−4 −2.412 svar LS 3.447 × 10−2 9.273 × 10−4 −2.401 × 10−3 −2.590 II 3.447 × 10−2 9.282 × 10−4 −2.404 × 10−3 −2.590 tms LS 3.700 × 10−2 1.208 × 10−4 −3.096 × 10−4 −2.562 II 3.700 × 10−2 1.208 × 10−4 −3.096 × 10−4 −2.582 d/p LS 3.275 × 10−2 4.139 × 10−2 −3.015 × 10−2 −0.728 II 3.291 × 10−2 4.175 × 10−2 −3.042 × 10−2 −0.729 infl LS 3.802 × 10−2 9.331 × 10−4 −2.603 × 10−4 −0.279 II 3.802 × 10−2 9.430 × 10−4 −2.755 × 10−4 −0.292 tbl LS 3.700 × 10−2 1.904 × 10−4 3.344 × 10−4 1.756 II 3.701 × 10−2 1.957 × 10−4 3.439 × 10−4 1.757 ntis LS 3.767 × 10−2 1.919 × 10−4 9.756 × 10−5 0.509 II 3.767 × 10−2 1.930 × 10−4 5.340 × 10−5 0.277 d/e LS 3.342 × 10−2 5.095 × 10−2 −1.352 × 10−2 −0.265 II 3.342 × 10−2 5.120 × 10−2 −1.369 × 10−2 −0.267 eqis LS 3.647 × 10−2 4.165 × 10−3 −5.725 × 10−4 −0.137 II 3.647 × 10−2 4.195 × 10−3 −6.826 × 10−4 −0.163

數據

Figure 1: Finite-sample properties of ˆ β LS and ˆ β II as λ 2 varies
Figure 2: Finite-sample properties of ˆ β LS and ˆ β II as σ uv varies
Figure 3: Finite-sample properties of ˆ β LS and ˆ β II as σ u varies
Table 1: Bias of AR(1)-Based Estimator When Pre- Pre-dictor is AR(2) λ 2 bias  ˆ β LS  bias  ˆβ KS    bias ( βˆ LS )bias ( βˆ KS )  −0.8 0.0219 0.0147 0.671 −0.7 0.0322 0.0194 0.602 −0.6 0.0361 0.0183 0.506 −0.5 0.0396 0.0172 0.434 −0.4 0.0398
+7

參考文獻

相關文件

By looking at the slope of the curve as t increases, we see that the rate of increase of the population is initially very small, then gets larger until it reaches a maximum at about

One model for the growth of a population is based on the assumption that the population grows at a rate proportional to the size of the population.. That is a reasonable

It would be game-changing to form a class atmosphere that encourage diversity and discussion, formed of groups with different microculture that optimized the sense of belonging

Proof : It is clear that three numbers are vertices of triangle inscribed in the unit circle with center at the origin... So, it is clear that axiom

Rather than requiring a physical press of the reset button before an upload, the Arduino Uno is designed in a way that allows it to be reset by software running on a

(Once again, please be reminded that increase in money supply does not mean that it automatically increases the money holding by the people. It must go through the process that

◦ Lack of fit of the data regarding the posterior predictive distribution can be measured by the tail-area probability, or p-value of the test quantity. ◦ It is commonly computed

Because simultaneous localization, mapping and moving object tracking is a more general process based on the integration of SLAM and moving object tracking, it inherits the