• 沒有找到結果。

This section develops a shrinkage estimator that is to control potential estimation risks associated with the predictive regression. The motivation for a need for the estimator will be discussed immediately.

3.1 Sources of estimation risk

The OLS estimation of the slope coefficient tends to suffer more estimation risk than ordinary regressions. It results from some distinctive characteristics for the predictive regressions.

First, the predictor xit, or the deviation of exchange rate from fundamentals, is documented to be a highly persistently process. The high degree of persistence in forecasting variables, however, may generate a biased estimate of βi in small samples, although do not alter the

degree of persistence in forecasting variables for a given sample size, as demonstrated by Stambaugh (1999).

Second, the predictor possesses a much smaller variability than the dependent variable does (Mark and Sul, 2004). It is regarded that the small variance might not affect some statistical properties of the estimators asymptotically. But in finite samples, a small variance in regressors suggests a small signal-to-noise ratio, yielding imprecise estimates of βi. It turns out that the imprecision could not be safely ignored when taking into account another key feature of predictive regressions as below.

Third, the dependent variable is long-horizon change in log exchange rate that amounts to overlapping sums of short-horizon change in log exchange rates. Valkanov (2003) have shown that in some parametric settings, the OLS estimator of βi might be inconsistent, when the link of the small variance in predictors to the overlaps in dependent variables is considered. In other words, the small-sample distributions of the estimates of βi is more dispersed with the forecast horizon k. This is simply another reflection of the imprecise estimation of βi due to the small variance.

The estimation risk appears to take a form of either bias or inefficiency, whether it is due to the high persistence or the small variability. The little consensus emerging from empirical findings in the literature may come from the fact that econometric methods adopted can only deal with some parts of the risks. It is thus important to have a comprehensive approach that is capable of controlling for the overall estimation risks. This will be the subject to be studied in the next subsection.

3.2 A shrinkage estimator

To reduce the estimation risks in the context of predictive regressions, we consider a shrinkage estimator for βi defined by

β˜i = ωi· ˆβi+ (1 − ωi) · ¯β,

where ωi is the ‘shrinking factor’ that assigns weights to the two estimators in combination, βˆi is the OLS estimator, and the grand average estimator is given by

β¯= PN

i=1βˆi N

6

which takes the average of all the OLS estimators of the slope coefficients of the N countries.

As it appears now, the shrinkage estimator considered is a Stein-like estimator which linearly combines two alternative estimators that differ in their bias and precision characteristics.

Stein (1956) showed that shrinking sample means toward a fixed constant, under some conditions, can reduce estimation risks. The key is to create a tradeoff between bias and accuracy by introducing a biased estimator, represented by the fixed target, to combine with the unbiased estimator, represented by the sample mean. The statistical decision theory suggests the existence of an interior optimum in the tradeoff between bias and precision.

Taking a proper weighted average of the unbiased and the biased estimators would constitute one of feasible approaches to attaining the optimal tradeoff. The idea of shrinkage has been widely applied to the problem of portfolio selection (for example, Jorion, 1985; Dumas and Jacquillat, 1990) and the context of prediction (Copas, 1983).

Our shrinkage estimator inherits the idea of Stein’s (1956) seminal work. To see this, it is important to recognize that 1) the individual OLS estimator of the slope coefficient is unbiased asymptotically, but has lots of estimation errors due to the small variance of and high degree of persistence of the predictors, and that 2) the opposite is true for the grand average of all the cross-sectional slope estimators which is subject to larger bias but less estimation errors due to the averaging. Thus the intuition of the theory simply suggests that taking a weighted average of the OLS estimator and the grand mean estimator is likely to yield an estimator with less risk in the context of predictive regressions.

It is evident that the suggested estimator for the slope coefficients utilizes information from cross-sections in a similar way that the panel-based estimators adopted by Groen (2000) and Mark and Sul (2001) do. Like Stein, we are concerned with the estimation of several unknown coefficients. It is natural that these Stein-like estimator makes best use of existing information available. The implicit assumption underlying the use of information from cross-sections for our estimator, however, is very much different from that for the panel-based estimators. The panel-based estimators are built on the assumption that the slope coefficients are all the same for all the cross-sectional countries. The panel approach to pooling the data produces a single estimate. On the other hand, the shrinkage estimator allows for separate slope estimate for each cross-section country as the OLS estimator does, but makes use of

estimator and the pooled estimator are the same to reduce the estimation errors, but differ in the way how the cross-sectional information is processed. Yet, our shrinkage estimator has the advantages of producing more reasonable slope estimates. When the truth is that the slope coefficients are all the same across each cross-sectional countries, the difference between the slope estimates from the two approaches should be insignificant, because by definition the shrinakge estimator embraces the extreme situation. But if the slope parameters are all different in each country, the panel approach by no means gives consistent slope estimate, because of a false constraint imposed in estimation, while our shrinkage estimator still does, as will be shown below.

The construction of the shrinkage estimator opens a few questions to be answered.

3.3 Shrinkage target

What is the rationale behind the choice of the grand average as the shrinkage target? The grand average has been commonly employed in the literature. Lindley (1962) demonstrates that the risk dominance of the Stein estimator when shrinking the sample mean toward the grand mean estimator. Our choice reflects a belief that the slope coefficients is different and near to each other. Like the constant target in the case of sample mean estimation, the grand average plays the role of a restricted estimator, while the OLS estimator is the unrestricted counterpart. Here the constraint imposed into our estimator is the slope coefficients of all the country are the same. Conventional econometric wisdom would suggest that imposing any structure or restriction in estimation brings forth efficiency gains, whether or not it is true. Consequently, our shrinkage estimator performs best when each of the slope coefficients coincides.

Alternative shrinkage targets as structures or constraints might exist. One shrinkage target works well for a certain data set, but not necessarily for another. It is thus difficult to tell which one works well a priori, without checking out-of-sample performance. This study looks at a dataset of aggregate time series from major developed countries. Within the circle, the transmission of shocks to fundamentals to exchange rates, if any, may vary with the macro environment or the aggregate policy in each country, but at a similar speed.

The assumption of the slope heterogeneity is considered a reasonable one. We will show the advantages of the chosen target through simulations.

8

3.4 Shrinking factor

Applications of the shrinkage estimator is not possible without knowing the shrinking factor ωi. The subsection is devoted to how to determine and estimate it. The major goal for all the existing estimators is in finite samples to minimize the estimation risk of the proposed estimators. This entails a need to specify a loss function. We propose to use the quadratic loss function. The loss function considers the tradeoff between bias and estimation errors using a quadratic distance measure between the true and the estimated slope coefficient. It gives rise to a risk function that calculates mean squared errors (MSE) as follows:

ℜ( ˜βi, βi) = MSE( ˜βi) = Eh

We can describe the ‘optimal’ shrinking factor further, provided that the probability struc-ture between the estimators is given. Without assuming normality, suppose the OLS esti-mator and the grand average in finite samples is jointly distributed as:

 βˆi

where P can be any distribution function as long as the first and second moments exist and are finite, and , and βi is the true value of the slope coefficient i. Note that γi,ls is the bias associated with the OLS estimator due to the high degree of persistence in predictors, and γi,g = PNiβi − βi, the distance of the true grand average from the true slope coefficient i.

Therefore, when the slope coefficients are all the same, γi,g = 0. A important feature from the covariance matrix is to have the considered estimators correlated by an introduction of ρi. The appearance of the correlation measure is unusual when understanding the shrinkage estimator in an empirical Bayesian vein where it combines a prior, given by the grand average estimate, with the sample information corresponding to the least squares estimate.

The typical empirical Bayesian approach assumes independence between prior and sample information, regardless of whether the prior is estimated from the same dataset as the grand

estimations that takes place without considering the link.2 With some algebra calculations,3 ωi =1 − σi,ls+ γi,ls2 − ρi− γi,lsγi,g

σi,ls2 + γi,ls2 + σ2i,g+ γ2i,g− 2ρi− 2γi,lsγi,g.

The expression offers some intuitions of how the weight is given to the combined estimators.

Higher weight is assigned to the grand average when the OLS estimator has a higher bias and variance. But the estimator is drawn toward the OLS estimator in the case of high correlation between the combined estimators. When there are increases in γi,g or σi,g2 , indications that the true slope coefficients are so scattered not as the prior prescribes, the OLS estimator receives more weight. This is because gains from information combination in both cases is small. Thus, very importantly, the weight adapts itself to the data.

It is crucial both in theory and in applications to estimate the shrinking factor con-sistently. A pre-requisite for establishing the consistency of the proposed estimator is the weight consistency.4 Further, it controls the information quality represented by the com-bined estimators. A correct inference from the data is very much dependent on a precisely estimated weight. A technical innovation in this paper is to suggest a consistent bootstrap procedure to estimate the weight parameter. The available estimator as suggested by Judge and Mittelhammer (2004), though consistent, relies on the asymptotic argument, and its finite-sample performance can be satisfactory only when the sample size is sufficiently large.

This might not be the case in the study because of a panel of short-spanned time series is under investigation. Typically the bootstrap estimator have proven to display more robust performance than the asymptotic counterpart. A more important justification for such a bootstrap procedure is that no general consistent estimator has been mounted for the bias parameters such as γit,l and γi,g. We appeal to the bootstrap procedure in order to render our estimation and inference feasible in the subsequent empirical analysis. Very detailed illustration of the bootstrap algorithm is given in the appendix to save the space.

2One more difference is that the Bayesian approach is in general not optimized with any particular loss function.

3A detailed derivation is given in the appendix.

4We have shown in the appendix the consistency of the shrinkage estimator, given the existence of a consistent estimator for the shrinking factor.

10

相關文件