• 沒有找到結果。

Conditional Variance Estimation in Heteroscedastic Regression Models

N/A
N/A
Protected

Academic year: 2022

Share "Conditional Variance Estimation in Heteroscedastic Regression Models"

Copied!
20
0
0

加載中.... (立即查看全文)

全文

(1)

Conditional Variance Estimation in Heteroscedastic Regression Models

Lu-Hung Chen1, Ming-Yen Cheng2 and Liang Peng3

Abstract

First, we propose a new method for estimating the conditional variance in het- eroscedasticity regression models. For heavy tailed innovations, this method is in general more efficient than either of the local linear and local likelihood estimators.

Secondly, we apply a variance reduction technique to improve the inference for the conditional variance. The proposed methods are investigated through their asymp- totic distributions and numerical performances.

Keywords. Conditional variance, local likelihood estimation, local linear estimation, log-transformation, variance reduction, volatility.

Short title. Conditional Variance.

1Department of Mathematics, National Taiwan University, Taipei 106, Taiwan. Email:

r91090@csie.ntu.edu.tw

2Department of Mathematics, National Taiwan University, Taipei 106, Taiwan. Email:

cheng@math.ntu.edu.tw

3School of Mathematics, Georgia Institute of Technology, Atlanta GA 30332-0160, USA. Email:

peng@math.gatech.edu

(2)

1 Introduction

Let {(Yi, Xi)} be a two-dimensional strictly stationary process having the same marginal distribution as (Y, X). Let m(x) = E(Y |X = x) and σ2(x) = V ar(Y |X = x) be re- spectively the regression function and the conditional variance, and σ2(x) > 0. Write Yi = m(Xi) + σ(Xi) εi. (1.1) Thus E(εi|Xi) = 0 and V ar(εi|Xi) = 1. When Xi = Yi−1, model (1.1) includes ARCH(1) time series (Engle, 1982) and AR (1) processes with ARCH(1) errors as special cases. See Borkovec (2001), Borkovec and Kl¨uppelberg (2001) for probabilistic properties and Ling (2004) and Chan and Peng (2005) for statistical properties of such AR(1) processes with ARCH(1) errors.

There exists an extensive study on estimating the nonparametric regression func- tion m(x); see for example Fan and Gijbels (1996). Here we are interested in estimat- ing the volatility function σ2(x). Some methods have been proposed in the literature;

see Fan and Yao (1998) and references cited therein. More specifically, Fan and Yao (1998) proposed to first estimate m(x) by the local linear technique, i.e.,m(x) =b ba if

(ba, bb) = argmin(a,b)

n

X

i=1

Yi− a − b(Xi− x) 2

KXi− x h2



, (1.2)

where K is a density function and h2 > 0 is a bandwidth, and then estimate σ2(x) byσb12(x) =αb1 if

(αb1, bβ1) = argmin11)

n

X

i=1



rbi− α1− β1(Xi− x) 2

W

Xi− x h1



, (1.3)

where bri = {Yi−m(Xb i)}2, W is a density function and h1 > 0 is a bandwidth. The main drawback of this conditional variance estimator is that it is not always positive.

Recently, Yu and Jones (2004) studied the local Normal likelihood estimation, men- tioned in Fan and Yao (1998), with expansion for log σ2(x) instead of σ2(x). The main advantage of doing this is to ensure that the resulting conditional variance estimator

(3)

is always positive. More specifically, the estimator proposed by Yu and Jones (2004) is bσ22(x) = exp{αb2}, where

(αb2, bβ2) = argmin22)

n

X

i=1

h

briexp−α2−β2(Xi−x) +α22(Xi−x)i

WXi− x h1

 . (1.4) Yu and Jones (2004) derived the asymptotic limit ofbσ22(x) under very strict conditions which require independence of (Xi, Yi)0s and εi ∼ N (0, 1).

Motivated by empirical evidences that financial data may have heavy tails, see for example Mittnik and Rachev (2000), we propose the following new estimator for σ2(x). Rewrite (1.1) as

log ri = ν(Xi) + log(ε2i/d) (1.5) where ri = {Yi − m(Xi)}2, ν(x) = log(dσ2(x)) and d satisfies E{log(ε2i/d)} = 0.

Based on the above equation, first we estimate ν(x) by bν(x) =αb3 where (αb3, bβ3) = argmin33)

n

X

i=1

 log(bri+ n−1) − α3− β3(Xi− x) 2

WXi− x h1



. (1.6) Note that we employ log(bri+ n−1) instead of logbri to avoid log(0). Next, by noting that E(ε2i|Xi) = 1 and ri = exp{ν(Xi)} ε2i/d, we estimate d by

d =b h1 n

n

X

i=1

briexp{−ν(Xb i)}i−1

. Therefore our new estimator for σ2(x) is defined as

32(x) = exp{bν(x)}/ bd.

Intuitively, the log-transformation in (1.5) makes data less skewed, and thus the new estimator may be more efficient in dealing with heavy tailed errors than other estimators such as bσ21(x) and bσ22(x). Peng and Yao (2003) investigated this effect in least absolute estimation of the parameters in ARCH and GARCH models. Note that this new estimatorσb32(x) is always positive.

We organize this paper as follows. In Section 2, the asymptotic distribution of this new estimator is given and some theoretical comparisons withbσ21(x) andbσ22(x) are

(4)

addressed as well. In Section 3, we apply the variance reduction technique in Cheng, et al. (2007) to the conditional variance estimators σb12(x) and bσ23(x) and provide the limiting distributions. Bandwidth selection for the proposed estimators is discussed in Section 4. A simulation study and a real application are presented in Section 5.

All proofs are given in Section 6.

2 Asymptotic Normality

To derive the asymptotic limit of our new estimator, we impose the following regu- larity conditions. Denote by p(·) the marginal density function of X.

(C1) For a given point x, the functions E(Y3|X = z), E(Y4|X = z) and p(z) are continuous at the point x, and ¨m(z) = dzd22m(z) and ¨σ2(z) = dzd22σ2(z) are uniformly continuous on an open set containing the point x. Further, assume p(x) > 0;

(C2) E(Y4(1+δ)) < ∞ for some δ ∈ (0, 1);

(C3) The kernel functions W and K are symmetric density functions each with a bounded support in (−∞, ∞). Further, there exists M > 0 such that |W (x1) − W (x2)| ≤ M |x1−x2| for all x1and x2 in the support of W and |K(x1)−K(x2)| ≤ M |x1− x2| for all x1 and x2 in the support of K;

(C4) The strictly stationary process {(Yi, Xi)} is absolutely regular, i.e., β(j) := sup

j≥1

E sup

A∈Fi+j

|P (A|F1i) − P (A)| → 0 as j → ∞,

where Fij is the σ-field generated by {(Yk, Xk) : k = i, · · · , j}, j ≥ i. Further, for the same δ as in (C2),

X

j=1

j2βδ/(1+δ)(j) < ∞;

(C5) As n → ∞, hi → 0 and lim inf

n→∞ nh4i > 0 for i = 1, 2.

(5)

Our main result is as follows.

Theorem 1. Under the regularity conditions (C1)–(C5), we have pnh1

σb32(x) − σ2(x) − θn3 d

→ N

0, p(x)−1σ4(x)λ2(x)R(W ) , where λ2(x) = E{(log(ε2/d))2|X = x), R(W ) =R W2(t) dt and

θn3 = 1

2h21σ2(x)¨ν(x) Z

t2W (t) dt

−1

2h21σ2(x)E ¨ν(X1) Z

t2W (t) dt + o h21+ h22.

Next we compareσb32(x) withσb12(x) and bσ22(x) in terms of their asymptotic biases and variances. It follows from Fan and Yao (1998) and Yu and Jones (2004) that, for i = 1, 2,

pnh1

σbi2(x) − σ2(x) − θni d

→ N

0, p−1(x)σ4(x)¯λ2(x)R(W ) , where ¯λ2(x) = E{(ε2− 1)2|X = x},

θn1 = 1

2h21σ¨2(x) Z

t2W (t) dt + o(h21+ h22) and θn2 = 12h21σ2(x)dxd22{log σ2(x)}R t2W (t) dt + o h21+ h22.

Remark 1. If ˆν(Xi) in ˆd is replaced by another local linear estimate with either a smaller order of bandwidth than h1 or a higher order of kernel than W , then the asymptotic squared bias of bσ32(x) is the same as that of bσ22(x), which may be larger or smaller than the asymptotic squared bias of σb12(x); see Yu and Jones (2004) for detailed discussions.

Remark 2. Suppose that given X = x, ε has a t-distribution with degrees of freedom m. Then the ratios of λ2(x) to ¯λ2(x) are 0.269, 0.497, 0.674, 0.848, 1.001 for m = 5, 6, 7, 8, 9, respectively. That is, bσ23(x) may have a smaller variance than both bσ12(x) andσb22(x) when ε has a heavy tailed distribution.

Remark 3. Note that the regularity conditions (C1)–(C5) were employed by Fan and Yao (1998) as well. However, it follows from the proof in Section 6 that we

(6)

could replace condition (C2) by E{| log(ε2)|2+δ} < ∞ for some δ > 0 in deriving the asymptotic normality of ν(x). Condition (C2) is only employed to ensure thatb d − d = Ob p(n−1/2). As a matter of fact we only need bd − d = op(nh1)−1/2

to derive the asymptotic normality of bσ23(x). In other words, the asymptotic normality ofσb32(x) may still hold even when E(Y4) = ∞. This is different from other conditional variance estimators such as bσ12(x) and σb22(x), which require at least E(Y4) < ∞ to ensure asymptotic normality.

3 Variance Reduced Estimation

Here we apply the variance reduction techniques proposed by Cheng, et al. (2007), which concerns nonparametric estimation of m(x), to the conditional variance esti- mators σb12(x) and bσ32(x). The reason why we do not consider bσ22(x) here is that the asymptotic normality was derived by Yu and Jones (2004) under much more stringent conditions than those required by the other two estimators. The idea of our variance reduction strategy is to construct a linear combination of either σb12(z) or σb32(z) at three points around x such that the asymptotic bias is unchanged. The details are given below.

For any given point x, let βx,0, βx,1, βx,2 be a grid of equally spaced points, with bin width γh1 = βx,1− βx,0, such that x = βx,1+ lγh1 for some l ∈ [−1, 1]. Then, like Cheng et al. (2007), our variance reduction estimators for σ2(x) are defined as

2j(x) = l(l − 1)

2 bσj2x,0) + (1 − l2)bσ2jx,1) + l(l + 1)

2 σbj2x,2), (3.1) for j = 1 and 3. Suppose that Supp(σ) is bounded, Supp(σ) = [0, 1] say, since βx,0 < x < βx,2, βx,0 and βx,2 would be outside Supp(σ) if x is close to the end- points. Therefore we take γ(x) = minγ, x/(1 + l)h1, (1 − x)/(1 − l)h1

so that

x,0, βx,1, βx,2 ∈ Supp(σ) = [0, 1] all the time.

The following theorem gives the asymptotic limits of the variance reduced esti-

(7)

mators.

Theorem 2. Under the conditions of Theorem 1, for interior point x we have pnh1

12(x) − σ2(x) − θn1 d

→ N

0,R(W ) − l2(1 − l2) C(γ) p(x)−1σ4(x)¯λ2(x) and

pnh1

32(x) − σ2(x) − θn3 d

→ N

0,R(W ) − l2(1 − l2) C(γ) p(x)−1σ4(x)λ2(x) , where C(s, t) =R W (u − st)W (u + st) du and C(s) = 32C(0, s) − 2C(12, s) +12C(1, s).

Hence eσ2j(x) has the same asymptotic bias as σbj2(x) for j = 1, 3. Note that 0 ≤ l2(1 − l2) ≤ 1/4 for all l ∈ [−1, 1] and it attains the maximum at l = ±2−1/2. Moreover, for symmetric kernel W the quantity C(γ) is nonnegative for all γ ≥ 0;

0 ≤ C(γ) ≤ (3/2)R(W ) and C(γ) is increasing in γ if W is symmetric and concave; see Cheng et al. (2007). So, the variance reduction estimators have smaller asymptotic variances and asymptotic mean squared errors.

By choosing l = ±2−1/2, we achieve the most variance reduction regardless what h1, γ and W are and the resulting estimators are

j,(1)2 (x) = 1

4(1 − 21/2)σbj2 x − (1 + 2−1/2)γh1 +1

2bσj2 x − 2−1/2γh1 +1

4(1 + 21/2)bσj2 x − (2−1/2− 1)γh1

(3.2) and

σej,(2)2 (x) = 1

4(1 + 21/2)σbj2 x + (2−1/2− 1)γh1 +1

2bσj2 x + 2−1/2γh1 +1

4(1 − 21/2)bσ2j x + (2−1/2+ 1)γh1

(3.3) for j = 1 and 3.

Either of the variance reduction estimators eσ2j,(1)(x) and eσj,(2)2 (x) uses more in- formation from data points on one side of x than the other side; see (3.2) and (3.3).

One way to balance this finite sample bias effect is to take the average

σej,(3)2 (x) = 1 2



j,(1)2 (x) +eσj,(2)2 (x)

(3.4)

(8)

for j = 1 and 3. When Supp(σ) = [0, 1], to keep the points βx,0, βx,1, βx,2 with l = ±2−1/2 all within the data range [0, 1] we let γ(x) = minγ, x/(1 + 2−1/2)h1, (1 − x)/(1 + 2−1/2)h1 for a positive constant γ, γ = 1 say.

Theorem 3. Under the conditions of Theorem 1, for interior point x we have pnh1

1,(3)2 (x)−σ2(x)−θn1 d

→ N

0,R(W )−C(γ)/4−D(γ)/2 p(x)−1σ4(x)¯λ2(x) and

pnh1

3,(3)2 (x)−σ2(x)−θn3 d

→ N

0,R(W )−C(γ)/4−D(γ)/2 p(x)−1σ4(x)λ2(x) , where

D(γ) = R(W ) − C(γ) 4 − 1

16 n

4 1 +√

2C √

2 − 1, γ/2 + 3 + 2√

2C 2 −√

2, γ/2 +2C √

2, γ/2 + 4 1 −√

2C √

2 + 1, γ/2 + 3 − 2√

2C √

2 + 2, γ/2o .

Remark 4. Note that, for any kernel W , 0 ≤ D(γ) ≤ (5/8)R(W ). Hence, eσ2j,(3)(x) has a smaller asymptotic variance than both eσj,(1)2 (x) andσej,(2)2 (x) for j = 1 and 3.

Remark 5. In Cheng et al. (2007) the variance reduction techniques are applied to nonparametric estimation of the regression m(x). The results in Theorems 2 and 3 are nontrivial given the theory developed therein.

Remark 6. When estimating the conditional variance σ2(x), it does not provide any gain, in asymptotic terms, by replacing m(Xb i) with the variance reduced regression estimator of Cheng, et al. (2007) in the squared residuals bri = Yi − m(Xb i) 2

, i = 1, · · · , n.

Remark 7. In (1.6), the term n−1 is added to avoid log 0 and it can be replaced by n−η, for any η > 0, without affecting the theoretical results in Theorems 1–

3. However, in finite sample cases, a too small value of η would increase the bias and a too large value of η would increase the variability. In the simulation study summarized in Section 5.1, we also experimented with η = 0.5 and 2. We found that

(9)

η = 0.5 is undesirable as the MADE boxplot is always above the others even though it is narrower. When η = 2, besides the MADE boxplot is wider than the others, the performance is not stable with the MADE boxplot lower than the others only in some settings, not all.

4 Bandwidth Selection

In the construction of our estimator bσ32(x), two bandwidths are needed: h2 is the bandwidth in (1.2) to get the squared residualsbr1, · · · ,brnand h1 is the bandwidth in (1.6) to estimate the conditional variance. Since both (1.2) and (1.6) are local linear fittings based on data (Xi, Yi), i = 1, · · · , n and (Xi, log(rbi + n−1), i = 1, · · · , n respectively, we suggest to employ the same bandwidth procedure in the two steps.

Let ˆh(X1, · · · , Xn; Y1, · · · , Yn) denote any data-driven bandwidth rule for the local linear fitting (1.2).

1. Take h2 = ˆh(X1, · · · , Xn; Y1, · · · , Yn) in the local linear regression (1.2) to obtain the regression estimates m(Xb i), i = 1, · · · , n, and the squared residuals bri =

Yi−m(Xb i) 2

, i = 1, · · · , n.

2. Use bandwidth h1 = ˆh X1, · · · , Xn; log(br1+n−1), · · · , log(rbn+n−1) in the local linear fitting (1.6) to get bσ32(x).

A simple modification of the above bandwidth procedure can be used to imple- ment our variance reduction estimatorσe3,(3)2 (x). Comparing Theorem 1 and Theorem 3, the asymptotically optimal global (or local) bandwidths ofσe3,(3)2 (x) andbσ32(x) differ by the constant multiplier R(W ) − C(γ)/4 − D(γ)/2 1/5 which depend only on the known W and γ. Therefore, no matter whether ˆh is a global bandwidth or a local bandwidth, the modification proceeds as, in step 2 above, using

h1 =R(W ) − C(γ)/4 − D(γ)/2 1/5ˆh X1, · · · , Xn; log(br1+ n−1), · · · , log(brn+ n−1) (4.1)

(10)

in (1.6) to obtain bσ23(z), z ∈ {βx,0, βx,1, βx,2} with l = ±2−1/2. Then one can form the linear combinations, specified in (3.2), (3.3) and (3.4), to get σe3,(3)2 (x).

For the conditional variance estimator bσ21(x), Fan and Yao (1998) recommended a bandwidth principle analogous to what we specify for bσ23(x) in the above. To modify the bandwidth rule of σb12(x) for use in eσ1,(3)2 (x), apply the same constant factor adjustment as in (4.1) when computing bσ21(z) for z ∈ {βx,0, βx,1, βx,2} with l = ±2−1/2.

5 Numerical Study

The five estimators σb12(x), σb22(x), bσ23(x), eσ21,(3)(x) and σe3,(3)2 (x) are compared based on their finite sample performances via a simulation study and an application to the motorcycle data set.

5.1 Simulation

Consider the regression model

Yi = aXi+ 2 exp −16Xi2 + σ (Xi) εi, (5.1) where σ (x) = 0.4 exp (−2x2) + 0.2, a = 0.5, 1, 2, or 4, Xi ∼ Uniform[−2, 2] and εi is independent of Xi and follows either the N (0, 1) or the (1/√

3) t3 distribution.

For each of the settings, 1000 samples of size n = 200 were generated from model (5.1). The plug-in bandwidth of Ruppert et al. (1995) was employed as the band- width selector ˆh in Section 4. To implement bσ22(x), h1 was taken as the data-driven bandwidth given in Yu and Jones (2004). Both K and W , kernels in the regression and conditional variance estimation stages, were taken as the Epanechnikov kernel K(u) = (3/4)(1 − u2)I(|u| < 1). The parameter γ in σb1,(3)2 (x) and bσ23,(3)(x) was set to 1. Performance of an estimator bσ(·) of σ(·) is measured by the mean absolute

(11)

0.010.020.030.040.050.06

normal error a=0.5

0.010.020.030.040.050.06

t error a=0.5

0.010.020.030.040.050.06

normal error a=1

0.010.020.030.040.050.06

t error a=1

0.010.020.030.040.050.06

normal error a=2

0.010.020.030.040.050.06

t error a=2

0.010.020.030.040.050.06

normal error a=4

0.010.020.030.040.050.06

t error a=4

Figure 1: MADE boxplots under model (5.1). In each panel, the MADE boxplots of bσ1(·), σb2(·), bσ3(·), eσ1,(3)(·) and eσ3,(3)(·) are arranged from left to right. The left and right columns respectively give the results for the Normal and t errors. From the top, the rows correspond to a = 0.5, 1, 2 and 4.

deviation error or the mean squared deviation error, respectively defined by

M ADE (σ) =b 1 g

g

X

i=1

bσ (xi) − σ (xi)

, M SDE (bσ) = 1 g

g

X

i=1



bσ (xi) − σ (xi) 2

,

(12)

where {xi, i = 1, · · · , g} is a grid on [-2,2] with g = 101. Here, we measure the performance by MADE or MSDE of estimating σ(·) instead of those of estimating σ2(·) since the latter would seriously down-weight errors in estimating small values of σ2(·).

Figure 1 presents the MADE boxplots of the five estimators σbj(·), j = 1, 2, 3, and eσj,(3)(·), j = 1, 3. Under all of the configurations, eσj,(3)(·) outperforms σbj(·) for j = 1, 3 and our log-tranform based methods improve on the Fan and Yao (1998) estimator. When the error distribution is Normal, the Yu and Jones (2004) estimator bσ2(·) is somehow the best since its MADE median is the lowest. The reason for this optimality is that bσ2(·) is derived from a local Normal likelihood model which now coincides with the true error model. However, its MADE boxplot is always much wider than those of the other four and this instability is intrinsic to local likelihood methods. Interestingly, our estimator eσ3,(3)(·) is nearly optimal even under Normal errors: compared to bσ2(·), the MADE median is roughly the same and the MADE

0.000.100.20

normal error npred=2

0.000.100.20

t error npred=2

0.00.20.4

normal error npred=3

0.00.40.81.2

t error npred=3

Figure 2: MADE boxplots of predictions under model (5.2). The layout is the same as in Figure 1, except that the top and bottom rows respectively represent the two- and three-step predictions here.

(13)

upper quartile is lower. Further,bσ3(·) and eσ3,(3)(·) become decisively better than any of the others under t errors. Therefore,σb3(·) andeσ3,(3)(·) are very robust against heavy tailed errors, and σb2(·) is not robust against departure from Normality. Although eσ1,(3)(·) performs better thanσb1(·) all the time but its behaviors under different error distributions is predetermined by that of bσ1(·). The MSDE boxplots are not given here, but they provide similar conclusions.

Another setting we considered is the following nonlinear time series model Xt+1 = 0.235Xt(16 − Xt) + εt (5.2) where εt ∼ N (0, 0.32) is independent of Xt. From model (5.2), 500 samples of size n = 500 were simulated. The conditional variance of Xt+1 given the past data

Xt, Xt−1, · · · is a constant function. Hence we investigate estimation of the condi- tional variances in two-step and three-step prediction problems with Yt = Xt+2 and Yt = Xt+3 respectively. Figure 2 presents the MADE boxplots, which convey similar conclusions as in the previous example. In particular, eσ3,(3)(·) is quite reliable and robust.

In the construction of bσ3(·) and eσ3,(3)(·), the term n−1 is added in (1.6) to avoid log 0. We also experimented with the term n−1 replaced by n−0.5 or n−2. To save space, the results are not given here. We found that with n−0.5 the MADE boxplot is always above the others even though it is narrower. When using n−2, the MADE boxplot is wider than the others, but the performance is not stable with the MADE boxplot lower than the others only in some settings.

5.2 An Application

The estimators bσ12(x),σb22(x),σb32(x),σe1,(3)2 (x) and eσ23,(3)(x) were employed to estimate the conditional variance for the motorcycle data given by Schmidt et al. (1981). The covariate X is the time (in milliseconds) after a simulated impact on motorcycles and the response variable Y is the head acceleration (in gram) of a test object. The

(14)

10 20 30 40 50

−100−50050

●●●●●●● ●●●● ●●

●●●

● ●

● ●

●●

● ●

(a)

0 10 20 30 40 50 60

01020304050

(b)

●●●●●●●● ●●

● ●●

●●

● ●

0 10 20 30 40 50 60

01020304050

(c)

●●●●●●●● ●●

● ●●

●●

● ●

0 10 20 30 40 50 60

01020304050

(d)

●●●●●●● ●●

● ●●

●●

● ●

●●

0 10 20 30 40 50 60

01020304050

(e)

●●●●●●●● ●●

● ●●

●●

● ●

●●

0 10 20 30 40 50 60

01020304050

(f)

●●●●●●●● ●●

● ●●

●●

● ●

●●

Figure 3: Motorcycle data. Panel (a) depicts the motorcycle data and the local linear regression estimatem(·). The absolute residuals are plotted against the design pointsb in panels (b)–(f), which respectively show the estimates σb1(·), bσ2(·), bσ3(·), eσ1,(3)(·) and eσ3,(3)(·) (solid lines) and 10%, 50% and 90% variability curves of their resampled versions (dashed lines).

sample size is 132. As before, the squared residuals rb1, · · · ,brn are used to estimate the conditional variance. We took ˆh in Section 4 as the bandwidth selector of Rupprt, et al. (1995). Then the bandwidth h2 in m(·) was 4.0145, and the bandwidth hb 1 in bσ12(·), bσ23(·), σe1,(3)2 (·) and eσ3,(3)2 (·) was respectively 6.1775, 4.5763, 5.1053 and 3.7821.

The bandwidth h1 in bσ23(·) was selected by the method of Yu and Jones (2004) and was 13.4188. In Figure 3, panel (a) depicts the original data and the local linear regression estimate m(·), and the solid lines in panels (b)–(f) are respectively theb estimates bσ1(·), bσ2(·), σb3(·), eσ1,(3)(·) and eσ3,(3)(·). In Figure 4, panel (a) plots the residuals, and panels (b)–(f) depicts the Normal Q-Q plots of the residuals divided by the estimates of the conditional standard deviations. Panels (b) and (e) suggest that the motorcycle data has a heavy left tail in the error distribution. Panels (d) and (f) show that our estimators σ(·) andb eσ3,(3)(·) effectively correct the heavy left tail. Panel (c) indicates a departure from normality whenbσ2(·) is applied to this data

參考文獻

相關文件

We are not aware of any existing methods for identifying constant parameters or covariates in the parametric component of a semiparametric model, although there exists an

We propose two types of estimators of m(x) that improve the multivariate local linear regression estimator b m(x) in terms of reducing the asymptotic conditional variance while

• Many statistical procedures are based on sta- tistical models which specify under which conditions the data are generated.... – Consider a new model of automobile which is

LTP (I - III) Latent Topic Probability (mean, variance, standard deviation) non-key term.

LTP (I - III) Latent Topic Probability (mean, variance, standard deviation) non-key term.

LTP (I - III) Latent Topic Probability (mean, variance, standard deviation).. O

• When this happens, the option price corresponding to the maximum or minimum variance will be used during backward induction... Numerical

• When this happens, the option price corresponding to the maximum or minimum variance will be used during backward induction... Numerical