新的一致性條件動差檢定

(1)

行政院國家科學委員會專題研究計畫成果報告

新的一致性條件動差檢定

研究成果報告(精簡版)

計畫類別：個別型

計畫編號： NSC 96-2415-H-004-008-

執行期間： 96 年 08 月 01 日至 97 年 10 月 31 日

執行單位：國立政治大學經濟學系

計畫主持人：林馨怡

計畫參與人員：碩士班研究生-兼任助理人員：周東慶

處理方式：本計畫可公開查詢

中華民國 97 年 12 月 11 日

(2)

1 Introduction

Many economic and econometric models are represented by conditional moment restrictions, for example, the rational expectation model, the market disequilibrium model, the conditional probability model, the discrete choice model and the nonlinear simultaneous equations model. The validity of those model specifications is evaluated by testing the associated moment conditions. The resulting test is the conditional moment test or M-test which has been developed by Newey (1985), Tauchen (1985), and White (1987). However, these conditional moment tests may not be consistent because they check only necessary conditions of the conditional moment restrictions and there exist alternatives that cannot be detected by these testing procedures. Therefore, our research focuses on constructing consistent conditional moment tests.

There is an abundance of literature on constructing conditional moment tests. The non-parametric tests employ a nonnon-parametric estimator to a non-parametric conditional moment function that equals zero under the null. See, for example, Eubank and Spiegelman (1990), Lewbel (1995), Hong and White (1995), Fan and Li (1996), Li and Wang (1998), Chen and Fan (1999), Zheng (1998a, 1998b, 2000), Horowitz and Spokoiny (2001), Delgado and Gonz´alez Manteiga (2001), Li, Hsiao, and Zinn (2003) and Tripathi and Kitamura (2003), among others. However, the test statistics of nonparametric tests are subjective in choos-ing smoothchoos-ing parameters and could be computationally costly. Another approach of the conditional moment test is based on infinitely many unconditional moment functions with uncountably many weighted functions indexed by continuous nuisance parameters (Stich-combe and White, 1998). Therefore, a consistent conditional moment test can be obtained by checking these orthogonality functions. Bierens (1982, 1984, 1990), de Jong (1996), Bierens and Ploberger (1997), and Bierens and Ginther (2001) choose the exponential function while Stute (1997), Stute, Thies and Zhu (1998), Koul and Stute (1999), and Stute and Zhu (2002) use the indicator function as the weight functions. The former tests are henceforth called the Bierens test while the latter tests are the the Stute test.

It is noted that both the Bierens and the Stute tests are not asymptotically pivotal in general; that is their limiting distributions depend on model characteristics and critical val-ues cannot be tabulated. For example, the auxiliary nuisance parameters in the exponential weight function of the Bierens test lead to the limiting distributions depending on the data generating process. Although Bierens and Ploberger (1997) derived case-independent upper bounds of the critical values to solve this problem, their test may be too conservative in practice. The Stute test is also not asymptotically pivotal because of the estimation effects (Durbin, 1973) and it is case dependent. Stute, Gonz´alez Manteiga and Presedo Quind¨ımil

(3)

(1998), Whang (2000, 2001, 2004), and Dominguez and Lobato (2006) try to avoid the prob-lem by using the bootstrap to approximate the limiting distributions. Stute, Thies and Zhu (1998), Koul and Stute (1999) and Stute and Zhu (2002) employe the martingale transforma-tion of Khmaladze (1981) to obtain asymptotically distributransforma-tion-free test statistics. However, due to the difficulty of having a proper definition of a multitime parameter martingales, these tests, except for Song (2007), cannot be carried out with multivariate regressors. Note that in Song (2007), the nonparametric estimation of the conditional moment function is required but a high dimensional nonparametric estimation is very complicated to compute. Kuan and Lin (2008) propose a test which centers the estimation effect out by an average of empirical processes and thus obtained an asymptotically pivotal test. Their test can be applied to multivariate regressor cases and the limiting distribution is a sup-norm of a multi-parameter Kiefer type process.

This paper proposes a consistent conditional moment test by checking an infinite set of unconditional moment conditions with indicator weight functions. The test statistic is based on the subsampling marked empirical process with sample size b instead of the whole size n such that b < n. The subsampling, investigated by Politis and Romano (1994) and Poli-tis, Romano and Wolf (1999) is a method of estimating the distribution of an estimator or test statistic by drawing subsamples from the original data. Existing studies in the litera-ture include Andrews and Guggenberger (2005), Chernozhukov and Fern´andes-Val (2005), Guggenberger and Wolf (2004), Hong and Scaillet (2006), Linton, Massoumi and Whang (2005) and Whang (2004). The subsampling method has not been used to construct the test statistics, which is done in this article. Other advantages of this paper are: (i) Instead of computing the sample average of the conditional moment function with the whole sample, the test statistic is obtained by the subsampling marked empirical process. The estimation effect disappears when the relative sample size of subsampling to that of the whole sample is zero asymptotically. Therefore, the proposed test does not suffer from the Durbin problem and is asymptotically pivotal. (ii) The present paper differs from the Stute test in that they use a martingale transformation technique which is applied to the model with a univariate regressor, and the test subsamples the marked empirical processes and can be applied to models with multivariate regressors. (iii) Unlike the tests using nonparametric smoothing methods, the proposed test therein avoids the user-chosen smoothing methodology. (iv) The test statistic can be computed using any √n-consistent estimator and different estimation methods can be applied. (v) The proposed test does not use the bootstrap, the martingale transformation, or the nonparametric method, resulting in significant simplifications in com-puting the test statistics. One disadvantage of our test is that it is powerful against local alternatives with rates b−1/2, but the proposed test is incapable of detecting local

(4)

alterna-tives at rate n−1/2. Our test shares the same disadvantage of most nonparametric tests. The Monte Carlo simulations show that the proposed test has good finite sample performances and the test is robust with respect to different values of b.

This paper is arranged as follows. Section 2 presents the conditional moment restriction and the proposed test. Section 3 shows the consistency of the test and the asymptotic behavior under different local alternatives. Section 4 shows the Monte carlo simulation results and follows the conclusions in section 5. All proofs are given in the Appendix.

2 A New Test

2.1 Conditional Moment Restrictions Consider general conditional moment restrictions

IE[m(y, x, β_o)x] = 0, (1)

where IE[·|x] denotes the expectation conditional on the information set of x and m(·) is a function on data, and {y, x} is a sequence of random variables with x = (x1, · · · , xk)0 and

parameters β ∈ B with B ∈ Rk. The conditional moment restrictions can be obtained from existing models such as the parametric nonlinear regression models where m(y, x, β_o) is the difference between y and g(x0, β), with g(·) a nonlinear function. To test the condition moment restrictions, the null and alternative hypotheses are as follows. The null hypothesis is the conditional moment function being equal to zero:

H0: IP IE(m(y, x, βo)

x) = 0 = 1, for some β_o∈ B,

against the alternative hypothesis is, for all β ∈ B, IE(m(y, x, β)x) 6= 0 with a positive probability:

H1: IP IE(m(y, x, β)

x) = 0 < 1, for all β ∈ B, with B ∈ Rk a compact set.

As has been shown by Stinchcombe and White (1998), the conditional moment condition (1) equals infinitely many unconditional moment functions

IE[m(y, x, β_o)ω(x, ξ)] = 0, ∀ξ ∈ Rk, (2)

where ω(·) is an infinite set of weights indexed by continuous parameters ξ and ω(·) may be any analytic function that is not polynomial. Therefore, testing (2) constructs a consistent conditional moment test. For example, Bierens (1982, 1984, 1990), de Jong (1996) and Bierens

(5)

and Ploberger (1997) and Bierens and Ginther (2001) use the exponential weighted function ω(x, ξ) = exp(x0ξ) for their integrated conditional moment test. Stute (1997), Stute, Thies and Zhu (1998), Koul and Stute (1999) and Stute and Zhu (2002) take the indicator function

ω(x, ξ) = 11{x≤ξ}:= 11{x1≤ξ1}· · · 11{xk≤ξk},

where 11A denotes the indicator function of even A. The present paper implements the

indi-cator function and the conditional moment restrictions (1) can be rewritten by the infinitely many unconditional moment functions as follows:

IE[m(y, x, β_o)11{x≤ξ}] = 0, ∀ξ = (ξ1, · · · , ξk)0 ∈ Rk. (3)

The above moment functions allow for multivariate regressors.

2.2 Test Statistics

The specification test considered in this paper examines infinitely many unconditional mo-ment functions (3) that are equivalent to the conditional momo-ment restriction (1) and therefore it is a consistent conditional moment test. To test the moment function IE[m(y, x, β_o)11{x≤ξ}]

being equal to zero, it is natural to consider the normalized sample average of the moment function: Mn(ξ; βo) := 1 √ n n X t=1 m yi, xi, βo 11{xi≤ξ},

with {yi, xi}ni=1 a sequence of random variable, and 11{xi≤ξ} = 1{xi1≤ξ1}· · · 1{xik≤ξk}. The

function m yi, xi, β 11{xi≤ξ}is considered the marked empirical process with marks given by

the moment function m. The function Mn is the average of the market empirical processes

with sample size n. Our objective is to test that Mn(ξ; βo) is close to zero or not. If

Mn(xi; βo) is close to zero, then we do not reject the null hypothesis; otherwise, we reject

the null hypothesis and conclude that the conditional moment restriction does not hold.

Since the true parameter β_o is unknown, we replace β_o by its consistent estimator, ˆβ_n, and the sample average of the marked empirical processes becomes

Mn(ξ; ˆβn) = 1 √ n n X i=1 m(yi, xi, ˆβn)11{xi≤ξ}.

Now, rewrite the process Mn:

Mn(ξ; ˆβn) = Mn(ξ; βo) + 1 √ n n X i=1 m(yi, xi, ˆβn) − m(yi, xi, βo) 11{xi≤ξ}.

(6)

If m(yi, xi, β) is once differentiable with first derivative ∇βm(yi, xi, βo), then Mn(ξ; ˆβn) = Mn(ξ; βo) + 1 √ n n X i=1 ∇βm(yi, xi, βo)( ˆβn− βo)11{xi≤ξ}+ op(1) = Mn(ξ; βo) + √ n( ˆβ_n− β_o)1 n n X i=1 ∇_βm(yi, xi, βo)11{xi≤ξ}+ op(1).

It is seen that Mn(ξ; ˆβn) and Mn(ξ; βo) are not asymptotically equivalent due to the

presence of the second term on the right hand side of the second equality. This term is the estimation effect discussed in Durbin (1973). The second term depends on a model characteristic that makes the test based on Mn(ξ; ˆβn) not asymptotically pivotal; this is

the well know Durbin’s problem. To eliminate the estimation effect, Stute, Thies and Zhu (1998), Koul and Stute (1999), and Stute and Zhu (2002) use the martingale transformation of Khmaladze (1981). However, due to the difficulty of having a proper definition of a multitime parameter martingales, these tests are used for univariate x; see also Bai (2003). Song (2007) extends the research to multiparameter processes but because a nonparametric estimation of the conditional moment restriction is required, his test is complicated to compute.

This paper considers a subsampling version of the Mnprocess. Instead of using the whole

sample of data with sample size n to compute the process, we use a subsample of data with sample size b to compute the sample average and construct the following process, for b < n:

Mb(ξ; ˆβn) := 1 √ b b X i=1 m yi, xi, ˆβn 11{xi≤ξ},

where ˆβ_ncan be any√n-estimator associated with the model of interest by the whole sample. Mb permits the following expansion:

Mb(ξ; ˆβn) = Mb(ξ; βo) + 1 √ b b X i=1 ∇_βm(yi, xi, βo)( ˆβn− βo)11{xi≤ξ}+ op(1) = Mb(ξ; βo) + r b n √ n( ˆβ_n− β_o) " 1 b b X i=1 ∇_βm(yi, xi, βo)11{xi≤ξ} # + op(1).

It is interesting to see that if b → ∞, n → ∞ and b/n → 0, and there are some regularity conditions, then the second term on the right-hand-side of the second equality of the above equality converges to zero. Thus, Mb(ξ; ˆβn) and Mb(ξ; βo) are asymptotically equivalent.

Subsampling the marked empirical process annihilates the estimation effect. Kuan and Lin (2008) considers a partial sum of the data and eliminate the estimation effect by centering a sequentially marked empirical process. Let D(Rk) be the space of the cadlag function on Rk endowed with the Skorohod topology. Here, Mb is in D(Rk). In what follows, let

(7)

⇒ denote the convergence in distribution, and −→ denote the convergence in probability.p The following assumptions are sufficient for the weak convergence of the subsampling marked empirical processes.

[A1] {yi, xi}ni=1 is ergodic and strictly stationary where xi has the continuous distribution

function F and the density function is f .

[A2] (i) IE[m(yi, xi, β)2|xi] < ∞,

(ii) IEm(yi, xi, β)4= κ < ∞,

(iii) IE[m(yi, xi, β)4||xi||1+η] < ∞, for some η > 0.

[A3] The conditional density fxi|Fi−1 is bounded and continuous, where Fi−1is the σ algebra

generated by x1, · · · , xi−1.

[A4] m(·) is once continuously differentiable in a neighborhood β_o and satisfies

IE " sup β∈Bo |∇_βm(yi, xi, β)| # < ∞,

where Bo denotes a neighborhood of βo.

[A5] ˆβ_n is a√n-consistent estimator; that is √n ˆβ_n− β_o = O_p(1).

Assumption [A1] permits data with weak dependence. Assumptions in [A2] restrict the dependence of the moment function. Given [A2] (i), the conditional variance function σ2(xi)

of m(yi, xi, β) is defined with

σ2(u) := varm(yi, xi, β)

x_i = u].

For ξ = (ξ1, · · · , ξk)0 and u = (u1, · · · , uk)0, define

V (ξ) := IEσ2(xi)11{xi≤ξ} = Z ξ −∞ σ2(u)Fx(du), with Rξ −∞ := Rξ1 −∞· · · Rξk

−∞. Following Koul and Stute (1999), assumptions [A2](ii) and (iii)

together with [A3] are required to obtain the uniform tightness in the space D[−∞, ∞]. Assumption [A4] is a standard smoothness assumption. [A4] can be relaxed to non-smooth moment function when considering the stochastic equicontiunity of m. Assumption [A5] is weak and could be applied to most existing estimation method. In the following, we obtain the weak convergence of Mb.

Theorem 2.1. Under H0 and assumptions [A1]-[A5], if b → ∞, n → ∞ and b/n → 0, then

one has:

Mb(ξ; ˆβ) ⇒ B V (ξ),

(8)

The limiting distribution of Mb is a centered Gaussian process which is a multi-parameter

Brownian motion process on [0, 1]k with covariance function

V (ξ₁∧ ξ₂) = Z ξ₁∧ξ₂ −∞ σ2(u)F (du), where Rξ₁∧ξ₂ −∞ = Rξ11∧ξ21 −∞ · · · Rξ1k∧ξ2k

−∞ . In particular, when xi is a univariate, the process B

is the standard Brownian motion. The limit of Mb(ξ; βo) and that of Mb(ξ; ˆβn) are the

same and the Durbin problem disappears because of the convergence rate of b to infinity is slower that that of n. In addition, it is seen that V (ξ) plays an important in the test. Since V (ξ) still depends on the distribution of xi and σ2, the process Mb(ξ; ˆβn) is not

asymptotically distribution free. When σ2(xi) = σ20 (the conditional homoskedasticity case),

which is a constant, we obtain V (ξ) = σ2₀F (ξ). We follow Koul and Stute (1999) to consider an estimator of V (ξ): ˆ Vn(ξ) = 1 n n X i=1 m2(yi, xi, β)11{xi≤ξ}, ξ ∈ R k_.

Tests for H0can be based on an appropriately scaling of Mb. Consider a consistent estimator

ˆ

σ_b2 for σ2₀ in homoskedasticity case. We have the “scale invariant”version of subsampling marked empirical processes:

˜ Mb(ξ; ˆβn) := 1 √ bσˆ −1 b b X i=1 m yi, xi, ˆβn 11{xi≤ξ}.

Theorem 2.2. Under H0 and assumptions [A1]-[A5], if b → ∞, n → ∞, b/n → 0 and

ˆ σ2

b → σ20, then one has

˜

Mb(ξ; ˆβn) ⇒ B F (ξ),

with B(·) a standard Brownian sheet.

The computational counterpart of the scaled invariant version of ˜Mb(ξ; ˆβ) is considered

as follows: ˜ Mb(xj; ˆβn) := 1 √ bσˆ −1 b b X i=1 m yi, xi, ˆβn 11{xi≤xj}, j = 1, · · · , n,

where each realization xj is used as a ξ in the indicator function. Consider two goodness-of-fit

statistics, the Kolmogorov-Smirnov and Cramer-von Mises test statistics:

KSn= sup xj∈Rk ˜M_b(x_j; ˆβ_n) ,

(9)

CMn= 1 n n X j=1 ˜ Mb(xj; ˆβn)2.

By Theorem 2.2 and the continuous mapping theorem, for large n, one has, with ω ∈ [0, 1]k

KSn⇒ sup ξ∈Rk B F (ξ) = sup 0≤ω≤1 B(ω) , and CMn= Z ∞ −∞ ˜ Mb(ξ; ˆβn)2F (dξ) ⇒ Z ∞ −∞ B F (ξ)2 F (dξ) = Z 1 0 B(ω)2dω.

The critical values of the test statistics KSn and CMn can be found in existing literature;

see the book of Shorack and Wellner (1986). It is interesting to note that the proposed test is asymptotically pivotal and the limiting distribution of the proposed test does not depend on a data generating process. Therefore, we have the following corollary.

Corollary 2.3. Under all the assumptions in Theorem 2.2.

KSn⇒ sup 0≤ω≤1 B(ω), CMn⇒ Z 1 0 B(ω)2dω,

with B(·) the standard Brownian sheet.

3 Power of the Tests

To investigate the power performance of the proposed test, two types of alternatives are considered. One is the general type of alternatives:

H1: IE[m(y, x, βo)|x] = µ(x) 6= 0, ∀ ξ = (ξ1, · · · , ξk) ∈ Rk,

and the other is the local alternatives:

H₁L: IEm(y, x, β_o)x = δ(x)

√ b ,

with δ(x) 6= 0. We then have the following theorem.

Theorem 3.1. Assume assumptions [A1]-[A5] hold. Assume also b → ∞, n → ∞ and b/n → 0. Therefore:

(i) Under the fixed alternative H1:

˜

(10)

(ii) Under the local alternatives H₁L:

˜

Mb(ξ; ˆβn) ⇒ B(F (ξ)) + σ−10 IE[δ(xi)11{xi≤ξ}].

By Theorem 3.1 and the continuous mapping theorem, we additionally have the following corollary.

Corollary 3.2. Assume assumptions [A1]-[A5] hold. Assume also b → ∞, n → ∞ and b/n → 0. Therefore:

(i) Under the fixed alternative H1:

KSn→ ∞,

CMn→ ∞.

(ii) Under the local alternatives H₁L:

KSn⇒ sup ξ∈Rk B(F (ξ)) + σ−1₀ IE[δ(x_i)11_{x i≤ξ}] CMn⇒ Z 1 0 B(F (ξ)) + σ−1₀ IE[δ(xi)11{xi≤ξ}] 2 dω.

The first part of Corollary 3.2 gives the consistency of the proposed test and the second part implies that the proposed test has nontrivial powers against local alternatives at rate b−1/2. However, there may exist local alternatives at rate n−1/2 as follows.

H₂L: IEm(y, x, β_o)x = δ(x)

√ n .

We have Theorem 3.3.

Theorem 3.3. Assume assumptions [A1]-[A5] hold. Assume also b → ∞, n → ∞ and b/n → 0. Under the local alternatives HL

2:

˜

Mb(ξ; ˆβn) ⇒ B(F (ξ)).

Under the local alternatives H₂L, the limiting distribution of ˜Mb(ξ; ˆβn) is the same as

that under the null hypothesis; see Theorem 2.1. The proposed test is incapable of detecting local alternatives at rate n−1/2. Therefore, our test shares the same disadvantage of most nonparametric tests; see the discussion in Whang (2000) therein. The proposed test has no local power against this type of alternatives. Thus our test cannot be used in this scenario.

(11)

4 Monte Carlo Simulations

This section reports some simulation results to examine the finite sample performance of the test statistic KSn. We consider the following null data generating processes (DGPs).

(A) yi = xi1+ ei, (B) yi = xi1+ 5 + ei, (C) yi = xi1+ exp(zi) + ei, (D) yi = xi2+ xi3+ ei, (E) yi = xi2+ xi3+ 5 + ei, (F) yi = xi2+ xi3+ exp(zi) + ei.

Here xi1, xi2, xi3 and zi are independent and identically distributed (i.i.d.) N (0, 1)

distribu-tion and ei is i.i.d. N (0, σ02) with σ20 = 1, 2, 3, 4. The test statistic KSn for one regressor

is: KS1 = max j 1 √ bσˆ −1 1 b X i=1 (yi− x0i1βˆ1)11{xi1≤xj} ,

for DGPs (A), (B) and (C) and a statistic for two regressors:

KS2 = max j 1 √ bσˆ −1 2 b X i=1 (yi− x0i2βˆ2− x0i3βˆ3)11{xi2≤xj1}11{xi3≤xj2} ,

for DGPs (D), (E) and (F) where ˆβ1, ˆβ2, ˆβ3 are least square estimates, ˆσ21 = b−1

Pb

i=1(yi−

x0_i1βˆ1)2 and ˆσ22= b−1

Pb

i=1(yi− x0i2βˆ2− x0i3βˆ3)2. In each simulation experiment, the number

of replications is 2000. The significance level is 0.05. We choose different values of b in this simulation. The choice of b is considered for the formula b = np with p = 0.5, 0.55, · · · , 0.95.

Table 1, given σ₀2 = 1, reports the rejection frequencies of the tests for different values of n and p. For DGPs (A) and (D), the rejection values are finite sample sizes of the test. In the column of DGP (A), all values are close to the significance level 0.05 except at the values of p = 0.5. However, in the column of DGP (D), the proposed test is under-sized for large p. We can see that when the number of regressors of the regression increases, or if b increases, then the sizes of the test are lower. For DGPs (B), (C), (E) and (F), the rejection rates are the finite sample powers of the proposed test. In columns of DGPs (B) and (E) that have fixed alternatives, the finite sample powers are 1 showing that the test has good power performances in different values of n and b(orp). In addition, the values on columns of

(12)

DGPs (C) and (F) present that the test performs well when the alternatives are a random variable. For DGP (C), there are good power performances of the test for large values of p. When n increases, the powers of the test are closer to 1. For DGP (F), powers are lower for n = 100, and as n and b(orp) increase, the powers increase. To sum up, the test has correct sizes for one regressor and is slightly under-sized for two regressors in the regression. When there are fixed alternatives, the power performances are very good. The powers of the test increase along with both n and b. Table 2 reports the rejection frequencies of the test for six DGPs with different σ₀2 and p. The sample size is 500. The finite sample performances in Table 2 are similar to those in Table 1. Moreover, we find that when the variety of error term increases, the powers of the test decrease.

5 Conclusions

This paper proposes a consistent conditional moment test based on infinitely many uncondi-tional moment conditions. The test statistic is a subsampling marked empirical processes and the Durbin problem is eliminated as the convergence rate of the subsampling size is slower than that of the whole sample size. We thus obtain an asymptotically pivotal test. The proposed test is consistent against a general type of alternatives and is powerful against local alternatives at rates b−1/2. However, our test is not powerful against n−1/2 local alternatives. In addition, the test performs well in finite sample simulations and the power performances are good with most values of b.

(13)

Table 1: Rejection frequencies of the conditional moment tests KS1 KS2 n p (A) (B) (C) (D) (E) (F) 100 0.50 0.076 1.000 0.742 0.057 1.000 0.596 0.55 0.067 1.000 0.817 0.043 1.000 0.661 0.60 0.057 1.000 0.877 0.037 1.000 0.775 0.65 0.048 1.000 0.947 0.038 1.000 0.847 0.70 0.044 1.000 0.977 0.036 1.000 0.919 0.75 0.047 1.000 0.992 0.031 1.000 0.968 0.80 0.049 1.000 0.999 0.024 1.000 0.985 0.85 0.042 1.000 1.000 0.021 1.000 0.995 0.90 0.046 1.000 0.999 0.025 1.000 0.999 0.95 0.041 1.000 1.000 0.018 1.000 0.999 200 0.50 0.063 1.000 0.863 0.046 1.000 0.783 0.55 0.045 1.000 0.937 0.038 1.000 0.867 0.60 0.051 1.000 0.978 0.040 1.000 0.950 0.65 0.053 1.000 0.992 0.031 1.000 0.981 0.70 0.039 1.000 0.997 0.035 1.000 0.990 0.75 0.054 1.000 1.000 0.023 1.000 0.998 0.80 0.048 1.000 1.000 0.028 1.000 1.000 0.85 0.043 1.000 1.000 0.023 1.000 1.000 0.90 0.042 1.000 1.000 0.025 1.000 1.000 0.95 0.049 1.000 1.000 0.022 1.000 1.000 500 0.50 0.068 1.000 0.964 0.057 1.000 0.950 0.55 0.042 1.000 0.989 0.030 1.000 0.982 0.60 0.047 1.000 0.998 0.035 1.000 0.994 0.65 0.044 1.000 0.999 0.036 1.000 0.998 0.70 0.045 1.000 1.000 0.040 1.000 0.999 0.75 0.040 1.000 1.000 0.030 1.000 1.000 0.80 0.055 1.000 1.000 0.028 1.000 1.000 0.85 0.040 1.000 1.000 0.029 1.000 1.000 0.90 0.036 1.000 1.000 0.019 1.000 1.000 0.95 0.035 1.000 1.000 0.028 1.000 1.000 Note: The significant level is 0.05. b = np_{. The values in the}

3rd and 6th columns are the finite sample sizes and the values in the 4th, 5th, 7th and 8th columns are the finite sample powers of the proposed test.

(14)

Table 2: Rejection frequencies of the conditional moment tests KS1 KS2 σ₀2 p (A) (B) (C) (D) (E) (F) 2 0.50 0.058 1.000 0.906 0.037 1.000 0.862 0.55 0.053 1.000 0.972 0.041 1.000 0.955 0.60 0.043 1.000 0.995 0.036 1.000 0.984 0.65 0.044 1.000 0.999 0.037 1.000 0.998 0.70 0.049 1.000 1.000 0.039 1.000 0.999 0.75 0.044 1.000 1.000 0.029 1.000 1.000 0.80 0.052 1.000 1.000 0.030 1.000 1.000 0.85 0.040 1.000 1.000 0.033 1.000 1.000 0.90 0.040 1.000 1.000 0.022 1.000 1.000 0.95 0.036 1.000 1.000 0.023 1.000 1.000 3 0.50 0.060 1.000 0.843 0.050 1.000 0.797 0.55 0.050 1.000 0.942 0.039 1.000 0.915 0.60 0.054 1.000 0.987 0.037 1.000 0.972 0.65 0.051 1.000 0.999 0.039 1.000 0.994 0.70 0.048 1.000 1.000 0.034 1.000 0.999 0.75 0.037 1.000 1.000 0.033 1.000 1.000 0.80 0.046 1.000 1.000 0.038 1.000 1.000 0.85 0.052 1.000 1.000 0.030 1.000 1.000 0.90 0.040 1.000 1.000 0.023 1.000 1.000 0.95 0.041 1.000 1.000 0.027 1.000 1.000 4 0.50 0.048 1.000 0.776 0.042 1.000 0.703 0.55 0.053 1.000 0.897 0.042 1.000 0.845 0.60 0.044 1.000 0.969 0.037 1.000 0.949 0.65 0.042 1.000 0.992 0.031 1.000 0.988 0.70 0.048 1.000 0.999 0.033 1.000 0.999 0.75 0.046 1.000 1.000 0.035 1.000 1.000 0.80 0.053 1.000 1.000 0.038 1.000 1.000 0.85 0.047 1.000 1.000 0.027 1.000 1.000 0.90 0.043 1.000 1.000 0.024 1.000 1.000 0.95 0.042 1.000 1.000 0.029 1.000 1.000 Note: The significant level is 0.05. b = np_{. The values in the}

3rd and 6th columns are the finite sample sizes and the values in the 4th, 5th, 7th and 8th columns are the finite sample powers of the proposed test.

(15)

Appendix

Proof of Theorem 2.1. By assumption [A4], the subsampling marked empirical process Mb permits the Taylor expansion:

1 √ b b X i=1 m yi, xi, ˆβn 11{xi≤ξ} = √1 b b X i=1 m yi, xi, βo 11{xi≤ξ}+ 1 √ b b X i=1 ∇βm yi, xi, βo(ˆβn− βo)11{xi≤ξ}+ op(1).

Because b/n → 0 and assumption [A5],

√ b( ˆβ_n− β_o) = r b n √ n( ˆβ_n− β_o)−→ 0.p

In addition, by assumptions [A1] and [A4], H¨older’s inequality and ergodic theorem, we have the following law of large numbers of ergodic and stationary sequence:

1 b b X i=1 ∇_βm yi, xi, βo 11{xi≤ξ} p −→ IE∇βm yi, xi, βo 11{xi≤ξ} , we then obtain 1 √ b b X i=1 ∇_βm yi, xi, βo(ˆβn−βo)11{xi≤ξ}= " 1 b b X i=1 ∇_βm yi, xi, βo 11{xi≤ξ} # √ b( ˆβ_n−β_o)−→ 0.p Therefore, 1 √ b b X i=1 m yi, xi, ˆβ 11{xi≤ξ}= 1 √ b b X i=1 m yi, xi, βo 11{xi≤ξ}+ op(1).

Mb(ξ; ˆβn) and Mb(ξ; βo) are asymptotically equivalent. Estimating parameter β does not

affect the limiting distribution of the statistic and the Durbin problem does not appear.

The process Mb belongs to the Shorohod space D(Rk) and the weak convergence of

Mb(ξ; βo) in the space D(Rk) to a continuous limit is implies by the tightness of Mb and

the finite dimensional convergence of Mb(ξ; βo). In the following, we first follow Bickel

and Wichura (1971), Koul and Stute (1999) and Dom´ınguez and Lobato (2004) to show the tightness of Mb and then the weak convergence of Mb(ξ; βo). Define I1 = (s1, t1] =

×k

j=1(s1j, t1j], and I2 = (s2, t2] = ×kj=1(s2j, t2j] be two subsets in Rk. Then I1 and I2 are

neighbor subsets if and only if for some j∗ ∈ {1, 2, · · · , k}, (s1

j∗, t1_j∗] 6= (s2_j∗, t_j2∗], ×k_j6=j∗(s1_j, t1_j] =

×k

(16)

the process Mb indexed by a parameter in Rk has an associated process indexed by the

intervals as follows: for h = 1, 2,

Mb(Ih; β) := 1 √ b b X i=1 m(yi, xi; β)11{xi∈Ih} = 1 X e1=0 · · · 1 X ek=0 (−1)k−Pj=1,··· ,kej_M b(sh1+ e1(th1− sh1), · · · , shk+ ek(thk− shk); β),

which is the increment of Mb around Ih. Let m(yi, xi; β) = mi. Following Bickel and Wichura

(1971, Theorem 3 and example II), if

IE Mb(I1; β)2, Mb(I2; β)2 = 1 b2IE   " _b X i=1 mi11{xi∈I1} #2" _b X i=1 mi11{xi∈I2} #2 .

is bounded, then for any λ > 0 and γ > 1,

IP(Mb≥ λ) ≤ λ−4µ(I1∪ I2)γ,

with some measure µ. The above result asserts that the process Mb is tight.

Let Fi denote the natural filtration. Under H0 and assumption [A1], {mi11{xi≤τ }, Fi−1}

is a strictly stationary and ergodic martingale difference sequence. When a subindex appears once in the summation, the corresponding term is zero by the law of iterated expectation and the martingale difference property. Moreover, since I1 and I2 are disjoint sets, when

a subindex appears more than twice, the corresponding term is zero. Therefore, similar to Koul and Stute (1999) and Dom´ınguez and Lobato (2004),

IE Mb(I1; β)2, Mb(I2; β)2 = 1 b2IE   b X i=1 m2_i11{xi∈I1}   i−1 X j=1 mj11{xj∈I2}   2 + 1 b2IE   b X i=1 m2_i11{xi∈I2}   i−1 X j=1 mj11{xj∈I1}   2 .

The first and the second terms in the above equation are similar and the only difference is the indexing set Ih; we then focus on the first term. By assumption [A2](i),

1 b2 b X i=1 IE  m2_i11{xi∈I1}   i−1 X j=1 mj11{xj∈I2}   2  = 1 b2 b X i=1 IE  σ2(xi, Fi−1)11{xi∈I1}   i−1 X j=1 mj11{xj∈I2}   2  = 1 b2 b X i=1 IE   Z I1

σ2(u, Fi−1)fxi|Fi−1(u)du

  i−1 X j=1 mj11{xj∈I2}   2 .

(17)

By Fubini’s Theorem, the above equation equals 1 b2 b X i=1 Z I1 IE 

σ2(u, Fi−1)fxi|Fi−1(u)

  i−1 X j=1 mj11{xj∈I2}   2 du.

Using Cauchy-Schwarz’s inequality, we have

1 b2 b X i=1 Z I1 IE 

σ2(u, Fi−1)fxi|Fi−1(u)

  i−1 X j=1 mj11{xj∈I2}   2 du ≤ 1 b2 b X i=1 Z I1    n IEσ2_{(u, F}

i−1)fxi|Fi−1(u)

2o1/2    IE   i−1 X j=1 mj11{xj∈I2}   4   1/2  du.

By Burkholder’s inequality and the moment inequality yield, with some constant C,

IE   i−1 X j=1 mj11{xi∈I2}   4 ≤ C IE   i−1 X j=1 m2_j112_{x_i_∈I₂_}   2

≤ C(i − 1)2IE(m4₁11{x1∈I2}).

It follows that 1 b2 b X i=1 IE  m2_i11{xi∈I1}   i−1 X j=1 mj11{xj∈I2}   2  ≤ 1 b2 b X i=1 Z I1 n IEσ2_{(u, F}

i−1)fxi|Fi−1(u)

2o1/2 C(i − 1)2_IE(m4 111{x1∈I2}) 1/2 du = 1 b2C IE(m 4 111{x1∈I2}) 1/2 b X i=1 (i − 1) Z I1 n

IEσ2(u, Fi−1)fxi|Fi−1(u)

2o1/2 du.

IE(m4₁11{x1∈I2}) ≤ IE(m 4

1) which is bounded by assumption [A2] (ii). In addition, from

Koul and Stute (1999), R

I1{IE[σ 2_(u)f

xi|Fi−1(u)]

2_}1/2_{du is bounded by assumptions [A2](iii)}

and [A3]; see more detail discuss therein. Therefore, under H0 and assumptions [A1]–[A3],

the process Mb is tight. Note that our assumption [A2] (ii) and (iii) are similar to the

assumption (A)(a) in Koul and Stute (1999). In Dom´ınguez and Lobato (2004), they use stricter conditions (see, [A7] and [A8]) and H¨older’s inequality to obtain the boundness of IE(m4

111{x1∈I1}).

Under assumptions [A1] and [A2] (i), and by a central limit theorem for the ergodic stationary martingale difference sequence, we have for any τ ∈ Rk,

(18)

For ξ₁, ξ₂ ∈ Rk_, Cov Mb(ξ1; βo), Mb(ξ2; βo) = 1 b b X i=1 IEm(yi, xi; βo)211{xi≤ξ1}11{xi≤ξ2} p −→ Z ξ₁∧ξ₂ −∞ σ2(u) F (du) = V (ξ₁∧ ξ₂),

where the first equality holds by the property of martingale difference sequence. Since V (ξ) is nondecreasing and nonnegative, Mb admits a asymptotically distributed as B(V (ξ)), where

B(·) is a standard Brownian sheet. 2

Proof of Theorem 2.2. In this proof, we show that using a consistent estimator ˆσ_b2 to replace σ₀2does not affect the asymptotics of the scale invariant subsampling marked empirical process. Rewrite the process ˜Mb(ξ; ˆβn) :

1 √ bσˆ −1 b b X i=1 m yi, xi, ˆβn 11{xi≤ξ} = ˆσ−1_b − σ−1₀ 1 √ b b X i=1 m yi, xi, ˆβn 11{xi≤ξ}+ 1 √ bσ −1 0 b X i=1 m yi, xi, ˆβn 11{xi≤ξ}.

Since ˆσ_b−1− σ₀−1= op(1), and by Theorem 2.1,

b−1/2 b X i=1 m yi, xi, ˆβn 11{xi≤ξ}= Op(1), then ˜ Mb(ξ; ˆβn) = 1 √ bσ −1 0 b X i=1 m yi, xi, ˆβn 11{xi≤ξ}+ op(1). Denote ˜ M_bo(ξ, β) := √1 bσ −1 0 b X i=1 m yi, xi, β 11{xi≤ξ}.

The processes ˜Mb(ξ; ˆβn) and ˜Mbo(ξ, ˆβn) have the same limiting distribution. In addition,

sim-ilar to the proof of Theorem 2.1, replacing βo by ˆβnin ˜Mbo does not affected the asymptotics

of ˜M_bo. It follows that

˜

(19)

and it suffices to focus on the limiting behavior of ˜M_bo(ξ; β_o). The tightness of Mb can be

ob-tained in Theorem 2.1 as σ2_o is continuous. Since {m yi, xi, β 11{xi≤ξ}, Fi−1} is a martingale

difference sequence, we then use the central limit theorem for ergodic and stationary martin-gale difference sequence to obtain the limiting distribution, which is a Gaussian process with zero mean and for ξ₁, ξ₂ ∈ Rk_,

Cov ˜ M_bo(ξ₁, β_o), ˜M_bo(ξ₂, β_o) = 1 bσ −2 0 b X i=1 IEm(yi, xi; βo)211{xi≤ξ1}11{xi≤ξ2} p −→ Z ξ₁∧ξ₂ −∞ F (du) = F (ξ₁∧ ξ₂).

Hence, ˜Mb(ξ; ˆβn) ⇒ B(F (ξ)), with B a Brownian sheet. 2

Proof of Theorem 3.1. M˜b(ξ; ˆβn) and ˜Mbo(ξ; βo) are asymptotically equivalent from (4).

It suffices to discuss the limit of ˜M_bo(ξ; β_o) under two different types of alternatives.

For part (i), rewrite ˜M_bo(ξ; βo):

1 √ bσ −1 0 b X i=1 m yi, xi, βo 11{xi≤ξ} = √1 bσ −1 0 b X i=1 m yi, xi, βo − µ(xi) 11{xi≤ξ}+ 1 √ bσ −1 0 b X i=1 µ(xi)11{xi≤ξ}.

Under H1 and assumptions [A1]–[A5], by the previous proofs, the first part of the above

equation converges to B(F (ξ)). In addition, if IE|µ(xi)11{xi≤ξ}| < ∞, the probability limit

of b−1/2σ−1₀ Pb i=1µ(xi)11{xi≤ξ} will be 1 √ bσ −1 0 b X i=1 IE[µ(xi)11{xi≤ξ}] = √ bσ−1₀ IE[µ(xi)11{xi≤ξ}]. As b → ∞, ˜M_bo(ξ; βo) → ∞. Thus ˜ Mb(ξ; ˆβn) → ∞.

For part (ii), rewrite ˜Mo

b(ξ; βo): 1 √ bσ −1 0 b X i=1 m yi, xi, βo 11{xi≤ξ} = √1 bσ −1 0 b X i=1 m yi, xi, βo − δ(x_√i) b 11{xi≤ξ}+ 1 bσ −1 0 b X i=1 δ(xi)11{xi≤ξ}.

(20)

Under H₁Land assumptions [A1]–[A5], by the previous proofs, the first part of the above equa-tion converges to B(F (ξ)). If IE|δ(xi)11{xi≤ξ}| < ∞, the probability limit of b

−1_σ−1 0

Pb

i=1δ(xi)11{xi≤ξ}

will be σ₀−1IE[δ(xi)11{xi≤ξ}]. Therefore, under H L

1, ˜Mb(ξ; ˆβn) converges to a Brownian sheet

process plus a non-zero constant term σ₀−1IE[δ(xi)11{xi≤ξ}]. 2

Proof of Theorem 3.3. Similar to the proof of Theorem 3.1, rewrite ˜Mo

b(ξ; βo): 1 √ bσ −1 0 b X i=1 m yi, xi, βo 11{xi≤ξ} = √1 bσ −1 0 b X i=1 m yi, xi, βo − δ(x_√i) n 11{xi≤ξ}+ 1 √ b√nσ −1 0 b X i=1 δ(xi)11{xi≤ξ}.

The probability limit of the second term on the right-hand-side of the above equation will be

1 √ b√nσ −1 0 b X i=1 δ(xi)11{xi≤ξ}= √ b √ n " σ−1₀ 1 b b X i=1 δ(xi)11{xi≤ξ} # p −→ 0, with b/n → 0 and b−1Pb i=1δ(xi)11{xi≤ξ} p

−→ IE[δ(xi)11{xi≤ξ}]. Therefore, ˜Mb(ξ; ˆβn)

con-verges to a Brownian sheet process under both H0and H2L. 2

References

Andrews, D. and P. Guggenberger (2005). Hybrid and size-corrected subsample methods, unpublished manuscript, Cowels Foundation, Yale University.

Bai, J. (2003). Testing parametric conditional distributions of dynamic models, Review of Economics and Statistics, 85, 531–549.

Bickel, P. and M. Wichura (1971). Convergence criteria for multiparameter stochastic pro-cesses and some applications, Annals of Mathematical Statistics, 42, 1656–1670.

Bierens, H. (1982). Consistent model specification tests, Journal of Econometrics, 20, 105– 134.

(1984). Model specification testing of time series regressions, Journal of Econometrics, 26, 323–353.

(1990). A consistent conditional moment test of functional form, Econometrica, 58, 1443–1458.

(21)

Bierens, H. and D. Ginther (2001). Integrated conditional moment testing of quantile regres-sion models, Empirical Economics, 26, 307–324.

Bierens, H. and W. Ploberger (1997). Asymptotic theory of integrated conditional moment tests, Econometrica, 65, 1129–1151.

Chen, X. and Y. Fan (1999). Consistent hypothesis testing in semiparametric and nonpara-metric models for econononpara-metric time series, Journal of Econononpara-metrics, 91, 373–401.

Chernozhukov, V. and I. Fern´andez-Val (2005). Subsampling inference on quantile regression process, Sankha, 67, 253–276.

de Jong, R. (1996). On the Bierens test under data dependence, Journal of Econometrics, 72, 1–32

Delgado, M. and W. Gonz´alez-Manteiga (2001). Significance testing in nonparametric re-gression based on the bootstrap, Annals of Statistics, 29, 1469–1507.

Dom´ınguez, M and I. Lobato (2006). Consistent estimation of models defined by conditional moment restrictions, Econometrica, 72, 1601–1615.

Durbin, J. (1973). Weak convergence of the sample distribution function when parameters are estimated, Annals of Statistics, 1, 279–290.

Eubank, R. and C. Spiegelman (1990). Testing the goodness-of-fit of a linear model via nonparametric regression techniques, Journal of the American Statistical Association, 85, 387–392.

Fan, Y. and Q. Li (1996). Consistent model specification tests: omitted variables and semi-parametric functional forms, Econometrica, 64, 865–890.

Guggenberger, P. and M. Wolf (2004). Subsampling tests of parameter hypotheses and over-ifentifying restrictions with possible failure of identification, working paper, Department of Economics, UCLA.

Hong, H. and O. Scaillet (2006). A fast subsampling method for nonlinear dynamics models, Journal of Econometrics, 133, 557–578.

Hong, Y. and H. White (1995). Consistent specification testing via nonparametric series regression, Econometrica, 63, 1133–1159.

Horowitz, J. and V. Sponoiny (2001). An adaptive, rate-optimal test of a parametric mean=regression model against a nonparametric alternative, Econometrica, 69, 599–631.

(22)

Probability and its applications, XXVI, 240–257.

Koul, H. and E. Stute (1999). Nonparametric model checks for time series, Annals of Statis-tics, 27, 204–236.

Kuan, C.-M. and H.-y. Lin (2008). A consistent and asymptotical pivotal test for conditional moment restrictions, working paper.

Lewbel, A. (1995). Cinsistent nonparametric hypothesis tests with an application to Slutsky symmetry, Journal of Econometrics, 67, 379–401.

Li, Q. and S. Wang (1998). A simple consistent bootstrap test for a parametric regression function, Journal of Econometrics, 87, 145–165.

Li, W, C. Hsiao, and J. Zinn (2003). Consistent specification tests for semiparametric/nonparametric models based on series estimation methods, Journal of Econometrics, 112, 295–325.

Linton, O., Maasoumi, E. and Y.-J. Whang (2005). Consistent testing for stochastic domi-nance under general sampling schemes, Review of Economic Studies, 72, 735–765.

Newey, W. (1985). Maximum likelihood specification testing and conditional moment tests, Econometrica, 53, 1047–1070.

Politis, D. N. and J. P. Romano (1994). Large sample confidence regions based on subsamples under minimal assumptions, Annals of Statistics, 22, 2031–2050.

Politis, D. N., J. P. Romano, and M. Wolf (1999). Subsampling, New York: Springer.

Shorack, G. and J. Wellner (1986). Empirical Processes with Applications to Statistics, New York: John Wiley & Sons.

Song. K. (2007). Testing semiparametric conditional moment restrictions using conditional martingale transforms, working paper.

Stichcombe, M. and H. White (1998). Consistent specification testing with nuisance param-eters present only under the alternative, Econometric theory, 14, 295–325.

Stute, W. (1997). Nonparametric model checks for regression, Annals of Statistics, 25, 613– 641.

Stute, W., W. Gonz´alez Manteiga and M. Presedo Quind¨ımil (1998). Bootstrap approxima-tions in model check for regression, Journal of the American Statistical Association, 93, 141–149.

Stute, W., S. Thies, and L. Zhu (1998). Model checks for regression: an innovation process approach, Annals of Statistics, 26, 1916–1934.

(23)

Stute, W. and L. Zhu (2002). Model checks for generalized linear models, Scandinavian Journal of Statistics, 29, 535–545.

Tauchen, G. (1985). Diagnostic testing and evaluation of maximum likelihood models, Jour-nal of Econometrics, 30, 415–443.

Tripathi, G. and Y. Kitamura (2003). Testing conditional moment restrictions, Annals of Statistics, 31, 2059–2095.

Whang, Y. (2000). Consistent bootstrap tests of parametric regression functions, Journal of Econometrics, 98, 27–46.

(2001). Consistent specification testing for conditional moment restrictions, Economics Letters, 71, 299–306.

(2004). Consistent specification testing for quantile regression models, working paper.

White, H. (1987). Specification testing in dynamic models. In T. Bewley (ed.), Advances in Econometrics–Fifth World Congress, 1, New York: Cambridge University Press.

Zheng, J. (1998a). A consistent nonparametric test of parametric regression models under conditional quantile restrictions, Econometric Theory, 14, 123–138.

(1998b). Consistent specification testing for conditional symmetry, Econometric The-ory, 14, 139–149.

(2000). A consistent test of conditional parametric distributions, Econometric Theory, 16, 667–691.

新的一致性條件動差檢定

行政院國家科學委員會專題研究計畫 成果報告