行政院國家科學委員會專題研究計畫 成果報告
不受分配影響的模型設定之檢定方法
研究成果報告(精簡版)
計 畫 類 別 : 個別型
計 畫 編 號 : NSC 98-2410-H-004-056-
執 行 期 間 : 98 年 08 月 01 日至 99 年 07 月 31 日
執 行 單 位 : 國立政治大學經濟學系
計 畫 主 持 人 : 林馨怡
處 理 方 式 : 本計畫可公開查詢
中 華 民 國 99 年 08 月 19 日
1
Introduction
Many economic and econometric models are represented by conditional moment restrictions, for example, the rational expectation model, the market disequilibrium model, the condi-tional probability model, the discrete choice model and the nonlinear simultaneous equations model. The validity of these types of model is determined by testing conditional moment restrictions. Examples of such tests include conditional moment tests or M-test developed by Newey (1985), Tauchen (1985), and White (1987). However, such conditional moment tests may not be consistent because only necessary conditions of conditional moment restrictions are checked. There is an abundance of literature on constructing consistent conditional mo-ment tests. One technique is to employ a nonparametric test. See, for example, Delgado
and Gonz´alez Manteiga (2001), Li, Hsiao, and Zinn (2003), Horowitz and Spokoiny (2001),
Tripathi and Kitamura (2003), and Zheng (2000), among others. The nonparametric tests are usually subjective in choosing smoothing parameters and may be computationally costly. Another technique for constructing a consistent conditional moment test is based on infinitely many unconditional orthogonality restrictions with uncountably many weighted functions in-dexed by continuous nuisance parameters (Stichcombe and White, 1998). This technique is called the integrated function approach because it uses the integrated measures of dependence of orthogonal restrictions. For these types of tests, when determining the weighted functions, Bierens (1982, 1984, 1990), Bierens and Ploberger (1997), Bierens and Ginther (2001), and de Jong (1996), employ the exponential function, while Koul and Stute (1999), Stute (1997), Stute, Thies and Zhu (1998), and Stute and Zhu (2002) employ the indicator function.
It is noted, that generally, tests based on integrated function approach are not asymptoti-cally pivotal. That is, their limiting distributions depend on model characteristics and critical values cannot be tabulated. For example, the limiting distribution for tests employing the exponential weight function depend on the data generating process (DGP) of the auxiliary nuisance parameters. Although Bierens and Ploberger (1997) have derived case-independent upper bounds of critical values to solve the limiting distribution problem, their test may be too conservative in practice. Meanwhile, the limiting distribution for tests employing the indica-tor weight function is not asymptotically pivotal because of estimation effects (Durbin, 1973)
and being case dependent. Dominguez and Lobato (2006), Stute, Gonz´alez Manteiga and
Presedo Quind¨ımil (1998), and Whang (2000, 2001, 2004) try to avoid the problem by using bootstrapping techniques to approximate the limiting distribution. Specifically, Khmaladze and Koul (2004), Stute, Thies and Zhu (1998), Koul and Stute (1999), Stute and Zhu (2002), and Song (2009) employ the martingale transformation technique of Khmaladze (1981) to obtain asymptotically distribution-free test statistics. However, these tests usually encounter
the poor finite sample performance due to the curse of dimensionality. Recently, Excanciano (2006) and Lavergne and Patilea (2008) propose tests breaking the curse of dimensionality. The former test is based on the integrated function technique and uses projections, while the latter test is based on the smoothing nonparametric technique.
Accordingly, this paper proposes a consistent conditional moment test that is asymptot-ically pivotal. The proposed test is based on the integrated function approach and the test statistic is obtained through a subsampling marked empirical process, using sample size b instead of the whole sample size n such that b < n. Subsampling, as defined by Politis and Romano (1994) and Politis, Romano and Wolf (1999) is a method for estimating the distribu-tion of an estimator or test statistic by drawing subsamples from the original data. Andrews
and Guggenberger (2005), Chernozhukov and Fern´andez-Val (2005), Guggenberger and Wolf
(2004), Hong and Scailet (2006), Linton, Massoumi and Whang (2005) and Whang (2004) have employed subsampling techniques for estimating the distribution of estimators. Instead of computing the sample average of the conditional moment function with the whole sample, the test statistic is obtained by the subsampling marked empirical process with subsample size b. The estimation effect disappears when the relative sample size of subsampling to that of the whole sample is zero asymptotically. Therefore, the proposed test does not suffer from the estimation effect problem and is asymptotically pivotal. Further, multiple regressors may be employed in the test. Thus, the proposed test can be viewed as the complement of Es-canciano (2006) and Lavergne and Patilea (2008) for breaking the curse of dimensionality. Additionally, any√n-consistent estimator and different estimation methods may be employed to compute the test statistic. Bootstrapping, martingale transformation or nonparametric techniques are not required, thus, simplifying computation of test statistics. However, the proposed test is powerful against local alternatives at rates b−1/2, but the proposed test is incapable of detecting local alternatives at rate n−1/2. When performing Monte Carlo sim-ulation, it was shown that good finite sample performances were obtained and the proposed test was robust with respect to different values of b.
Following arrangement of this paper is as follows. Section 2 presents the conditional moment restriction and the proposed test. Section 3 shows the consistency of the proposed test and the asymptotic behavior given different local alternatives. Section 4 shows the results of Monte Carlo simulation. Lastly, Section 5 is the conclusion. All proofs are presented in the Appendix.
2
A New Test
2.1 Conditional Moment Restrictions
Consider the general conditional moment restrictions IE[m(Y, X, θo)
X] = 0, (1)
where IE[·|X] denotes the expectation conditional on the information set of X, the function m(·) is well-defined, {Y, X} is a sequence of random variables with X = (X1, · · · , Xk)0 and
parameters θ ∈ Θ with Θ ∈ Rk. The conditional moment restrictions can be obtained from
existing models such as the parametric nonlinear regression model where m(Y, X, θo) is the
difference between Y and g(X0, θ), with g(·) being a nonlinear function. To test the condition moment restrictions, the null and alternative hypotheses are as follows. The null hypothesis is the conditional moment function being equal to zero:
H0: P IE(m(Y, X, θo)
X) = 0 = 1, for some θo∈ Θ,
and the alternative hypothesis is, for all θ ∈ Θ, IE(m(Y, X, θ)X) 6= 0 with a positive proba-bility:
H1: P IE(m(Y, X, θ)
X) = 0 < 1, for all θ ∈ Θ,
with Θ ∈ Rk a compact set.
As previously proposed by Stinchcombe and White (1998), the conditional moment con-dition (1) equals infinitely many unconcon-ditional moment functions
IE[m(Y, X, θo)ω(X, x)] = 0, ∀x ∈ Rk, (2)
where ω(·) is an infinite set indexed by continuous parameters x and ω(·) may be any analytic function that is not polynomial. A consistent conditional moment test can be constructed by testing (2). For example, Bierens (1982, 1984, 1990), de Jong (1996) and Bierens and Ploberger (1997) and Bierens and Ginther (2001) employ the exponential weighted function ω(X, x) = exp(X0x) for their integrated conditional moment test. Meanwhile, Stute (1997), Stute, Thies and Zhu (1998), Koul and Stute (1999) and Stute and Zhu (2002) employ the indicator function
ω(X, x) = 11{X≤x} := 11{X1≤x1}· · · 11{Xk≤xk},
where 11A denotes the indicator function of even A. This paper proposes employing the
indi-cator function and the conditional moment restrictions (1) can be rewritten by the infinitely many unconditional moment functions as follows:
wherein multivariate regressors may be employed; see Khmaladze and Koul (2004), Escan-ciano (2006), and Song (2009).
2.2 Test Statistics
The specification test employed in this paper examines infinitely many unconditional moment functions (3) that are equivalent to the conditional moment restriction (1). Thus, the spec-ification test is a consistent conditional moment test. To test whether the moment function IE[m(Y, X, θo)11{X≤x}] equals to zero, the normalized sample average of the moment function:
Mn(x; θo) := 1 √ n n X t=1 m Yi, Xi, θo 11{Xi≤x},
with {Yi, Xi}ni=1 a sequence of random variable, and 11{Xi≤x} = 1{Xi1≤x1}· · · 1{Xik≤xk}, is
employed. The function m Yi, Xi, θ 11{Xi≤x} is the marked empirical process with the marks
given by the moment function m. The function Mn is the average of the marked empirical
process with sample size n. If Mn(x; θo) is close to zero, then the null hypothesis is not
rejected. Otherwise, the null hypothesis is rejected and the conditional moment restriction does not hold.
Since the true parameter θo is unknown, we replace θo by its consistent estimator, ˆθn.
Thus the sample average of the marked empirical process is: Mn(x; ˆθn) = 1 √ n n X i=1 m(Yi, Xi, ˆθn)11{Xi≤x}.
By rewriting the process Mn based on
Mn(x; ˆθn) = Mn(x; θo) + 1 √ n n X i=1 m(Yi, Xi, ˆθn) − m(Yi, Xi, θo) 11{Xi≤x},
if m(Yi, Xi, θ) is once differentiable with first derivative ∇θm(Yi, Xi, θo), then
Mn(x; ˆθn) = Mn(x; θo) + 1 √ n n X i=1 ∇θm(Yi, Xi, θo)(ˆθn− θo)11{Xi≤x}+ op(1) = Mn(x; θo) + √ n(ˆθn− θo) 1 n n X i=1 ∇θm(Yi, Xi, θo)11{Xi≤x}+ op(1).
Thus, Mn(x; ˆθn) and Mn(x; θo) are not asymptotically equivalent due to the presence of the
second term on the right hand side of the second equality. This term is the estimation effect presented in Durbin (1973), wherein the presence of the second term depends on a model characteristic that makes the test based on Mn(x; ˆθn) not asymptotically pivotal.
Stute (1999) and Stute and Zhu (2002) employ the martingale transformation technique for univariate regressors and Khmaladze and Koul (2004) and Song (2009) employ the same technique for multivariate regressors. Note that because using a nonparametric estimation of the conditional moment function is required, it is complicated to compute a high dimensional nonparametric estimation and is subjective to user-chosen parameters employing martingale transformation technique. In addition, the finite sample performance is poor due to the curse of dimensionality. To solve the subjective choice of parameters problem and the curse of dimensionality, Escanciano (2006) proposes a consistent conditional moment test using the projections technique and his test presents excellent empirical powers in finite sample. However, the limiting distribution of Escanciano’s test should be obtained by bootstrapping technique and is not asymptotically pivotal.
Thus, this paper employs a subsampling version of the Mn process to construct a
con-sistent conditional moment test which is asymptotically pivotal. Instead of employing the whole sample size n to compute the marked empirical process, a subsample size b is employed to compute the sample average and construct the process, for b < n:
Mb(x; ˆθn) := 1 √ b b X i=1 m Yi, Xi, ˆθn 11{Xi≤x},
where ˆθn can be any
√
n-consistent estimator associated with the model of interest with sample size n. Thus, by employing Mb the following equation is provided:
Mb(x; ˆθn) = Mb(x; θo) + 1 √ b b X i=1 ∇θm(Yi, Xi, θo)(ˆθn− θo)11{Xi≤x}+ op(1) = Mb(x; θo) + r b n √ n(ˆθn− θo) " 1 b b X i=1 ∇θm(Yi, Xi, θo)11{Xi≤x} # + op(1). (4)
If b → ∞, n → ∞ and b/n → 0, and there exist some regularity conditions, then the second term on the right-hand-side of the second equality of (4) converges to zero. Thus, Mb(x; ˆθn)
and Mb(x; θo) are asymptotically equivalent. Subsampling the marked empirical process
eliminates the estimation effect. Assume D(Rk) to be the space of the cadlag function on Rk
endowed with the Skorohod topology. Here, Mb is in D(Rk). Assume also, that ⇒ denotes
the convergence in distribution, and→ denotes the convergence in probability. The followingp assumptions are sufficient for the weak convergence of the subsampling marked empirical process.
[A1] {Yi, Xi}ni=1is independent and identically distributed (i.i.d.) where Xi has the bounded
[A2] (i) IE[m(Yi, Xi, θ)2|Xi] < ∞,
(ii) IE[m(Yi, Xi, θ)4] = κ < ∞,
(iii) IE[m(Yi, Xi, θ)4||Xi||1+η] < ∞, for some η > 0.
[A3] m(·) is once continuously differentiable in a neighborhood θo and satisfies
IE sup θ∈Θo |∇θm(Yi, Xi, θ)| < ∞, where Θo denotes a neighborhood of θo.
[A4] ˆθn is a
√
n-consistent estimator; that is √n ˆθn− θo = Op(1).
The assumptions in [A2] restrict the dependence of the moment function. Given [A2] (i), the conditional variance function σ2(Xi) of m(Yi, Xi, θ) is defined with
σ2(u) := varm(Yi, Xi, θ) Xi = u]. For xi= (x1, · · · , xk)0 and u = (u1, · · · , uk)0: V (x) := IEσ2(X i)11{Xi≤x} = Z x −∞ σ2(u)F (du), is defined with R−∞x := Rx1 −∞· · · Rxk
−∞. Assumptions [A1] together with [A2] are required to
obtain the uniform tightness in the space D[−∞, ∞]. Assumption [A3] is a standard smooth-ness assumption. [A3] can be relaxed as a non-smooth moment function when considering the stochastic equicontiunity of m. Assumption [A4] is weak and may be applied to most existing estimation methods. Following, the weak convergence of Mb is obtained.
Theorem 2.1. Under H0 and given assumptions [A1]-[A4], if b → ∞, n → ∞ and b/n → 0,
then one has:
Mb(x; ˆθn) ⇒ B V (x),
where B(·) is a Gaussian process with mean zero and covariance function V (x1∧ x2).
The limiting distribution of Mb is a centered Gaussian process which is a multi-parameter
Brownian motion process on [0, 1]k with covariance function V (x1∧ x2) = Z x1∧x2 −∞ σ2(u)F (du), where Rx1∧x2 −∞ = Rx11∧x21 −∞ · · · Rx1k∧x2k
−∞ . In particular, when Xi is univariate, the process B
is the standard Brownian motion process. The limit of Mb(x; θo) and that of Mb(x; ˆθn)
convergence rate of b to infinity is slower that of n to infinity. Note that V (x) plays an important role in the proposed test. Since V (x) still depends on the distribution of Xi and
σ2, the process Mb(x; ˆθn) is not asymptotically distribution-free. For a general conditional
heteroskedasticity case, the scaled invariant version of subsampling marked empirical process is considered as follows: ˜ Mb(x; ˆθn) := 1 √ b b X i=1 ˆ σ(Xi)−1m Yi, Xi, ˆθn 11{Xi≤x},
where ˆσ(Xi)2 is a consistent estimator of σ(Xi)2. The scaled version of the statistic is also
considered in many research, such as Khmaladze and Koul (2004), Koul and Stute (1999),
Stute (1997), Stute, Thies and Zhu (1998), and Song (2009). When σ2(Xi) = σ02 (the
conditional homoskedasticity case), which is a constant, ˜Mb(x; ˆθn) simplifies to
1 √ bσˆ −1 b b X i=1 m Yi, Xi, ˆθn 11{Xi≤x}, with ˆσ2b = b−1Pb
i=1m(Yi, Xi, ˆθn)2 a consistent estimator for σ20.
Theorem 2.2. Under H0 and given assumptions [A1]-[A4], if b → ∞, n → ∞, b/n → 0 and
ˆ
σ(Xi)2− σ(Xi)2= op(b−1/2), then
˜
Mb(x; ˆθn) ⇒ B F (x),
with B(·) a Gaussian process with mean zero and covariance function F (x1∧ x2).
The computational counterpart of the scaled invariant version of ˜Mb(x; ˆθ) is as follows:
˜ Mb(Xj; ˆθn) := 1 √ b b X i=1 ˆ σ(Xi)−1m Yi, Xi, ˆθn 11{Xi≤Xj}, j = 1, · · · , n,
where each realization Xj is used as an x in the indicator function. Consider two
goodness-of-fit statistics, the Kolmogorov-Smirnov and Cramer-von Mises test statistics:
KSn= sup Xj∈Rk ˜Mb(Xj; ˆθn) , and CMn= 1 n n X j=1 ˜ Mb(Xj; ˆθn)2.
Employing Theorem 2.2 and the continuous mapping theorem, for a large n, with τ ∈ [0, 1]k, then KSn⇒ sup x∈Rk B F (x) = sup τ ∈[0,1]k B(τ ) ,
and CMn= Z ∞ −∞ ˜ Mb(X; ˆθn)2F (dx) ⇒ Z ∞ −∞ B F (x)2 F (dx) = Z [0,1]k B(τ )2dτ.
The critical values of the test statistics KSnand CMncan be found in existing literature such
as Shorack and Wellner (1986) and Khmaladze and Koul (2004).1 Note that the proposed
test of this paper is asymptotically pivotal and the limiting distribution of the proposed test does not depend on a DGP. Therefore, the following corollary is as follows.
Corollary 2.3. Under all the assumptions in Theorem 2.2.
KSn⇒ sup τ ∈[0,1]k B(τ ), and CMn⇒ Z [0,1]k B(τ )2dτ,
where B(·) is a Gaussian process with mean zero and covariance (τ1∧ τ2).
3
Power of the Tests
To investigate the power performance of the proposed test, two types of alternatives are considered. One is the general alternative:
H1: IE[m(Y, X, θo)|X] = µ(X) 6= 0,
and the other is the local alternatives: H1L: IEm(Y, X, θo) X = δ(X) √ b ,
with δ(X) 6= 0. Under H1, the limiting distribution of the proposed test statistic diverges, in
which the power of the test is obtained.
Theorem 3.1. Assume all the conditions of Theorem 2.2 hold and b → ∞, n → ∞ and b/n → 0. Therefore:
(i) Under the fixed alternative H1:
˜
Mb(x; ˆθn) → ∞.
(ii) Under the local alternatives H1L: ˜
Mb(x; ˆθn) ⇒ B(F (x)) + IE[σ(Xi)−1δ(Xi)11{Xi≤x}].
Employing Theorem 3.1 and the continuous mapping theorem, then under the fixed
al-ternative H1: KSn→ ∞, and CMn→ ∞. Therefore, the consistency of the proposed test is
obtained. Moreover, under the local alternatives H1L,
KSn⇒ sup x∈Rk B(F (x)) + IE[σ(Xi)−1δ(Xi)11{Xi≤x}] , and CMn⇒ Z [0,1]k B(F (x)) + IE[σ(Xi)−1δ(Xi)11{Xi≤x}] 2 dτ.
This shows that the proposed test has nontrivial powers against local alternatives H1L at rate b−1/2. Note that there may exist local alternatives at rate n−1/2 as follows.
H2L: IEm(Y, X, θo)
X = δ(X)√ n .
Under the local alternatives H2L, the limiting distribution of ˜Mb(x; ˆθn) is the same as that
under the null hypothesis (see Theorem 2.1). The proposed test is not powerful against local alternatives at rate n−1/2.
Theorem 3.2. Assume assumptions [A1]-[A4] hold and b → ∞, n → ∞ and b/n → 0. Under the local alternatives H2L:
˜
Mb(x; ˆθn) ⇒ B(F (x)).
4
Monte Carlo Simulations
The finite sample performance of the test statistic KSn is examined. The following null
DGPs is as follows. (A) yi = xi1+ ei,
(B) yi = xi1+ 5 + ei,
(C) yi = xi1+ exp(zi) + ei,
(E) yi = xi2+ xi3+ 5 + ei,
(F) yi = xi2+ xi3+ exp(zi) + ei.
Here xi1, xi2, xi3 and zi are i.i.d. N (0, 1) distribution and ei is i.i.d. N (0, σ20) with σ02 =
1, 2, 3, 4. The test statistic KSn for one regressor is:
KS1 = max j 1 √ bσˆ −1 1 b X i=1 (yi− x0i1βˆ1)11{xi1≤xj} ,
for DGPs (A), (B) and (C) and the test statistic for two regressors is:
KS2 = max j 1 √ bσˆ −1 2 b X i=1 (yi− x0i2βˆ2− x0i3βˆ3)11{xi2≤xj1}11{xi3≤xj2} ,
for DGPs (D), (E) and (F) where ˆβ1, ˆβ2, ˆβ3 are least square estimates, ˆσ21 = b−1
Pb
i=1(yi−
x0i1βˆ1)2and ˆσ22 = b−1
Pb
i=1(yi−x0i2βˆ2−x0i3βˆ3)2. In each simulation experiment, the number of
replications is 2000 and the significance level is 0.05. Different values of b are employed in this simulation. The choice of b is considered for the formula b = np with p = 0.5, 0.55, · · · , 0.95. Table 1, given σ02 = 1, reports the rejection frequencies of the tests for different values of n and p. For DGPs (A) and (D), the rejection values are finite sample sizes of the test. In the column of DGP (A), all values are close to the significance level 0.05 except for the values of p = 0.5. However, in the column of DGP (D), the proposed test is under-sized for a large p. Thus, when the number of regressors of the regression increases, or if b increases, then the finite sample sizes of the test are lower. For DGPs (B), (C), (E) and (F), the rejection rates are the finite sample powers of the proposed test. In columns of DGPs (B) and (E) that have fixed alternatives, the finite sample powers are 1. Thus, the test has good power performances with different values of n and b(or p). In addition, the values in the columns of DGPs (C) and (F) determine that the test performs well when the alternatives are a random variable. For DGP (C), there are good power performances of the test for large values of p. When n increases, the powers of the test are closer to 1. For DGP (F), finite sample powers are lower for n = 100, and as n and b(or p) increase, the finite sample powers increase. Thus, the proposed test has correct finite sample sizes for one regressor and is slightly under-sized for two regressors in the regression model. When there are fixed alternatives, the power performances are very good. The finite sample powers of the test increase along with both n and b. Table 2 reports the rejection frequencies of the test for six DGPs with different σ02 and p. The sample size is 500. The finite sample performances in Table 2 are similar to those in Table 1. Moreover, when the variety of error term increases, the finite sample powers of the test decrease.
Table 1: Rejection frequencies of the conditional moment tests KS1 KS2 n p (A) (B) (C) (D) (E) (F) 100 0.50 0.076 1.000 0.742 0.057 1.000 0.596 0.55 0.067 1.000 0.817 0.043 1.000 0.661 0.60 0.057 1.000 0.877 0.037 1.000 0.775 0.65 0.048 1.000 0.947 0.038 1.000 0.847 0.70 0.044 1.000 0.977 0.036 1.000 0.919 0.75 0.047 1.000 0.992 0.031 1.000 0.968 0.80 0.049 1.000 0.999 0.024 1.000 0.985 0.85 0.042 1.000 1.000 0.021 1.000 0.995 0.90 0.046 1.000 0.999 0.025 1.000 0.999 0.95 0.041 1.000 1.000 0.018 1.000 0.999 200 0.50 0.063 1.000 0.863 0.046 1.000 0.783 0.55 0.045 1.000 0.937 0.038 1.000 0.867 0.60 0.051 1.000 0.978 0.040 1.000 0.950 0.65 0.053 1.000 0.992 0.031 1.000 0.981 0.70 0.039 1.000 0.997 0.035 1.000 0.990 0.75 0.054 1.000 1.000 0.023 1.000 0.998 0.80 0.048 1.000 1.000 0.028 1.000 1.000 0.85 0.043 1.000 1.000 0.023 1.000 1.000 0.90 0.042 1.000 1.000 0.025 1.000 1.000 0.95 0.049 1.000 1.000 0.022 1.000 1.000 500 0.50 0.068 1.000 0.964 0.057 1.000 0.950 0.55 0.042 1.000 0.989 0.030 1.000 0.982 0.60 0.047 1.000 0.998 0.035 1.000 0.994 0.65 0.044 1.000 0.999 0.036 1.000 0.998 0.70 0.045 1.000 1.000 0.040 1.000 0.999 0.75 0.040 1.000 1.000 0.030 1.000 1.000 0.80 0.055 1.000 1.000 0.028 1.000 1.000 0.85 0.040 1.000 1.000 0.029 1.000 1.000 0.90 0.036 1.000 1.000 0.019 1.000 1.000 0.95 0.035 1.000 1.000 0.028 1.000 1.000 Note: The significant level is 0.05. b = np. The values in the
3rd and 6th columns are the finite sample sizes and the values in the 4th, 5th, 7th and 8th columns are the finite sample powers of the proposed test.
Table 2: Rejection frequencies of the conditional moment tests KS1 KS2 σ02 p (A) (B) (C) (D) (E) (F) 2 0.50 0.058 1.000 0.906 0.037 1.000 0.862 0.55 0.053 1.000 0.972 0.041 1.000 0.955 0.60 0.043 1.000 0.995 0.036 1.000 0.984 0.65 0.044 1.000 0.999 0.037 1.000 0.998 0.70 0.049 1.000 1.000 0.039 1.000 0.999 0.75 0.044 1.000 1.000 0.029 1.000 1.000 0.80 0.052 1.000 1.000 0.030 1.000 1.000 0.85 0.040 1.000 1.000 0.033 1.000 1.000 0.90 0.040 1.000 1.000 0.022 1.000 1.000 0.95 0.036 1.000 1.000 0.023 1.000 1.000 3 0.50 0.060 1.000 0.843 0.050 1.000 0.797 0.55 0.050 1.000 0.942 0.039 1.000 0.915 0.60 0.054 1.000 0.987 0.037 1.000 0.972 0.65 0.051 1.000 0.999 0.039 1.000 0.994 0.70 0.048 1.000 1.000 0.034 1.000 0.999 0.75 0.037 1.000 1.000 0.033 1.000 1.000 0.80 0.046 1.000 1.000 0.038 1.000 1.000 0.85 0.052 1.000 1.000 0.030 1.000 1.000 0.90 0.040 1.000 1.000 0.023 1.000 1.000 0.95 0.041 1.000 1.000 0.027 1.000 1.000 4 0.50 0.048 1.000 0.776 0.042 1.000 0.703 0.55 0.053 1.000 0.897 0.042 1.000 0.845 0.60 0.044 1.000 0.969 0.037 1.000 0.949 0.65 0.042 1.000 0.992 0.031 1.000 0.988 0.70 0.048 1.000 0.999 0.033 1.000 0.999 0.75 0.046 1.000 1.000 0.035 1.000 1.000 0.80 0.053 1.000 1.000 0.038 1.000 1.000 0.85 0.047 1.000 1.000 0.027 1.000 1.000 0.90 0.043 1.000 1.000 0.024 1.000 1.000 0.95 0.042 1.000 1.000 0.029 1.000 1.000 Note: The significant level is 0.05. b = np. The values in the
3rd and 6th columns are the finite sample sizes and the values in the 4th, 5th, 7th and 8th columns are the finite sample powers of the proposed test.
Table 3: Empirical powers of tests
n 100 200 500 KS2 ES KS2 ES KS2 ES
(G) 0.970 0.949 0.973 0.953 0.965 0.947 (H) 0.980 0.944 0.979 0.944 0.947 0.949
Note: The significant level is 0.05. b = n0.8. The values are the finite sample
powers of the proposed test and Escanciano’s (2006) test.
Then the finite sample powers of the proposed test and Escanciano’s (2006) test are compared. In Escanciano’s test, the wild bootstrapping technique is required and the number of the wild bootstrapping in the simulation is 500. In addition, to make the computation simpler, A(0)ijr = π is employed. Two DGPs with two regressors considered are:
(G) yi = (xi1+ xi2) + (xit+ xi2)exp(−0.1(xi1+ xi2)2) + ei,
(H) yi = (xi1+ xi2) + xitxi2+ ei,
with xi1, xi2, ei i.i.d. N (0, 1). The finite sample powers of the proposed test and Escanciano’s
test are reported in Table 3 with different sample sizes n = 100, 200, and 500. The finite sample powers of the proposed test are higher than those of Escanciano’s test in all scenarios, except when n = 500 for DGP (H). This result shows that the proposed test has good finite sample power.
5
Conclusions
This paper proposes a consistent conditional moment test based on infinitely many uncondi-tional moment restrictions. The test statistic is a subsampling marked empirical process and an asymptotically pivotal test is obtained. The proposed test is consistent against a general type of alternatives and is powerful against local alternatives at rates b−1/2. In addition, the test performs well in finite sample simulations and the power performances are good with most values of b. However, the proposed test still suffers from choosing b and a future work might consider an optimal choice for b.
Appendix
Proof of Theorem 2.1. Given assumption [A3], the subsampling marked empirical process
Mb permits the Taylor expansion:
1 √ b b X i=1 m Yi, Xi, ˆθn 11{Xi≤x} = √1 b b X i=1 m Yi, Xi, θo 11{Xi≤x}+ 1 √ b b X i=1 ∇θm Yi, Xi, θo(ˆθn− θo)11{Xi≤x}+ op(1).
Because b/n → 0 and given assumption [A4], √ b(ˆθn− θo) = r b n √ n(ˆθn− θo) p → 0.
In addition, given assumptions [A1] and [A4], and H¨older’s inequality, the following law of large numbers of i.i.d. sequence is:
1 b b X i=1 ∇θm Yi, Xi, θo 11{Xi≤x} p → IE∇θm Yi, Xi, θo 11{Xi≤x} , Then we obtain 1 √ b b X i=1 ∇θm Yi, Xi, θo(ˆθn−θo)11{Xi≤x} = " 1 b b X i=1 ∇θm Yi, Xi, θo 11{Xi≤x} # √ b(ˆθn−θo) p → 0. Therefore, 1 √ b b X i=1 m Yi, Xi, ˆθn 11{Xi≤x} = 1 √ b b X i=1 m Yi, Xi, θo 11{Xi≤x}+ op(1).
Mb(x; ˆθn) and Mb(x; θo) are asymptotically equivalent. Thus, the estimating parameter θ
does not affect the limiting distribution of the statistic and the estimation effect problem does not appear.
The process Mb belongs to the Shorohod space D(Rk) and the weak convergence of
Mb(x; θo) in the space D(Rk) to a continuous limit is determined by the tightness of Mb
and the finite dimensional convergence of Mb(x; θo). In the following, Bickel and Wichura
(1971), Koul and Stute (1999) and Dom´ınguez and Lobato (2006) are employed to show the tightness of Mb and then the weak convergence of Mb(x; θo). I1 = (s1, t1] = ×kj=1(s1j, t1j],
and I2 = (s2, t2] = ×kj=1(sj2, t2j] are defined as the two subsets in Rk. Then I1 and I2 are
neighbor subsets if and only if for some j∗ ∈ {1, 2, · · · , k}, (s1
j∗, t1j∗] 6= (s2j∗, tj2∗], ×kj6=j∗(s1j, t1j] =
×k
Thus, the process Mb indexed by a parameter in Rk has an associated process indexed by
the intervals as follows, wherein h = 1, 2,
Mb(Ih; θ) := 1 √ b b X i=1 m(Yi, Xi; θ)11{Xi∈Ih} = 1 X e1=0 · · · 1 X ek=0 (−1)k−Pj=1,··· ,kejM b(sh1 + e1(th1 − sh1), · · · , shk+ ek(thk− shk); θ),
which is the increment of Mb around Ih. Denote m(Yi, Xi; θ) = mi. Employing Bickel and
Wichura (1971, Theorem 3 and example II), if
IE Mb(I1; θ)2, Mb(I2; θ)2 = 1 b2IE " b X i=1 mi11{Xi∈I1} #2" b X i=1 mi11{Xi∈I2} #2 .
is bounded, then for any λ > 0 and γ > 1, P (Mb ≥ λ) ≤ λ−4µ(I1∪ I2)γ,
with some measure µ. Thus, as show the process Mb is tight.
Under H0 and given assumption [A1], when a subindex appears once in the summation,
the corresponding term is zero by the law of iterated expectation and the i.i.d. assumption. Moreover, since I1 and I2 are disjoint sets, when a subindex appears more than twice, the
corresponding term is zero. Therefore, IE Mb(I1; θ)2, Mb(I2; θ)2 = 1 b2IE b X i=1 m2i11{Xi∈I1} i−1 X j=1 mj11{Xj∈I2} 2 + 1 b2IE b X i=1 m2i11{Xi∈I2} i−1 X j=1 mj11{Xj∈I1} 2 .
The first and the second terms in the above equation are similar and the only difference is the indexing set Ih; we then focus on the first term. Under H0 and given assumption [A2](i),
1 b2 b X i=1 IE m2i11{Xi∈I1} i−1 X j=1 mj11{Xj∈I2} 2 = 1 b2 b X i=1 IE σ2(Xi)11{Xi∈I1} i−1 X j=1 mj11{Xj∈I2} 2 = 1 b2 b X i=1 IE Z I1 σ2(u)f (u)du i−1 X j=1 mj11{Xj∈I2} 2 .
Given Fubini’s Theorem, the above equation equals to: 1 b2 b X i=1 Z I1 IE σ2(u)f (u) i−1 X j=1 mj11{Xj∈I2} 2 du.
Given Cauchy-Schwarz’s inequality, the following is 1 b2 b X i=1 Z I1 IE σ2(u)f (u) i−1 X j=1 mj11{Xj∈I2} 2 du ≤ 1 b2 b X i=1 Z I1 n IEσ2(u)f (u)2o1/2 IE i−1 X j=1 mj11{Xj∈I2} 4 1/2 du.
Given Burkholder’s inequality and the moment inequality yield, with some constant C,
IE i−1 X j=1 mj11{Xi∈I2} 4 ≤ C IE i−1 X j=1 m2j112{Xi∈I2} 2
≤ C(i − 1)2IE(m4111{X1∈I2}).
Thus, 1 b2 b X i=1 IE m2i11{Xi∈I1} i−1 X j=1 mj11{Xj∈I2} 2 ≤ 1 b2 b X i=1 Z I1 n IEσ2(u)f (u)2 o1/2 C(i − 1)2IE(m4 111{X1∈I2}) 1/2 du = 1 b2C IE(m 4 111{X1∈I2}) 1/2 b X i=1 (i − 1) Z I1 n IEσ2(u)f (u)2o1/2 du.
IE(m4111{X1∈I2}) ≤ IE(m
4
1) which is bounded by assumption [A2] (ii). In addition, from Koul
and Stute (1999), RI
1{IE[σ
2(u)f (u)]2}1/2du is bounded by assumptions [A1] and [A2](iii).
Therefore, under H0 and given assumptions [A1]–[A2], the process Mb is tight. Note that our
assumption [A2] (ii) and (iii) are similar to the assumption (A)(a) in Koul and Stute (1999). Given assumptions [A1] and [A2] (i), and by a central limit theorem for i.i.d. sequence, we have for any x ∈ Rk,
For x1, x2 ∈ Rk, Cov Mb(x1; θo), Mb(x2; θo) = 1 b b X i=1 IEm(Yi, Xi; θo)211{Xi≤x1}11{Xi≤x2} p → Z x1∧x2 −∞ σ2(u) F (du) = V (x1∧ x2),
where the first equality holds by the property of i.i.d. sequence. Since V (x) is
nondecreas-ing and nonnegative, Mb is an asymptotically distributed B(V (x)), where B(·) is a
multi-parameter Brownian motion process. 2
Proof of Theorem 2.2. Herein, it is shown that a consistent estimator ˆσ(Xi)2 to replace
σ(Xi)2 does not affect the asymptotics of the scale invariant subsampling marked empirical
process. Thus, the process ˜Mb(x; ˆθn) may be rewritten as:
1 √ b b X i=1 ˆ σ(Xi)−1m Yi, Xi, ˆθn 11{Xi≤x} = √1 b b X i=1 ˆ σ(Xi)−1− σ(Xi)−1m Yi, Xi, ˆθn 11{Xi≤x}+ 1 √ b b X i=1 σ(Xi)−1m Yi, Xi, ˆθn 11{Xi≤x}.
The first term of the above equation 1 √ b b X i=1 ˆ σ(Xi)−1− σ(Xi)−1m Yi, Xi, ˆθn 11{Xi≤x} ≤√b sup Xi |ˆσ(Xi)−1− σ(Xi)−1| sup Yi,Xi |m Yi, Xi, ˆθn|.
Therefore, given ˆσ(Xi) − σ(Xi) = op(b−1/2) and assumption [A2] (i),
1 √ b b X i=1 ˆ σ(Xi)−1m Yi, Xi, ˆθn 11{Xi≤x}= 1 √ b b X i=1 σ(Xi)−1m Yi, Xi, ˆθn 11{Xi≤x}+ op(1). Let ˜Mbσ(x; θ) := b−1/2Pb i=1σ(Xi) −1m(Y
i, Xi, θ)11{Xi≤x}. Similar to the proof of Theorem 2.1,
replacing θo by ˆθn in ˜Mbσ(x; ˆθn) does not affect its asymptotics. It suffices to focus on the
limiting behavior of ˜Mbσ(x; θo). The tightness of the process can be obtained in Theorem 2.1
as σ(Xi)2 is continuous. Using Lindeberg-L´evy central limit theorem for i.i.d. sequence,
x1, x2 ∈ Rk, Cov ˜ Mbσ(x1, θo), ˜Mbσ(x2, θo) = 1 b b X i=1 IEσ(Xi)−2m(Yi, Xi; θo)211{Xi≤x1}11{Xi≤x2} p → Z x1∧x2 −∞ F (du) = F (x1∧ x2).
Hence, ˜Mb(x; ˆθn) ⇒ B(F (x)), with B a multi-parameter Brownian motion process. 2
Proof of Theorem 3.1. Following Theorem 2.2, ˜Mb(x; ˆθn) and ˜Mbσ(x; θo) are
asymptot-ically equivalent. It suffices to discuss the limit of ˜Mbσ(x; θo) under two different types of
alternatives.
For part (i), ˜Mbσ(X; θo) may be rewritten as:
1 √ b b X i=1 σ(Xi)−1m Yi, Xi, θo 11{Xi≤x} = √1 b b X i=1 σ(Xi)−1m Yi, Xi, θo − µ(Xi) 11{Xi≤x}+ 1 √ b b X i=1 σ(Xi)−1µ(Xi)11{Xi≤x}.
Under H1and given assumptions [A1]–[A4], by the previous proofs, the first part of the above
equation converges to B(F (x)). In addition, if IE|σ(Xi)−1µ(Xi)11{Xi≤x}| < ∞ from the i.i.d.
assumption, the probability limit of b−1/2Pb
i=1σ(Xi)−1µ(Xi)11{Xi≤x} will be √ bIE[σ(Xi)−1µ(Xi)11{Xi≤x}]. As b → ∞, ˜Mbσ(X; θo) → ∞, thus ˜ Mb(x; ˆθn) → ∞.
For part (ii), ˜Mσ
b(X; θo) may be rewritten as:
1 √ b b X i=1 σ(Xi)−1m Yi, Xi, θo 11{Xi≤x} = √1 b b X i=1 σ(Xi)−1 m Yi, Xi, θo − δ(X√i) b 1 1{Xi≤x}+ 1 b b X i=1 σ(Xi)−1δ(Xi)11{Xi≤x}.
Under H1L and given assumptions [A1]–[A4], if IE|σ(Xi)−1δ(Xi)11{Xi≤x}| < ∞, the
prob-ability limit of b−1Pb i=1σ(Xi) −1δ(X i)11{Xi≤x} will be IE[σ(Xi) −1δ(X i)11{Xi≤x}]. Therefore,
under H1L, ˜Mb(x; ˆθn) converges to a multiparameter Brownian motion process plus a non-zero
constant term IE[σ(Xi)−1δ(Xi)11{Xi≤x}]. 2
Proof of Theorem 3.2. Proof of Theorem 3.2 is similar to the proof of Theorem 3.1 wherein ˜
Mbσ(X; θo) may be rewritten as:
1 √ b b X i=1 σ(Xi)−1m Yi, Xi, θo 11{Xi≤x} = √1 b b X i=1 σ(Xi)−1 m Yi, Xi, θo − δ(X√ i) n 1 1{Xi≤x}+ 1 √ b√n b X i=1 σ(Xi)−1δ(Xi)11{Xi≤x}.
The probability limit of the second term on the right-hand-side of the above equation will be 1 √ b√n b X i=1 σ(Xi)−1δ(Xi)11{Xi≤x} = √ b √ n " 1 b b X i=1 σ(Xi)−1δ(Xi)11{Xi≤x} # p → 0, with b/n → 0 and b−1Pb i=1σ(Xi) −1δ(X i)11{Xi≤x} p → IE[σ(Xi)−1δ(Xi)11{Xi≤x}]. Therefore, ˜
Mb(x; ˆθn) converges to a multi-parameter Brownian motion process under both H0and H2L. 2
References
Andrews, D. and P. Guggenberger (2005). Hybrid and size-corrected subsample methods, unpublished manuscript, Cowels Foundation, Yale University.
Bickel, P. and M. Wichura (1971). Convergence criteria for multiparameter stochastic pro-cesses and some applications, Annals of Mathematical Statistics, 42, 1656–1670. Bierens, H. (1982). Consistent model specification tests, Journal of Econometrics, 20, 105–
134.
(1984). Model specification testing of time series regressions, Journal of Econometrics, 26, 323–353.
(1990). A consistent conditional moment test of functional form, Econometrica, 58, 1443–1458.
Bierens, H. and D. Ginther (2001). Integrated conditional moment testing of quantile regres-sion models, Empirical Economics, 26, 307–324.
Bierens, H. and W. Ploberger (1997). Asymptotic theory of integrated conditional moment tests, Econometrica, 65, 1129–1151.
Chernozhukov, V. and I. Fern´andez-Val (2005). Subsampling inference on quantile regression process, Sankha, 67, 253–276.
de Jong, R. (1996). On the Bierens test under data dependence, Journal of Econometrics, 72, 1–32
Delgado, M. and W. Gonz´alez-Manteiga (2001). Significance testing in nonparametric
re-gression based on the bootstrap, Annals of Statistics, 29, 1469–1507.
Dom´ınguez, M and I. Lobato (2006). Consistent estimation of models defined by conditional moment restrictions, Econometrica, 72, 1601–1615.
Durbin, J. (1973). Weak convergence of the sample distribution function when parameters are estimated, Annals of Statistics, 1, 279–290.
Escancciano, C. (2006). A consistent diagnostic test for regression models using projections, Econometrics Theory, 22, 1030–1051.
Guggenberger, P. and M. Wolf (2004). Subsampling tests of parameter hypotheses and over-ifentifying restrictions with possible failure of identification, working paper, Department of Economics, UCLA.
Hong, H. and O. Scaillet (2006). A fast subsampling method for nonlinear dynamics models, Journal of Econometrics, 133, 557–578.
Horowitz, J. and V. Spokoiny (2001). An adaptive, rate-optimal test of a parametric mean=regression model against a nonparametric alternative, Econometrica, 69, 599–631.
Khmaladze, E. (1981). Martingale approach in the theory of goodness-of-fit tests, Theory of Probability and its applications, XXVI, 240–257.
Khmaladze, E. and H. Koul (2004). Martingale transforms goodness-of-fit tests in regression models, Annals of statistics, 32, 995–1034.
Koul, H. and E. Stute (1999). Nonparametric model checks for time series, Annals of Statis-tics, 27, 204–236.
Lavergne, P. and V. Patilea (2008). Breaking the curse of dimensionality in nonparametric testing, Journal of Econometrics, 143, 103–122.
Li, W, C. Hsiao, and J. Zinn (2003). Consistent specification tests for semiparametric/nonparametric models based on series estimation methods, Journal of Econometrics, 112, 295–325.
Linton, O., Maasoumi, E. and Y.-J. Whang (2005). Consistent testing for stochastic domi-nance under general sampling schemes, Review of Economic Studies, 72, 735–765.
Newey, W. (1985). Maximum likelihood specification testing and conditional moment tests, Econometrica, 53, 1047–1070.
Politis, D. N. and J. P. Romano (1994). Large sample confidence regions based on subsamples under minimal assumptions, Annals of Statistics, 22, 2031–2050.
Politis, D. N., J. P. Romano, and M. Wolf (1999). Subsampling, New York: Springer. Shorack, G. and J. Wellner (1986). Empirical Processes with Applications to Statistics, New
York: John Wiley & Sons.
Song. K. (2009). Testing semiparametric conditional moment restrictions using conditional martingale transforms, Journal of Econometrics, forthcoming.
Stichcombe, M. and H. White (1998). Consistent specification testing with nuisance param-eters present only under the alternative, Econometric theory, 14, 295–325.
Stute, W. (1997). Nonparametric model checks for regression, Annals of Statistics, 25, 613– 641.
Stute, W., W. Gonz´alez Manteiga and M. Presedo Quind¨ımil (1998). Bootstrap
approxima-tions in model check for regression, Journal of the American Statistical Association, 93, 141–149.
Stute, W., S. Thies, and L. Zhu (1998). Model checks for regression: an innovation process approach, Annals of Statistics, 26, 1916–1934.
Stute, W. and L. Zhu (2002). Model checks for generalized linear models, Scandinavian Journal of Statistics, 29, 535–545.
Tauchen, G. (1985). Diagnostic testing and evaluation of maximum likelihood models, Jour-nal of Econometrics, 30, 415–443.
Tripathi, G. and Y. Kitamura (2003). Testing conditional moment restrictions, Annals of Statistics, 31, 2059–2095.
Whang, Y. (2000). Consistent bootstrap tests of parametric regression functions, Journal of Econometrics, 98, 27–46.
(2001). Consistent specification testing for conditional moment restrictions, Economics Letters, 71, 299–306.
(2004). Consistent specification testing for quantile regression models, working paper. White, H. (1987). Specification testing in dynamic models. In T. Bewley (ed.), Advances in
Zheng, J. (2000). A consistent test of conditional parametric distributions, Econometric Theory, 16, 667–691.