分量迴歸的模型設定檢定

(1)

1 Introduction

In statistical applications, it is often the case that one may intend to know the entire conditional distribution which provides more information on the conditional behavior of the random vari-ables of interest. The entire conditional distributions can be reformulated in terms of conditional quantile processes. Once the conditional quantile processes are specified, it is important to have model checks on their validity. Efforts have been devoted to constructing consistent tests for specifications of conditional quantile functions. The tests of Zheng (1998) and Horowitz and Spokoiny (2002) are based on the nonparametric approach in which the nonparametric smooth-ing method is used. Other tests of Koul and Stute (1999), Bierens and Ginther (2001), and Whang (2005) are based on the nuisance parameter approach in which infinitely many uncon-ditional moment conditions are considered. However, those tests are all concerned with the model specification of quantile regression model at a given quantile. To the best of our knowledge, there is no consistent specification test for an entire conditional distribution.

This paper, extending the existing literature, proposes a consistent specification test of con-ditional quantile processes. The proposed test checks the sum of marked empirical processes with the empirical process of the regressors marked by functions of quantile regression residu-als and is a Kolmogorov-Smirnov goodness-of-fit test. The Kolmogorov-Smirnov test statistic is not asymptotically pivotal and suffers from the Durbin problem in the goodness-of-fit test. To solve this problem, I use a subsampling method (Politis and Romano, 1994; Politis, Romano and Wolf, 1999) to approximate the limiting distribution of the proposed test. Subsampling method is a method of estimating the distribution of a test statistic by resampling the original data. Similar to the bootstrap, the subsampling treats the data as if they were the population for evaluation. In contrast to the bootstrap, the requirements of the subsampling for consistency are weaker than those of the bootstrap. In addition, the density estimation is usually a nuisance parameter problem that exists in quantile regression models. The proposed test is free of the density estimation and is thus computationally simpler.

This paper is organized as follows. In section 2, I introduce a consistent model check for condi-tional quantile processes. In section 3, the subsampling approximation of the limiting distribution

(2)

of the test statistic is discussed. Also, local power properties of the proposed test is considered. Section 4 is my conclusion of this paper. All proofs are referred to the Appendix.

2 The Proposed Test

In this section, a model check for a conditional quantile process is considered. The null hypothesis of the specification test for a conditional quantile process being correctly specified is

H0: QY(τ|X) = Xβββ_τ, ∀τ ∈ (0, 1),

where QY(·|X) is the conditional distribution of variable Y conditional on variables X ∈ Rk, and

βββ_τ ∈_Rk_{is a parameter. The conditional quantile process specifications of the null hypothesis can} be represented as the conditional moment conditions across quantiles as follows.

IE

τ − 1{Y −X0_βββ τ≤0}

X = 0, ∀τ ∈ (0, 1), (1)

with 1{·}the indicator function. There exists an equivalence relationship between the conditional

moment conditions and an infinite set of unconditional moment conditions with weight functions indexed by nuisance parameters; see Bierens (1982), Stute (1997), Stichcombe and White (1998), Stute, Thies and Zhu (1998), Koul and Stute (1999), among others. It follows that one can obtain an infinite set of unconditional moment conditions across quantiles:

IEh τ − 1{Yi−X0iβββτ≤0}1{Xi≤x}

i

=₀, ∀τ ∈ (0, 1), x ∈ Rk, ₍₂₎

with Xi ∈Rk, x =(x1, · · · , xk) and 1{Xi≤x} =1{Xi 1≤x1}1{Xi 2≤x2}· · ·1{Xi k≤xk}.

Because equation (2) is sufficient and necessary for (1), one can check whether the sample counterpart of the left-hand-side of (2) differs from zero or not to test the null hypothesis. Let  = (0, 1) × Rk _{be the parameter set. Consider a stochastic process, for}_{τ ∈ (0, 1) and x ∈ R}k_,

Rn(βββ_τ, τ, x) := 1 √ n n X i =1 τ − 1{Yi−X0iβββτ≤0} 1{Xi≤x} .

The process Rn(βββ_τ, τ, x) is the sum of marked empirical process with the empirical process of the

(3)

their values in the Skorohod space D[]. In what follows, I assume that the space D is endowed with the Skorohod topology. The weak convergence of Rn(βββ_τ, τ, x) in the space D is implied by

the finite dimensional convergence and the functional Donsker’s theorem. Sinceβββ_τ is unknown parameters and is replaced by its quantile regression estimator ˆβββ_τ, the proposed test statistic is based on the following marked empirical processes

Rn( ˆβββ_τ, τ, x) = 1 √ n n X i =1 τ − 1{Yi−X0iβββˆτ≤0}1{Xi≤x}.

Further, consider the following assumptions.

Assumption [A1] The conditional distribution functions of Yis, Fy|X(·) are absolutely continuous,

with continuous densities fy|X(·) uniformly bounded away from 0 and ∞.

Assumption [A2] {Yi, Xi}is a strictly stationary andα-mixing sequence with mixing coefficients

{α(d)}, satisfying

∞

X

d=1

dQ−2α(d)s/(Q+s) < ∞

for some even integer Q ≥ 2 and some s > 0. In addition, the distribution of Xi, FXi(·) is

absolutely continuous with respect to Lebesgue measure such that IE||Xi||2< ∞, ∀i ≥ 1. Assumption [A3] ˆβββ_τ has the following linear representation:

√ n ˆβββ_τ−βββ_τ = 1 n n X i =1 IE fy|X(X0iβββτ)XiX0i !−1 1 √ n n X i =1 τ − 1{Yi−X0iβββτ≤0}Xi +o(1),

and√n( ˆβββ_τ −βββ_τ) converges to a limiting distribution d₀(τ).

With those assumptions above, Theorem 1 below gives the convergence of the sum of marked empirical processes, Rn( ˆβββ_τ, τ, x).

Theorem 1. Given Assumptions [A1]–[A3], and under the null hypothesis, for allτ ∈ (0, 1) and x ∈ Rk_,

Rn( ˆβββ_τ, τ, x) ⇒ Q∞ τ, x + 10d0(τ),

where ⇒ denotes the weak convergence, Q∞(τ, x) is the limiting distribution of Rn(βββ_τ, τ, x) and

(4)

A Kolmogorov-Smirnov goodness-of-fit test based on the sum of marked empirical processes is proposed as follows. The Kolmogorov-Smirnov statistic based on Rn( ˆβββ_τ, τ, x) is

Kn:= sup

τ∈T x∈Rsupk

R_n( ˆβββ_τ, τ, x).

Corollary 2. Under the same conditions of Theorem 1,

Kn⇒ K0, with K0:= sup τ∈Tx∈Rsupk Q∞ τ, x + 10d0(τ).

3 Subsampling Approximation

Denote the critical values of Knas D(α) = IP(K0≤α) with significance level α. I show that D(α)

can be approximated via subsampling method in this section. Subsampling method is resampling the original data with size b without replacement. Define

Rn,b,i(τ, x) := 1 √ b i +b−1 X t =i τ − 1{yt−Xt0βββˆb(τ)≤0}1{Xt≤x}, i = 1, · · · , n − b + 1,

where ˆβββ_b(τ) is the quantile regression estimator from {yt, Xt}i +b−1t=i , i = 1, · · · , n − b + 1. Let

Kn,b,i =sup

τ∈T x∈Rsupk

R_n,b,i(τ, x).

One can approximate D(α) by the following subsampling approximation

Dn,b(α) := 1 n − b + 1 n−b+1 X i =1 1{Kn,b,i≤α}.

Theorem 3. Under the same conditions of Theorem 1. Assume b → ∞, n → ∞ and b/n → 0,

Dn,b(α) p

−→D(α).

Furthermore, consider the local alternatives: Ha: IE (τ − 1{y−X0_βββ τ≤0})X = δ( X, τ) √ n ,

(5)

withδ(X, τ) 6= 0. Note that in the Appendix, I show that Rn( ˆβββ_τ, τ, x) = 1 √ n n X i =1 τ − 1{Yi−X0iβββτ≤0}1{Xi≤x}+10 √ n ˆβββ_τ −βββ_τ + o(1).

Under Haand the regularity conditions, one has

IE Rn( ˆβββ_τ, τ, x) = 1 n n X i =1 δ(Xi, τ).

In addition, under Haand all the regularity conditions, if b/n → 0,

IE Rn,b,i(τ, x) = √ b √ n " 1 b i +b−1 X t=i δ(Xi, τ) # →0.

Therefore, the distribution of Knis different from the distribution of Kn,b,i under the local

alterna-tives. Let cv(1 − α) = inf w : Dn,b(w) ≥ (1 − α) denote the critical value of Kn,b,i. The power

of the proposed test under Hais obtained, that is, under the local alternatives,

IP(Kn≥cv(1 − α)) p

−→1.

4 Conclusion

In this article, I have proposed a new consistent specification test of conditional quantile processes, and shown that the limiting distribution of the proposed test can be approximated by the subsam-pling method under very weak conditions. The Durbin problem of the goodness-of-fit test and the nuisance parameter problem of the quantile regression models are free in the proposed testing procedures. The subsampling approximation makes the specification test computationally simpler.

Acknowledgments

(6)

Appendix

Before proving Theorem 1, I first prove Lemma A.1 and Lemma A.2 as follows. Define vn βββ, τ, x := 1 √ n n X i =1 n τ − 1{Yi−X0iβββ≤0}1{Xi≤x}−IE τ − 1{Yi−X0iβββ≤0}1{Xi≤x} o .

Lemma A.1. Under Assumptions [A.1] and [A.2], for each > 0, a ξ > 0.

limn→∞ sup ρ (βββ1,τ1,x1),(βββ2,τ2,x2)<ξ v_n βββ₁, τ₁, x₁ −v_n βββ₂, τ₂, x₂ Q < , (3) where ρ (βββ1, τ1, x1), (βββ2, τ2, x2) = IE h 1{Yi−X0iβββ1≤0}−τ11{X1≤x1}− 1{Yi−X0iβββ2≤0}−τ21{Xi≤x2} i21/2 and || · ||Q denotesLQ norm.

Proof of Lemma A.1: The result can be obtained by extending the results of Whang (2005). But

one should modify his results for quantile regression models across quantiles. Under Assump-tions [A.1] and [A.2], we verify the condiAssump-tions of Andrews and Pollard (1994, p.121). First, the strong mixing condition holds by Assumption [A.2]. Second,

sup i ≥1 IE sup B((βββ,τ,x),r) 1{Yi−X0iβββ ∗_≤_0}−τ∗1_{X i≤x∗}− 1{Yi−X0iβββ≤0}−τ1{Xi≤x} 2 =_sup i ≥1 IE sup B((βββ,τ,x),r) 1{Yi−X0iβββ ∗_≤ 0}−1{Yi−Xi0βββ≤0}1{Xi≤x}+(τ − τ ∗₎₁ {Xi≤x} + 1{Yi−X0iβββ ∗_≤ 0}−τ ∗₁ {Xi≤x∗} −1{Xi≤x} 2 =sup i ≥1 IE sup B((βββ,τ,x),r) 1{Yi−X0iβββ ∗_≤ 0}−1{Yi−Xi0βββ≤0} 2 1{Xi≤x}+(τ − τ ∗₎2₁ {Xi≤x} + ₁_{_Y i−X0iβββ ∗_≤_0}−τ∗2 1_{X i≤x∗} −1{Xi≤x} 2 +₂(τ − τ∗) 1_{_Y i−X0iβββ ∗_≤_0}−1_{_Y i−X0iβββ≤0}1{Xi≤x} +_{2 1}_{_Y i−X0iβββ ∗_≤_0}−1_{_Y i−X0iβββ≤0}1{Xi≤x} 1{Yi−Xi0βββ ∗_≤_0}−τ∗ 1_{X i≤x∗} −1{Xi≤x} +₂(τ − τ∗)1_{X i≤x} 1{Yi−Xi0βββ ∗_≤_0}−τ∗ 1_{X i≤x∗} −1{Xi≤x} ≤_C₁_IE||X_i||||βββ∗−βββ|| + |τ∗−τ|2+_C₂||_x∗−_{x|| + 2C3}|τ∗−τ|IE||X_i||||βββ∗−βββ|| +_2C₄_IE||X_i||||βββ∗−βββ||||x∗−_{x|| + 2C6}|τ∗−τ|||x∗−_x|| ≤_Cr,

(7)

where B((βββ, τ, x), r) is the ball of radius r around (βββ, τ, x), and the first inequality follows from Assumption [A1] and the Taylor expansions, and the second inequality follows from Assump-tion [A2]; the bracketing condiAssump-tion holds. Therefore, equaAssump-tion (3) holds by Theorem 2.2 of An-drews and Pollard (1994).

Lemma A.2. Under Assumptions [A.1], [A.2] and [A.3],

sup x∈Rk sup τ∈T vn ˆ βββ_τ, τ, x − vn βββ_τ, τ, x →0.

Proof of Lemma A.2: The following result also modifies Whang (2005). ∀ε > 0, a ξ > 0,

limn→∞IP sup x∈Rk sup τ∈T v_n βββˆ_τ, τ, x − v_n βββˆ_τ, τ, x< ε ! ≤_lim_n→∞_IP _sup x∈Rk sup τ∈T v_n βββˆ_τ, τ, x − v_n βββˆ_τ, τ, x< ε, sup x∈Rk sup τ∈T ρ ( ˆβββτ, τ, x), (βββτ, τ, x) < ξ ! +_lim_n→∞_IP _sup x∈Rk sup τ∈T ρ(βββτ, τ, x) ≥ ξ ! ≤_lim_n→∞_IP   sup ρ ( ˆβββτ,τ,x),(βββτ,τ,x) <ξ v_n βββˆ_τ, τ, x − v_n βββˆ_τ, τ, x< ε   →₀,

where the last convergence to zero is obtained by Lemma A.1 and the last term on the right-hand-side of the first inequality is zero by

sup x∈Rk sup τ∈T ρ ( ˆβββτ, τ, x), (βββτ, τ, x) 2 = _sup x∈Rk sup τ∈T IE h 1{Yi−X0iβββˆτ≤0} −₁_{_Y i−X0_iβββ_τ≤0} 2 12{Xi≤x} i ≤_sup τ∈T IE h 1{Yi−X0iβββˆτ≤0} −₁_{_Y i−X0iβββτ≤0} 2i ≤_{C IE||X}_i|||| ˆβββ_τ−βββ_τ|| →₀,

where the last inequality follows by Assumptions [A.1] and [A.2] and the last convergence holds by Assumption [A.3].

(8)

Proof of Theorem 1. With Lemmas [A.1] and [A.2], it suffices to show Theorem 1. By the

definition ofvn(·), one has

Rn( ˆβββ_τ, τ, x) = √1 n n X i =1 τ − 1{Yi−X0iβββˆτ≤0}1{Xi≤x} =v_n βββˆ_τ, τ, x − v_n βββ_τ, τ, x + v_n βββ_τ, τ, x +√1 n n X i =1 IE τ − 1_{_Y i−X0_iβββˆ_τ≤0}1{Xi≤x} =v_n βββ_τ, τ, x +√1 n n X i =1 IEh τ − 1{Yi−Xi0βββˆτ}1{Xi≤x} i +o(1) =v_n βββ_τ, τ, x +√1 n n X i =1 IEh τ − 1{Yi−Xi0βββτ≤0}1{Xi≤x} i + 1 n n X i =1 ∇_βββ_IEh τ − 1_{_Y i−X0iβββτ≤0}1{Xi≤x} i ! √ n ˆβββ_τ −βββ_τ + o(1) = √1 n n X i =1 τ − 1{Yi−X0iβββτ≤0}1{Xi≤x} + 1 n n X i =1 IE fy|X X0iβββτXi1{Xi≤x} ! √ n ˆβββ_τ −βββ_τ + o(1) = √1 n n X i =1 τ − 1{Yi−X0iβββτ≤0}1{Xi≤x}+10 √ n ˆβββ_τ−βββ_τ + o(1) ⇒_Q_∞ τ, x + 1₀_d₀(τ),

where the third equality holds by Lemma A.2, the fourth equality follows from the Taylor expan-sion and the last convergence holds by Assumption [A.3] and the finite dimenexpan-sional convergence.

Proof of Corollary 2. By the continuous mapping theorem and the result of Theorem 1, the result

(9)

Proof of Theorem 3. To prove Theorem 3, first note that IE Dn,b(α) = IE " 1 n − b + 1 n−b+1 X i =1 1{Kn,b,i≤α} # = 1 n − b + 1 n−b+1 X i =1 IE h 1{Kn,b,i≤α} i = 1 n − b + 1 n−b+1 X i =1 IP Kn,b,i ≤α = 1 n − b + 1 n−b+1 X i =1 IP Kn,b,1≤α = _{IP K}_n_,b,1≤α = _{IP K}₀≤α = D(α),

where the fourth equality follows by the stationary property and the sixth equality holds when b → ∞. Furthermore, we prove that

Var Dn,b(α) =_Var " 1 n − b + 1 n−b+1 X i =1 1{Kn,b,i≤α} # = 1 (n − b + 1)2 n−b+1 X i =1 Var 1{Kn,b,i≤α} + 2 (n − b + 1)2 n−b+1 X i =1 n−b X j =1 Cov 1{Kn,b,i≤α}, 1{Kn,b,i+ j≤α} = 1 (n − b + 1)  s{n−b+1,0}+2 b−1 X j =1 s{n−b+1, j}   + 2 (n − b + 1)2 n−b+1 X i =1 n−b X j =b Cov 1{Kn,b,i≤α}, 1{Kn,b,i+ j≤α} , with s{n−b+1, j} = 1 n − b + 1 (n−b+1)− j X i =1

Cov_1{K_n_,b,i≤α}, 1{Kn,b,i+ j≤α} .

If b/n → 0, then 1 (n − b + 1)  s{n−b+1,0}+2 b−1 X j =1 s{n−b+1, j}   =o(1).

(10)

Second, by Hall and Heyde (1980, p.277), for j ≥ b, Cov 1{Kn,b,i≤α}, 1{Kn,b,i+ j≤α} ≤4α( j), and then 2 (n − b + 1)2 n−b+1 X i =1 n−b X j =b Cov 1{Kn,b,i≤α}, 1{Kn,b,i+ j≤α} ≤ 8 (n − b + 1) n−b X j =b α( j) →₀, as n → ∞,

where the last convergence follows by the mixing assumption. It follows that Var[Dn,b(α)] → 0.

(11)

References

Andrews, D. and D. Pollard (1994). An introduction to functional central limit theorems for dependent stochastic processes, International Statistical Review, 62, 119–132.

Bierens, H. (1982). Consistent model specification tests, Journal of Econometrics, 20, 105–134. Bierens, H. and D. Ginther (2001). Integrated conditional moment testing of quantile regression

models, Empirical Economics, 26, 307–324.

Hall, P. and C. C. Heyde (1980). Martingale Limit Theory and Its Application (Academic Press, San Diego).

Horowitz, J. and V. Spokoiny, (2002). An adaptive rate–optimal test of linearity for median re-gression models, Journal of the American Statistical Association, 97, 822–835.

Koul, H. and E. Stute (1999). Nonparametric model checks for time series, Annals of Statistics,

27, 204–236.

Politis, D. and J. Romano (1994). Large sample confidence regions based on subsamples under minimal assumptions, Annals of Statistics, 22, 2031–2050.

Politis, D., J. Romano, and M. Wolf (1999). Subsampling (Springer Press, New York).

Stichcombe, M. and H. White (1998). Consistent specification testing with nuisance parameters present only under the alternative, Econometric theory, 14, 295–325.

Stute, W. (1997). Nonparametric model checks for regression, Annals of Statistics, 25, 613–641. Stute, W., S. Thies, and L. Zhu (1998). Model checks for regression: an innovation process

approach, Annals of Statistics, 26, 1916–1934.

Whang, Y. (2005). Consistent specification testing for quantile regression models, in D. Corbae, S N. Durlauf, and B.E. Hansen eds. Econometric Theory and Practice: Frontiers of Analysis and Applied Research (Cambridge University Press).

Zheng, J. (1998). A consistent nonparametric test of parametric regression models under condi-tional quantile conditions, Econometric Theory, 14, 123–138.