行政院國家科學委員會專題研究計畫 成果報告
追蹤資料的分量迴歸分析之內生性問題
研究成果報告(精簡版)
計 畫 類 別 : 個別型 計 畫 編 號 : NSC 100-2410-H-004-071- 執 行 期 間 : 100 年 08 月 01 日至 101 年 07 月 31 日 執 行 單 位 : 國立政治大學經濟學系 計 畫 主 持 人 : 林馨怡 計畫參與人員: 碩士班研究生-兼任助理人員:李婉璘 博士班研究生-兼任助理人員:劉冬威 公 開 資 訊 : 本計畫可公開查詢中 華 民 國 101 年 10 月 31 日
中 文 摘 要 : 無 中文關鍵詞: 無
英 文 摘 要 : This project develops a two-stage estimation of a panel data quantile regression model with endogenous explanatory variable. The regressors in the model include a lagged endogenous dependent variable and other explanatory variables, that are correlated with the fixed effects. In the estimation, the control function approach is used, and a penalized quantile regression method for panel data is applied in the second stage. The Monte Carlo simulation shows that the proposed estimation effectively reduces the endogenous bias and performs better than other estimators in finite samples.
英文關鍵詞: control function, endogeneity, panel data, quantile regression
1
Introduction
The panel data model with individual fixed effects, is widely employed in empirical studies. Classical panel data model usually investigates the mean relationship be-tween dependent variable and explanatory variables. It is thus desirable to have an econometric method which enables modeling heterogeneous relationship. Koenker (2004) proposes the quantile regression for panel data model, which provides a com-plete description of the heterogeneous effect of explanatory variables on dependent variables. Note that, in most applications, the number of observations on each in-dividual would be relatively modest, therefore, the fixed effects in the project do not allow a distribution shift and do not depend on the quantiles. Specifying a dummy variable that identifies individuals for the fixed effects to obtain estimates of common model parameters is not available in the quantile regression for panel data model, where the fixed effects should be estimated directly, and the incidental parameter problem arises in the estimation. By using a penalized quantile regression (PQR) method, Koenker (2004) uses a penalized objective function to improve the estimation of common model parameters by controlling the variability introduced by the fixed effects. See also Lamarche (2010) and Galvao and Montes-Rojas (2010). In practical application, the variables of interest are often endogenous, making conventional method inconsistent and hence inappropriate for recovering the hetero-geneous effects of variables. There are several studies focused on obtaining consis-tent estimators in quantile regression models with endogenous regressors. Amemiya (1982) and Powell (1983) suggest the two stage least absolute deviation estimation which is analogous to the two stage least squares estimation. Chesher (2003, 2005, 2007) and Ma and Koenker (2006) consider the scope of quantile regression meth-ods for structural econometrics models. Bundell and Powell (2007) and Lee (2007) propose the control function approach to solve possible endogenous problems. In addition, the instrumental variable quantile regression (IVQR) of Chernozhukov and Hansen (2005, 2006, 2008) is widely applied in empirical research.
When quantile regression is applied to a panel data model with endogenous variables, Arias et al. (2001), following the control function approach, suggest a two-stage estimation method. Moreover, Harding and Lamarche (2009), and Galvao (2011) introduce the IVQR method of Chernozhukov and Hansen (2005) for panel data models. Galvao and Montes-Rojas (2010) consider the IVQR method and the
PQR method of Koenker (2004) for dynamic panel data model. Lin (2010) combinds the PQR method of Koenker (2004) and the fitted value approach to propose a two-stage estimation for dynamic panel data quantile regression model with fixed effects. While most studies apply the IVQR method for the endogenous problem in quantile regression, it is very complicated to compute the estimators via the IVQR method in empirical practice.
2
Control Function
This project aims to extend the control function approach of quantile regression to panel data model with unobservable heterogeneity. The panel data model with fixed effect has the form
Yit= αi+ Xitβ + Z1,it0 γ1 + Uit, ∀i = 1, · · · , N, t = 1, · · · , T.
where Yit is a real-valued dependent variable, αi is the parameter which represents
time-invariant fixed effect which is intended to capture some individual specific effect or unobserved heterogeneity that is not adequately controlled for by other covariates in the model, Xit is a real-valued, continuously distributed, endogenous explanatory
variable, Z1,it is a (dZ1 × 1) vector of exogenous explanatory variables, β and γ1
are unknown parameters, Uit is the error term. If the number of observations T is
large for each individual then we may estimate a distributional shift αi(τ ) for each
individual. In most application, the number of observation T in each individual is relatively modest and it is not suitable to estimate distribution individual effect. For example, in time series applications, the number of observations T is small and it is difficult to estimate a distributional individual effect.
Consider the following model of endogenous variable
Xit = µ + Zit0γ + Vit, (1)
where Zit ≡ (Z1,it, Z2,it) is a (dZ × 1) vector of exogenous explanatory variable, µ
is an unknown parameter, γ ≡ [γ1, γ2] is a (dZ× 1) vector of unknown parameters,
Vit is real-valued unobserved random variable. For identification it is assumed that
there is at least one component of Zit that is not included in Z1,it, and there is at
and γ2 6= 0, where dZ1 is the dimension of Z1,it. When vit is the value of Vit that
satisfies (1),
QUit|Xit,Zit(τ |xit, zit) = QUit|Vit,Zit(τ |vit, zit),
where QUit|Xit,Zit(τ |xit, zit) denotes the τ th quantile of Uit conditional on Xit = xit
and Zit = zit, and the other expressions are understood similarly. In addition,
assume a quantile independence of Uit on Zit conditional on vit and a quantile
independence of Vit on Zit,
QUit|Vit,Zit(τ |vit, zit) = QUit|Vit(τ |vit) and (2)
QVit|Zit(θ|zit) = 0 (3)
almost surely. Under assumption (2), the panel data model for the τ th conditional quantile function of the response of the tth observation on the ith individual Yit is
QYit|Xit,Z1,it(τ |xit, zit) = αi+xitβ(τ )+z
0
1,itγ1(τ )+QUit|Vit(τ |vit), ∀i = 1, · · · , N, t = 1, · · · , T.
(4)
In model (4), the α’s have a pure location shift effect on the conditional quantiles and the effects of α do not depend on the quantile, τ . The covariates Xit, Z1,it,
are permitted to depend on the quantile, τ . In addition, since the variable Vit
is stochastically dependent on Uit, further assume that the conditional quantile
function of Uit on Vit is a linear function of Vit, (4) is rewritten as
QYit|Xit,Z1,it(τ |xit, zit) = αi+xitβ(τ )+z
0
1,itγ1(τ )+vitφ(τ ) ∀i = 1, · · · , N, t = 1, · · · , T,
(5)
with φ(τ ) the parameter. It is noted that unlike the ordinary least square method which eliminates the fixed effect by taking first difference of the model to obtain consistent estimation of parameters, in quantile regression framework, such trans-formation to eliminate the individual fixed effect is not available since the quantile function is not a linear operator. This suggests that β(τ ), γ1(τ ) and φ(τ ) could be
estimated by the penalized model of quantile regression for panel data of Koenker (2004). In applications, vit is unobserved. By assumption (3),
QXit|Zit(θ|zit) = µ(θ) + z
0 itγ(θ),
and vit can be estimated consistently by the residual of a linear θth quantile
re-gression of X on (1, Z). Therefore, β(τ ) and γ1(τ ) can be estimated by a two-step
procedure. The first step is construction of estimated residuals ˆV from the linear quantile regression of X on (1, Z). The second step is the penalized quantile regres-sion for panel data of Y on X, Z1 and ˆV . This approach corrects for endogeneity by
adding estimates of V as an additional explanatory variable, and can be viewed as a variant of control function approach.
3
Estimation
The estimation procedure consists of two steps. The data consist of independent and identically distributed (i.i.d.) observations {yit, xit, zit : i = 1, · · · , N, t =
1, · · · , T }. The first step is construction of estimated residuals ˆvit(θ) = xit− ˆµ(θ) −
zit0 ˆγ(θ) (i = 1, · · · , N, t = 1, · · · , T ) by a linear quantile regression of X on (1, Z), where (ˆµ(θ), ˆγ(θ)) is a solution to min µ,γ N X i=1 T X t=1 ρθ(xit− µ − zit0γ),
where θ ∈ (0, 1) and ρθ(u) = u(θ − 11{u<0}) is the piecewise linear quantile loss
function or “check” function of Koenker and Bassett (1978). The second step is estimation of penalized linear quantile regression of yit on (xit, z1,it, vit) using the
estimated residuals ˆvit in place of unobserved vit’s. In this project, the second step
is carried out via penalized quantile regression of panel data of Koenker (2004). To describe the second step, when the covariates contain of the model contain an intercept, the penalized quantile regression is to estimate the model (5) for several quantiles simultaneously, min β,γ1 q X k=1 N X i=1 T X t=1 ωkρτk(yit− αi− x 0 itβ(τk) − z01,itγ1(τk) − ˆvitφ(τk)) − λ N X i=1 |αi|, (6)
where ρτ is again the check function for τ ∈ (0, 1). When n is large relative to the
T , the `1 shrinkage is advantageous in controlling the variability introduced by the
large number of estimated αi parameters. For λ → 0, (6) becomes
min β,γ1 q X k=1 N X i=1 T X t=1 ωkρτ(yit− αi− x0itβ(τk) − z1,it0 γ1(τk) − ˆvitφ(τk)),
which is similar to the objective function of panel data quantile regression using instrumental variables method of Galvao (2011) and Harding and Lamarche (2009). When λ → ∞, the ˆαi → 0 for all i = 1, · · · , N , then an estimate of the model
purged of the fixed effect is obtained. The weights ωk control the relative influence
of the q quantiles {τ1, · · · , τq}, on the estimation of the αi parameters. Note that
since the z1,it contains an intercept, therefore we have q, τ -specific, estimates of the
intercept.
Impose assumptions 1-4 in the paper, we can obtanin the asymptotic normality of the proposed estimator. Please contact the author for the complete paper.
4
Monte Carlo Simulations
In this section, the Monte Carlo study is studied to investigate the small sample properties of estimators. We compare the bias and RMSE of the following estima-tors: (1) the proposed estimator in this project; (2) the penalized QR estimator in Koenker (2004); (3) the penalized QR estimator using the IVQR method in Galvao and Montes-Rojas (2010). Three models are considered in this section: (A) the pure location shift model,
yit = ηi+ βxit+ uit;
(B) the location-scale shift model I,
yit = ηi+ βxit+ (γ0xit)uit;
and (C) the location-scale shift model II,
yit = ηi+ βxit+ (1 + γ1xit)uit.
The error term uit follows the normal distribution N (0, σu2) with σu2 = 1, 3, 5,
the heavy-tail t-distribution with 3 degree of freedom (t3 distribution), or the χ2
-distribution with 3 degree of freedom (χ2
3 distribution).
The regressor xit is generated according to xit = µi+ ξit, where the fixed effect
µi = e1i+ 1 T T X t=1 xit, e1i∼ N (0, σ2e1),
and ξit follows the same distribution as uit. The fixed effects, ηi is generated as ηi = e2i+ 1 T T X t=1 it, e2i ∼ N (0, σ2e2).
From the above specification of the fixed effect, there is correlation between the individual effects and the explanatory variables; which ensures that the random effects are inconsistent. In the simulation, T = 10, N = 50, and the number of replication is 2000. In addition, the parameters α = {0.3, 0.4, 0.5, 0.6, 0.7}, β = 1, σe1 = σe2 = 1. For the location-scale shift models, we use γ0 = 0.5 and γ1 = 0.1.
The Monte Carlo simulation shows that the proposed estimation effectively reduces the endogenous bias and performs better than other estimators in finite samples. We report all the results in the paper.
5
Conclusions
This project develops a two-stage estimation of a panel data quantile regression model with endogenous explanatory variable. The regressors in the model include a lagged endogenous dependent variable and other explanatory variables, that are correlated with the fixed effects. In the estimation, the control function approach is used, and a penalized quantile regression method for panel data is applied in the second stage. The Monte Carlo simulation shows that the proposed estimation effectively reduces the endogenous bias and performs better than other estimators in finite samples. The proposed approach is easy to implement and effective in several practical applications.
References
Amemiya, T. (1982). Two stage least absolute deviations estimators, Journal Econo-metrics, 50, 689–711.
Arias, O., Hallock, K.F. and Sosa-Escudero, W. (2001). Individual heterogeneity in the returns to schooling: Instrumental variables quantile regression using twins data, Empirical Economics, 26, 7-40.
Blundell, R. and Powell, J.V. (2007). Censored regression quantiles with endogenous regressors, Journal of Econometrics, 141, 65–83.
Chernozhukov, V. and Hansen, C. (2005). Notes and comments an IV model of quantile treatment effects, Econometrica, 73, 245–261.
Chernozhukov, V. and Hansen, C. (2006). Instrumental quantile regression inference for structural and treatment effect models, Journal of Econometrics, 132, 497– 525.
Chernozhukov, V. and Hansen, C. (2008). Instrumental variable quantile regression: A robust inference approach, Journal of Econometrics, 142, 379–398.
Chesher, A. (2003). Identification in nonseparable models, Econometrica, 71, 1405– 1441.
Chesher, A. (2005). Nonparametric identification under discrete variation, Econo-metrica, 73, 1525–1550.
Chesher, A. (2007). Endogeneity and discrete outcomes. Cemmap Working Papers.
Galvao, A.F. (2011). Quantile regression for dynamic panel data, Journal of Econo-metrics, 164, 142–157.
Galvao, A.F. and Montes-Rojas, G.V. (2010). Penalized quantile regression for dynamic panel data, Journal of Statistical Planning and Inference, 140, 3476– 3497.
Harding, M. and Lamarche, C. (2009). A quantile regression approach for estimating panel data models using instrumental variables, Economics Letters, 104, 133– 135.
Koenker, R. (2004). Quantile regression for longitudinal data, Journal of Multivari-ate Analysis, 91, 74–89.
Koenker, R. and Bassett, G. (1978), Regression Quantiles, Econometrica, 46, 33-50.
Lamarche, C. (2010). Robust penalized quantile regression estimation for panel data, Journal of Econometrics, 157, 396–408.
Lee, S. (2007). Endogeneity in quantile regression models:a control function ap-proach, Journal of Econometrics, 141, 1131–1158.
Lin, H. Y., 2010. Dynamic panel quantile regression with an application to deficit and inflation. Working Paper.
Ma, L. and Koenker, R. (2006). Quantile regression methods for recursive structural equation models, Journal of Econometrics, 134, 471–506.
Powell, J. (1983). The asymptotic normality of two-stage least absolute deviations estimators, Econometrica, 51 , 1569–1575.
國科會補助計畫衍生研發成果推廣資料表
日期:2012/10/29國科會補助計畫
計畫名稱: 追蹤資料的分量迴歸分析之內生性問題 計畫主持人: 林馨怡 計畫編號: 100-2410-H-004-071- 學門領域: 數理與數量方法無研發成果推廣資料
100 年度專題研究計畫研究成果彙整表
計畫主持人:林馨怡 計畫編號: 100-2410-H-004-071-計畫名稱:追蹤資料的分量迴歸分析之內生性問題 量化 成果項目 實際已達成 數(被接受 或已發表) 預期總達成 數(含實際已 達成數) 本計畫實 際貢獻百 分比 單位 備 註 ( 質 化 說 明:如 數 個 計 畫 共 同 成 果、成 果 列 為 該 期 刊 之 封 面 故 事 ... 等) 期刊論文 0 0 100% 研究報告/技術報告 1 1 100% 研討會論文 0 0 100% 篇 論文著作 專書 0 0 100% 申請中件數 0 0 100% 專利 已獲得件數 0 0 100% 件 件數 0 0 100% 件 技術移轉 權利金 0 0 100% 千元 碩士生 1 1 100% 博士生 1 1 100% 博士後研究員 0 0 100% 國內 參與計畫人力 (本國籍) 專任助理 0 0 100% 人次 期刊論文 0 1 100% 研究報告/技術報告 0 0 100% 研討會論文 0 0 100% 篇 論文著作 專書 0 0 100% 章/本 申請中件數 0 0 100% 專利 已獲得件數 0 0 100% 件 件數 0 0 100% 件 技術移轉 權利金 0 0 100% 千元 碩士生 0 0 100% 博士生 0 0 100% 博士後研究員 0 0 100% 國外 參與計畫人力 (外國籍) 專任助理 0 0 100% 人次其他成果