含有非定態數列的階層式因子模型

(1)

科技部補助專題研究計畫成果報告

期末報告

含有非定態數列的階層式因子模型(第2年)

計畫類別：個別型計畫計畫編號： NSC 102-2410-H-004-019-MY2 執行期間： 103年08月01日至104年07月31日執行單位：國立政治大學經濟學系計畫主持人：徐士勛計畫參與人員：碩士班研究生-兼任助理人員：李勁宏碩士班研究生-兼任助理人員：張伊婷碩士班研究生-兼任助理人員：陳建安碩士班研究生-兼任助理人員：游書豪博士班研究生-兼任助理人員：徐兆璿處理方式： 1.公開資訊：本計畫可公開查詢 2.「本研究」是否已有嚴重損及公共利益之發現：否 3.「本報告」是否建議提供政府單位施政參考：否

中　華　民　國　104　年　10　月　28　日

(2)

中文摘要：由大量多維度資料的出現，假設所有變數的波動都僅由少數幾個重要共同因子所決定的「因子模型」以及其變化模型在近年的文獻研究上也逐漸受到重視。其中，為了能讓因子所蘊含的經濟意義更為清晰，納入更多經濟結構設定的「階層式因子模型」也因而被引進於實證研究中。然而，目前文獻上的「階層式因子模型」中皆隱含了兩個重要的限制：一為其只能分析定態資料，另一則為每一階層的因子數目必須由研究者事先給定以利後續估計方法(如最大概似估計或貝式估計)的進行。此一兩年期的計畫，即試圖在保留「階層式因子模型」的結構優點下，突破這兩大限制。我們的模型允許可能非定態的資料、因子及干擾項的存在，同時每一階層的因子數目將由資料來客觀決定。並且，我們提出的估計方法將僅倚賴主成分分析法，以逐層分析的方式進行估計。相較於大部分文獻所採用的估計方式，這是一個相對容易執行的估計方法。我們已經完整探討此模型及其延伸，對應的極限性質與對應的模擬設計。基本而言，由於多數資料皆非定態，而這些非定態特徵可能存在值得探討的共同趨勢，因此，此計畫所提出的分析架構在文獻上是一套新的嘗試，而且可與現有的「階層式因子模型」相抗衡。中文關鍵詞：共同因子，非定態，階層式因子模型，主成分分析法，變異數分解英文摘要： While facing the large dimensional data,the factor model,

which assumes the main fluctuations of all variables of interest are driven by only a few common factors,has thus become popular, and lots of its variants are introduced in the literature.In particular,to gain a better understanding of factors,the so-called top-down hierarchical factor model is established by imposing more economic structures on factors.Nevertheless,there are a couple of limitations in the exiting hierarchical factor models:(1) they work for the stationary data only,and (2) the number of factors of each layer must be presumed by researchers in advance of employing the maximum likelihood estimation or Bayesian methods.This paper thus aims to get round these

limitations,while keeping the advantages of top-down hierarchical factor model.The non-stationary data as well as non-stationary factors and idiosyncratic errors are allowed,

the number of factors of each layer is determined by the data instead of presumption by researchers,and the proposed estimation procedure is implementable by applying principal component analysis from top layer to bottom layer

recursively.The corresponding asymptotic properties of the proposed approach are discussed in detailed,and good

finite-sample performance is also shown by some Monte Carlo simulations. In essence,the proposed framework is new in the literature and can be a comparable alternative to the existing top-down hierarchical factor models, while

(3)

facing the possible non-stationary data.

英文關鍵詞： common factor, non-stationarity, hierarchical factor model, principal component analysis, variance decomposition

(4)

Hierarchical Factor Models with Possible

Non-stationary Components

Shih-Hsun Hsu

∗

Department of Economics, National Chengchi University

This Version: Oct, 2015

∗_{Corresponding author. E-mail: shhsu@nccu.edu.tw; TEL: +886-2-2939-3091 ext:51667; Address:}

Department of Economics, National Chengchi University, Taipei 116, Taiwan. The research support from the National Science Council of the Republic of China (Taiwan), NSC 102-2410-H-004-019-MY2, is gratefully acknowledged.

(5)

Abstract

While facing the large dimensional data, the factor model, which assumes the main fluctuations of all variables of interest are driven by only a few common factors, has thus become popular, and lots of its variants are introduced in the literature. In particu-lar, to gain a better understanding of factors, the so-called top-down hierarchical factor model is established by imposing more economic structures on factors. Nevertheless, there are a couple of limitations in the exiting hierarchical factor models: (1) they work for the stationary data only, and (2) the number of factors of each layer must be pre-sumed by researchers in advance of employing the maximum likelihood estimation or Bayesian methods. This paper thus aims to get round these limitations, while keeping the advantages of top-down hierarchical factor model. The non-stationary data as well as non-stationary factors and idiosyncratic errors are allowed, the number of factors of each layer is determined by the data instead of presumption by researchers, and the pro-posed estimation procedure is implementable by applying principal component analysis from top layer to bottom layer recursively. The corresponding asymptotic properties of the proposed approach are discussed in detailed, and good finite-sample performance is also shown by some Monte Carlo simulations. In essence, the proposed framework is new in the literature and can be a comparable alternative to the existing top-down hierarchical factor models, while facing the possible non-stationary data.

Keywords: common factor, non-stationarity, hierarchical factor model, principal com-ponent analysis, variance decomposition.

(6)

1 Introduction

As the information technology improves, thousands of time series data with long span are now available, and how to efficiently extract the useful information from these large dimen-sional data while making inference has drawn the attention of researchers. In particular, the co-called factor model, which assumes the main fluctuations of all variables of interest are driven by only a few common factors, has thus become popular in the literature. This project is also focusing on some related issues about factor model and its variants.

For a stationary time series data {yi t}(i = 1, . . . , Ny;t = 1, . . . , T ), the conventional

factor model assumes that they are driven by q × 1 unobserved common factors, that is,

y_{i t} =30_iF_t +ε_{i t}, i = 1, . . . , N_y, t = 1, . . . , T, (1) where Ft = (F1t, F2t, . . . , Fqt)0 is a q × 1 vector of unobserved common factors at time t ,

3i = (λ1i, λ2i, . . . , λqi)0 is the corresponding (q × 1) factor loadings, and εi t is referred

to as the idiosyncratic error. In the literature, this model and its variants have been widely employed in making inference or forecasting and their performance are quite well in general while facing the large dimensional data; see e.g., Stock and Watson (2002, 2006) and Bai and Ng (2008) among others. As is well known, however, because of lacking of economic structures, one of the deficiencies of this conventional factor model is that the “meaning” of factors is sometimes not so straightforward to know since they just combine information from all variables with some particular weights.

2 The hierarchical factor model

To gain a better understanding of factors, the so-called top-down hierarchical factor model is introduced in the literature,1 especially in analysis of international business cycles; see e.g.,

1_{Basically, the named “multi-level” or “multi-layer” factor models are similar to the hierarchical factor}

models. Besides, instead of the top-down specification, there is also a hierarchical factor model with bottom-up specification in the literature, see e.g., Diebold, Li and Yue (2008), Ng and Moench (2010) and Moench et.al (2011).

(7)

Gregory et al.(1997) and Kose et al. (2003). To illustrate, let N denote the number of coun-tries, M the number of time series per country, R the number of regions, and T the length of the time series, then given the stationary data {yi t}(i = 1, . . . , M × N; t = 1, . . . , T ),

the three-layer, top-down hierarchical factor model of Kose et al. (2003) for analyzing the international business cycles is

y_i,t =a_i +λW_i F_tW +λ_iRF_r,tR +λC_i F_n,tC +ε_i,t, (2) 8W_(L)FW

t =uuuWt , 8rR(L)FrR,t =uuurR,t, 8Cn(L)FnC,t =uuuCn,t, (3)

i =1, . . . , M × N, t = 1, . . . , T, r = 1, . . . , R, n = 1, . . . , N,

where F_tW, F_r,tR and F_n,tC , orthogonal to each other, are world (W), regional (R) and country-specific (C) factors,λW_i , λ_iR andλC_i are corresponding factor loadings,8W(L), 8_rR(L) and 8C

n(L) are generic lag polynomials, and uuuWt , uuur,tR and uuuCn,t are corresponding factor

inno-vations, respectively.2 In particular, it is worth noting that a shock should be labeled as worldwide, regional or country-specific depending on its effects rather than its origin; see e.g., the illustration of this type of model in Forni and Reichlin (2001) or Kose et al.(2003).

As easily seen from the specification (2), the “meaning” of these factors of each layer now is much easier to understand; i.e., they are world-wide, regional or country-specific. Besides, recall that all variables in the conventional factor model (1) are driven by the same q common factors, the fluctuation of i -th variable yi t in (2) now is explained only by three

factors (one for world, one for the located region and one for its country) instead, even though the whole system of M × N time series is driven by 1 + R + N factors. From this viewpoint, the top-down hierarchical factor model, therefore, can be viewed as a “restricted” factor model by imposing some zero loadings in conventional factor model (1). Moreover, because factors of each layer are orthogonal, the variance of yi t can be decomposed as

Var(yi,t) = (λW_i )2Var(F_tW) + (λ_iR)2Var(F_r,tR) + (λC_i )2Var(F_n,tC ) + Var(εi,t),

it thus provides an easy way to understand how the fluctuations of factors contribute to the variance of yi,t.

2_{For comparison with what proposed below, we changed some notations of variables and presentation of}

model in Kose et al.(2003). Their original model specification and notations of variables can refer to equation (1)–(4) on page 1219 of Kose et al.(2003).

(8)

Since the model specification above can be viewed as a state-space model for unobserv-able factors — model (2) is the measurement equation and (3) is the corresponding transi-tion equatransi-tions, given the distributransi-tions of idiosyncratic errorsε_i,t and of the factor innovations (they are usually normal distributed), most of the existing methods for estimating these unob-served factors and loadings in hierarchical factor models are either the maximum likelihood estimation (MLE) with Kalman filter or the Bayesian estimation by Gibbs sampling. More examples of top-down hierarchical factor models, estimation procedures and properties can be found in, for example, Gregory et al. (1997), Forni and Reichlin (2001), Brook and Del Negro (2006), Kose et al. (2003, 2008), Fu (2007), Del Negro and Otrok (2008), Stock and Warson (2008), Beck et al. (2009), Crucini et al. (2011), He and Liao (2011), Hirata et al. (2011), Mumtaz et al. (2011), Lee (2012), and Bai and Wang (2012) etc.

2.1 The limitations of the existing analysis of hierarchical factor models

So far as the existing hierarchical factor models is concern, the “meaning” of factors is much more clear than the conventional factor models, and most of the related empirical studies are impressed and insightful; see e.g., Kose et al. (2003, 2008) and Crucini et al. (2011). Nevertheless, there are a couple of limitations which are worth noting. The first is that, all the existing hierarchical factor models in the literature work only for the stationary data. It is quite evident that not all raw data are stationary, and transferring non-stationary data to be stationary is kind of arbitrary and is even tedious while facing the large dimensional data. Besides, since the co-movement of non-stationary variables via co-integration is crucial to capture the long-run relationship between variables and has widely found in the literature, the existing hierarchical factor models could not take into account this possibility, and can not further take the advantage of this long-run relationship between variables for inference. The second limitation is that the number of factors of each layer must be presumed by researchers in advance of employing the typical estimation methods such like Kalman filter or Gibbs sampling. The resulting inference might be misleading if the presumed number of factors for any layer is wrong. Moreover, the estimation procedures, either by MLE with Kalman filter or the Bayesian estimation by Gibbs sampling, for the existing hierarchical factor models are sometime tedious because to obtain convergent estimates is usually time-consuming.

(9)

previous work is to get round above limitations, while keeping the advantages of top-down hierarchical factor model. The non-stationary data as well as non-stationary factors and idiosyncratic errors are allowed, the number of factors of each layer is determined by the data instead of presumption by researchers, and the estimation procedure is implementable by applying principal component analysis (PCA) recursively.3 From this perspective, the proposed model specification, estimation procedure and corresponding asymptotics are new in the literature and can be a comparable alternative to the existing top-down hierarchical factor models.

3 The Proposed

In order to illustrate the proposed methodology more clearly for analyzing the hierarchical factor models with non-stationary components, we consider a two-layer, top-down hierar-chical factor model in what follows. We specify the model clearly, compare it with the existing hierarchical factor models, and propose a corresponding procedure for consistently estimating the unobservable factors and loadings. Extension to multi-layer models are quite straightforward, we thus focus on this two-layer model in this paper.

3.1 Model Specification

More specifically, let YYY_t be N × 1 vector of standardized variables at time t , and can be partitioned into two classes XXX_t (Nx×1) and ZZZt (Nz×1), where Nx+Nz = N. A two-layer

common factor model for YYY_t is

YYYt = " X XX_t ZZZ_t # =" αααx αααz # +" λλλ c0 x λλλc0 z # F F Fc_t +" λλλ x0_F_F_Fx t λλλz0_F_F_Fz t # +" εεε x t εεεz t # , (4)

where αααx(Nx × 1) and αααz(Nz × 1) are vectors of constants, FFFc_t, identified as the global

factors, is a qc×1 vector of factors common to all variables, FFFx_t (qx ×1) and FFFz_t (qz ×1),

3_{Beck et al. (2009) also consider the similar estimation procedure, they also estimate factors and loadings}

from top layer to down layer by employing PCA and OLS recursively. However, their method works only for stationary data and not any asymptotic properties is provided in their paper.

(10)

orthogonal to FFFc_t, are the respective factors specific to classes of XXX and ZZZ, λλλc_x (qc × N_x), λλλc

z (qc×Nz), λλλx (qx × Nx) andλλλz (qz ×Nz) are the corresponding vectors of loadings for

each layer,εεεx_t =[εx_1,t, . . . , εx_N x,t] 0 andεεεz_t = [εz_1,t, . . . , εz_N z,t] 0

are idiosyncratic errors related to XXX and ZZZ. Besides, similar to the specifications by Bai and Ng (2004) for analyzing the conventional factor model, the dynamics of factors and errors are specified as

(1 − L)FFFm t =888m(L)uuumt , m = c, x, z, (5) (1 − ρm i L)ε m i,t =2mi (L)ν m i,t, m = x, z, i = 1, . . . , Nm, (6)

where for m = c, x, z, uuum_t are factor innovations with corresponding lag polynomials888m(L) = P∞

l=1φlmLl; and for m = x, z, i = 1, . . . , Nm,νi,tm are error innovations with corresponding

lag polynomials2m_i (L) = P∞_l=1θ_i,lmLl.

Contrary to the three-layer model (2), the specification for two-layer one in (4) implies that xi t = αx,i +λλλc 0 x,iFFFct +λλλx 0 i FFF x

t +εi,tx and zi t = αz,i +λλλc

0 z,iFFFct +λλλz 0 i FFF z t +εi,tz ; they

are driven by the same common factors from first layer and their own class-specific factors form second layer. Instead of the top-down structure, as the proposed in Diebold, Li and Yue (2008), Ng and Moench (2010) or Moench et.al (2011), the bottom-up hierarchical factor model is specified as first layer: ( X XX_t = ¨ααα_x + ¨λλλx 0 ¨ F FF_tx + ¨εεεx_t Z ZZ_t = ¨ααα_z + ¨λλλz 0 ¨ FFFz_t + ¨εεεz_t , second layer: ( ¨ F FFx_t = ¨λλλc 0 x FFF¨ c t + ¨εεεcxt ¨ F FFz_t = ¨λλλc 0 z FFF¨ c t + ¨εεεczt . In comparing with the top-down structure, it is obvious that the factors ¨FFF_tx and ¨FFF_tz in first layer are correlated with the second-layer factor ¨FFFc_t since ¨FFFc_t is the common factor of ¨FFFx_t and ¨FFFz_t.

The remarkable features of this model specification are as follows. First, the proposed specification is quite general and flexible since the non-stationarity is allowed via factors in (5) or idiosyncratic errors when ρ_im = 1 in (6); cf. (3). Moreover, the weak correlations between idiosyncratic errors and factor innovations, and the possible dependence between FFFx_t and FFFz_t are unrestricted. Second, the number of factors of each layer, qc, qx and qz, are not presumed and will be determined by the data. Third, we do not need to worry about how to transfer the non-stationary data to be stationary while facing the large dimensional data. Fourth, a series is stationary or not depends on the composite effect of common factors

(11)

of each layer and its corresponding idiosyncratic error. Since at least one of factors is non-stationary, all variables are non-stationary no matter how their corresponding idiosyncratic errors are. On the other hand, if all factors of each layer are stationary, then the series is stationary only when its idiosyncratic error is; otherwise, it is not. Last but not least, after employing the typical unit root tests to the proposed model, we can tell whether the non-stationarity is pervasive(due to factor FFFc_t), or class-specific (due to factors FFFx_t or FFFz_t) or variable-specific (due to idiosyncratic error), or some (all) of them. In addition, we can further address the issue of co-integration for each layer within this framework.

3.2 Estimation Procedure

For the proposed new hierarchical factor models with non-stationary components, We con-sider a new estimation procedure which involves continuously updating PCA by extending the methods of Bai and Ng (2004) which deals with the conventional factor models only. In what follows, for an unknown parameter(or variable) of interest, θ say, we denote its estimator at the m−th iteration of the continuously updating PCA as bθ(m). The proposed estimation procedure is thus involved in the following few steps:

Step 1: Since not all variables of XXX_t and ZZZ_t are stationary, we difference variables in (4) first:

1YYYt =" 1XXXt 1ZZZt # =" λλλ c0 x λλλc0 z # 1FFFc t + " λλλx0_1FFFx t λλλz0_1FFFz t # +" 1εεε x t 1εεεz t # , (7)

where1Qt = Qt −Qt −1for Qt = XXXt, ZZZt, FFFct, FFFxt, FFFtz, εεεxt andεεεtx.

Step 2: Under the identification conditions N−1(λλλc_x0λλλc_x + λλλ_zc0λλλc_z) = I_qc, a qc × qc identify

matrix, the typical PCA yields the initial estimators for [λλλc_x λλλc_z]0, [ bλλλc_x(0) bλλλc_z(0)]0say, they are√N times the eigenvectors corresponding to the qc largest eigenvalues of the N × N matrixPT

t =11YYYt1YYY0t. As a consequence, in the first layer of this hierarchical

factor model, the initial estimators for1FFFc_t is [ 1FFFc t(0) = N −1 b λλλc x(0) 0_1XXX t + bλλλcz(0) 0_1ZZZ t , for t = 2, . . . , T .

(12)

Step 3: For the second layer of the proposed hierarchical factor model, the initial estimators for loadingsλλλx andλλλz, bλλλx(0) and bλλλz(0) say, as well as the initial estimators for factors 1FFFx

t and1FFFzt, [1FFFxt(0) and [1FFFtz(0) say, can then be obtained by applying PCA as

Step 2 to1XXXt− bλλλcx(0) 0_1FFF_[c

t(0) and 1ZZZt− bλλλcz(0) 0_1FFF_[c

t(0), respectively.

Step 4: Given the initial estimators for factors and loadings for the second layer in the previous steps, we again apply PCA to the data

" 1XXXt − bλλλx(0)01FFF[tx(0)

1ZZZt − bλλλz(0)01FFF[z_t(0)

#

, (8)

which is the data after purging the estimated component components in the second layer. The resulting PC estimator for the loadings ([λλλc_x λλλc_z]0) and factors (1FFFc_t) in the first layer are denoted as [ bλλλc_x(1) bλλλc_z(1)]0and [1FFFc_t(1), respectively.

Step 5: Given updated PC estimators [ bλλλc_x(1) bλλλc_z(1)]0 and [1FFFc_t(1) in the first layer, the up-dating PC estimators for the second layer can then obtained by applying PCA to the purged data1XXXt − bλλλc_x(1)01FFF[c_t(1) and 1ZZZt − bλλλc_z(1)01FFF[c_t(1), respectively. We will

get the loading estimators bλλλx(1) and bλλλz(1), as well as the factor estimators [1FFFx_t(1) and [1FFFz_t(1).

Step 6: Continuously update PC estimators by iterating between Step 4 and Step 5 until the estimators converge, at the m∗−th iteration, say. The resulted convergent estima-tors are denoted as bλλλc_x ≡ bλλλc_x(m∗), bλλλc_z ≡ bλλλc_z(m∗), bλλλx ≡ bλλλx(m∗), bλλλz ≡ bλλλz(m∗), [ 1FFFc t ≡ [1FFFct(m ∗_{), [}_1FFFx t ≡ [1FFFxt(m ∗_{) and [}_1FFFz t ≡ [1FFFzt(m ∗_{). As a consequence, the}

estimators for FFFc_t,1FFF_tx and1FFFz_t, respectively, are

c FFFc_t = t X s=2 [ 1FFFc s = N −1 b λλλc x 0 X X X_t + bλλλc_z0ZZZ_t , c FFFx_t = t X s=2 [ 1FFFx s, and cFFFzt = t X s=2 [ 1FFFz s, for t = 2, . . . , T .

Step 7: Given the estimators for factors and loadings in the previous steps, we have

d 1εεεx t =1XXXt − bλλλcx 0 [ 1FFFc t − bλλλx 0 [ 1FFFx t, d1εεε z t =1ZZZt − bλλλcz 0 [ 1FFFc t − bλλλz 0 [ 1FFFz t,

(13)

then the estimators of the idiosyncratic errors are b εεεx t = t X s=2 d 1εεεx s, and bεεε z t = t X s=2 d 1εεεz s.

Step 8: Given all the estimators in the previous steps, the estimators for constants are

c αααx = XXXt − bλλλcx 0 c FFFc_t − bλλλx0FFcF_tx − bεεεx_t, b αααz = ZZZt− bλλλcz 0 c F F Fc_t − bλλλz0FFcF_tz− bεεεz_t.

Essentially, the consistent estimators for these possible non-stationary factors are ob-tained by continuously updating the PC estimators for the first-differenced data from top layer to bottom layer. Note also that, the number of factors for each layer can not be con-sistently estimated by using the information criteria of Bai and Ng (2002) directly in se-quence. As a consequence, we consider the following modified procedure as suggested in Wang (2010).

For the first layer, the estimator for the number of factors qccan be constructed as

b qc = b q1+qb2−qb3, (9) where b q₁ =arg min 0≤q≤ ¯qln(Sx(q)) + q · P (Nx, T ), b q2 =arg min 0≤q≤ ¯qln(Sz(q)) + q · P (Nz, T ), b q3 =arg min 0≤q≤ ¯qln(Sc(q)) + q · P (Nc, T ),

with some pre-specified value ¯q, penalty functionP(N, T ), and S_x(q) = (N_xT)−1 T X t =2 n 1XXXt −χχχx(q) 0 1XXXt −χχχx(q)o , S_z(q) = (N_zT)−1 T X t =2 n 1ZZZt −χχχz(q) 0 1ZZZt−χχχz(q)o , S_c(q) = (N_cT)−1 T X t =2 n 1YYYt−χχχc(q) 0 1YYYt −χχχc(q)o ,

(14)

are the sum of squared residuals between the specified data and its corresponding common component. Specifically,χχχ_x(q), χχχ_z(q), and χχχ_c(q) are common components with rank q and they are estimated by applying PCA to the data1XXXt,1ZZZt, and1YYYt, respectively. Givenqbc, the estimator for the number of factors in the first layer, the numbers of factors in the second layers can be determined accordingly as

b qx ₌

b

q₁−qbc, qbz =q_b2−qbc. (10) As claimed in Wang (2010),qbc,qbx andqbz are consistent for qc, qx and qz, respectively.

4 Asymptotic Properties

Because the proposed estimation procedure—constructing factors and loadings from top layer to bottom layer sequentially— is new in the literature while facing the possible non-stationary components in the hierarchical factor models, the corresponding properties need to be established carefully. Denote k Ak = trace(A0A)1/2, then extending the assumptions of Bai and Ng (2004) and Wang (2010), the regularity conditions for the proposed estimation to be valid are:

A-1: (i)λλλc_x(qc×N_x),λλλc_z(qc×N_z),λλλx(qx×N_x) andλλλz(qz×N_z) are non-random and satisfy kλλλc_xk < ∞, kλλλc_zk < ∞, kλλλxk < ∞, and kλλλzk < ∞; (ii)(λλλc_xλλλc_x0 +λλλc_zλλλc_z0)/(N_x +N_z) converges to an qc ×qc positive definite matrix, (λλλxλλλx0)/Nx converges to an qx ×

qx positive definite matrix, and (λλλzλλλz0)/Nz converges to an qz ×qz positive definite

matrix; (iii) the rank of [λλλc_x0 λλλx0] is qc+qx and the rank of [λλλc_z0 λλλz0] is qc+qz. A-2: (i) Var[1FFFc_t0 1FFF_tx0 1FFFz_t0]0 has rank qc +qx + qz. For m = c, x, z, (ii) uuum_t ∼

i i d(0, 666_u_m), IEkuuum_t k4< ∞; (iii) Var(1FFFm_t ) = P∞_l=0φm

l 666umφ j0 l > 0; (iv) P ∞ l=0lkφlmk < ∞; (v) 888m_{(1) has rank q}m 1 , 0 ≤ q1m ≤qm.

A-3: For m = x, z, (i) ν_i,tm ∼i i d(0, σ_ν2

m), IE|ν m i,t|8 < ∞, Pl=0∞ lkθi,lmk < ∞, 2mi (1)2σν2m > 0; (ii)PNm i =1|IE[ν m

i,tνmj,t]|< ∞ for all j; (iii) IE|Nm1/2P_{i =1}Nm[ν_i,tmν_i,sm − IE[ν_i,tmν_i,sm]]|4 <

∞, for all(t, s); (iv) PNx

i =1

PNz

i =1|IE[ν x

(15)

A-4: (i) uuuc_t is independent of uuux_t, uuuz_t andν_i,tm, m = x, z; (ii) for m = x, z, uuum_t is independent ofν_i,tm.

A-5: For m = c, x, z, IE[kFFFm₀k]< ∞ and IE[|ν_im_,0|]< ∞, i = 1, . . . , Nm.

Note that, because of the multi-layer top-down factor structure, the above regularity con-ditions are slightly different from what in Bai and Ng (2004) which focuses on the model with one-layer factor structure; we need to regulate some relationships among factors, inno-vations of factors and errors within layer and between layers. First, as in Bai and Ng (2002), we only consider non-random factor loadings for each layer for simplicity in Assumption A-1(i); Assumption A-1(ii) ensures that the FFFc_t, FFF_tx and FFFz_t, respectively, has a nontrivial contribution to the variances of YYY_t (all variables), of XXX_t (class-specific variables) and of ZZZ_t (class-specific variables); Assumption A-1(iii), as in Assumption B of Wang (2010), guaran-tees enough heterogeneity among individual variable within classes x or z when responding to both common and class-specific factors; The factor structure for each layer is thus identi-fiable. It also implies that only the global factors FFFc_t are common to all variables, while the class-specific factors FFFx_t and FFFz_t are not pervasive enough to be counted as the global fac-tors. Positive definite short-run variances and reduced-rank long-run variances of1FFFm_t are regulated under Assumption A-2 for all factors; q₁m common stochastic trends and qm −q₁m stationary factors are allowded for m = c, x, z. For the error innovations, Assumption A-3 allows the some weak serial and cross-section correlations in (1 − ρ_imL)εm_i,t, since ρ_im can be different across i in the class m, m = x, z; similarly, the weak correlations between class-specific error innovations are also regulated in A-3(iv). Under Assumption A-4, the global factor innovations uuuc_u is independent of other factors and error innovations, and the class-specified factors and error innovations are also independent in the class m, m = x, z, while all these innovations themselves are serially independent as regulated in Assumptions A-2(i) and A-3(i). Assumption A-5 regulates some initial conditions as commonly used in unit root analysis. More detailed discussions about these assumptions may also refer to Bai and Ng (2004) and Wang(2010).

Given these assumptions, it immediately follows

Theorem 4.1 Given Assumptions A-1 to A-5, and Nx/N → δ, 0 < δ < 1, as N, Nx, Nz, T →

(16)

(a) qcm

P

−→ qm if (i)P(Nm, T ) → 0 and (ii) C2_N_m_,TP(Nm, T ) → ∞ where CNm,T =

min{√N_m, √

T }.

(b) When qm = 1, dFFFm_t −→P δFFFm_t with some constant δ, and cε_{i t}m −→P ε_{i t}m for all i = 1, . . . , Nm;

(c) When qm > 1, the space spanned by dFFFm_t is an consistent estimate for the space spanned by FFFm_t ;

(d) _cααα_x −→P ααα_x andααα_b_z −→P ααα_z.

In essence, Theorem 4.1 (a) is obtained by applying Theorem 2 of Bai and Ng (2002) and results claimed by Wang(2010), Theorem 4.1 (b)(c) can be proved by using the Lemma 1 and Lemma 2 of Bai and Ng (2004), and the proof of Theorem 4.1 (d) is thus straightforward provided that Theorem 4.1 (a)(b)(c) are held. Note that, compared with the estimation and asymptotics for the PANIC model of Bai and Ng (2004), the major concern on the proposed procedure could be that, the “estimation errors” of factors and loadings for the first layer would affect the properties of the estimators for the second layer, and the induced composite effect would further affect the estimations in Step 7 and Step 8. However, these “estimation errors” practically would be smaller and smaller after each iteration in this continuous up-dating PC procedure, and could be neglected if above regularity conditions and the growing orders of Nm and T are held theoretically.

Since the proposed model allows the possible non-stationary components, it is thus im-portant and interesting to detect whether they are stationary or not based on the proposed estimates. Basically, for the estimates of idiosyncratic errors for every variable (Step 4) and the estimates of factors for each layer when qc =qx =qz =1, Theorem 4.1 (b) guarantees that it is valid to employ the typical unit root tests such as Dickey-Fuller test or Phillips-Perron test for making inference. The problem is much more complicated, however, when the number of factors for each layer is greater than one. Again, take the first layer of the model for example, without loss of generality, we assume that there are q₁c non-stationary factors and (qc −q₁c) stationary ones when qc > 1. Even though qc can be consistently estimated by ˆqcvia (9), the number of common non-stationary trends will still be overstated if we test each of the first ˆqc factor estimates individually for the presence of unit root. The reason is that we can only consistently estimate the space spanned by the factors instead of

(17)

identifying factors themselves when there are more than one factors; cf. Theorem 4.1 (a). Any rotation of these factors by using a full-rank matrix yields the same space. It imme-diately implies that more than one common non-stationary trends would be detected after rotations of factors even q₁cis exactly equal to one; similar argument can be found in Bai and Ng (2004). This complicated case is under consideration now. Moreover, for some layer, if some estimated factor is detected to be stationary but some variables whin this layer are non-stationary, it means that this factor represents a co-integration relationship between these non-stationary variables. This long-run relationship may help us to make further inference in various topics.

5 Simulations

Some selected simulation results are reported in what follows. In the first simulation, we investigate the performance of estimated number of factors for each layer based on (9) and (10). For the proposed model (4), the settings for factors and error innovations are

FFFc_t ∼ N(000, Iqc), FFFx t ∼ N(000, σ2Iqx), FFFz t ∼ N(000, σ2Iqz), εεεx t ∼ N(000, INx), εεε z t ∼ N(000, INz),

where eq is the q × 1 unity vectors and Iq is the q × q identity matrix; all the loadings

are followed the normal distribution with mean one and unity variance. The settings for the parameters in the selected case are T = 50 or 100, Nx = Nz = 20 or 80, qc = 1, qx = 2,

and qz =3, the accurate ratio of the proposed procedure as suggested in Wang (2010) with variousσ2are reported in Table 1.

The results in Table 1 shows that the introduced procedure works quite well, except the case thatσ = 0.25. When σ = 0.25, the fluctuation of the class-specific factors is much smaller than that of the corresponding error innovations with unity variance, the PCA thus cannot succeed in separating the common factors form the errors because of its objective. However, the performance improves very much since σ ≥ 0.5. In sum, this simulation shows that the introduced procedure based on (9) and (10) can yield the very well estimates for the number of common factors in each layer once the relative fluctuation of the factor to that of the corresponding error innovation is not too small.

(18)

Table 1: Accurate rates for selecting the number of common factors. (A) T = 50 Nx = Nz =20 σ = 0.25 0.5 0.75 1 1.25 1.5 2 F FFc_t 0.5396 0.9373 0.9986 0.9990 0.9993 0.9992 0.9994 FFFx_t 0.0595 0.9315 0.9992 0.9993 0.9994 0.9996 0.9997 F FFz_t 0.0559 0.9342 0.9992 0.9997 0.9999 0.9996 0.9997 (B) T = 50 Nx =Nz =80 σ = 0.25 0.5 0.75 1 1.25 1.5 2 F FFc_t 0.2178 0.9992 1.0000 1.0000 1.0000 1.0000 1.0000 FFFx_t 0.0702 0.9992 1.0000 1.0000 1.0000 1.0000 1.0000 F FFz_t 0.0643 0.9992 1.0000 1.0000 1.0000 1.0000 1.0000 (C) T = 100 Nx =Nz =20 σ = 0.25 0.5 0.75 1 1.25 1.5 2 F FFc_t 0.5610 0.9813 0.9999 1.0000 1.0000 1.0000 1.0000 FFFx_t 0.0786 0.9816 1.0000 1.0000 1.0000 1.0000 1.0000 F FFz_t 0.0804 0.9809 0.9999 1.0000 1.0000 1.0000 1.0000 (D) T = 100 Nx = Nz =80 σ = 0.25 0.5 0.75 1 1.25 1.5 2 F FFc_t 0.2851 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 FFFx_t 0.2404 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 F FFz_t 0.2403 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

(19)

Table 2: R2for the estimated and true factors. T =50, Nx = Nz =20 T =200, Nx = Nz =80 σ 0.5 1 2 0.5 1 2 \ F FFc_t(0) 0.94 0.73 0.19 0.95 0.74 0.17 c F FFc_t 0.98 0.98 0.98 0.98 0.98 0.98 \ FFFx_t(0) 0.82 0.78 0.49 0.85 0.81 0.51 c F F Fx_t 0.84 0.87 0.74 0.86 0.89 0.78 \ F FFz_t(0) 0.83 0.77 0.48 0.85 0.81 0.52 c FFF_tz 0.84 0.86 0.73 0.86 0.89 0.78

NOTE: \FFFm_t (0) and dFFFm_t represent the initial and the convergent estimates for the factor FFFm_t , m = c, x, z.

In the second simulation, we would like to investigate the performance of the proposed estimation procedure in Section 3.2. We compare performance of the initial PC estimates of these factors ( \FFFm_t (0) for m = c, x, z.) with the continuous updated PC estimates (dFFFm_t for m = c, x, z.). In this selected result, we consider qc = qx = qz = 1, and(T = 50, Nx =

N_z =20) or (T = 200, Nx = Nz =80). The regression R2while regressing the true factor

to the estimated one is reported. Basically, the R2of the initial PC estimates heavily depend on σ, the variance of the class-specific factors. In particular, when σ = 2, which is twice as violent as the common factor FFFc, the R2s for the initial PC estimate \FFFc_t(0) are below 0.2, which reflects the fact that the employing PCA once cannot succeed in separating the common factor from the class-specific factors if the latter is much more violent. However, once we continuously update the estimated based on Steps 4 and 5 of the procedure until the estimates converge, the results in Table 2 show that the R2 for cFFFc_t is 0.98 even in the case withσ = 2. The proposed estimating procedure can yield consistent estimates for the factors in each layer.

(20)

6 Concluding Remarks

In this paper, we study the hierarchical factor models with possible non-stationary compo-nents. We show how to estimate this kind of factor model(as well as the numbers of factor in each layer), the required conditions for yielding the consistent estimates, and the corre-sponding asymptotic theory. Several simulations are designed for investigating the finite-sample performance of the proposed procedure, and the selected results (as well as other un-reported ones) show that the proposed procedure works very well. Two empirical stud-ies are also considered based on the proposed model, one fucus on the global CDS(Credit Default Swap) indexes and another one deals with the GVAR dataset which involves several real- and nominal-sector variables in the world. The preliminary results for the estimated global and regional factors are quite interesting, and the corresponding academic papers are now in progress.

References

Bai, J. and P. Wang (2012), Identification and estimation of dynamic factor models, Working paper.

Bai, J. and S. Ng (2002), Determine the number of factors in approximate factor models, Econometrica, 71, 135-171.

Bai, J. and S. Ng (2004), A Panic attack on unit root and cointegration, Econometrica, 72(4), 1127–1177.

Bai, J. and S. Ng (2008), Large dimensional factor analysis. Foundations and Trendsr in Econometrics, 3, 89-163.

Beck, G. W., K. Hubrich and M.Marcellino (2009), Regional inflation dynamics within and across Euro Area and a comparison with the United States. Economic Policy, 143–184.

Bernanke, B.S., J. Boivin and P. Eliasz (2005). Measuring the effects of monetary policy: A factor-augmented vector autoregressive (FAVAR) approach, Quarterly Journal of Economics, 120, 387–422.

(21)

comovement, Review of Finance, 10, 69–98.

Crucini, M.J., M.A. Kose and C. Otrok (2011), What are the driving forces of international business cycles? Review of Economic Dynamics, 14, 156–175.

Del Negro, M. and C. Otrok (2008), Dynamic factor models with time-varying parameters: measuring changes in international business cycles, Working paper.

Diebold, F.X., C. Li and V. Z. Yue (2008), Global yield curve dynamics and Interations: a dynamic Nelson-Siegel approach, Journal of Econometrics, 146, 351–363.

Forni, M. and L. Reichlin (2001), Federal policies and local economies Europe and the US European Economic Review, 45, 109–134.

Fu, D. (2007), National, regional and metro-specic factors of the U.S. housing market, Work-ing paper.

Gregory, A.W., A.C. Head, and J. Raynauld (1997), Measuring world business cycles, Inter-national Economic Review, 38(3), 677-701.

He, D. and W. Liao (2011), Asian business cycle sychronization, Pacific Economic Review, 17(1), 106–135.

Hirata, H., M.A. Kose and C. Otrok (2011), Regionalization vs. globalization, Working paper.

Kose, M.A., C. Otrok, and C.H. Whiteman (2003), International business cycles: world, re-gion,and country-specific factors, The American Economic Review, 93(4), 1216–1239.

Kose, M.A., C. Otrok, and C.H. Whiteman (2008), Understanding the evolution of world business cycles, Journal of International Economics, 75, 110–130.

Lee, J. (2012), Measuring business cycle comovements in Europe-Evidence from a dynamic factor model with time-varying parameters. pdf Economics Letters, 115, 438–440.

Moench, E., S. Ng and M. Potter(2011), Dynamic hierarchical factor models, Working paper.

Mumtaz, H., S. Simonelli and P. Surico (2011), International comovements, business cycle and inflation: A historical perspective, Review of Economic Dynamics, 14, 176–198.

Ng, S. and E. Moench (2010), A hierarchical factor analysis of US housing market dynamics, Working paper.

(22)

Pesaran, M.H. and Y. Shin(1999), An autoregressive distributed lag modelling approach to cointegration analysis, in S. Strom (ed), Econometrics and Economic Theory in the 20th Century: The Ragnar Frisch Centennial Symposium, Cambridge, Cambridge Uni-versity Press, Ch.11, 371–413.

Stock, J.H. and M.W. Watson (1998), Diffusion indexes, NBER Working Paper 6702.

Stock, J.H. and M.W. Watson (2002), Macroeconomic forecasting using diffusion indexes, Journal of Business and Economic Statistics, 20, 147-162.

Stock, J.H. and M.W. Watson (2006), Forecasting with many predictors, In G. Elliott, C. W. J. Granger, and A. Timmermann (Eds.): Handbook of Economic Forecasting, Volume 1, Amsterdam: Elsevier, 515-554.

Stock, J.H. and M.W. Watson (2008), The evolution of national and regional factors in U.S., Working paper.

Wang, P (2010), Large Dimensional Factor Models with a Multi-Level Factor Structure Iden-tification, Estimation and Inference, Working paper.

(23)

科技部補助計畫衍生研發成果推廣資料表

日期:2015/10/21

科技部補助計畫

計畫名稱: 含有非定態數列的階層式因子模型計畫主持人: 徐士勛計畫編號: 102-2410-H-004-019-MY2 學門領域: 數理與數量方法

無研發成果推廣資料

(24)

102年度專題研究計畫研究成果彙整表

計畫主持人：徐士勛計畫編號：102-2410-H-004-019-MY2 計畫名稱：含有非定態數列的階層式因子模型成果項目量化單位備註（質化說明：如數個計畫共同成果、成果列為該期刊之封面故事...等）實際已達成數（被接受或已發表）預期總達成數（含實際已達成數）本計畫實際貢獻百分比國內論文著作期刊論文 0 0 100% 篇研究報告/技術報告 1 1 100% 根據此計畫的研究成果，我們將其分成兩大部分，一部份為側重理論模型與性質，另一部份則為實證研究，目前正在整理並撰寫相關學術論文。研討會論文 0 0 100% 專書 0 0 100% 章/本專利申請中件數 0 0 100% 件已獲得件數 0 0 100% 技術移轉件數 0 0 100% 件權利金 0 0 100% 千元參與計畫人力（本國籍）碩士生 4 0 100% 人次為使計畫順利進行以及因應實際的學生情況，在此計畫中將每年原申請的兩名博士班兼任助理員額改為1名博士生以及4名碩士級兼任助理。博士生 1 2 100% 為使計畫順利進行以及因應實際的學生情況，在此計畫中將每年原申請的兩名博士班兼任助理員額改為1名博士生以及4名碩士級兼任助理。博士後研究員 0 0 100% 專任助理 0 0 100%

(25)

研討會論文 0 0 100% 專書 0 0 100% 章/本專利申請中件數 0 0 100% 件已獲得件數 0 0 100% 技術移轉件數 0 0 100% 件權利金 0 0 100% 千元參與計畫人力（外國籍）碩士生 0 0 100% 人次博士生 0 0 100% 博士後研究員 0 0 100% 專任助理 0 0 100% 其他成果（無法以量化表達之成果如辦理學術活動、獲得獎項、重要國際合作、研究成果國際影響力及其他協助產業技術發展之具體效益事項等，請以文字敘述填列。）　　無成果項目量化名稱或內容性質簡述科教處計畫加填項目測驗工具(含質性與量性) 0 課程/模組 0 電腦及網路系統或工具 0 教材 0 舉辦之活動/競賽 0 研討會/工作坊 0 電子報、網站 0 計畫成果推廣之參與（閱聽）人數 0

(26)

含有非定態數列的階層式因子模型

科技部補助專題研究計畫成果報告

期末報告