A novel algorithm for dynamic factor analysis

(1)

A novel algorithm for dynamic factor analysis

Jih-Jeng Huang

a

, Gwo-Hshiung Tzeng

b,c,*

,

Chorng-Shyong Ong

a

a_{Department of Information Management, National Taiwan University, Taipei, Taiwan, ROC} b_{Institute of Management of Technology, National Chiao Tung University,}

1001 Ta-Hsuch Road, Hsinchu 300, Taiwan, ROC

c_{Department of Business Administration, Kainan University, Taoyuan, Taiwan, ROC}

Abstract

In this paper, a dynamic factor model is proposed to extract the dynamic factors from time series data. In order to deal with the problem of scaling, the cross-correlation matrices (CCM) are ﬁrst employed to cluster the time series data. Then, the dynamic factors are extracted using the revised independent component analysis (ICA). In addi-tion, a numerical study is used to demonstrate the proposed method. On the basis of the simulated results, we can conclude that the proposed method can really extract the eﬀec-tive dynamic factors.

Keywords: Dynamic factor model; Factor analysis; Cross-correlation matrices (CCM); Indepen-dent component analysis (ICA); Time series

* _{Corresponding author. Address: Institute of Management of Technology, National Chiao Tung} University, 1001 Ta-Hsuch Road, Hsinchu 300, Taiwan, ROC.

E-mail address:u5460637@ms16.hinet.net(G.-H. Tzeng).

(2)

1. Introduction

Dynamic factor analysis (DFA), which was proposed by Engle and Watson

[1,2], is a dimension-reduction approach for extracting the common trends of time series data. The mathematical formulation of DFA can be described as follows. Let the multivariate time series vector at time t be yt. Then, the

dy-namic factor model can be formulated as

yt¼Catþ et; ð1Þ

at¼at1þ et; ð2Þ

where C denotes the factor loading, at N(at, mt) is the common trends at time

t, et N(0, re) is the noise component matrix, and et N(0, re) is the diagonal

error covariance matrix. In addition, at, et, and et are independent of each

other.

Although DFA has been successfully used in the applications of econom-ics [3–6] and psychology [7,8], two main problems should be considered for adopting DFA in practice. First, the computational cost of estimating parameters in DFA is usually heavy. Several papers have been reported that DFA can only be suitable for small scaling time series data [9,10]. Although several algorithms such as Markov chain Monte Carlo method [11,12], and EM algorithm [9,10] have been proposed to deal with the problem above, these methods cannot truly overcome the problem of scaling. Second, the conventional DFA only extract the linear common trends among time series data using the second-order statistics. However, the information of the high-order statistics should also be considered to response the complex systems in practice.

In this paper, a novel algorithm is proposed to deal with the problems above simultaneously. First, in order to overcome the problem of scaling, the cross-correlation matrices (CCM)[13] are used to cluster time series variables into segments. Next, the revised independent component analysis (ICA) is proposed to extract the dynamic factors by diﬀerent segments. Sixteen daily indices of stock markets and foreign currency exchange rates from 1995 to 1997 are used to demonstrate the proposed method. In addition, the dynamic factors are used to predict the daily indices and compare with the dynamic regression model. On the basis of the simulated results, we can conclude that the proposed method can really extract the important common trends among time series data and performs the accurate prediction.

The remainder of this paper is organized as follows. The dynamic factor model is proposed in Section 2. A numerical example, which is used to illustrate the proposed method and compare with the dynamic regression model, is presented in Section 3. Discussion and conclusions are in the last section.

(3)

2. Dynamic factor model

In order to derive the dynamic factors, the CCM [13] is ﬁrst employed to calculate the correlation of the multivariate time series so that we can cluster the variables into several segments to reduce the computational cost. Next, the dynamic factors can be derived using the ICA approach.

2.1. Cross-correlation matrices

Consider the multivariate time series Zt, and the mean vector l, then the

cross-covariance matrices at the lth lag can be deﬁned as Rl¼ CovðZt;ZtlÞ ¼ E½ðZt lÞðZtl lÞ0 ¼ E z1t l1 z2t l2 .. . zkt lk 2 6 6 6 6 6 6 4 3 7 7 7 7 7 7 5 ½z1ðtlÞ l1;z2ðtlÞ l2; . . . ;zkðtlÞ lk ¼ v11ðlÞ v12ðlÞ v1kðlÞ v21ðlÞ v22ðlÞ v2kðlÞ .. . .. . ... vk1ðlÞ vk2ðlÞ vkkðlÞ 2 6 6 6 6 6 6 4 3 7 7 7 7 7 7 5 . ð3Þ

On the basis of the cross-covariance matrices, we can obtain the CCM as follows: Pl¼ q11ðlÞ q12ðlÞ q1kðlÞ q₂₁ðlÞ q22ðlÞ q2kðlÞ .. . .. . ... qk1ðlÞ qk2ðlÞ qkkðlÞ 2 6 6 6 6 4 3 7 7 7 7 5; ð4Þ where qijðlÞ ¼ vijðlÞ ½viiðlÞvjjðlÞ 1=2. ð5Þ

By detecting the coeﬃcients of the CCM, we can cluster the correlated time ser-ies variables into several segments. Next, we can introduce the ICA method and present how the dynamic factors can be obtained using ICA.

(4)

2.2. Independent component analysis

ICA[14,15] is a statistical tool to extract the independent component (IC) from an observed multivariate time series. ICA has been proposed to deal with many real-world applications such as signal processing [16,17], magnetoen-cephalography (MEG)[18], and image analysis [19,20]. The concepts of ICA can be described as follows. Let a time signal vector be xt= {x1, x2, . . . , xn},

the ICA model can be formulated as

xt¼ Ast; ð6Þ

where A denotes the unknown mixing matrix and s denotes the sources. The problem of ICA is to extract the IC vector, yt, from the signal vector, st.

We can depict the problem above as shown in Fig. 1.

In order to derive the ICs, we can calculate the demixing matrix, W, such that

yt¼ Wxt¼ WAst. ð7Þ

Therefore, if we can ﬁnd W = A1, then yt= st, and the perfect separation

oc-curs. It should be highlighted that the conventional ICA only deal with the ran-dom variables and cannot handle the time series data. In this paper, a revised ICA, which was proposed in [21,22], is proposed to deal with non-stationary and temporally correlated data.

In addition, although ICA and principal component analysis (PCA) share some common characteristics like building generative model and performing dimension reduction, PCA only process the second-order dependencies in the data. However, ICA is a generation of PCA that separates the higher-order dependencies in the data. In addition, conventional PCA can only deal with the random variable data instead of the time series data. We can depict

Fig. 2to present the proposed algorithm as follows.

In the next section, a numerical study is used to demonstrate the proposed method.

3. Numerical study

In this section, 16 daily indices of stock markets and foreign currency ex-change rates from 1995 to 1997, including Amsterdam, Frankfurt, Hong Kong,

A t

s x_t y_t

W

(Mixing Matrix)

Source Signal (Demixing Matrix) IC

(5)

London, New York, Paris, Singapore, Tokyo, and so on, are used to extract the dynamic factors. These daily indices can be represented usingFig. 3.

In order to cluster the indices above to reduce the computational cost, the CCM is used to calculate the correlation among indices as shown inTable 2. On the basis of the CCM, we can cluster these indices into three segments as shown inTable 1(Table 2).

Next, we can extract the dynamic factors from the segments using ICA. Since the cluster 1 contains many indices, we extract two dynamic factors from cluster 1 as shown in Fig. 4. However, only one dynamic factor is extracted from clusters 2 and 3 as shown inFigs. 5 and 6.

Next, the dynamic regression (DR) model is employed to test the eﬃciency of the proposed method. First, we select six variables to be the dependent variable and the other 15 variables are used to predict the dependent variable in the six dynamic regression models. Next, we use the same dependent vari-ables but the dynamic factors to be the independent varivari-ables in other six dynamic regression models. Finally, we use Akaike information criterion (AIC), Hannan-Quinn criterion (HQC), corrected AIC (AICC), and Schwarz Bayesian criterion (SBC) to compare the proposed method with the dynamic regression model as shown inTable 3.

On the basis of the simulated results, we can conclude that the dynamic fac-tor model performs almost the same accuracy with the dynamic regression model. It indicates the dynamic factors can really be extracted and reﬂect the common trends of the multivariate time series. Next, we provide the depth dis-cussion according to our implementation.

1

xt x2t xnt

CCM

Cluster 1 Cluster 2 Clusterm

IC . . . . . . . . . A DF 1 DF 2 DF k

(6)

4. Discussion and conclusions

Dynamic factor analysis is a useful tool for extracting the common trends among time series data. The dynamic factors are useful for the decision-maker.

Fig. 3. The trend chart of the 16 daily indices.

Table 1

Cluster for multivariate time series Cluster for multivariate time series

Cluster 1 AMSTEOE, DAXINDX, FRCAC40, FTSE100, HNGKNGI, SPCOMP, DTCHGUS, FRNFRUS, GERMDUS, JAPYNUS, SWISFUS

Cluster 2 JAPDOWA, AUSTRUS, CDNDLUS Cluster 3 SNGALLS, BRITPUS

(7)

Table 2 The cross-correlation matrix a t the ﬁrst lag Variable A MSTEOE DAXINDX FRCAC40 FTSE100 HNGK NGI JAPDOWA SNGALLS SPCOMP AU STRUS BRITP US CDND LUS DTCHGUS FRNFR US GERMDU S JAPYNUS SW ISFUS Cross-correlation matrix AMSTEOE 1.000 0.997 0.984 0.980 0.820 0.048 0.497 0.982 0.219 0.582 0.326 0.964 0.946 0.963 0.836 0.918 DAXINDX 0.997 1.000 0.984 0.977 0.803 0.017 0.521 0.975 0.256 0.581 0.359 0.966 0.953 0.965 0.827 0.918 FRCAC40 0.984 0.984 1.000 0.957 0.799 0.027 0.508 0.956 0.243 0.619 0.340 0.963 0.953 0.962 0.798 0.938 FTSE100 0.980 0.977 0.957 1.000 0.815 0.060 0.475 0.991 0.226 0.531 0.253 0.918 0.889 0.917 0.851 0.860 HNGKNGI 0.820 0.803 0.799 0.815 1.000 0.365 0.081 0.826 0.215 0.413 0.093 0.759 0.730 0.761 0.714 0.765 JAPDOWA 0 .048 0.017 0.027 0.060 0.365 1.000 0.599 0.091 0.739 0.400 0.336 0.041 0.022 0.048 0.269 0.021 SNGALLS 0.497 0.521 0.508 0.475 0.081 0.599 1.000 0.449 0.650 0.648 0.525 0.490 0.520 0.482 0.248 0.435 SPCOMP 0 .982 0.975 0.956 0.991 0.826 0.091 0.449 1.000 0.197 0.564 0.248 0.919 0.887 0.918 0.865 0.867 AUSTRUS 0.219 0.256 0.243 0.226 0.215 0.739 0.650 0.197 1.000 0.295 0.531 0.190 0.238 0.184 0.062 0.094 BRITPUS 0.582 0.581 0.619 0.531 0.413 0.400 0.648 0.564 0.295 1.000 0.264 0.541 0.548 0.534 0.412 0.597 CDNDLUS 0.326 0.359 0.340 0.253 0.093 0.336 0.525 0.248 0.531 0.264 1.000 0.423 0.463 0.419 0.223 0.374 DTCHGUS 0.964 0.966 0.963 0.918 0.759 0.041 0.490 0.919 0.190 0.541 0.423 1.000 0.993 1.000 0.849 0.976 FRNFRUS 0.946 0.953 0.953 0.889 0.730 0.022 0.520 0.887 0.238 0.548 0.463 0.993 1.000 0.993 0.797 0.971 GERMDUS 0.963 0.965 0.962 0.917 0.761 0.048 0.482 0.918 0.184 0.534 0.419 1.000 0.993 1.000 0.850 0.976 JAPYNUS 0.836 0.827 0.798 0.851 0.714 0.269 0.248 0.865 0.062 0.412 0.223 0.849 0.797 0.850 1.000 0.834 SWISFUS 0.918 0.918 0.938 0.860 0.765 0.021 0.435 0.867 0.094 0.597 0.374 0.976 0.971 0.976 0.834 1.000

(8)

For example, by extracting the important factors, the decision-maker can understand the changing trends of the future and eﬀectively manage the strate-gic planning.

In this paper, the 16 daily indices of stock markets and foreign currency ex-change rates are used to extract the dynamic factors. Since the computational cost of dynamic factor analysis is heavy, the CCM is ﬁrst used to cluster the 16 indices into three segments. Next, the dynamic factors are extracted by each cluster. In the ﬁrst cluster, two dynamic factors are extracted. From the shape

100 200 300 400 500 600 700 -1.7 -1.6 -1.5 -1.4 -1.3 -1.2 -1.1 -1 -0.9 y₃

Fig. 5. The third dynamic factor derived form cluster 2.

-4 -2 0 2 4 y₁ 100 200 300 400 500 600 700 -10 -5 0 5 y₂

(9)

of the ﬁrst-two dynamic factors, it can be seen that the direction of the two fac-tors are opposite. It can be interpreted that the two opposite forces control the indices of cluster 1. On the other hand, the second and the third dynamic fac-tors which are extracted from cluster 2 and cluster 3 seem reﬂect the short-term and the long-term cycle trends.

In addition, we use the dynamic factors to predict the daily indices and com-pare with the dynamic regression model. On the basis ofTable 3, it can be seen that the dynamic factors can perform the accurate prediction. That is, the pro-posed method can really extract the important common trends among time ser-ies data. Finally, the problem of scaling can be overcome using the proposed method. 100 200 300 400 500 600 700 -0.7 -0.65 -0.6 -0.55 -0.5 -0.45 -0.4 -0.35 y₄

Fig. 6. The fourth dynamic factor derived form cluster 3.

Table 3

The comparison of the dynamic regression model and the proposed method

Dependent Independent AIC HQC AICC SBC

DFA AMSTEOE F1 and F2 4.2023 4.2094 4.2023 4.2207

DR AMSTEOE Others 4.1956 4.2335 4.1966 4.2939

DFA JAPDOWA F3 15.1852 15.1899 15.1852 15.1975

DR JAPDOWA Others 13.2089 13.2467 13.2098 13.3071

DFA SNGALLS F4 8.6634 8.6587 8.6634 8.6511

DR SNGALLS Others 3.0958 3.1336 3.0967 3.1940

DFA DAXINDX F1 and F2 7.2486 7.2557 7.2487 7.2671

DR DAXINDX Others 7.2401 7.2779 7.2410 7.3383

DFA AUSTRUS F3 5.7006 5.6959 5.7006 5.6884

DR AUSTRUS Others 10.0310 9.9955 10.0302 9.9389

DFA BRITPUS F4 11.5517 11.5469 11.5516 11.5394

(10)

References

[1] R.F. Engle, M.W. Watson, A one-factor multivariate time series model of metropolitan wage rates, Journal of the American Statistical Association 76 (376) (1981) 774–781.

[2] M.W. Watson, R.F. Engle, Alternative algorithms for estimation of dynamic MIMIC, factor, and time varying coeﬃcient regression models, Journal of Econometrics 23 (3) (1983) 385–400. [3] A.C. Harvey, Forecasting, Structural Time Series Models and the Kalman Filter, Cambridge,

Cambridge University Press, 1989.

[4] A.W. Gregory, A.C. Head, Common and country-speciﬁc ﬂuctuations in productivity, investment, and the current account, Journal of Monetary Economics 44 (3) (1999) 423–451. [5] A. Gregory, A. Head, J. Raynauld, Measuring world business cycles,

International-Economic-Review 38 (3) (1997) 677–701.

[6] S.C. Norrbin, D.E. Schlagenhauf, The role of international factors in the business cycle: a multi-country study, Journal of International Economics 40 (1–2) (1996) 85–104.

[7] P.C.M. Molenaar, Dynamic factor analysis for the analysis of multivariate time series, Psychometrika 50 (1) (1985) 181–202.

[8] P.C.M. Molenaar, J.G. de Gooijer, B. Schmitz, Dynamic factor analysis of nonstationary multivariate time series, Psychometrika 57 (3) (1992) 333–349.

[9] A.F. Zuur, R.J. Fryer, I.T. Jolliﬀe, R. Dekker, J.J. Beukema, Estimating common trends in multivariate time series using dynamic factor analysis, Environmetrics 14 (7) (2003) 665–685. [10] A.F. Zuur, I.D. Tuck, N. Bailey, Dynamic factor analysis to estimate common trends in ﬁsheries time series, Canadian Journal of Fisheries and Aquatic Sciences 60 (5) (2003) 542–552. [11] O. Aguilar, G. Huerta, R. Prado, M. West, Bayesian inference on latent structure in time

series, Bayesian Statistics 6 (1) (1998) 1–16.

[12] M. West, P.J. Harrison, Bayesian Forecasting and Dynamic Models, New York, Springer-Verlag, 1997.

[13] G.C. Tiao, R.S. Tsay, Multiple time series modeling and extended sample cross correlations, Journal of Business and Economic Statistics 1 (1) (1983) 43–56.

[14] C. Jutten, J. Herault, Blind separation of sources, Part I: An adaptive algorithm based on neuromimetic architecture, Signal Processing 24 (1) (1991) 1–10.

[15] P. Common, Independent component analysis—a new concept? Signal Processing 36 (3) (1994) 287–314.

[16] A. Bell, T. Sejnowski, An information—maximization approach to blind separation and blind deconvolution, Neural Computation 7 (6) (1995) 1129–1159.

[17] S. Ikeda, N. Murata, A method of ICA in time frequency domain, in: Proceedings of International workshop on Independent Component Analysis and Blind Signal Separation, Aussois, France, 1999, pp. 365–370.

[18] R. Vigario, Extraction of ocular artifacts from EEG using independent component analysis, Electroencephalography and Clinical Neurophysiology 103 (3) (1997) 395–404.

[19] A. Bell, T. Sejnowski, The Ôindependent componentsÕ of natural scenes are edge ﬁlters, Vision Research 37 (23) (1997) 3327–3338.

[20] B.A. Olshausen, D.J. Field, Emergence of simple-cell receptive ﬁeld properties by learning a sparse code for natural images, Nature 381 (6583) (1996) 607–609.

[21] S. Choi, A. Cichocki, Blind separation of nonstationary and temporally correlated sources from noisy mixtures, in: Proceeding of NEEE NNSP, Sydney, Australia, 2000, pp. 405–414. [22] J.V. Stone, Blind source separation using temporal predictability, Neural Computation 13 (7)