Granger causality - Data analysis - 駕駛在不同瞌睡程度下大腦訊號傳遞網絡的變化

2. Methods

2.5. Data analysis

2.5.5. Granger causality

For perceiving the connectivities in human brain under drowsiness state and alert state, we used Granger causality algorithm to obtain their causal relationship. The concept of causality was expressed by Granger for econometric purpose [43]. It based on the notion that causes imply effects during the future evolution of events and, conversely, the future cannot affect present. Granger causality assumed one time series y(t) caused another time series x(t) if y(t-1),y(t-2),...,y(t-p) participation in prediction of x(t) could significantly improves.

The t was time point and p was order. Therefore, the basic assumption of Granger causality was that the participating time series must be autoregressive model (4). In multi-channel EEG data, there was a time-variate multivariate autoregressive model (tvMVAR) to fit all time series [64]. Consider a set (K components) of processes like

X

_t represented as:

 X

= A

_t−1

∗ X

_t−1

+ + A

_{t− p}

∗ X

_{t− p}

+ E

_t (4)

where, t represented the current time point,

 X

, X

_t−1

,, X

_{t− p} were K x 1 observed data matrix, p was model order, and E was a zeros mean Gaussian noise vector.

 A

_t−1

,, A

_{t− p} were K x K coefficient matrix of tvMVAR, and it also represented as linear time-lagged dependence. At a given time, the diagonal elements of A represented the self-connectivity of each channel, and off-diagonal element represented the inter-connectivity between each channel. If the prediction of

X

_t of

α

component was more accurate by adding a specific component

β

, we could say, component

β

caused component

α

. In other words, If the error matrix

E

_t was controlled in the minimum, the coefficients of component

β

to complete component

α

indicated how much the

β

caused the

α

. The following 4 section, order selection, Kalman filter, Granger causality value and statistical testing were 4 steps to determine if there was any connection exist between two time series components. The purpose of the first three steps was

accessing the coefficients of observed data and analyzing the strength of causal relationship between each component in each time sample and different drowsy stage. The statistical testing was used to determine the causal relationship was established or not.

2.5.5.1. Order selection

Before accessing the observed measurements model coefficients, selecting the model order to get the best prediction was necessary in each different drowsy stage GC analysis. The Bayesian information criterion (BIC) [65] function was used to choose the model order p to yield a well-fit tvMVAR. The penalty was of each order calculated by BIC criterion, and selected the model order for which BIC penalty reaching for a minimum.

BIC p ( )

^{= de}p + log tn

( )

^{∗ p /n} ⁽⁵⁾

where, p was the model order presenting the current order, de was the predict error of p order, tn was the trial number of the observed data, n was the sample number of this trial. The error of observed data model prediction, de, was computed by QR decomposition algorithm which was often used to solve the linear least square problem. Therefore the de value had to be accessed once as calculating the p order each time. The BIC value of order p was the penalty of the observed data model prediction, that is, the lower value meant the better prediction and the more well-fit generated model.

Figure 10.The BIC testing in subject 01 (S01) with 36 trials.

Figure10 shown that the shape of the curve was greatly similar in all trials from one subject. The order with the lowest penalty would be selected to be the model order as the red circle in figure 10. In the example, 4 was selected to be the model order for predicting the model coefficients in the next step.

2.5.5.2. Kalman filter

Figure 11 showed the following steps of Granger causality analysis after order selection in an example of bi-variate. Applying Kalman filter was the next step.

The purpose of Kalman filter was to use measurements that are observed over time that contain noise, random variations, and other inaccuracies, and produce the model whose values that tended to be closer to the true values of the measurements and their associated calculated values [66,67]. In GC analysis,

the advantage of adaptive filter such as Kalman filter was that the autoregressive model of observed data could be variable. Kalman filter could detect the change of linear combination of model along with time series in each prediction, therefore the assumption of stationary input data was unnecessary and one set of model coefficients was accessed in each time sample [68]. Otherwise, Milde et al. [51]

reported that this adaptive way to access the model coefficients was more suitable than recursive least square (RLS) [69], which was the other adaptive filter used in GC analysis.

The procedure of filtering was producing estimates of the true values of measurements and their associated calculated values by predicting a value, estimating the uncertainty of the predicted value, and computing a weighted average of the predicted value and the measured value. The estimates produced by the method tend to be closer to the true values than the original measurements because the weighted average has a better estimated uncertainty than either of the values that went into the weighted average. Assume there was a m-dimensional vector autoregressive model of order p, and those functions could be shown as below:

Y t ( ) ^{= H t} ( ) ^{∗ X t} ( ) ⁺

^ε

( ) ^t

⁽⁶⁾

where, t was the current time point,

Y t ( )

was the observed data,

H t ( )

^{was the}

observed measurement matrix with dimension

m ∗ m

p

X

_t was the

m

p ∗ m

p

state variable matrix, and it also represented the model coefficient matrix in this study.

Γ t ( )

was the transition matrix whose size was also

m

p ∗ m

p

, and it was identical matrix for assuming the coefficients evolve according to random walks.

K

_t was the Kalman gain matrix, Ρ_t|t was the priori covariance matrix of the estimation error, Ρ_t_+1|t was the posteriori covariance matrix,

R t ( )

^{was the}

observation noise covariance,

R t ( )

⁼

^ε

^∗

^ε

^T ^,

^{W t} ( )

was the zero-mean white process-noise,

Q t ( )

was the state-noise covariance. The function 6 and 7 were basic state-space form.

Y t ( )

^and

^{H t} ( )

were known observation data, and the aim of Kalman filter in this study was to estimate the model coefficients

X

_t by

Y t ( )

and

H t ( )

, in the other words, to access the function 6. Function 8, 9 and 10 were the innovation part in the filter preparing for next time point prediction, and function 11 and 12 were prediction part. The filter trained the observed data time sample by time sample, and then the object coefficient matrix

X

_t at each time sample could be solved. The initialization was shown in function 13 and 14.

sample

Moving window by 1 sample

Component A

compare with stage 0 GC value distribution

significant

Figure 11. The diagram of Granger causality application of bi-variate time series after order selection.

2.5.5.3. Granger causality value

The original autoregressive model of observed data appeared to be analyzed by GC after filtering. We used a three components example to explain how to access GC value from the coefficient matrix

X

_t in function 6. Assume the autoregressive model was solved by Kalman filter in previous section:

Y t ( )

⁼

^{X i} ( )

i=1

∑

p ^{∗Y t − i}

^{( )}

⁺^{ε t}

^{( )}

⁽¹⁴⁾

The same with the function 6,

Y t ( )

was the observed data with m components and t presented the current time point, p was the model order,

X

_i represented linear time-lagged dependence which was referred as coefficient matrix from lag 1 to order p. For explaining legibly, the function 14 was simplified order to be equal to 1 and only three components, not only that, the equation was expanded as following function:

Figure 11: The diagram of Granger causality application of bi-variate time series after order selection. There were 2 components in this example, component A (comp A) and component B (comp B). Two components were trained by the Kalman filter which were represented as blue blocks with order length as window length to access the model coefficients. Next step was calculated the GC value in each time point by coefficients from filter represented by green blocks. Testing the significance was after the GC value calculating represented as pink blocks. Each GC value along with the time series was compared with the distribution of GC value in stage 0, if the value was in the distribution, the value was set for 0; else the value kept the original GC value. The last step of Granger causality was decision of the causal relationship. If the non-zero value was much more than the zero in the whole GC value along the time series, then the connection, causal relationship was established.

y

₁

( ) t

The diagonal elements,

x

₁₁

x

₂₂

x

₃₃ represented the self-connectivity at lag 1, and the others were inter-component connectivity at lag 1. These coefficients were positive relation with causal relationship in above coefficients matrix. For instance,

y

₁

( ) t

was composed by

y

₁

( ) t −1 ^y

( ) t −1

with coefficients

x

₁₁

x

₁₂

x

₁₃ respectively, that mean the relative strength of influence from component

y

₂to

y

₁ at time point t-1 was expressed by the ratio

abs x ( )

₁₂

^{/ abs x} ( )

^{+ abs x} ( )

13 ^,

and from

y

₃ was expressed by the ratio

abs x ( )

₁₃

^{/ abs x} ( )

^{+ abs x} ( )

13 ^.

Therefore the GC value was designed [70,71] :



person, the statistic testing was trial by trial in each subject. The EEG data in stage 0, baseline, were regarded as reference for the GC value in stage 1 to 4 to test the significance. All the single GC values along with time series were compared with the distribution of stage 0 distribution, if the GC value from stage 1 – 4 wasn’t in the 95% confidence interval of stage 0, the GC value was significant and the GC value was kept, else the GC value was set for 0. Since avoid influence of the extreme value from minor trial, the median were used to average all the trials. After the median accessing, only if number of non-zero GC values were much more than the number of zero in whole time series between two components, the causal relationship in these two components was established as the example of connection of comp A to comp B in figure 11.

在文檔中駕駛在不同瞌睡程度下大腦訊號傳遞網絡的變化 (頁 29-37)

Granger causality

2. Methods

2.5. Data analysis

2.5.5. Granger causality

X

 X

= A

∗ X

+ + A

∗ X

+ E

 X

, X

,, X

 A

,, A

X

α

β

β

α

E

β

α

β

α

2.5.5.1. Order selection

BIC p ( )

( )

2.5.5.2. Kalman filter

Y t ( ) = H t ( ) ∗ X t ( ) +

( ) t

Y t ( )

H t ( )

m ∗ m

p

X

m

p ∗ m

p

Γ t ( )

m

p ∗ m

p

K

R t ( )

R t ( )

ε

ε

W t ( )

Q t ( )

Y t ( )

H t ( )

X

Y t ( )

H t ( )

X

2.5.5.3. Granger causality value

X

Y t ( )

X i ( )

∑

( )

( )

Y t ( )

X

y

( ) t

x

x

x

y

( ) t

y

( ) t −1 y

( ) t −1 y

( ) t −1

x

x

x

Y t ( ) ^{= H t} ( ) ^{∗ X t} ( ) ⁺

( ) ^t

^ε

^ε

^{W t} ( )

^{H t} ( )

^{X i} ( )

^{( )}

^{( )}

( ) t −1 ^y

( ) t −1 ^y

^{/ abs x} ( )

^{+ abs x} ( )

^{+ abs x} ( )

^{/ abs x} ( )

^{+ abs x} ( )

^{+ abs x} ( )