Data Analysis - Classifications of the Transient Brain Dynamics in Single Trials

3. Classifications of the Transient Brain Dynamics in Single Trials

3.4. Data Analysis

Fig. 3-7 shows the system flowchart for processing the ERP signals. After collecting high-fidelity EEG signals, a low-pass filter is first used to remove the line noise and higher frequency (>50Hz) noise. A first calibration based on time-domain overlap-added averaged ERP and ERP image is perform to demonstrate the validity of the collected EEG signals. In order to remove a wide variety of artifacts and for the applications of on-line use, unlike traditional time-domain overlap-added averaged methods for processing ERP data, the measured ERP signals are further analyzed using ICA algorithm (described in Chapter 2-3) in single trials. The ICA is also used to select possible ERP features related to the traffic-light stimuli based on the time sequences of the ERPs and the corresponding scalp distribution of the ICA components. After extraction of the single-trial ERP signal, we design a novel temporal matching filter to solve the time-alignment problem caused by the variations of subject’s response in each single trial. The PCA algorithm is then applied to the filtered ERP data to reduce dimension and select the representative components. Finally, we develop a fuzzy neural network (FNN) model (Chapter 2-6) compared to Learning Vector Quantization method (LVQ) and Back-propagation Neural Network model (BPNN) to on-line classify the ERP data corresponding to different stimuli. The classified results can be used as control and feedback commands in vehicle safety-driving systems.

EEG

3.4.1. ICA Decompositions of the ERP Data

The brief schematic depiction of the decomposition by ICA is shown in Fig. 3-8. In our experiment, we assume that the multi-channel EEG recordings are mixtures of underlying brain sources and artificial signals. As discussion in Chapter 2-3, we suppose that the number of sources is the same as the number of sensors by assuming that the source numbers contributing to the scalp EEG are statistically independent; that is, if there are N sensors, the ICA algorithm can separate N source components. The conduction of the EEG sensors is assumed to be instantaneous and linear such that the measured mixing signals are linear and the propagation delays are negligible. We also assume that the signal source of muscle activity, eye, and, cardiac signals are not time locked to the sources of EEG activity which is regarded as reflecting synaptic activity of cortical neurons. Therefore, the time courses of the sources are assumed to be independent. The important fact used to distinguish a source, si, from mixtures, xi, is that the activity of each source is statistically independent of the other sources, i.e., the mutual information between any two sources, si and sj, is zero. The task of ICA algorithm is to recover a version, of the original sources S by finding a square matrix W that inverts the mixing process linearly and save the identical scale and permutation. For EEG analysis, the rows of the input matrix X are the EEG signals recorded at different electrodes, the rows of the output data matrix u = WX are time courses of activation of the ICA components, and the columns of the inverse matrix W^-1 give the projection strengths of the respective components onto the scalp sensors. The scalp topographies of the components provide information about the location of the sources (e.g., eye activity should project mainly to frontal sites, and the visual event-related potential is on the center to posterior area, etc.).

“Corrected” EEG signals can then be derived as X = W^-1u, where u is the matrix of activation waveforms u.

unmixing (W)

(X)

activations (u=WX)

scalp maps W

^-1

EEG Scalp Data

Independent Components

unmixing (W)

(X)

activations (u=WX)

scalp maps W

^-1

EEG Scalp Data

Independent Components

Figure 3-8. Schematic depiction of ICA decomposition of EEG signals.

3.4.2. Temporal Matching Filter

For single-trial analysis of ERP signals in time domain, the amplitude and latency of ERP (P300) is an important parameter for the ERP classification. Due to the time varying and non-stationary property of P300 in each single trial of the same stimulus, one frequent happened problem concerning classification is the time-alignment problem defined as the time varying of the latency in P300. There are many psychophysiological factors leading to the time-alignment phenomenon of the single-trial ERP signals for one subject, such as the cognitive state of the subject at that moment, the different response behavior for each trial of

signals caused by the same stimulus could be classified into different principles in PCA due to the time alignment problem.) and decrease the recognition rate. To solve such a problem, we process the single-trial EPR signals with short-term techniques using maximum magnitude of cross-correlation function and propose a novel temporal matching filter. Fig. 3-9 shows the concept diagram of this matching filter. After collecting high-fidelity ERP signals, the temporal matching filter is selected by averaging the first N single trials as the standard pattern of P300 for each subject. Then we calculate the cross-correlation value between the matching filter and subsequent single trial, and find out the maximum magnitude of cross-correlation function. Finally, the original single-trial sequence is shifted to a new time sequence according to the maximum cross-correlation value. Detailed algorithms are listed below:

1. Given the input single-trial source component ui (i is the trial index), we calculate the average of the first N trials by

∑

as the standard pattern of the

matching filter.

2. Find the maximum cross-correlation coefficients between the standard pattern u and subsequent single trials by calculating k argmaxu_xcorr(k)

-100 0 100 200 300 400 500 600 700

Figure 3-9. The use of Matching Filter for temporal alignment of the single-trial ERP.

3.4.3. Principle Component Analysis (PCA)

Given the observed zero-mean data matrix X(t)=

[

x1(t) x2(t) L xN(t)

]

^T , the principle component analysis (PCA) is to find an orthogonal N×N matrix

[

p p pN

]

P= 1 2 L , p is a N×1 vector, that determines a transformation of _i variable,X(t)=PY(t), such that the new variables ^Y⁽^t⁾=

[

^y1⁽^t⁾ ^y2⁽^t⁾ K ^yN⁽^t⁾

]

are uncorrelated and arranged in order of decreasing variances.

Let S_X = XX^T (N −1) be the covariance matrix of X(t) and let D be a diagonal matrix

p1 K are then called the principle components of the data. The kth principle component

p determines the new variable k y_k(t) in the following way: Let the entries in p express _k

[ ]

weights. Using a cutoff on the first Kth principle components, the observed data matrix may thus be reduced in its dimensionality from N to K without much loss of information [93].

3.4.4. Learning Vector Quantization (LVQ)

The Learning Vector Quantization is a supervised competitive learning algorithm from vector quantization (VQ) and Self-Organizing Map (SOM) algorithm by Kohonen [105-106], which has the network to "discover" structure in the data by finding how the data are clustered.

The goal of LVQ algorithm is to approximate the distribution of a class using a reduced number of codebook vectors where the algorithm seeks to minimize classification errors. The algorithm is associated with the neural network class of learning algorithms, though works significantly differently compared to conventional feed-forward networks like Back-propagation. The neural network for learning vector quantization consists of two layers:

an input layer and an output layer. It represents a set of reference vectors, the coordinates of which are the weights of the connections leading from the input neurons to an output neuron.

Hence, one may also say that each output neuron corresponds to one reference vector. The learning method of learning vector quantization is often called competition learning, because it works as follows [107]: For each training pattern the reference vector that is closest to it is determined. The corresponding output neuron is also called the winner neuron. The weights of the connections to this neuron - and this neuron only: the winner takes all - are then adapted.

The direction of the adaptation depends on whether the class of the training pattern and the class assigned to the reference vector coincide or not. If they coincide, the reference vector is moved closer to the training pattern; otherwise it is moved farther away. This movement of the reference vector is controlled by a parameter called the learning rate. It states as a fraction of the distance to the training pattern how far the reference vector is moved. Usually the learning rate is decreased in the course of time, so that initial changes are larger than changes

made in later epochs of the training process. Learning may be terminated when the positions of the reference vectors do hardly change anymore.

3.4.5. Back-propagation Neural Network Model (BPNN)

Another algorithm which has hugely contributed to neural network fame is the back-propagation algorithm [93, 108]. The principal advantages of back-propagation are simplicity and reasonable speed. Back-propagation is well suited to pattern recognition/classification problems. In essence, the back-propagation network is a perceptron with multiple layers, a different threshold function in the artificial neuron, and a more robust and capable learning rule. Back-propagation can train multilayer feed-forward networks with differentiable transfer functions to perform function approximation, pattern association, and pattern classification. The term back-propagation refers to the process by which derivatives of network error, with respect to network weights and biases, can be computed. This process can be used with a number of different optimization strategies. The architecture of a multilayer network is not completely constrained by the problem to be solved. The number of inputs to the network is constrained by the problem, and the number of neurons in the output layer is constrained by the number of outputs required by the problem. However, the number of layers between network inputs and the output layer and the sizes of the layers are up to the designer.

The two-layer sigmoid/linear network can represent any functional relationship between inputs and outputs if the sigmoid layer has enough neurons.

在文檔中基於腦波之駕駛員認知反應估測及其在安全駕駛的應用 (頁 68-74)