3. Methods
3.3 Computation of Self-Organization Maps
. (3)
Smoothing
We used a componentwise moving average to smooth the power spectra data.
The size of moving window was the 10% epochs of that case in that subject, and the window was shifted by 1 epoch. The moving was processed by circular motion. For example, there were 100 EEG epochs in that case. The first window used epochs 1 through 10, the second window used epochs 2 to 11, and the latest window was used epochs 100 through 9. A moving average (computed using the 10% epochs) was used to minimize the presence of artifacts in the EEG signals of all epochs in that case.
Thus for each case and subject, EEG signals in Frontal and Motor were well enough to be input data for the SOM model.
3.3 Computation of Self-Organization Maps
Self-Organizing Maps (SOM) offer an approach to brain activities that provides a
∑ ∑
mechanism for visualizing the complex distribution of cognitive states. The maps is defined by k neurons (locations) arranged as 1-, 2-, or 3-D lattice and easily realized that the topographic organization of the data. Increasing the number of locations k increases the accuracy of the results of labeling. Each neuron in the map contains an n-dimensional (same as the input data) reference vector during the unsupervised learning (training) process. When the unsupervised training process is over, the topographic organization of the map will adequately represent the input space. Thus, similar inputs will project near each other onto the near neurons in the map. Then the map will construct a structure by the input data. Topological neighborhoods can be of different shapes such as rectangle or hexagonal. In this research, we chose k = 625 (a rectangle lattice with dimensions 25 and 25). The maps were initialized, taught, and evaluated by SOM toolbox for MATLAB [47].
The initial step in the SOM routine is to define a random distribution of neurons, and there is one reference vector in each neuron in this map. All reference vectors mi
(i=1~k) must have random initial value and equal the dimension in the input space.
Each neuron in the map will have a reference vector of 500 coefficients. In Fig. 3-5, for each input vector x(t), a reference vector mc(t) with minimum Euclidean distance from the input is searched for by the following equation (4):
(4) The input data does not become part of a group at this time; it is simply used to adjust the location of the SOM node in the input space.
In Fig.3-6, the best-matching model vector mc and the model vectors mi in its circular neighborhood are then modified toward the value of the input vector by the following equation (5):
(5) The magnitude of the learning coefficient α(t) decreases monotonically. Also the size of the neighborhood of mc decrease at successive inputs. At the beginning of the self-organizing its neighborhood on the map is wide, while at the end only the nearest neighbors of mc(t) are modified. The learning consisted of two phases. In the first phase, the learning coefficient α(t) decreased from 1 to 0 in 75000 steps, while the radius of the neighborhood decreased from 25 to 1. In the second phase, α(t) decreased from 0.1 to 0 in 50000 steps, while the neighborhood radius decreased from 6 to 1. In both phases, the samples were presented randomly.
Fig. 3-5: The properties of the map
The X in the input data was mapping onto the map. The Euclidean distance from X to mc(t) was the minimum.
By the end of the training phase, this map may be useful until each neuron in the map is labeled. In ordered to label all neurons in the map, the input data was pre-classified into designated categories. There is a model vector in each neuron, and the Euclidean distance would be computed to each labeled patterns. The pattern would be located in that neuron with smallest distance then became the best stimulus for that neuron. This procedure closely resembles the way sites in the brain get labeled by stimulus features that maximally excite neurons at that time. Such a labeling procedure was applied to all the neurons in the map. Then we can run the voting scheme. Each neuron will be finally assigned a label that corresponds to the species whose patterns elicited a maximal response with the highest frequency. In other words, there may be several patterns in one neuron after the step of locating. For example, 9 patterns were in the same neuron (case-1:1, case-2:0, case-3:1, case-4:7, case-5:1), and the neuron would be assigned as ‘case-4’. The reason was that the Fig. 3-6: The illustration of the neighborhood size
The reference vectors in the best-matching neuron and its neighbor neurons would be adaptive to fit that input vector. (A) The neighborhood function in this study was used the Gaussian function. (B) The coverage of different neighborhood size.
neuron, we would not label that neuron. The method of labeling would eventually produce a well ordered partition of the map such that groups or clusters of neurons will respond maximally to the same class of patterns. Such a topographic map creates similarity relationships and can be used for pattern classification.
The maps were trained by the description as the above. The structure of the trained maps represented the phenomenon of all EEG epochs in each case. There were still some unlabeled neurons in the middle of each main area. We set a reference vector for each neuron during the steps of training and each reference vector would be adapted during the learning phase. The neighborhood size was reducing by the steps of training, and the neurons in the area of influence would get the chance of adapting.
The structure of all maps was more consistent with the input data during this learning mechanism. Although there were no any data in these unlabeled neurons, there was a reference vector in the unlabeled neuron to represent the phenomenon and it must be similar to the reference vector or EEG epochs which located on the neighbor neurons.
By this learning theorem, we marked each unlabeled neuron. If there was an unlabeled neuron in the middle of an array like Fig. 3-7, the distance (Euclidean distance) to each neighbor neuron would be computed. We could find the minimum distance between the two neurons, and then that unlabeled neuron would be labeled.
The label of these two close neurons must the same. After applying this step, there were no unlabeled neurons in the trained maps and the better maps would be generated. We could create a data base to recognize the EEG signals of distraction effects.