Experiment Results - 腦神經訊號源自動分類系統

4-1. Cross Validation Results of Training Data

Table 4-1 reports the average accuracy and standard deviation through four

different machine learning models. We can clearly find that the SVMRBF performs

the best accuracy about 92.6%. The second better performing model is RBFNN whose

accuracy can reach about 90.6%. The other two neural networks, MLPTAN and

MLPRBF perform similar accuracy. In addition, we can see that the standard

deviations of these models are very low and consistent (not reach 1%). It means that

the training models had been trained as a good structure with the optimal parameters.

Therefore, the training experimental results are shown that it might be possible to

have automatic systems for useful independent component selection. It is a strong

indicator of the fact that with adequate training data it may be possible to design a

“universal machine” for selection of good / useful independent components. Such

systems would be extremely useful for real-time applications. The table 4-2 is the

training parameters for performance evaluation from testing data.

Table 4-1. The accuracy of 10-fold cross validation.

Supervised Method Accuracy Rate ± Standard Deviation

MLPTAN 83.5% ± 0.0082

MLPRBF 83.8% ± 0.0133

RBFNN 90.6% ± 0.0085

SVMRBF 92.6% ± 0.0037

Table 4-2. The parameters of supervised methods.

Validate

4-2. Evaluation Performance of Testing Data

To evaluate the performance of the training structure from the four different

machine learning models, we collect other different 10 subjects EEG datasets and

applied ICA to extract the testing components. Each subject also has 28 components

in each session. For real-life application, we individually test the accuracy

performance of each subject. Figure 4-1 plots the average accuracy of testing datasets

from 10 subjects via different threshold score. Different color dotted lines mean the

average accuracy of testing datasets using different training structures with thresholds

from 0.1 to 0.9. Short vertical lines on each threshold point mean the standard

deviation of average accuracy from 10 subjects. According to the Figure 4-1 results,

we have two major findings. The first finding is that the classification accuracy of

four different models are increasing when the threshold is also increasing from 0.1 to

0.6 and then decreasing. The RBFNN model performs the best accuracy (92.5%)

under threshold 0.6 and the local optimum threshold values 0.6 and 0.7 will lead to

the better classification performance according to the performance curves. Moreover,

we collect the classification accuracy of each subject under threshold 0.6 in Table 4-3.

The numerator value in brackets means the number of correct classified components

and the denominator value means the totally 28 extracted components.

The other finding is that SVMRBF (the black line) performs the more stable

classification performance. All classification accuracies of 10 different thresholds are

over 85%. Therefore, if we cannot find the global optimum value of the threshold, we

can use SVMRBF to be a general model of useful components selection. It can

guarantee the classification performance is 85% at least. In other words, 85%

accuracy means that there are four misclassified components in 28 components.

Figure 4-1. Average accuracy of testing datasets from 10 subjects via different threshold score.

Table 4-3. Classification accuracy of each subject under threshold 0.6.

Subjects S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 Average

For real-life application, we do not know the new observed component is useful

or useless. In other words, that means we cannot know the desire output of each

observed data. Therefore, we need to calculate the positive predictive value (PPV)

from the confusion matrix to know the predictive performance. In this study, the PPV

means the useful component predictive rate (UCPR). Equation (28) shows the PPV

formula in which TP means true positive value and FP means false positive value.

)

( TPTrueUsefulComponents FP FalseUsefulComponents Components

and FP means the number of false ones. In UCPR equation, we have the assumption

that the FP value is as small as possible because these false useful components are

always from bad components (noise). Figure 4-2 plots the average UCPR from 10

subjects via different thresholds. According to Figure 4-1 results, RBFNN performs

the best accuracy performance when the threshold set at 0.6. Hence, we collect the

UCPR of each subject under threshold 0.6 in Table 4-4 and find that RBFNN still

performs the best UCPR performance (86%). In Table 4-4, the numerator value means

the number of true useful components and the denominator value means true useful

components add false ones. Each subject has at most three false useful components

during the prediction process of four models. Due to the useful components are less

than bad components, some uncertainty components from bad components will be

easy to misclassify and affect the UCPR significantly. Hence, we need to increase the

population of useful components to enhance the stability and performance of the

automatic scheme.

Figure 4-2. Average useful component predictive rate from 10 subjects via different threshold score.

Table 4-4. Useful component predictive rate of each subject under threshold 0.6.

Subjects S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 Average

MLPTAN 6/9 8/10 8/11 4/5 6/9 8/8 6/8 8/9 12/12 5/8 71/89 (79.8%)

MLPRBF 6/10 9/12 8/11 4/6 8/11 7/9 6/8 8/10 12/12 6/7

74/96 (77.1%)

RBFNN 6/7 8/10 8/10 4/5 6/8 8/9 7/9 9/9 12/12 6/7

74/86 (86.0%)

SVMRBF 7/12 9/11 9/12 4/7 8/11 8/10 6/10 9/12 12/13 6/8 78/106 (73.6%)

In Figure 4-2, the UCPR will be monotonically increasing in terms of the

increasing thresholds. It shows the higher threshold performs the better predictive

performance. In addition, although SVMRBF performs the best training performance,

we can find it does not perform the best accuracy in testing evaluation process. The

predictive performance of SVMRBF is also not stable for real-life application in

comparison with other three models.

In summary, we suggest that RBFNN model will be the better model for the

automatic scheme of useful component selection. It can also perform the better

predictive performance for real-life application. Regarding the optimal threshold of

RBFNN model, we suggest setting value at 0.6 is enough to have better performance.

Since the number of good components is much smaller than the bad components, the

training process may give more importance to the noise components. We plan to

explore the utility of data replication as well as generation of additional data through

rotation or negation. All these are part of future investigation.

在文檔中腦神經訊號源自動分類系統 (頁 53-61)