4-1. Cross Validation Results of Training Data
Table 4-1 reports the average accuracy and standard deviation through four
different machine learning models. We can clearly find that the SVMRBF performs
the best accuracy about 92.6%. The second better performing model is RBFNN whose
accuracy can reach about 90.6%. The other two neural networks, MLPTAN and
MLPRBF perform similar accuracy. In addition, we can see that the standard
deviations of these models are very low and consistent (not reach 1%). It means that
the training models had been trained as a good structure with the optimal parameters.
Therefore, the training experimental results are shown that it might be possible to
have automatic systems for useful independent component selection. It is a strong
indicator of the fact that with adequate training data it may be possible to design a
“universal machine” for selection of good / useful independent components. Such
systems would be extremely useful for real-time applications. The table 4-2 is the
training parameters for performance evaluation from testing data.
45
Table 4-1. The accuracy of 10-fold cross validation.
Supervised Method Accuracy Rate ± Standard Deviation
MLPTAN 83.5% ± 0.0082
MLPRBF 83.8% ± 0.0133
RBFNN 90.6% ± 0.0085
SVMRBF 92.6% ± 0.0037
Table 4-2. The parameters of supervised methods.
Validate
46
4-2. Evaluation Performance of Testing Data
To evaluate the performance of the training structure from the four different
machine learning models, we collect other different 10 subjects EEG datasets and
applied ICA to extract the testing components. Each subject also has 28 components
in each session. For real-life application, we individually test the accuracy
performance of each subject. Figure 4-1 plots the average accuracy of testing datasets
from 10 subjects via different threshold score. Different color dotted lines mean the
average accuracy of testing datasets using different training structures with thresholds
from 0.1 to 0.9. Short vertical lines on each threshold point mean the standard
deviation of average accuracy from 10 subjects. According to the Figure 4-1 results,
we have two major findings. The first finding is that the classification accuracy of
four different models are increasing when the threshold is also increasing from 0.1 to
0.6 and then decreasing. The RBFNN model performs the best accuracy (92.5%)
under threshold 0.6 and the local optimum threshold values 0.6 and 0.7 will lead to
the better classification performance according to the performance curves. Moreover,
we collect the classification accuracy of each subject under threshold 0.6 in Table 4-3.
The numerator value in brackets means the number of correct classified components
47
and the denominator value means the totally 28 extracted components.
The other finding is that SVMRBF (the black line) performs the more stable
classification performance. All classification accuracies of 10 different thresholds are
over 85%. Therefore, if we cannot find the global optimum value of the threshold, we
can use SVMRBF to be a general model of useful components selection. It can
guarantee the classification performance is 85% at least. In other words, 85%
accuracy means that there are four misclassified components in 28 components.
48
Figure 4-1. Average accuracy of testing datasets from 10 subjects via different threshold score.
Table 4-3. Classification accuracy of each subject under threshold 0.6.
Subjects S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 Average
For real-life application, we do not know the new observed component is useful
or useless. In other words, that means we cannot know the desire output of each
observed data. Therefore, we need to calculate the positive predictive value (PPV)
from the confusion matrix to know the predictive performance. In this study, the PPV
means the useful component predictive rate (UCPR). Equation (28) shows the PPV
formula in which TP means true positive value and FP means false positive value.
)
( TPTrueUsefulComponents FP FalseUsefulComponents Components
49
and FP means the number of false ones. In UCPR equation, we have the assumption
that the FP value is as small as possible because these false useful components are
always from bad components (noise). Figure 4-2 plots the average UCPR from 10
subjects via different thresholds. According to Figure 4-1 results, RBFNN performs
the best accuracy performance when the threshold set at 0.6. Hence, we collect the
UCPR of each subject under threshold 0.6 in Table 4-4 and find that RBFNN still
performs the best UCPR performance (86%). In Table 4-4, the numerator value means
the number of true useful components and the denominator value means true useful
components add false ones. Each subject has at most three false useful components
during the prediction process of four models. Due to the useful components are less
than bad components, some uncertainty components from bad components will be
easy to misclassify and affect the UCPR significantly. Hence, we need to increase the
population of useful components to enhance the stability and performance of the
automatic scheme.
50
Figure 4-2. Average useful component predictive rate from 10 subjects via different threshold score.
Table 4-4. Useful component predictive rate of each subject under threshold 0.6.
Subjects S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 Average
MLPTAN 6/9 8/10 8/11 4/5 6/9 8/8 6/8 8/9 12/12 5/8 71/89 (79.8%)
MLPRBF 6/10 9/12 8/11 4/6 8/11 7/9 6/8 8/10 12/12 6/7
74/96 (77.1%)
RBFNN 6/7 8/10 8/10 4/5 6/8 8/9 7/9 9/9 12/12 6/7
74/86 (86.0%)
SVMRBF 7/12 9/11 9/12 4/7 8/11 8/10 6/10 9/12 12/13 6/8 78/106 (73.6%)
51
In Figure 4-2, the UCPR will be monotonically increasing in terms of the
increasing thresholds. It shows the higher threshold performs the better predictive
performance. In addition, although SVMRBF performs the best training performance,
we can find it does not perform the best accuracy in testing evaluation process. The
predictive performance of SVMRBF is also not stable for real-life application in
comparison with other three models.
In summary, we suggest that RBFNN model will be the better model for the
automatic scheme of useful component selection. It can also perform the better
predictive performance for real-life application. Regarding the optimal threshold of
RBFNN model, we suggest setting value at 0.6 is enough to have better performance.
Since the number of good components is much smaller than the bad components, the
training process may give more importance to the noise components. We plan to
explore the utility of data replication as well as generation of additional data through
rotation or negation. All these are part of future investigation.
52