The Proposed Methods
4.1 Off-line analysis
4.1.1 Data preparation
Subject
Before implementing the on-line automated sleep staging system, we analyze data set with off-line analysis to decide the methods of automated sleep staging in each step and ver-ify the accuracy of staging. The data sets were obtained from the sleep center of Sang-Mei Hospital in Taichung, Taiwan. There are 25 records from 10 males and 15 females. These subjects probably have some kind of sleep disorders but it will not affect our experiment results.
Records
The PSG was recorded digitally with Sandmanr Digital20T M system. Data were collected by nocturnal polysomnography, including four EEG channels (C3/A2, C4/A1, O1/A2, O2/A1), EMG of chin, EOG of the left side and the right side, EKG, SpO2, airflow and sleep position. These records were saved as a EDF (European Data Format) file which is a standard format in sleep research. Each subject has two nocturnal sleep records about 6 to 8 hours. In the second night sleep, continuous positive airway pressure (CPAP) is added to the subject so the subject sleeps better than the day before, which means the sleep cycles may include more complete stages. Therefore, the recording of which the CPAP is added is used as the training data while the other recording as testing data. These records have been previously scored by sleep specialists and assumed as ground truth for comparison with results of the proposed automated staging method. We extracted EEG signal of C3/A2 channel from PSG records as the data set for automated scoring.
4.1.2 Experimental results
The data set include 25 subjects. All are well scoring by sleep specialists as our ground truth and compare to the results from our system. For the off-line analysis, we classified sleep stages into three classes: Wake, REM and NREM.
While we classified as three stages, the accuracy of the proposed automated sleep stag-ing system is 69.5% to 93.8% and 85.04% in average. Table 4.1 shows the result of 25 sub-jects. It shows the percentage of total accuracy and the REM accuracy for every records.
Total accuracy means that if predicted label is same as ground truth and it takes into count.
The REM accuracy means that both ground truth and predicted label is labeled as REM state and it takes into count. The REM accuracy of some records are very low and this will be discussed in the next chapter.
Table 4.1: List of total accuracy and REM accuracy of 25 subjects. Total accuracy means that if predicted label is same as ground truth and it takes into count; The REM accuracy means that both ground truth and predicted label is labeled as REM state and it takes into count. The REM accuracy of Subject 3 is labeled as – which means there is no REM state in this record.
Subject Total accuracy REM accuracy Subject Total accuracy REM accuracy s0102 84.7% (728/859) 77.0% (117/152) s2102 82.9% (636/767) 100.0% ( 63/63) s0302 88.6% (616/695) 87.3% ( 89/102) s2202 81.7% (686/840) 54.4% (136/250) s0402 77.7% (483/622) – s2302 84.9% (699/823) 56.4% (119/211) s0502 91.4% (620/678) 90.5% (143/158) s2402 86.8% (688/793) 63.4% (102/161) s0602 76.4% (616/806) 40.4% ( 88/218) s2502 86.8% (665/766) 83.7% ( 41/ 49) s0802 80.9% (637/787) 100.0% ( 81/ 81) s2602 93.7% (731/780) 85.7% ( 96/112) s1202 81.1% (664/819) 58.6% (163/278) s2702 90.0% (764/849) 85.0% (142/167) s1402 86.2% (702/814) 89.0% (219/246) s2802 90.7% (640/706) 74.5% ( 82/110) s1502 90.9% (687/756) 86.1% ( 99/115) s2902 86.9% (672/773) 64.5% (107/166) s1702 88.3% (722/818) 67.8% ( 99/146) s3102 84.9% (656/773) 96.4% (135/140) s1802 83.7% (769/919) 58.9% (201/341) s3202 86.2% (676/784) 58.8% (100/170) s1902 93.8% (804/857) 77.6% (142/183) s3302 78.6% (582/740) 65.0% (104/160) s2002 83.5% (641/768) 69.0% (172/228)
Table 4.3 is the confusion matrix of all 25 records. Totally there are 19592 epochs in 25 record. An element in row i and column j counts the number of times class j was classified as i. Diagonal elements count the number of correct classifications and off-diagonal
ele-ments count the number of misclassifications. The last column of the matrix is the positive predictive value (PPV) and used to evaluate the accuracy of classification. The relationship among terms are shown as Table 4.2. PPV is defined by the ratio of TP and the sum of TP and FP.
Table 4.2: Relationships among terms. Definitions of True positive (TP), False positive (FP), True negative (TN) and False negative (FN)
Condition(ground truth)
positive negative
Test outcome positive True Positive (TP) False Positive (FP) negative False Negative (FN) True Negative (TN)
Table 4.3: Confusion matrix of 25 Subjects. An element in row i and column j counts the number of times class j was classified as i. Diagonal elements count the number of correct classifications and off-diagonal elements count the number of misclassifications. The last column of the matrix is the positive predictive valus of the classification result which is defined by the ratio of true positive and total epoch of a state. There are totally 19592 epochs in 25 subjects.
Manual scoring
Wake REM NREM Total PPV
Automated Wake 1567 161 290 2018 77.7%
scoring REM 65 2840 880 3785 75.0%
NREM 404 1006 12379 13789 89.8%
Total 2036 4007 13549 19592 85.6%
Figure 4.1 shows the hypnogram of the sleep structure for Subject s1902. The accuracy of the record is 93% which is the highest accuracy of 25 records. The first and second field is the upper night sleep and the third and fourth field is the lower night sleep. The blue line represents the results of manual scoring which we assumed as the ground truth, and the red line is the results of automated scoring by our system. Table 4.9 is the confusion matrix for subject19. The numbers in bold face represent those correctly classified.
0 hr 0.5 hr 1 hr 1.5 hr 2 hr 2.5 hr 3 hr 3.5 hr 4 hr
Figure 4.1: Hypnogram of Subject s1902. The accuracy of the record is 93%. The first and second field is the upper night sleep and the third and fourth field is the lower night sleep.
The blue line and the red line are represented as the result of manual scoring and the result of automated scoring by our system respectively.
Table 4.4: Confusion matrix of Subject s1902.
Manual scoring
Wake REM NREM Total PPV
Automated Wake 28 2 4 34 82.4%
scoring REM 0 142 6 148 95.9%
NREM 2 39 634 675 93.9%
Total 30 183 644 857 93.8%