This section presents our results of performance evaluation for brain extraction meth-ods. Table 2.1 lists the experimental outcomes of the proposed and other brain extraction algorithms using the first IBSR data set. In general, MLS and our method performed better than others. Jointly considering both the sensitivity and specificity, the accuracy indices of BET and BSE were moderate among the five methods evaluated. In this experiment, HWA did not achieve significant outperformance for all accuracy criteria (𝑝 > 0.05). Notice that the performance indices of each method shown in Table 2.1 did not count in the cases that (1) the JSC value between the extracted brain volume and the ground truth is smaller than 0.6 (three cases for BSE); (2) the program terminates without any results (three cases for HWA); and (3) the extraction result is blank (one case for BSE and one case for MLS). Ex-cluding these cases (seven in total), all methods achieved slightly larger JSC values, which means better overlapping of the extracted brain regions with the ground truths, as shown in Table 2.2. HWA had remarkable improvement in its sensitivity due to the omission of additional four poor cases. Because of the exclusion of these seven cases, outperformance of BSE and MLS to our method became significant in terms of the specificity (𝑝 = 0.001) and JSC (𝑝 = 0.024), respectively.
To verify that the manual removal of slices containing neck or shoulder region in the first experiment did not largely affect the performance for BSE, MLS, and the proposed methods, we applied these three algorithms again to extract the brain volumes from original IBSR images. The obtained results indicated that these three algorithms produced similar extraction outcomes no matter the excess non-brain slices were removed or not.
Table 2.3 lists the experimental results of the proposed and other extraction algorithms using the second IBSR data set. Our method generally performed better than others with respect to all accuracy criteria, except for the sensitivity. HWA achieved the best sensitivity in detecting brain tissues at the expense of the relatively low specificity. BET, MLS, and
ʻ˴ʼ
ʻ˵ʼ
ʻ˶ʼ
ʻ˷ʼ
˕˘˧ ˢ̈̅ʳ̀˸̇˻̂˷
ˢ̈̅ʳ̀˸̇˻̂˷
˛˪˔ ˢ̈̅ʳ̀˸̇˻̂˷
˛˪˔ ˢ̈̅ʳ̀˸̇˻̂˷
˕˘˧
Figure 2.8: Excess non-brain tissues affect the extraction accuracy of BET and HWA. The MR images of the first IBSR data set contain neck areas, as shown in the left of (a) and (c).
In this case BET and HWA cannot well extract the brain volumes, as shown in the middle of (a) and (c). Manually removing several inferior non-brain slices, as shown in the left of (b) and (d), can facilitate BET and HWA to produce better extraction results, as shown in the middle of (b) and (d). On the other hand, the proposed method is relatively robust to the excess non-brain tissues, as shown in the right from (a) to (d).
Table2.1:PerformanceevaluationusingthefirstIBSRdatasetafterexcludingtheunsatisfactoryresultsofeachbrainextraction algorithm. Method𝐽𝑆𝐶𝑆e𝑆p𝑝m𝑝fTime(sec.) BET0.878(.017)𝐴 0.983(.023)0.987(.004)𝑎 0.016(.022)0.107(.021)𝐴 11.4(.8)∗ BSE4 0.900(.025)0.954(.035)𝐴 0.993(.003)0.044(.034)𝐴 0.055(.017)𝐵 3.6(1.0)+ HWA3 0.752(.037)𝐴 0.974(.068)0.970(.008)𝐴 0.022(.056)0.226(.026)𝐴 96.1(24.7)∗ MLS1 0.922(.025)0.989(.010)0.992(.006)0.011(.009)0.067(.031)228.6(28.0)∗ Proposedmethod0.910(.018)0.986(.013)0.991(.005)0.013(.013)0.077(.027)14.4(3.6)∗ mean(standarddeviation) 𝐴 Theproposedmethodissuperiortothecomparedmethodwithp<.01 𝑎Theproposedmethodissuperiortothecomparedmethodwithp<.05 𝐵 Thecomparedmethodissuperiortotheproposedmethodwithp<.01 Thesuperscriptsinthefirstcolumnindicatethenumberoftheexcludedcases.Themarks“∗”and“+”indicatethattheexperiments wereexecutedonanAMDOpteron240processorrunningLinuxandanAMDXP2400+processorrunningWindowsXP,respectively.
Table2.2:PerformanceevaluationusingthefirstIBSRdatasetwithoutconsideringallthecaseswhichcausedunsatisfactoryresults.
Method𝐽𝑆𝐶𝑆e𝑆p𝑝m𝑝fTime(sec.) BET 70.881(.017) 𝐴0.981(.026)0.988(.003) 𝐴0.017(.024)0.102(.021) 𝐴11.5(.9) ∗
BSE 70.905(.025)0.954(.035) 𝐴0.994(.002) 𝐵0.044(.034) 𝐴0.051(.010) 𝐵3.7(1.0) +
HWA 70.762(.012) 𝐴0.999(.001) 𝑏0.967(.008) 𝐴0.001(.001) 𝑏0.237(.013) 𝐴91.6(16.2) ∗
MLS 70.922(.023) 𝑏0.989(.010)0.992(.006)0.011(.010)0.067(.030) 𝑏230.2(30.0) ∗
Proposedmethod 70.911(.014)0.988(.015)0.991(.004)0.012(.014)0.077(.023)19.9(2.9) ∗
mean(standarddeviation)
𝐴Theproposedmethodissuperiortothecomparedmethodwithp<.01
𝑎Theproposedmethodissuperiortothecomparedmethodwithp<.05
𝐵Thecomparedmethodissuperiortotheproposedmethodwithp<.01
𝑏Thecomparedmethodissuperiortotheproposedmethodwithp<.05
Thesuperscriptsinthefirstcolumnindicatethenumberoftheexcludedcases.Themarks“∗”and“+”indicatethattheexperiments
wereexecutedonanAMDOpteron240processorrunningLinuxandanAMDXP2400+processorrunningWindowsXP,respectively.
BSE were statistically equal in all accuracy criteria, except for the specificity of BSE. BSE had the significantly lower specificity in detecting non-brain regions compared to BET (p=0.046) and MLS (p=0.02).
Tables 2.1 to 2.3 also list the average execution time of extraction methods using the first and second IBSR data sets. Both experiments show that BSE achieved the best effi-ciency, followed by BET and our method, though BSE was executed on a relatively low-end processor. The processing time of HWA and MLS was apparently longer among the com-pared methods. Notice that MLS has a chance to achieve better efficiency if the algorithm is implemented in C/C++, instead of Java.
For each brain extraction method, the probabilities of the false classification for brain and non-brain voxels, 𝑝m and 𝑝f, were calculated to evaluate its extraction risk. Fig. 2.9a shows the risk profiles of the first experiment when the risk ratio 𝑐 between 𝑝mand 𝑝franged from 1 to 10. It is apparent that MLS and our method have relatively lower extraction risks.
BET and HWA perform better than BSE when the risk ratio is larger than 1.8 and 8.0, respectively. This figure also illustrates the extraction risk for the results excluding the seven subjects that caused markedly poor results. We can see that the performance of the proposed method, BSE, and MLS has been slightly improved. The extraction risk of HWA decreases rapidly due to its high sensitivity to the inclusion of brain tissues. The risk profiles of the second experiment shown in Fig. 2.9b indicate that the proposed method has the lowest extraction risk compared to other algorithms if the penalty is smaller than 6.
HWA performs better than BSE, MLS, BET, and our method if the risk ratio is larger than 1.6, 2.0, 3.0, and 6.0, respectively.
Table2.3:PerformanceevaluationforbrainextractionalgorithmsusingthesecondIBSRdataset.
Method𝐽𝑆𝐶𝑆e𝑆p𝑝m𝑝fTime(sec.) BET0.891(.052) 𝑎0.959(.042)0.989(.005)0.038(.038)0.071(.031)17.9(2.9) ∗
BSE0.838(.083) 𝐴0.957(.042) 𝑎0.973(.030) 𝑎0.041(.041) 𝑎0.119(.104)14.9(0.5) +
HWA0.814(.040) 𝐴0.9997(.0003) 𝐵0.965(.016) 𝐴0.0002(.0002) 𝐵0.186(.040) 𝐴101.5(9.7) ∗
MLS0.878(.081) 𝑎0.938(.099)0.989(.007)0.060(.098)0.061(.037)485.8(86.1) ∗
Proposedmethod0.915(.018)0.978(.011)0.990(.003)0.021(.011)0.064(.022)27.4(1.9) ∗
mean(standarddeviation)
𝐴Theproposedmethodissuperiortothecomparedmethodwithp<.01
𝑎Theproposedmethodissuperiortothecomparedmethodwithp<.05
𝐵Thecomparedmethodissuperiortotheproposedmethodwithp<.01
Themarks“∗”and“+”indicatethattheexperimentswereexecutedonanAMDOpteron240processorrunningLinuxandan
AMDXP2400+processorrunningWindowsXP,respectively.
ʻ˴ʼ
ʻ˵ʼ
Figure 2.9: Extraction risk evaluation using (a) the first IBSR data set and (b) the second IBSR data set. The mark “∗” indicates that some failed or extremely poor segmentation case(s) are not included. After excluding all of these cases for each method, the risk profiles are shown as the dashed lines in (a).