• 沒有找到結果。

Chapter 5 Applications and Results

5.1.3 Results

The database of ultrasound images consisting a total of 538 pathologically proven thyroid nodules, including 322 benign and 216 malignant cases were used to evaluate the performance of our proposed method. Among them, the training and testing dataset had 275 and 263 cases, respectively. The effectiveness of the performance results was then reported in qualitative and quantitative terms.

Qualitative analysis

A step-by-step visualization showed the results of our proposed method actually applied to the ultrasound images corresponding to the above schematic diagram (Fig. 3-2).

So that, two clinical thyroid nodules, one benign and one malignant, were used to demonstrate in Fig. 5-3 and Fig. 5-4, respectively. To further show comparison between original images, AB, and MB, aforementioned two clinical thyroid nodules were used to demonstrate the final results of AB in the third row of Fig. 5-5, and showed that in both case AB had a very similar results to their corresponding MB (Fig. 5-5, second row).

Fig. 5-3. Result of applying our proposed method on a benign case of thyroid nodule corresponding to the steps of Fig. 3-2. (A) original thyroid nodule on ultrasound image.

(B) manually input four extreme nodes on approximate location of the nodule. (C) the

region of interest (ROI) was automatically generated based on the major axis and minor axis. (D) reference boundary points. (E) new cut points and new radial lines defined by the centers of the eight 45-degrees sectors in the ROI. (F) direction searching method. (G) outlier elimination method. (H) inner product method. (I) smoothing. (J) linking.

Fig. 5-4. Result of applying our proposed method on a malignant case of thyroid nodule corresponding to the steps of Fig. 3-2. (A) original thyroid nodule on ultrasound image.

(B) manually input four extreme nodes on approximate location of the nodule. (C) the region of interest (ROI) was automatically generated based on the major axis and minor axis. (D) reference boundary points. (E) new cut points and new radial lines defined by the centers of the eight 45-degrees sectors in the ROI. (F) direction searching method. (G) outlier elimination method. (H) inner product method. (I) smoothing. (J) linking.

Fig. 5-5. The comparison between the boundary generated by our proposed method and the gold-standard boundary by experienced radiologist. The first row were original images, the second row were the corresponding gold-standard boundaries, and the third row were the corresponding automatic boundaries, respectively.

To show that our proposed method is applicable on complicated cases with variation of the nodule’s inner tissues, Fig. 5-6 demonstrated that AB agreed well with corresponding MB in the case of weak boundary (Fig. 5-6, first row), blurring boundary (Fig. 5-6, second row), missing boundary (Fig. 5-6, third row), inhomogeneity (Fig. 5-6, fourth row), and cysts in nodule (Fig. 5-6, fifth row), respectively.

Fig. 5-7 demonstrated the comparison between gold-standard (Fig. 5-7, second row), our proposed method (Fig. 5-7, third row), WM (Fig. 5-7, fourth row), ACM (Fig. 5-7, fifth row), and DRLSE (Fig. 5-7, sixth row) applied to one benign nodule (Case A), one malignant nodule (Case B), and another malignant nodule with inhomogeneity (Case C), respectively.

Fig. 5-6. The comparison between the boundary generated by our proposed method and the gold-standard boundary by experienced radiologist on complicated cases. The first row (Case A) was a nodule with weak boundary. The second row (Case B) was a nodule with blurring boundary. The third row (Case C) was a nodule with missing boundary. The fourth row (Case D) showed a nodule with intensity inhomogeneity. The fifth row (Case E) showed a case with cysts in nodule.

Fig. 5-7. The screenshots of identical view of one benign nodule (Case A), one malignant nodule (Case B), and another malignant nodule with inhomogeneity (Case C), showing the results of comparison between our proposed method and other standardized methods.

The first row were the original images, the second row were the gold-standard boundaries delineated by experienced radiologist, the third row were automatic boundaries by our proposed method, the fourth row were boundaries by Watershed Model, the fifth row were boundaries by Active Contour Model, the sixth row were boundaries by Distance

Quantitative analysis

The performance results from 538 thyroid nodules were summarized in Table 5-2.

Among them, the testing dataset (263/538) showed that HD and MD were 17.36±10.69 pixel points and 4.69±3.00 pixel points, respectively. The NHD and NMD were 3.65%±

1.15% and 1.02%±0.40%, respectively. The above results clearly indicated a high degree of correlation between AB and MB. The five overlapping area metrics showed the similarity and accuracy between two areas delineated by AB and MB. The TPR achieved 93.66%±5.07%, whereas FPR was 7.68%±5.76%. These results demonstrated that our proposed method delineated thyroid nodules with high sensitivity and high specificity.

To investigate the generalizing ability of our proposed method, we analyzed the performance of the proposed method in both the training dataset (275/538) and the testing dataset (263/538) with Mann-Whitney U test. The testing performance was similar to the training performance, as shown by p-values in most metrics (Table 5-2). It indicated that our proposed method was not over-fitted with the training dataset and could be applied generally to the testing dataset.

To see if our proposed method biased toward benign or malignant nodules, we analyzed the respective performances in 161 benign modules and 102 malignant nodules from testing dataset (263/538). The results were listed in Table 5-3. In both benign and malignant nodules our proposed method achieved good results, and there was no statistical difference between them shown by Mann-Whitney U test.

We compared our proposed method with WM, ACM, and DRLSE using the same testing dataset. We used the packaged functions directly provided by the original authors for analysis of non-preprocessed images in the comparison study. The respective performances of these method were listed in Table 5-4. Then we analyzed the difference

in performance by comparing our proposed method to WM, our proposed method to ACM, our proposed method to DRLSE with Mann-Whitney U test. The p-values were summarized in Table 5-5.

In order to evaluate if our proposed method was insensitive to adopted parameters, we analyzed the effect of different parameter values on performance. The results of sensitivity analysis regarding tunable parameters “distance w”, “angle θThreshold”,

“parameter a”, and “inner product b” were summarized in Table 5-6, Table 5-7, Table 5-8, and Table 5-9, respectively. It showed that the performance varies with different values of w and θThreshold. The larger w we chose, the worse result we got. θThreshold affected the performance in opposite direction: the smaller θThreshold, the worse result. On the contrary, our proposed method was insensitive to a and b. All performance metrics were almost the same regardless of a and b values.

All performance experiments were accomplished on an Intel Core i7-4500U 1.80 GHz processor with 8 GB RAM. The time performance with the same dataset of our proposed method was 3.30 seconds per nodule, whereas WM and ACM were 5.68 and 0.79 seconds per nodule, respectively. DRLSE had the worst time performance with 16.01 seconds per nodule.

Table 5-2. The individual performance of our proposed method from training and testing datasets by using boundary error metrics and overlapping area metrics. ( HD = Hausdorff distance, MD = mean absolute distance, NHD = normalized HD, NMD = normalized MD, TPR = true positive area ratio, FPR = false positive area ratio, FNR = false negative area ratio, JSI = Jaccard similarity index, PPV = positive predictive value. Unit for HD and MD is pixel. All values are represented as average±standard deviation. p-values are

(161/263) 3.59±1.08 0.99±0.37 93.98±4.93 7.90±5.52 6.02±4.93 87.19±4.14 92.62±4.59 Malignant

(102/263) 3.75±1.24 1.06±0.45 93.15±5.25 7.34±6.11 6.85±5.25 86.91±4.86 93.13±4.98

p-value 0.390 0.337 0.116 0.250 0.116 0.935 0.254

Table 5-3. The individual performance of our proposed method from benign and malignant cases by using boundary error metrics and overlapping area metrics. (HD = Hausdorff distance, MD = mean absolute distance, NHD = normalized HD, NMD = normalized MD, TPR = true positive area ratio, FPR = false positive area ratio, FNR = false negative area ratio, JSI = Jaccard similarity index, PPV = positive predictive value.

Boundary Error Metrics

Cases Dataset HD MD NHD (%) NMD (%)

538

Training

(275/538) 18.07±10.31 4.30±2.62 3.23±1.12 0.78±0.29

Testing

(263/538) 17.36±10.69 4.69±3.00 3.65±1.15 1.02±0.40

p-value 0.267 0.184 <0.005 <0.005

Overlapping Area Metrics

Cases Dataset TPR (%) FPR (%) FNR (%) JSI (%) PPV (%)

538

Training

(275/538) 94.22±4.38 7.84±5.39 5.78±4.38 87.47±3.97 92.66±4.49 Testing

(263/538) 93.66±5.07 7.68±5.76 6.34±5.07 87.08±4.44 92.82±4.75

p-value 0.337 0.542 0.337 0.575 0.555

All values are represented as average±standard deviation. p-values are obtained with

3.65±1.15 1.02±0.40 93.66±5.07 7.68±5.76 6.34±5.07 87.08±4.44 92.82±4.75

Watershed

model 6.22±1.87 1.73±0.42 94.93±3.00 17.50±7.16 5.07±3.00 81.02±4.24 84.82±5.07 Active

contour model

4.60±1.69 1.60±0.67 86.93±7.35 8.31±11.15 13.07±7.35 80.56±5.84 92.62±8.10

DRLSE 4.22±1.03 1.47±0.36 87.61±6.31 7.75±7.34 12.39±6.31 81.37±4.02 92.61±5.85

Table 5-4. The comparison of our proposed method with other methods by using the same dataset. (HD = Hausdorff distance, MD = mean absolute distance, NHD = normalized HD, NMD = normalized MD, TPR = true positive area ratio, FPR = false positive area ratio, FNR = false negative area ratio, JSI = Jaccard similarity index, PPV = positive predictive value. All values are represented as average±standard deviation.)

Dataset Method NHD NMD TPR FPR FNR JSI PPV

<0.005 <0.005 <0.005 <0.005 <0.005 <0.005 <0.005 Our proposed method

v.s.

DRLSE

<0.005 <0.005 <0.005 0.174 <0.005 <0.005 0.529

Table 5-5. Mann-Whitney U test p-values for method comparisons. (HD = Hausdorff distance, MD = mean absolute distance, NHD = normalized HD, NMD = normalized MD, TPR = true positive area ratio, FPR = false positive area ratio, FNR = false negative area ratio, JSI = Jaccard similarity index, PPV = positive predictive value. All values are represented as average±standard deviation.)

Dataset w NHD (%) NMD (%) TPR (%) FPR (%) FNR (%) JSI (%) PPV (%) 5 6.26±1.75 2.17±0.93 79.10±10.28 2.87±3.89 20.90±10.28 76.78±8.61 96.94±3.59

Table 5-6. Performance of different values of distance w. (HD = Hausdorff distance, MD

= mean absolute distance, NHD = normalized HD, NMD = normalized MD, TPR = true positive area ratio, FPR = false positive area ratio, FNR = false negative area ratio, JSI = Jaccard similarity index, PPV = positive predictive value. All values are represented as average±standard deviation.)

Dataset θThreshold NHD (%) NMD (%) TPR (%) FPR (%) FNR (%) JSI (%) PPV (%)

Testing (263/538)

45 5.01±1.08 1.92±0.59 76.51±6.40 0.86±1.62 23.49±6.40 75.83±5.84 98.99±1.74 90 4.71±1.23 1.59±0.61 81.73±7.06 1.70±2.56 18.27±7.06 80.32±6.07 98.16±2.53 135 4.04±1.32 1.19±0.51 88.54±6.37 3.75±3.92 11.46±6.37 85.34±5.38 96.21±3.64 180 3.65±1.15 1.02±0.40 93.66±5.07 7.68±5.76 6.34±5.07 87.08±4.44 92.82±4.75

Table 5-7. Performance of different values of angle θThreshold. (HD = Hausdorff distance, MD = mean absolute distance, NHD = normalized HD, NMD = normalized MD, TPR = true positive area ratio, FPR = false positive area ratio, FNR = false negative area ratio, JSI = Jaccard similarity index, PPV = positive predictive value. All values are represented as average±standard deviation.)

Table 5-8. Performance of different values of parameter a. (HD = Hausdorff distance, MD = mean absolute distance, NHD = normalized HD, NMD = normalized MD, TPR = true positive area ratio, FPR = false positive area ratio, FNR = false negative area ratio, JSI = Jaccard similarity index, PPV = positive predictive value. All values are represented as average±standard deviation.)

Table 5-9. Performance of different values of inner product b. (HD = Hausdorff distance, MD = mean absolute distance, NHD = normalized HD, NMD = normalized MD, TPR = true positive area ratio, FPR = false positive area ratio, FNR = false negative area ratio, JSI = Jaccard similarity index, PPV = positive predictive value. All values are represented as average±standard deviation.)

5.2 Breast tumor detection in 3D ultrasound imaging 5.2.1 Materials

Automated 3D Breast Ultrasound Images

The ABUS images used in this study were acquired by an ACUSON-S2000 Automated Breast Volume Scanner (ABVS) (Siemens Medical Solutions USA, Inc., Malvern, PA) in Breast Center of National Taiwan University Hospital. The scanner consisted of a flexible arm with a broadband linear array transducer (5-14 MHz with the center frequency at 11 MHz) at its end, a touchscreen, and a 3D imaging workstation. The

scan process was performed with the patients in supine position using the original default settings of the Siemens ACUSON-S2000 ABVS. The transducer translated over the breast with a constant speed to obtain a 3D volume of imaging data to cover a large segment of the breast. The medial, anterior-posterior, and lateral regions of each breast underwent standardized examinations to form a total six sets of ABUS image passes of entire breasts.

Each examination consisted of 318 2D transverse plane images with 0.525mm slice-to-slice interval. The sagittal plane images and coronal plane images were then reconstructed by ABVS Workplace system to form a 3D volume. However, the number of two reconstructed plane image slices varied for different patients.

Patients

Informed consents were obtained from all patients recruited in this study and approved by National Taiwan University Hospital Research Ethics Committee.

For performance validation, consisting of 176 passes with 132 abnormal passes and 44 normal passes from women were used in this study. Among the 132 abnormal passes, there were 162 tumors all proven by the histopathological examination of biopsy specimens, including 79 benign and 83 malignant ones. The details of the tumor characteristics were shown in Table 5-10. In pursuit of reality and standardization, the passes of transverse plane images in which visible tumor existed were used for analysis.

All tumors were annotated by experienced radiologist to serve as a gold-standard. For generalization of our proposed system, 44 normal passes were also included in the database. All the normal passes were confirmed by multiple examination modalities.

Histopathologic diagnosis

Fibroadenoma 55 Invasive ductal carcinomas 67

Fibrocystic disease 20 Ductal carcinomas in situ 9

Phyllodes tumor 2 Invasive lobular carcinomas 5

Intraductal papilloma 2 Malignant phyllodes tumor 2

Table 5-10. Patient clinical data and tumor characteristics. (Size was the longest length among the tumors.)

相關文件