國立臺灣大學電機資訊學院資訊工程學系 碩士論文
Department of Computer Science and Information Engineering College of Electrical Engineering and Computer Science
National Taiwan University Master Thesis
全乳房自動超音波影像之腫瘤偵測
Tumor Detection for Automated Whole Breast Ultrasound Image
徐位文 WEI-WEN HSU
指導教授:張瑞峰 博士
Advisor: RUEY-FENG CHANG, Ph.D.
中華民國 100 年 7 月 July, 2011
i
口試委員會審定書
ii
ACKNOWLEDGEMENTS
First, I have to express my sincere appreciation for my advisor, Dr. Ruey-Feng Chang, for his tireless guidance, patient training, and kind discussions during my lifetime of institute. In addition, I am grateful to my seniors, especially Jimmy Shen and Kai Yang, in the Medical Imaging Laboratory for their advice and help during the research of this thesis. Last but not the least, I want to thank my family and friends for their encouragement and support. I sincerely dedicate this thesis to my parents, whom I admire and respect the most in the world.
iii
論文摘要
乳癌近幾年來一直是女性癌症中的主要死因。然而,藉由早期的檢測及治療 能大幅地降低乳癌的死亡率。在許多乳房檢測儀器中,乳房超音波與乳房X光攝 影的結合能在檢查乳癌上有著互補的作用。近幾年來,自動的全乳房超音波已經 有不少的論文研究,其在腫瘤的方位顯示和記錄上都有很好的表現。因為一個病 例會有很龐大的三維影像資料,逐張檢查診斷會花費醫師許多時間。因此,此論 文提出一個電腦輔助的乳房腫瘤偵測系統偵測腫瘤可疑的區域,以協助醫生做診 斷。本論文採用的腫瘤偵測系統是以區域為基礎做運算處理。首先,fast 3-D mean
shift方法將3維影像資料分割成區域並移除影像中的雜訊。接著,fuzzy c-means clustering會將這些區域依其灰階值來做分群。因為在超音波影像中,腫瘤一般會 有較低的灰階值,所以這些被分到最暗的區域會被視為是腫瘤的可疑區域。之 後,在這些可疑的區域中,如果灰階值相差在門檻值內的區域會結合在一起呈現 最後切割的結果。不僅如此,為了更進一步去區分腫瘤和非腫瘤,每個切割後腫 瘤的可疑區域會取出七個特徵,結合這七個特徵去做分類,以減少非腫瘤被誤判 為腫瘤情況的發生。在這個實驗中,大部份的腫瘤都能被找到。在平均每個病例 有4.92個非腫瘤被誤判為腫瘤的情況下, 系統的腫瘤偵測準確率是89.04%
(130/146),其中惡性腫瘤的偵測準確率更可達到94.03% (63/67)。希望在醫師診 斷的輔助上能有所幫助。
iv
ABSTRACT
Breast cancer has been the major cause of death for women among all kinds of cancers in recent years. Nonetheless, the early detection and improved treatment can significantly reduce the mortality of breast cancer. Breast ultrasound (US) is a very important complementary imaging modality with mammography in breast cancer detection. Recently, the automatic whole breast ultrasound (ABUS) system has been developed to provide the proper orientation and documentation of breast lesions.
Because a large three dimension (3-D) volume image is obtained for each case, the physician needs to spend a lot of time in reviewing all slice images. Therefore, a computer-aided tumor detection system is proposed to find the suspicious regions of tumors and assist the physician in diagnosis. The region-based ABUS tumor detection method is adopted in this study. At first, the 3-D volume image is segmented into regions and the speckle noise is removed by the fast 3-D mean shift method.
Subsequently, the fuzzy c-means (FCM) clustering classifies these regions into different classes according to their intensities. Because tumors are usually darker than normal tissues in US, the regions classified into the darkest cluster by the FCM are regarded as the suspicious tumor regions in this study. After FCM, these suspicious regions are merged within a merging threshold to present the segmented results.
Moreover, in order to discriminate the real tumors from the other non-tumor regions, seven features are extracted from the suspicious tumor regions and the classification method is adopted with 10-fold validation to reduce the false-positives. By experimental results, almost all the tumors can be found by this system and the sensitivity is 89.04% (130/146) with 4.92 FPs per case. Furthermore, the detection rate for malignant tumors is up to 94.03% (63/67). The proposed tumor detection system is useful for the diagnosis of doctors.
v
Table of Contents
口試委員會審定書... i
ACKNOWLEDGEMENTS ... ii
論文摘要 ...iii
ABSTRACT ... iv
Table of Contents ... v
List of Figures ... vi
List of Tables ...viii
Chapter 1 Introduction ... 1
Chapter 2 Materials ... 4
Chapter 3 Region-based ABUS Tumor Detection ... 6
3.1 Region segmentation using fast 3-D mean shift method ... 7
3.2 Region classification using fuzzy c-means clustering ... 10
3.3 False-positive reduction ... 13
Chapter 4 Experimental Results and Discussion ... 17
4.1 Experimental Results ... 17
4.2 Discussion ... 35
Chapter 5 Conclusion and Future Works ... 38
References ... 39
vi
List of Figures
Fig. 1 The screen of the SomoVu ViewStation. ... 5 Fig. 2 The region-based ABUS tumor detection. ... 6 Fig. 3 (a) The original image (b) The image after applying fast 3-D mean shift method
(c) The regions after fast 3-D mean shift method ... 10 Fig. 4 The FCM clustering result. (a) the first cluster (b) the second cluster (c) the
third cluster (d) the fourth cluster. The tumor is circled in (d)... 12 Fig. 5 The regions in Fig. 3(c) classified into the darkest cluster by the FCM ... 13 Fig. 6 Results of different merging threshold values (a) THtumor= 8 (b) THtumor= 4. ... 13 Fig. 7 A bounding box of the suspicious tumor region. ... 15 Fig. 8 Flat and narrow fat (a) the original slice image (b) white areas are the
suspicious regions. ... 15 Fig. 9 The FROC curve of the proposed system. ... 23 Fig. 10 A true-positive case of 1.9 cm infiltrating duct carcinoma. (a) The original
image (b) The white areas are the suspicious tumor regions before FP reduction.
(c) The white area is the result after FP reduction and the solid circle indicates the position of the real tumor. ... 24 Fig. 11 A true-positive case of 2.0 cm infiltrating duct carcinoma. (a) The original
image (b) The white area is the suspicious tumor region before FP reduction. (c) The white area is the result after FP reduction and the solid circle indicates the position of the real tumor. ... 25 Fig. 12 A true-positive case of 1.8 cm fibroadenomas. (a) The original image (b)
White areas are the suspicious tumor regions before FP reduction. (c) The white areas are the results after FP reduction. The solid circle indicates the position of the real tumor and the dot circle indicates the FP. ... 26 Fig. 13 A true-positive case of 1.3 cm tubular adenoma. (a) The original image (b) The white areas are the suspicious tumor regions before FP reduction. (c) The white area is the result after FP reduction and the solid circle indicates the position of the real tumor. ... 27 Fig. 14 A true-positive case of 3.2 cm infiltrating duct carcinoma. (a) The original
image (b) The white areas are the suspicious tumor regions before FP reduction.
(c) The white area is the result after FP reduction and the solid circle indicates the position of the real tumor. ... 28 Fig. 15 A true-positive case of 3.5 cm infiltrating duct carcinoma. (a) The original
image (b) The white areas are the suspicious tumor regions before FP reduction.
(c) The white area is the result after FP reduction and the solid circle indicates the position of the real tumor. ... 29
vii
Fig. 16 A true-positive case of 7.0 cm DCIS. (a) The original image (b) The white area is the suspicious tumor region before FP reduction. (c) The white area is the result after FP reduction and the solid circle indicates the position of the real tumor. ... 30 Fig. 17 A true-positive case of 5.4 cm phyllodes tumor. (a) The original image (b) The
white area is the suspicious tumor region before FP reduction. (c) The white area is the result after FP reduction and the solid circle indicates the position of the real tumor. ... 31 Fig. 18 A false-negative case of 3.1cm fibroadenomas. (a) The original image (b) The
white areas are the suspicious tumor regions before FP reduction. (c) The result after FP reduction and the solid circle indicates the position of the real tumor. 32 Fig. 19 A false-negative case of 2.0 cm DCIS. (a) The original image (b) The white
area is the suspicious tumor region before FP reduction. (c) The result after FP reduction the solid circle indicates the position of the real tumor. ... 33 Fig. 20 A false-negative case of 2.5 cm infiltrating duct carcinoma. (a) The original
image in A-view (b) The white areas are the suspicious tumor regions before FP reduction from (a). (c) The original image in C-view. (d) The white areas are the suspicious tumor regions before FP reduction from (c). ... 34 Fig. 21 A false-positive example. (a) The arrow indicates the rib region in the original
image. (b) The white areas are the suspicious tumor region before FP reduction.
(c) The white area is the result after FP reduction and the dot circle indicates the FP. The rib is misclassified as the suspicious tumor region in this case. ... 35
viii
List of Tables
Table 1 The results for 113 cases with the tumors ... 18
Table 2 The sensitivity rates of tumor detection for benign and malignant tumors ... 22
Table 3 The sensitivity rate of different sizes for benign and malignant tumors ... 23
Table 4 Median value and p-value of Mann-Whitney U test for each feature ... 23
1
Chapter 1 Introduction
Based on the incidence rates of breast cancers in the recent years, one in eight women will develop breast cancer in their lifetime [1] and breast cancer ranks the first as the cause of the estimated cancer death among females [2]. Earlier screening and improved treatment are the most likely reasons for the reduced mortality rates [2]. The early detection of the cancer provides a better chance of proper treatment [3].
Therefore, physicians suggest that women over forty years old should have the breast cancer examination every year. Mammography [4-6] and breast ultrasound (US) [7, 8]
are the efficient methods used for breast cancer detection and diagnose presently. The diagnostic results with these modalities have given a great assistance to physicians for deciding the proper treatments to patients.
Mammography is the common modality used for detecting breast lesions in the early stage. However, mammography is difficult to screen the dense breasts with women [9-11], and it has some drawbacks, such as high false-positive (FP) rate and low specificity. Therefore, breast US is an important adjunct to mammography for evaluating the dense breast cancer. Breast US does not use the ionizing radiation that may be an invasive damage on the human body and it is non-invasive, more efficient, relatively inexpensive, real-time, and convenient [12-14]. Additionally, it offers interactive visualization of potential anatomical tissues in real time. Due to these advantages, breast US is an appropriate screening tool for breast tumor detection.
The major limitation of US image is that the US probe could not fully cover the breast width; therefore, its scanning ROI is smaller than other imaging modalities.
The automated breast ultrasound (ABUS) [11, 15-17] is a new technique to scan the whole breast. In the study of Chou et al.[17], several advantages of ABUS are
2
indicated. The ABUS could provide better reproducibility for follow-up studies and potential information for breast lesions. For radiologists, ABUS is ease of use without long training time and ABUS has excellent intra- and inter-observer variability.
However, the physician needs a lot of time to review the three dimension (3-D) images and to diagnose the ABUS images. When a large number of patients are examined, the physician could be tired and the misdiagnosis might be occurred. In order to reduce the misdiagnosis, computer aided detection (CADe) system could be considered as the second reader to assist the diagnosis of the physician and could improve more diagnostic accuracy [18, 19].
Several studies for the CADe system of breast US have been proposed in the recent years. Mogatadakala et al. [20] proposed a method using order statistic features from multiresolution decompositions of energy-normalized subregions for discriminating surrounding normal and tumor regions. Drukker et al. [21, 22]
investigated an automatic lesion detection system in two stages. The lesion candidates were detected by using a radial gradient index filtering and were segmented by using a region growing method initially. Subsequently, Bayesian neural network was adopted for lesion classification. In the study of Ikedo et al. [23], a tumor detection system for whole breast US images was proposed. Two features, the edge direction and the density difference, were applied for tumor detection. Recently, Chang et al.
[24] proposed a CADe system for multipass automated breast US. In order to improve the image quality, the anisotropic diffusion and stick filters were applied to reduce the speckle noise and to enhance the tumor contour. Subsequently, the gray-level slicing method was adopted to segment suspicious lesion regions. Finally, seven features were used as the criteria to discriminate real lesion or non-lesion regions.
The pixel-based tumor detection method will take a lot of time for processing and the speckle noise will also affect the detection results. Therefore, the region-based
3
ABUS tumor detection method is adopted in this study. At first, the fast 3-D mean shift [25] method is adopted to segment the 3-D image into regions according to the local information. Subsequently, the fuzzy c-means (FCM) clustering [26] is used to classify the regions into different classes according to their intensities. Because tumors are usually darker than normal tissues in US, the dark regions could be regarded as the suspicious tumor regions. In order to discriminate the real tumors from the other non-tumor regions, seven features are extracted from the suspicious tumor regions and the classification method is adopted to reduce the false-positives.
4
Chapter 2 Materials
The data used in the study were acquired between June 2007 and June 2008 from Seoul National University Hospital. 146 biopsy proven lesions (size range 0.2-7.0 cm, mean 1.64±1.15 cm) in 113 female subjects (ages range 20-79 years;
mean 45.18±10.59 years) were used for evaluating the performance of the tumor detection system. The 146 lesions include 79 benign and 67 malignant lesions. The 79 benign lesions include 44 fibrocystic changes, 31 fibroadenomas and 4 papillomas.
The 67 malignant lesions include 59 infiltrating carcinomas and 8 ductal carcinoma in situ (DCIS). These subjects were scanned by the SomoVu ScanStation (U-system, San Jose, CA, USA), as shown in Fig. 1. The SomoVu ScanStation comprises a View Station and a Scan Station. The patient in the supine position is scanned by the Scan Station with the 10 MHz linear transducer whose width is 15.4 cm and the acquired data is transformed to the View Station. Although the transducer is wider than the conventional ultrasound transducer, more than one pass is needed for covering the entire breast of the patient. The pixel resolutions of the acquired 3-D image are 0.285 mm in the transverse direction, 0.086 mm in the sagittal direction and 0.6 mm in the coronal direction.
5
Fig. 1 The screen of the SomoVu ViewStation.
6
Chapter 3
Region-based ABUS Tumor Detection
Because the number of pixels in the ABUS image is too large and there is speckle noise in the US images, the pixel-based method will take a lot of time to detect the tumors and the speckle noise will also affect its performance. Hence, a region based method is adopted not only to reduce the detection time but also to avoid the affection of speckle noise in this study. At first, for segmenting the 3-D image into several regions, the fast 3-D mean shift method [25] is applied. Afterward, the fuzzy c-means (FCM) clustering [26] is used to classify the regions into different classes according to their grey levels. Because the tumors are darker than the normal tissue, the dark regions might be the suspicious regions of tumors. However, not all the suspicious regions are real tumors, seven features are extracted from these suspicious regions and the classification method is used to reduce the false-positives. The system flowchart is shown in Fig. 2.
Fig. 2 The region-based ABUS tumor detection.
Region segmentation using fast 3-D mean shift method
Region classification using fuzzy c-means clustering
False-positive reduction
Detection results
7
3.1 Region segmentation using fast 3-D mean shift method
In this stage, the fast 3-D mean shift method is applied for region segmentation;
that is, the pixels with certain local homogeneity are gathered together as a region.
Besides, there is a lot of speckle noise in the US images and the noise will affect the performance of tumor detection. Since the mean shift method can ignore the outliers in the data [27], it is very useful in removing the speckle noise in the US images.
Accordingly, the mean shift method is adopted in this study. After applying the region segmentation, each pixel in a region is replaced by the mean of their intensity values.
The mean shift method [28, 29] is a feature-space analysis technique with the ability of clustering a discrete data set over the feature-space. We assume that a data set with n data points lying on the feature space and a spherical window of radius r are given. For each data point, the mean shift method computes the mean of the points that lie within the window and then shifts the window to the place of the mean, repeating the above move until convergence. With convergence guaranteed, each point can be associated with a certain peak that represents a cluster.
The mean shift procedure estimates a probability density function using the Parzen window density estimator [30]. Given a discrete data set with n data points denoted as xi, i=1,2,…,n in the d-dimensional feature space Rd, the kernel density estimation with Epanechnikov kernel K(x) [29, 31] at the point x is given by
n
i
xi
x K n x f
1
) 1 (
)
( . (1)
where ( ) ( )
2
h x ck x x x
K i i
, k is the profile with respect to K, c is a normalization
constant, and h is the bandwidth of the kernel.
For searching the local peak in density distribution, we focus on the gradient of this kernel density estimation:
8
n
i i n
i
i k
n x c x K n
x f
1 1
) 1 (
) (
.
(2)
Let g(x)k(x) and the formula can be represented by:
] ) (
) (
[ ] ) (
[ )
(
1
2 1
2
1 1
2
x h
x g x
h x g x
x h
x g x
n k c n
x c f
n
i
i i
n
i
i i
n i
i
n
i
i i
i
(3)
and the mean shift vector is derived:
x
h x g x
h x g x
x x
m
n
i
i i
n
i
i i
i
1
2 1
2
) (
) (
)
( . (4)
The algorithm uses formula (4) to update the window center in each iteration.
In the application of mean shift method to image processing, it carries on by employing the mean shift clustering over the combined spatial-range domain [32]. For a 3-D gray-level image, the mean shift method works on each pixel over the four-dimensional feature space, i.e., three dimensions for the spatial domain and one dimension for the range (gray-level) domain; that is, for the combined spatial-range domain, the kernel in formula (1) becomes :
) ( ) ( )
(
2 2
3
r r
r s
s
s r
s h
k x h k x h h x C
K
(5)
where xs is the spatial part, xr is the range part of a feature vector, hs and hr are the kernel bandwidths of spatial domain and range domain respectively, and C is the corresponding normalization constant.
However, the size of 3-D volume data sets is so large that it might take a lot of processing time to apply the 3-D mean shift method directly. In order to reduce the processing time, the fast 3-D mean shift [25] is applied in our implementation. The
9
fast 3-D mean shift uses the 2-D information propagation in a straight-forward manner.
Instead of using a four-dimension window directly, the fast 3-D mean shift method uses a three-dimension window in each 2-D image slice. If the k-th slice from the original 3-D volume image is denoted as fk and its segmented result after applying the 2-D mean shift method is denoted as gk, the algorithm computes the difference of each corresponding pixel between fk and fk+1. If the difference is less than the threshold THz, the pixel in fk+1 will be replaced by its corresponding pixel value in gk . After all pixels in fk+1 are checked, it applies the 2-D mean shift method to generate the segmented result gk+1. Supposed that there are M slices in our 3-D image, the fast 3-D mean shift method executes the above 2-D mean shift procedure starting from k=1 to k=M and the segmented set is {g1 ,…, gM}. Because the ABUS volume size is large and we want to further accelerate the speed of fast 3-D mean shift. For reducing the processing time in the study, the image is down-sampled to one-eighth of the original size. The fast 3-D mean shift method is applied to the down-sampled image and then the image is enlarged back to the original size as an approximate result of fast 3-D mean shift method.
There are three parameters hs, hr, and THz in the fast 3-D mean shift method to determine the results of segmentation. In this study, three parameters is chosen to be (hs, hr, THz)=(7, 15, 1) and the segmentation result is shown in Fig. 3. Note that the number of regions is large and the region size is small in Fig. 3(b). The reason is that the used parameters are very small. If the bigger parameters are chosen, the boundary between tumors and non-tumor regions may be disappeared and then tumors will be connected with other regions. Besides, the most fragmental parts are the regions with higher gray-level intensity such as the firbograndular tissue and those regions are not considered as the suspicious tumor regions.
10
(a)
(b)
(c)
Fig. 3 (a) The original image (b) The image after applying fast 3-D mean shift method (c) The regions after fast 3-D mean shift method
3.2 Region classification using fuzzy c-means clustering
After obtaining the segmented regions, the FCM [26, 33] is adopted to classify the segmented regions into several clusters according to each region’s intensity value.
Because the intensity of the tumor is darker than that of the normal tissues, the regions classified into the darkest cluster by the FCM are regarded as the suspicious tumor regions in this study. Then, these suspicious regions could be merged with their neighboring suspicious regions to represent a tumor. However, the suspicious regions
11
in the darkest cluster include not only tumor regions but also darker non-tumor regions such as fat, shadowing, and anechoic regions. In order to avoid merging the darker non-tumor regions with the tumor regions, a merging threshold value THtumor is used. That is, if the difference between two suspicious regions is smaller than THtumor, then these regions could be merged. Otherwise, one of suspicious regions might be only the darker non-tumor region.
In this paper, the regions are classified into 4 clusters. The FCM clustering result is shown in Fig. 4 and the regions in Fig. 3(c) classified into the darkest cluster by the FCM is shown in Fig. 5. In Fig. 6, the darkest regions are merged with different merging threshold values. With THtumor=4, the darkest non-tumor region is not merged with the tumor region as in Fig. 6(b).
12
(a)
(b)
(c)
(d)
Fig. 4 The FCM clustering result. (a) the first cluster (b) the second cluster (c) the third cluster (d) the fourth cluster. The tumor is circled in (d).
13
Fig. 5 The regions in Fig. 3(c) classified into the darkest cluster by the FCM
(a)
(b)
Fig. 6 Results of different merging threshold values (a) THtumor= 8 (b) THtumor= 4.
3.3 False-positive reduction
After above analyses, the suspicious tumor regions could be detected; however, not all the suspicious regions are real tumors. Hence, several features of these regions are extracted and used to distinguish the real tumors from the others in order to reduce the number of false-positives (FPs). These features are the region’s volume, intensity
14
mean, standard deviation of intensity values, neighborhood mean difference, volume ratio, long-short axis ratio, and standard deviation of radii.
In the suspicious tumor regions, some too small regions might be noise or some too large regions might be the anechoic regions. Therefore, the region volume is used as the feature for removing the non-tumor regions. In addition, two physical features are obtained by computing each region’ mean and its standard deviation in gray-level intensity. In general, the anechoic regions are darker than the tumor regions and their standard deviations are much smaller; on the other hand, some of tumors, especial malignant tumors, their standard deviations are large because these tumors are disordered and have large diversities.
Moreover, the relations between each region and its neighborhood are also observed as features. A bounding box is a rectangular parallelepiped that circumscribes the suspicious tumor region, shown as Fig. 7. The feature neighborhood mean difference is to compute the intensity mean difference between the suspicious region and the area that is outside the suspicious region in the bounding box. Since most of tumors are surrounded with the brighter tissue, the feature neighborhood mean difference is expected to be larger for tumors. As a result, the region whose feature of neighborhood mean difference is small is likely to be the non-tumor region and be screened out. Another feature, volume ratio, works on the ratio of the volume of suspicious region and the volume of its bounding box. Because tumors are usually close to the shapes of ellipses, the tumor region usually takes a certain proportion in the bounding box and its feature of volume ratio is much larger than that of a skewed shadow.
15
Fig. 7 A bounding box of the suspicious tumor region.
Finally, two features that describe the shapes of regions are measured. The long-short axis ratio is defined by the length ratio of the longest edge and the shortest edge of the bounding box. Compared with the shape of tumors that are close to ellipses, most of fat regions are flat and narrow, as shown in Fig. 8. Therefore, the features of long-short axis are commonly larger for fat regions than for tumors. The last feature is standard deviation of radii. It is to measure the diversity of the distances from the centroid to each surface pixel in the suspicious region. If the diversity of radiuses is small, it means the region is close to be in the shape of a sphere.
(a)
(b)
Fig. 8 Flat and narrow fat (a) the original slice image (b) white areas are the suspicious regions.
16
The binary logistic regression model [34] is used to classify the suspicious regions into tumor and non-tumor based on the proposed seven features. The predicting values from the binary logistic regression lie between 0 and 1. We can choose a threshold value THlogistic to classify these suspicious regions into two categories, tumor and non-tumor. If the predicting value of the suspicious region is greater than the chosen threshold, the region is regarded as a tumor, otherwise, the region is considered as a non-tumor and can be ignored.
17
Chapter 4
Experimental Results and Discussion
The proposed system of tumor detection for ABUS images is implemented by the Matlab 2008a (The Mathworks, Natick, MA) with Microsoft Windows 7 operating system (Microsoft, Seattle, WA). The program is running on an Intel Core i7 2.67 GHz CPU with 4G RAM.
4.1 Experimental Results
In this experiment, there are 146 lesions of 113 patients used for estimating the performance of the proposed tumor detection method. In the stage of false-positive reduction, the binary logistic regression model [34] is adopted with 10-fold cross-validation. The detection results of our proposed method are shown in Table 1.
In this table, the numbers of true-positives (TPs), false-negatives (FNs), and false positives (FPs) are listed for each case. The sensitivity rates for benign and malignant tumors are listed in Table 2. The total number of tumors is 146, in which 67 tumors are malignant and 79 tumors are benign. The sensitivity rate of tumor detection is 89.04% with 4.92 FPs per case. The sensitivity rate for malignant tumors is 94.03%
and the sensitivity rate for benign tumors is 84.1%. The sensitivity rates of different sizes for benign and malignant tumors are listed in Table 3.
For statistical analysis of the proposed features in false-positive reduction, firstly, the Kolmogorov-Smirnov test [35] is applied to observe whether the feature is a normal distribution or not. If the feature is a normal distribution, then the mean values and standard deviation are calculated for the tumors and non-tumors. Differences between the values of the features for the tumors and non-tumors are evaluated with Student’s t test. If the distribution of a feature is not normal, the median value is listed
18
and the Mann-Whitney U test [35] is used. A p-value that is less than 0.05 is considered to indicate a statistically significant difference. Our proposed features are determined to be non-normal distributions by the Kolmogorov-Smirnov test. Thus, the Mann-Whitney U test is applied and the median and p-value for respective feature is listed in Table 4. Also, the free-response operating characteristics (FROC) [36] are also adopted to show the performance of our tumor detection system. The FROC, shown in Fig. 9, is generated by the predicted values from the binary logistic regression using different threshold THlogistic. At THlogistic=0.54, the sensitivity rate of tumor detection is 89.04% with 4.92 FPs per case. Note that the number of FPs was 63.32 per case before the false-positive reduction.
According to Table 1, most of the tumors identified by the radiologists could be found through our proposed tumor detection system with lower FP rate per case. The 8 cases of true-positive examples are shown in Fig. 10 - Fig. 17, 3 false-negative cases are shown in Fig. 18 -Fig. 20, and Fig. 21 shows a false-positive example. In these figures, the solid circles indicate the position of the real tumors and the dot circles indicate the FPs after FP reduction.
Table 1 The results for 113 cases with the tumors
Case No. False positive False Negative True positive Benign(0) /Malignant(1)
1 1 0 1 1
2 6 0 1 0
3 5 0 2 0
4 3 0 1 0
5 9 0 3 0
6 10 0 1 1
7 0 0 1 1
8 6 0 1 0
19
9 1 0 1 1
10 5 1 0 1
11 7 0 1 0
12 14 0 3 1
13 5 0 3 0
14 2 0 1 1
15 5 1 0 1
16 10 0 2 1
17 3 0 1 1
18 2 1 0 1
19 6 0 1 0
20 2 1 0 0
21 1 0 2 0
22 1 0 1 1
23 2 0 1 1
24 2 0 1 0
25 6 0 1 1
26 2 1 0 0
27 6 1 0 0
28 6 0 1 1
29 12 0 1 0
30 9 0 1 0
31 13 0 3 0
32 7 1 2 0
33 6 0 1 1
34 5 0 3 1
35 6 0 2 1
36 10 0 1 1
37 6 0 1 1
38 3 0 2 0
20
39 3 0 1 0
40 5 0 1 1
41 5 0 1 1
42 3 0 1 0
43 9 0 2 0
44 3 0 1 1
45 11 0 1 0
46 6 1 0 0
47 6 0 2 0
48 8 0 1 0
49 6 0 1 1
50 0 1 1 0
51 8 0 1 0
52 4 0 2 1
53 2 0 1 1
54 4 0 1 1
55 3 0 1 1
56 9 0 2 0
57 9 0 1 0
58 0 0 1 1
59 3 0 1 0
60 8 0 2 1
61 2 1 1 1
62 4 0 1 1
63 7 0 1 1
64 5 0 1 1
65 3 0 1 1
66 7 0 2 0
67 2 0 1 0
68 0 0 1 0
21
69 2 1 0 0
70 5 0 1 1
71 2 0 1 1
72 7 0 1 1
73 4 0 1 0
74 5 0 1 1
75 5 1 1 0
76 8 0 1 0
77 7 1 1 0
78 11 0 1 1
79 4 0 1 0
80 6 0 1 1
81 4 0 2 1
82 0 0 1 0
83 9 0 1 1
84 6 0 1 1
85 8 0 2 1
86 2 0 1 1
87 4 0 1 1
88 6 0 1 0
89 5 0 1 1
90 14 0 1 1
91 3 0 1 0
92 0 0 1 1
93 4 0 1 1
94 2 0 1 1
95 7 1 0 0
96 6 0 1 1
97 5 1 2 1
98 1 0 1 1
22
99 0 0 1 1
100 11 0 1 0
101 3 0 1 0
102 9 0 1 1
103 2 0 1 1
104 2 0 1 1
105 4 0 1 1
106 3 0 2 0
107 6 0 1 1
108 2 0 1 1
109 3 1 1 1
110 2 0 1 1
111 3 0 1 0
112 3 0 1 0
113 1 0 1 0
Total 556 16 130 B:79/M:67
Table 2 The sensitivity rates of tumor detection for benign and malignant tumors Tumor Number Detected Miss detected Sensitivity
Benign 79 67 12 84.81%
Malignant 67 63 4 94.03%
23
Table 3 The sensitivity rate of different sizes for benign and malignant tumors
< 1.0 cm 1.0 – 2.0 cm 2.0 - 3.0 cm ≧3.0 cm
Benign 87.23%
(41/47)
81.48%
(22/27)
100%
(2/2)
66.67%
(2/3)
Malignant 100%
(3/3)
100%
(16/16)
90%
(27/30)
94.44%
(17/18)
Total 88%
(44/50)
88.37%
(38/43)
90.63%
(29/32)
90.48%
(19/21)
Table 4 Median value and p-value of Mann-Whitney U test for each feature
Feature Median
p-value Non-tumor
Regions Tumor Regions
Volume 26.01 201.30 <0.001*
Mean 37.85 33.20 <0.001*
SD 1.49 3.26 <0.001*
Volume Ratio 0.17 0.26 <0.001*
Neighborhood Mean
Difference 8.66 21.20 <0.001*
Long-short axis ratio 3.05 2.07 <0.001*
Variance of radiuses 0.99 1.12 0.19
* The difference was statistically significant.
Fig. 9 The FROC curve of the proposed system.
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 5 10 15 20 25 30 35
Sensitivty
FPs per case
FROC
24
(a)
(b)
(c)
Fig. 10 A true-positive case of 1.9 cm infiltrating duct carcinoma. (a) The original image (b) The white areas are the suspicious tumor regions before FP reduction. (c) The white area is the result after FP reduction and the solid circle indicates the position of the real tumor.
25
(a)
(b)
(c)
Fig. 11 A true-positive case of 2.0 cm infiltrating duct carcinoma. (a) The original image (b) The white area is the suspicious tumor region before FP reduction.
(c) The white area is the result after FP reduction and the solid circle indicates the position of the real tumor.
26
(a)
(b)
(c)
Fig. 12 A true-positive case of 1.8 cm fibroadenomas. (a) The original image (b) White areas are the suspicious tumor regions before FP reduction. (c) The white areas are the results after FP reduction. The solid circle indicates the position of the real tumor and the dot circle indicates the FP.
27
(a)
(b)
Fig. 13 A true-positive case of 1.3 cm tubular adenoma. (a) The original image (b) The white areas are the suspicious tumor regions before FP reduction. (c) The white area is the result after FP reduction and the solid circle indicates the position of the real tumor.
28
(a)
(b)
(c)
Fig. 14 A true-positive case of 3.2 cm infiltrating duct carcinoma. (a) The original image (b) The white areas are the suspicious tumor regions before FP reduction. (c) The white area is the result after FP reduction and the solid circle indicates the position of the real tumor.
29
(a)
(b)
(c)
Fig. 15 A true-positive case of 3.5 cm infiltrating duct carcinoma. (a) The original image (b) The white areas are the suspicious tumor regions before FP reduction. (c) The white area is the result after FP reduction and the solid circle indicates the position of the real tumor.
30
(a)
(b)
(c)
Fig. 16 A true-positive case of 7.0 cm DCIS. (a) The original image (b) The white area is the suspicious tumor region before FP reduction. (c) The white area is the result after FP reduction and the solid circle indicates the position of the real tumor.
31
(a)
(b)
(c)
Fig. 17 A true-positive case of 5.4 cm phyllodes tumor. (a) The original image (b) The white area is the suspicious tumor region before FP reduction. (c) The white area is the result after FP reduction and the solid circle indicates the position of the real tumor.
32
(a)
(b)
(c)
Fig. 18 A false-negative case of 3.1cm fibroadenomas. (a) The original image (b) The white areas are the suspicious tumor regions before FP reduction. (c) The result after FP reduction and the solid circle indicates the position of the real tumor.
33
(a)
(b)
(c)
Fig. 19 A false-negative case of 2.0 cm DCIS. (a) The original image (b) The white area is the suspicious tumor region before FP reduction. (c) The result after FP reduction the solid circle indicates the position of the real tumor.
34
(a)
(b)
(c) (d)
Fig. 20 A false-negative case of 2.5 cm infiltrating duct carcinoma. (a) The original image in A-view (b) The white areas are the suspicious tumor regions before FP reduction from (a). (c) The original image in C-view. (d) The white areas are the suspicious tumor regions before FP reduction from (c).
35
(a)
(b)
(c)
Fig. 21 A false-positive example. (a) The arrow indicates the rib region in the original image. (b) The white areas are the suspicious tumor region before FP reduction.
(c) The white area is the result after FP reduction and the dot circle indicates the FP. The rib is misclassified as the suspicious tumor region in this case.
4.2 Discussion
US has been shown to be a useful tool for breast tumor detection and has been an useful adjunct to mammography, especially for women with dense breast tissue [10, 11] The ABUS has been an popular screening tool in clinical because its operator-independent, ease for training, time-efficient and better reproducibility for follow-up studies [17]. Due to large amounts of data in the 3-D US images, tumor detection is not an easy task for the physician and the misdiagnosis might be occurred.
Therefore, the CADe systems have been proposed to assist the diagnosis of the
36
physician.
In this study, the proposed region-based method can reduce the influences of noises and the strategy that merging the tumor regions after FCM can prevent the tumor regions from connecting other non-tumor regions, improving the segmentation results. In above true-positive examples from Fig. 10 to Fig. 14, the tumors in these examples are surrounded with others darker regions that may connect with the tumor region and causes segmentation distortions. In our proposed method, the suspicious tumor regions generated by FCM are merged within the merging threshold THtumor to segment a real tumor, separating from other non-tumor regions. These segmentation results show that our proposed method can segment the tumor regions well.
However, it is very difficult to choose a perfect merging threshold THtumor to fit all the cases. A few false-negative cases are caused by the merging threshold. For the false-negative case shown in Fig. 18, the segmented tumor region is narrow and the tumor was classified as a non-tumor region in the FP reduction. Since our merging threshold is quite small for this case, the segmented tumor region is just the partial real tumor. Oppositely, the same merging threshold is too large for the case shown in Fig. 19. The position of the tumor in this case is right below the nipple so that the shadows from the nipple affect the segmentation seriously. As a result, with the same merging threshold, the tumor region merges with shadows and causes distortion.
Another false-negative case is shown in Fig. 20 and this malignant tumor is just adjacent to the right anechoic region. Even though the segmentation result seems perform very well in A-view, the tumor region connects with another darker region at the tumor edge. The FP reduction will consider it as a non-tumor region since the boundary between these two regions is ambiguous, as shown in Fig. 20(c)(d).
For a false-positive case shown in Fig. 21, because the rib in this case is in a shape that looks like a tumor, the rib is miss-classified and becomes a FP. In our
37
detection results, some parts of ribs are segmented as the suspicious regions that look like tumors; therefore, some further features should be used to classify these tumor-like suspicious regions to be non-tumor.
38
Chapter 5
Conclusion and Future Works
In this study, an automatic 3-D region-based CADe system for ABUS images was proposed. At first, the fast 3-D mean shift method was adopted to segment 3-D image into several regions. Subsequently, the FCM method was applied to classify these regions into different classes according to their intensities. Because the intensities of the tumor regions were usually darker than that of the other tissue regions, the dark regions were regarded as the suspicious tumor regions in our study.
Due to many FPs in the suspicious tumor regions, seven features were used to reduce these FPs. In the experiments, the sensitivity of the CADe system was 89.04%
(130/146 lesions) with 4.92 FPs. The results show that the 3-D region-based CADe system could perform well and provide the reliable diagnosis.
Although the final results of tumor detection are acceptable, the proposed method could be further improved. After the FP reduction, some ribs are still miss-classified to be the tumors since the segmented regions of these ribs are tumor-alike. Further features should be used to discriminate these ribs from tumors.
Because the ribs usually locates below the central horizontal of the A-view slice image, the features that concern with the position of the segmented region may be useful to further reduce the FP rate of the detection system.
39
References
[1] N. Howlader, et al., SEER Cancer Statistics Review, 1975-2008: National Cancer Institute, 2011.
[2] A. Jemal, et al., "Global cancer statistics," CA: A Cancer Journal for Clinicians, vol. 61, pp. 69-90, 2011.
[3] B. O. Anderson, et al., "Breast cancer in limited-resource countries: an overview of the Breast Health Global Initiative 2005 guidelines," Breast J, vol.
12 Suppl 1, pp. S3-15, Jan-Feb 2006.
[4] T. E. Wilson, et al., "Breast cancer in the elderly patient: early detection with mammography," Radiology, vol. 190, pp. 203-207, 01 1994.
[5] M. A. Roubidoux, et al., "Bilateral breast cancer: early detection with mammography," Radiology, vol. 196, pp. 427-431, 08 1995.
[6] S. Buseman, et al., "Mammography screening matters for young women with breast carcinoma: evidence of downstaging among 42-49-year-old women with a history of previous mammography screening," Cancer, vol. 97, pp.
352-8, Jan 15 2003.
[7] E. A. Sickles, et al., "Benign breast lesions: ultrasound detection and diagnosis," Radiology, vol. 151, pp. 467-470, 05 1984.
[8] T. M. Kolb, et al., "Comparison of the performance of screening mammography, physical examination, and breast US and evaluation of factors that influence them: an analysis of 27,825 patient evaluations," Radiology, vol.
225, pp. 165-175, 10 2002.
[9] N. F. Boyd, et al., "Heritability of mammographic density, a risk factor for breast cancer," N.Engl.J.Med., vol. 347, pp. 886-894, 09/19/ 2002.
[10] P. Crystal, et al., "Using sonography to screen women with mammographically dense breasts," AJR Am.J.Roentgenol., vol. 181, pp. 177-182, 07 2003.
[11] K. M. Kelly, et al., "Breast cancer detection using automated whole breast ultrasound and mammography in radiographically dense breasts," European Radiology, vol. 20, pp. 734-742, 2010.
[12] K. Flobbe, et al., "The role of ultrasonography as an adjunct to mammography in the detection of breast cancer. a systematic review," Eur.J.Cancer, vol. 38, pp. 1044-1050, 05 2002.
[13] W. K. Moon, et al., "Multifocal, multicentric, and contralateral breast cancers:
bilateral whole-breast US in the preoperative evaluation of patients,"
Radiology, vol. 224, pp. 569-576, 08 2002.
[14] K. J. Taylor, et al., "Ultrasound as a complement to mammography and breast
40
examination to characterize breast masses," Ultrasound Med.Biol., vol. 28, pp.
19-26, 01 2002.
[15] R.-F. Chang, et al., "Whole breast computer-aided screening using free-hand ultrasound," International Congress Series, vol. 1281, pp. 1075-1080, 2005.
[16] E. Wenkel, et al., "Automated breast ultrasound: lesion detection and BI-RADS classification--a pilot study," Rofo, vol. 180, pp. 804-8, Sep 2008.
[17] C. Yi-Hong, et al., "Automated Full-field Breast Ultrasonography: The Past and The Present," Journal of Medical Ultrasound, vol. 15, pp. 31-44, 2007.
[18] M. A. Helvie, et al., "Sensitivity of noncommercial computer-aided detection system for mammographic breast cancer detection: Pilot clinical trial,"
Radiology, vol. 231, pp. 208-214, Apr 2004.
[19] L. A. Khoo, et al., "Computer-aided detection in the United Kingdom National Breast Screening Programme: prospective study," Radiology, vol. 237, pp.
444-9, Nov 2005.
[20] K. V. Mogatadakala, et al., "Detection of breast lesion regions in ultrasound images using wavelets and order statistics," Medical Physics, vol. 33, pp.
840-849, 2006.
[21] K. Drukker, et al., "Computerized lesion detection on breast ultrasound,"
Medical Physics, vol. 29, pp. 1438-1446, 2002.
[22] K. Drukker, et al., "Computerized detection and classification of cancer on breast ultrasound1," Academic Radiology, vol. 11, pp. 526-535, 2004.
[23] Y. Ikedo, et al., "Development of a fully automatic scheme for detection of masses in whole breast ultrasound images," Medical Physics, vol. 34, pp.
4378-4388, 2007.
[24] R.-F. Chang, et al., "Rapid image stitching and computer-aided detection for multipass automated breast ultrasound," Medical Physics, vol. 37, 2010.
[25] G. F. Dominguez, et al., "Fast 3D mean shift filter for CT images," in Image Analysis, Proceedings, Halmstad, Sweden, 2003, pp. 438-445.
[26] J. C. Bezdek, et al., "Fcm - the Fuzzy C-Means Clustering-Algorithm,"
Computers & Geosciences, vol. 10, pp. 191-203, 1984.
[27] D. G. R. Bradski and A. Kaehler, Learning opencv, 1st edition: O'Reilly Media, Inc., 2008.
[28] Y. Z. Cheng, "Mean Shift, Mode Seeking, and Clustering," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 17, pp. 790-799, Aug 1995.
[29] D. Comaniciu and P. Meer, "Mean shift: A robust approach toward feature space analysis," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, pp. 603-619, May 2002.
[30] E. Parzen, "On Estimation of a Probability Density Function and Mode," The
41
Annals of Mathematical Statistics, vol. 33, pp. 1065-1076, 1962.
[31] D. W. Scott, "Multivariate Density Estimation," ed: John Wiley & Sons, Inc., 1992.
[32] D. Comaniciu and P. Meer, "Mean shift analysis and applications," in Computer Vision, 1999. The Proceedings of the Seventh IEEE International Conference on, 1999, pp. 1197-1203 vol.2.
[33] M. J. Sabin, "Convergence and Consistency of Fuzzy C-Means Isodata Algorithms," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 9, pp. 661-668, Sep 1987.
[34] D. Hosmer and S. Lemeshow, Applied logistic regression (Wiley Series in probability and statistics): Wiley Interscience, 2000.
[35] A. Field, Discovering Statistics Using SPSS, 2nd ed. ed. London: SAGE Publications, 2005.
[36] J. S. Suri, Advances in diagnostic and therapeutic ultrasound imaging.
Boston ; London: Artech House, 2008.