腦部結構磁振造影影像量化分析與模板建立

(1)

行政院國家科學委員會專題研究計畫成果報告

腦部結構磁振造影影像量化分析與模板建立

研究成果報告(精簡版)

計畫類別：個別型

計畫編號： NSC 98-2221-E-009-147-

執行期間： 98 年 08 月 01 日至 99 年 07 月 31 日

執行單位：國立交通大學資訊工程學系（所）

計畫主持人：陳永昇

共同主持人：陳麗芬、葉子成、謝仁俊

計畫參與人員：碩士級-專任助理人員：顏毓萱

學士級-專任助理人員：謝心怡

碩士班研究生-兼任助理人員：王清偉

碩士班研究生-兼任助理人員：李國維

碩士班研究生-兼任助理人員：王億婷

碩士班研究生-兼任助理人員：林家萍

博士班研究生-兼任助理人員：鄭詠成

報告附件：出席國際會議研究心得報告及發表論文

處理方式：本計畫可公開查詢

中華民國 99 年 10 月 31 日

(2)

Multiresolutional Graph Cuts for Brain Extraction

from MR Images

Yi-Ting Wang

Department of Computer Science National Chiao Tung University

Hsinchu, Taiwan Email: [email protected]

Yong-Sheng Chen

Department of Computer Science

National Chiao Tung University Hsinchu, Taiwan Email: [email protected]

Li-Fen Chen

Institute of Brain Science National Yang-Ming University

Taipei, Taiwan Email: [email protected]

Abstract—We proposed a multiresolutional brain extraction framework which utilize graph cuts technique to classify head magnetic resonance (MR) images into brain and non-brain regions. Our goal is to achieve both high sensitivity and specificity results of brain extraction. Started with an extracted brain with high sensitivity and low specificity, we refine the segmentation results by trimming non-brain regions in a coarse-to-fine manner. The extracted brain at the coarser level will be propagated to the finer level to estimate foreground/background seeds as constraints. The short-cut problem of graph cuts is reduced by the proposed pre-determined foreground from the coarser level. In order to consider the impact of the intensity inhomogeneities, we estimated the intensity distribution locally by partitioning volume images of each resolution into different numbers of smaller cubes. The graph cuts method is individually applied for each cube. The proposed method was compared to four existing methods, Brain Surface Extractor, Brain Extraction Tool, Hybrid Watershed algorithm, and ISTRIP, by using four data sets, the first and the second IBSR data set of the Internet Brain Segmentation Repository, BrainWeb phantom images from the Montreal Neurological Institute, and healthy subjects collected by Taipei Veterans General Hospital. The performance evaluation for brain extraction, our method outperforms others for the first/second IBSR data set and BrainWeb phantom data set, and performs comparably with the BET and ISTRIP methods when using the VGHTPE data set.

I. INTRODUCTION

Accurate brain extraction is important to voxel-based mor-phology (VBM) [1] and neuroimaging studies. VBM of mag-netic resonance (MR) image is a powerful and non-invasive method to investigate human brain in vivo, it compares the concentration of GM or WM regionally in cross-sectional (the same time between groups) or longitudinal (the same group at different times) manner to describe ageing [2] or diseases, such as schizophrenia, bipolar disorder, and Alzheimer’s [3] [4]. The VBM procedure can be separated into following steps: brain extraction, intensity inhomogeneity correction, tissue segmentation, registration, and voxel based analysis. Brain extraction is the work of excluding non-brain tissues such as skull, muscle, and fat. Intensity inhomogeneity correction, tissue segmentation, registration, and voxel based analysis are all benefit from brain extraction. And the accuracy of above processes is important to neuroimaging studies, such as cortical thickness analysis [5].

A. Related works of brain extraction

Many methods have been proposed to solve the brain extraction problem. These methods can be classified into four classifications: intensity-based approaches, edge-based approaches, deformable approaches, and hybrid approaches.

Intensity-based approaches use the threshold of brain/non-brain to achieve brain/non-brain extraction. This method is sensitive to noise or intensity inhomogeneity. Watershed (WAT) describes the high intensities as hill and the low intensities as valley, and if a valley between two regions then they are separated. But watershed may results in over-segmentation caused by noise that separate brain into too many regions. To solve over-segmentation, it’s necessary to merge those regions that belong to the same structure by postprocess [7]. Pre-flooded WAT [7] proposed a pre-flooding approach to tackle the problem of over-segmentation. They inverted the intensity value of original image, and merged basin with height less than certain intensity value. The disadvantage of this method is that sometimes the results contain some eye skull or exclude parts of the cerebellum.

Edge-based approaches use gradient to locate the bound-ary between brain/non-brain, but may cause some errors in temporal since there are strong gradient to separate tempo-ral and cerebellum. Brain Surface Extractor (BSE) [8] uses anisotropic diffusion filtering to smooth noise, Marr-Hildreth edge detector to produce closed contours, and refines the result by mathematical morphology.

Deformable approaches is a method of brain extraction by deforming a rough surface to the brain surface. BET [6] finds a brain/non-brain threshold, initial a triangular tessellation of a sphere’s surface at center of gravity (COG), and finally iteratively deforms to the brain surface. ISTRIP [10] initializes an ellipsoid by finding the effective intensity range, then uses Wendland’s radial basis function (RBFs) deforming to the brain surface.

Hybrid approaches combine methods of different types to segment brain/non-brain. For the purpose of correcting the method of WAT, HWA combines pre-flooded WAT and deformable surface model to get better accuracy of brain extraction. HWA first perform the pre-flooded WAT to obtain an rough brain extraction and this brain as initial of deformable

(3)

Foreground seeds Background seeds S2 S1 Goal of boundary S1 Short-cut S2

Fig. 1. Short-cut problem of graph cuts. S1 is the goal of boundary and

S2 is the short-cut path. Although the edge weights of n-links on the goal

of boundary boundary S1 are smaller than those on the short-cut path S2,

however, the boundary S1is to long cause the cost is bigger than short-cut

S2. Therefore, graph cuts may result in short-cut S2instead of boundary S1.

surface model, then warp a accurately segmented brain atlas to the result of the pre-flooded WAT by rigid registration and deform to the accurate surface in the end. HWA is benefits of robust by using the pre-flooded WAT and geometric constraints of deformable surface model but is too conservative and still containing lots of non-brain.

B. Graph cuts

Graph cuts is a method for separating foreground and background and is similar to max-flow min-cut theorem. The difference is that in graph cuts, each vertices has an edge (t-link) connected to source S, an edge connected to sink T and an edge (n-link) between neighboring vertices.

When we use graph cuts, the first step is to select foreground seeds and background seeds. And then we assign edge weights as capacity. The weight of n-link is according the difference of gray value or color between neighboring points. The weight of t-link is according the similarity to foreground seeds and background seeds. If the intensity of a node is like to fore-ground seeds than the weight of t-link connected to source S is big that means this node has more chance be classified as foreground and vise versa. Finally, according to the weights of t-link and n-link, we can get a cut to partition foreground and background by graph cuts algorithm.

Short-cut is a well-known problem in graph cuts and mostly occur in thin and long region. Short-cut problem means that the cost of desired boundary is bigger than the short-cut path that misclassification happened. Figure1 is an example of the short-cut problem. This problem can be solved by combining geodesic information and graph cuts algorithm [13].

Based on [12], many applications of graph cuts were published. Whatever no methods are perfect, therefore, user interactive tools often come with graph cuts [14] [15] [16] [17] [18]. Video Snap Cut [19] mentioned that graph cuts with global statistics can’t handle complexity scene therefore they segment with localized classifiers. The user will roughly define the foreground and background on key frame, and get a initial result. The localized classifiers then surround along

the initial result. Each localized classifiers will combine local information to get a segmentation. These localized classifiers will propagate to next frame based on optical flow and update their local statistic. For refining the result of graph cuts, [17] also provides a tool to compute local color distribution.

For decreasing the cost of graph cuts algorithm, Multilevel banded graph cuts (BGC) [20] run full grid graph cuts with the coarsest image and get an initial object , then refine the initial object in a coarse-to-fine manner and graph cuts only operate in a band which between eroded and dilated the object. In this way, we can decrease memory usage and time cost but might exclude some thin structures such as blood vessels showed in the experiment of 3-D heart and pulmonary artery CT volume. To tackle the problem of BGC, [21] presented a accurate multilevel BGC by identifying thin structures and adding to BGC system when fining the lower resolution object. C. Motivation of this work

In the existing brain extraction methods, some have high sensitivities whereas some have high specificities. The high sensitivity stands for extracting large percent of the brain region, and the high specificity value rejects large percent of the non-brain region. The pursuits of high sensitivity and high specificity are generally conflicting. How to comprise between high sensitivity and high specificity is essential to accurate brain extraction. HWA method has the advantage of high sensitivity, but the disadvantage is its low specificity. In this work, we planned to develop an accurate brain extraction algorithm by using the brain region estimated by HWA method as the initial. While maintaining high sensitivity, our algorithm achieves high specificity by trimming non-brain regions from the initials with multiresolutional graph cuts.

II. METHOD

A. Flowchart of multiresolutional brain extraction

The proposed brain extraction method is composed of three major steps, as shown in Figure 2. First, we apply the HWA method to initialize the brain region as the roughly extracted brain. In the image of the roughly extracted brain, we remove non-WM voxels with high intensities by segmenting the WM tissue with region growing and excluding those voxels brighter than the segmented WM tissue. This volume is denoted by VL, where VL represents that this volume is at level L, in this thesis. The deleted voxels are usually non-brain voxels, such as fat.

The second step is the multiresolutional graph cuts and the advantage of multiresolutional framework to this work is that we can use the result of coarser levels as constraints to finer levels. We downsample VL image into lower-resolution ones, and from fine to coarse levels are VL−1, VL−2_{, . . . , V}0_{, as}

shown in Figure 3. The eight voxels in L − l level (a 2 ∗ 2 ∗ 2 cube) are combined into one voxel in L − (l + 1) level, l = 0, 1, . . . , L − 1. We start the multiresolutional graph cuts from the coarsest level. The shape of brain in this level is much easier to obtain, because of the brighter voxels not belonging to brain are blurred. To prevent short-cut problem in graph cut,

(4)

Fig. 2. Flowchart of the proposed brain extraction method.

the best foreground seeds should cover the object as much as possible. Hence, we take the WM as our foreground seeds at the level 0. And the estimation of foreground/background seeds at finer levels are constrained by upper level extraction results.

At the final step, we fill holes, which are background areas surrounded by foreground areas in coronal view. Since the darker voxels in the ventricles might be classified as background by the graph cuts algorithm and the deleted voxels in the first step might include some voxels of WM, and resulting in scattered holes.

There will be more details of removing the bright voxels in non-WM regions, multiresolutional graph cut and the final step in the following sections for the above steps.

B. Removal of bright voxels in non-WM regions

The foreground seeds are determined by intensity distribu-tion, which may be affected by the voxels with high intensities. Therefore, before application of graph cuts algorithm, we delete those bright voxels according to the estimated WM intensity distribution. We use 3-D region growing and Otsu’s [25] method to estimate the WM tissue in the way similar to [22]. The other reason of removing bright voxels in non-WM regions is that they may be classified as foreground by graph cuts algorithm, because the intensities in non-WM brain regions should not be larger than those of WM regions.

To estimate WM, in our observation, we can partition the results of rough brain extraction into four categories from darker to brighter with three thresholds which were found

(5)

Fig. 3. The multilevel image pyramid. (a) (b) (c) (d) (e) (f) threshold 1 threshold 2 threshold 3

Fig. 4. The four categories by Otsu’s method of the initial brain extraction. (a) The result of initial brain extraction. (b) The histogram of single slice and can be classified into four categories by threshold 1, threshold 2 and threshold 3. The horizontal axis is the gray value and the vertical axis is the number of voxels. (c) The first category is below threshold 1 and composed of voxels with gray values like CSF and non-head voxels. This image is shown the nonzero voxels with gray value below threshold 1. (d) The second category is between threshold 1 and threshold 2 and composed of voxels with gray values between those of CSF and GM. (e) The third category is between threshold 2 and threshold 3 and composed of voxels with gray values of GM. (f) The fourth category is greater than threshold 3 and composed of voxels with gray values of WM.

using Otsu’s method, as shown in Figure 4. The first category is composed of voxels with gray values like CSF and non-head voxels. The second category is composed of voxels with gray valuess between those of CSF and GM. The third category is majorly composed of voxels with gray values of GM. The fourth category is majorly composed of voxels with gray values of WM. When gathering statistics of the distribution of gray values, we only consider voxels with gray values lower than T , where T is the gray value higher than 98 percent of total nonzero voxels (similar to [6]). The reason to exclude those voxels with particularly high values is that they may dominate the third threshold of Otsu’s [25] method.

For each slice, we apply the Otsu’s method described above to determine the three thresholds and the largest threshold is used to segment WM voxels. We start 3-D region growing to

Fig. 5. The results of segmented white matter tissue in coronal slice. Top row are the results of the initial brain extraction. Bottom row are the results of the segmented white matter tissue.

find WM with voxels higher than the threshold between WM and GM for each slices located between 2/5 and 3/5 middle transverse slices. These starting points of region growing locate at the center region of each slices.The center region at superior and inferior transverse slices may be non-brain region, therefore we limit start points between 2/5 and 3/5 middle transverse slices. Since the connectivity of 3-D region growing is higher than that of 2-D region growing, we use a strict threshold t as the stopping criterion of region growing, to avoid from growing WM region to brighter non-brain region. Thus threshold t represents the maximum intensity difference between 3-D neighboring voxels. To improve the accuracy of WM segmentation, we perform 2-D region growing starting from the results of 3-D region growing, and uses the third threshold of Otsu’s method and T as the stopping criterion. Figure 5 demonstrates the results of WM segmentation.

To avoid the influence of bright voxels in non-WM region on the intensity distribution, we remove those voxels with gray values larger than the average of estimated WM subtracted by the estimated WM standard deviation. We call this volume VL_{, as shown in the bottom row of Figure 6. The reason why}

we remove those voxels with high gray values is that those voxels most likely belong to a non-brain region, such as fat . These kind of voxels are determined as foregrounds while doing the graph cuts algorithm. Removal of the brighter voxels in the non-WM areas can improve the estimation of WM as well as the brain extraction result.

C. Graph cuts for brain extraction

1) Edge weights in the graph cuts method: The proposed method construct a graph at each level and apply the graph cuts algorithm to extract the brain region. In the constructed graph, neighboring voxels have a connected edge, which is denoted by n-link. Besides, each voxel has edges, denoted by t-link, connected to two specific vertices: foreground terminal S and background terminal T sas shown in Figure 7. There are two kinds of n-links that need to be assigned with weights, one

(6)

Fig. 6. The result of deleting non-WM voxels with gray values higher than estimated WM. Top row: the results of initial brain extraction. Bottom row: the result of deleting the voxels with gray values higher than estimated WM.

is the inter-n-link and the other is the intra-n-link. Inter-n-link connects voxels in consecutive slices, shown as black lines in Figure 7, and the intra-n-link connects voxels in the same slice, shown as orange lines in Figure 7. For segmentation of foreground and background, we need to assign edge weight for each edge in the constructed graph and apply graph cuts algorithm. According to the weights of t-link and n-link, the graph cuts algorithm will find a global optimal path to classify foreground and background by minimizing the energy function: E = X (xi,xj)N A(xi, xj) + X (xi,S)N,or(xi,T )N B(xi) . (1)

The xi is a voxel in the image. N is a set of the

neigh-boring voxels xi and xj in the path of the graph cut, and

A(xi, xj)/B(xi) is the function to assign the weight of

n-link/t-link, respectively.

The finest level has different equation of edge weights from other levels. To assign the weight of n-link at coarser levels, the edge weight is assigned as:

A(xi, xj) = wn× exp(−

(Ixi− Ixj) 2

2α2 ) , (2)

where wn is a parameter representing the importance of

n-link, Ixi and Ixj are the intensity of two adjacent voxels, xi

and xj, connected by this link. If the difference between Ixi

and Ixj is large, then the weight of this link becomes small.

Thus, the cut path is more likely to go through this link. The weight of n-link at the finest level is assigned as: A(xi, xj) = wn×exp(− (Ixi− Ixj) 2 2α2 )+wd×( D(xi) + D(xj) 2 ) . (3) In the finest level, we add a distance term D(x) to Eq. (2). The D(x) is defined as the least-squares distance from x to the contour of the extracted brain which is upsampled from the coarser level to the finest level. In this work we use a library which is based on kd-trees and box-decomposition

trees for approximate nearest neighbor searching (ANN) [24] to estimate the distance term.

The parameter value of wn and wd for inter slice is half

of the value in intra slice, in order to keep the relationship of different slices but not too strong.

For the weight of t-link connected to foreground terminal is defined as:

B(xi) = wt×

PF(xi)

PF(xi) + PB(xi)

, (4) where wt stands for the importance of this term in contrast

to n-link. For each voxel x, PF(xi) and PB(xi) are its

like-lihood functions to foreground seeds and background seeds, respectively: PF(xi) = 1 p2πσf exp(−(Ixi− µf) 2 2σ2 f ) , (5) PB(xi) = 1 √ 2πσb exp(−(Ixi− µb) 2 2σ2 b ) . (6) The σf is the standard deviation of foreground seeds, whereas

and σb is set as twice the standard deviation of background

seeds for increasing the separability of foreground and back-ground by increasing the backback-ground range. The µf and µb

are the average of foreground seeds and background seeds. If voxel Ixi is more likely a background seed, then the t-link

connected to the foreground terminal is much easier to be cut, then x has more chance to be classified as a background. The weights of foreground seeds connecting to foreground terminal are set to be infinity and they will be classified as foreground. Similarly, t-link connected to background terminal is defined as:

B(xi) = wt×

PB(xi)

PF(xi) + PB(xi)

. (7) The weights of background seeds connecting to back-ground terminal are set to be infinity and they will be classified as background. The edge weight calculation func-tions from Eq. 1 to Eq. 7 were modified from those in [12]. We added a distance term in Eq. 3 at the origi-nal resolution level. We use the graph cuts algorithm pro-posed in [23] and the implementation can be found in http://www.cs.ucl.ac.uk/staff/V.Kolmogorov/software.html. D. Multiresolutional graph cuts for brain extraction

For the purpose of multiresolutional graph cut, we down-sample the VL (the highest resolution) at level L into coaser levels, VL−1, VL−2, . . . , V0. The equation of downsampling is defined as: V_x,y,zL−l = 1 8 1 X i=0 1 X j=0 1 X k=0 V_{2x+i,2y+j,2z+k}L−(l−1) , for 1 ≤ l ≤ L . (8) The VL−l

x,y,z is the intensity at coordinate (x, y, z) and at level

L − l. The number of voxels at L − (l − 1) level are 8 times of voxels at L − l level.

(7)

Fig. 7. Demonstration of n-link and t-link. Blue lines are the t-link connecting to background terminal. Green lines are the t-link connecting to foreground terminal. Orange lines are the intra slice n-link (link in the same slice). Black lines are the inter slice n-link (link in different slices).

1) Graph cuts at the coarsest level: To get the shape of brain, we start running graph cut with V0_{, which is at the}

lowest resolution. At this level, the shape of brain is much easier to obtain, because of the brighter voxels that are not belonging to brain were blurred. On the other hand, the brighter voxels (WM) belonging to the brain area are set as foreground seeds, which always be classified as foreground. We use the region growing method, describe in Section 2.1, to segment the WM tissue in V0as foreground seeds, as voxels overlaid with warm colors in the middle row of Figure 8. On the other hand, we use the voxels with non-zero gray values and below the lowest threshold as background seeds, as shown as voxels overlaid with cool colors in the middle row of Figure 8. After the graph cut of the entire brain was completed, we get a foreground object, O0, as show in bottom row of Figure 8. Then erode O0 one voxel to get a smaller volume, E0, as a constraint to the middle resolution level.

Fig. 8. Demonstration of foreground seeds, background seeds and results at the lowest resolution level. Top row: the volumes are V0, individually. Middle row: the warm colors are foreground seeds and cool colors are background seeds. Bottom row: the extracted brain at the lowest resolution level.

2) Graph cuts at the middle levels: The graph cuts at middle levels is to refine the extracted brain from the lowest resolution level. At middle dimension level l, 0 < l < L, to get a pre-determined foreground area we first upsample the eroded brain region, El−1_{, at coarser level l − 1 to level l the}

pre-determined foreground area El. The upsampling equation of the eroded volume is defined as:

E_x,y,z0l = El−1x 2, y 2, z 2 , (9)

where E0l_x,y,zis the intensity at coordinate (x, y, z) and at level l. The weights of t-link connecting to foreground terminal in E0l, as shown in third row of Figure 9, are set to be infinity. This ensures E0l must be classified as foreground. The pre-determined foreground area can reduce the short-cut problem and prevent cut at WM/GM boundary.

Considering local gray value distribution, we partition Vl into 2l_{× 2}l_{× 2}l_{cubes (each dimension is divided into smaller}

overlapping region, as shown in Figure 10. Then we perform graph cut for each cube individually. Each cube has 1/4 in length overlapped with its adjacent cubes, this reduce the inconsistent border between two cubes. Each cube has its own foreground and background seeds. We estimate the three thresholds with each cube by Otsu’s method. If the voxel is brighter than the third threshold and contains in E0l then it is be defined as foreground seed, as shown as voxels overlaid with warm colors in the second row of Figure 9. If the voxel intensity is nonzero and smaller than the first threshold then it will be defined as background seed, as shown as voxels overlaid with cool colors in the second row of Figure 9. The bottom row of Figure 9 is the extracted brain at this resolution and we denote this volume as Ol. The eroded Ol is defined as El_{. For the overlapped region between neighboring cubes,}

we integrate the results by logical OR operation.

3) Graph cuts at the finest level: Graph cuts applied at the finest level is to get the final extracted brain. At the finest level, the pre-determined foreground area is the upsampled EL−1_,

as shown in the third row of Figure 11. We partition VL _into

2L_{× 2}L_{× 2}L_{cubes individually and in our method the L = 2.}

Then we perform graph cuts algorithm with each cube. The way of finding foreground seeds and background seeds is the same as in the middle levels, as shown in the second row of Figure 11. We also upsampling OL−1to this level, called O0L, as shown in top row cool colors of Figure11. The equation of upsampling OL−1 is defined as:

O0L_x,y,z= P1 k=−1O L−1 x 2, y 2,z2+k 3 , (10) where O0L_x,y,z is the intensity at coordinate (x, y, z) and at level L. Different from middle and the coarsest levels, we add a distance term: wd×(

D(xi)+D(xj)

2 ), in Eq. (2.3). The distance

term D(x) is defined as the least-squares distance from x to contour of O0L. The reason is that, the brighter voxels belonging to non-brain region are similar to foreground seeds, so they might be classified as foreground. But in the middle and the coarsest levels, those brighter voxels are blurred, hence those voxels can be judged as background seeds. Therefore, we use the contour of O0L as a factor of the edge weight in this level because when the n-link is closer to the contour, it should has more chance to be cut. For the overlapped region between neighboring cubes, we integrate the results by logical OR operation.

(8)

Fig. 9. Demonstration of foreground seeds, background seeds and results at middle levels. Top row: the volume at middle levels. Second row: the warm color is foreground seeds and cool color is background seeds. Third row: the pre-determined foreground which is eroded and upsampled the result at the lowest resolution level. Bottom row: the extracted brain at middle levels.

Fig. 10. Demonstration the unit of graph cut at level 1. The region surrounded by yellow, purple, green, and blue dotted line are the unit of graph cut, individually. Each unit perform graph cut once. The left image of this figure is the view in 2-D and the right one is the view in 3-D.

E. Postprocessing to fill holes that dark brain region be classified as background by graph cuts

The final step is to fill holes which are the dark brain areas classified as background regions by graph cuts algorithm, such as ventricles, as shown in Figure 12. The voxels surrounded by foreground regions, as shown in the yellow circle of Figure 13 also need to be filled, because these voxels were wrongly deleted in Step 1 of the system flowchart, since they failed to be included in WM region growing process due to their irregular intensities. The postprocessing is illustrated in Figure 13. We find the contour of the extracted brain from the highest resolution level, including the removed dark brain region, as

Fig. 11. Demonstration of foreground seeds, background seeds and results at the finest level. Top row: the volume at the highest level and the cool color is contour of O0L. Second row: the warm color is foreground seed and cool color is background seed. Third row: the pre-determined foreground which is eroded and upsampled the result at the middle level. Bottom row: the extracted brain which is at the finest level.

Fig. 12. The red circle is the dark volume may be classified as background that need to be post-processed

shown in Figure 13 (b). Then we take the outermost contour, as shown in Figure 13 (c), and incorporate the voxels inside the contour to fill dark brain regions. Figure 13 (d) shows the result of postprocessing. Notice that this postprocessing is applied on coronal slices.

III. RESULTS

A. Image data sets for performance evaluation

We applied four data sets to compare the accuracy of brain region extracted by using the existing methods, HWA [9], BSE [8], BET [6], ISTRIP [10], and the proposed method. The four data sets are the first and the second Internet Brain Segmentation Repository (IBSR) data set, Taipei Veterans General Hospital (VGATPE) data set, and BrainWeb phantom

(9)

Fig. 13. (a) The red circle is the dark voxels in brain but be classified as background, and yellow circle is the voxels deleted in step1of flowchart but belong to brain region, that both need to be post-processed. (b) We find contour of the extracted brain from the highest resolution level. (c) Take the outermost contour. (d) Fill the voxels which are inside the outermost contour and the dark voxels in brain are filled.

data set.

The first IBSR data set consist of 20 MR images, each with around 60 coronal slices, matrix size 256 mm × 256 mm, FOV 256 mm × 256 mm, and slice thickness 3.1 mm. The most images in this data set involve apparent intensity inhomogeneity and the large region of neck. The large region of neck may bias the estimation of parameters in the HWA and BET method. Therefore, the large region of neck were manually removed before estimation [10].

The second IBSR data set is composed of 18 MR images, each with around 128 coronal slices, matrix size 256 mm × 256 mm, FOV 240 mm × 240 mm, and slice thickness 1.5 mm. Each image has brain region determined by manual segmentation. The IBSR data sets as well as the brain region images are from the Montreal Neurological Institute and are available at http://www.cma.mgh.harvard.edu/ibsr.

The VGATPE data set contains 18 MR volumes of healthy subjects collected by Taipei Veterans General Hospital, each with around 124 coronal slices, matrix size 256 mm × 256 mm, FOV 260 mm × 260 mm, and slice thickness 1.5 mm. The ages of these subjects are between 22 and 57 years old. The brain region images of all subjects are segmented manually.

BrainWeb phantom images from the Montreal Neurological Institute are simulated by various levels of noise and intensity non-uniformity. The noise levels are 0%, 1%, 3%, 5%, 7%, and 9%. Each noise level contains three levels of intensity non-uniformity, which are 0%, 20%, and 40%. Each image with around 217 coronal slices, matrix size 181 mm × 181 mm, FOV 181 mm × 181 mm, and slice thickness 1mm. B. Evaluation metrics for brain extraction

This work used five coefficients to compare the accuracy of the brain regions extracted from existing methods.

The Jaccard similarity coefficient (JSC) is to measure the similarity of ground truth A and the extracted brain B by the following equation:

J SC(A, B) = |A ∩ B|

|A ∪ B| , (11) where |x| denotes the voxel number of x. The value of JSC is between 0 and 1 and the larger JSC value means the better overlap of A and B.

The higher sensitivity Se and higher specificity Sp means

more accurate extraction of brain region. Sensitivity Se is to

measure the ratio of extracted brain region to total brain region: Se=

T P

T P + F N . (12) If more brain voxels are obtained, the Sevalue will be higher.

Specificity Sp is to measure the ratio of rejected non-brain

region to total non-brain region: Sp=

T N

T N + F P . (13) If more non-brain voxels are rejected the Sp value will be

higher. The true positive rate, TP, is the number of voxels extracted as brain correctly. The number of voxels incorrectly extracted as brain is denoted by FP. The true negative rate, TN, is the number of voxels extracted as non-brain correctly. The number of voxels incorrectly extracted as non-brain region is denoted by FN.

Two probabilities of miss rate, pmand pf, were defined in

[8]. The probability of missed detection of the brain voxels is defined as:

pm=

|A − B|

|A ∪ B| . (14) The probability of miss rejection of the non-brain voxels is defined as:

pf =

|B − A|

|A ∪ B| . (15) The lower these two miss rates means the better brain extrac-tion.

BrainWeb provides discrete anatomical model to label tis-sues and the CSF region provided by BrainWeb contains ventricle areas as well as peripheral CSF areas, which are gen-erally considered as non-brain regions. Therefore, we evaluate the brain extraction algorithms with this data set according to two criteria. One only considers WM and GM, the other considers WM, GM, and CSF as the total brain region. C. Experimental results

Tables I, II, II, III, IV, V, and VI list the performance evaluation results of BET, BSE, HWA, ISTRIP and the pro-posed method for different data sets. In addition to the Table I, in other cases, although HWA has the highest Se, its poor

performance of Sp results in its worse performance of JSC.

Since some of the brain volumes in the first IBSR data set has apparent intensity inhomogeneity and artifacts, therefore, not all brain volumes have the satisfied result. As [10], Table I excludes the extracted brains which the JSC value is below 0.6

(10)

or the result is blank. The number of excluded brains are four, three, and five for the BSE, HWA, and the proposed method, respectively. Performance evaluation using the first IBSR data set without including all the failure cases as shown in Table II, and the number of excluded cases is seven. In Table I and Table II we has the similar JSC value with ISTRIP, In terms of Seand Sp, the proposed method has better sensitivity than

ISTRIP. However, ISTRIP can extract the all brain volumes in the first IBSR data set.

The proposed method outperforms others for the second IBSR set, in terms of JSC, Se, and Sp. As listed in Table III,

compared to HWA, our method maintains the high sensitivity of HWA as much as possible and improves the low specificity of HWA to the highest one. Our method achieves the highest JSC, the second highest Se, and the highest Sp. In this data

set, the total amount of voxels in an MR volume is 8388168, and the brain region is about one-eighth of the total voxels. Therefore, 0.01 voxels Se of the value means there are about

10000 of the extracted brains overlapped with the ground truth. The 0.01 voxels Sp of the value means there are about 70000

of the extracted brain regions overlapped with the non-brain regions of the ground truth. The 0.01 voxels JSC of the value means there are more than 10000 of the extracted brains overlapped with the ground truth.

When using the VGHTPE data set, although BSE has the highest Sp and JSC values, it has the lowest Se. This means

BSE may exclude many brain voxles. Comparison of BET, ISTRIP, and the proposed method, BET has the highest Sp

and the lowest Se. On the contrary, ISTRIP has the highest Se

and the lowest Sp. The proposed method has the intermediate

values of Sp and Se. The results of performance evaluation

for VGHTPE data set are listed in Table IV.

The performance evaluation using the BrainWeb phantom data set and only considering WM and GM is listed in Table V. Although BSE has the highest Sp, but has the lowest Se.

The proposed method has the best performance compared with BET and ISTRIP, in terms of JSC, Se, and Sp.

Since the ground truth of the CSF region provided by BrainWeb contains peripheral CSF area, the values of Sp

when considering CSF, as listed in Table VI, higher than those in Table V. In Table VI, the higher Se means that the

segmentation result may contain more peripheral CSF voxels. Therefore, we only concern the Sp value with this data set,

and we have the similar Sp of BET which has the highest

value.

The parameters of the second IBSR data set are as same as those used in [10]. For each brain extraction method, except BSE, the parameters of the VGHTPE and BrainWeb phantom data sets are the same. The fractional intensity threshold of BET was set as 0.7; the parameters of HWA were set to the default values with surface-shrink option turned on; the intensity contrast of ISTRIP was set as 0.08 with ”further remove non-brain tissues with high intensity” option turned on; the weight of intra n-link was set as 0.35, t-link was set as 0.65 and the distance term was set as 0.07 in the proposed method. The edge constant, diffusion iteration, and diffusion constant,

Fig. 14. The extracted brain image using the proposed method with the subject in the second IBSR data set.

erosion size of BSE when using the VGHTPE (BrainWeb phantom) data set were set to be 25 (increased with the degree of non-uniformity), 3 ,0.62 and 1 (2), respectively.

The extracted brain image using the proposed method with the subject in the second IBSR data set is shown in Figure 14. In our method we bring up three points, distance term at the highest level, pre-determined foreground and application of graph cuts algorithm localized by using smaller cubes to local estimation of intensity histogram. The left column of Figure 15 illustrations the extracted brains without using the distance term, and the right column is the extracted brain using distance term. The difference is obvious, the border of the extracted brains in the right column are more smooth. Figure 16 shows the extracted brain of the proposed method overlaid by another without considering pre-determined foreground. The brain region of proposed method is larger than the one without using pre-determined foreground. Some brain extraction methods segment brain volume in a slice-by-slice manner, however we use smaller cubes to do this work. The left column of Figure 17 shows the extracted brain of the proposed method overlaid with a brain extracted by graph cuts algorithm in a slice-by-slice manner, and the right column shows the extracted brain of proposed method. The brain region of proposed method is larger than the one using the slice-by-slice manner. Since the upper part of the image in the right column of Figure 17 is darker than the bottom part of image in transverse view, therefore, applying graph cuts algorithm localized by using the local estimated histogram is preferable than slice-by-slice manner.

The performance evaluations of the proposed method with-out distance term/predetermined term and in a slice-by-slice manner using the second IBSR data set are listed in Table VII. Although the proposed method without predetermined term has the highest JSC value and specificity, but it has lower sensitivity than the proposed method with the three factors. According to the value of JSC, Se, and Sp, distance term,

pre-determined foreground and application of graph cuts algorithm localized by using smaller cubes to local estimated intensity histogram are important factors in our method.

(11)

TABLE I

PERFORMANCE EVALUATION USING THE FIRSTIBSRDATA SET AND EXCLUDING THE FAILURE RESULTS OF EACH BRAIN EXTRACTION METHOD.

Method JSC Se Sp Pm Pf BET 0.878 (0.017) 0.983 (0.023) 0.987 (0.004) 0.016 (0.022) 0.107 (0.021) BSE4 0.900 (0.025) 0.954 (0.035) 0.993 (0.003) 0.044 (0.034) 0.055 (0.017) ISTRIP 0.910 (0.018) 0.986 (0.013) 0.991 (0.005) 0.013 (0.013) 0.077 (0.027) HWA3 _{0.752 (0.037)} _{0.974 (0.068)} _{0.970 (0.008)} _{0.022 (0.056)} _{0.226 (0.026)} Our method5 _{0.910 (0.021)} _{0.989 (0.012)} _{0.991 (0.003)} _{0.010 (0.011)} _{0.079 (0.020)}

The superscripts in the first column indicate the excluded cases. TABLE II

PERFORMANCE EVALUATION USING THE FIRSTIBSRDATA SET WITHOUT INCLUDING ALL THE FAILURE CASES.

Method JSC Se Sp Pm Pf BET7 _{0.881 (0.017)} _{0.981 (0.026)} _{0.988 (0.003)} _{0.017 (0.024)} _{0.102 (0.021)} BSE7 _{0.905 (0.025)} _{0.954 (0.035)} _{0.994 (0.002)} _{0.044 (0.034)} _{0.051 (0.010)} ISTRIP7 _{0.911 (0.014)} _{0.988 (0.015)} _{0.991 (0.004)} _{0.012 (0.014)} _{0.077 (0.023)} HWA7 _{0.762 (0.012)} _{0.999 (0.001)} _{0.967 (0.008)} _{0.001 (0.001)} _{0.237 (0.014)} Our method7 _{0.910 (0.021)} _{0.991 (0.008)} _{0.991 (0.003)} _{0.009 (0.007)} _{0.081 (0.020)}

The superscripts in the first column indicate the excluded cases. TABLE III

PERFORMANCE EVALUATION USING THE SECONDIBSRDATA SET.

Method JSC Se Sp Pm Pf BET 0.891 (0.052) 0.959 (0.042) 0.989 (0.005) 0.038 (0.038) 0.071 (0.031) BSE 0.838 (0.083) 0.957 (0.042) 0.973 (0.030) 0.041 (0.041) 0.119 (0.104) ISTRIP 0.915 (0.018) 0.978 (0.011) 0.990 (0.003) 0.021 (0.011) 0.064 (0.022) HWA 0.814 (0.040) 0.9997 (0.0003) 0.965 (0.016) 0.0002 (0.0002) 0.186 (0.040) Our method 0.930 (0.011) 0.981 (0.016) 0.991 (0.005) 0.018 (0.015) 0.052 (0.017) TABLE IV

PERFORMANCE EVALUATION USING THEVGHTPEDATA SET.

Method JSC Se Sp Pm Pf BET 0.921 (0.009) 0.971 (0.013) 0.994 (0.001) 0.028 (0.012) 0.051 (0.011) BSE 0.933 (0.008) 0.958 (0.011) 0.997 (0.001) 0.041 (0.011) 0.026 (0.008) ISTRIP 0.915 (0.011) 0.984 (0.012) 0.991 (0.002) 0.015 (0.011) 0.07 (0.015) HWA 0.847 (0.019) 0.998 (0.002) 0.979 (0.003) 0.002 (0.002) 0.151 (0.019) Our method 0.920 (0.010) 0.982 (0.009) 0.992 (0.002) 0.017 (0.009) 0.063 (0.017) TABLE V

PERFORMANCE EVALUATION USING THEBRAINWEB PHANTOM DATA SET AND ONLY CONSIDERING ONLYWMANDGM.

Method JSC Se Sp Pm Pf BET 0.832 (0.010) 0.994 (0.002) 0.945 (0.005) 0.005 (0.001) 0.163 (0.011) BSE 0.855 (0.041) 0.989 (0.005) 0.954 (0.018) 0.010 (0.004) 0.136 (0.044) ISTRIP 0.811 (0.005) 0.998 (0.0003) 0.934 (0.002) 0.001 (0.0003) 0.188 (0.005) HWA 0.696 (0.005) 0.9995 (0.0001) 0.876 (0.003) 0.0003 (9.00947E-05) 0.304 (0.005) Our method 0.840 (0.025) 0.997 (0.001) 0.946 (0.011) 0.003 (0.001) 0.157 (0.026) IV. DISCUSSION

The program execution speed of the existing methods for brain extraction which compared in this work are fast (less than a minute). The program execution speed of the proposed method is about 5 minutes for an MR volume. Regarding to the performance evaluation for brain extraction, our method outperforms others for the second IBSR data set. Therefore,

the time cost of the proposed method is not expensive. The compared method in this work, BET, HWA, and ISTRIP are performed on Genuine Intel CPU 2.93GHz×16 processor running Linux, BSE and our method are performed on Intel Core2 Quad XP 2.83GHz processor on Windows system. Most of the running time of our method spent on the Otsu’s method, since it was performed frequently. Although the speed of

(12)

TABLE VI

PERFORMANCE EVALUATION USING THEBRAINWEB PHANTOM DATA SET AND CONSIDERINGWM, GMANDCSF.

Method JSC Se Sp Pm Pf BET 0.928 (0.009) 0.944 (0.011)) 0.993 (0.001) 0.055 (0.011) 0.017 (0.003) BSE 0.870 (0.031) 0.897 (0.009) 0.988 (0.018) 0.0999 (0.012) 0.030 (0.041) ISTRIP 0.934 (0.001) 0.963 (0.004) 0.988 (0.001) 0.036 (0.004) 0.031 (0.003) HWA 0.850 (0.003) 0.993 (0.002) 0.936 (0.002) 0.006 (0.002) 0.144 (0.004) Our method 0.912 (0.007) 0.934 (0.017) 0.991 (0.005) 0.064 (0.0178) 0.024 (0.013) TABLE VII

PERFORMANCE EVALUATION OF THE PROPOSED METHOD WITHOUT DISTANCE TERM/PREDETERMINED TERM AND IN A SLICE-BY-SLICE MANNER USING THE SECONDIBSRDATA SET.

Method JSC Se Sp Pm Pf

Our method 0.930 (0.011) 0.981 (0.016) 0.991 (0.005) 0.018 (0.015) 0.052 (0.017) Our method without distance term 0.911 (0.023) 0.975 (0.017) 0.989 (0.007) 0.024 (0.016) 0.066 (0.026) Our method in slice-by-slice manner 0.904 (0.020) 0.953 (0.024) 0.991(0.006) 0.045 (0.023) 0.051 (0.022) Our method without predetermined term 0.933 (0.012) 0.979 (0.016) 0.992 (0.004) 0.020 (0.016) 0.047 (0.016)

Fig. 15. The result of extracted brain without using distance term at the highest level. The left column is the extracted brain without considering the contour factor. The right column is the extracted brain and consider the contour factor corresponding to the left column.

multiresolutional graph cuts with narrow band should be faster, but we use multiresolutional graph cuts to extracted brain instead of using multiresolutional graph cuts with narrow band. The reason is that the multiresolutional graph cuts with narrow band will limit the refinement of the results in a band and may miss the thin structure.

In our method we perform graph cuts algorithm individually for each cube, the advantage is to consider the influence of the intensity inhomogeneity. In the finest level, when the cube contains only a small portion of brain region. The intensity of foreground seeds may be similar to the non-brain region.

Fig. 16. The extracted brain of proposed method overlaid by another without considering pre-determined foreground factor.

Fig. 17. The left column shows the extracted brain of proposed method overlaid with a brain extracted by graph cuts algorithm in a slice-by-slice manner, and the right column shows the extracted brain of proposed method.

Therefore the results may contain non-brain and have irregular border, as shown in the red window of Figure. 18. The distance term can correct this problem, but if the distance term is too big then the result may not be smooth.

The advantage of the distance term at the finest level is its capability to constrain the shape of brain boundary. However, it has disadvantages. In some cases the extracted brain at middle levels are better than that at the highest level. Since the contour

(13)

Fig. 18. An example of the possible disadvantage by using localized graph cuts.

Fig. 19. An example of the possible disadvantage by using the distance term in the finest level. (a) The extracted brain at the highest level and the red windows show the non-brain voxles. (b) The yellow line shows the contour used in distance term at the highest level, and the positions corresponding to the red windows in (a) are inconsistent with the brain area. (c) The extracted brain at middle levels. (a) and (b) are shrunk as the size of (c).

used in distance term is upsampled from the coarser level, the contour may not be entirely consistent with the border of the brain at the highest level, as illustrated in Figure 19.

V. CONCLUSION

We have proposed an automatic method based on multireso-lutional graph cuts technique to segment the head MR images into brain and non-brain regions.

The contributions of this work are 1) using distance term at the highest level to constrain the brain contour, 2) using pre-determined foreground from the coarser level to avoid excluding the brain voxels, 3) and applying graph cuts al-gorithm individually for smaller cubes to tackle the intensity inhomogeneity problem.

We have compared to the existing methods and com-pared with our method by using four data sets. Regarding the performance evaluation for brain extraction, our method outperforms others for the first/second IBSR data set and BrainWeb phantom data set, and comparably with the BET and ISTRIP methods for the VGHTPE data sets.

REFERENCES

[1] John Ashburner and Karl J. Friston. Voxel-Based Morphometry-The Methods. NeuroImage, 11:805-821, 2000.

[2] Catriona D. Good, Ingrid S. Johnsrude, John Ashburner, John Ashburner, Karl J. Friston, and Richard S. J. Frackowiak. A voxel based morpho-metric study of ageing in 465 normal adult human brains. NeuroImage, 14:21-36, 2001.

[3] M. Kubicki, M. E. Shenton, D. F. Salisbury, Y. Hirayasu, K. Kasai, R. Kikinis, F. A. Jolesz, and R. W. McCarley. Voxel-Based Morphometric Analysis of Gray Matter in First Episode Schizophrenia. NeuroImage, 17:1711-1719, 2002.

[4] Andrew M. McIntosh, Dominic E. Job, T. William J. Moorhead, Lesley K. Harrison, Karen Forrester, Stephen M. Lawrie, and Eve C. Johnstone. Voxel-based morphometry of patients with schizophrenia or bipolar disorder and their unaffected relatives. Biological Psychiatry, 56:544-552, 2004.

[5] Paul M. Thompson, Agatha D. Lee, Rebecca A. Dutton, Jennifer A. Geaga, Kiralee M. Hayashi, Mark A. Eckert, Ursula Bellugi, Albert M. Galaburda, Julie R. Korenberg, Debra L. Mills, Arthur W. Toga, and Allan L. Reiss. Abnormal Cortical Complexity and Thickness Profiles Mapped in Williams Syndrome. Neuroscience, 25(16):4146-4158, 2005. [6] Stephen M. Smith. Fast Robust Automated Brain Extraction. Human

Brain Mapping, 17:143-155, 2002.

[7] Horst K. Hahn and Heinz-Otto Peitgen. The Skull Stripping Problem in MRI Solved by a Single 3D Watershed Transform. MICCAI, 134-143, 2000.

[8] David W. Shattuck, Stephanie R. Sandor-Leahy, Kirt A. Schaper, David A. Rottenberg, and Richard M. Leahy. Magnetic Resonance Image Tissue Classification Using a Partial Volume Model. NeuroImage, 13:856-876, 2001.

[9] F. S´egonne, E. Busa, M. Glessner, D. Salat,A.M. Dale, H.K. Hahn, and B. Fischl. A hybrid approach to the skull stripping problem in MRI. NeuroImage, 22:1060-1075, 2004.

[10] Jia-Xiu Liu, Yong-Sheng Chen, Li-Fen Chen. Accurate and robust extraction of brain regions using a deformable model based on radial basis functions. Joural of Neuroscience Methods, 183:155-166, 2009. [11] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford

Stein. Introduction To Algorithms, Second Edition. 2001.

[12] Yuri Y. Boykov and Marie-Pierre Jolly. Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in N-D Images. Internation Conference on Computer Vision,vol.I,p.105, 2001.

[13] Brian L. Price, Bryan Morse, Scott Cohen. Geodesic Graph Cut for Interactive Image Segmentation. Conference on Computer Vision and Pattern Recognition, 2010.

[14] Yin Li, Jian Sun, Chi-Keung Tang, Heung-Yeung Shum. Lazzy Snap-ping. ACM, 2004.

[15] Carsten Rother, Vladimir Kolmogorov and Andrew Blake. ”GrabCut”-Interactive Foreground Extraction using Iterated Graph Cuts. ACM Trans-actions on Graphics, 23:309-314, 2004.

[16] Jean Stawiaski, Etienne Decenci`ere and Franc¸ois Bidault. Interactive Liver Tumor Segmentation Using. MICCAI, 2008.

[17] Yin Li, Jian Sun, Heung-Yeung Shum. Video Object Cut and Paste. ACM Transactions on Graphics, 24:595-600, 2005.

[18] J. Reese and W. Barrett. Image Editing with Intelligent Paint. EURO-GRAPHICS, 2002.

[19] Xue Bai, Jue Wang, David Simons, Guillermo Sapiro. Video SnapCut: Robust Video Object Cutout Using Localized Classifiers. ACM SIG-GRAPH, 2009.

[20] Herve Lombaert, Yiyong Sun, Leo Grady, Chenyang Xu. A Multilevel Banded Graph Cuts Method for Fast Image Segmentation. ICCV, 259-265, 2005.

[21] Ali Kemal Sinop and Leo Grady. Accurate Banded Graph Cut Segmen-tation of Thin Structures Using Laplacian Pyramids. 896-903MICCAI, 2006.

[22] Suresh A. Sadananthan, Weili Zheng, Michael W.L. Chee, Vitali Zagorodnov. Skull stripping using graph cuts. NeuroImage, 49:225-239, 2010.

[23] Yuri Boykov and Vladimir Kolmogorov. An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004. [24] David M. Mount and Sunil Arya. ANN: A Library for Approximate Nearest Neighbor Searching, Internet: www.iceengg.edu/staff.html, Jan 27, 2010 [Sep. 17, 2010].

[25] Ostu N. Threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern, 9:62-66, 1979.

(14)

1

國科會補助專題研究計畫項下出席國際學術會議心得報告

日期： 2010 年 6 月 14 日

一、參加會議經過

Human Brain Mapping 是由人腦造影國際組織所籌辦的年度會議，為腦科學研究領域最重要的

會議之一，每年均吸引兩千餘人參與。今年會議在西班牙巴塞隆納舉行，我們在會議前一天抵達，

克服時差後旋即全程參與會議，並發表了研究論文：“Gamma oscillation patterns for different facial

expressions in early visual processing: an MEG study”。這篇論文的重點在我們利用腦磁圖來研究當

受試者接受不同人臉表情（包括正向與負向情緒）影像刺激時，其早期視覺處理相關的腦部迦瑪

活動模式的不同。

圖：壁報論文發表

計畫編號

NSC 98-2221-E-009-147-

計畫名稱