以X光顯微鏡進行三維斷層影像重建

(1)

國

立

交

通

大

學

資訊科學與工程研究所

博

士

論

文

以

X 光顯微鏡進行三維斷層影像重建

Three-Dimensional Tomography Reconstruction for

X-Ray Microscopy

研究生：鄭昌杰

指導教授：荊宇泰博士

胡宇光博士

(2)

以

X 光顯微鏡進行三維斷層影像重建

Three-Dimensional Tomography Reconstruction for

X-Ray Microscopy

研

究生：鄭昌杰

Student: Chang-Chieh Cheng

指導教授：荊宇泰博士

Advisor: Dr. Yu-Tai Ching

胡宇光

博士

Dr. Yeukuang Hwu

國

立交通大學資訊學院

資

訊科學與工程研究所

博

士論文

A Dissertation Submitted to

Institute of Computer Science and Engineering College of Computer Science

National Chiao Tung University In Partial Fulfillment of the Requirements for

the Degree of Doctor of Philosophy in Computer and Information Science

April 2014 Hsinchu, Taiwan

(3)

以

X 光顯微鏡進行三維斷層影像重建

學生：鄭昌杰

指導教授：荊宇泰

博士

胡宇光

博士

國立交通大學資訊學院

資訊科學與工程研究所

摘

要

由於高解析度影像在取像上並不是那麼容易，所以從這些影像上取

得的資訊是十分有價值的。X 光顯微鏡利用了硬 X 光的短波長與高穿

透力的特性來取得厚實標本的奈米解析度影像，這是其他取像技術無法

辦到的。一般的

X 光取像設備是以 X 光吸收率來取像，但若以折射率

來進行相位取像可有效地取得物體材質間的對比變化並加強之，且這種

取像過程不會被某些生物標本的高

X 光吸收結構所影響。這種先進的

技術在生醫研究上具有獨特的價值，但也浮現許多問題需要解決。本研

究針對其中幾個問題並提出新的影像處理方法來解決。在斷層取像過程

中，物體的旋轉會發生影像上的不對稱抖動現象，我們提出一個特徵點

對齊法來對齊這些

X 光投影影像，使得斷層掃描影像可以成功的重建

出來。為了解決高解析度攝影視野的限制，我們提出一個影像拼貼的技

術將數張較小視野的影像合成一張可完整包含物體的大尺寸影像。此方

法經由一組模擬資料來驗證其大型物體的三維斷層影像重建之可行

性。最後，我們以傅立葉體積描繪法來快速地瀏覽重建完成的三維立體

資料，也運用了具備美觀性與準確性的紋理體積描繪法來進行三維立體

資料的視覺化。其中，我們為傅立葉體積描繪法提出一個快速的資料分

類方法，可有效地改變轉換函數的權重並即時地反映在繪圖結果上。

(4)

Three-Dimensional Tomography

Reconstruction for

X-Ray Microscopy

student：Chang-Chieh Cheng

Advisors：Dr.

Yu-Tai Ching

Dr. Yeukuang Hwu

Institute of Computer Science and Engineering

College of Computer Science

National Chiao Tung University

ABSTRACT

High resolution images pushing the performance limit are always difficult to

acquire and therefore extremely valuable to extract crucial information from

them. X-ray microscopy takes advantage of the short wavelength and high

penetration characteristic nature of hard-X-rays and demonstrated nanometer

level resolution on thick specimens that is not available by other techniques.

With devices to obtain contrast from differences of the refractive index,

rather than the absorption of X-rays, the phase imaging can effectively

enhance the contrast of materials which does not contain high X-ray

absorbing structures such as biology specimens. The cutting edge level of

performance found unique value in particularly biomedical research but also

highlighted several problems which requires attention. This study addresses

specifically several issues and tackles them by new image processing

approaches. An image alignment method is implemented to eliminate the

problem caused by the nonsystematic jitter of the sample rotation

mechanism using a feature-recognition based alignment algorithm so a

tomography reconstruction can be performed on these aligned projection

images. To overcome the limitation of small field of view associated with

the high resolution and limited number of imaging pixels, another image

alignment method is developed to stitch these small field-of-view images

into a larger one covering larger area of the specimen. This method is tested

on a large number of simulated projection images of a phantom to

demonstrate the possibility of tomography reconstruction in 3D of a large

object. Finally, to visualize the tomography reconstructed images, a method

based on the Fourier volume rendering (FVR) algorithm is designed to

achieve better visualization and precision of the commonly used

texture-based volume rendering (TVR) method. Improved processing

(5)

efficiency by adjusting the weight of transfer function for FVR drastically

reduces the processing time and computation resources required for voxel

classification and make it possible for routine application.

(6)

Acknowledgement

First of all I would like to express my sincere gratitude to my thesis advisor, Prof. Yu-Tai Ching, for his support and guidance throughout my research. My sincere appreciation is extended to Dr. Yeukuang Hwu, for his support and advise on the synchrotron radiation.

I would also like to thank all members of MIPLAB for their assistance in the laboratory experiment.

Finally, I would like to dedicate this work to my parents and brother. Without their support, patience and love, this study would not have been completed.

(7)

Three-Dimensional Tomography Reconstruction for

X-Ray Microscopy

Student: Chang-Chieh Cheng

1

Advisors: Yu-Tai Ching

1

, Yeukuang Hwu

2

1. Department of Computer Science, National Chiao Tung University, Hsinchu, Taiwan.

(8)

Appendix

88

Chapter A Background Removal 89

Chapter B Image Similarity 92

B.1 Entropy . . . 93

B.2 Mutual Information . . . 94

B.3 Two-Image Alignment . . . 95

(10)

List of Figures

1.1 The structure of synchrotron X-ray microscope . . . 9

1.2 The synchrotron X-ray microscope in NSRRC . . . 10

1.3 The synchrotron X-ray images . . . 11

1.4 The stitched image of a cancer cell . . . 12

2.1 The tomography reconstruction . . . 17

2.2 Fourier slice theorem . . . 18

2.3 The FBP algorithm . . . 18

2.4 The pixel area covered by ray . . . 19

2.5 The results of FBP . . . 19

2.6 The theory of ART . . . 22

2.7 The tomogrphic images of FBP and SIRT . . . 23

3.1 Texture mapping . . . 27

3.2 The proxy polygons in TVR . . . 28

3.3 A result of TVR . . . 28

3.4 The FVR algorithm . . . 31

3.5 A spline with m control points . . . 33

3.6 The clustered control points . . . 37

3.7 The results of FVR with B´ezier curve transfer function(CT chest) . . . 40

3.8 The results of FVR with B-spline transfer function(CT chest) . . . 41

3.9 The results of FVR with B-spline transfer function(CT chest) . . . 42

3.10 The results of FVR with B-spline transfer function(HeLa1) . . . 43

(11)

4.1 The misalignment of tomography . . . 48

4.2 The feature tracking between two X-ray projection images Ii−1 and Ii . . . 55

4.3 The simulated X-ray image of the phantom data . . . 60

4.4 Feature loci of the phantom data . . . 61

4.5 One slice in the phantom data . . . 61

4.6 X-ray projection images of HeLa cells . . . 62

4.7 The loci of the reliable projected feature points of HeLa1 and HeLa2 . . . 63

4.8 One slice in the reconstructed tomographic images of HeLa1 and HeLa2 . . 64

4.9 The 3D volume rendering of the reconstructed volume . . . 65

4.10 Eight examples of successful alignment (1) . . . 66

4.11 Eight examples of successful alignment (2) . . . 67

4.12 The first example of unsuccessful alignment . . . 68

4.13 The second example of unsuccessful alignment . . . 69

5.1 The acquisition for a large image . . . 76

5.2 The search area for alignment. . . 77

5.3 The alignment of multiple references. . . 77

5.4 The stitch order for a large image . . . 78

5.5 The construction of Gaussian-Laplacian pyramid . . . 79

5.6 The reconstruction of Gaussian-Laplacian pyramid . . . 80

5.7 The image blending . . . 81

5.8 Image stitching for the phantom . . . 83

5.9 The tomographic result reconstructed from the stitched images of phantom 84 A.1 The procedure of background removal. . . 91

(12)

List of Tables

2.1 The time comparisons of FBP and SIRT . . . 24

3.1 The computing time of two cases using the B-spline as a transfer function . 45

(13)

Chapter 1 Introduction

Analyze high resolution images of any imaging method are of very high value. Particu-larly for those images obtained at the cutting edge level performance. When an imaging method pushes its performance at its limit, it is not likely be improved with technical ad-vancement. Image processing methods repeatedly can offer invaluable support to advance the imaging technique over barriers which cannot be resolved otherwise. X-ray microscopy is an emerging imaging method takes advantage of the unique short wavelength and high penetration characteristics and offer nanometer resolution at thick, hundreds of µm, spec-imens in 3D which is currently not available by other imaging techniques. Its importance in biomedical research is boosted by the high quality X-ray photons generated by syn-chrotron accelerators[1]. Characteristics of X-rays from synsyn-chrotron radiation such as very high intensity, coherence, and tunable in energy are all very valuable to achieve high performance for X-ray imaging [2]. At some energy range, such as soft-X-rays (wavelength longer than 1 nm), no other devices offer photons of sufficient number for useful imaging applications. X-ray microscopy using hard-x-rays suffer from the low efficiency optics and therefore the high intensity X-rays from synchrotron are the only choice to allow high performance imaging. Using nanofabricate Fresnel zone plate (FZP) [3] optics of 20 nm outermost zone with high thickness, >500 nm, of Au (gold), 16.5 nm resolution was demonstrated using synchrotron X-rays. Inserting a ring device placing at the back focal plane, Zernike phase contrast is able enhance images of materials which show no strong

(14)

X-ray microscope based on FZP and Zernike phase contrast. Three dimensional imaging is achieved by rotating the specimen and image the transmitted X-rays at different angles as the “projection images”. Several devices, such as the capillary condenser, the pinhole, and the beamstop, are required to facilitate the illumination conditions for the FZP. An X-ray microscopy of this type is implemented at the BL01B beamline in National Syn-chrotron Radiation Research Center (NSRRC, Hsinchu) and provide all the images used in this study. Fig. 1.2 shows the X-ray microscopy in NSRRC. This beamline is capable of performing X-ray imaging at different resolution and specimens size. In addition to the X-ray microscopy, a phase contrast micro-radiation imaging setup can produce X-ray mi-crographs of live animals at µm resolution. Fig. 1.3 shows three images taken from these X-ray imaging apparatuses. Specifically, Fig. 1.3(a) shows a high resolution 1024 × 1024 pixels X-ray micrograph of an EMT (Epithelial-Mesenchymal Transition) cell with, Fig. 1.3(b) shows 1600 × 1200 pixels phase contrast X-ray image of the tip of a toothpick, and Fig. 1.3(c) shows a montage image of 5 × 7 1024 × 1024 pixels phase contrast microradi-ology images of a mouse head. The size of the pixels of these images are 11.78 nm (a), 2.87 µm (b), and 2.87 µm (c). These images showcased the superior performance of the X-ray imaging in resolution and contrast and the size of specimen compare of conventional X-ray imaging and other imaging techniques.

After taking projection images at a large number, typically > 100, at different rotation angles, ideally cover 180 degree, many processing methods can be used to generate a 3D models from these projection images [4]. For example, a most common method, filtered-back-projection method (FBP), transforms the sinogram, consist of a stack image lines from different projection images, to a reconstructed slice of reconstructed image. Putting these slices together, one can obtain a 3D data metrics of the absorption of each voxel. Visualization tools can then be used to produce a “model” to recreated the 3D models which nevertheless is accurate, in terms of special resolution, to the level of the X-ray microscopy imaging.

The most advanced development in X-ray microscopy, reaching 16.5 nm as mentioned above, however, as a technology is pushed to state-of-the-art, suffers from a number

(15)

of technical problems and many of these issues cannot be solved by improvement on the hardware alone. One of the most severe problem due to the extreme high resolution of the imaging, is the lack of sufficient precision in the rotation movement. It is not possible with the current technology to guarantee two consecutive rotated images produced precisely at the pre-assigned rotation axis. Even with the best rotation movement, the axis is wobbling and jittering which moves the rotation axis from its otherwise fixed position µm away, which is intolerable with the images of 20 nm resolution and 2-5 nm pixels size. Such inaccuracy cannot be calibrated because it is not systematic and therefore it does not only reduce the resolution of the reconstructed model, most likely also cause the complete failure of the tomography reconstruction. Current solution to this problem is by manually recognize specific feature in the individual micrographs and then project their position in the rotation motion and then align the images accordingly. Which is not only time consuming, but difficult to execute in most of the cases.

The limitation due to the matching of pixel size and the resolution, the size of the object are often larger than the field of view of an individual image. Although tomography reconstruction is possible, it loses the references to blank area and therefore the absolute references of the absorption coefficient. This makes the quantitative analysis based on the pixel value difficult, if not impossible. There is therefore a strong desire to obtain image to cover the whole specimen while not losing the resolution. Most common method is to stitch a series of images taken by imaging different location of the specimen. Alignment of these images without a precise recording of the specimen position, again limited by the mechanical movement, would require similar image processing effort. Fig. 1.4 shows a high-resolution image of a cancer cell stitched from a 3 × 3 patch of 1024 × 1024 pixel images. Without obtaining a full set of such stitched images at the full rotation angle, we tested our tomography process with a set of phantom data generated to simulate the large-scale tomography reconstruction.

In this study, we also tackle the issue of visualization which is required to analyze 3D volume data rendered from the X-ray projection images. Volume rendering is a commonly used technique that generates a two-dimensional image from a volume data with given

(16)

Object holder Object Vertical movement Horizontal movement Rotation Detector Zone plate Phase ring Pinhole Condenser X-ray

Figure 1.1: The structure of synchrotron X-ray microscope.

viewing parameters. Texture-based volume rendering (TVR) [5] using graphics hardware and texture mapping to accelerate the integration of ray casting is the most popular visualization technique in recent years. Fourier volume rendering (FVR) [6], [7], is another real-time visualization method based on Fourier transform. Theoretically, the FVR can be faster than TVR. But in reality, the voxel classification scheme in FVR is much more demanding than in TVR because the rendering is carried out in the frequency domain. This study proposed a method to set a transfer function based on B-spine and demonstrate successfully its much enhanced efficiency in classifying the voxels in FVR.

This thesis is organized as following: In Chapter 2 two tomography reconstruction algorithms used in this study is described in detail and two visualization techniques, TVR and FVR, are presented in Chapter 3. The proposed algorithm for high-resolution volume reconstruction is described in Chapter 4. The proposed method for tomography

reconstruction of large high-resolution volume is presented in Chapter 5. Chapter 6

summarizes this thesis with further discussion and proposed future work. In Appendix A, the method of background removal described and the similarity estimation of two images is described in Appendix B.

(17)

(a)

(b) (c)

Figure 1.2: The synchrotron X-ray microscope in NSRRC. (a) The overview of

(18)

(a) (b)

(c)

Figure 1.3: The synchrotron X-ray images. (a) An EMT cell. (b) A toothpick. (c)

(19)

(a) (b)

Figure 1.4: The stitched image of a cancer cell. (a) Nine image blocks acquired

(20)

Chapter 2 Tomography Reconstruction

It is known that a series of X-ray projections around an object can be used to reconstruct a 3D volume data by using an appropriate reconstruction algorithms [4] (Fig. 2.1). In this study, two tomography reconstruction methods are used to reconstruct three-dimensional volumes. For completeness, the following sections review these two reconstruction meth-ods

2.1 Filtered Back Projection

The Radon transform [8] expresses that using an apparatus of parallel-beam projection to gather the intensity of X-ray. Let f (x, y) be a two-dimensional function to represent a slice of an object. Given a projection angle θ, The Radon transform is defined as follows.

pθ(x0) =

Z ∞

−∞

f (y0sin θ + x0cos θ, −y0cos θ + x0sin θ)dy0. (2.1)

pθ(x0) can be expressed a θ − x plane called sinogram. The middle figure of Fig. 2.1

shows an example of sinogram generated from the Shepp-Logan phantom object [9]. The sinogram is an image in the plane of θ and x. Each raw of the sinogram is retrieved from the X-ray images with the same y coordinate.

The Filtered Back Projection(FBP) algorithm based on the Fourier slice theorem [10] can estimate a tomographic image from a sinogram. The Fourier slice theorem in 2D space

(21)

is described as follows. Given a 2D function in spatial domain, f (x, y) (Fig. 2.2(a)), and let pθ(x0) be the projection of f (x, y) in direction θ where

x0 = x cos θ + y sin θ, (2.2)

and

y0 = −x sin θ + y cos θ. (2.3)

Let F (u, v) be the Fourier transform of f (x, y) (Fig. 2.2 (b)), and Pθ(w) be the 1D Fourier

transform of pθ(x0). Pθ(w) is a line segment with orientation θ in F (u, v) passing through

the origin. Using the Fourier slice theorem, the projection of f (x, y) along orientation

θ can be obtained by taking the inverse Fourier transform of Pθ(w). To apply the fast

Fourier transform (FFT) [11], a convention between Cartesian coordinate system (x-y

plane) and polar coordinate system (w-θ plane) is required. Let F₂−1 be the 2D inverse

Fourier transform, the relation between f (x, y) and F (u, v) can be expressed as follows.

f (x, y) = F₂−1{F (u, v)} = Z ∞ −∞ Z ∞ −∞F (u, v)e 2π(ux+uy)i dudv, (2.4)

where i =√−1. Let u = w cos θ and v = w sin θ to convert the u − v coordinate system

to w-θ coordinate system. Eq. 2.4 can be rewritten as follows.

f (x, y) =

Z 2π

0

Z ∞

0

F (w, θ)ew2π(x cos θ+y sin θ)iwdwdθ

= Z 2π 0 ( Z ∞ −∞

|w|F (w, θ)ew2π(x cos θ+y sin θ)i_dw)dθ. _(2.5)

Define

qθ(x0) =

Z ∞

−∞|w|F (w, θ)e

w2πx0i_dw. _(2.6)

qθ(x0) is the inverse Fourier transform of a filtered line segment. This line segment passes

though the origin of frequency domain and its angle from the horizontal axis is θ. Then, a ramp waveform, |w|, is multiplied to this line segment. |w| is called the ramp filter

(22)

because a multiplication in the frequency domain is equal to a filtering process in the spatial domain. Finally, the FBP equation is obtained as follows.

f (x, y) =

Z π

0

qθ(x cos θ + y sin θ)dθ. (2.7)

Eq. 2.7 is the fundamental theory of FBP. The implementation of FBP is described as follows.

1. Apply 1D FFT, F1, to transform each row of the sinogram pθ(x0),

Pθ(w) = F1{pθ(x0)}.

2. Multiply the ramp filter, |w|, to Pθ(w).

3. Calculate the filtered projections, qθ, by applying F1−1 to the filtered Pθ(w).

4. Back project qθ for all θ to the output image.

Fig. 2.3 shows the algorithm of FBP. According to the projection angle θ, the back

projection in Step 4 projects a ray for each element of qθ to the output image as shown

in Fig 2.4(a). If a pixel of the output image is covered by a ray of qθ(x0), estimate the

covered area and add the value of qθ(x0) multiplied by the covered area to this pixel. Fig

2.4(b) shows the estimation of the pixel area covered by a ray . Given a re-projected ray, and there are several sampling points with uniform interval on this ray. For each pixel f (x, y), if its perpendicular distance to the ray, d, is less than the pixel width, find the

nearest two sampling points on the ray, Ra and Rb. Two areas of rectangles, Wa and Wb,

then can be calculated by Ra, Rb, x and y. Therefore, the area covered by ray can be

estimated by Wa+ Wb.

FBP can be implemented on a parallel computing machine[12]. In this study, FBP was implemented on the GPU (graphic processing unit) and the multi-core processor. The GPU is a multi-processor chip of SIMD (single instruction multiple data) architecture. The GPU and CPU used in this study respectively are nVidia GTX 760 and i7 3.4 GHz with four cores. In the experiments, the test image was the Shepp-Logan phantom object with 1024 × 1024 pixels as shown in Fig. 2.5(a). The interval between each two successive

(23)

projection angles was 1◦ and the range of projection angles was 0◦ to 179◦. The CPU with single-core mode consumed 13.6 second to produce the reconstructed image. The computational time of CPU with four-core mode was 2.9 second. The computational time of GPU was 0.1 second. These experiments exhibited that the parallel computing can reduce the computation cost of FBP extremely.

If the number of projection data is insufficient, FBP could produce a poor result. Fig. 2.5(b) shows an FBP result from a incomplete sinogram of large angle between projec-tions. The size of image was 1024 × 1024 pixels, the interval between each two successive

projection angles was 10◦, and the range of the projection angles was 0◦ to 179◦. Fig.

2.5(c) shows an FBP result from a incomplete sinogram of insufficient projections. The size of image was 1024 × 1024 pixels, the interval between each two successive projection

angles was 1◦, and the range of projection angles was 20◦ to 159◦. The main reason of

these two poor results is that the number of projection data is far less than the size of output image.

(24)

𝜃𝜃 = 𝜋𝜋 𝜃𝜃 =𝜋𝜋₂ 𝜃𝜃 = 0 𝜃𝜃 𝑥𝑥 Tomographic image Sinogram X-ray projections Tomography Reconstruction

Figure 2.1: The tomography reconstruction. The middle figure shows a sinogram

created from a series of X-ray projection images, where the object is a Shepp-Logan phantom and the projection range is 0 to π. A tomography reconstruction algorithm then can be applied to generate the tomographic image.

(25)

v

u

ɵ

)

(w

P

_ F(u, v) (a) (b)

Figure 2.2: Fourier slice theorem. (a) pθ(x0) is the projection of f (x, y) along θ + π/2.

(b) In the frequency domain, the line segment, Pθ(w), is the 1D Fourier transform of

pθ(x0). × x y f(x,y) θ qθ(x ') Pθ(w) Frequency domain Ramp filter 1.0 p_θ(x') Spatial domain F1 Q_θ(w) Back-‐projec-on F1-1 q_θ(x') x' Frequency domain Spatial domain x' y' x' w w w

(26)

d

f(x, y)

R

_a

R

_b

W

_a

W

_b

Ray

(a) (b)

0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.1 0.1 0.1 0.1 0.1 0.1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Figure 2.4: The pixel area covered by ray. (a) In back projection, each filtered

projection is re-projected on the output image. The number in each cell represents the area covered by ray. (b) The approximation of pixel area covered by ray.

Figure 2.5: The results of FBP. The object was the SheppLogan phantom. (a)

The quality result reconstructed from the range and interval of projection angles were

[0◦, 179◦] and 1◦ respectively. (b) The result of large angle between projections. The

range and interval of projection angles were [0◦, 179◦] and 10◦ respectively. (c) The result

of insufficient projections. The range and interval of projection angles were [20◦, 159◦]

(27)

2.2 Algebraic Reconstruction Technique

The algebraic reconstruction technique (ART) is an iterative method of tomography re-construction [13]. There are two main differences between the ART and FBP. First, all estimation processes of ART is in spatial domain rather than in frequency domain. Sec-ond, the ART uses several iterations to correct the errors to reconstruct the image. In most cases, the ART can achieve a better reconstruction than the FBP can get. However, the computational cost of ART is much larger than the FBP.

Fig. 2.6 shows how does the ART work. Assume that the size of output image

is N pixels, and the size of sinogram is M projections. Let the reconstructed image and the pixel areas covered by jth ray respectively be two N -dimensional vector, ~f = [f1, f2, · · · , fN] and ~wj = [wj1, w

j

2, · · · , w j

N]. The projection measured from the jth ray, pj,

then can be expressed as follows.

pj = ~wj · ~f. (2.8)

Assume that ~fi is the result of i-th iteration. The line OU in Fig. 2.6 is the normalized

vector of ~wj, OU = q w~j ~ wj· ~wj , (2.9) and |OF | = ~fi−1· OU . (2.10)

The lengths of OA and HG then can be obtained by the following equations.

|OA| = OU · OC = q pj ~ wj · ~wj (2.11) |HG| = OF − OA = wqj· ~fi−1− pj ~ wj· ~wj ~ wj. (2.12)

(28)

Finally, the result of i-th round, ~fi, then can be obtained by the following equation. ~ fi = ~fi−1+ pi− ~wj · ~fi−1 ~ wj· ~wj ~ wj. (2.13)

Eq. 2.13 is the recursion for the ART. The implementation of ART consists of four

steps as described as follows. First, calculate the ~wj for the j-th ray. The second step is

called forward-projection that estimates a projection value by inner product of ~fi−1 and

~

wj. The correction then can be obtained by subtraction of pj and ~wj · ~fi−1. Finally,

back project this correction to ~fi−1. These four steps are iteratively executed until the

correction is smaller than a threshold or the number iteration is larger than a limitation. However, the original method of ART is not practical because it iterates for each single projection such that the computational time is extreme high. Eq. 2.13 can be rewritten by the following equation.

~ fi = ~fi−1+ K X j=1 pi− ~wj· ~fi−1 ~ wj · ~wj ~ wj. (2.14)

In Eq. 2.14, the output of each iteration is updated by K rays to achieve better performances than Eq. 2.13. This improved method is called Simultaneous Iterative Reconstruction Technique (SIRT). There are many software systems of SIRT have been developed. Tomo3D is a parallel-computing software system of SIRT [14, 15]. Tomo3D can be executed on an environment of multi-core CPU and GPU. This study used tomo3D to reconstruct the large tomographic images. Table 2.1 shows the time comparisons of FBP on GPU and Tomo3D. Two test data sets named HeLa and Kidney that were a HeLa cell and a kidney of mice respectively. In HeLa, the range of projection angle was

±70◦_{, the angle interval was 1}◦_{, and the size of output was 1024 × 1024 pixels. In Kidney,}

the range of projection angle was ±90◦, the angle interval was 1◦, and the size of output

was 4880 × 4880 pixels. Note that Tomo3D used 30 iterations to reconstruct each results. Fig. 2.7 shows the images reconstructed by FBP and Tomo3D. As shown in Fig. 2.7, the quality of SIRT is better than FBP when the image size is large.

(29)

!

f

_i

!

f

_i−1

!

w

_j

p

_j

=

w

!

_j

⋅

!

f

Figure 2.6: The theory of ART. The size of output image is N pixels, and the

size of sinogram is M projections. ~f = [f1, f2, · · · , fN] is the reconstructed image. ~wj = [wj₁, wj₂, · · · , wj_N] contains the pixel areas covered by jth ray. pj is The projection measured

(30)

(a) (b)

(c) (d)

Figure 2.7: The tomogrphic images of FBP and SIRT. (a) and (b) are the images of HeLa cell, and (c) and (d) are the images of mice kidney, respectively. (a) and (c) were reconstructed by FBP on GPU, and (b) and (d) were reconstructed by Tomo3D.

(31)

Table 2.1: The time comparisons of FBP and SIRT.

Data Sinogram size Output size Time cost of Time cost of

(pixels) (pixels) FBP on GPU Tomo3D

(sec) (sec)

HeLa 1024 × 140 1024 × 1024 0.1 0.61

Kidney 4880 × 180 4880 × 4880 9.1 63.6

The run time statistics were obtained by using Intel i7 3.4GHz CPU, 12GB main memory, nVidia GTX 760 GPU, and running Windows 8.1. 30 iterations for SIRT.

(32)

Chapter 3 Visualization

Visualization is a technique generates images, animations, or figures, enabling users to understand and analyze a multidimensional data set. Volume visualization techniques can be divided into two types, surface rendering and ray-casting rendering. Surface rendering requires geometric primitives to construct the isosurfaces that represent the set of points of a selected constant value. These isosurfaces are then rendered using a shading algo-rithm. Lorensen and Cline [16] proposed the marching cube method. In this approach, 15 primitive types of polygons were defined to create isosurfaces from a unit cube.

Unlike surface rendering, ray-casting rendering produces rendered images by imitating X-rays passing through an object. The 2D projections are obtained by integrating the

voxel values on the lines along the view direction. The integral is an optical model

proposed by Max [17]. This study uses two ray-casting rendering methods, texture-based volume rendering and Fourier volume rendering, to verify the reconstructed volume data.

3.1 Texture-Based Volume Rendering

Texture-based volume rendering (TVR) [5] is the most popular visualization technique in recent years. This method uses graphics hardware (GPU) and texture mapping to accelerate the integration of ray casting. A texture is a data set stored in a video memory space of graphics hardware. To use the texture, a proxy polygon is constructed. Each vertex of the proxy polygon is assigned a coordinate of texture. Note that the range of

(33)

each component of texture coordinate is in [0, 1]. The graphics hardware then uses linear interpolation to compute the texture coordinate of each pixel on the drawing area of proxy polygon during the rendering of proxy polygon. According to these interpolated texture coordinates, the texture then can be mapped to the drawing area of proxy polygon. Fig. 3.1 shows the procedure of texture mapping.

Before the volume rendering, a volume must be stored into the video memory as a three-dimensional texture. Given a viewpoint and a bounding cube of the volume, TVR then constructs a set of proxy polygons that parallel the view plane and all vertices are on the edges of bounding cube. Fig. 3.2 shows the proxy polygons for TVR. These proxy polygons are used to sample voxels from the three-dimensional texture during the rendering. In the voxel sampling stage, the color and opacity of each voxel can be decided by a transfer function to enhance the region of interest. Next, sort these polygons by their distances to the viewpoint. As shown in Fig. 3.2, Polygon N is the farthest polygon and Polygon 1 is the nearest polygon. Finally,the rendering result can be generated by using the alpha blending method to draw the proxy polygons from Polygon N to Polygon 1.

Fig. 3.3 shows a rendering result of two volumes of 2563 voxels, a tomography of head and

a car engine. This result was rendered by TVR with Phong shading [18] and shadowing [19].

To enhance the region of interest, designing an appropriate transfer function is vital. Numerous practical and effective transfer functions have been proposed. For example, Engel et al. [20] proposed the pre-integrated 2D transfer function to improve the rendering

quality. Kindlmann et al. [21] used the surface curvature to determine the contour

thickness for non-photorealistic volume rendering. Lum and Ma [22] proposed the lighting transfer function to enhance the boundary surfaces of the region of interest. Caban and Rheingans [23] designed a transfer function that considered the texture of a feature. Additionally, Correa and Ma developed a number of multidimensional transfer functions for various purposes [24], [25], [26]. To conveniently modify a transfer function, Wu and Qu [27] devised three operations for combining transfer functions. Zhou and Takatsuka [28] utilized the contour tree and residue flow model to automatically generate a harmonic

(34)

color transfer function.

Because TVR samples volume data from the video memory, the original TVR method

cannot render a large volume data if the size of volume data is larger than 10243 voxels. To

solve this problem, the large volume is divided to several blocks before the data loading. These blocks are then loaded into the video memory according to their visibilities. There are two methods proposed to construct a data structure for large volume data. Kniss et al. used level of detail (LOD) to construct a multi-resolution data structure for a large volume [29]. Wang et al. used the entropy to divid a large volume to several blocks [30].

0.0, 0.0 _{1.0,
0.0} 1.0, 1.0 0.0, 1.0 0.4, 0.75 0.7, 0.74 0.2, 0.5 0.8, 0.5 0.4, 0.3 0.68, 0.3

Figure 3.1: Texture mapping. The image on left side is a 2D texture of a tomographic head. The image on right side is a proxy polygon. Each vertex of proxy polygon is assigned a texture coordinate. The range of texture coordinate is (0, 0) to (1, 1).

(35)

Polygon N

Polygon 3 Polygon 2

Polygon 1

Figure 3.2: The proxy polygons in TVR Each proxy polygon parallels the view plane and each vertex is exactly on the edges of bounding cube.

Figure 3.3: A result of TVR This result was generated by TVR with Phong shading

(36)

3.2 Fourier Volume Rendering

Fourier Volume Rendering (FVR)[6], [31], [7] is a volume rendering method based on the

Fourier slice theorem. With a 3D volume data, the FVR algorithm requires O(n2_{log n)}

time to generate a result. Because it requires time less than O(n3_{) does, FVR is preferred}

for designing a real-time rendering algorithm with a preprocessing step. The Fourier slice theorem has been described in Section 2.1. The Fourier slice theorem holds in 3D space (Fig. 3.4). Let F (u, v, w) be the Fourier transform of 3D volume f (x, y, z). Given the

view direction v, the projection Pv(x, y) is the projection along v. Using the Fourier slice

theorem, Pv(x, y) is obtained by taking the inverse Fourier transform of Pv(u0, v0), where

Pv(u0, v0) is the frequency signals in a 2D plane passing through the origin in F (u, v, w).

The FVR method can be summarized in the following equations. Given the volume

data f (x), x∈R3_{, we define Π}

v as an operator that performs FVR from the viewing

direction v. The FVR of f (x) from v can be presented as:

I = Πv(f (x))

= F₂−1(F3(f (x))δv). (3.1)

In the first part of Eq. (3.1), I is the projection of f (x) from the viewing direction v;

in the second part, δv restricts the spectrum of the 3D Fourier transform, F3, to a plane

passing through the origin and perpendicular to v. We then take the 2D inverse Fourier

transform, F₂−1, to obtain the projection I.

Assuming that the Fourier transform of volume data is available, the inverse Fourier transform of a slice from the frequency domain, which passes through the origin and is perpendicular to the view direction, is the projection of the volume along the view direction. This projection is the exact result of direct volume rendering. Using FVR, because only the inverse Fourier transform of a slice must be calculated, a volume of

n3 _{voxels can be rendered in O(n}2_{log n) time, providing the Fourier transform of the}

volume is available. To accelerate the speed of rendering, Viola et al. [32] presented an implementation of FVR using GPU with the advantage of parallel computing to accelerate

(37)

FVR computation.

Because resampling in frequency domain could produce replicas in spatial domain [10], Malzbender designed several filters to reduce the artifacts caused by the resampling in the frequency domain [7]. Levoy [31] presented three shading models for FVR, including depth cueing, directional lighting, and specular reflections. Combined with Levoy’s shading model, Entezari et al. [33] used the spherical harmonic function to approximate cubic illumination shading for FVR.

To enhance the region of interest, designing an appropriate transfer function is vital. Numerous practical and effective transfer functions have been proposed. For example, Engel et al. [20] proposed the pre-integrated 2D transfer function to improve the rendering

quality. Kindlmann et al. [21] used the surface curvature to determine the contour

thickness for non-photorealistic volume rendering. Lum and Ma [22] proposed the lighting transfer function to enhance the boundary surfaces of the region of interest. Caban and Rheingans [23] designed a transfer function that considered the texture of a feature. Additionally, Correa and Ma developed a number of multidimensional transfer functions for various purposes [24], [25], [26]. To conveniently modify a transfer function, Wu and Qu [27] devised three operations for combining transfer functions. Zhou and Takatsuka [28] utilized the contour tree and residue flow model to automatically generate a harmonic color transfer function.

For any volume rendering algorithm that processes volume data in a spatial domain, the transfer function can be applied to the volume, and calculate the rendered result within the same time bound. However, for FVR with frequency domain data, re-rendering

the volume after applying a transfer function within the same O(n2log n) time bound is

challenging. A naive approach would be to employ one of the following two methods: apply the transfer function to the volume data and recompute the Fourier transform, or implement the convolution operation in the frequency domain. The time required for both

approaches exceeds O(n2_{log n). To improve the re-rendering time after the application}

of a transfer function, Nagy et al. [34] proposed designing a binary classification transfer function for FVR. They used the Fourier series of a step function to eliminate unwanted

(38)

u v w ) , , (u vw F ) ' , ' (u v Pv pv(x,y) 2D IFT Extraction Frequency domain Spatial domain v

Figure 3.4: The FVR algorithm. In the frequency domain, the 2D frequency function Pv(u0, v0) passing through the origin of F (u, v, w) is extracted. The projection pv(x, y) is

obtained by taking the 2D inverse Fourier transform of Pv(u0, v0).

voxels.

3.2.1 Transfer Function Design for FVR

This study proposes a voxel classification method for FVR [35, 36]. This method uses the

B´ezier curve and B-spline to construct transfer functions such that the voxel classification

can be immediately adjusted in rendering time. Let f (x)=a · f1(x) + b · f2(x) where f1(x)

and f2(x) are two integrable functions. The Fourier transform, F , possesses the linear

combination property as shown in Eq. (3.2),

F (f (x)) = F (a · f1(x) + b · f2(x)) = a · F (f1(x)) + b · F (f2(x)), (3.2)

where a and b ∈ R are two constants. This property can be used to efficiently implement the shading model and transfer function of FVR.

The shading model of FVR proposed by Levoy [31] employs the linear combination property advantage. Let g be a shading model applied to the volume data f (x). From Eq. (3.1), the FVR of the resulting volume can be presented as

(39)

= F₂−1(F3(g(f (x)))δv). (3.3)

If g(f (x)) can be decomposed into a linear combination of m terms, then:

g(f (x)) = h0· g0(f (x)) + h1· g1(f (x))

+ . . . + hm−1· gm−1(f (x)), (3.4)

where hi is the shading factor. Substituting Eq. (3.4) into g(f (x)) in Eq. (3.3), we obtain

Eq. (3.5). I = Πv(h0· g0(f (x)) + h1· g1(f (x)) + . . . + hm−1· gm−1(f (x))) = F T₂−1((h0F3(g0(f (x))) +h1· F3(g1(f (x))) + . . . + hm−1· F3(gm−1(f (x))))δv) = h0· Πv(g0(f (x))) + h1· Πv(g1(f (x))) + . . . + hm−1· Πv(gm−1(f (x))) = m−1 X i=0 hi· Πv(gi(f (x))). (3.5)

In Eq. (3.5), the rendered result is the summation of m weighted FVR results. Therefore,

recomputing the 3D Fourier transform is not required if Πvgi(f (x)), i = 0, 1, ..., m − 1, in

Eq. (3.5) are available.

3.2.2 B´

ezier curve Transfer Function

A B´ezier curve defined by m control points is shown as Eq. (3.6):

B(u) = m−1

X

i=0

(40)

0 1

2

m-2

m-1

Figure 3.5: A spline with m control points can be used to present the transfer function. The shape of a spline curve can be modified by changing the height of the control points using a graphic user interface system.

where u and wi∈R, and

bi,m−1(u) =     m − 1 i     (1 − u)m−i−1ui. (3.7)

As shown in Fig. 3.5, a B´ezier curve is defined in the U-W coordinate system, where

the U-axis corresponds to the voxel value and the W-axis corresponds to the shading

weight of the voxel value. The B´ezier curve consists of m control points, pi = (ui, wi), i =

0, 1, ..., m − 1, ui = u0+ ih and h = _m−11 . By adjusting wi, we edit the transfer function.

Combining Eqs. (3.6), (3.3), and (3.5), FVR with a B´ezier curve transfer function is

written as I = F₂−1(F3( m−1 X i=0 wibi,m−1(f (x)))δv) = F₂−1(( m−1 X i=0 wiF3(bi,m−1(f (x))))δv) = m−1 X i=0 wiΠv(bi,m−1(f (x))). (3.8)

According to Eq. (3.8), the FVR of a volume after applying the B´ezier curve transfer

(41)

3.2.3 B-Spline Transfer Function

One disadvantage of the B´ezier curve transfer function is its inability to enable good

localized control of the curve shape. Modifying a control point may alter the shape of a

significant portion of the B´ezier curve. Compared to the B´ezier curve, a B-spline [37] is

more useful in controlling the curve shape. A k-degree B-spline is defined in Eq. (3.9):

R(u) = m−1 X i=0 wiri,k(u), (3.9) where ri,k(u) = si,k(u) Pm−1 j=0 sj,k(u) (3.10)

and si,k(u) is the B-spline basis function. Let a non-decreasing sequence T = {ti|ti ≤

ti+1, i = 1, . . . , l − 1} be the knot vector. si,k(u) is recursively defined as Eq. (3.11):

si,k(u) = u − ti ti+k− ti si,k−1(u) + ti+k+1− u ti+k+1− ti+1 si+1,k−1(u), (3.11) and si,0 =        1 if ti ≤ u < ti+1, 0 otherwise. (3.12)

Given the volume data f (x), taking Eq. (3.9) as the transfer function, and applying FVR to the resulting volume, we have

I = Πv(R(f (x))). (3.13)

Substituting Eq. (3.9) into Eq. (3.13), and according to Eq. (3.5), we have

I = F₂−1(F3( m−1 X i=0 wiri,k(f (x)))δv) = F₂−1(( m−1 X wiF3(ri,k(f (x))))δv)

(42)

= m−1

X

i=0

wiΠv(ri,k(f (x))). (3.14)

The FVR of a volume after applying the B-spline transfer function is the summation of the m-weighted FVR results.

3.2.4 Increasing the Control Points

Using additional control points enables the transfer function to be easily shaped to a desired form. Unfortunately, the memory space requirements are linearly proportional to the number of control points. Consequently, the volume data may not fit into the GPU memory. To overcome this problem, we propose to increase the number of control points, but then cluster the control points into groups. Control points in the same group will either share identical shading factors, or their shading factors can be obtained through interpolation. The memory required then depends on the number of groups, which are manageable in size.

Given n control points in q clusters, the i-th cluster contains mi control points with

the weight wi, and the formula of the spline curve is provided as Eq. (3.15):

R(u) = q−1 X i=0 mi−1 X j=0 wiαjrˆi(pj),k(u), (3.15)

where αj is a constant and ˆi(pj) is the index of the control point pj. Applying Eq. (3.15)

to Eq. (3.5), we have: q−1 X i=0 wiΠv( mi−1 X j=0 αjrˆi(pj),k(f (x))). (3.16)

The control points in the same cluster share the same shading weight; thus, only one copy of the volume data is stored in the GPU memory. In Fig. 3.6, the curve consists

of eight control points. We can cluster the eight control points into four groups, G0 =

{p0}, G1 = {p1, p3, p5}, G2 = {p2, p4} and G3 = {p6, p7}. In this case, only four copies

of the volume data are required.

If the weight of a control point can be obtained by interpolation from the weights of other control points, the control point can be clustered into a group. Given a sequence of

(43)

control points pc, 0 ≤ i ≤ c ≤ j < m, the weight of the control point pd, i < d < j, can be obtained through polynomial interpolation

wd =

j

X

c=i,c6=d

wclc(ud). (3.17)

In Eq. (3.17), lc is an interpolation basis function, such as linear interpolation, spline

interpolation, or Lagrange interpolation. For the ease of explanation, we assume only one control point is obtained through interpolation. However, generalizing to more than one control point is not difficult. Applying the transfer function to a volume, Eq. (3.14) becomes: I = wdΠv(rd,k(f (x))) + m−1 X c=0,c6=d wcΠv(rc,k(f (x))) = j X c=i,c6=d wclc(u)Πv(rd,k(f (x))) + m−1 X c=0,c6=d wcΠv(rc,k(f (x))) = j X c=i,c6=d wiΠv(lc(u)rd,k(f (x)) + rc,k(f (x))) + X 0≤c<i wcΠv(rc,k(f (x))) + X j<c<m wcΠv(rc,k(f (x))). (3.18)

The simplest polynomial interpolation is linear interpolation, given three control points pi = (ui, wi), pj = (uj, wj) and pd= (ud, wd), where ui < ud< uj. The weight wd can be obtained by

wd = (1 − β)wi+ βwj, (3.19)

where β = (ud− ui)/(uj − ui). Applying the transfer function to the volume, Eq. (3.14)

becomes I = wiΠv(ri,k(f (x))) + wdΠv(rd,k(f (x))) + wjΠv(rj,k(f (x))) = wiΠv(ri,k(f (x))) + wjΠv(rj,k(f (x))) + ((1 − β)wi+ βwj)Πv(rd,k(f (x))) = wiΠv(ri,k(f (x)) + (1 − β)rd,k(f (x))) + w Π (r (f (x)) + βr (f (x))). (3.20)

(44)

0 0 1 2 4 3 5 6 7 1 2 3

Figure 3.6: The clustered control points. Eight control points are clustered into four groups. In this case, only four copies of volume data are required.

Equation (3.20) shows that recomputing the Fourier transform of the volume data is not required if the weights of the control points are modified.

3.2.5 Results of FVR

We developed a GUI (Graphical User Interface) software system with the proposed trans-fer function for FVR. Using the developed software system, users can easily modify the transfer function and change the view direction. Our implementation environment was as follows:

• CPU: Pentium 4, 2.4 GHz,

• Memory: 2 GB,

• GPU: nVidia GeForce 8800 GT,

• Video memory: 768 MB,

• Viewport size: 512 × 512 pixel.

To evaluate the performance, we used two sets of volume data of different size, 1283 _and

(45)

We measured the rendering time and the pre-processing time, which included computing a 3D FFT. The performance of different cases is shown in Table 3.1. Because modern GPUs support parallel computing, increasing the volume size and number of control points slightly affect the frame rate. However, the pre-processing time is linearly proportional to the volume size and the number of control points.

This study presents the rendered results using several data sets. The first data set

was CT-scan human chest volume data. The volume size was 2563 _{voxels. The rendered}

results obtained using the B´ezier curve and the B-spline transfer functions defined by six

control points are shown in Figs. 3.7 and 3.8. In both Figs. 3.7 and 3.8, the images on the left show the rendered result obtained by applying the transfer functions that are shown on the right. The transfer functions in the first and second rows of both figures were designed to depict lung and bone structure, respectively. As shown in Fig. 3.8, using B-spline as the transfer function can significantly enhance lung capillaries. The result in Fig. 3.9 show that using a transfer function with greater control points can achieve better rendering. This study used a B-spline transfer function with 20 control points clustered into six groups. A comparison of the images shown in Figs. 3.8 and 3.9, the lung capillaries were significantly enhanced.

We also tested two volume data sets of HeLa cells, HeLa1 and HeLa2 (see Section 4.1.3). These data sets were reconstructed from the synchrotron radiation images aligned

by the proposed alignment method(see Ch. 4). The size of HeLa1 and HeLa2 is 2563

voxels. Fig. 3.10 shows the results of HeLa1. The results were obtained using the B-spline transfer function of 25 control points. Among these control points, three control

points, P0 to P2, were used to adjust the weights of membrane voxels. The cluster

G0 contained first two control points to adjust the background voxels. The cluster G1 contained middle six control points to adjust the weights of the voxels which contain the intensities between membrane and particles. The cluster G2 contained the remained 11 control points to adjust the weights of particle voxels. In the top row, we mainly enhanced

the weights of P1 and P2 to render the membrane of cell. The bottom row shows the

(46)

G2 was enhanced to render the particles.

Fig. 3.11 shows the results of HeLa2. The results were obtained using the B-spline transfer function of 50 control points. Among these control points, three control points,

P0 to P2, were used to adjust the weights of membrane voxels. The cluster G0 contained

first two control points to adjust the background voxels. The cluster G1 contained middle seven control points to adjust the weights of the voxels which contain intensities between membrane and particles. The cluster G2 contained the remained 38 control points to adjust the weights of particle voxels. In the top row, we mainly enhanced the weights

of P1 and P2 to render the membrane of cell. The bottom row shows the results of

the particles. The gray scales of particles are between 0.5 to 1.0, the weight of G2 was

enhanced to render the particles.

One disadvantage of the proposed method is the significant amount of memory re-quired. A user can more easily design a curve shape if more control points are used. However, the memory required increases in linear proportion to the number of control points, and the GPU memory is limited. Currently, commercially available GPUs can

process a volume of 2563 if six control points are used. When additional control points

are required, the control points can be clustered into groups if they possess the same weight or their weight can be obtained through interpolation. Each group requires a copy of the volume to ensure the required memory can be maintained at a manageable size.

One limitation of the proposed method is that transfer functions containing a negative value are not allowed. This issue is the future research. Additionally, another future works is designing an efficient sampling method for extracting a slice of the frequency domain volume because the computing time required for sampling a slice of the frequency domain data is longer than that required to perform the inverse 2D Fourier transform. A more efficient sampling method is required to improve the overall performance.

(47)

Figure 3.7: The results of FVR with B´ezier curve transfer function(CT chest).

The rendered results that were obtained using the B´ezier curve transfer function. The

data set was the CT-scan human chest of 2563 voxels. The left image in the first row

depicted lung structure. The image in the right shows the transfer function. We tried to enhance the gray scales between 0.4 and 0.65. But the gray scales between 0.25 and 0.8 are also enhanced so that the lung capillaries were blurred. The left image in the second row shows the bone structures through enhancing the gray scales between 0.9 and 1.0.

(48)

Figure 3.8: The results of FVR with B-spline transfer function(CT chest). The same volume data used as Fig. 3.7. The rendered results that were obtained using the B-spline transfer function. The first row shows lung structure and the second row shows the bone structure. Using the B-spline transfer function was easier to control the curve shape. The lung capillaries were more clearly shown in the rendered result compared to the result in Fig. 3.7.

(49)

G0 G1 G2 G3 G0 G4 G5 G0 G1 G2 G3 G0 G4 G5

Figure 3.9: The results of FVR with B-spline transfer function(CT chest).

The same volume data used as Fig. 3.7. The rendered results were obtained using the B-spline transfer function defined by 20 control points. The control points were clustered

into six groups, G0∼5. The first row shows lung structure, the gray scales between 0.4

and 0.65 were enhanced. The second row shows bone structure, the gray scales between 0.7 and 1.0 were enhanced. Compared to the results that are shown in Fig. 3.8, the lung capillaries and backbone structure were further enhanced.

(50)

G0 p0 p1 p2 G1 G2 G2 G1 p0 p1 p2 G0

Figure 3.10: The results of FVR with B-spline transfer function(HeLa1). This

is the volume of HeLa cell with 2563 voxels. The results were obtained using the

B-spline transfer function of 25 control points. P0 to P2, were used to adjust the weights

of membrane voxels. The cluster G2 contained the final 11 control points to adjust the

weights of particle voxels. In the top row, the weights of P1 and P2 were enhanced to

render the membrane of cell. In the bottom row, the weights of G2 is enhanced to render

(51)

G2 G1 p0 p1 p2 G0 G2 G1 p2 p0 p1 G0

Figure 3.11: The results of FVR with B-spline transfer function(HeLa2). This

is the volume of HeLa cell with 2563 voxels. The results were obtained using the

B-spline transfer function of 50 control points. P0 to P2, were used to adjust the weights

of membrane voxels. The cluster G2 contained the final 11 control points to adjust the

weights of particle voxels. In the top row, the weights of P1 and P2 were enhanced to

render the membrane of cell. In the bottom row, the weights of G2 is enhanced to render

(52)

Table 3.1: The computing time of two cases using the B-spline as a transfer function.

Data size

(voxel) 1283 ₂₅₆3

The number of

the control points 6 4 1 6 4 1

Memory required (MB) 96 64 16 768 512 128 Pre-processing (second) 18.1 12.1 3 167.4 109 8.7 Frame rate (FPS) 64.1 65.8 66.46 59.2 62.41 66.26

(53)

Chapter 4 High-Resolution Volume

Reconstruction

Theoretically, the volume image can be reconstructed from a sequence of high-resolution projections. This is not true, because the problem of mechanical imprecision arises when the resolution increases to a certain level, such as that required for cell tomography. When the pixel size is approximately 10 nm, a slight mechanical vibration can hinder accurate reconstruction. Fig. 4.1 demonstrates this problem. In Fig. 4.1(a), four X-ray images of a cancer cell are taken from different projection angle. The yellow lines mark the Y-axis positions of a black particle. Because the mechanical vibration, the Y-axis position of each black particle in the images are different. Fig. 4.1(b) shows the sinogram constructed from a row in each image of the 140 X-ray images of a cancer cell. The sinogram is no smooth and the reconstruction is not possible. Fig. 4.1(c) shows the FBP result.

In our cell tomography experience, the pixel size of image is 11.78 nm. Rotating the object holder can cause a 5 to 30 pixel difference in position because of the mechanical instability. Although the position of the object holder can be calibrated during image ac-quisition, the calibration process can take an unacceptably long time, causing the object to receive excessive X-rays. The TXM controller provided by Xradia (hereinafter called “Xradia”) was designed to solve the misalignment problem. Unfortunately, manual ad-justments are generally required to obtain satisfactory tomography reconstruction. A

(54)

sim-ilar problem also exists in electron microscopic tomography. Using the cross-correlation function to align the electron microscopic images is a common solution to this problem [38]. A software system , “SPIDER”, was implemented based on the cross-correlation function[39]. However, the cross-correlation function alignment does not considering the projection model, thus limiting the quality of tomography reconstruction.

This study presents a feature-based alignment approach for calibrating the displace-ment caused by mechanical vibration [40]. Because a synchrotron X-ray is a parallel beam projection, the resulting displacement can be decomposed into vertical and hori-zontal displacements. The proposed method aligns the images in the vertical direction by direct image alignment. Calibrating the images in the horizontal direction is more complex than that in the vertical direction. In addition to matching feature points, the matched feature points must form sine-wave shaped loci. We propose approximating the loci of the matched feature points in the x-θ coordinate system according to sine waves by using the least square curve fitting. The deviation between the loci and the sine waves provides information for horizontal calibration.

The following two sections respectively describe the proposed method and the results.

4.1 Methods

4.1.1 Sample Preparation and Image Acquisition

HeLa cells were used in this study. The cells were grown on Kapton film, and endocytosis [41] was used to stain HeLa cells by absorbing gold nanoparticles of 250µM (micromo-lar). The cells were then fixed in a container using a mixture of paraformaldehyde and glutaraldehyde[42].

The synchrotron microscope used in this study was built at the National Synchrotron Radiation Research Center, Hsinchu, Taiwan. The CCD size is 2048×2048 pixels, and the

field of view is 24 µm. Each projection image was taken after a 1◦ rotation. To prevent

the object holder from becoming perceptible (occurring when the object holder is nearly

(55)

(a)

(b)

(c)

Figure 4.1: The misalignment in tomography. (a) Four X-ray images of a cancer

cell acquired from different projection angles. The yellow lines mark the Y-axis positions of a black particle. These positions are different due to the mechanical vibration. (b) The sinogram extraced from 140 X-ray images with misalignment. (c) The FBP result of (b).

(56)

140 projection images were acquired. The size of each image was 1024 × 1024 pixels, and the pixel size was 11.78 nm. In this study, the exposure time of each image was 1 second.

4.1.2 The Alignment Method

A projected feature point is a pronounced mark in an X-ray projection image. Alignment is accomplished by first aligning the projected feature points in the vertical direction and then in the horizontal direction. The projected feature points should be maintained in the vertical direction. Thus, vertically aligning the projected feature points in the second image to the previous image is sufficient. However, the location of the feature points in the horizontal direction varies among projection images. Calculating the horizontal location of the feature points is a more difficult task than alignment in the vertical direction. The following subsections describe these steps.

Vertical Direction Alignment

For each pair of projection images, the sum of the intensity values on each row is calcu-lated. The sums of the rows form histograms that should be similar in a pair of consecutive images. The vertical displacement can be calculated by minimizing the difference between the histograms.

Given an N × M image I(x, y), 0 ≤ x < N , 0 ≤ y < M and I(x, y) ∈[0, 1]. The vertical histogram h is calculated by

h(I, y) = 1 N N −1 X x=0 I(x, y). (4.1)

Assume that Ia is the unaligned image and Ib is the reference image. The vertical

cor-rection of Ia _{is ˆ}_{y, which can be estimated by}

ˆ y = arg min y0 M −1 X y=0

(h(Ia, y + y0) − h(Ib, y))2. (4.2)

To achieve the most favorable results, the image is preprocessed to enhance the features. In this experiment, the estimated correction is more accurate when the images are enhanced

(57)

by applying the edge detection method[43].

Horizontal Direction Calibration

The horizontal calibration is based on the projected feature points forming a sine-shaped locus in the x-θ coordinate system. This calibration involves three steps: detecting pro-jected feature points, matching propro-jected feature points to construct a set of loci from the matched projected feature points, and fitting the loci to sine curves to adjust the horizontal displacement of images.

Detecting Projected Feature Points

Feature point extraction is a fundamental step in image stitching, object recognition, and feature-based image alignment [44]. Researchers have proposed many feature detection methods. The corner detection method proposed by Harris and Stephen [45] is commonly used to extract corner-shape regions in an image. To achieve scale invariance, Kadir and Brady [46] selected the salient region from the image scale-space as the feature that possesses the maximum entropy. Lowe [47] proposed the scale-invariant feature transform (SIFT) algorithm to select local extrema from the differences of a Gaussian (DoG) pyramid in an image. The SIFT algorithm uses the gradient location-orientation histogram as a feature descriptor to achieve rotation invariance and illumination invariance. Researchers have proposed several improved versions of the SIFT algorithm. Bay et al. used the Haar wavelet to expedite feature detection [48]. Rady et al. proposed entropy-based feature detection [49], and Suri et al. combined mutual information alignment with the SIFT algorithm [50].

In this study, a modified SIFT algorithm was employed to extract automatically the projected feature points contained in X-ray images. The typical SIFT implementation in-volves describing a feature according to its location, size, the orientation of the sampling region, and the image gradient histogram in the sampling region. Because the proposed method matches the projected feature points in two X-ray images based on mutual in-formation [51, 52, 53], each projected feature point in this study contained the entropy

(58)

of the sampling region rather than the image gradient histogram. To reduce the noise and the number of low-contrast projected feature points, the entropy of each selected projected feature point must exceed a given threshold. The experiments in this study entailed setting a threshold between 0.5 and 1.0. Because the features in the objects are gold nanoparticles, the size and orientation of the sampling region were fixed in this implementation.

Matching Projected Feature Points

Let Fi_{, i = 1, . . . , m be the sets of projected feature points in m projection images. The}

projected feature points are classified into k groups. In the ideal case, each group is the set of projected feature points, which are the projections of a feature (i.e., gold nanopar-ticle) in the object from various angles. Because the rotation angle of the object is small, the projected feature points are in proximity and have similar mutual information in two consecutive images. However, the distance between the two matched projected feature points depends on the distance between the feature and the rotation axis of the object. This means that an affine transform cannot match the projected feature points in two im-ages. Therefore, this study presents a greedy method for classifying the projected feature points. For each pair of images, the random sample consensus (RANSAC) method[54] was first applied to compute an initial alignment of the two images, and a tracking method was then employed to match the projected feature points in the next image.

Several feature tracking methods are available [44]. The proposed method is designed based on the Shafique and Shah’s method [55], which is a greedy algorithm for tracking moving objects in videos, and the Tang and Tao’s method [56], which integrates the hidden Markov model to eliminate unreliable matches.

Given the projected feature points sets Fi−1_{and F}i_{, the RANSAC method was applied}

to compute a translation matrix Ti,i−1, so that a sufficient number of projected feature

points p and q in Fi−1 _{and F}i _{respectively, |T}

i,i−1(q) − p| is less than a given threshold .

Applying translation matrices Ti,i−1, i = 2, . . . , m to the consecutive images achieves the

以X光顯微鏡進行三維斷層影像重建

國

立

交

通

大

學

資訊科學與工程研究所

博

士

論

文

以

X 光顯微鏡進行三維斷層影像重建

Three-Dimensional Tomography Reconstruction for

X-Ray Microscopy

研 究 生： 鄭 昌 杰

指 導 教 授： 荊宇泰 博士

胡宇光 博士

以

X 光顯微鏡進行三維斷層影像重建

Three-Dimensional Tomography Reconstruction for

X-Ray Microscopy

研

究 生：鄭昌杰

Student: Chang-Chieh Cheng

指導教授：荊宇泰 博士

Advisor: Dr. Yu-Tai Ching

胡宇光

博士

Dr. Yeukuang Hwu

國

立 交 通 大 學 資 訊 學 院

資

訊 科 學 與 工 程 研 究 所

博

士 論 文

以

X 光顯微鏡進行三維斷層影像重建

學生：鄭昌杰

指導教授：荊宇泰

博士

胡宇光

博士

國立交通大學資訊學院

資訊科學與工程研究所

摘

要

由於高解析度影像在取像上並不是那麼容易，所以從這些影像上取

得的資訊是十分有價值的。X 光顯微鏡利用了硬 X 光的短波長與高穿

透力的特性來取得厚實標本的奈米解析度影像，這是其他取像技術無法

辦到的。一般的

X 光取像設備是以 X 光吸收率來取像，但若以折射率

來進行相位取像可有效地取得物體材質間的對比變化並加強之，且這種

取像過程不會被某些生物標本的高

X 光吸收結構所影響。這種先進的

技術在生醫研究上具有獨特的價值，但也浮現許多問題需要解決。本研

究針對其中幾個問題並提出新的影像處理方法來解決。在斷層取像過程

中，物體的旋轉會發生影像上的不對稱抖動現象，我們提出一個特徵點

對齊法來對齊這些

X 光投影影像，使得斷層掃描影像可以成功的重建

出來。為了解決高解析度攝影視野的限制，我們提出一個影像拼貼的技

術將數張較小視野的影像合成一張可完整包含物體的大尺寸影像。此方

法經由一組模擬資料來驗證其大型物體的三維斷層影像重建之可行

性。最後，我們以傅立葉體積描繪法來快速地瀏覽重建完成的三維立體

資料，也運用了具備美觀性與準確性的紋理體積描繪法來進行三維立體

資料的視覺化。其中，我們為傅立葉體積描繪法提出一個快速的資料分

類方法，可有效地改變轉換函數的權重並即時地反映在繪圖結果上。

Three-Dimensional Tomography

Reconstruction for

X-Ray Microscopy

student：Chang-Chieh Cheng

Advisors：Dr.

Yu-Tai Ching

Dr. Yeukuang Hwu

Institute of Computer Science and Engineering

College of Computer Science

National Chiao Tung University

ABSTRACT

High resolution images pushing the performance limit are always difficult to

研究生：鄭昌杰

指導教授：荊宇泰博士

胡宇光博士

究生：鄭昌杰

指導教授：荊宇泰博士

立交通大學資訊學院

訊科學與工程研究所

士論文