應用非結構性網格之平行化三維PIC-FEM程式的研究與發展

(1)

國

立

交

通

大

學

機

械

工

程

學

系

博士論文

應用非結構性網格之平行化三維 PIC-FEM 程式

的研究與發展

Development of a Parallelized PIC-FEM Code Using a

Three-Dimensional Unstructured Mesh and Its Applications

研究生：許國賢

指導教授：吳宗信博士

(2)

應用非結構性網格之平行化三維 PIC-FEM 程式的研究與發展

Development of a Parallelized PIC-FEM Code Using a

Three-Dimensional Unstructured Mesh and Its Applications

研究生：許國賢 Student：Kuo-Hsien Hsu

指導教授：吳宗信博士 Advisor：Dr. Jong-Shinn Wu

國立交通大學

機械工程學系

博士論文

A Thesis

Submitted to Institute of Mechanical Engineering Collage of Engineering National Chiao Tung University

In Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy in

Mechanical Engineering July 2006

Hsinchu, Taiwan, Republic of China

中華民國九十五年七月

(3)

誌謝

首先由衷感謝論文指導老師吳宗信教授在學術研究上的細心及辛苦的指導。在兩年碩士班和五年博士班的學習期間，學生深切地感受到老師對學術研究的熱情與願景，淺移默化間也成為激勵學生本身不斷追求進步的動力；在待人處事的生活哲學上，老師更是樹立了我們學習的好榜樣，時時刻刻提醒我們身為國家社會一份子所應擔當的責任與義務。感謝口試委員傅武雄主任、蔡惠峰副主任、陳彥升研究員、洪哲文教授以及江仲驊教授給與本論文撰寫及研究成果的建議。在此特別感謝大學專題指導老師江仲驊教授，多年來的關心與鼓勵。江老師總是在電話筒的那端耐心傾聽學生那不善表達的心情告白，能親自邀請老師擔任口試委員，是學生心目中對您最大的敬意。感謝劉晉良教授、黃楓南教授與鄭宗杰博士分別在數值方法及實驗研究上的指導與討論。感謝曾經是 GDKL 實驗室和現在 MuST 實驗室一同打拼的夥伴們：許佑霖學長、邵雲龍學長，曾坤樟學長、連又永同學、李允民學弟、周欣芸學妹、洪捷粲學弟、李富利學弟、許哲維學弟、鄭凱文學弟以及所有碩士班的學弟妹們。感謝主內弟兄陳百彥學弟在論文發表上的協助和主前的代禱，讓我能安心和專心在程式上的發展。特別感謝邵雲龍學長、洪捷粲學弟和李富利學弟在我博士論文最後階段，全力的協助與幫忙。謝謝你們大家豐富了我的研究生活，更讓我享受到 team work 的甘甜。

(4)

深深感謝父親許忠裕先生和母親陳幸雪女士在我求學生涯的過程，在經濟上和生活上的不間斷的支持和鼓勵。感謝我的岳父吳義行先生和岳母鄭秀月女士，願意讓他們的寶貝女兒嫁給一個還在求學的學生；謝謝您們對我們的信任以及信仰上的相互扶持；感謝您們無怨無悔的愛，希望這份畢業的榮耀，是我這兩位父親今年父親節最特別的禮物。感謝小妹淑貞、妹婿碩賢和小弟志豪，在我外出求學的期間，對父母的孝順與付出。感謝文博、敏慈、大勇和美嬋對我們的鼓勵，也謝謝你們總是開心地跟我們分享姪子和姪女可愛的成長照片。特別感謝我最親愛的老婆美蔓，總是陪伴在我身邊，給予我最直接的幫助；謝謝妳每天在上帝面前為我所作的禱告，因著妳的愛，所有的逆境遭遇，我都能甘之如飴。感謝光復教會的牧者、傳道與所有弟兄姐妹在日常生活中彼此的幫忙與代禱，引領著我一次又一次經歷上帝那奇妙的恩典。最後，深願將這所有的榮耀歸於上帝祢自己的名。許國賢謹誌 2006 年 7 月于風城交大

(5)

Development of a Parallelized PIC-FEM Code Using a

Three-Dimensional Unstructured Mesh and Its Applications

Student: Kuo-Hsien Hsu Advisor: Prof. Jong-Shinn Wu Department of Mechanical Engineering

National Chiao-Tung University

Abstract

A general parallel three-dimensional electrostatic particle-in-cell scheme with finite element method (PIC-FEM) using an unstructured mesh is proposed and verified in this dissertation. A multi-level graph-partitioning technique is used to dynamically decompose the computational domain to improve the parallel performance during runtime. Completed parallelized PIC-FEM code is used to simulate several important physical problems, including field emission, DC/RF gas discharge and DC/RF magnetron plasmas. In this thesis, research is divided into three different phases.

In the first phase, a parallelized three-dimensional electrostatic Poisson’s

equation solver using Galerkin finite element method using an unstructured mesh is developed and validated. In addition, a parallelized three-dimensional vector potential magnetostatic Poisson’s equation solver is developed and validated. Furthermore, these two solvers are coupled, respectively, with a parallel adaptive mesh refinement (PAMR) module, to automatically improve the resolution of solution near where the property gradient is large. In both solvers, resulting algebraic equations are solved using either the parallel conjugate gradient method with a subdomain-by-subdomain scheme for more processors (>10) or the direct sparse matrix solver for fewer processors (<10). Parallel speedup test for solvers using parallel conjugate gradient

(6)

method is performed on a HP-IA64 cluster system up to 32 processors at NCHC of Taiwan. Results show that parallel efficiency can reach 84% and 75% at 32 processors for the electrostatic Poisson’s equation solver and magnetostatic vector Poisson’s equation solver, respectively.

In the second phase, a general parallel three-dimensional PIC-FEM code is

developed and validated. This PIC-FEM code integrates the parallelized Poisson’s equation solver, developed in the first phase, with the PIC and Monte Carlo collision (MCC) schemes on an unstructured tetrahedral mesh. Charged particles are traced either cell-by-cell on an unstructured mesh. This is achieved using leap-frog time-integration method and Boris rotational scheme when magnetic field is involved. Charge assignment and force (field) interpolation between charged particles and grid points is implemented using the same interpolation function originated from the FEM. In addition, dynamic domain decomposition (DDD) with weighting based on number of particles is used to balance the workload among processors during runtime. Study of parallel performance of the parallelized PIC-FEM code is performed on the HP-IA64 clusters. Results using DDD show that parallel efficiency can reach 83% at 32 processors.

In the third phase, the parallelized PIC-FEM code is used to simulate several

important problems to demonstrate its superior capability in handling practical problems. These problems include field emission from a CNT /silicon based emitter under external electric field and magnetic field, two typical three-dimensional DC and RF gas discharge plasmas, and two typical three-dimensional DC and RF magnetron plasmas with permanent magnets.

(7)

Keyword: Particle-In-Cell with Monte-Carlo collision, finite element method, unstructured mesh, parallel, graph-partitioning, field emission, DC and RF gas discharge plasmas, DC and RF magnetron plasmas

(8)

應用非結構性網格之平行化三維 PIC-FEM 程式的研究與發展

學生：許國賢指導教授：吳宗信國立交通大學機械工程學系

摘要

本論文研究目的主要是發展與驗證一平行化三維粒子式 Particle-In-Cell (PIC-FEM)和蒙地卡羅法程式，並使用 multi-level 圖形切割技術於三維非結構性網格的動態區域切割法。此程式可應用於許多重要物理問題上的模擬，包括場發射元件、直流氬氣放電電漿、射頻氬氣放電電漿、直流氬氣磁控式電漿和射頻氬氣磁控式電漿。本研究主要可分為三部分。第一部份：使用Galerkin 有限元素法來分別離散靜電 Poisson 方程式來求解三維靜電場問題以及靜磁向量 Poisson 方程式來求解三維靜磁場問題。當平行電腦叢集數目大於10 時，程式使用的平行化 conjugate gradient method 來求解矩陣問題；反之，當平行電腦叢集數目小於 10 時，則使用直接矩陣法 MUMPS 來求解矩陣問題。再者，將以上所發展的平行化靜電和

靜磁程式結合平行可調適網格再切割模組(parallel adaptive mesh refinement

module, PAMR)；因此，當計算區域出現位勢場變化較大處，計算網格將會自動被切割以獲得更正確的數值解。兩者程式平行效率測試於國家高速電腦提供的 32 台 HP-IA64 平行電腦叢集上進行，結果顯示靜電和靜磁程式的平行化效率於 32 台電腦腦叢集下分別可達到 84%和 75%。

(9)

論文的第二部份：主要是結合第一部份發展的三維平行化靜電和靜磁程式及三維平行PIC-FEM 程式於三維非結構性四面體網格上。PIC-FEM 程式使用蛙跳法與Boris 法來計算粒子運動方程式。因計算網格的不同，粒子追蹤法可利用非結構性網格關係來追蹤運動粒子軌跡。帶電粒子和計算網格點之間的權重函數為有限元素法中的形狀函數。再者，使用動態區域切割法來平均分散平行電腦間的工作量以改善平行化效率。最後模擬一近似一維的直流氬氣放電電漿和射頻氬氣放電電漿來驗展程式的正確性，並模擬三維射頻氬氣放電電漿來測試程式平行效率，測試結果顯示，如使用粒子數目為動態區域切割法的權重，於32 台 HP-IA64 電腦腦叢集下還可達到83%平行效率。論文最後部份，為展現三維平行 PIC-FEM 程式在許多重要物理問題上優秀的模擬能力，其應用問題包括三維奈米碳管式場發射元件與矽式場發射元件、三維直流氬氣放電電漿、三維射頻氬氣放電電漿、三維直流氬氣磁控式電漿和射頻氬氣磁控式電漿。關鍵字：粒子式和蒙地卡羅法、有限元素法、非結構性網格、平行化、圖形切割法、場發射元件、直流氬氣放電電漿、射頻氬氣放電電漿、直流氬氣磁控式電漿和射頻氬氣磁控式電漿

(10)

List of Tables

Table 1. Main excellent features of a field emission display………120 Table 2. Table 2. Time breakdown and speedup of Poisson’s equation solver at the

different number of processors………121 Table 3. Evolution of simulation parameters at different levels of mesh refinement.

(EMAX is the local maximum electric field strength at the surface of CNT field

emitter) ………122 Table 4. Evolution of simulation parameters at different levels of mesh refinement.

(BMAX is the local maximum magnetic field strength at the center of magnet

arrays)………123 Table 5. The important geometrical parameters of CNT triode- and tetrode-type field

emitters……….124 Table 6. Characteristics of device performance for different focus types…………..125

(14)

List of Figures

Figure 1.1 Representation of the parameter space in plasma etching. The key internal plasma properties (middle) are the bridge between externally controlled

variables (top) and the figures of merit (bottom)……….126

Figure 2.1 Element equation from a typical element (e) are used for each element in the mesh………...127

Figure 2.2 A three-Dimensional _{C -linear standard tetrahedral element………….128}0 Figure 2.3 A three-Dimensional _{C -linear standard hexahedral element…………129}0 Figure 2.4 (a) Vertex-based. (b) edge-based (c) element-based partition of 4 × 3 mesh into two sub-regions………..130

Figure 2.5 An L-shape domain subdivided into three sub-domains………...131

Figure 2.6 Sketch of graph and mesh……….132

Figure 2.7. Flowchart of the parallel mesh refinement module………133

Figure 2.8 Flowchart of the coupled PPES-PAMR method………134

Figure 2.9 The flowchart of parallel FEM……….135

Figure 2.10 Contours of the potential distribution of (a) a grounded conducting sphere (Φ= 0 Volts) immersed in a uniform electric field (Er =10Volts/m) and (b) uniform positive charges distribution between two infinite grounded conducting plates (Φ= 0 Volts)………..136

Figure 2.11 Schematic diagram of the simulation domain for a typical CNT triode-type field emitter within a periodic cell. The important geometrical parameters are: R=500 nm, r=10 nm, he=600 nm, h=500 nm, L=49.3 μ m, d=200 nm and W=25 μm………137

Figure 2.12 Surface mesh distribution of a typical single CNT triode-type field emitter within a periodic cell. Only ¼ of a periodic cell is simulated for the study of parallel performance of the Poisson’s equation solver………...138

Figure 2.13 Parallel speedup as a function of the number of processors on the PC-cluster system (maximum 32 processors) for CNT triode-type field emitter with gate voltage 150 volts, anode voltage 400 volts and the grounded cathode………..139

Figure 2.14 Close-up of the unstructured adaptive surface mesh at different levels for a single CNT triode-type field emitter with gate voltage 150 volts, anode voltage 400 volts and the grounded cathode (ε_ref =0.08). (a) Level-0 (7006 nodes). (b) Level-1 (22750 nodes). (c) Level-2 (34927 nodes). (d) Level-5 (61241 nodes)………..………140

(15)

Figure 3.1 Flowchart of the coupled PVPES-PAMR method………141

Figure 3.2 Arrangement of magnetization vectors of each permanent magnet segment for producing uniform flux density in the center of the permanent magnet array consisting of eight segments………142

Figure 3.3 Surface mesh distribution of the permanent magnet array consisting of eight segments……….………143

Figure 3.4 Figure 3.4 (a) Contour of magnetic flux density and (b) magnetic flux lines of permanent magnetic arrays (see Fig. 3.2)………144

Figure 3.5 Unstructured adaptive mesh distribution at different levels the permanent magnet array consisting of eight segments. (ε_ref =0.08). (a) Level-0 (7845 nodes). (b) Level-1 (38364 nodes). (c) Level-2 (54355 nodes). (d) Level-4 (98743 nodes). (e) Level-5 (108415 nodes). (f) Level-6 (108840 nodes)………145

Figure 3.6 Parallel speedup as a function of the number of processors on the PC-cluster system (maximum 32 processors) for of the permanent magnet array consisting of eight segments……….146

Figure 4.1 The flowchart of conventional PIC-MCC method……….……147

Figure 4.2 Schematic of leap-frog integration method………148

Figure 4.3.Collisions in argon plasma………149

Figure 4.4. Nanbu’s method to sample a collisional event……….……150

Figure 4.5 The electron-molecule cross sections in argon………..…151

Figure 4.6 Vector diagram for scattering collision………..…………152

Figure 4.7 The ion-molecule cross sections in argon……….153

Figure 4.8 Flowchart of parallel PIC-FEM……….…154

Figure 4.9 PIC-MCC module of parallel PIC-FEM………155

Figure 4.10 Flowchart of parallel PIC-FEM with dynamic domain decomposition..156

Figure 4.11 Sketch of the decomposed domain and the local data structure……..157

Figure 4.12 Schematic overview of the basic plasma processes in a magnetron plasma………..………...158

Figure 4.13 Schematic diagram of the spatial regions present in D.C. glow discharges, (a) at short cathode–anode distance and/or low pressure; (b) at longer inter electrode distance and/or higher pressure………..…159

Figure 4.14 (a) Potential and (b) electric profiles of quasi-1D DC glow discharge...160

Figure 4.15 Ion and electron number densities of quasi-1D DC glow discharge..…161

Figure 4.16 The net charge density of quasi-1D DC glow discharge……….162

(16)

Figure 4.18 Energy distribution functions of a impinging on the cathode of quasi-1D DC glow discharge……….…164 Figure 4.19 The potential profile of quasi-1D RF glow discharge…….………….165 Figure 4.20 Ion and electron number densities of quasi-1D RF glow discharge…166 Figure 4.21 Ion and electron kinetic energies of quasi-1D DC glow discharge…167 Figure 4.22 Electron energy probability functions of (a) 50mtorr and (b) 20mtorr in

the bulk region of quasi-1D DC glow discharge………168 Figure 4.23 Sketch of the 3D RF gas discharge plasma………….……….169 Figure 4.24 Parallel speedup as a function of number of processors for 3D RF plasma at different numbers of particle on HP IA-64 clusters machine (maximum 32 processors)………...………...170 Figure 4.25 Evolution of domain decomposition using 20 processors, during the

simulation for a RF gas discharge plasma (a) initial (b) intermediate (c) final……….……171 Figure 4.26 Time breakdown of various steps in PIC-FEM on 32 processors with (a)

10 particles per cell (b) 40 particles per cell………..……….172 Figure 5.1 Schematic diagram of the simulation domain for a typical CNT triode-type

field emitter within a periodic cell. The important geometrical parameters are: R=500 nm, r=10 nm, he=600 nm, h=500 nm, L=49.3 μm, d=200 nm and W=25 μm……….……….173 Figure 5.2 Contours of the (a) electric potential and the (b) electric field distribution near the tip of the CNT triode-type field emitter with gate voltage 150 volts, anode voltage 400 volts and the grounded cathode………...………174 Figure 5.3 FN plot of the field emission characteristics of CNT triode-type field

emitter (height is 600 nm) with gate voltage 110-160 volts, anode voltage 400 volts and the grounded cathode.(_S _{≡ slope}₌₋3244.25φ3/2/β _, _φ₌ 4.52 eV)……….175 Figure 5.4 Trajectories of the emitted electrons inside the periodic cell of CNT

triode-type field emitter with the grounded cathode, anode voltage 400 volts and two different gate voltages: (a) 110 volts (b) 160 volts……….……….176 Figure 5.5 Effect of the gate voltage on the emission current for two different CNT triode-type field emitter heights with anode voltage 400 volts and the grounded cathode………...……….177

(17)

Figure 5.6 Schematic diagram of the simulation domain for a typical CNT tetrode-type field emitter within a periodic cell. The important parameters are: R=500 nm, R =1500 nm, r=10 nm, he=600 nm, _f h₁=500 nm, h₂=500 nm, L=48.6 μm, d₁=200 nm, d₂=200 nm and W=25 μm………178 Figure 5.7 Comparisons of the trajectories of the emitted electrons between (a) CNT

triode-type field emitter with the grounded cathode, anode voltage 400 volts and the gate voltage150 volts and tetrode-type field emitter with the additional three different focusing voltages: (b) 5 volts (c) 0 volts (d) –5 volts………..………179 Figure 5.8 Perspective view of the structure of the magnetic focusing carbon nanotube

field emission arrays……….…………180 Figure 5.9 Schematic diagram of the 1/4 simulation domain for a typical CNT-based

triode-type field emitter within a periodic cell. The important geometrical parameters are: R=500nm, r=10nm, he=600nm, h=500nm, d=200nm, L=0.9mm and W=0.3mm……….………181 Figure 5.10 LHS shows surface mesh distribution of a single CNT triode-type field

emitter within a periodic cell. RHS shows surface mesh distribution of the CNT field emitter and equipotential lines near the tip for Vg=120V. Unstructured tetrahedral adaptive mesh is ideal for the simulation structure, which consists of a smaller emitter within a larger periodic cell…….…182 Figure 5.11. Field emission I-V characteristic of a single gated CNT field emitter

without the externally applied downward magnetic field………183 Figure 5.12 Snapshots and trajectories of electrons for (a) Bz = 0 T, (b) Bz = -0.2 T,

(c) Bz = -0.5 T, and (d) Bz=-1.0 T. The gate voltage and the anode voltage are fixed to 120V and 1 kV, respectively……….…184 Figure 5.13 Dependence of electron beam diameter at the anode on the flux density of magnetic focusing field………185 Figure 5.14 (a) SEM image and (b) surface mesh distribution of a single silicon based

field emitter within a periodic cell………..186 Figure 5.15 Contours of the (a) electric potential and the (b) electric field distribution near the tip of the single silicon based field emitter with anode voltage 200 volts……….187 Figure 5.16 Field emission I-V characteristic and F-N plot of single silicon based field

emitter with work function 4.5eV and 4.9eV………..…188 Figure 5.17 (a) The surface mesh plot and (b) domain decomposition profile of 3D

(18)

Figure 5.18 Contours of the (a) electric potential and the (b) electric field distribution of 3D DC glow discharge………190 Figure 5.19 Contours of (a) electron and (b) ion number densities of 3D DC glow

discharge……….191 Figure 5.20 Contours of (a) electron and (b) ion kinetic energies of 3D DC glow

discharge………192 Figure 5.21 (a) The surface mesh plot and (b) domain decomposition profile of 3D RF

gas discharge plasma………193 Figure 5.22 Sketch and boundary condition of the 3D RF discharge plasma enclosed

by a dielectric chamber wall………194 Figure 5.23 Potential contour of 3D RF discharge plasma enclosed by a dielectric

chamber wall………...195 Figure 5.24 Contours of (a) electron and (b) ion number densities of 3D RF discharge

plasma enclosed by a dielectric chamber wall……….196 Figure 5.25 Contours of (a) electron and (b) ion kinetic energies of 3D RF discharge

plasma enclosed by a dielectric chamber wall……….197 Figure 5.26 Sketch and surface mesh distribution of the 3D DC magnetron

plasma……….…….198 Figure 5.27 Contours of magnetic flux density with magnetization (a) 0.25 T (b) 0.5T

(c) 0.75 T (d) 1.0 T of permanent magnet systems ………...……...199 Figure 5.28 Potential contours of 3D RF magnetron plasma with (a) M=0.125T, γ =0.06, (b) M=0.125T, γ=0.1, and (c) M=0.1875T, γ=0.0…………..200 Figure 5.29 Contours of (a) electron and (b) ion number densities of 3D DC

magnetron plasma with M=0.125T andγ=0.06…………..………201 Figure 5.30 Contours of (a) electron and (b) ion number densities of 3D DC

magnetron plasma with M=0.125T andγ=0.1…....………202 Figure 5.31 Contours of (a) electron and (b) ion number densities of 3D DC

magnetron plasma with M=0.1875T andγ=0.06………….…………203 Figure 5.32 Contours of (a) electron and (b) ion kinetic energies of 3D DC magnetron

plasma with M=0.125T andγ=0.06………204 Figure 5.33 Contours of (a) electron and (b) ion kinetic energies of 3D DC magnetron

plasma with M=0.125T andγ=0.1…..………...205 Figure 5.34 Contours of (a) electron and (b) ion kinetic energies of 3D DC magnetron

plasma with M=0.1875T andγ=0.06……….206 Figure 5.35 Potential contours of 3D RF magnetron plasma with (a) M=0.125T, γ=0

(b) M=0.125T, γ=0.06 (c) M=0.25T, γ=0.06……….207 Figure 5.36 Contours of (a) electron and (b) ion number densities of 3D RF magnetron

(19)

Figure 5.37 Contours of (a) electron and (b) ion number densities of 3D RF magnetron plasma with M=0.125T and γ=0.06………209 Figure 5.38 Contours of (a) electron and (b) ion number densities of 3D RF

magnetron plasma with M=0.25T and γ=0.06………..210 Figure 5.39 Contours of (a) electron and (b) ion kinetic energies of 3D RF magnetron

plasma with M=0.125T and γ=0.0………..211 Figure 5.40 Contours of (a) electron and (b) ion kinetic energies of 3D RF magnetron

plasma with M=0.125T and γ=0.06………212 Figure 5.41 Contours of (a) electron and (b) ion kinetic energies of 3D RF magnetron

(20)

Chapter 1 Introduction

1.1 Motivation and Background

Understanding of several important physical and engineering problems, for example, field emission [Itoh et. al., 2004] and low-temperature rarefied (low-pressure) plasma [Lieberman and Lichtenberg, 1994], requires the consideration of space-charged effects self-consistently due to the motion of charged particles. Some common characteristics of these problems may include three-dimensional, geometrically complicated, thermally non-equilibrium and low-pressure. In addition to studying these problems experimentally, modeling through computer simulation may become one of the most efficient ways to understand the underlying physics due to the rapid advancement of the modern computer technology. Modeling technique that assumes thermally equilibrium, such as fluid modeling [Kushner, 2005], fails in correctly capturing the important physical features of the-above mentioned problems.

Until 1960s, the plasma physicists had devised an important simulation technique, Particle-In-Cell (PIC) [Birdsall and Langdon, 1991], which equivalently solves the collisionless Boltzmann equation for charged particles self-consistently. However, a self-consistent PIC simulation is often computationally intensive even at low pressure. In general, the conventional PIC solves the fields with finite difference method using

(21)

a structured mesh. This is not flexible enough or becomes awkward to simulate the device with complicated geometries. In addition, the structured mesh is often difficult to manage efficiently when dynamic domain decomposition is required in a typical parallel particle simulation, such as the PIC method. In contrast, not only does the finite element method (FEM), which often uses an unstructured mesh, offer much greater flexibility in handling the complicated geometry, but also it provides excellent flexibility in dynamic domain decomposition for parallel computing. In addition, mesh refinement that is important in several simulations can be easily coupled to the finite element method using an unstructured mesh. However, there are some disadvantages by using unstructured mesh for PIC method. Firstly, particle tracing on an unstructured mesh may slow down the simulation. Secondly, the programming may become complicated. Nevertheless, there were very few studies directed along this line. Taking all these intertwining factors together, it is still very valuable to develop a PIC simulation code using an unstructured mesh. Therefore, in this thesis our goal is to develop a parallelized three-dimensional PIC-FEM code considering Monte-Carlo collision, which can be run on memory-distributed parallel machines, such as PC clusters.

1.2 Literature Surveys

(22)

low-pressure plasma, past studies of modeling and simulation in these two disciplines are reviewed in detail in the following in turn.

1.2.1 Modeling of Field Emission

Field emission display (FED) is the new type of flat-panel display and its working principle is similar with traditional CRTs. The colored light is generated from the phosphor, which is excited by electrons. The electrons are emitted from cathode. Instead of thermo-ionic emission in CRT, the electrons in FED are emitted by a cold pixel electron source that typically consistsof a large array of low-work-function emitter micro-tips. Moreover, FED needs the lower power input then CRT since there is a power-inefficient deflection system in CRT to steer the emitted electrons. Furthermore, when FED is compared with TFT LCD, the FED also exhibits the some better performances than TFT LCD. For example, FED offer a superior viewing angle and are several microseconds quicker in response time. In addition, FED also has the potential for high brightness and contrast. The advantages of applying FED in display technology include lower driving voltage, higher lighting frequency, and possibly, better display resolution. In general, three types of FED can be classified depend on their structures, which are diode, triode, and tetrode types. Table 1 also shows the main features of these different type of FED [Itoh et. al., 2004].

(23)

tremendous interest in the past few years for their remarkable field emission (FE) properties such as high aspect ratios, whisker-like in shape for optimum geometrical field enhancement, high electrical conductivity, and extraordinary environment stability, e.g., [de Heer et. al., 1995], [Rao et. al, 2000], [Nishimura et. al., 2004], and

[Nilsson et. al, 2000]. Therefore, CNTs have great potential to be used as field emission cathodes for various applications of vacuum microelectronic devices, including field-emission displays (FED), e.g., [Wang et. al.,1997], [Fowler and Nordheim, 1928], and [Spindt,1968], high-frequency microwave amplifier, e.g., [Choi

et. al,1999], and [Pirio et. al,2002], electron microscopy and parallel electron beam lithography (EBL) [Hong et. al., 1994], to name a few.

Most of the FE devices applied the famous Spindt-type structure [Wang et. al., 1997], which has a metallic or silicon etched field emitter with an integrated gate electrode aperture surrounding the emitter tip to control the extraction of emission current. The multiple carbon nanotubes based field emission cathodes within the integrated gate electrode aperture have been reported in many papers over the past six years, e.g., [Fowler and Nordheim,1928], [Spindt, 1968], [Lei et. al.,1998], and [Hu and Huang, 2003]. For some applications, such as electron beam lithography and microscopy, individual gated carbon nanotube field emitter was specifically fabricated to eliminate the screening effects and to optimize the emitted current and electron

(24)

beam diameter [Lan et. al., 2004]. The electrons emitted from a very small area on the top of CNT inherently spread with a large dispersion angle. Thus, an appropriate electron-beam focusing system is necessary for developing a well-focused electron beam source.

From the Fowler-Nordheim law [Fowler and Nordheim, 1928], the magnitude of the electron flux emitted from the surface depends upon the local electric field at the surface and the work function of the solid. In addition to finding materials with lower work functions, enhancing the local electric field near the surface is one of the most critical tasks in improving field emission properties. As a trial-and-error method is often expensive in terms of time and cost, a computer simulation may speed up the design process by revealing the detailed physics with the FED. In practice, the geometry of the field emitter and the gates involved in the FED design is three-dimensional and often very complicated, e.g., [Spindt, 1968], [Choi et. al., 1999], and [Pirio et. al, 2002].

In the past, several numerical studies have been conducted for the prediction of field emission properties, e.g., [Hong et. al.,1994], [Wang et. al,1997], [Lei et. al., 1998], [Hu and Huang, 2003], [Lan et. al.,2000], and [Lan et. al.,2004]. Most of these studies use either the 2-D or 3-D finite difference method, e.g., [Wang et. al.,1997], [Lei et. al.,1998], [Hu and Huang, 2003], [Lan et. al.,2000], and [Lan et. al., 2004], or

(25)

the 2-D finite element approach [Hong et. al., 1994] for discretizing the electrostatic Poisson’s equation. As mentioned earlier, a practical FED design often involves three-dimensional objects with a complicated geometry, rendering the use of the finite-difference method as very difficult or unsuitable. The finite-element or finite-volume method using unstructured grids should represent the best choice for the numerical method in this regard. In addition, parallel processing can be necessary in simulating the practical three-dimensional design of field emitters or when including space-charged effect with high emission currents in the Particle-In-Cell (PIC) method ,e.g., [Hu and Huang, 2003], [Lan et. al.,2000], and [Lan et. al., 2004]. Otherwise, in Ref.,e.g., [Hu and Huang, 2003], [Lan et. al.,2000], and [Lan et.

al.,2004], the computational time for a typical run to emit only a few electrons can take up to one week. Also, the accuracy of the electron-flux prediction from the emitters strongly depends on the accuracy of the local electrical field at the surface, which makes the grid resolution at the surface a critical issue in the simulation. In the following, the very similar concerns are also arising from the low-temperature plasma simulation

1.2.2 Modeling of Low-temperature Plasma

Plasma is ionized gas. Hence, it consists of ions and electrons, as well as neutral species. The ionization degree of plasma varies from 100% (fully ionized plasma) to

(26)

very low values (e.g. ₁₀−4_–₁₀−6_{; weakly ionized plasma). Besides the space plasma,}

the laboratory plasma is divided into two main groups, which are the high-temperature plasma (fusion plasma), and the low-temperature plasma (gas discharges plasma). Moreover, two sub-groups of gas discharge plasma are classified depended on its working gas pressure, which are thermal equilibrium plasma and non-thermal equilibrium plasma [Lieberman and Lichtenberg, 1994]. The efficient energy exchange between the plasma species due to many collisions occur for high-pressure plasma. Thermal equilibrium discharge is typically used for applications where heat is required, such as for cutting, spraying, welding. On the other hand, for low gas pressure plasma, different temperatures of the plasma species due to its inefficient energy transfer. Non-thermal equilibrium plasma is typically used for applications where heat is not desirable. In recent years, this field of non-thermal equilibrium plasma applications has rapidly expanded due to its non-equilibrium aspect of the plasma. The latter sub-group of gas discharge plasmas is also the second subject of this dissertation.

Some important operating parameters for obtaining different non-equilibrium conditions are briefly summarized as follows [Economou, 2000]:

z The chemical input of working gas and its corresponding gas pressure z The imposed external electromagnetic field structure

(27)

z The configurations of plasma chamber and electrodes z The temporal behavior (e.g. pulsing the plasma)

One can realizes that the non-equilibrium plasma conditions are strongly influenced by any one or many of these summarized parameters. For example, the representation of the parameter space in plasma etching is shown in Fig. 1.1. It clearly illustrates that the above-mentioned operating parameters (in the top block) determine the following key plasma properties (listed in the middle block) including the electron velocity distribution function (EVDF), the space and time variation of electron, etc. Finally, these properties may dominate the figures of merit (listed in the bottom block) including the etching (or deposition) rate, uniformity, etc. Therefore, a computer simulation code should use such many parameters as inputs to help optimize the expected non-equilibrium plasma conditions easily and understand the underlying physics.

Dimensionality of plasma reactor simulations ranges from zero-dimensional to three-dimensional. Low dimensional simulations, such as zero-dimensional, e.g., [Font et. al.,1998], [Meeks and Shon, 1995], and [Deshmukh. and Economou, 1992],

and one-dimensional models, e.g., [Midha and Economou, 2000], [Kline et. al. 1989], and [Nedelea and Urbassek, 2004], are best choice in handling very complicated chemistry [Meeks et. al., 1997]. However, two-dimensional simulations can address

(28)

the important aspect of reaction uniformity across the wafer radius, e.g., [Shon and Lee, 2002], [Shon et. al.,1999], and [Shon et. al.,1998]. Three-dimensional simulations are useful for studying azimuthal asymmetries in the reactor due to non-axisymmetric power deposition, or non-axisymmetric gas inlets and pumping ports, e.g., [Kushner et. al.,1996], and [Kushner, 1997].

In general, there are three kinds of plasma simulation approaches; the first is the fluid model, the second is the kinetic model and the third is the hybrid model. In addition, Maxwell’s equations for electromagnetic fields also need to be solved self-consistently coupled with the plasma densities and currents from plasma simulations. For the fluid model, one need to solve the equations, which are derived after taking the moments of the Boltzmann equation with some assumptions regarding, e.g., [Meyyappan, 1994], and [Gogolides and Sawin, 1992]. They are species continuity equation, species momentum equation and species energy equation. Related publications of fluid model could be found in numerous articles, e.g., [Ventzek et. al.,1993], [Lymberopoulos et. al.,1995], [Bukowski and Graves, 1996], and [Ventzek et. al.,1994], and are not repeated here. Unlike fluid model, kinetic approach yields the particle distribution functions as an output of the simulation. Especially, it is more accurate than fluid model at low pressures when the species mean free path is comparable to or longer than a characteristic length scale or for

(29)

highly non-equilibrium situations. However, Kinetic approach is computationally intensive as compared to fluid model. One of the well-established kinetic approaches is the Particle-In-Cell with Monte-Carlo Collisions (PIC-MCC) method, e.g., [Birdsall, 1991], and [Vahedi et. al.,1993]. In the past two decades, PIC-MCC method has long been used to study the nonlinear kinetic problems in space and laboratory plasma physics. For self-consistent treatment of the plasma and the background gas, Nanbu combined the Direct Simulation Monte Carlo method (DSMC) with PIC-MCC, e.g., [Nanbu, 2000], and [Serikov et. al.,1997].

Each time step in the PIC-MCC consists of four major steps: charge extrapolation, force interpolation from the solution of the Maxwell’s equations, particle movement, and Monte-Carlo collisions. Briefly speaking, based on the particle positions, charges are assigned to each mesh point and current densities are assigned to the faces between the mesh points. Maxwell's equations are then solved to compute the electric and magnetic fields on the grid. The force on the particles is obtained from the fields at these gird points by interpolation based on the particle position. Particles are then moved according to Newton's law. Particle collisions are handled stochastically in a Monte Carlo module in-between field adjusting time steps. The details of the PIC-MCC method will be given in chapter 4.

(30)

1991], and [Birdsall, 1991], a structured mesh is usually construed for the computational domain. However, until very recently, there have been few developments of electrostatic PIC method using a three-dimensional unstructured mesh, mostly designed for thruster plume simulations due to their complicated computational geometry. A hybrid PIC-DSMC code using unstructured mesh, called AQUILA, which has been developed by [Santi et al., 2003] on hall thruster plume simulation. They obtained the improved current density results from better the unstructured mesh resolution. AQUILA uses finite element method to discretize Poisson’s equation with electrons from Boltzmann relation, and then uses Newton-Raphson method to solve the non-liner resulting matrix. [Spirkin et al., 2004]

has also developed a three-dimensional Particle-In-Cell code on a unstructured tetrahedral mesh with finite volume method. This PIC code was applied to the simulation of the flow inside the segmented micro-channel of a directional-retarding potential analyzer. Results show the flow characteristics of the ions and electrons and provide estimates of the collected current by the micro-plate.

Parallel PIC-MCC method has been previously studies by various researchers using different schemes, e.g., [Seidel et al.,2002], [Kawamura et al.,2000], [Decyk, 2002], [Walker,1991], [Walker, 1990], [Lee and Azari, 1992], [Akarsu et al.,1996], [Decyk and Norton, 2004], and [Liewer and Decyk, 1989], since it is the most

(31)

computationally demanding compared with other models. Most parallel PIC-MCC schemes, e.g., [Kawamura et al.,2000], [Seidel et al.,2002], and [Lee and Azari, 1992]

employ a Eulerian decomposition scheme in which just paralleling the particle processing without paralleling the field solver since the field solver can be a small percentage of the work load especially when FFT methods are used. In this report, for a fixed number of grid points, the speedup just for this parallel particle processing became more linear with increasing particle number on 2 and 4 CPU symmetric multiprocessor (SMP) machines and on a distributed network of workstations (NOW).

In the past, there have been very few studies concerning on developing dynamic load-balancing technique for particle-based PIC-MCC code, e.g., [Seidel et al.,2002] , [Decyk and Norton, 2004], and [Liewer and Decyk, 1989]. In Seidel’s work [Seidel et

al.,2002], he has proposed a method in which the code will dynamically repartition the computational domain and intends to balance the workload among processors under the framework of structured mesh. Decyk et al. have developed a new algorithm just for PIC method on concurrent processors with distributed memory, which named the general concurrent PIC algorithm (GCPIC). In this algorithm, the physical domain of particle simulation was divided into sub-domains, which are equal to the number of processors. The sub-domain can be re-created to keep the processor loads of particle computations balance (dynamic load balancing) during the transient

(32)

period of the simulation, which was called primary decomposition. Again, each sub-domain may have equal numbers of particles, however, unequal numbers of grid numbers. Thus, GCPIC used secondary decomposition to divide the physical domain into number of processors equal sub-domains with equally number of grid points under MIMD paradigm for computing field solver efficiently. However, these reviews showed that parallel PIC methods are not suitable using the SIMD paradigm.

1.3 Objectives of the Thesis

Specific objectives of the present thesis are briefly summarized as follows: 1. To develop and verify a parallelized three-dimensional Poisson’s equation

solver using FEM for predicting electrostatic distribution;

2. To develop and verify a parallelized three-dimensional vector potential Poisson’s equation solver using FEM for predicting magnetostatic distribution;

3. To develop and verify a parallelized three-dimensional PIC-FEM code using an unstructured mesh by combining the Poisson’s equation solvers mentioned in the above;

4. To apply the completed PIC-FEM code to simulate field emission, DC/RF gas discharge plasma and DC/RF magnetron plasma and compare with

(33)

experimental data wherever available.

1.4 Organization of the Thesis

The chapters of this thesis are organized as follows.

Chapter 2 details a parallel three-dimensional electrostatic field solver formulated via Galerkin finite element method based on the subdomain-by-subdomain non-overlapping domain decomposition method. After finite element assembling procedure, the resulting matrix is stored in compressed sparse row format and is solved using either the parallel conjugate gradient method or a direct matrix solver, MUMPS. A parallel adaptive mesh refinement module (PAMR) is also coupled into the developed electrostatic field solver for obtaining better solution, especially, when there is a large variation of potential in the computational domain. Some benchmark problems are presented for demonstrating the accuracy and applicability of the electrostatic field solver. In the end of this chapter, the parallel performances of the Poisson’s equation solver are studied and their time breakdown is analyzed in detail.

Chapter 3 details a parallel three-dimensional magnetostatic field solver. This solver is developed and paralledized follows the techniques presented in chapter 2. Also, PAMR is coupled into the developed magnetostatic field solver. Some

(34)

benchmark problems are presented for demonstrating the accuracy and applicability of the parallel magnetostatic field solver. In the end of this chapter, the parallel performances of these solvers are studied and their time breakdown is analyzed.

Chapter 4 presents the proposed parallel three-dimensional PIC-FEM method using an unstructured mesh. The PIC-FEM is developed follows the main principles of the conventional PIC-MCC method. In addition, the parallel implementation of PIC-FEM is done via domain decomposition method. Dynamic domain decomposition is developed for alleviating the load between the processors. Two benchmark problems are presented for demonstrating the accuracy and applicability of the parallel PIC-FEM code. In the end of this chapter, the parallel performance of the PIC-FEM code using DDD is studied and its time breakdown is analyzed in detail.

In chapter 5, the proposed parallel three-dimensional PIC-FEM code is used to simulate three different realistic problems. They are: three-dimensional field emission display (FED), three-dimensional DC/RF gas discharge plasma, and three-dimensional DC/RF magnetron plasma. Most of the simulated results are then compared with the previous studied available in the literature. Finally, the conclusions of this work and some possible directions for the future research work are presented in the chapter 6in turn.

(35)

Chapter 2 The Parallel Computing of Finite Element Method for

Three-Dimensional Electrostatic Field Problems

This chapter begins with the introduction to background of computational electromagnetic. In solving electrostatic problems, only the finite element Galerkin weighted residual method (GWRM) is chosen and introduced for using either a tetrahedral or a hexahedral mesh. Globe coordinate and local coordinate shape functions are used for tetrahedral and hexahedral meshes, respectively. Before the parallel computing of FEM, the computational domain is firsts decomposed into a number of non-overlapping sub-domains using multi-level graph-partitioning library, METIS. Then, some preprocessing procedure is used in converting the output data from graph-partitioning tool into the input data for field solvers. The second step is that the Poisson’s equation is formulated via GWRM using subdomain-by-subdomain method (SBS). After applying the assembling procedure of FEM, only the non-zero entries of the system of equation are stored in compressed sparse row format and solve the matrix using either parallel conjugate gradient method or direct matrix solver, MUMPS. The parallel adaptive mesh refinement module is then coupled with the parallel Poisson’s equation solver in order to improve the resolution of solution

(36)

near where the property gradient is large. Some benchmark problems are presented for demonstrating the accuracy and applicability of the Poisson’s equation solver. In the end of this chapter, the parallel performance of the Poisson’s equation solver is studied and its time breakdown is analyzed in detail.

2.1 Background of Computational Electromagnetic

For the whole electromagnetic theory, it could be regarded as the logical deduction from the Maxwell’s equations, which were first formulated by James Clerk Maxwell in 1873 [Cheng, 1995]. They are:

t B E ∂ ∂ − = × ∇ r r (2-1) J t E c B r r r _µ + ∂ ∂ − = × ∇ 1₂ (2-2) ε ρ = ⋅ ∇ Er (2-3) 0 = ⋅ ∇ Br (2-4) where Er is the electric field intensity, Br is the magnetic flux density, Jr is the current density, εis the dielectric permittivity of the medium, µ is the dielectric permeability of the medium, c is the speed of light and ρis the volume charge density. In this thesis, we only consider the electrostatic (ES) and magnetostatic (MS) field problems, hence, the Maxwell’s equations can be simplified to the scalar and vector Poisson’s equations for ES and MS fields, respectively. They are:

(37)

ε ρ φ =− ∇2 (2-5) J Ar =−r ∇2 1 µ (2-6)

where φ is electric potential and Ar is the magnetic vector potential. From these two equations, it is clear that the electrostatic fields are contributed from the static electric charges and the magnetostatic fields are due to motion of electric charges with uniform velocity (direct current) or external magnets. The details in solving the magnetostatic field problems are presented in the next chapter.

In the past years, Maxwell’s equations have long been studied in dealing with electromagnetic problems. Two different approaches for solving Maxwell’s equations are analytic and numerical approached. For the simple structured device, there are many analytic solutions available which could easily derived from the series expansions, separation of variables, Bessel and Legandre polynomials, Laplace transformations, and the like [Umashankar, 1993]. However, there is almost no solution available when one consider a device with a complicated structure that involve a number of conductors, dielectric, permanent magnets, and semiconductors of arbitrary shapes, etc. Thus, an extremely accurate numerical method for solving Poisson’s equation is required to model these complicated structures.

Fortunately, with the developments in numerical techniques in the past two decades, nowadays it is possible to solve large-scale electromagnetic problems

(38)

numerically within reasonable time limits. The numerical methods can be generally divided into three separate groups, which are integral method, differential method, and variational method. For the integral method, it is based on the basis of the superposition principle and one can integrate such effects at any point obtaining the potential at that point. The well-known integral method is the boundary element method [Kythe, 1995], which is particularly suitable for problems with material homogeneity. Finite difference method [Sullivan, 2000] is the most popular among the differential methods. For this method, the computational domain is usually divided into structured mesh and the values of a scalar potential field are sought at all grid points. However, this method usually suffers from many problems when considering a complicated structured case, especially in generating the structured mesh on object with arbitrary geometrical shape, imposing the boundary conditions, interface approximation of muti-material region. For the variational method, it is based on the differential or integral form of the field equations to be solved [Zeidler, 1985]. The well-known variational method is the finite element method, which is widely used in many fields, e.g., [Beltzer, 1990], [Winslow, 1967], [Silvester and Pelosi, 1994], and [Jin, 2002], which also include the electromagnetic problems [Winslow, 1967].

(39)

2.2 Finite Element Method (FEM)

2.2.1 Background

The FEM is the numerical technique for obtaining approximate numerical solution of the partial differential equations (PDEs) arising from any physical system subjected to its boundary conditions. For FEM, the computational domain is first divided into smaller non-overlapping regions called elements (cells), and a trial function is postulated over each of the elements. For example, the trial solution with global coordinates for a three-dimensional tetrahedral mesh is:

(

x y z a

)

a a N

(

x y z

)

a N

(

x y z

)

a N

(

x y z

)

U~ , , ; = ₀ + ₁ ₁ , , + ₂ ₂ , , +_L+ _n _n , , (2-7) Where x, y, z are the independent variables in the problems. The functions N

(

x,y,z

)

are known functions called trial functions. The coefficients, a , are undetermined _i parameters and n is called degree of freedom (DOF). After formulating the PDEs using Galerkin weighted residual method with the trial solution, the element equations are obtained for a typical element. These element equations can then be used for other the elements over the domain as shown in Fig. 2.1. Then, larger sets of algebraic equations, which are called system equations, are formed after assembling these element equations. Moreover, the matrix structure of such huge system equations is sparse in essence; the matrix may be solved efficiently only storing the non-zero entries.

(40)

2.2.2 The Galerkin Weighted Residual Method (GWRM)

In FEM, we employ Galerkin weighted residual method (GWRM) with the three-dimensional C0_{-linear shape-function for a tetrahedral mesh and for a}

hexahedral mesh. In GWRM, a weighted average of residual over the entire domain should be zero and the trial function is the element shape function associated with each ai. The representations of GWRM with three-dimensional C0-linear global

coordinate shape function and local coordinate shape function are given in Eqs. (2-8a) and Eqs. (2-8a), respectively.

(

, , ;

) (

, ,

)

=0

∫∫∫

R x y z a Ni x y z dxdydz (2-8a)

(

, , ;

) (

, ,

)

=0

∫∫∫

Rξ η ρ a Ni ξ η ρ dxdydz (2-8b) where R(x,y,z;a) or R

(

ξ,η,ρ;a

)

is the residual of the governing equation.

2.3 Calculation of Electrostatic (ES) Field

Since the concept of GWRM is introduced, this section begins with derivation of the element equation of Poisson’s equation for typical ES fields (as shown in Eqs. (2-5)). A trial solution is first constructed to approximate the exact electric potential,φ

( )

r ≈U~

( )

r;α , which can be written with shape function using global coordinate for a tetrahedral mesh,

(

)

_∑

(

)

= = 4 1 , , ; , , ~ j j jN x y z a a z y x U (2-9)

(41)

steps for formulating Eqs.(2-5) using GWRM with a three-dimensional tetrahedral mesh in global coordinates are described in detail as follows.

Step 1: Derive the typical element equation of Eqs. (2-5) using GWRM.

4 1 ) ( ) ~ ~ ~ ( 0 2 ) ( 2 ) ( 2 2 ) ( 2 − = − = ∂ ∂ + ∂ ∂ + ∂ ∂

∫∫∫

i dxdydz N dxdydz N z U y U x U e i e i e e e ε ρ (2-10)

Step 2: Integrate by parts.

x U x N x U N x x U N e e i e e i e e i _∂ ∂ ∂ ∂ − ∂ ∂ ∂ ∂ = ∂ ∂ ( ) ( ) ( ) ) ( 2 ) ( 2 ) ( ~ ) ~ ( ~ (2-11a) y U y N y U N y y U N e e i e e i e e i _∂ ∂ ∂ ∂ − ∂ ∂ ∂ ∂ = ∂ ∂ ( ) ( ) ( ) ) ( 2 ) ( 2 ) ( ~ ₍ ~ ₎ ~ (2-11b) z U z N z U N z z U N e e i e e i e e i _∂ ∂ ∂ ∂ − ∂ ∂ ∂ ∂ = ∂ ∂ () ( ) () ) ( 2 ) ( 2 ) ( ~ ₍ ~ ₎ ~ (2-11c) Substituting Eqs.(2-11) into the RHS of Eqs. (2-10), and Eqs. (2-10) becomes

4 1 ) ( )] ~ ( ) ~ ( ) ~ [( )] ~ ( ) ~ ( ) ~ ( [ ) ( 0 ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( − = − = ∂ ∂ ∂ ∂ + ∂ ∂ ∂ ∂ + ∂ ∂ ∂ ∂ − ∂ ∂ ∂ ∂ + ∂ ∂ ∂ ∂ + ∂ ∂ ∂ ∂

∫∫∫

i dxdydz N dxdydz z N z U y N y U x N x U dxdydz N z U z N y U y N x U x e i e i e e i e e i e e i e e i e e i e ε ρ (2-12)

Eqs. (2-12) can be reformulated using divergence theorem, it may be written,

∫∫

∫∫∫

⋅ ∂ ∂ + ∂ ∂ + ∂ ∂ + = ∂ ∂ ∂ ∂ + ∂ ∂ ∂ ∂ + ∂ ∂ ∂ ∂ A e e e e i e i e i e e i e e i e dA n k z U j y U i x U N dxdydz N dxdydz z N z U y N y U x N x U r v v v ~ ~ ₎ ~ ( ) ( )] ~ ( ) ~ ( ) ~ [( ) ( ) ( ) ( ) ( ) ( 0 ) ( ) ( ) ( ) ( ) ( ) ( ε ρ 4 1 ) ( ) ( ) ( ) ( ) ( 0 ) ( ) ( ) ( 0 − = − = ⋅ − =

∫∫

∫∫∫

∫∫

∫∫∫

i dA N dxdydz N dA n N dxdydz N A e n e i e i A e e i e i τ ε ρ τ ε ρ r r r (2-13)

(42)

where τrn(e) is the outward-normal component of the flux. All load terms are placed on the

RHS; this includes the interior load and the boundary fluxes. Step 3: Substitute the trial solution into element equations.

Inserting Eqs. (2-9) into Eqs. (2-13) yields,

4 1 , 4 1 ) ( ] )] ( ) ( ) [( ) ( ) ( ) ( 0 ) ( ) ( ) ( ) ( ) ( ) ( − = − = − = ∂ ∂ ∂ ∂ + ∂ ∂ ∂ ∂ + ∂ ∂ ∂ ∂

∫∫

∫∫∫

j i dA N dxdydz N a dxdydz z N z N y N y N x N x N A e n e i e i j e j e i e j e i e j e i τ ε ρ _r (2-14)

Writing it also in matrix form,

⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ = ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ ) ( ) ( 2 ) ( 1 2 1 ) ( ) ( 2 ) ( 1 ) ( 2 ) ( 22 ) ( 21 ) ( 1 ) ( 12 ) ( 11 e n e e n e nn e n e n e n e e e n e e F F F a a a K K K K K K K K K M M L M L M M L L (2-15)

Where the coefficient matrix and load matrix are defined as:

∫∫

∫∫∫

− = ∂ ∂ ∂ ∂ + ∂ ∂ ∂ ∂ + ∂ ∂ ∂ ∂ = A e n e i e i e i e j e i e j e i e j e i e ij dA N dxdydz N F dxdydz z N z N y N y N x N x N K ) ( ) ( ) ( 0 ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )] ( ) ( ) [( τ ε ρ _r

Step 4: Substitute the 3-D C0-linear shape function for a tetrahedral mesh into element equation.

The LHS of Eqs. (2-14) may be reformulated,

∫∫∫

+ + = dxdydz K e j e i e j e i e j e i e j i ( ) ) ( ) ( ) ( ) ( ) ( ) ( ) ( , b b c c d d (2-16) Where 4 1 ) ( 6 1 ) , , ( ( ) ( ) ( ) ( ) ) ( ₌ ₊ _x₊ _y₊ _z _i₌ ₋ V z y x N e i e i e i e i e i a b c d (2-17a)

(43)

m m m l l l k k k e i z y x z y x z y x = ) ( a (2-17b) m m l l k k e i z y z y z y 1 1 1 ) ( ₌₋ b (2-17c) m m l l k k e i z x z x z x 1 1 1 ) ( ₌₋ c (2-17d) 1 1 1 ) ( m m l l k k e i y x y x y x − = d (2-17e)

Where V is the volume of cell and the subscripts i, k, l, m have the values 1, 2, 3, 4 (see Fig. 2.2) for N₁(e)(x,y,z) and are permuted cyclically

for ()( , , ) 2 x y z N e _, ( )₍ _, _, ₎ 3 x y z N e _and ()₍ _, _, ₎ 4 x y z N e _.

Step 5: Evaluate the interior load term and boundary flux term of Eqs .(2-14). For coupling with Particle-In-Cell method in later chapter, here we interpolate charges from the particles to the nodes [Santi et. al., 2003], that is: 4 1 ) , , ( ) ( ) ( ₌

∑

_q _N _x _y _z _i₌ ₋ k k k k e i k e i ρ (2-18)

Where the subscript k represents charged particle properties. SubstitutingEqs. (2-18)

(44)

∑

∑ ∫∫∫

∫∫∫

= = = = 4 1 ) ( 0 4 1 ) ( ) ( 0 ) ( 0 4 1 ) ( i e i i e i e i e i V dxdydz N dxdydz N ρ ε ρ ε ε ρ (2-19)

Where V is the volume of cell. Now consider the boundary flux integral, we assume the flux is constant and move (e)

n

τr outside the integrals:

3 ) ( ) ( ) ( ) ( ) ( ) ( e e n A e i e n A e n e i dA N dA N ∆ = =

_∫∫

∫∫

τ τ τ r r r (2-20)

Where _{∆ is the face area of the element.}(e)

Step 6: Assemble the element equations into a system equation Using an assembly procedure,

) ( ) ( ( ) () , e i T j e j i T _K _a _W _F W = (2-21)

The system equation is formed, ( ) ( ) , s i j s j i a F K = (2-22)

After these theoretical developments, we may apply the boundary conditions to Eqs.(2-22).

For a three-dimensional hexahedral mesh with local coordinate shape function, the same steps 1-3 and 6 are repeated for Eqs.(2-5), andthe step 4-5 are be modified as follows.

(45)

Step 4: Substitute the 3-D C0-linear shape function for a hexahedral mesh into element equation.

The Eqs. (2-15) is transformed into a form appropriate for numerical evaluation. ρ η ξ ρ η ξ ρ η ξ d d d J z N z N z N d d d J y N y N y N d d d J x N x N x N K e e k e j e i e e k e j e i e e k e j e i e ij ) ( ) ( ) ( 1 1 1 1 1 1 ) ( ) ( ) ( ) ( 1 1 1 1 1 1 ) ( ) ( ) ( ) ( 1 1 1 1 1 1 ) ( ) ( ∂ ∂ ∂ ∂ ∂ ∂ + ∂ ∂ ∂ ∂ ∂ ∂ + ∂ ∂ ∂ ∂ ∂ ∂ =

∫ ∫ ∫

− − − − − − − − − (2-23)

∫∫

∫ ∫ ∫

− = − − − A e n e i e e i e i N J d d d N dA F 1 ( ) ( ) ( ) 1 ) ( 1 1 1 1 0 ) ( ₍ ₎ _ξ _η _ρ _τ ε ρ _r (2-24)

Where )N_i(ξ,η,ρ is the shape function with local coordinate, ) 1 )( 1 )( 1 ( ) , , ( ₈1 i i i i N ξ η ρ = +ξξ +ηη +ρρ (2-25)

And ξ_i, η_i and ρ_i are the local coordinate of grid node for a standard hexahedral element (as shown in Fig.2-3). And, J(e) is the Jacobian [Shao,

2006]: ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ = 8 8 8 7 7 7 6 6 6 5 5 5 4 4 4 3 3 3 2 2 2 1 1 1 8 7 6 5 4 3 2 1 8 7 6 5 4 3 2 1 8 7 6 5 4 3 2 1 ] [ Z Y X Z Y X Z Y X Z Y X Z Y X Z Y X Z Y X Z Y X N N N N N N N N N N N N N N N N N N N N N N N N J ρ ρ ρ ρ ρ ρ ρ ρ η η η η η η η η ξ ξ ξ ξ ξ ξ ξ ξ (2-26)

應用非結構性網格之平行化三維PIC-FEM程式的研究與發展

國

立

交

通

大

學

機

械

工

程

學

系

博士論文

應用非結構性網格之平行化三維 PIC-FEM 程式

的研究與發展

Development of a Parallelized PIC-FEM Code Using a

Three-Dimensional Unstructured Mesh and Its Applications

研 究 生：許國賢

指導教授：吳宗信博士

應用非結構性網格之平行化三維 PIC-FEM 程式的研究與發展

Development of a Parallelized PIC-FEM Code Using a

Three-Dimensional Unstructured Mesh and Its Applications

研 究 生：許國賢 Student：Kuo-Hsien Hsu

指導教授：吳宗信 博士 Advisor：Dr. Jong-Shinn Wu

國 立 交 通 大 學

機 械 工 程 學 系

博 士 論 文

中華民國九十五年七月

誌謝

Development of a Parallelized PIC-FEM Code Using a

Three-Dimensional Unstructured Mesh and Its Applications

Abstract

應用非結構性網格之平行化三維 PIC-FEM 程式的研究與發展

摘要

Table of Contents

List of Tables

List of Figures

Chapter 1

Introduction

1.1 Motivation and Background

1.2 Literature Surveys

1.3 Objectives of the Thesis

1.4 Organization of the Thesis

Chapter 2

The Parallel Computing of Finite Element Method for

Three-Dimensional Electrostatic Field Problems

2.1 Background of Computational Electromagnetic

2.2 Finite Element Method (FEM)

(

)

(

)

(

)

(

)

(

)

(

) (

)

∫∫∫

(

) (

)

∫∫∫

(

)

2.3 Calculation of Electrostatic (ES) Field

( )

( )

(

)

∑

(

)

∫∫∫

∫∫∫

∫∫∫

研究生：許國賢

研究生：許國賢 Student：Kuo-Hsien Hsu

指導教授：吳宗信博士 Advisor：Dr. Jong-Shinn Wu

國立交通大學

機械工程學系

博士論文

_∑

_∫∫