Parallel Performance of the Parallel PIC-FEM Method Using DDD

Chapter 3. The Parallel Computing of Finite Element Method for Three-

4.5 Parallel Performance of the Parallel PIC-FEM Method Using DDD

Dynamic Domain Decomposition

In order to study the parallel performance of the current parallel PIC-FEM code using DDD, the three-dimensional RF gas discharge plasma with different numbers of particle is used as the test problem. This parallel performance is studied using 32 processors on the HP IA-64 clusters at National Center for High-performance Computing (NCHC), which is a memory-distributed machine. In the following, the simulation conditions, dynamic domain decomposition, parallel performance, and time breakdown analysis of the parallel PIC-FEM code with DDD is presented in turn.

Simulation Conditions

Fig. 4.23 illustrates the sketch of the 3D RF discharge plasma. Considering the

discharge sustained between two parallel circular electrodes in a grounded cylindrical

chamber by 20mm under an operation argon pressure of 50mtorr. Eqs. (4.21) is used for the cathode potential with V_rf =500Vand f =13.56MHz . The computational domain is divided using an unstructured mesh (~165,000 cells). Initially, the spatial distributions of the ion and electron number densities are assumed uniform to each

other. The electron timestep is 3.695×10⁻¹¹s and the number of sub-cycling is 10.

The particle velocities are sampled from the Maxwellian distribution at a

= =

where k is the Boltzmann constant. The secondary electron emission coefficient is 0.

The ions and electrons incident on the solid surfaces are always neutralized. In addition, two different numbers of particles are considered for simulation, which are 10 particles per cell and 40 particles per cell.

Parallel Performance

Results of parallel speedup and efficiency of the 3D RF gas discharge plasma computation, with different numbers of particle, as a function of the number of processors are presented in Fig. 4.24. In Fig. 4.24, there are five curves with circle, square, diamond and triangle symbols with respect to static domain decomposition (SDD), dynamically domain decomposition (DDD) at intervals of 1500∆t with different numbers of particles. And the linear dash line with square symbol represents the ideal case. As expected, the parallel performance of those using dynamic domain decomposition is much better than those using static dynamic domain decomposition.

Several trends for different numbers of particle are described in detail as follows.

10 Particles per Cell

Linear speedup occurs clearly for number of processors less than or equal to 10 is shown in Fig. 4.24. However, the efficiency decreases with increasing number of processors (up to 32) as expected, due to heavy FEM solver communication among processors and particle load unbalance, if only the static domain decomposition is

applied. As the number of processors increases over 10, FEM begin to play a more important role than the particle load unbalance. Thus, the parallel efficiency decreases monotonously with increasing number of processors (up to 32) even if dynamic domain decomposition is used. However, for the this problem the parallel efficiency using dynamic domain decomposition improves appreciably in the range of 5-10%, as compared with static domain decomposition.

40 Particles per Cell

Similarly, linear speedup exists for this problem up to 10 processors, if static domain decomposition is activated (see Fig. 4.24). However, the quasi-linear speedup (up to 28 processors) is seen if the dynamic domain decomposition is deactivated, which demonstrates the effectiveness of implementing dynamic domain decomposition in particle load unbalance. However, as the number of processors is over 28, this quasi-linear speedup decreases due to increasing FEM solver communication among processors. For this problem, parallel performance using dynamic domain decomposition is generally improve 10-30% parallel efficiency than that using static domain decomposition as the number of processors is less than or equal to 32. Note that approximately 85% of parallel efficiency can be reached at processor numbers of 64 for this problem.

Typical evolutions of dynamic domain decomposition using graph-partitioning

technique are shown in Fig. 4.25. METIS used to form the initial partition by assigning the unitary weight on each vertex, and ParMetis is used to repartition at constant time interval. It is clear that region covered by each sub-domain (processor) changes as the simulation proceeds due to repartitioning among processors when the initial size of each domain is approximately the same. There exists a smallest sub-domain in the middle of the camber due to the presence of highest density in this region (see Fig. 4.25). In addition, the size of the sub-domains near the electrodes is generally larger as compared with others due to the rarefied conditions caused by the sheath potential. It clearly demonstrates that the current implementation of dynamic domain decomposition is very effective in dealing with such plasma system.

Time Breakdown Analysis

Fig. 4.26 illustrates the typical fraction of time spending in PIC-FEM

computation and dynamic domain decomposition of 32 processors. It can be seen that, for the both problem the cost of repartition is very small and can be neglected by comparing with “useful” PIC-FEM. For problem of 10 particles per cell, the average fraction of time spending in FEM solver is larger than the fraction time for the particle movement. This explains the rapid decrease of parallel efficiency at this condition even if using dynamic domain decomposition for load balance (See Fig. 4.24).

Fraction time for parallel communication (is almost proportional to FEM solver)

among processors is large in PIC-FEM, which causes the parallel efficiency increase dramatically with the number of processors.

On the contrast, for the problem of 40 particles per cell, the fraction of time for particle movement is larger then the fraction of time for FEM solver. For this problem, parallel efficiency using dynamic domain decomposition could be generally improved 10-30% than that using static domain decomposition as the number of processors is less than or equal to 32. This shows the current parallel PIC-FEM method may be scalable at least for the large numbers of particles.

4.6 Some Remarks

In this chapter, the proposed parallel 3D PIC-FEM code using an unstructured mesh with DDD mainly follows the major steps of conventional PIC-MCC code in order to keep first principal and self-consistent approach. This code has successfully been verified in simulating quasi-1D DC and RF gas discharge plasma since the results agree with the previous studies very well. Parallel speedup of PIC-FEM shows that better speedup is performed with larger number of particles in both SDD and DDD cases. And, when the number of particles is fixed, DDD performed better speedup than SDD. The reasons were clearly explained from the detailed time breakdown analysis.

Chapter 5 Applications to Realistic Problems

In this chapter, the proposed parallel 3D PIC-FEM code is used to simulate three different realistic problems. They are: 3D field emission display (FED), 3D DC/RF gas discharge plasma, and 3D DC/RF magnetron plasma. The backgrounds of these studying cases can be found in the chapter 1 of this thesis, here, we only interest in the simulation work using PIC-FEM code. In simulating FED, Monte-Carlo collision module does not take into account since the gas pressure is very low. When the space-charge effect is ignored, only parallel Poisson’s equation solver is used to solve the electrostatic field once, the particles are then moved and collected when on the anode surface. Two studying cases of FED simulation without considering space-charge effect are presented. However, whenever there is a high charge density distribution in FED cell, space-charge effect has to be considered properly, and a studying case is simulated and compared with the experimental data. In simulating 3D DC and RF gas discharge, the simulation conditions is similar with those in previous 1D simulation work, the main different is the computational domain is pure 3D, which is close to the geometry of practical plasma chamber. From the results, one can clearly see the important 3D geometry effect on spatial distribution of plasma

macroparameter. In simulating 3D DC and RF magnetron plasma, two concentric cylindrical magnets behind the cathode, which is used to confine electrons for providing higher ionization rate. Therefore, before plasma simulation, this magnetic field induced by the permanent magnets has to be solved in advance using vector Poisson’s equation solver. Once the magnetic field is obtained, and then the following simulation work is similar with those in 3D DC/RF gas discharge. Different cases with different magnetization (M) and different secondary electron emission coefficient (γ) on the spatial distribution of plasma macroparameter are studied.

5.1 Simulation on Field Emission Display (FED)

In this section, FED is simulated using the PIC method without considering space charge-effect and with considering space-charge effect in turn. Thus a completed parallel Poisson’s equation solver with parallel adaptive mesh refinement is used to compute the electric field distribution of a CNT-based field emitter without considering space-charge effect as the first simulation case.

The generally accepted Fowler-Nordheim theory [Fowler and Nordheim, 1928]

for a clean metal surface relates the field emission’s current density, J, to the electric field at the tip surface of the emitter, E, in volts/nm and the work function of the emitter, φ, in electron volts (eV) by the equation,

( ) ( )

_⎟⎟ and y is the image charge lowering the contribution to the work function. The functions t

( )

y and v

( )

y are approximated by t²

( )

y =1.1, v

( )

y =0.95−y².

5.1.1. FED Simulation Without Space-Charge Effect

The Electron trajectory from the emitter surface to the anode surface is traced on the unstructured mesh based on the computed electric field distribution from the Poisson’s equation solver, by using the cell-by-cell particle tracking technique. The current density is then computed as the time average of the accumulated charges due to electron flow reaching the anode surface. In the following, two different cases are studied. The first studying case is to predict the FED emission current and investigate the spatial distribution of electron trajectory under different applied voltages. The second studying case is to add the external uniform magnetic field into FED to demonstrate the focus-ability and FED emission current are also strongly influenced by the magnetic field.

CaseⅠ

Fig. 5.1 depicts the simulation domain for a typical CNT triode-type field emitter

within a periodic cell. Only ¼ of the full emitter is used due to the intrinsic symmetry with Neumann boundary conditions applied at all symmetric planes. Important geometrical conditions (also summarized in part in table 5) include a tip radius of 10 nm, an emitter height of 600 and 400 nm, a distance of 0.5 µm between the gate and the cathode, a gate radius of 0.5µm above the emitter, a distance of 50 µm between the anode and the cathode, a thickness of the gate of 0.2 µm, and the half width of each cell measuring 25 µm. The applied voltage of the gate ranges from 110 to 190 volts, while the cathode and anode are grounded and applied with 400 volts, respectively. The refined final number of nodes used for the simulation is approximately 90,000. The typical results of the predicted potential distribution along with electric field distribution (gate voltage=150 volts, height= 600 nm) are shown in Fig. 5.2a and Fig. 5.2b, respectively. The maximal value of the electric field can reach

up to ~11.47 V/nm at the emitter tip when the gate voltage is 150 volts.

The predicted current and voltage data with an emitter height of 600 nm are presented in Fowler-Nordheim format in Fig. 5.3, with an anode voltage of 400 volts.

It is clear that the computed I-V data follow the Fowler-Nordheim law very well as the gate voltage varies from 110 to 160 volts. The fitted field enhancement factor

( V E d

β = ) is 26.1, where V is the applied cathode voltage, and d is the vacuum gap in

the field emission diode configuration. The corresponding electron trajectories are illustrated in Fig. 5.4 at two different gate voltages (110 and 160 volts) with a height of 600 nm. The results show that the spreading angle of electrons from the tip increases with the increasing gate voltage. This is attributed to the fact that the area of the tip surface which has a larger local electric field increases as the applied voltage increases, which results in the greater emission of electrons from the side of the emitter near the tip. As will be shown later, adding a focusing gate can help to effectively reduce the spreading angle.

The effects of CNT height and gate voltage to the emission current under an applied voltage of 400 volts are presented in Fig. 5.5, with the CNT measuring 400 and 600 nm, respectively. The results show that the turn-on voltage increases with the decreasing height of the CNT emitter. Also, the emission current increases dramatically with the given CNT height. This is reasonable since the larger the height of the CNT, the larger the local electric field which results at the tip surface (shorter anode-cathode distance with the same voltage difference), which in turn induces greater emission of electrons.

Fig. 5.6 shows schematically the same field emitter as shown in Fig. 5.1 with an additional focusing gate in-between the gate electrode and anode. Most geometrical

conditions (also summarized in part in table 6) are the same as those in Fig. 5.1, except for the distance between the focusing electrode and the gate electrode measuring 0.5 µm, the thickness of the focusing electrode measuring 0.2 µm, and the radius of the hole in the center of the focusing electrode which is 1.5 µm. Similar to that in the previous case without the focusing gate, only ¼ of a periodic cell is used for the simulation. Fig. 5.7b to Fig.5.7d present a comparison of the focusing effects of electron trajectories using different focusing electrode voltages (5, 0, –5 volts).

Likewise, data involving the absence of focusing electrode are presented for the purpose of comparison (Fig. 5.7a). The results show that the addition of a focusing electrode above the gate electrode can effectively reduce the spreading angle of the electron trajectories, which can possibly increase the resolution and the intensity at the anode. Among the cases simulated, focusing the electrode with 5 volts represents the best choice in focusing the electron flows at the anode.

Case Ⅱ

Another simulation case for parallel Poisson’s equation solver is the magnetic focusing structure consists of a solenoid (or a permanent magnet) outside of the FE device, as shown in Fig. 5.8, which is used to induce the tunable magnetic flux density (Bz), which is assumed uniformly in space. A 1/4 simulation domain of a single gated cathode structure is shown in Fig. 5.9, while a typical final adaptive

refined mesh (91930 nodes) is shown in Fig. 5.10. Important geometrical conditions include a tip radius of 10 nm, emitter height of 600 nm, distance of 0.5 µm between the gate and the cathode, gate radius of 0.5µm above the emitter, distance of 900 µm between the anode and the cathode, thickness of the gate of 0.2 µm, and the half width of each cell measuring 300 µm. The applied voltage of the gate ranges from 50 to 120 volts, while the cathode and anode are grounded and applied with 1,000 volts, respectively.

Without the externally applied magnetic focusing field, the simulated anode current versus gate voltage (I-V) curve is shown in Fig. 5.11, which displays a turn-on voltage of approximate 95V. Note the turn-on voltage is defined as the gate voltage at which the current to anode is 1 µA. The anode current plotted in Fowler-Nordheim coordinate (FN plot) is also shown as an inset to Fig. 5.11 .The linearity of FN plot clearly shows that the computed I-V data follow the Fowler-Nordheim model very well. The corresponding electron snapshots and trajectories with the gate voltage of 120 V are illustrated in Fig. 12 (a), which will be explained shortly.

Furthermore, we simulate the electron trajectories considering the presence of the externally applied downward magnetic field in the range of 0-1 Tesla to study influence of magnetic field to the electron focusing. In Fig. 12 (a)~(d) several 3-D

electron snapshots and trajectories are presented at the gate voltage of 120V, the anode voltage of 1kV, and the different magnetic flux density of 0T, -0.2 T, -0.5T, -1T, respectively. Based on the simulated electron trajectories, the maximum diameter of beam spot on the anode plane can be estimated. The dependence of electron beam diameter on the magnetic flux density is shown in Fig. 5.13, which demonstrates an Airy-function like structure. It is clear that the electron beam diameter rapidly decreases from 500µm down to less than 100µm as the magnetic flux density increases from zero to ~0.3T. At Bz = -0.35 T the beam spot size is estimated as 52µm, which is a minimum in the present simulation conditions. The over focusing of electron beam, as shown in Fig. 5.12(c), is observed in some high magnetic flux density region and the oscillation amplitude in electron beam diameter diminishes as the magnetic field becomes very large. At very large value of magnetic field the electron beam size eventually converges to ~70µm. The total emission current and anode current with magnetic focusing field shown in table 6 are the same as the results without magnetic field. From the simulation, we can find that this magnetic focusing design can optimally suppress the electron beam dispersion under a well-controlled magnetic field and the emission current to anode will not decrease by using this magnetic focusing method.

The above computational examples only serve to demonstrate the capability of

the current parallel Poisson’s equation solver using FEM with parallel adaptive mesh refinement in predicting field emission properties with complicated geometries.

5.1.3. FED Simulation With Space-Charge Effect

In this subsection, PIC method is used for considering the space-charge effect in simulating the silicon field emission diode. Fig. 5.14 shows the SEM image and surface mesh distribution for a typical silicon based field emitter within a periodic cell.

Only 1/4 of the full emitter is used due to the intrinsic symmetry with Neumann boundary conditions applied on all symmetric planes. A conical etched single emitter has been used for our modeling. Important geometrical conditions include an emitter height of 400 nm, a distance of 20 nm between the anode and the cathode, and the half width of each cell measuring 25 µm. The applied voltage of the anode probe ranges from 140 to 320 volts, while the cathode are grounded The refined final number of nodes used for the simulation is approximately 96,326. Fig. 5.15 shows the simulated potential and electric field profile with anode voltage 200 volts.

The simulated and experimental anode current versus gate voltage (I-V) curve is shown in Fig. 5.16. Fig.5.16 shows simulations with work function is 4.5eV agree very well with measurement after turn-on. And before turn-on, simulations using work function is 4.9eV agree very well with measurements probably due to the contamination on the tip surface. It also displays a turn-on voltage of approximate 175

volts. Note the turn-on voltage is defined as the gate voltage at which the current to anode is 1 µA. The anode current plotted in Fowler-Nordheim coordinate (FN plot) is also shown as an inset to Fig. 5.16. The linearity of FN plot clearly shows that the computed and experimental I-V data follow the Fowler-Nordheim model very well.

The above computational examples serve to demonstrate the capability of the current PIC-FEM code with parallel adaptive mesh refinement in predicting field emission properties with complicated geometries.

5.2 Simulation on Gas Discharge Plasma

In this section, PIC-FEM code is used to investigate the structure of typical 3D DC and RF gas discharge plasma. The argon discharge under a low pressure has been simulated as taking place between two cylindrical electrodes in a dielectric cylindrical chamber. In the following, the related simulation conditions and results are given in turn.

5.2.1 Three-dimensional DC gas discharge plasma

Consider the discharge sustained between two parallel cylindrical electrodes by

在文檔中應用非結構性網格之平行化三維PIC-FEM程式的研究與發展 (頁 97-0)