3.2 Reactive ion etch
3.2.1 Plasma in RIE …
Plasma can be generally described as an ionized gas composed of ions, free electrons, and a variety of neutral species. It contains approximately equal concentrations of positively charged particles (positive ions) and negatively charged particles (electrons and negative ions). While this description encompasses a wide range of plasma types and conditions, specific class is typically used in semiconductor processing called a "low pressure glow discharge", this type of plasma is weakly ionized (most of the gas molecules are neutral), low pressure (1 mTorr to 1 Torr), and nonequilibrium electrons contain most of the energy and the ions remain near room temperature. The advantage of this kind of plasma is surface chemical reactions with plasma can take place under no equilibrium conditions and at low temperature. Plasma effectively controls the main
etching characteristics, such as the etching rate, anisotropy, selectivity and uniformity provides.
It is well known that the plasma process is affected by chemical and physical features, several features of plasma are described in this section such as: thermal velocities of the free electrons and ions inside the plasma. These thermal velocities are described by the following equations,
12
Where, e is the electron charge, and the ionized particle charge, Te and Ti are the electron and ion temperatures, and m and M are the masses of the electron and ion, respectively.
For typical semiconductor processing plasmas, Te is between 1 and 10V (here, temperature and voltage can be considered equivalent through the relation T
q V = KB
therefore, 300K≡ 0.026V), while Ti is near room temperature. The ion mass is also much larger than the electron mass. For these two reasonsve >>vi , and during the time immediately after the plasma is ignited, the electrons are lost to the chamber walls much more rapidly than the ions. This leaves the bulk plasma with a net positive charge, which sets up an electric field from the plasma to the walls. The high electric field region near the walls is called the sheath.
The sheath thickness is one of the plasma features. This thickness is the order of 1 mm and is the only separated charge region. All other areas can be considered quasi-neutral over a length scale larger than ~15 µm. The potential across the sheath relative a floating surface (the floating potential) is typically 10-30 V and accelerates ions that enter the sheath towards the floating surface. Since the ion temperature in the bulk plasma is low (≈ 0.026V), the energy of the ions bombarding the chamber walls is on the order of several Te. If the wafers to be processed are placed on one of the walls, these ions will strike the wafers with high velocity, and at nearly-normal incidence. This ionic bombardment is the basis for the anisotropic etching.
3.2.2 Cooling System in RIE
In reactive ion etch process, large heat is generated by electrons strikes and heavy ion bombardment, where the high temperature can cause photoresist reticulation.
Consequently, Temperature control is needed to keep wafer health and protect wafer photoresist. Where low pressure is not good to transfer the heat a cooling system is required. Thus, Helium backside cooling is commonly used as shown in figure 3.4., either clamp ring or electrostatic chuck (E-chuck) is hold wafer. So in this cooling process, Helium is pressurized at the wafer back so heat transfer from wafer to water cooled chuck.
Figure 3.4: Helium backside cooling system
3.3 Case Study: Oxide RIE in PSC
This thesis studied one of reactive etching process in Power Ship Corp., where 2300 Exelan Flex (figure 3.5) used to etch the dioxide silicon films. This equipment is a complete package with an integral power supply, RIE chamber, and internal cooling system.
Figure 3.5: 2300 Exelan Flex, etching equipment in many VLSI factories.
This advanced Reactive Ion Etching equipment made up of many different sections (figure 3.6). Some of these sections are maintenance and controller sections. The main electronics monitor the status and manipulate the values of controlled variables such as:
Pressure, Temperature, gas mass flow, voltage, current, and RF power.
Figure 3.6: Sketch of 2300 Exelan Flex equipment includes some controller
The other main section in 2300 Exelan Flex is RIE chamber (figure 3.7). The main etching chamber configured inside the vacuum chamber for optimal efficiency, where these two chambers separated by quartz confinement rings, these rings effort pressure controller for the main chamber. This chamber as typical RIE (sec 3.2) consists two parallel plates, Esc, RF power supply, and pumping system.
Figure 3.7: Sketch of the main etching chamber inside RIE chamber
The purpose of oxide reactive ion etch is to create the trenches, which are filled by metal to connect wafer layers. To achieve this purpose three main oxide etching steps applied during oxide etch process, which contains eleven steps. Before the main etching process, good plasma environment must be prepared by stabilization steps (stable step, strike plasma step, and ramp up plasma step).
The first step in RIE recipe starts by opening high vacuum pump automatically followed by closing rough vacuum pump, thus ultra-high vacuum (1mTorr) is achieved inside RIE chamber, after that the opened area between the vacuum chamber and main etch chamber is minimized by quartz confinement rings, at this instant the fluorine gas
Figure 3.8: Process recipe for oxide film etching in RIE
The second step objective is to strike the plasma. Plasma is initiated in the system through applying RF power to the wafer platter. This power is typically applied at a few hundred watts with twenty seven and two megahertz, these frequencies create an oscillating electric field that ionizes few gas molecules by stripping them of electrons, creating plasma as shown in equation 3.10. In step three the values of the RF power will be increased to enhance the plasma amount until plasma reaches the required density.
e CF
F CF
e +
n→ +
n−1+
(3.10)After plasma enhanced, the main etching process is occurred by physical and chemical mechanisms. More physical than chemical mechanisms occurred in silicon oxide etching processes. In addition, free fluorine radicals are the main etchant for this etching process. When etching oxide, oxygen byproduct can react with C to free more
Step1: Stable
Step 2: Plasma Strike
Step 3: Plasma Ramp up
Step 4: Oxide Main Etching 1
Step 5: Oxide Main Etching 2
Step 6: Plasma Ramp down
Step 7: Stable
Step 8: Plasma Strike
Step 9: Plasma Ramp up
Step 10: Oxide Main Etching 3
Step 11: Plasma Ramp down
fluorine (equation 3.10). The reactions between the etchant atoms and oxide silicon wafer surface shown in the following equations:
Adsorption on Substrate
As shown in figure 3.8 there are three main etching steps: the first oxide etch step is outlined the wafer surface shape from the rounded mask. The second main etch step provides the pattern etch. The third main oxide etch occurs in deepest depth to prepare the contact areas between two layers, thus this etching need higher plasma density than the previous etching steps to achieve its purpose. So the stabilization steps repeated with same procedure but different RF power and gas flow values preparing new plasma.
3.4 RIE Factors
For reactive ion etching systems (case study) mainly forty different signals were collected from 2300 Exelan Flex equipment in PSC, twenty two signals are carefully chosen from the whole signals, the chosen signals are directly impact the wafer status.
Figure 3.9 clarifies the main twenty one factors except the time factor. For more understanding table 3.2 provides more elaborations about RIE factors.
Table 3.2: RIE factors and its clarification
Figure 3.9: Main factors of oxide RIE.
3.4.1 RF Power
The energy of the ions bombarding the substrate can be increased by employing an additional capacitive source of RF. Thus the etch rate increase linearly with the RF power supply. RF power present the most important knob that controls etch rate. When etch rate is out of specifications the first response the engineers do is checking RF system. four signals related to RF power presented in 2300 Exelan Flex table 3.2, 2 MHZ power supply, 27 MHZ power supply, 2MHZ reflected power and 2MHZ reflected power. In
Figure 3.10: Etch rate and DC bias as a function of the RF power.
3.4.2 Pressure
Gas pressure in RIE is typically maintained in a range between a few millitorr and a few hundred millitorr by adjusting gas flow rates or adjusting an exhaust orifice.
Normally, as pressure is decreased below about l00 mTorr, the potential across the discharge characteristically increases. At very low pressure, physical etching mechanisms tend to dominate (figure 3.11) , because of high ion energy, low reactant density, and long mean free paths.
According to the kinetic theory of molecular gases, the mean free path of a gas molecule at constant temperature is inversely proportional with the pressure. So when the pressure decreases the mean free paths of the species increases, and the energetic particles in the plasma can easily transfer their kinetic energy to the atoms at the oxide silicon film surface.
Figure 3.11 Qualitative effect of pressure on ion energy and the etching mechanism [24]
C H A P T E R 4 P RINCIPAL C OMPONENT A NALYSIS
Principal component analysis (PCA) is an important analysis technique in multivariate statistics, it was first suggested in 1901 by Pearson [36], and formally developed by Hotelling [37]. The main idea of principal component analysis (PCA) is to represent number of correlated variables into a smaller number of uncorrelated variables called principal components. The first principal component accounts for as much of the variability in the data as possible, the second PC is the linear combination with the second largest variance and orthogonal to the first PC, and so on. There are as many PCs as the number of the original variables. For many datasets, the first several PCs explain most of the variance, so that the rest can be disregarded with minimal loss of information.
The objectives of using PCA are to reduce the dimensionality of a data set, and to identify new underlying variables that are now orthogonal.
To enhance performance of prediction model in this study, PCA is suggested to represent the RIE factors, since simple neural networks with few nodes and connections tend to have better generalization capability. In this chapter, PCA technique
automatically extracts three principle components (PC) from all RIE factors (twenty two factors). Table 4.1 shows the RIE factors and its abbreviation.
Table 4.1 RIE factors and its abbreviation.
x
1 Bias Voltagex
12 Gas 4x
2 ESC Clamp Voltagex
13 Gas 7x
3 ESC Current1x
14 He Flow Innerx
4 ESC Current2x
15 He Flow Outerx
5 ESC Temperaturex
16 He Pressure Innerx
6 Foreline Pressurex
17 He Pressure Outerx
7 Forward Power 27MHzx
18 Pressurex
8 Forward Power 2MHzx
19 Process Timex
9 Gas 1x
20 Reflect Power 27MHzx
10 Gas 10x
21 Reflect Power 2MHzx
11 Gas 11x
22 Top Plate TemperatureIt is important to treat each step separately in PCA, because each etching step has different inherent physical/chemical characteristics, and by considering the overall process characteristics and the objective of model simplicity, it was decided that utilizing one PCA for each of the eleven steps might yield a better solution than utilizing a single PCA for the entire process. In this thesis, principal component analysis was utilized for 90 training wafers, the principle components are found by computing the sample covariance* matrix and selecting its eigenvectors (loading vectors) for the k biggest eigenvalues as shown in figure 4.1.
* Covariance matrix Cov(X) is a good choice to capture the dependence between
The covariance matrix Cov(X) can be obtained by:
is data matrix with n samples (rows) and 22 variables (columns), as well as each column represents one of RIE factors.
X
i represents vector i of data matrix(X −X) stands for subtract the mean value of each column from the corresponding column .
To find the eigenvalues and eigenvector of the covariance matrix, Cov(X) represented by using singular value decomposition (SVD) as shown in the following equation:
( )
TCov X = Λ V V
(4.2)where
V =
⎡⎣v v
12
L v
22⎤⎦is an 22 by 22 unitary matrix of corresponding eigenvector
1
eigenvalues for the eleven steps. Figure 4.2 shows the decrease in eigenvalues λ for each of etching step.
The variance of the ith PC is equal to the ith largest eigenvalue of the covariance matrix [38]. Because of this important property, the
ϕ
percentage (equation 4.3) is used as a guide in choosing an appropriate number of PC. The goal is to choose as small a value of k as possible while achieving a reasonably high percentage of PC variance. Theϕ
values shown in table 4.3 give the cumulative proportion of the variance explained bythe first k PCs,
ϕ
is larger than 88% when k equals three, thus three principle components are selected to characterise the twenty two RIE factors by using equation 4.4:1
Score equations (
T = XV
) is tend to identify the three principle components according to RIE factors. Score equations for each etch step is shown in Appendix IIFigure 4.2 Eigenvalues of the covariance matrix for RIE steps
C H A P T E R 5 P REDICTION M ODELS FOR RIE
The computing, telecommunications, aerospace, automotive and consumer electronics industries all rely heavily on integrated circuits (ICs). Next-generation IC manufacturing equipment will require dramatic improvements in cost, quality, throughput, and flexibility. Reducing manufacturing cost involves increasing chip yield, reducing cycle time, maintaining consistent product quality, improving equipment reliability, and maintaining stringent process control. Since IC fabrication consists of hundreds of steps, maintaining product quality requires the control of thousands of variables. Process steps are performed in sequence, and yield loss may occur at every step. However, analyzing wafer defects is the regular method for evaluation semiconductor technologies. Wafer defects carry a lot of wafer status information which can be analyzed in order to characterize the quality of processes and products. If the prediction model accurately predicts the wafer status, the repeated etching failure rate should be prevented, process yield should be greatly enhanced, inspection cost should be reduced and profit should be increased.
The experimental process of this study is as depicted in figure 5.1, and four models are included: offline back-propagation neural network (BPNN), offline principle
component analysis BPNN (PCABPNN), online BPNN and online PCABPNN. These models have the potential to reduce the overall cost of ownership of semiconductor equipment by increasing the wafer yield and throughput of product wafers, and not depend upon monitor wafers or expensive metrology rather it will enable inexpensive real-time wafer-to-wafer control applications in RIE. The capability of the four prediction models to predict the wafer status correctly is discussed in this chapter.
5.1 Data Collection from RIE.
The most important step in semiconductor process modelling is the collection of data. It is essential to gather a sufficient sample of representative data; or else it is impossible to train a neural network or any other type of model. In this study, the parameters data of 2300 Exelan Flex machine is collected by engineers in Powerchip Semiconductor Corp (PSC) factory based on their experience. These parameters include chamber temperature and pressure, forward and reflected RF power, DC bias and gas flow rates (table 3.2).
Wafer Percentages in this Case Study.
25
Figure 5.2 Percentage of the training and testing wafers in this case study.
Figure 5.2 presents the training, and testing percentage of one hundred twenty wafers which collected from 2300 Exelan Flex machine, and the percentage of training wafers to testing wafers which is three to one, where fourteen wafers (12%) from the ninety training wafers (75%) stand for unopened etch defected wafers, and five wafers (4%) from the thirty tested wafers (25%) stand for defected wafers (unopened etch).
5.2 Data Preparation.
This section explain the details of the preparation data performed in this study, data preparation techniques are used to obtain good prediction results. Figure 5.3 reveals the variation of the data point number/ process time for each step, for example in step five the number of data points is between 10- 140 with 60 data points as an average of step five . Because of this dissimilarity, it is hard to decide the BPNN inputs. Therefore three different data preparation techniques are suggested and prepared such as: raw data, sampling data and statistical summary data.
• Raw data preparation: one hundred eighty four data points is the minimum number of data points from the collected data, thus the first one hundred eighty data points are suggested as raw data inputs for offline prediction models. And twenty data points are suggested as raw data inputs for online prediction models.
• Sampling: non symmetric sampling is the second suggested preparation technique, which has ability to cover all etching steps (figure 3.8), at the same time focusing on the main three etching steps (step 4, step 5 and step 10). Table 5.1 shows the number of captured samples in each step, where two samples captured from stabilization steps (step 1, 2, 3, 7, 8, and 9) and two from plasma ramp down steps (step 6 and 11), more than half of the captured samples are captured from the main etching steps (step 4, 5, and 10), as a result thirty four captured samples cover all etching steps. These thirty four captured samples used as inputs for offline prediction models.
Table 5.1: Number of suggested sampling for each step in the sampling technique
Figure 5.4: The position of captured samples.
Figure 5.4 illustrates the position of captured samples, where the first data point of each step is captured, and the suggested sampling rate for the main etching steps is five
the online prediction models, these samples includes the first data point of each stabilization steps, the third point after the beginning of the first three stabilization steps and the end point of the third step (which is the first data point of forth step).
• Statistical summary preparation: the last suggested preparation technique, and depends on mean and standard deviation values. The following equations show the calculation of the two statistical summary values.
1
Figure 5.5: Statistical summary data preparation technique
Because of many samples have no data for the sixth step, this step became out of interest. This means there is ten steps will be statistical summarized and applied in offline prediction models.
5.3 Architecture of prediction models
As stated before, many researchers have been adopted BPNN to solve the problem of categorization, prediction and examination the manufacturing process, because the advantages of BPNN such as: easy and fast to comprehend, high accuracy and fast recalling speed. This study combines back propagation neural network (BPNN) and principle component analysis (PCA) to construct the four prediction models as shown in table 5.2. Offline prediction models concern about all etching process steps to predict the wafer status after the end of etching process. In other hand, online prediction models concern about the first three stabilization steps. Online prediction models are capable to reduce the defected wafers more than offline prediction models, due to the abnormalities of the etching process can be predicted as soon as possible and before the end of the first main etching step (step 4).
Table 5.2: The methods used in each prediction model.
PCA BPNN Notes
Offline BPNN ※ Concern about all etching steps
Online BPNN ※ Concern about the first three steps
Offline PCABPNN ※ ※ Concern about all etching steps
Online PCABPNN ※ ※ Concern about the first three steps
5.3.1 BPNN
BPNN in this study consists three layers of neurons: the input layer, hidden layer, and output layer. The input layer receives external information such as RIE processing factors or principle components. From the output layer, predictions are produced, the prediction values expressed as a binary values to represent the wafer status, since the network output is between zero to one, the zone that is smaller than the Min value is set to zero and the zone that is greater than the Max value is set to one. If the network output value is equal to one that means the wafer status is good, and otherwise is bad (defected
BPNN in this study consists three layers of neurons: the input layer, hidden layer, and output layer. The input layer receives external information such as RIE processing factors or principle components. From the output layer, predictions are produced, the prediction values expressed as a binary values to represent the wafer status, since the network output is between zero to one, the zone that is smaller than the Min value is set to zero and the zone that is greater than the Max value is set to one. If the network output value is equal to one that means the wafer status is good, and otherwise is bad (defected