Chapter 1 Introduction
1.2 Thesis Outline
Jian-Bin Chiu and Cheng-Ching Yu etc. [9] used the concept of soft landing of a spacecraft to CMP operation. Therefore, the CMP operation can be formulated as a minimum time optimal control problem. They treat the oxide surface as the landing surface, the polishing pad as a fly vehicle, and the removal rate as the vertical velocity.
The equations describing the removal can be expressed as:
1 u
where H is the thickness of material to be removed, RR the removal rate, and u the rate of change of the removal rate. The constraints in removal rate and rate of change of removal rate are applied because the parameters of CMP machine have physical limit, e.g. platen speed, down force, and slurry flow rate. They also set the final condition to H(tf) =2000Å and RR(tf) =2000 Å/min in order to reduce the dishing and erosion according to the experimental data proposed by K. Wijekoon and S. Tsai etc.
[10]. Fig. 1.2 shows the dishing and erosion are proportional to the pressure and relative velocity. Once the landing point is reached (H(tf) =2000Å), the polisher continues the removal with the smaller removal (RR(tf) =2000 Å/min) until the end point is detected. Fig. 1.3 shows the result of optimal operation. Through their inspiration, we plan to use dynamic programming as our method of optimal operation in this thesis.
1.2 Thesis Outline
In this study, we focus on the mechanical effects in CMP. The down force and rotational speed were taken to be the operational parameters. In chapter 2, an
overview of the CMP process and the parameters of mechanical and chemical aspects were introduced and the model which was used in simulation was represented. In chapter 3, the method of dynamic programming was introduced and the simulation data were discussed. In chapter 4, the experimental results through the operation of dynamic programming were obtained and the result was discussed. In chapter 5, we focus on the step height reduction by the down force based on the force redistribution.
The simulation data were presented. Finally, the conclusion and future work were presented in chapter 6.
Chapter 2 An Overview of CMP Process and Model Representation
CMP has been used to polish a variety of material for thousands of year, for example to produce optically flat and mirror finished surface. More recently optically flat and damage-free glass and semiconductor surfaces have been prepared by use of the CMP processes. Now CMP is being introduced in planarizing the interlayer dielectric and metal wiring to form interconnections between device and device.
2.1 Introduction of CMP structure
A schematic of a CMP machine and the perspective front-view of a typical CMP system are shown in Fig. 2.1, 2.2 and 2.3. Both the carrier and pad are rotated in the same direction with different velocities. The flowing slurry is carried onto the wafer surface through the porosity on the pad surface. The slurry chemically attacks and softens the wafer surface, which is then removed by mechanical abrasion. The primary segments of CMP machine are as follows:
(1) Wafer Carrier:
The wafer carrier holds the wafer face down during CMP and brings the wafer in contact with the polishing pad. The carrier rotates in the same direction as the platen.
(2) Platen:
The rotating base on which the polishing pads are placed. Sometimes referred
to as the polishing “table”.
(3) Polish Arm:
Transport the wafer by the polish arm.
(4) Pad:
A pad which is mounted on a rotating platen and polishes the wafer. Polishing pads come in a variety of materials and are designed with a variety of surface features depending on the process results needed.
(5) Slurry:
An abrasive mixture containing particles of colloidal silica, alumina, or some other abrasive material suspended in a chemical compound and DI water.
Slurry is fed onto and through the polishing pad during CMP in order to remove material from the wafer surface.
(6) Pad Conditioning:
A process in which the polishing pad is “roughed up” by a diamond disc in order to reduce the effects of glazing. In Fig. 2.4, conditioning enhances pad performance, but reduces overall pad lifetime.
If we only care about the amount of mechanical abrasion, it will result in a decreased removal rate and the wafer surface may peel off or be scratched. On the other hand, if we only care abut the amount of chemical reaction removal, it will lead to the erosion of the dielectric or the dishing of the metal lines. Giving undue emphasis to either of them will not achieve the global planarization. Therefore, how
to combine the mechanical abrasion and chemical reaction to get good performance and high throughput is nowadays an important challenge to be dealt with.
2.2 CMP Process Parameters
As named chemical mechanical polishing, the primary parameters are divided into two parts which are the chemical aspect and the mechanical aspect.
2.2.1 Mechanical Parameters
The primary mechanical parameters are as follows:
(1) Platen Speed:
The platen speed affects slurry transport across the wafer and the transport of the reactions and products of chemical reactions to and from the wafer surface. It has been noted that the copper removal rate is strongly dependent on the platen speed. In Fig. 2.5 [16], as the platen speed increases, the removal rate increases.
(2) Carrier Speed:
When the carrier speed is the same as the platen speed, the best uniformity will be achieved.
(3) Down Force:
In Fig. 2.6 [16], as the down force increases, the removal rate increases and then reduces the polishing time. This, of course, means higher throughput. The danger is that too much down force can cause problems such as scratches or gouges, and can possibly cause non-uniformity.
(4) Back Pressure:
It is sometimes used to provide some curvature or shape to the wafer during polishing. The idea is to produce an optimum wafer shape with respect to the pad underneath for improving removal rate distribution on the wafer and within-wafer-uniformity (WIWUN).
(5) Pad Conditioning:
There are variables which will affect pad conditioning. For instance, the conditioning duration, the abrasiveness of the disc and down force will all have an effect on the pad. A long conditioning duration may improve pad performance, but ultimately will reduce pad lifetime.
(6) Slurry Flow Rate:
Slurry flow rate affects how quickly new chemicals and abrasive are delivered to the pad and reaction by-products and used abrasive are removed from pad. It also affects how much slurry is on the pad and therefore will affect the lubrication properties of the system.
Furthermore, there are still other mechanical parameters which affect the process:polish oscillation, wafer mounting and pad hardness, for instance.
2.2.2 Chemical Parameters
The primary chemical parameters are as follows:
(1) Abrasive Size
Abrasive size affects the removal rate and the surface damage. For example,
experimental results show that there is an inverse proportional relationship between the abrasive size and the material removal rate, Fig. 2.7 [11].
(2) Abrasive Weight Concentration
Abrasive weight concentration also affects the removal rate. For instance, experimental results show that there is a proportional relationship between the abrasive concentration and the material removal rate, Fig. 2.7 [11].
(3) Abrasive Variety
Silica oxide (SiO2) is the most common used for oxide polishing while aluminum oxide (Al2O3) is the most common used for metal polishing.
(4) Slurry Viscosity
The more viscous a material, the more it resists flow. High slurry viscosity results in poor transport of reactants and products to and from the wafer surface. It also affects lubrication of the wafer pad interface.
(5) Oxidizer Concentration
In Fig. 2.8 [12], at the region of low oxidizer concentration, the rate of oxide generation is small and the passivation layer is removed as soon as it is formed.
Then a maximal removal rate is reached when the rate of passivation is equal to the rate of mechanical abrasion. As we increase the oxide concentration further, the passivation itself changes its structure. This creates a barrier for mechanical abrasion and slows down the removal rate.
Furthermore, there are still other chemical parameters which affect the process:
slurry temperature, slurry buffering and film hardness, for instance.
There are many variables that can affect the CMP performance as shown in Table 2-1. Besides, some factors which are difficult to control and monitor like the slurry transport under the wafer and the local temperature of the slurry also have significant effects on the CMP performance and the process parameters are interrelated such that modifications to one parameter will have an impact on other process issues. For instance, increasing platen speed or down force may increase the removal rate, yet at the same time create slurry flow rate and distribution problems. Therefore, the key problem is how to optimize the process parameter settings in order to obtain the desired results for the given film being planarized.
2.3 Model of Chemical Mechanical Polishing
The material removal model for CMP can be separated into two parts, mechanical model and chemical model. The chemical action of the slurry is responsible for continuously softening the silicon oxide or oxidizing the metal surface to form a thin passive layer which is immediately removed by the action of the slurry abrasives. The fresh silicon oxide or metal surface exposed due to the abrasion is then rapidly repassivated and removed. This process of passivation-abrasion-repassivation continuous until the desired thickness is realized. Based on this idea, a mechanical removal model and a chemical model can be independently developed for CMP, with the mechanical model considering only the mechanical removal of the passivation layer, and the chemical model considering only the passivation of this layer.
2.3.1 Preston Equation
Preston provided a simple model of material removal in glass polishing tools, postulated based on experimental observation that the removal rate is proportional to the nominal applied pressure and the relative velocity between the pad and the material being polished. Preston equation [3] for the removal rate RR can be written as
RR = Kp P V
where P is the down pressure, V the relative velocity of wafer, and Kp a constant representing the effect of other remaining parameters, such as the abrasive type and concentration, and the nature of the chemicals and their concentrations. This equation has been widely used in CMP process control and consumable development for IC fabrication and manufacturing. However, it is focused on mechanical removal of material and there are some other phenomenons that can not be explained. For example, experimental results show that the pressure dependence of removal rate for CMP with soft pad satisfies a nonlinear relationship. Therefore, what is included in the all-purpose parameter Kp is unclear.
2.3.2 Luo and Dornfeld Equation
Luo and Dornfeld proposed a model to describe the interactions between the wafer, pad, and abrasives, which are quite different from those in conventional polishing or lapping processes due to the small pad hardness and different size scales of the pad asperity and the polishing abrasives. They assumed the removal mechanism in the solid-solid contact mode instead of the hydro-dynamic mode, as shown in Fig.
2.9. Luo and Dornfeld equation [8] for the removal rate RR can be written as
C
V is the relative velocity of wafer
C1 is a constant representing the effect of slurry abrasives (average size and size distribution), wafer and pad hardness, and pad roughness
C2 is a constant representing the effect of slurry chemicals, slurry abrasives, wafer size, wafer density, wafer hardness, pad material, and pad roughness Φ is the normal cumulative distribution function which representing the
probability density of active abrasives over the wafer-pad interface
( )
e ( ) dt2π x 1
Φ ∫−x∞ 12 t2
= −
RRC is the material removal due to chemical etch
The values, C1 and C2, are independent of the down force P0 and the relative velocity V. This model primarily is also focused on mechanical effect, particularly the abrasion due to the abrasive-wafer and abrasive-pad contact, but it includes the chemical reaction at the wafer surface. Therefore, this model looks more comprehensive to describe the CMP process.
In SiO2 CMP process the material removal due to chemical etch, RRchemical etch,is small compared with the mechanical removal but the material removal due to chemical etch in Cu CMP process may need to be considered for more accurate results. Therefore, we ignored the chemical etch effect in the simulation of SiO2 CMP process and the removal rate can be written as
V
Chapter 3 Optimal Control Design:Dynamic Programming
Sociological, economic, and physical pressures in all areas of modern life have generated an accelerated demand for high-level decision-making based upon limited information about the processes being controlled. In 1950s, a systematic and concerted mathematical study of such decision-making situations was initiated by Richard Bellman. This pioneering work was based upon the fundamental system-theoretic notion of feedback, i.e., that decision rules should be based upon the current (and perhaps past) states of the process under study. Bellman and his colleagues continued to develop the feedback decision-making concept under the name of “dynamic programming”. The majority of problems of true practical concern were computationally intractable due to the limited state of the computing art at that time. As time goes on, a combination of rapid progress in computer technology, coupled with the development of refined computational procedures, has made it practical for solving a wide variety of problems in economics, engineering, operations research, and mathematics, itself.
3.1 Bellman’s Principle of Optimality
The fundamental concept of dynamic programming originated by Bellman is called the principle of optimality. This principle may conceptually be thought as follows: Given an optimal trajectory from point A to point C, the portion of the trajectory from any intermediate point B to point C must be the optimal trajectory from B to C. In Fig. 3.1, if the path Ⅰ-Ⅱ is the optimal path from A to C, then
according to the principle of optimality path Ⅱ is the optimal path from B to C. The proof by contradiction for this case is immediate: Assume that some other path, such as Ⅱ´, is the optimum path from B to C. Then, path Ⅰ-Ⅱ´ has less cost than path
Ⅰ-Ⅱ. However, this contradicts the fact that Ⅰ-Ⅱ is the optimal path from A to C, and hence Ⅱ must be the optimal path from B to C.
3.2 Dynamic Programming
Consider a quantized state x∈X, at stage (N-1). At this state, each of the admissible decisions u(m)∈U is applied.
]
For each of these decisions the cost at the current stage can be determined as M)
Next, for each of these decisions the next state at stage N is determined from the system equation,
The next step is to compute the minimum cost at stage N for each of the states x(m). However, in general a particular state x(m) will not lie on one of the quantized states x∈X at which the optimal cost I(x, N) is defined. In fact, it may lie outside of the range of admissible states. In the latter case the decision is rejected as a candidate for the optimal decision for this state and stage. If a next state x(m) does fall within the range of allowable states, but not on a quantized value, then it is necessary to use some type of interpolation procedure to compute the minimum cost function at these points.
Assume, then that the values of the minimum cost at the states x(m) can be
expressed as a function of the values of the optimal cost at quantized states x∈X.
( ),N] P[x( ),N,I
(
x,N)
] ,all x XI[x m = m ∈
where I(x,N)=L(x,N). If, as is often the case, no decision is made at k=N, the final stage, and hence the cost function at N depends only on the final state, x(N).
The total cost of applying decision u(m) at state x, stage (N-1), can then be written as
[
x,u( )m,N-1]
I[
x( )m ,N]
(m) L
F1 = +
The minimization can be achieved by simply comparing the M quantities. According to the functional equation, the minimum value will be the minimum cost at state x, stage (N-1).
This procedure is repeated at each quantized state x∈X at stage (N-1). When this has been done, I(x, N-1) and uˆ[x,N-1] are known for all x∈X. It is now possible to compute I(x, N-2) and uˆ[x,N-2] for all x∈X based on knowledge of I(x, N-1).
The general iterative procedure continues this process. Suppose that I(x, k+1) is known for all x∈X. Then I(x, k) and uˆ[x,k]are computed for all x∈X from
and where I(x(m), k+1) is computed by interpolation on the known values I(x, k+1) for all x∈X:
( ),k 1] P[x( ),k 1,I
(
x,k 1)
] ,all x XI[xm + = m + + ∈
The optimal decision uˆ[x,k] is the decision for which Eq. (3.2) takes on the minimum. The iterative procedure begins by computing uˆ[x,N-1] and I(x,N-1) from the given boundary conditions I(x, N), and it continues until [x,0]uˆ and I(x, 0) have been computed.
The complete results of dynamic programming are shown in Fig. 3.2. At each state of stage, the optimal decision is written blow, and the minimum cost is written above. Finally, we can find the optimal sequence of decisions starting from the given x(0) and system equation. This is called the recovery procedure and these decisions are the input for our experiments. However, this is based on the system equation when we lack the measure of state. Our simulated results were done in this manner.
If we could monitor the state and stage of the system, the dynamic programming solution, k]uˆ[x, , leads to a feedback control or decision policy configuration. One method of implementing this solution is to simply store all the values of uˆ[x,k] in memory, monitor the state and stage of the system, and look up the appropriate value of k]uˆ[x, as required. This type of implementation is attractive because the dynamic programming calculations can be done off-line, and the only operation that needs to be done during the decision interval is retrieval of the appropriate optimal decision.
The system configuration is as shown in Fig. 3.3.
3.3 Simulation Results
We used the very simple concept to get the equation for our simulation. The differential equation of the thickness being polished is equal to the removal rate and we made the removal rate to be the input. The equation is written as
u dt RR
h& =dh =− =−
where h is the thickness, RR the removal rate, u the input. The discretized version using a sampling period of T is
(
k 1) ( )
h k T u( )
kh + = − × (3.3) where k is the stage. We assumed that there were only 7 values of the input (include 0) because of restrictions on the Westech 372M CMP machine and the sampling period T here was fixed to 1. For each of these inputs the cost at the current stage can be determined as
and the cost at the final stage N also was determined as
2 where s is the weighting factor of final state, q the weighting factor of transient state, and r the weighting factor of input. Then we suppose a quantized state h∈H and a admissible inputs u(m)∈U are applied.
H=[6000 5999 5998…2 1 0] (jN=6001) U=[u(1) u(2) u(3) u(4) u(5) u(6) u(7)] (mN=7) knowledge of I(h, N-1). The iterative procedure continues until uˆ[h,0] and I(h, 0) have been computed. The program flowchart is shown in Fig. 3.4 and the complete results of dynamic programming can be plotted like Fig. 3.2. Finally, we can find the optimal sequence of inputs starting from the given h(0) and Eq. 3.3 by means of the recovery procedure.
3.3.1 SiO2 CMP Process
Before we start to simulate dynamic programming, we have to determine the two constants, C1 and C2, in the Luo and Dornfeld equation. Furthermore, we modified the power of V from 1 to 6/10 which is based on Yin’s thesis [19] and the value of V means the rotational speed. Platen speed and carrier speed were all equal to the rotational speed. Two sets of experimental removal rate results are used to solve for the values of C1 and C2 by means of an iteration method of trial and error. The process
Before we start to simulate dynamic programming, we have to determine the two constants, C1 and C2, in the Luo and Dornfeld equation. Furthermore, we modified the power of V from 1 to 6/10 which is based on Yin’s thesis [19] and the value of V means the rotational speed. Platen speed and carrier speed were all equal to the rotational speed. Two sets of experimental removal rate results are used to solve for the values of C1 and C2 by means of an iteration method of trial and error. The process