Chemical mechanical planarization operation via
dynamic programming
Chia-Shui Lin
a,*, Yung-Chou Lee
baDepartment of Mechanical Engineering, National Chiao Tung University, Hsin-Chu 30050, Taiwan bDepartment of Electrical and Control Engineering, National Chiao Tung University, Hsin-Chu 30050, Taiwan
Received 28 April 2006; received in revised form 1 December 2006; accepted 5 February 2007 Available online 16 February 2007
Abstract
In this paper, the impact on non-planarization index by the down force and rotational speed during a SiO2or Cu CMP process was investigated. Since the magnitudes of down force and rotational speed have limits, we choose the dynamic programming approach because of its ability to achieve constrained optimization by the down force and rotational speed. The duration and the amount of input were computed based on the chemical mechanical polishing model by Luo and Dornfeld [J. Luo, D.A. Dornfeld, IEEE Trans. Semiconduct. Manufact. 14(2) (2001) 112–132.] when the other parameters were fixed. Experiments done for blanket wafers based on dynamic program-ming operation and conventional constant removal rate operation was compared with each other. The non-planarization index could be improved consistently by dynamic programming operation versus constant removal rate operation. The improvement ranges from 2% to 39% improvement over the base recipe of constant removal rate in all experiments as shown inTables 3 and 6. The thickness removal error is consistently smaller by constant removal rate operation versus dynamic programming operation in all experiments as shown inTables 3 and 6. To get the best performance of both planarization and thickness removal, it is recommended that planarization step and overpolish step in SiO2and Cu CMP should use different mode of operation, i.e., dynamic programming operation during planarization step for min-imizing non-planarization index and constant removal rate operation during overpolish step for minmin-imizing thickness removal error. The incremental time calculation for eliminating thickness removal error during overpolish step can be done using the thickness error and removal rate derived from Luos’ removal rate model based on constant wafer pressure and platen speed at the end of planarization step. Our contribution is a new approach for CMP. Standard CMP uses constant removal rate operation in both planarization step and overpolish step. Our new approach uses dynamic programming operation during planarization step and constant removal rate operation during overpolish step.
Ó 2007 Elsevier B.V. All rights reserved.
Keywords: Chemical mechanical planarization; Dynamic programming; Non-planarization index; Copper dishing; Oxide erosion
1. Introduction
Chemical mechanical planarization (CMP) is a widely accepted technique to provide a globally planarized surface for microelectronic wafer fabrication nowadays. CMP was developed during the early 1980s when multilevel intercon-nect technology was pushed to the limits of circuit density and performance. This technique produces excellent plana-rization across the wafer surface and improves both
photo-lithography and deposition process[1]. In recent years, the
device levels and densities increased continuously, at the same time the problem of resistance–capacitance (RC) time delays which can appreciably slow down circuit speeds must be solved quickly. As a result, copper has emerged as the optimal interconnect material because of its low resistivity and high electromigration resistance compared
with aluminum [2,3]. Patterned Cu lines are produced by
a damascene process when using Cu as an interconnect material. In the damascene process, the dielectric is pat-terned, followed by the barrier and metal deposition. The barrier is required to prevent the rapid diffusion of the
0167-9317/$ - see front matter Ó 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.mee.2007.02.003
*
Corresponding author. Tel.: +886 35731796; fax: +886 35720634. E-mail address:chiaslin@mail.nctu.edu.tw(C.-S. Lin).
www.elsevier.com/locate/mee Microelectronic Engineering 84 (2007) 2817–2831
Cu into the dielectric. The final step in this process is CMP that removes the excess metal and provides global
planari-zation.Fig. 1schematically shows a single layer Cu
inter-connect structure before and after CMP. Two key problems in Cu pattern wafer CMP, namely copper dishing and oxide erosion, generate surface non-planarity which gives rise to problems in integrating multiple layers of metal. Copper and oxide thinning results in increased RC delay which leads to inferior device performance.
There-fore, we focus on the experiments for SiO2and Cu CMP.
Several research efforts have been reported on modeling the CMP process and the most well known equation is the
Preston’s equation[4]. Preston’s equation reflects the
influ-ence of process parameters including wafer pressure and relative velocity. In the last several years, the revised Pres-ton’s equations concentrated on different elements of CMP.
For example, Zhang and Busnaina [5]proposed an
equa-tion taking into account the normal stress and shear stress acting on the contact area between abrasive particles and
wafer surfaces. Tseng and Wang [6] showed that the
removal rate is proportional to the terms P5/6 and V1/2.
Zhao and Shi[7,8]consider the effects of the pad hardness
and the contact between wafer and pad. Luo and Dornfeld
[9]assumed an indentation-sliding model for the
penetra-tion of the pad and included an empirical accommodapenetra-tion of chemical reaction at the wafer surface. Compared with experiment results, the Luo and Dornfeld model more accurately predicts the removal rate. (Therefore, the Luo and Dornfeld model will be employed to predict thickness removal rate in this paper).
Most of the research work on CMP is focused on
removal mechanism and slurry chemistry. Chiu et al. [10]
applied the concept of soft landing of a spacecraft to CMP operation. Therefore, the CMP operation can be for-mulated as a minimum time optimal control problem. They treat the oxide surface as the landing surface, the polishing
pad as a fly vehicle, and the removal rate as the vertical velocity. The equations describing the thickness removal process can be expressed as:
_ H _RR " # ¼ 0 1 0 0 H RR þ 0 1 a amax 6a 6 amax
where H is the thickness of material to be removed, RR the removal rate, and a the rate of change of the removal rate. The constraints in removal rate and rate of change of re-moval rate are applied because the parameters of CMP ma-chine have physical limit, e.g., platen speed, wafer pressure, and slurry flow rate. They also set the final condition to
H(tf) = 2000 A˚ and RR(tf) = 2000 A˚ /min in order to
re-duce the dishing and erosion according to the experimental
data proposed by K. Wijekoon and S. Tsai etc.[17].Fig. 2
shows that copper dishing and oxide erosion are propor-tional to platen speed and wafer pressure. Once the landing
point is reached (H(tf) = 2000 A˚ ), the polisher continues
the removal with the smaller removal (RR(tf) = 2000 A˚ /
min) until the end point is detected.Fig. 3shows the result
of optimal operation. Through their inspiration, we plan to use dynamic programming as our method of optimal oper-ation in this research.
Cu Barrier
Layer Dielectric (SiO2)
Dishing Erosion
Fig. 1. Schematics of a single layer Cu interconnect: (a) before polishing, (b) ideal case after polishing and (c) real case after polishing.
Fig. 2a. Dependence of copper dishing and oxide erosion on platen speed. Wafer pressure was kept constant[17].
Fig. 2b. Dependence of copper dishing and oxide erosion on wafer pressure. Platen speed was held constant[17].
Lin and Chi[11] employed the sliding-mode control to set the operation profile of CMP process through ‘‘Dynamic Tuning’’ method to enable the CMP process behave closer to the soft landing. However, the
experimen-tal verification may be hard to carry out because of the continuous time control and the lack of available operation mechanism. Hence, the dynamic programming control will be employed to deal with the discrete time control. Dynamic programming was developed by Bellman and
his colleagues in the 1950s [12]. The method of dynamic
programming will be explained in Appendix A of this
paper. It has the advantage of dealing with constrained inputs. That means the problem of available operation mechanism can be solved for constrained inputs and multi-ple finite stages CMP. Experiments based on dynamic pro-gramming were carried out in this research for both
SiO2and Cu blanket wafers.
In this paper, the removal rate representation will be
presented in Section2. In Section3, the simulation results
via dynamic programming will provide the basis of dynamic programming of wafer pressure and platen speed as part of recipe for CMP tool. The experimental results for CMP operation via dynamic programming were obtained
and discussed in Section 4. Section5 is for conclusion.
2. Removal rate representation
In this work, the equation proposed by Luo et al.[9]will
be employed, that is
Fig. 3. Trajectory for RRmax= 9000 A˚ /min, Hsmall= 2000 A˚ , and RRsmall= 2000 A˚ /min[10].
RR¼ C2 1 U½3 C1P1=30
ffiffiffiffiffi
P0
p
V þ RRC;
where RR denotes the removal rate; P0 is applied wafer
pressure; V is the angular velocity of wafer carrier and the pad platen; U is normal cumulative distribution
func-tion; C1 and C2are constants representing the properties
of abrasives, pad and wafer; RRC is material removal
due to chemical reaction.
2.1. SiO2wafer CMP process
In SiO2wafer CMP process the material removal due to
chemical reaction, RRC, is small compared with the
mechanical removal but the material removal due to chem-ical etch in Cu wafer CMP process may need to be consid-ered for more accurate results. Therefore, we ignored the
chemical etch effect in the simulation of SiO2wafer CMP
process. Furthermore, we modified the power of V from
1 to 6/10 which is based on Tsai’s thesis[13]and the value
of V is the platen speed of wafer carrier and pad platen. The experiments were carried out on the Westech 372M
CMP tool as shown inFig. 4. Experimental samples were
prepared on p-type, (1 0 0)-oriented, 6 in. (150 mm) diame-ter silicon wafers. The thermally grown silicon dioxide film was obtained by wet oxidation (ASM/LB45 furnace
sys-tem), in which the silicon was exposed to an ambient of
H2 and O2 at 980°C. The polishing sample for blanket
SiO2 wafer CMP experiment is a SiO2 film layer grown
to 9000 A˚ thickness by this furnace system. The structure
of the SiO2blanket wafer is shown in Fig. 5. Two sets of
experimental removal rate results are used to solve for
the values of C1and C2by means of an iteration method
of trial and error and the removal rate can be written as
RR¼ 8257 1 U½3 0:322P1=30
ffiffiffiffiffi
P0
p
Vð6=10Þ:
Fig. 6shows the model prediction and experimental obser-vations of the effects of the wafer pressure and platen speed.
2.2. Cu wafer CMP process
The experimental samples were prepared on p-type, (1 0 0)-oriented, 6 in. (150 mm) diameter silicon wafers. The thermally grown silicon dioxide film was obtained by wet oxidation (ASM/LB45 furnace system), in which the
silicon was exposed to an ambient of H2 and O2 at
980°C. The polishing sample for blanket Cu wafer in
CMP experiment is a two-layer film structure of Cu/Ta
with thickness of 20,000/500 A˚ sputter deposited by
ULVAC SBH-3308 RDE sputter system on the silicon
wafer which is covered with a 1000 A˚ thick thermally
grown SiO2film. The under layer of 500 A˚ Ta is used as
an adhesion promoter for the copper deposition since cop-per itself does not adhere well on the thermal oxide. The
structure of the Cu blanket wafer is shown in Fig. 7.
Two sets of experimental removal rate results are used to
solve for the values of C1and C2as we did for the SiO2
SiO2 9000Å
p-type dummy wafer Fig. 5. The Structure of SiO2blanket wafer.
wafer CMP process. Furthermore, we adopted the original power of V to 1 based on our experimental data. Platen speed and carrier speed were all equal to the platen speed, V. The only difference was that we had to consider the chemical etching rate in the equation. The chemical etching rate was obtained by experiment and the procedure was
presented in Section 4.2. The measured etching rate was
about 14 A˚ /min. It is quite small compared to the overall
removal rate. The main reason was that we added a higher
concentration of citric acid. According to Liu’s thesis[14],
as the concentration of citric acid in the HNO3-citric acid
slurry increased, the etching rate of copper in the HNO3
-citric acid slurry was suppressed. The -citric acid behaves like BTA (Benzotriazole) in preventing copper corrosion
in the HNO3-based solution. The BTA is a common Cu
corrosion inhibitor since it can be absorbed on Cu surface to form a passivation layer. It is helpful to a Cu damascene structure because of the low etching rate at the recessed region of the Cu film. Furthermore, the removed thickness
was only 6000 A˚ , hence we hoped that the chemical
removal rate was not too high. Therefore, we omitted the
chemical removal rate here. Thus, the removal rate repre-sentation will be RR¼ 30; 503 1 U½3 0:0113P1=30 ffiffiffiffiffi P0 p V
Fig. 8shows the model prediction and experimental obser-vations of the effects of the wafer pressure and platen speed.
3. Simulation via dynamic programming 3.1. Design
From Larson and Casti [15], the procedure of dynamic
programming can be concluded into three steps. First, cal-culate the equivalent discrete time domain state equation which describes the CMP process; second, determine the cost function in order to compute the minimum cost and find the optimal control inputs; the final procedure is the iterative computation. We used a very simple con-cept to obtain the state space model of removal rate for our simulation. The differential equation of the thickness being polished is equal to the removal rate and we made the removal rate to be the input. The equation is written as
_h ¼dh
dt ¼ RR ¼ u
where h is the thickness, RR the removal rate, u the input. We assumed that there were only seven values of the input (include 0) because of restrictions on the Westech 372M CMP tool and the sampling period T here was fixed to 1 s. The cost function L was determined as
Cu 20000Å
Ta 500Å SiO2 1000Å
p-type dummy wafer Fig. 7. The structure of Cu blanket wafer.
L¼1 2qh 2ðkÞ þ1 2ru 2ðkÞ þ1 2sh 2 N
where k is the stage, N is the final stage, q is the weighting factor of transient state, r is the weighting factor of input, and s is the weighting factor of final state. The weighting factor of transient state means the penalty factor for the speed of approaching to the final state. The weighting fac-tor of admissible input means the penalty facfac-tor for trans-lational wafer pressure and rotational torque provided to CMP tool. The last weighting factor s of final state means the penalty factor for final state which describes the extent of under-polishing. In this study, we want the admissible input to decrease slowly based on the concept of soft
land-ing operation which was proposed by Chiu et al.[10]. The
values of these weighting factors were determined by an iteration method of trial and error to get successive de-crease pattern of admissible inputs which also minimize the cost function L.
3.2. Simulation results
3.2.1. SiO2wafer CMP process
As shown inFig. 6, the error bars means the
non-plan-arization index (the standard deviation of the removal rates of the nine points on the wafer). The smaller the non-planarization index, the more uniform removal rate
Fig. 9. Simulation result of SiO2blanket wafer when the wafer pressure and platen speed as the variable input separately (1 stage = 1 s).
on the entire wafer and it results in more flat surface. Decreasing the wafer pressure or platen speed will decrease the non-planarization index. In order to reduce the wafer pressure or platen speed successively, we choose the weighting factors of final state, transient state and input to be 10000, 1 and 100000, respectively for the case of wafer pressure, and 10000, 1 and 700 for the case of platen speed. For the case of wafer pressure, the platen speed was held at 30 rpm. The initial thickness h(0) was
6000 A˚ , the result is shown in Fig. 9. According to the
result, the process terminated at the 363th stage and the input is 7 psi during 0–177th stage, 4 psi during 178– 225th stage, 2.5 psi during 226–272th stage and 2 psi
dur-ing 273–362th stage. It was the basis of our SiO2 wafer
experiment on wafer pressure. For the case of platen speed, the wafer pressure was held at 4 psi and the result
is also shown in Fig. 9.
At the beginning of the CMP process, we hoped that the removal rate was high enough to reduce the time of process and also obtained more flat surface at the end of the
pro-cess. According to Chen’s thesis[16], the platen speed has
a great influence on the non-uniformity. For these reasons, we attempted to apply both inputs. We chose the weighting factors of final state, transient state, wafer pressure and platen speed to be 10000, 1,50000 and 1000, respectively.
The result of dynamic programming is shown in Fig. 10.
Fig. 11. Simulation result of Cu blanket wafer when the wafer pressure and platen speed as the variable input separately (1 stage = 1 s).
The platen speed is decreased first in order to get better
non-planarization index when the SiO2 film thickness is
about 2000 A˚ and followed by the wafer pressure. (The
def-inition of non-planarization index is shown in Section4.)
3.2.2. Cu wafer CMP process
In order to decrease the wafer pressure or platen speed
in the same manner as SiO2wafer, we choose the weighting
factors of final state, transient state and input to be 10000, 1 and 40000, respectively for the case of wafer pressure, and 10000, 1 and 1200 for the case of platen speed. For the case of wafer pressure, the platen speed was held at 30 rpm. For the case of platen speed, the wafer pressure
was held at 3.5 psi. The initial thickness h(0) was 6000 A˚
and the result is shown inFig. 11.
As we know that dishing and erosion are proportional
to the pressure and platen speed [17], we expected that
decreasing the wafer pressure and platen speed would reduce the dishing and erosion of Cu wafer and also obtain better non-planarization index at the end of the process. Nevertheless, only the Cu blanket wafer experiment was done due to lack of Cu pattern wafers. We chose the weighting factors of final state, transient state, wafer pres-sure and platen speed to be 10000, 1,20000 and 1200, respectively. The result of dynamic programming is shown inFig. 12.
4. Experiment for SiO2and Cu blanket wafer
The experiments were carried out on the IPEC 372M
CMP polisher. The slurry formulation of SiO2 wafer
CMP process was prepared by diluting the commercial slurry, Cabot SS-25, with DI water in the ratio of one to one. The copper wafers were polished with a slurry of 2
wt% alumina abrasive, 2 vol% HNO3 and 0.01 M citric
acid slurry.
The CMP removal rates were monitored at nine points along two perpendicular diameters on the entire wafer and the within-wafer non-planarization index is defined as:
Non planarization index ¼ 1
n 1 Xn i¼1 ðxi xÞ 2 !1 2
where x is the removed thickness and x is the average re-moved thickness.
4.1. SiO2wafer CMP experiment
The experiments with dynamic programming were com-pared with the experiments which removed the same thick-ness at the same duration of polishing with constant removal rate operation. Since the removed thickness and duration of polishing were known, we can compute the value of constant removal rate. The base recipe process parameters for the constant removal rate were found
through experiment and are listed inTables 1 and 2. The
experimental results are listed inTable 3.
4.1.1. Change wafer pressure as the admissible input The duration of polishing was 363 s and the removed
thickness was 6000 A˚ . The required removal rate was
992 A˚ /min. Because of the fixed platen speed here, the
required removal rate was found by changing the wafer pressure to be 4.3 psi. From the experimental results which
were listed inTable 3, the constant removal rate operation
mode has the better thickness removal but the dynamic programming operation mode possesses 39% better non-planarization index. The model prediction error on the lower wafer pressure caused the inaccuracy of thickness removal. It could be improved by developing more accu-rate model.
4.1.2. Change platen speed as the admissible input
The duration of polishing was 334 s and the removed
thickness was 6000 A˚ . The required removal rate was
1078 A˚ /min. Because of the fixed wafer pressure here, the
required removal rate was found by changing the platen
speed to be 40 rpm.Table 3revealed the similar
phenome-non to the part of wafer pressure. The dynamic
program-Table 1
Process parameters of SiO2CMP experiment (constant removal rate base recipe) Fixed parameter Base recipe value for wafer pressure as
variable input inFig. 9
Base recipe value for platen speed as variable input inFig. 9
Base recipe value for both wafer pressure and platen speed as variable inputs inFig. 10
Wafer pressure 4.3 psi 4.0 psi 4.7 psi
Back Pressure 1 psi 1 psi 1 psi
Carrier speed 30 rpm 40 rpm 52 rpm
Platen speed 30 rpm 40 rpm 52 rpm
Polish time 363 s 334 s 237 s
Slurry flow rate 150 ml/min 150 ml/min 150 ml/min
Pre-wet pad speed 28 rpm 28 rpm 28 rpm
Pre-wet duration 10 s 10 s 10 s
Pre-wet flow rate 300 ml/min 300 ml/min 300 ml/min
Pad Rodel IC1400 Rodel IC1400 Rodel IC1400
Table 2
Slurry formulation of SiO2CMP
Species Concentration
Commercial slurry Cabot SS-25 50 vol%
ming operation mode possesses 26% better
non-planariza-tion index. The thickness removal was over 6000 A˚ for
dynamic programming because the model prediction was lower than the experimental data on the higher platen speed. 4.1.3. Change wafer pressure and platen speed as the admissible inputs simultaneously
The duration of polishing was 237 s and the removed
thickness was 6000 A˚ . The required removal rate was
1519 A˚ /min. In order to avoid one-sided emphasis of these
two inputs, we simultaneously increased the wafer pressure and platen speed. The final value was found to be 4.7 psi and 52 rpm, respectively. The thickness removal had a little improvement and this might be caused by combining com-plimentary model prediction error from both platen speed
model and wafer pressure model. Nevertheless, the dynamic programming operation still possessed 16% better non-planarization index.
4.2. Cu wafer CMP experiment
We will explain how to calculate the removal rate due to chemical reaction first. The experiment for calculating the removal rate due to chemical reaction was carried out in a circular glass container. Because the slurry was not well stirred, we put a magnet at the bottom of the container to produce the flow of the slurry in order to keep the reac-tion rate at the surface stable. It also dispersed the abrasive to suspend in the slurry and made the experiment be closer to the real circumstance of polishing. The surface of Cu wafer faced with the magnet and kept a distance with the magnet. We set the time to immerse the wafer in the slurry to 5 min and assumed that the concentration of the slurry was constant during chemical reaction. The base recipe process parameters for the constant removal rate were
found through experiment and are listed in Tables 4 and
5. The experimental results are listed inTable 6.
The same comparison between dynamic programming and constant removal rate were done as we did for the
Table 3
SiO2wafer experimental result with three kind of admissible inputs via dynamic programming operation Average removed thickness (A˚ ) Thickness removal error from 6000 A˚ (%) Non-planarization index (A˚ ) Non-planarization index improvement (%)
Wafer pressure 5499 A˚ (5899 A˚) 8.35 (1.68) 435 A˚ n(717 A˚) 39
Platen speed 6113 A˚ (5914 A˚) 1.88 (1.43) 518 A˚ (697 A˚) 26
Simultaneously 5875 A˚ (5957 A˚) 2.08 (0.72) 487 A˚ (580 A˚) 16
The values in brackets belong to the constant removal rate operation.
Table 4
Process parameters of Cu CMP experiment (constant removal rate base recipe) Fixed parameter Base recipe value for wafer pressure
as variable input inFig. 11
Base recipe value for platen speed as variable input inFig. 11
Base recipe value for both wafer pressure and platen speed as variable inputs inFig. 12
Back pressure 1 psi 1 psi 1 psi
Wafer pressure 5.1 psi 3.5 psi 4.9 psi
Carrier speed 30 rpm 50 rpm 57 rpm
Platen speed 30 rpm 50 rpm 57 rpm
Polish time 118 s 85 s 60 s
Slurry flow rate 150 ml/min 150 ml/min 150 ml/min
Pre-wet pad speed 28 rpm 28 rpm 28 rpm
Pre-wet duration 10 s 10 s 10 s
Pre-wet flow rate 300 ml/min 300 ml/min 300 ml/min
Pad Rodel IC1400 Rodel IC1400 Rodel IC1400
Table 5 Slurry formulation of Cu CMP Species Concentration Abrasive Al2O3(EXTEC 0.1 lm) 2 wt%
Oxidizer HNO3 2 vol%
Complex agent Citric acid 0.01 M
Dilution DI water Remaining balance of slurry
Table 6
Cu wafer experimental result with three kind of admissible inputs via dynamic programming operation Average removed thickness ( A˚ ) Thickness removal error from 6000 A˚ (%) Non-planarization index (A˚ ) Non-planarization index improvement (%)
Wafer pressure 7491 A˚ (6721 A˚) 24.85 (12.02) 1204 A˚ (1225 A˚) 2
Platen speed 5406 A˚ (5948 A˚) 9.9 (0.87) 615 A˚ (806 A˚) 24
Simultaneously 6814 A˚ (5772 A˚) 13.57 (3.8) 1034 A˚ (1153 A˚) 10
SiO2wafer CMP experiment. The parameters for the
con-stant removal rate were also found through experiment. 4.2.1. Change wafer pressure as the admissible input
The duration of polishing was 118 s and the removed
thickness was 6000 A˚ . The required removal rate was
3051 A˚ /min. Because of the fixed platen speed here, the
required removal rate was found by changing the wafer pressure to be 5.1 psi. From the experimental results which
were listed inTable 6, the constant removal rate operation
has better thickness removal but the dynamic program-ming operation possesses a little better non-planarization index. However, the difference of non-planarization index between dynamic programming operation and constant removal rate operation is very small and it can be consid-ered within statistical error of experimental data. It also shows that the wafer pressure is not a major factor to influ-ence the non-planarization index in Cu wafer CMP. As
shown in Fig. 8, we see that the non-planarization index
data under 3 psi and 5 psi is within statistical error with each other. This explains why there is no degradation in non-planarization index when the wafer pressure is decreased from 5.1 psi to 3 psi, i.e., there is no degradation in non-planarization index when Cu wafer CMP process is changed from constant removal rate operation mode to dynamic programming operation mode. The inaccuracy of removed thickness may be caused by the lower predicted value of removal rate on the smaller wafer pressure. It means that the model for Cu wafer CMP process needs to be modified.
4.2.2. Change platen speed as the admissible input
The duration of polishing was 85 s and the removed
thickness was 6000 A˚ . The required removal rate was
4235 A˚ /min. Because of the fixed wafer pressure here, the
required removal rate was found by changing the platen
speed to be 50 rpm.Table 6 shows that the dynamic
pro-gramming operation is 24% better than the constant removal rate operation in terms of non-planarization index. The error of the removed thickness may be caused by higher predicted value of removal rate at faster platen speed and lower predicted value of removal rate at slower speed, there was not enough time to remedy lower removal rate at slower platen speed by higher removal rate at faster platen speed. Therefore removal thickness of dynamic pro-gramming operation mode is less than that of constant removal rate operation mode. However, it made a signifi-cant improvement of non-planarization index through dynamic programming operation of platen speed. This shows platen speed is a major parameter affecting the non-planarization index of Cu wafer CMP process. 4.2.3. Change wafer pressure and platen speed as the admissible inputs simultaneously
The duration of polishing was 60 s and the removed
thickness was 6000 A˚ . The required constant removal rate
was 6000 A˚ /min. We simultaneously increased the wafer
pressure and platen speed to be 4.9 psi and 57 rpm, respec-tively. The dynamic programming operation still provides 10% improvement of non-planarization index than con-stant removal rate operation.
4.3. Discussion
Three cases of CMP process to use wafer pressure, pla-ten speed and both wafer pressure and plapla-ten speed as admissible inputs were examined in this study. In the
SiO2 wafer CMP experiment, the non-planarization
indexes of three cases were all improved and the errors of the removed thickness were within 8%. The model could predict the removal rate well. It illustrated that the
multi-step dynamic programming of SiO2wafer CMP was
feasi-ble to implement. It could further be verified by experiment on IPEC 372M CMP tool.
Slurry chemicals play an important role in the Cu wafer CMP process. The formation of a non-native passivation layer by the passivating chemical (e.g., citric acid) in the slurry, the dissolution of Cu or the abraded materials by abrasives from surface layer are all determined by the
chem-ical environment in the slurry[18]. In the Cu wafer CMP
experiment, the result shows that the platen speed is the main factor to influence non-planarization index. When we made the platen speed change, non-planarization index improved 10% at least. It means that the higher the platen speed, the faster the refresh rate of the slurry underneath the wafer and the larger the removal rate. It may also cause worse non-uniformity of the slurry to transport on the entire wafer and influence the non-uniformity of the removal rate. For this reason, the interactions between mechanical and chemical parameters need to be investigated more thor-oughly. This also indicated that the current model was not sufficient to describe the entire Cu wafer CMP process. The errors in thickness removal fall between 14% and 25%. There are two ideas to reduce the error in thickness removal. The first way was to modify the model and make it more comprehensive and complete by including interac-tion between mechanical polishing and chemical reacinterac-tion. The second way was to get more removal rate data corre-sponding to every value of the admissible input by experi-ment to obtain higher degree polynomials regression model.
In the case of both SiO2and Cu CMP, we observed that
both Tables 3 and 6 reveal that dynamic programming operation performs better than constant removal rate oper-ation in terms of minimizing non-planarizoper-ation index.
From the removal rate model for SiO2wafer in Fig. 6,
the standard deviation (or NPI) of removal rate are smaller for lower wafer pressure and platen speed. It means lower wafer pressure and platen speeds generates smaller NPI or
better planarization results as is reported in Table 3 for
SiO2 wafer. The relative improvement in standard
devia-tion of removal rate (or relative improvement in NPI) from higher wafer pressure and platen speeds to lower wafer
pressure and platen speeds is obvious in Fig. 6 for SiO2
From the removal rate model for Cu wafer inFig. 8, the standard deviation (or NPI) of removal rate are smaller for lower wafer pressure and platen speed. It means lower wafer pressure and platen speeds generates smaller NPI
or better planarization results as is reported in Table 6
for Cu wafer. The relative improvement in standard devia-tion of removal rate (or relative improvement in NPI) from higher wafer pressure and platen speeds to lower wafer
pressure and platen speeds is obvious in Fig. 8 for Cu
wafer.
The relative trend of improvement of standard deviation
(or NPI) for SiO2wafer inFig. 6stands out more than the
relative trend of improvement of standard deviation (or
NPI) for Cu wafer inFig. 8. This also predicts the benefits
of dynamic programming operation on NPI improvements
for SiO2wafer will also stand out more than Cu wafer.
The standard deviation of removal rate data for SiO2
wafer in Fig. 6 is smaller than the standard deviation of
removal rate data for Cu wafer in Fig. 8. This explains
the fact that the percentage value of thickness removal
error for SiO2wafer is smaller inTable 3than that of
thick-ness removal error for Cu wafer inTable 6due to smaller
numerator in percentage computation. This also explains the fact that the percentage value of NPI improvement
for SiO2 wafer is larger in Table 3 than that of NPI
improvement for Cu wafer in Table 6 due to smaller
denominator in percentage calculation.
To get better performance for both global planarization and thickness removal, we recommend that planarization
step and overpolish step in SiO2and Cu CMP should use
different mode of operation, i.e., dynamic programming operation during planarization step for minimizing non-planarization index and constant removal rate operation during overpolish step for minimizing thickness removal error. The incremental time calculation in overpolish step can be done using the thickness error and removal rate derived from Luo’s removal rate model based on constant wafer pressure and platen speed at the end of planarization step.
5. Conclusions and future work 5.1. Conclusions
In this study, we focused on the mechanical parameters of CMP process. The wafer pressure and platen speed were taken as the control parameters. We applied the control method of dynamic programming to carry out experiment
for CMP process with blanket SiO2 and Cu wafers. The
influence of dynamic programming operation and constant removal rate operation on the non-planarization index for CMP process were compared carefully. We arrived at the following conclusions:
(1) The non-planarization index could be improved con-sistently by dynamic programming operation versus constant removal rate operation. The dynamic
pro-gramming operation has 2% to 39% improvement over the base recipe of constant removal rate in all
experiments as shown in Tables 3 and 6.
(2) The thickness removal error is consistently smaller by constant removal rate operation versus dynamic pro-gramming operation in all experiments as shown in
Tables 3 and 6.
(3) To get the best performance of both planarization and thickness removal, it is recommended that
plan-arization step and overpolish step in SiO2 and Cu
CMP should use different mode of operation, i.e., dynamic programming operation during planariza-tion step for minimizing non-planarizaplanariza-tion index and constant removal rate operation during overpol-ish step for minimizing thickness removal error. The incremental time calculation for eliminating thickness removal error during overpolish step can be done using the thickness error and removal rate derived from Luo’s removal rate model based on constant wafer pressure and platen speed at the end of plana-rization step.
(4) The platen speed is a more consistent factor to
influ-ence the non-planarization index (about 25%
improvement over base recipe of constant removal rate) during planarization step using dynamic
pro-gramming operation as shown in Tables 3 and 6.
The removal thickness error (about 1%) is also mini-mum in overpolish step using constant removal rate operation by constant platen speed and wafer
pres-sure as shown in the third row ofTables 3 and 6.
(5) In SiO2CMP, dynamic programming of platen speed
during planarization step is followed by constant removal rate operation during overpolish step using platen speed of 20 rpm and wafer pressure of 4.0 psi. (6) In Cu CMP, dynamic programming of platen speed during planarization step is followed by constant removal rate operation during overpolish step using platen speed of 30 rpm and wafer pressure of 3.5 psi. (7) Best known method (BKM) for CMP planarization is recommended to use dynamic programming opera-tion of platen speed for coarse control of non-plana-rization index during plananon-plana-rization step and use constant removal rate operation via constant platen speed and wafer pressure at the end of planarization step for fine control of thickness removal error during overpolish step.
5.2. Future work
Experimental verification of dynamic programming operation needs to be carried out on pattern wafers in the future for different kinds of pattern wafers. The influ-ence on copper dishing and oxide erosion can be compared between dynamic programming and constant removal rate operation. It is conjectured that using dynamic planning operation in planarization step and constant removal rate operation in overpolish step with minimum platen speed
and/or wafer pressure will minimize the copper dishing and
oxide erosion according to experimental results inFig. 2aa
andFig. 2bb reported by K. Wijekoon and S. Tsai etc.[17]. Appendix A. Optimal control design: dynamic programming Sociological, economic, and physical pressures in all areas of modern life have generated an accelerated demand for high-level decision-making based upon limited informa-tion about the processes being controlled. In 1950s, a sys-tematic and concerted mathematical study of such decision-making situations was initiated by Richard Bell-man. This pioneering work was based upon the fundamen-tal system-theoretic notion of feedback, i.e., that decision rules should be based upon the current (and perhaps past) states of the process under study. Bellman and his
col-leagues continued to develop the feedback decision-making concept under the name of ‘‘dynamic programming’’. The majority of problems of true practical concern were putationally intractable due to the limited state of the com-puting art at that time. As time goes on, a combination of rapid progress in computer technology, coupled with the development of refined computational procedures, has made it practical for solving a wide variety of problems in economics, engineering, operations research, and math-ematics, itself.
A.1. Bellman’s principle of optimality
The fundamental concept of dynamic programming originated by Bellman is called the principle of optimal-ity. This principle may conceptually be thought as fol-lows: given an optimal trajectory from point A to point C, the portion of the trajectory from any intermediate point B to point C must be the optimal trajectory from
B to C. In Fig. A.1, if the path I–II is the optimal path
from A to C, then according to the principle of optimal-ity path II is the optimal path from B to C. The proof by contradiction for this case is immediate: Assume that
some other path, such as II0, is the optimum path from
B to C, then path I–II0 has less cost than path I–II.
How-ever, this contradicts the fact that I–II is the optimal path
C A B I II II,
Fig. A.1. Illustration of the principle of optimality.
from A to C, and hence II must be the optimal path from B to C.
A.2. Dynamic programming
Consider a quantized state x2 X, at stage (N 1). At
this state, each of the admissible decisions u(m)2 U is
applied.
X ¼ ½x1x2. . . xn1xn; U ¼ ½u1u2. . . uM1uM
For each of these decisions the cost at the current stage can be determined as
LðmÞ¼ L½x; uðmÞ; N 1 ðm ¼ 1; 2; . . . ; MÞ
Next, for each of these decisions, the next state at stage N is determined from the system equation,
xðmÞðN Þ ¼ g½x; uðmÞ; N 1 ðm ¼ 1; 2; . . . ; MÞ
The next step is to compute the minimum cost at stage N
for each of the states x(m). However, in general, a particular
state x(m)will not lie on one of the quantized states x2 X at
which the optimal cost I(x,N) is defined. In fact, it may lie outside of the range of admissible states. In the latter case the decision is rejected as a candidate for the optimal deci-sion for this state and stage. If a next state x(m) does fall within the range of allowable states, but not on a quantized value, then it is necessary to use some type of interpolation procedure to compute the minimum cost function at these points.
Assume that the values of the minimum cost at the states x(m) can be expressed as a function of the values of the
optimal cost at quantized states x2 X.
I½xðmÞ; N ¼ P½xðmÞ; N ;Iðx; N Þ; all x2 X
where I(x,N) = L(x,N).
If, as is often the case, no decision is made at k = N, the final stage, and hence the cost function at N depends only on the final state, x(N).
The total cost of applying decision u(m)at state x, stage
(N 1), can then be written as
FðmÞ1 ¼ L½x; uðmÞ; N 1 þ I½xðmÞ; N
The minimization can be achieved by simply comparing the M quantities. According to the functional equation, the minimum value will be the minimum cost at state x, stage
(N 1). I½x; N 1 ¼ min uðmÞ2U L½x; u ðmÞ; N 1 þ I½xðmÞ; N ðA:1Þ
the optimal decision at this state and stage, ^u½x; N 1, is
the control u(m)for which the minimum in Eq.(A.1)is
actu-ally taken on.
This procedure is repeated at each quantized state x2 X
at stage (N 1). When this has been done, I(x,N 1) and
^
u½x; N 1 are known for all x 2 X. It is now possible to
compute I(x, N 2) and ^u½x; N 2 for all x 2 X based
on knowledge of I(x, N 1). k , x k , x ( )x,k uˆ ( )x,k uˆ MEMORY CONTROLLER SYSTEM
Fig. A.3. A controller based on retrieving the results of the dynamic programming computation from memory.
Initialization: Compute I(h (j), N), j=1,2,…,jN
No No No Yes Yes Yes (Stage) k = N-1 (State) j = 1 (Control) m = 1 Next state h(m) = g(h(j), u(m), k)
Interpolate in n state variables to find I(h(m), k+1) in terms of the values I(h(j), k+1), j=1,2,…,jN F(h(j), u(m), k) = L(h(j), u(m), k)+I(h(m), k+1) Is m=mN I(h(j), k) = min {F(h(j), u(m), k)} m=1,2,…mN ( )
( )
( ){
(
( ) ( )m ,k)
}
u , j h F mN 1,2,..., m min arg mˆ ; mˆ u k , j h uˆ = = = Is k=0 Is j=jN STOP m=m+1 j=j+1 k=k-1The general iterative procedure continues this process.
Suppose that I(x,k+1) is known for all x2 X. Then
I(x,k) and ^u½x; kare computed for all x 2 X from
I½x; k ¼ min
uðmÞ2U L½x; u
ðmÞ; k þ I½xðmÞ; kþ 1
ðA:2Þ where x(m) is determined from
xðmÞ¼ g½x; uðmÞ; k
and where I(x(m),k+1) is computed by interpolation on the
known values I(x,k+1) for all x2 X:
I½xðmÞ; kþ 1 ¼ P½xðmÞ; kþ 1; Iðx; k þ 1Þ; all x2 X
The optimal decision ^u½x; k is the decision for which Eq.
(A.2) takes on the minimum. The iterative procedure
be-gins by computing ^u½x; N 1 and I(x,N 1) from the
gi-ven boundary conditions I(x,N), and it continues until it
arrives at ^u½x; 0 and I(x, 0). The complete results of
dy-namic programming are shown in Fig. A.2. At each state
of stage, the optimal decision is written blow, and the min-imum cost is written above. Finally, we can find the opti-mal sequence of decisions starting from the given x(0) and system equation. This is called the recovery procedure and these decisions are the input for our experiments. However, this is based on the system equation when we lack the measurement of state. Our simulated results were done in this manner.
If we could monitor the state and stage of the system,
the dynamic programming solution, ^u½x; k, leads to a
feed-back control or decision policy configuration. One method of implementing this solution is to simply store all the
val-ues of ^u½x; k in memory, monitor the state and stage of the
system, and look up the appropriate value of ^u½x; k as
required. This type of implementation is attractive because the dynamic programming calculations can be done off-line, and the only operation that needs to be done during the decision interval is retrieval of the appropriate optimal
decision. The system configuration is as shown inFig. A.3.
A.3. Simulation Results
We used the very simple concept to get the equation for our simulation. The differential equation of the thickness being polished is equal to the removal rate and we made the removal rate to be the input. The equation is written as
_h ¼dh
dt ¼ RR ¼ u
where h is the thickness, RR the removal rate, u the input. The difference equation version has been computed using a sampling period of T (1 s) is
hðk þ 1Þ ¼ hðkÞ T uðkÞ ðA:3Þ
where k is the stage. We assumed that there were only seven values of the input (include 0) because of restrictions on the Westech 372M CMP machine and the sampling period T here was fixed to 1 second. For each of these inputs the cost at the current stage can be determined as
LðmÞ¼1
2qh
2ðkÞ þ1
2ru
2ðkÞ ðm ¼ 1; 2; . . . ; 7Þ ðA:4Þ
and the cost at the final stage N also was determined as
Iðx; N Þ ¼ Lðx; N Þ ¼1
2sh
2
N ðA:5Þ
where s is the weighting factor of final state, q the weighting factor of transient state, and r the weighting factor of
in-put. Then we suppose a quantized state h2 H and a
admis-sible inputs u(m)2 U are applied.
H¼ ½6000 5999 5998 . . . 2 1 0 ðjN ¼ 6001Þ
U¼ ½uð1Þuð2Þuð3Þuð4Þuð5Þuð6Þuð7Þ ðmN ¼ 7Þ
We could get I(h,N 1) and ^u½h; N 1 for all h 2 H by
substituting Eq. (A.3), Eq. (A.4) and Eq. (A.5) into Eq.
(A.2)which is presented in SectionA.2.
I½6000; N 1; I½5999; N 1; . . . ; I½1; N 1; I½0; N 1 ^
uð6000; N 1Þ; ^uð5999; N 1Þ; . . . ; ^uð1; N 1Þ; ^uð0; N 1Þ
It is now possible to compute I(h,N 2) and ^u½h; N 2 for
all h2 H based on knowledge of I(h,N 1). The iterative
procedure continues until ^u½h; 0 and I(h, 0) have been
com-puted. The program flowchart is shown inFig. A.4and the
complete results of dynamic programming can be plotted like Fig. A.2. Finally, we can find the optimal sequence
of inputs starting from the given h(0) and Eq. (A.3) by
means of the recovery procedure. References
[1] Texas Engineering Extension Service, CMP, TEEX, 2001.
[2] C.K. Hu, J.M.E. Harper, Materials Chemistry and Physic. 52 (1998) 5–16.
[3] Y. Ein-Eli, E. Abelev, E. Rabkin, D. Starosvetsky, J. Electrochem. Soc. 150 (9) (2003) C646–C652.
[4] F. Preston, J. Soc. Glass Technol. 11 (1927) 214–247.
[5] F. Zhang, A. Busnaina, J. Feng, M.A. Fury, in: Proceedings of the Fourth International Chemical–Mechanical Planarization for ULSI Multilevel Interconnection Conference, 1999 pp. 61–64.
[6] W.T. Tseng, Y.L. Wang, J.Electrochem. Soc. 144 (2) (1997) L15– L17.
[7] B. Zhao, F.G. Shi, in: Proceedings of the Fourth International Chemical- Mechanical Planarization for ULSI Multilevel Intercon-nection Conference, 1999 pp. 13–22.
[8] F.G. Shi, B. Zhao, Appl. Phys. A 67 (1998) 249–252.
[9] J. Luo, D.A. Dornfeld, IEEE Trans. Semiconduct. Manufact. 14 (2) (2001) 112–132.
[10] J.-B. Chiu, C.-C. Yu, S.-H. Shen, Microelectron. Eng. 65 (2003) 345– 356.
[11] C.-S. Lin, C.Y. Chi, Microelectron. Eng. 75 (2004) 285–296. [12] R. Bellman, Dynamic Programming, Princeton University Press,
Princeton, NJ, 1957.
[13] Hung-Yin Tsai, ‘‘Analysis of Non-uniformity Based on Material Removal and Kinematics in Chemical Mechanical Planarization,’’ M.S. thesis, Department of Power Mechanical Engineering, National Tsing Hua University, 1999.
[14] Keng-Ming Liu, ‘‘Study on the Chemical Mechanical Polishing of Copper Thin Films,’’ M.S. thesis, Department of Electronic Engi-neering, National Chiao Tung University, 2000.
[15] R.E. Larson, J.L. Casti, Principles of Dynamic Programming, Marcel Dekker, 1978.
[16] Junne-Dar Chan, ‘‘Tribochemical Reactions on Cu-film Influenced by Mechanical Stresses in Chemical Mechanical Polishing,’’ M.S. thesis, Department of Mechanical Engineering, National Cheng Kung University; 2000.
[17] K. Wijekoon, S. Tsai, M. Chandrachood, B. Brown, F. Redeker, S. Nanjangud, G. Amico, in: SEMI/Japan Technical Symposium, 1998. [18] J.M. Steigerwald, S.P. Murarka, D.J. Duquette, R.J. Gutmann, Mat.