Chemical mechanical planarization operation via dynamic programming

(1)

Chemical mechanical planarization operation via

dynamic programming

Chia-Shui Lin

a,*

, Yung-Chou Lee

b

a_{Department of Mechanical Engineering, National Chiao Tung University, Hsin-Chu 30050, Taiwan} b_{Department of Electrical and Control Engineering, National Chiao Tung University, Hsin-Chu 30050, Taiwan}

Received 28 April 2006; received in revised form 1 December 2006; accepted 5 February 2007 Available online 16 February 2007

Abstract

In this paper, the impact on non-planarization index by the down force and rotational speed during a SiO2or Cu CMP process was investigated. Since the magnitudes of down force and rotational speed have limits, we choose the dynamic programming approach because of its ability to achieve constrained optimization by the down force and rotational speed. The duration and the amount of input were computed based on the chemical mechanical polishing model by Luo and Dornfeld [J. Luo, D.A. Dornfeld, IEEE Trans. Semiconduct. Manufact. 14(2) (2001) 112–132.] when the other parameters were ﬁxed. Experiments done for blanket wafers based on dynamic program-ming operation and conventional constant removal rate operation was compared with each other. The non-planarization index could be improved consistently by dynamic programming operation versus constant removal rate operation. The improvement ranges from 2% to 39% improvement over the base recipe of constant removal rate in all experiments as shown inTables 3 and 6. The thickness removal error is consistently smaller by constant removal rate operation versus dynamic programming operation in all experiments as shown inTables 3 and 6. To get the best performance of both planarization and thickness removal, it is recommended that planarization step and overpolish step in SiO2and Cu CMP should use diﬀerent mode of operation, i.e., dynamic programming operation during planarization step for min-imizing non-planarization index and constant removal rate operation during overpolish step for minmin-imizing thickness removal error. The incremental time calculation for eliminating thickness removal error during overpolish step can be done using the thickness error and removal rate derived from Luos’ removal rate model based on constant wafer pressure and platen speed at the end of planarization step. Our contribution is a new approach for CMP. Standard CMP uses constant removal rate operation in both planarization step and overpolish step. Our new approach uses dynamic programming operation during planarization step and constant removal rate operation during overpolish step.

Keywords: Chemical mechanical planarization; Dynamic programming; Non-planarization index; Copper dishing; Oxide erosion

1. Introduction

Chemical mechanical planarization (CMP) is a widely accepted technique to provide a globally planarized surface for microelectronic wafer fabrication nowadays. CMP was developed during the early 1980s when multilevel intercon-nect technology was pushed to the limits of circuit density and performance. This technique produces excellent plana-rization across the wafer surface and improves both

photo-lithography and deposition process[1]. In recent years, the

device levels and densities increased continuously, at the same time the problem of resistance–capacitance (RC) time delays which can appreciably slow down circuit speeds must be solved quickly. As a result, copper has emerged as the optimal interconnect material because of its low resistivity and high electromigration resistance compared

with aluminum [2,3]. Patterned Cu lines are produced by

a damascene process when using Cu as an interconnect material. In the damascene process, the dielectric is pat-terned, followed by the barrier and metal deposition. The barrier is required to prevent the rapid diﬀusion of the

*

Corresponding author. Tel.: +886 35731796; fax: +886 35720634. E-mail address:chiaslin@mail.nctu.edu.tw(C.-S. Lin).

www.elsevier.com/locate/mee Microelectronic Engineering 84 (2007) 2817–2831

(2)

Cu into the dielectric. The ﬁnal step in this process is CMP that removes the excess metal and provides global

planari-zation.Fig. 1schematically shows a single layer Cu

inter-connect structure before and after CMP. Two key problems in Cu pattern wafer CMP, namely copper dishing and oxide erosion, generate surface non-planarity which gives rise to problems in integrating multiple layers of metal. Copper and oxide thinning results in increased RC delay which leads to inferior device performance.

There-fore, we focus on the experiments for SiO2and Cu CMP.

Several research eﬀorts have been reported on modeling the CMP process and the most well known equation is the

Preston’s equation[4]. Preston’s equation reﬂects the

inﬂu-ence of process parameters including wafer pressure and relative velocity. In the last several years, the revised Pres-ton’s equations concentrated on diﬀerent elements of CMP.

For example, Zhang and Busnaina [5]proposed an

equa-tion taking into account the normal stress and shear stress acting on the contact area between abrasive particles and

wafer surfaces. Tseng and Wang [6] showed that the

removal rate is proportional to the terms P5/6 and V1/2.

Zhao and Shi[7,8]consider the eﬀects of the pad hardness

and the contact between wafer and pad. Luo and Dornfeld

[9]assumed an indentation-sliding model for the

penetra-tion of the pad and included an empirical accommodapenetra-tion of chemical reaction at the wafer surface. Compared with experiment results, the Luo and Dornfeld model more accurately predicts the removal rate. (Therefore, the Luo and Dornfeld model will be employed to predict thickness removal rate in this paper).

Most of the research work on CMP is focused on

removal mechanism and slurry chemistry. Chiu et al. [10]

applied the concept of soft landing of a spacecraft to CMP operation. Therefore, the CMP operation can be for-mulated as a minimum time optimal control problem. They treat the oxide surface as the landing surface, the polishing

pad as a ﬂy vehicle, and the removal rate as the vertical velocity. The equations describing the thickness removal process can be expressed as:

_ H _RR " # ¼ 0 1 0 0 _H RR þ 0 1 a amax 6a 6 amax

where H is the thickness of material to be removed, RR the removal rate, and a the rate of change of the removal rate. The constraints in removal rate and rate of change of re-moval rate are applied because the parameters of CMP ma-chine have physical limit, e.g., platen speed, wafer pressure, and slurry ﬂow rate. They also set the ﬁnal condition to

H(tf) = 2000 A˚ and RR(tf) = 2000 A˚ /min in order to

re-duce the dishing and erosion according to the experimental

data proposed by K. Wijekoon and S. Tsai etc.[17].Fig. 2

shows that copper dishing and oxide erosion are propor-tional to platen speed and wafer pressure. Once the landing

point is reached (H(tf) = 2000 A˚ ), the polisher continues

the removal with the smaller removal (RR(tf) = 2000 A˚ /

min) until the end point is detected.Fig. 3shows the result

of optimal operation. Through their inspiration, we plan to use dynamic programming as our method of optimal oper-ation in this research.

Cu Barrier

Layer Dielectric (SiO2)

Dishing Erosion

Fig. 1. Schematics of a single layer Cu interconnect: (a) before polishing, (b) ideal case after polishing and (c) real case after polishing.

Fig. 2a. Dependence of copper dishing and oxide erosion on platen speed. Wafer pressure was kept constant[17].

Fig. 2b. Dependence of copper dishing and oxide erosion on wafer pressure. Platen speed was held constant[17].

(3)

Lin and Chi[11] employed the sliding-mode control to set the operation proﬁle of CMP process through ‘‘Dynamic Tuning’’ method to enable the CMP process behave closer to the soft landing. However, the

experimen-tal veriﬁcation may be hard to carry out because of the continuous time control and the lack of available operation mechanism. Hence, the dynamic programming control will be employed to deal with the discrete time control. Dynamic programming was developed by Bellman and

his colleagues in the 1950s [12]. The method of dynamic

programming will be explained in Appendix A of this

paper. It has the advantage of dealing with constrained inputs. That means the problem of available operation mechanism can be solved for constrained inputs and multi-ple ﬁnite stages CMP. Experiments based on dynamic pro-gramming were carried out in this research for both

SiO2and Cu blanket wafers.

In this paper, the removal rate representation will be

presented in Section2. In Section3, the simulation results

via dynamic programming will provide the basis of dynamic programming of wafer pressure and platen speed as part of recipe for CMP tool. The experimental results for CMP operation via dynamic programming were obtained

and discussed in Section 4. Section5 is for conclusion.

2. Removal rate representation

In this work, the equation proposed by Luo et al.[9]will

be employed, that is

Fig. 3. Trajectory for RRmax= 9000 A˚ /min, Hsmall= 2000 A˚ , and RRsmall= 2000 A˚ /min[10].

(4)

RR¼ C2 1 U½3 C1P1=30

_{ffiffiffiffiffi}

P0

p

V þ RRC;

where RR denotes the removal rate; P0 is applied wafer

pressure; V is the angular velocity of wafer carrier and the pad platen; U is normal cumulative distribution

func-tion; C1 and C2are constants representing the properties

of abrasives, pad and wafer; RRC is material removal

due to chemical reaction.

2.1. SiO2wafer CMP process

In SiO2wafer CMP process the material removal due to

chemical reaction, RRC, is small compared with the

mechanical removal but the material removal due to chem-ical etch in Cu wafer CMP process may need to be consid-ered for more accurate results. Therefore, we ignored the

chemical etch eﬀect in the simulation of SiO2wafer CMP

process. Furthermore, we modiﬁed the power of V from

1 to 6/10 which is based on Tsai’s thesis[13]and the value

of V is the platen speed of wafer carrier and pad platen. The experiments were carried out on the Westech 372M

CMP tool as shown inFig. 4. Experimental samples were

prepared on p-type, (1 0 0)-oriented, 6 in. (150 mm) diame-ter silicon wafers. The thermally grown silicon dioxide ﬁlm was obtained by wet oxidation (ASM/LB45 furnace

sys-tem), in which the silicon was exposed to an ambient of

H2 and O2 at 980°C. The polishing sample for blanket

SiO2 wafer CMP experiment is a SiO2 ﬁlm layer grown

to 9000 A˚ thickness by this furnace system. The structure

of the SiO2blanket wafer is shown in Fig. 5. Two sets of

experimental removal rate results are used to solve for

the values of C1and C2by means of an iteration method

of trial and error and the removal rate can be written as

RR¼ 8257 1 U½3 0:322P1=30

_{ffiffiffiffiffi}

P0

p

Vð6=10Þ:

Fig. 6shows the model prediction and experimental obser-vations of the eﬀects of the wafer pressure and platen speed.

2.2. Cu wafer CMP process

The experimental samples were prepared on p-type, (1 0 0)-oriented, 6 in. (150 mm) diameter silicon wafers. The thermally grown silicon dioxide ﬁlm was obtained by wet oxidation (ASM/LB45 furnace system), in which the

silicon was exposed to an ambient of H2 and O2 at

980°C. The polishing sample for blanket Cu wafer in

CMP experiment is a two-layer ﬁlm structure of Cu/Ta

with thickness of 20,000/500 A˚ sputter deposited by

ULVAC SBH-3308 RDE sputter system on the silicon

wafer which is covered with a 1000 A˚ thick thermally

grown SiO2ﬁlm. The under layer of 500 A˚ Ta is used as

an adhesion promoter for the copper deposition since cop-per itself does not adhere well on the thermal oxide. The

structure of the Cu blanket wafer is shown in Fig. 7.

Two sets of experimental removal rate results are used to

solve for the values of C1and C2as we did for the SiO2

SiO2 9000Å

p-type dummy wafer Fig. 5. The Structure of SiO2blanket wafer.

(5)

wafer CMP process. Furthermore, we adopted the original power of V to 1 based on our experimental data. Platen speed and carrier speed were all equal to the platen speed, V. The only diﬀerence was that we had to consider the chemical etching rate in the equation. The chemical etching rate was obtained by experiment and the procedure was

presented in Section 4.2. The measured etching rate was

about 14 A˚ /min. It is quite small compared to the overall

removal rate. The main reason was that we added a higher

concentration of citric acid. According to Liu’s thesis[14],

as the concentration of citric acid in the HNO3-citric acid

slurry increased, the etching rate of copper in the HNO3

-citric acid slurry was suppressed. The -citric acid behaves like BTA (Benzotriazole) in preventing copper corrosion

in the HNO3-based solution. The BTA is a common Cu

corrosion inhibitor since it can be absorbed on Cu surface to form a passivation layer. It is helpful to a Cu damascene structure because of the low etching rate at the recessed region of the Cu ﬁlm. Furthermore, the removed thickness

was only 6000 A˚ , hence we hoped that the chemical

removal rate was not too high. Therefore, we omitted the

chemical removal rate here. Thus, the removal rate repre-sentation will be RR¼ 30; 503 1 U½3 0:0113P1=30 _{ffiffiffiffiffi} P0 p V

Fig. 8shows the model prediction and experimental obser-vations of the eﬀects of the wafer pressure and platen speed.

3. Simulation via dynamic programming 3.1. Design

From Larson and Casti [15], the procedure of dynamic

programming can be concluded into three steps. First, cal-culate the equivalent discrete time domain state equation which describes the CMP process; second, determine the cost function in order to compute the minimum cost and find the optimal control inputs; the final procedure is the iterative computation. We used a very simple con-cept to obtain the state space model of removal rate for our simulation. The differential equation of the thickness being polished is equal to the removal rate and we made the removal rate to be the input. The equation is written as

_h ¼dh

dt ¼ RR ¼ u

where h is the thickness, RR the removal rate, u the input. We assumed that there were only seven values of the input (include 0) because of restrictions on the Westech 372M CMP tool and the sampling period T here was ﬁxed to 1 s. The cost function L was determined as

Cu 20000Å

Ta 500Å SiO2 1000Å

p-type dummy wafer Fig. 7. The structure of Cu blanket wafer.

(6)

L¼1 2qh 2_{ðkÞ þ}1 2ru 2_{ðkÞ þ}1 2sh 2 N

where k is the stage, N is the final stage, q is the weighting factor of transient state, r is the weighting factor of input, and s is the weighting factor of final state. The weighting factor of transient state means the penalty factor for the speed of approaching to the final state. The weighting fac-tor of admissible input means the penalty facfac-tor for trans-lational wafer pressure and rotational torque provided to CMP tool. The last weighting factor s of final state means the penalty factor for final state which describes the extent of under-polishing. In this study, we want the admissible input to decrease slowly based on the concept of soft

land-ing operation which was proposed by Chiu et al.[10]. The

values of these weighting factors were determined by an iteration method of trial and error to get successive de-crease pattern of admissible inputs which also minimize the cost function L.

3.2. Simulation results

3.2.1. SiO2wafer CMP process

As shown inFig. 6, the error bars means the

non-plan-arization index (the standard deviation of the removal rates of the nine points on the wafer). The smaller the non-planarization index, the more uniform removal rate

Fig. 9. Simulation result of SiO2blanket wafer when the wafer pressure and platen speed as the variable input separately (1 stage = 1 s).

(7)

on the entire wafer and it results in more ﬂat surface. Decreasing the wafer pressure or platen speed will decrease the non-planarization index. In order to reduce the wafer pressure or platen speed successively, we choose the weighting factors of ﬁnal state, transient state and input to be 10000, 1 and 100000, respectively for the case of wafer pressure, and 10000, 1 and 700 for the case of platen speed. For the case of wafer pressure, the platen speed was held at 30 rpm. The initial thickness h(0) was

6000 A˚ , the result is shown in Fig. 9. According to the

result, the process terminated at the 363th stage and the input is 7 psi during 0–177th stage, 4 psi during 178– 225th stage, 2.5 psi during 226–272th stage and 2 psi

dur-ing 273–362th stage. It was the basis of our SiO2 wafer

experiment on wafer pressure. For the case of platen speed, the wafer pressure was held at 4 psi and the result

is also shown in Fig. 9.

At the beginning of the CMP process, we hoped that the removal rate was high enough to reduce the time of process and also obtained more ﬂat surface at the end of the

pro-cess. According to Chen’s thesis[16], the platen speed has

a great inﬂuence on the non-uniformity. For these reasons, we attempted to apply both inputs. We chose the weighting factors of ﬁnal state, transient state, wafer pressure and platen speed to be 10000, 1,50000 and 1000, respectively.

The result of dynamic programming is shown in Fig. 10.

Fig. 11. Simulation result of Cu blanket wafer when the wafer pressure and platen speed as the variable input separately (1 stage = 1 s).

(8)

The platen speed is decreased ﬁrst in order to get better

non-planarization index when the SiO2 ﬁlm thickness is

about 2000 A˚ and followed by the wafer pressure. (The

def-inition of non-planarization index is shown in Section4.)

3.2.2. Cu wafer CMP process

In order to decrease the wafer pressure or platen speed

in the same manner as SiO2wafer, we choose the weighting

factors of ﬁnal state, transient state and input to be 10000, 1 and 40000, respectively for the case of wafer pressure, and 10000, 1 and 1200 for the case of platen speed. For the case of wafer pressure, the platen speed was held at 30 rpm. For the case of platen speed, the wafer pressure

was held at 3.5 psi. The initial thickness h(0) was 6000 A˚

and the result is shown inFig. 11.

As we know that dishing and erosion are proportional

to the pressure and platen speed [17], we expected that

decreasing the wafer pressure and platen speed would reduce the dishing and erosion of Cu wafer and also obtain better non-planarization index at the end of the process. Nevertheless, only the Cu blanket wafer experiment was done due to lack of Cu pattern wafers. We chose the weighting factors of ﬁnal state, transient state, wafer pres-sure and platen speed to be 10000, 1,20000 and 1200, respectively. The result of dynamic programming is shown inFig. 12.

4. Experiment for SiO2and Cu blanket wafer

The experiments were carried out on the IPEC 372M

CMP polisher. The slurry formulation of SiO2 wafer

CMP process was prepared by diluting the commercial slurry, Cabot SS-25, with DI water in the ratio of one to one. The copper wafers were polished with a slurry of 2

wt% alumina abrasive, 2 vol% HNO3 and 0.01 M citric

acid slurry.

The CMP removal rates were monitored at nine points along two perpendicular diameters on the entire wafer and the within-wafer non-planarization index is deﬁned as:

Non planarization index ¼ 1

n 1 Xn i¼1 ðxi xÞ 2 !1 2

where x is the removed thickness and x is the average re-moved thickness.

4.1. SiO2wafer CMP experiment

The experiments with dynamic programming were com-pared with the experiments which removed the same thick-ness at the same duration of polishing with constant removal rate operation. Since the removed thickness and duration of polishing were known, we can compute the value of constant removal rate. The base recipe process parameters for the constant removal rate were found

through experiment and are listed inTables 1 and 2. The

experimental results are listed inTable 3.

4.1.1. Change wafer pressure as the admissible input The duration of polishing was 363 s and the removed

thickness was 6000 A˚ . The required removal rate was

992 A˚ /min. Because of the ﬁxed platen speed here, the

required removal rate was found by changing the wafer pressure to be 4.3 psi. From the experimental results which

were listed inTable 3, the constant removal rate operation

mode has the better thickness removal but the dynamic programming operation mode possesses 39% better non-planarization index. The model prediction error on the lower wafer pressure caused the inaccuracy of thickness removal. It could be improved by developing more accu-rate model.

4.1.2. Change platen speed as the admissible input

The duration of polishing was 334 s and the removed

1078 A˚ /min. Because of the ﬁxed wafer pressure here, the

required removal rate was found by changing the platen

speed to be 40 rpm.Table 3revealed the similar

phenome-non to the part of wafer pressure. The dynamic

program-Table 1

Process parameters of SiO2CMP experiment (constant removal rate base recipe) Fixed parameter Base recipe value for wafer pressure as

variable input inFig. 9

Base recipe value for platen speed as variable input inFig. 9

Base recipe value for both wafer pressure and platen speed as variable inputs inFig. 10

Wafer pressure 4.3 psi 4.0 psi 4.7 psi

Back Pressure 1 psi 1 psi 1 psi

Carrier speed 30 rpm 40 rpm 52 rpm

Platen speed 30 rpm 40 rpm 52 rpm

Polish time 363 s 334 s 237 s

Slurry ﬂow rate 150 ml/min 150 ml/min 150 ml/min

Pre-wet pad speed 28 rpm 28 rpm 28 rpm

Pre-wet duration 10 s 10 s 10 s

Pre-wet ﬂow rate 300 ml/min 300 ml/min 300 ml/min

Pad Rodel IC1400 Rodel IC1400 Rodel IC1400

Table 2

Slurry formulation of SiO2CMP

Species Concentration

Commercial slurry Cabot SS-25 50 vol%

(9)

ming operation mode possesses 26% better

non-planariza-tion index. The thickness removal was over 6000 A˚ for

dynamic programming because the model prediction was lower than the experimental data on the higher platen speed. 4.1.3. Change wafer pressure and platen speed as the admissible inputs simultaneously

1519 A˚ /min. In order to avoid one-sided emphasis of these

two inputs, we simultaneously increased the wafer pressure and platen speed. The ﬁnal value was found to be 4.7 psi and 52 rpm, respectively. The thickness removal had a little improvement and this might be caused by combining com-plimentary model prediction error from both platen speed

model and wafer pressure model. Nevertheless, the dynamic programming operation still possessed 16% better non-planarization index.

4.2. Cu wafer CMP experiment

We will explain how to calculate the removal rate due to chemical reaction ﬁrst. The experiment for calculating the removal rate due to chemical reaction was carried out in a circular glass container. Because the slurry was not well stirred, we put a magnet at the bottom of the container to produce the ﬂow of the slurry in order to keep the reac-tion rate at the surface stable. It also dispersed the abrasive to suspend in the slurry and made the experiment be closer to the real circumstance of polishing. The surface of Cu wafer faced with the magnet and kept a distance with the magnet. We set the time to immerse the wafer in the slurry to 5 min and assumed that the concentration of the slurry was constant during chemical reaction. The base recipe process parameters for the constant removal rate were

found through experiment and are listed in Tables 4 and

5. The experimental results are listed inTable 6.

The same comparison between dynamic programming and constant removal rate were done as we did for the

Table 3

SiO2wafer experimental result with three kind of admissible inputs via dynamic programming operation Average removed thickness (A˚ ) Thickness removal error from 6000 A˚ (%) Non-planarization index (A˚ ) Non-planarization index improvement (%)

Wafer pressure 5499 A˚ (5899 A˚) 8.35 (1.68) 435 A˚ n(717 A˚) 39

Platen speed 6113 A˚ (5914 A˚) 1.88 (1.43) 518 A˚ (697 A˚) 26

Simultaneously 5875 A˚ (5957 A˚) 2.08 (0.72) 487 A˚ (580 A˚) 16

The values in brackets belong to the constant removal rate operation.

Table 4

Process parameters of Cu CMP experiment (constant removal rate base recipe) Fixed parameter Base recipe value for wafer pressure

as variable input inFig. 11

Base recipe value for platen speed as variable input inFig. 11

Base recipe value for both wafer pressure and platen speed as variable inputs inFig. 12

Back pressure 1 psi 1 psi 1 psi

Wafer pressure 5.1 psi 3.5 psi 4.9 psi

Carrier speed 30 rpm 50 rpm 57 rpm

Platen speed 30 rpm 50 rpm 57 rpm

Polish time 118 s 85 s 60 s

Slurry ﬂow rate 150 ml/min 150 ml/min 150 ml/min

Pre-wet pad speed 28 rpm 28 rpm 28 rpm

Pre-wet duration 10 s 10 s 10 s

Pre-wet ﬂow rate 300 ml/min 300 ml/min 300 ml/min

Pad Rodel IC1400 Rodel IC1400 Rodel IC1400

Table 5 Slurry formulation of Cu CMP Species Concentration Abrasive Al2O3(EXTEC 0.1 lm) 2 wt%

Oxidizer HNO3 2 vol%

Complex agent Citric acid 0.01 M

Dilution DI water Remaining balance of slurry

Table 6

Cu wafer experimental result with three kind of admissible inputs via dynamic programming operation Average removed thickness ( A˚ ) Thickness removal error from 6000 A˚ (%) Non-planarization index (A˚ ) Non-planarization index improvement (%)

Wafer pressure 7491 A˚ (6721 A˚) 24.85 (12.02) 1204 A˚ (1225 A˚) 2

Platen speed 5406 A˚ (5948 A˚) 9.9 (0.87) 615 A˚ (806 A˚) 24

Simultaneously 6814 A˚ (5772 A˚) 13.57 (3.8) 1034 A˚ (1153 A˚) 10

(10)

SiO2wafer CMP experiment. The parameters for the

con-stant removal rate were also found through experiment. 4.2.1. Change wafer pressure as the admissible input

3051 A˚ /min. Because of the ﬁxed platen speed here, the

required removal rate was found by changing the wafer pressure to be 5.1 psi. From the experimental results which

were listed inTable 6, the constant removal rate operation

has better thickness removal but the dynamic program-ming operation possesses a little better non-planarization index. However, the diﬀerence of non-planarization index between dynamic programming operation and constant removal rate operation is very small and it can be consid-ered within statistical error of experimental data. It also shows that the wafer pressure is not a major factor to inﬂu-ence the non-planarization index in Cu wafer CMP. As

shown in Fig. 8, we see that the non-planarization index

data under 3 psi and 5 psi is within statistical error with each other. This explains why there is no degradation in non-planarization index when the wafer pressure is decreased from 5.1 psi to 3 psi, i.e., there is no degradation in non-planarization index when Cu wafer CMP process is changed from constant removal rate operation mode to dynamic programming operation mode. The inaccuracy of removed thickness may be caused by the lower predicted value of removal rate on the smaller wafer pressure. It means that the model for Cu wafer CMP process needs to be modiﬁed.

4.2.2. Change platen speed as the admissible input

4235 A˚ /min. Because of the ﬁxed wafer pressure here, the

required removal rate was found by changing the platen

speed to be 50 rpm.Table 6 shows that the dynamic

pro-gramming operation is 24% better than the constant removal rate operation in terms of non-planarization index. The error of the removed thickness may be caused by higher predicted value of removal rate at faster platen speed and lower predicted value of removal rate at slower speed, there was not enough time to remedy lower removal rate at slower platen speed by higher removal rate at faster platen speed. Therefore removal thickness of dynamic pro-gramming operation mode is less than that of constant removal rate operation mode. However, it made a signiﬁ-cant improvement of non-planarization index through dynamic programming operation of platen speed. This shows platen speed is a major parameter aﬀecting the non-planarization index of Cu wafer CMP process. 4.2.3. Change wafer pressure and platen speed as the admissible inputs simultaneously

thickness was 6000 A˚ . The required constant removal rate

was 6000 A˚ /min. We simultaneously increased the wafer

pressure and platen speed to be 4.9 psi and 57 rpm, respec-tively. The dynamic programming operation still provides 10% improvement of non-planarization index than con-stant removal rate operation.

4.3. Discussion

Three cases of CMP process to use wafer pressure, pla-ten speed and both wafer pressure and plapla-ten speed as admissible inputs were examined in this study. In the

SiO2 wafer CMP experiment, the non-planarization

indexes of three cases were all improved and the errors of the removed thickness were within 8%. The model could predict the removal rate well. It illustrated that the

multi-step dynamic programming of SiO2wafer CMP was

feasi-ble to implement. It could further be veriﬁed by experiment on IPEC 372M CMP tool.

Slurry chemicals play an important role in the Cu wafer CMP process. The formation of a non-native passivation layer by the passivating chemical (e.g., citric acid) in the slurry, the dissolution of Cu or the abraded materials by abrasives from surface layer are all determined by the

chem-ical environment in the slurry[18]. In the Cu wafer CMP

experiment, the result shows that the platen speed is the main factor to influence non-planarization index. When we made the platen speed change, non-planarization index improved 10% at least. It means that the higher the platen speed, the faster the refresh rate of the slurry underneath the wafer and the larger the removal rate. It may also cause worse non-uniformity of the slurry to transport on the entire wafer and influence the non-uniformity of the removal rate. For this reason, the interactions between mechanical and chemical parameters need to be investigated more thor-oughly. This also indicated that the current model was not sufficient to describe the entire Cu wafer CMP process. The errors in thickness removal fall between 14% and 25%. There are two ideas to reduce the error in thickness removal. The first way was to modify the model and make it more comprehensive and complete by including interac-tion between mechanical polishing and chemical reacinterac-tion. The second way was to get more removal rate data corre-sponding to every value of the admissible input by experi-ment to obtain higher degree polynomials regression model.

In the case of both SiO2and Cu CMP, we observed that

both Tables 3 and 6 reveal that dynamic programming operation performs better than constant removal rate oper-ation in terms of minimizing non-planarizoper-ation index.

From the removal rate model for SiO2wafer in Fig. 6,

the standard deviation (or NPI) of removal rate are smaller for lower wafer pressure and platen speed. It means lower wafer pressure and platen speeds generates smaller NPI or

better planarization results as is reported in Table 3 for

SiO2 wafer. The relative improvement in standard

devia-tion of removal rate (or relative improvement in NPI) from higher wafer pressure and platen speeds to lower wafer

pressure and platen speeds is obvious in Fig. 6 for SiO2

(11)

From the removal rate model for Cu wafer inFig. 8, the standard deviation (or NPI) of removal rate are smaller for lower wafer pressure and platen speed. It means lower wafer pressure and platen speeds generates smaller NPI

or better planarization results as is reported in Table 6

for Cu wafer. The relative improvement in standard devia-tion of removal rate (or relative improvement in NPI) from higher wafer pressure and platen speeds to lower wafer

pressure and platen speeds is obvious in Fig. 8 for Cu

wafer.

The relative trend of improvement of standard deviation

(or NPI) for SiO2wafer inFig. 6stands out more than the

relative trend of improvement of standard deviation (or

NPI) for Cu wafer inFig. 8. This also predicts the beneﬁts

of dynamic programming operation on NPI improvements

for SiO2wafer will also stand out more than Cu wafer.

The standard deviation of removal rate data for SiO2

wafer in Fig. 6 is smaller than the standard deviation of

removal rate data for Cu wafer in Fig. 8. This explains

the fact that the percentage value of thickness removal

error for SiO2wafer is smaller inTable 3than that of

thick-ness removal error for Cu wafer inTable 6due to smaller

numerator in percentage computation. This also explains the fact that the percentage value of NPI improvement

for SiO2 wafer is larger in Table 3 than that of NPI

improvement for Cu wafer in Table 6 due to smaller

denominator in percentage calculation.

To get better performance for both global planarization and thickness removal, we recommend that planarization

step and overpolish step in SiO2and Cu CMP should use

diﬀerent mode of operation, i.e., dynamic programming operation during planarization step for minimizing non-planarization index and constant removal rate operation during overpolish step for minimizing thickness removal error. The incremental time calculation in overpolish step can be done using the thickness error and removal rate derived from Luo’s removal rate model based on constant wafer pressure and platen speed at the end of planarization step.

5. Conclusions and future work 5.1. Conclusions

In this study, we focused on the mechanical parameters of CMP process. The wafer pressure and platen speed were taken as the control parameters. We applied the control method of dynamic programming to carry out experiment

for CMP process with blanket SiO2 and Cu wafers. The

inﬂuence of dynamic programming operation and constant removal rate operation on the non-planarization index for CMP process were compared carefully. We arrived at the following conclusions:

(1) The non-planarization index could be improved con-sistently by dynamic programming operation versus constant removal rate operation. The dynamic

pro-gramming operation has 2% to 39% improvement over the base recipe of constant removal rate in all

experiments as shown in Tables 3 and 6.

(2) The thickness removal error is consistently smaller by constant removal rate operation versus dynamic pro-gramming operation in all experiments as shown in

Tables 3 and 6.

(3) To get the best performance of both planarization and thickness removal, it is recommended that

plan-arization step and overpolish step in SiO2 and Cu

CMP should use diﬀerent mode of operation, i.e., dynamic programming operation during planariza-tion step for minimizing non-planarizaplanariza-tion index and constant removal rate operation during overpol-ish step for minimizing thickness removal error. The incremental time calculation for eliminating thickness removal error during overpolish step can be done using the thickness error and removal rate derived from Luo’s removal rate model based on constant wafer pressure and platen speed at the end of plana-rization step.

(4) The platen speed is a more consistent factor to

inﬂu-ence the non-planarization index (about 25%

improvement over base recipe of constant removal rate) during planarization step using dynamic

pro-gramming operation as shown in Tables 3 and 6.

The removal thickness error (about 1%) is also mini-mum in overpolish step using constant removal rate operation by constant platen speed and wafer

pres-sure as shown in the third row ofTables 3 and 6.

(5) In SiO2CMP, dynamic programming of platen speed

during planarization step is followed by constant removal rate operation during overpolish step using platen speed of 20 rpm and wafer pressure of 4.0 psi. (6) In Cu CMP, dynamic programming of platen speed during planarization step is followed by constant removal rate operation during overpolish step using platen speed of 30 rpm and wafer pressure of 3.5 psi. (7) Best known method (BKM) for CMP planarization is recommended to use dynamic programming opera-tion of platen speed for coarse control of non-plana-rization index during plananon-plana-rization step and use constant removal rate operation via constant platen speed and wafer pressure at the end of planarization step for ﬁne control of thickness removal error during overpolish step.

5.2. Future work

Experimental verification of dynamic programming operation needs to be carried out on pattern wafers in the future for different kinds of pattern wafers. The influ-ence on copper dishing and oxide erosion can be compared between dynamic programming and constant removal rate operation. It is conjectured that using dynamic planning operation in planarization step and constant removal rate operation in overpolish step with minimum platen speed

(12)

and/or wafer pressure will minimize the copper dishing and

oxide erosion according to experimental results inFig. 2aa

andFig. 2bb reported by K. Wijekoon and S. Tsai etc.[17]. Appendix A. Optimal control design: dynamic programming Sociological, economic, and physical pressures in all areas of modern life have generated an accelerated demand for high-level decision-making based upon limited informa-tion about the processes being controlled. In 1950s, a sys-tematic and concerted mathematical study of such decision-making situations was initiated by Richard Bell-man. This pioneering work was based upon the fundamen-tal system-theoretic notion of feedback, i.e., that decision rules should be based upon the current (and perhaps past) states of the process under study. Bellman and his

col-leagues continued to develop the feedback decision-making concept under the name of ‘‘dynamic programming’’. The majority of problems of true practical concern were putationally intractable due to the limited state of the com-puting art at that time. As time goes on, a combination of rapid progress in computer technology, coupled with the development of reﬁned computational procedures, has made it practical for solving a wide variety of problems in economics, engineering, operations research, and math-ematics, itself.

A.1. Bellman’s principle of optimality

The fundamental concept of dynamic programming originated by Bellman is called the principle of optimal-ity. This principle may conceptually be thought as fol-lows: given an optimal trajectory from point A to point C, the portion of the trajectory from any intermediate point B to point C must be the optimal trajectory from

B to C. In Fig. A.1, if the path I–II is the optimal path

from A to C, then according to the principle of optimal-ity path II is the optimal path from B to C. The proof by contradiction for this case is immediate: Assume that

some other path, such as II0_{, is the optimum path from}

B to C, then path I–II0 _{has less cost than path I–II.}

How-ever, this contradicts the fact that I–II is the optimal path

C A B I II II,

Fig. A.1. Illustration of the principle of optimality.

(13)

from A to C, and hence II must be the optimal path from B to C.

A.2. Dynamic programming

Consider a quantized state x2 X, at stage (N 1). At

this state, each of the admissible decisions u(m)2 U is

applied.

X ¼ ½x1_x2_{. . . x}n1_xn_; _U _{¼ ½u}1_u2_{. . . u}M1_uM

For each of these decisions the cost at the current stage can be determined as

LðmÞ¼ L½x; uðmÞ_{; N}₁ _{ðm ¼ 1; 2; . . . ; MÞ}

Next, for each of these decisions, the next state at stage N is determined from the system equation,

xðmÞðN Þ ¼ g½x; uðmÞ_{; N}₁ _{ðm ¼ 1; 2; . . . ; MÞ}

The next step is to compute the minimum cost at stage N

for each of the states x(m). However, in general, a particular

state x(m)will not lie on one of the quantized states x2 X at

which the optimal cost I(x,N) is deﬁned. In fact, it may lie outside of the range of admissible states. In the latter case the decision is rejected as a candidate for the optimal deci-sion for this state and stage. If a next state x(m) does fall within the range of allowable states, but not on a quantized value, then it is necessary to use some type of interpolation procedure to compute the minimum cost function at these points.

Assume that the values of the minimum cost at the states x(m) can be expressed as a function of the values of the

optimal cost at quantized states x2 X.

I½xðmÞ_{; N ¼ P½x}ðmÞ_{; N ;}_{Iðx; N Þ;} _{all x}_{2 X}

where I(x,N) = L(x,N).

If, as is often the case, no decision is made at k = N, the ﬁnal stage, and hence the cost function at N depends only on the ﬁnal state, x(N).

The total cost of applying decision u(m)at state x, stage

(N 1), can then be written as

FðmÞ1 ¼ L½x; uðmÞ; N 1 þ I½xðmÞ; N

The minimization can be achieved by simply comparing the M quantities. According to the functional equation, the minimum value will be the minimum cost at state x, stage

(N 1). I½x; N 1 ¼ min uðmÞ_2U L½x; u ðmÞ_{; N}_{1 þ I½x}ðmÞ_{; N} ðA:1Þ

the optimal decision at this state and stage, ^u½x; N 1, is

the control u(m)for which the minimum in Eq.(A.1)is

actu-ally taken on.

This procedure is repeated at each quantized state x2 X

at stage (N 1). When this has been done, I(x,N 1) and

^

u½x; N 1 are known for all x 2 X. It is now possible to

compute I(x, N 2) and ^u½x; N 2 for all x 2 X based

on knowledge of I(x, N 1). k , x k , x ( )x,k uˆ ( )x,k uˆ MEMORY CONTROLLER SYSTEM

Fig. A.3. A controller based on retrieving the results of the dynamic programming computation from memory.

Initialization: Compute I(h (j)_{, N), j=1,2,…,jN}

No No No Yes Yes Yes (Stage) k = N-1 (State) j = 1 (Control) m = 1 Next state h(m)_{= g(h}(j)_{, u}(m)_{, k)}

Interpolate in n state variables to find I(h(m)_{, k+1) in terms of the} values I(h(j), k+1), j=1,2,…,jN F(h(j)_{, u}(m)_{, k) =} L(h(j)_{, u}(m)_{, k)+I(h}(m)_{, k+1)} Is m=mN I(h(j), k) = min {F(h(j), u(m), k)} m=1,2,…mN ( )

( )

{

(

( ) ( )m _,_k

)

}

u , j h F mN 1,2,..., m min arg mˆ ; mˆ u k , j h uˆ = = = Is k=0 Is j=jN STOP m=m+1 j=j+1 k=k-1

(14)

The general iterative procedure continues this process.

Suppose that I(x,k+1) is known for all x2 X. Then

I(x,k) and ^u½x; kare computed for all x 2 X from

I½x; k ¼ min

uðmÞ_2U L½x; u

ðmÞ_{; k þ I½x}ðmÞ_{; k}_{þ 1}

ðA:2Þ where x(m) is determined from

xðmÞ¼ g½x; uðmÞ_{; k}

and where I(x(m),k+1) is computed by interpolation on the

known values I(x,k+1) for all x2 X:

I½xðmÞ_{; k}_{þ 1 ¼ P½x}ðmÞ_{; k}_{þ 1; Iðx; k þ 1Þ;} _{all x}_{2 X}

The optimal decision ^u½x; k is the decision for which Eq.

(A.2) takes on the minimum. The iterative procedure

be-gins by computing ^u½x; N 1 and I(x,N 1) from the

gi-ven boundary conditions I(x,N), and it continues until it

arrives at ^u½x; 0 and I(x, 0). The complete results of

dy-namic programming are shown in Fig. A.2. At each state

of stage, the optimal decision is written blow, and the min-imum cost is written above. Finally, we can ﬁnd the opti-mal sequence of decisions starting from the given x(0) and system equation. This is called the recovery procedure and these decisions are the input for our experiments. However, this is based on the system equation when we lack the measurement of state. Our simulated results were done in this manner.

If we could monitor the state and stage of the system,

the dynamic programming solution, ^u½x; k, leads to a

feed-back control or decision policy conﬁguration. One method of implementing this solution is to simply store all the

val-ues of ^u½x; k in memory, monitor the state and stage of the

system, and look up the appropriate value of ^u½x; k as

required. This type of implementation is attractive because the dynamic programming calculations can be done oﬀ-line, and the only operation that needs to be done during the decision interval is retrieval of the appropriate optimal

decision. The system conﬁguration is as shown inFig. A.3.

A.3. Simulation Results

We used the very simple concept to get the equation for our simulation. The diﬀerential equation of the thickness being polished is equal to the removal rate and we made the removal rate to be the input. The equation is written as

_h ¼dh

dt ¼ RR ¼ u

where h is the thickness, RR the removal rate, u the input. The diﬀerence equation version has been computed using a sampling period of T (1 s) is

hðk þ 1Þ ¼ hðkÞ T uðkÞ ðA:3Þ

where k is the stage. We assumed that there were only seven values of the input (include 0) because of restrictions on the Westech 372M CMP machine and the sampling period T here was ﬁxed to 1 second. For each of these inputs the cost at the current stage can be determined as

LðmÞ¼1

2qh

2_{ðkÞ þ}1

2ru

2_{ðkÞ ðm ¼ 1; 2; . . . ; 7Þ} _ðA:4Þ

and the cost at the ﬁnal stage N also was determined as

Iðx; N Þ ¼ Lðx; N Þ ¼1

2sh

2

N ðA:5Þ

where s is the weighting factor of ﬁnal state, q the weighting factor of transient state, and r the weighting factor of

in-put. Then we suppose a quantized state h2 H and a

admis-sible inputs u(m)2 U are applied.

H¼ ½6000 5999 5998 . . . 2 1 0 ðjN ¼ 6001Þ

U¼ ½uð1Þ_uð2Þ_uð3Þ_uð4Þ_uð5Þ_uð6Þ_uð7Þ _{ðmN ¼ 7Þ}

We could get I(h,N 1) and ^u½h; N 1 for all h 2 H by

substituting Eq. (A.3), Eq. (A.4) and Eq. (A.5) into Eq.

(A.2)which is presented in SectionA.2.

I½6000; N 1; I½5999; N 1; . . . ; I½1; N 1; I½0; N 1 ^

uð6000; N 1Þ; ûð5999; N 1Þ; . . . ; ûð1; N 1Þ; ûð0; N 1Þ

It is now possible to compute I(h,N 2) and ^u½h; N 2 for

all h2 H based on knowledge of I(h,N 1). The iterative

procedure continues until ^u½h; 0 and I(h, 0) have been

com-puted. The program ﬂowchart is shown inFig. A.4and the

complete results of dynamic programming can be plotted like Fig. A.2. Finally, we can ﬁnd the optimal sequence

of inputs starting from the given h(0) and Eq. (A.3) by

means of the recovery procedure. References

[1] Texas Engineering Extension Service, CMP, TEEX, 2001.

[2] C.K. Hu, J.M.E. Harper, Materials Chemistry and Physic. 52 (1998) 5–16.

[3] Y. Ein-Eli, E. Abelev, E. Rabkin, D. Starosvetsky, J. Electrochem. Soc. 150 (9) (2003) C646–C652.

[4] F. Preston, J. Soc. Glass Technol. 11 (1927) 214–247.

[5] F. Zhang, A. Busnaina, J. Feng, M.A. Fury, in: Proceedings of the Fourth International Chemical–Mechanical Planarization for ULSI Multilevel Interconnection Conference, 1999 pp. 61–64.

[6] W.T. Tseng, Y.L. Wang, J.Electrochem. Soc. 144 (2) (1997) L15– L17.

[7] B. Zhao, F.G. Shi, in: Proceedings of the Fourth International Chemical- Mechanical Planarization for ULSI Multilevel Intercon-nection Conference, 1999 pp. 13–22.

[8] F.G. Shi, B. Zhao, Appl. Phys. A 67 (1998) 249–252.

[9] J. Luo, D.A. Dornfeld, IEEE Trans. Semiconduct. Manufact. 14 (2) (2001) 112–132.

[10] J.-B. Chiu, C.-C. Yu, S.-H. Shen, Microelectron. Eng. 65 (2003) 345– 356.

[11] C.-S. Lin, C.Y. Chi, Microelectron. Eng. 75 (2004) 285–296. [12] R. Bellman, Dynamic Programming, Princeton University Press,

Princeton, NJ, 1957.

[13] Hung-Yin Tsai, ‘‘Analysis of Non-uniformity Based on Material Removal and Kinematics in Chemical Mechanical Planarization,’’ M.S. thesis, Department of Power Mechanical Engineering, National Tsing Hua University, 1999.

[14] Keng-Ming Liu, ‘‘Study on the Chemical Mechanical Polishing of Copper Thin Films,’’ M.S. thesis, Department of Electronic Engi-neering, National Chiao Tung University, 2000.

[15] R.E. Larson, J.L. Casti, Principles of Dynamic Programming, Marcel Dekker, 1978.

(15)

[16] Junne-Dar Chan, ‘‘Tribochemical Reactions on Cu-ﬁlm Inﬂuenced by Mechanical Stresses in Chemical Mechanical Polishing,’’ M.S. thesis, Department of Mechanical Engineering, National Cheng Kung University; 2000.

[17] K. Wijekoon, S. Tsai, M. Chandrachood, B. Brown, F. Redeker, S. Nanjangud, G. Amico, in: SEMI/Japan Technical Symposium, 1998. [18] J.M. Steigerwald, S.P. Murarka, D.J. Duquette, R.J. Gutmann, Mat.