應用類神經網路於蝕刻製程之缺陷分析與預測

(1)

國

立

交

通

大

學

機械工程學系

碩

士

論

文

應用類神經網路於蝕刻製程之缺陷分析與預測

An Online/Offline Prediction Model for RIE Using Neural

Networks

研究生：倪席琳

(2)

中華民國九十五年九月

應用類神經網路於蝕刻製程之缺陷分析與預測

An Online/Offline Prediction Model for RIE Using Neural Networks

研究生：倪席琳

Student: Nesrin Talat

指導教授：李安謙

Advisor:

Dr.

An

Chen

Lee

國立交通大學

機械工程學系

碩士論文

A Thesis

Submitted to the Institute of Mechanical Engineering Collage of Engineering

National Chiao Tung University in partial Fulfillment of the Requirements

for the Degree of Master of Science

In

Mechanical Engineering

September 2006

Hsinchu, Taiwan, Republic of China.

(3)

應用類神經網路於蝕刻製程之缺陷分析與預測

研究生：倪席琳

指導教授：李安謙教授

國立交通大學機械工程學系碩士論文

摘要

現今半導體製程中，包含上千道的製程步驟，以及參數。要如何有效的監控與偵測製程參數異常，製程監控機制就扮演著不可或缺的角色。為了要提高產量及良率，降低製程缺陷之發生，監控製程的每一步驟是否異常，藉由製程參數之錯誤偵測(Defect Detection)，以及預先一步預測缺失的發生，來達到此一目標。本論文之目的即在氧化物反應性離子蝕刻(Reactive Ion Etching, RIE)製程中，發展一套能夠即時監控，並且預測下一批貨或是下一個製程步驟之製程參數有無出現異常之方法。來解決上述提及之問題。

在本論文中，利用主成分分析(Principle Component Analysis, PCA)以及類神經網路(Neural Network, NN)來分析製程參數資料，製程因子先經由主成分分析處理後，得到主要成分，再送入針對 RIE 製程參數所建立完成之類神經網路，預測其是否出現異常；根據實際的製程需求及情況，可將 RIE 製程分為兩部分來建立其製程監控模型，第一部份利用全部製程步驟之資訊，來偵測參數異常狀況，稱之為離線(Offline)錯誤偵測模組；另一部份利用前三個製程步驟即時資訊來預測第四個製程步驟之異常，稱之為即時(Online)錯誤偵測模組。

(4)

經由實際製程參數資料及上述兩個模組實驗驗證後，確實能夠偵測及預測出製程參數之異常。同時，藉由此監控機制，可減少當製程異常時，排除異常的時間，並且達到低成本即時控制之應用。

(5)

An Online/Offline Prediction Model for RIE Using

Neural Networks

Student: Nesrin Talat

Advisor:

Dr.

An

Chen

Lee

Institute of Mechanical Engineering

National Chiao Tung University

Abstract

The fabrication of modern semiconductor products requires thousands of processing steps. A key element in achieving high yields during semiconductor fabrications is to minimize the amount of defected wafers. Therefore, detecting the defected wafers and predicting the wafer status are very important issues.

In this study, BPNN is the backbone of prediction models and the BPNN inputs were prepared in three different ways: raw data by using the first set of data points, capturing samples from the original data, and the statistical summary values by calculate the mean and standard deviation values for each step.

Four prediction models are established to predict the wafer status: offline back-propagation neural network (BPNN), offline principle component analysis BPNN

(6)

reduce the overall cost of ownership of semiconductor equipment by increasing the wafer yield and throughput of product wafers, and not depend upon monitor wafers or expensive metrology rather it will enable inexpensive real-time wafer-to-wafer control applications in RIE.

This study establishes a method for deciding the significant process parameters which affect the wafer status in RIE by comparing the result of applying each process parameter alone in BPNN. The significant parameters for all etching steps combined together in offline BPNN to tackle the defected wafer. Furthermore the significant parameters for the first three etching steps combined together in online BPNN to forecast the wafer status. By modifying the significant parameters when online BPNN model predicts the defected wafer, the down-time and mean-time-to-repair of the equipment can be decreased.

The evaluation results for the four models demonstrate that each model has its advantages and disadvantages under different BPNN input preparations. However, preparing statistical summary as BPNN inputs has less error prediction. Therefore, using statistical summary in online PCABPNN is recommended to enable rapid prediction of wafer status in RIE which greatly reducing test wafer necessity.

(7)

Acknowledgements

While this degree has been mostly a personal journey, its completion would not have been possible without the help, support and encouragement of many individuals. First of all, I would like to take this opportunity to extend my warmest gratitude to my advisor, Professor An Chen Lee, for his support, encouragement and patience throughout this degree. His unique insight, enthusiasm and broad-ranging interests have been truly inspirational.

I would also like to acknowledge Dr. Wen-Chin Chen for his time and for helpful discussion and guidance in Neural Network. I would like to thank the faculty at the mechanical department at the National Chiao Tung University for educating me and helping me realize my potential. I also want to thank the members of cluster and automatic labs for their friendship and providing help whenever I needed.

My experience was enhanced, both intellectually and socially, by the valuable interaction with the members of the PSC engineers.

Mom and Dad, your constant encouragement and support throughout the years have given me the strength and courage to do things I would have never dreamed possible. I would like to thank: my dad for teaching me that "there is always room for improvement", as well as my mom for teaching me patience and reminding me that "We are here on earth for a purpose", and my elder brother, Samer, for showing me how well and elegantly that knowledge can be used. In particular I would not have been able to get

(8)

Dedication

To some world peace is a dream; attractive, alluring,

desirable, fascinating,

perhaps even a little intoxicating; but still a dream...

Then there are those who make dreams come true

...

(9)

摘要………..

i

Abstract……… iii

Acknowledgements………... v

Dedication……… vi

Table of Contents………. vii

List of Figures……….. x

List of Tables………... xii

List of Abbreviations………... xiv

C h a p t e r 1 Introduction………... 1

1.1 Defect reduction overview……….………... 2

1.2 Motivation…...……….…... 4 1.3 Problem statement………..…….………..…………. 6 1.4 Thesis statement………..…….…...…………. 7 1.5 Thesis approach………..…….…...………… 8 1.6 Thesis organization………..………..…….…..……. 9

C h a p t e r 2 Neural Network ……….…………..………. 10

2.1 Related work……….……….………. 11

(10)

C h a p t e r 3 Reactive Ion Etch……….. 18

3.1 Etch process overview………..…..……….…………. 18

3.2 Reactive ion etch………..……….. 21

3.2.1 Plasma in RIE ….………..………... 23

3.2. Cooling system in RIE ……….………..………… 25

3.3 Study case: oxide RIE in PSC……….……….. 26

3.4 RIE parameters……….……….. 30

3.4.1 RF power………... 32

3.4.2 Pressure………... 33

C h a p t e r 4 Principal Component Analysis………...…………... 35

C h a p t e r 5 Prediction Models for RIE……...………….…………... 43

5.1 Data collection from RIE ……….…..……….……….………. 45

5.2 Data preparation ………..……….………..……….…..………. 46

5.3 Architecture of prediction models ……….. 49

5.3.1 BPNN ………..…...……… 50

5.3.2 Training ………...……… 50

5.4 Significant factors ……….. 52

5.4.1 Significant factors for offline BPNN …………..……… 52

5.4.2 Significant factors for online BPNN ………..……… 57

5.5 Evaluations of prediction models ……….. 61

5.5.1 Offline BPNN prediction model ………...……… 61

(11)

5.5.3 Offline PCABPNN prediction model ………...……… 65

5.5.4 Online PCABPNN prediction model ………...……… 68

5.6 Summary ……… 72

C h a p t e r 6 Conclusion………..………. 74

6.1 Conclusion………..…….…..……….………..………. 74 6.2 Future Extensions ……….…..……….………..………. 76

References……… 77

Appendix I Terminology ………...………. 82

(12)

List of Figures

Fig. 1.1 Fault density, yield, and output as a function of the time since the

inception of a semiconductor technology node……… 2

Fig. 1.2 Wafer surface shape before and after etching process. …………...… 7

Fig. 2.1 The artificial neural network cell………. 13

Fig. 2.2 The Back propagation neural networks model with one output……... 14

Fig. 2.3 Major steps of BPNN algorithm………... 17

Fig. 3.1 The four phenomenological etching mechanisms……… 20

Fig. 3.2 Reactive ion etch tool………. 21

Fig. 3.3 RIE mechanism……… 23

Fig 3.4 Helium backside cooling……….. 25

Fig 3.5 2300 Exelan Flex, utilize etching equipment in many VLSI factories. 26 Fig 3.6 Sketch of 2300 Exelan Flex equipment includes some controller…... 27

Fig 3.7 Sketch of the main etching chamber inside RIE chamber……… 28

Fig 3.8 Process recipe for oxide film etching in RIE………..………. 29

Fig 3.9 Main factors of oxide RIE……… 32

Fig 3.10 Etch rate and DC bias as a function of the RF power………..……… 33

Fig 3.11 Qualitative effect of pressure on ion energy and the etching mechanism……… 34

Fig 4.1 Flow chart of principle component analysis (PCA) algorithm………. 37

Fig 4.2 Eigenvalues of the covariance matrix for RIE steps……… 40

(13)

Fig. 5.2 Percentage of the training and testing wafers in this case study ...….. 45 Fig. 5.3 Box plots of data input number versus each step of given data, and

box plots of process time versus each step of given data……….…… 46

Fig. 5.4 The position of captured samples……….……… 47 Fig. 5.5 Statistical summary data preparation technique…………...………… 48 Fig. 5.6 Offline BPNN prediction model ……….. 62 Fig. 5.7 Online BPNN prediction model………... 64 Fig. 5.8 Offline PCABPNN prediction model ……….. 66 Fig. 5.9 Result of testing 30 wafers in offline PCABPNN prediction model……….. 67 Fig. 5.10 Online PCABPNN prediction model …….……….. 69 Fig. 5.11 Result of testing 30 wafers in online PCABPNN prediction model……….. 70

Fig 5.12 Summary of error prediction in online/offline prediction models using several data preparation techniques. ….………. 73

(14)

List of Tables

Table 3.1 Comparison between wet and dry etch characteristics.………. 19

Table 3.2 RIE signals and its clarification………. 31

Table 4.1 RIE factors and its abbreviation...………... 36

Table 4.2 Eigenvectors of the covariance matrix for the RIE steps………... 41

Table 4.3 PC variance percentage for RIE steps……….………. 42

Table 5.1 Number of suggested sampling for each step in the sampling technique……… 47

Table 5.2 The methods used in each prediction model ….……… 50

Table 5.3 Significant factors of RIE process decided after applying the first 180 data points of each factor in BPNN……… 53

Table 5.4 Significant factors of RIE process decided after applying the 34 captured samples of each factor in BPNN………. 54

Table 5.5 Significant factors of RIE process decided after applying the twenty statistical summary values of each factor in BPNN………... 55

Table 5.6 Significant factors of the first three stabilization steps decided after applying the first 20 data points in BPNN………. 57

Table 5.7 : Significant factors of the first three stabilization steps decided after applying the seven captured samples in BPNN………. 58

Table 5.8 Significant factors of the first three stabilization steps decided after applying the six statistical summary values of each factor in BPNN 59 Table 5.9 The performance of offline BPNN prediction model... 63

(15)

Table 5.10 The performance of online BPNN prediction model……… 65 Table 5.11 The performance of offline PCABPNN prediction model………... 68 Table 5.12 The performance of online PCABPNN prediction model………… 69

(16)

List of Abbreviation

AENN : Auto-Encoder Neural Networks

AIC : Akaikes Information Criterion

BPNN : Back Propagation Neural Networks

DPM : Defects per Pillion

D-S : Dempster–Shafer theory

FIS : Final Information Statistic

IC : Integrated Circuit

IMC : Internal Model Control

LVQ : Learning Vector Quantization

MCM :Multi-Chip Modules

MPC : Model Predictive Control

MSM : Metal Semiconductor Metal

NIC : Network Information Criterion

NN : Neural Networks

NNIC : Neural Network Information Criterion

OES : Optical Emission Spectroscopy

PCA : Principal Component Analysis

PNN : Polynomial Neural Network

PSC :Power Ship Corp.

R&D : Research and Development

RBFN : Radial Basis Function Network

(17)

RGA : Residual Gas Analysis

RSM : Response Surface Model

SOV :Shutoff Valve

TSNN : Time Series Neural Networks

(18)

C

H A P T E R

1 I

NTRODUCTION

The technology of semiconductor fabrication is complex, and requires many specialized process steps. These steps include wafer preparation, device fabrication, device test, and packaging. Device fabrication is the most complex manufacturing step in production of semiconductor integrated circuits, and typically includes photolithography, ion implantation, etching, thermal treatments, chemical vapor deposition, physical vapor deposition, molecular beam, electroplating, chemical mechanical polishing, wafer testing and back grinding [1].

The integrated circuit (IC) growth performance over the past 40 years has been both influential, and ubiquitous as Moore’s law –which states that the numbers of transistor per chip, or per IC will double every year or two– while their cost remains the same. The correctness of Moore’s Law aggressively facilitates the scaling down and integration of IC devices, so that smaller and smaller devices are enabled in larger and larger quantities [2]. These advanced talents have also put more constraints on the design manufacturability. Because the manufacturing process has become less tolerant to variations, which easily could be source of defects.

(19)

1.1 Defect Reduction Overview

In order to achieve low yield loss and high quality, semiconductor manufacturer maximize their yield rates during process life cycle. The intentions of the process life cycle phases are different, consequently the technology requirements for defect detection in those phases are different and requires specific defect detection tools or recipes as well. Figure 1.1 illustrates semiconductor lifecycle phases, where these three phases are defined by the defect reduction technology subgroup [3, 4]:

Figure 1.1: Fault density, yield, and output as a function of the time since the inception of a semiconductor technology node. [5]

(20)

• Process Research and Development (R&D): is characterized by relatively low production rates and yields, experimental development of process parameters, detailed characterization and identification of defects.

• Yield Ramp (Yield Enhancement): is defined as an improving the baseline yield for a given technology generation starts from R&D yield level and ends by volume production. During this phase, the yield moves from approximately 20% to 80%. There are two ways to achieve the required yield ramp: reduce the total number of defect and fault sources, also reduce the time to source and fix each new defect source of mechanism.

• Volume Production: This phase represents the final stage of the semiconductor process life cycle, in which no further tuning of the process control parameters is attempted. The objective of using defect detection tools in this stage is to identify process excursions as rapidly as possible, which requires very-high-throughput tools and methods. In this phase the process is well seasoned, so the problems are frequently catastrophic and could involve shutting down the line when faults occurred.

The rapid identification of defect and fault sources through integrated data management continues to be the essence of rapid defect learning [4]. Defects must be detected, analyzed, and eliminated within brief period of time. Particularly, the visual inspection plays a fundamental role in defect detection. The visual inspection is often carried out by a human expert. However, new technology features have made this inspection unreliable. For this reason, many researchers have been engaged to develop automatic analysis of manufactures processes, defect prediction models, and automatic

(21)

optical inspections. Moreover neural networks model has first-rate in defect prediction, and automatic optical inspections.

1.2 Motivation

As the semiconductor processing technology approaches the 0.1 mm feature size and 300 mm wafer diameter, the customer demands for low defects per million (DPM) have not change. Fundamentally, there are only two ways to meet these demands: first, allow fewer bad or weak die to reach the customer. Second, reduce the intrinsic number of bad or weak die [6]. In the direction of facing the customer demands, the industry has strived to improve the yield, and equipment utilization.

To optimize the yield and equipment utilization during semiconductor fabrications, the wafer defects should be minimized and properly processed wafers at each step should be confirmed. On other hand, measuring each wafer after each step is extremely difficult, due to cost and consumes long time for measuring each wafer after each step. As of now, experts in this industry usually measure and monitor wafers periodically, especially right after performing preventive maintenance and changing machine settings. A final test is performed on each wafer after all steps. Thus, if an error occurs, it is very likely that many wafers are misprocessed without notice until very late. Because of the late notice, it is extremely difficult to trace back and locate the faulty step and diagnose the problem. Therefore, one can save considerable resources by predicting the wafer state after or during each step. In this research it has been demonstrated that it

(22)

Reactive ion etch (RIE) is commonly used in VLSI as plasma etching method, where ions remove and react with wafer surface substrate in plasma environment. In addition, it is very difficult to control plasma etching process, since the physical mechanism of this process is not well understood.

Wafer defect occurs in RIE when there is a sudden change in etching behavior. This change can happen due to operator errors, or machine errors, such as gas leak, power fault, and pressure fault. The main defect in oxide RIE is un-open etch, this defect costs 10%-20% yield loss in PSC. Un-open etch signify the inadequate etching space in the wafer surface (see figure 1.2). This thesis explores the various issues of wafer defect and monitoring includes un-open etch defect diagnosis, offline defect prediction model, online defect prediction model, and the difference between these two prediction models.

Many modeling techniques have been used to model the plasma etching throughput. However, they are limited in their ability to predict the yield quality and do not provide the flexibility to detect the wafer defect during etch process. This thesis focuses on the development of RIE output by predicting the wafer defect during/after etching process. Predicting wafer status is supported by careful experimental design and implementation. In the long term, this study will improve etch yield, at least in the following aspects:

(23)

(1) It will aid in identifying the dominant relationships between the wafer defects and etch factors. Therefore very good product quality will be achieved with continuous production.

(2) It provides a fundamental understanding of the mechanisms involved in defects formation which is important for reliability during RIE process. (3) It offers offline model; this model will present a practical estimation of

wafer status after etching process.

(4) In additional, it presents online model to predict wafer status after stabilization steps during etching process.

(5) This research performs wafer inspection for RIE process without scarifying wafers yield, and not depending upon monitor wafers or expensive metrology which is the main benefit of predicting the wafer status.

(6) It will enable inexpensive real-time wafer-to-wafer control applications in RIE.

1.3 Problem Statement

Reactive ion etching (RIE) is a key process in VLSI circuit fabrication, which combines physical ion bombardment and chemical reactions in plasma etching process. In addition, this critical technology is not only very expensive, but also difficult to control because it is not well understood [7]. In fact, a malfunctioning plasma etcher can generate

(24)

Silicon dioxide film etching is one of the major interests in IC interlayer dielectric material and multi-chip modules (MCM). A silicon dioxide film etching without any errors is a complex task. One of the significant detractive defect in this process is un-opened etch (figure 1.2). Process engineers have to ensure that not only the processes and products are high-yielding but also that errors in manufacturing are eliminated as quickly as possible through continuous and timely improvements.

Figure 1.2: Wafer surface shape before and after etching process.

1.4 Thesis Statement

This thesis attempts to gain an understanding of the relationship between the un-open etch defect and RIE process, as well as providing empirical models to predict wafer status. Predicting wafer status in RIE is important to enhance yield, quality, and efficiency. Deep analysis has been done to improve the prediction model with the purpose of providing valuable benefits. Furthermore preparing data for neural network has been discussed and applied in three different ways.

(25)

1.5 Thesis Approach

In RIE process, the final yield is a result of intensive interactions between process factors, etchant molecules, and wafer surface. In our approach, BPNN with/without PCA provides a comprehensive view of process factors and bridging them with the wafer status during/after the RIE process, thus this approach will emphasize the applications of predicting the wafer defect and evaluate the relationship between RIE factors and wafer defect.

To more fully understand of this work, some additional background information is addressed in the next chapters. On the way to achieve the goal of predicting the wafer defect, the following steps are performed:

Step one: Full understanding of the RIE process, and some additional background

information such as plasma effect on etching process, back propagation neural network, and principle component analysis.

Step two: Analyze the RIE factors and their interaction with an un-open etch defect. Step three: Consider the RIE data from PSC, and properly prepare them in three

different ways: raw data, sampling data, and statistical summary.

Step four: Apply RIE factors in BPNN each factor alone to find significant factors for

each classification.

Step five: Decide the significant factors combination and apply these combination in

offline BPNN model.

(26)

Step eight: Build online defect prediction model, using the data of the first three etching

steps, and follow previous steps to obtain online BPNN model and online PCABPNN model.

Step nine: The numerical and performance results of the online and offline prediction

models will be evaluated as well as, the advantages, and disadvantages will be clarified.

1.6 Thesis Organization

This thesis presents an integration framework to diagnosis the defect occurred during RIE process. We begin with Chapter 2 reviewing neural network models and related works. Chapter 3 demonstrates etch process, Chapter 4 depicts the application of PCA in RIE. Chapter 5 shows offline defect prediction model for RIE using neural networks besides, its simulation results. Also, it shows the online model. Finally, Chapter 6 the conclusion includes summarizing important contributions of this thesis and discussing some interesting future directions.

(27)

C

H A P T E R

2 N

EURAL

N

ETWORK

In the recent years, neural networks have been successfully applied in pattern recognition, modelling tool and as a function approximation. Furthermore, neural networks model has potential capability in modelling and control of non-linear dynamic systems, which conventionally used for dynamic processes, such as model predictive control (MPC), internal model control (IMC), and model inversion control. Several semiconductor manufacturing research investigate the application of artificial neural networks (ANN) in process optimization, control, and diagnosis [8-13].

On the other side, semiconductor manufacturers have faced unprecedented operation dilemmas and challenges, and some back-end semiconductor manufacturers were forced to merge. To survive in such a tough industry environment, semiconductor front-end manufacturers must have superb products and production technology, and they must continuously improve their production capability and increase production efficiency in order to save costs and to increase sales.

(28)

2.1 Related work

Neural networks have seen an explosion of interest over the last few decades, and are being successfully applied in semiconductor inspection systems [9, 14-17]. The major reason for adopting neural networks is because neural networks have potential capability in modelling and controls of non-linear systems categorization. Moreover neural networks have the ability of learning arbitrary nonlinear mappings between noisy sets of input and output data. Back-propagation neural network (BPNN) is currently the most popular learning rule used in supervised learning, which also known as feed forward neural networks and multilayer perceptron (MLP).

Back propagation through time is a very powerful tool, with application to solve the problems of prediction, optimization, control, and diagnosis in the semiconductor manufacturing process. [18-21]. Most of the literatures adopt BPNN because it has advantages of an easier-comprehended theory, faster recalling speed and higher learning accuracy. However, the determination of the structure architecture and the parameters under this network is difficult. Since the function approaching ability of neural network depends on the architecture of network, poor prediction can be resulted due to the parameters and the complexity of the problem itself, or an improper selection of the architecture or parameters, and vice versa. The parameters of neural network include the number of hidden layers, number of hidden units, learning rate and momentum, etc. These factors have a very great influence on the quality of approximation ability of neural network. Fogel [22] suggested the use of final information statistic (FIS) based on Akaikes’ information criterion (AIC) to determine the number of hidden layers and

(29)

neurons. Murata and Yoshizawa [23] and Onoda [24] proposed improved methods of AIC by applying statistical probability and energy function to determine the number of neurons. Their methods are called network information criterion (NIC) and neural network information criterion (NNIC), respectively. Taguchi method has also been used to design the parameters for neural network in previous researches. Khaw, Lim and Lim [25] applied Taguchi method to design the parameters and verified that the method could design the optimal parameters fast and robustly. Santos and Ludermir [26] applied factorial design to assist the design and implementation of a neural network.

Numerous researchers have studied pattern classification by using BPNN for the automatic inspection system in the semiconductor industry [27-29]. Zoroofi et al. [27] used curve recognition to detect the contamination on a wafer surface during semiconductor production. Three conventional classification models, a back-propagation technique, a minimum distance algorithm and a maximum likelihood classifier, were used and the performance of these three models was compared. The results showed that the back-propagation classifier has a better classification performance. Su et al. [28] proposed a neural-network approach for semiconductor wafer post-sawing inspection. BPNN, radial basis function network (RBFN), and learning vector quantization (LVQ) were employed in the inspection models. The inspection results showed that both BPNN and LVQ have excellent prediction result with 100% accuracy. Chen et al. [29] used BPNN in the etch semiconductor process to identify and classify endpoint curves. By real-time monitoring of changes in the endpoint curve, the abnormalities of products can be detected immediately. The system can reduce the uncertainty in the process curve

(30)

classification and provide machine shut-down suggestion immediately when necessary. In this respect, back propagation neural network utilized to identify and predict the wafer status after/ during RIE. Next section deals with back propagation neural networks (BPNN).

2.2 Back Propagation Neural Networks Structures

Neural networks have emerged as an important tool to model and diagnose problems in complex manufacturing process. There are many types of neural networks to map the complex relationship between input and output through supervised training algorithms, such as associate memory networks, radial basis functions and back propagation neural network (BPNN). The BPNN consists of input layer, output layer and several parameters include: the number of hidden layers, number of hidden neurons, learning rate, momentum, etc. All of these parameters have significant impacts on the performance of the neural-network.

(31)

Neural network cells (neurons) are the basic elements of neural networks [30]. In general, the neurons are connected by links in term of weights. Neural networks may consist of multiple layers of neurons interconnected with other neurons in different layers. These layers consist of one input layer, one or more hidden layers and one output layer. As demonstrate in Figure 2.1 the inputs and the interconnected weights are processed by weighted summation function to produce a sum and then used by an activation function. The result of the activation function is the output to the neurons.

Figure 2.2: The Back propagation neural networks model with one output

Figure 2.2 illustrates the Back propagation neural network structure, where the boxes represent the hidden neurons. Each processing neuron first calculates the weighted sum

(32)

2.1. Then generates an output (s j ) through an activation function, where sigmoidal

function (Eq. 2.2) expressed as activation function in this case. Although there are several types of activation functions, sigmoidal functions are the most commonly used.

, 1 N j n j j n

s

w x

b

=

∑

+

(2.1) s

e

s

f

₋

+

=

1

1 )

(

(2.2)

Next the outputs are calculated by finding the weighted sum of all interconnected signals from the hidden layer plus a bias term and then generates an output ( y ) through an activation function. In BPNN usually learns by making changes in its weights

(

w

_{i j}_,

,

w

_{j k}_, ), when the mean square error (Eq. 2.3) is larger than acceptable limit.

∑

=

−

=

K k k k

y

d

E

1 2

)

(

2

1

(2.3)

Where

d

_k is the desired output (actual output), and

y

_kis the BPNN output, k represent the number of output neuron, in this thesis one output neuron is constructed. In the training process, the input and output variables chosen for the network learning are presented to the model in a normalized form, and the weights between the hidden and output layers are adjusted first by using the expression below:

(2.4) (2.5)

,

j k

w

=

w

+ ∆

w

,

. ( )(

)

,

1 j k

k

w

η

f s

y

o o

=

∆

=

−

(33)

Where

η

represents a dumping or accelerating factor,

Y

_k comes from the input-output pairs of data (x, y) available for training the network , and

O

_k is the output obtained from Equation 2.2 applied to the neurons of the output layer in the m iteration. Subsequently as shown in BPNN algorithm (see figure 2.3), the weights between the input and hidden layers are changed (Eq. 2.6, 2.7). After presentation of the first input-output pair, the second pair is processed, and so on.

(2.6) (2.7)

,

i j

w

=

w

+ ∆

w

( ) ( ) ( 1) ( ) ,

( )

,

. ( ){

( )(

k km km

)

j km

x

im

}

m

i j

j

f s

y

o

w

η

f s

−

∆

=

(34)

(35)

C

H A P T E R

3 REACTIVE

ION

ETCH

One of crucial importance processes in semiconductor fabrication is plasma etching. Plasma etching first appeared in the late 1960s to interconnect the current and future generation of integrated circuit [34]. In this chapter, etch process in general and reactive ion etch process (RIE) concise is provided in four sections: the first section described the main etching types and characteristics. The second section clarifies reactive ion etch process (RIE). 2300 Exelan Flex is explained in the third section. Some relations between RIE factors and wafer defects are presented in the last section.

For fully understanding etch process, it is necessary to be familiar with the etch factors, such as etch rate, etch profile, etch bias, selectivity, uniformity, residues, and so on. These terminology and others explained in Appendix Ι.

3.1 Etch Process Overview

Etching process is performed immediately after photolithography, to remove undesired material from the wafer surface by either chemical etcher or physical mechanism. Usually photoresist masks or other materials are used on the top of surface to

(36)

protect specific regions of the wafer surface while permitting selective etching through opening the photo-resist layer [1, 7, and 35].

Two basic types of etch processes are used in semiconductor manufacturing: liquid (wet) chemicals, and gaseous (dry) species. In wet etch, liquid chemicals such as acids, bases, and solvents are chemically reacting with undesired surface material creating byproducts species which either dissolve or vaporize away. Table 3.1 compares the characteristics of the wet and dry etch process.

Table 3.1: Comparison between wet and dry etch characteristics

On the other hand, dry etches is the primary etching method in advanced wafer fabrication. The goal of the dry etching process is to form significant features such as gates and interconnect lines, and contact holes. Contact holes will be filled with metal to connect the source and drain, and to connect different metal layers. The significant features formed in chips by reproducing the image of a mask on wafer surface with a high degree of integrity.

(37)

In general, dry etching mechanism is divided into the four basic phenomenological categories as shown in Figure 3.1: sputtering, chemical etching, ion-enhanced energetic or RIE and ion-ion-enhanced inhibitor processes [1, 7, and 35].

Figure 3.1: The four phenomenological etching mechanisms: Ⅰ sputtering, Ⅱ Chemical etching, Ⅲ Ion- enhanced energetic etching Ⅳ Ion enhanced inhibitor etching.

The four phenomenological dry etching mechanisms :( Ⅰ ) Sputtering is the nonselective surface atoms removal, where the non-reactive feedstock gas ions are induced by plasma to strike the target surface. Chemical Etching (Ⅱ) is an isotropic and selective surface atoms removal, where plasma induced gaseous etchant atoms or molecules (free radicals) which chemically react with the surface layer and forming gaseous volatile etch products. (Ⅲ)RIE -Ion-Enhanced Energetic - is the highly anisotropic etching of surface layers due to gaseous etchant and energetic ions. (Ⅳ) Ion Enhanced Inhibitor Etching where in this mechanism plasma supplies chemical etchant, and inhibitor precursor molecules. The inhibitor molecules adsorb or deposit on the substrate to form a protective layer or polymer film.

(38)

3.2 Reactive Ion Etch

Reactive ion etch is suitable technique for removing material from the wafer surface, and is widely used in VLSI manufacturing because the resist mask pattern can be accurately transferred to the film. Furthermore, RIE control the etching profile and under layer selectivity.

A typical RIE system consists of a cylindrical vacuum chamber with a wafer platter situated in the bottom portion of the chamber (figure 3.2). The wafer platter is electrically isolated from the rest of the chamber, which is usually grounded. Gas flow is introduced through small inlets in the top of the chamber and is evacuated out to the vacuum pump system through the bottom of the chamber. The types and amount of used gas is determined by the etch process.

(39)

RIE procedure starts by vacuuming chamber until the pressure arrives to the required limit. After that the etchant gas will introduced to the vacuum chamber at the same time RF powers are used to increase the electrons energy. Some of the etchant molecules dissociate from the impact of the collisions with electrons forming as in Eq. (3.1), which generate free radicals. These free radicals diffuse across the boundary layer, to reach the wafer surface, and are adsorbed on the surface (3.2). With the help of ion bombardment (3.4), these free radicals react with the surface atoms or molecules very quickly and form gaseous byproducts (3.3). The volatile byproducts desorbed from the surface (3.5), diffuse across the boundary layer, get into the convection flow, and are pumped out from the chamber. It is easy to conclude that RIE contains two kinds of etch mechanism:

I- Chemical etching caused by suitable reactive chemicals II- Physical etching caused by ion bombardment.

The following equations and Figure 3.3 clarify the RIE mechanism

Etchant Formation

e

F

e

+

₂

→

2 +

(3.1) Adsorption on Substrate

nF

Si

F

surf

−

→

2 (3.2) Chemical Reaction ) (ads x

SiF

nF

Si

−

→

(3.3)

Ion Assisted reaction

) ( ) ( ads x ions

SiF

nF

Si

−

⎯

⎯ →

⎯

(3.4) Product Desorption ) ( ) (ads x gas x

SiF

→

(3.5)

(40)

Figure 3.3: RIE Mechanism

3.2.1 Plasma in RIE

Plasma can be generally described as an ionized gas composed of ions, free electrons, and a variety of neutral species. It contains approximately equal concentrations of positively charged particles (positive ions) and negatively charged particles (electrons and negative ions). While this description encompasses a wide range of plasma types and conditions, specific class is typically used in semiconductor processing called a "low pressure glow discharge", this type of plasma is weakly ionized (most of the gas molecules are neutral), low pressure (1 mTorr to 1 Torr), and nonequilibrium electrons contain most of the energy and the ions remain near room temperature. The advantage of this kind of plasma is surface chemical reactions with plasma can take place under no equilibrium conditions and at low temperature. Plasma effectively controls the main

(41)

etching characteristics, such as the etching rate, anisotropy, selectivity and uniformity provides.

It is well known that the plasma process is affected by chemical and physical features, several features of plasma are described in this section such as: thermal velocities of the free electrons and ions inside the plasma. These thermal velocities are described by the following equations,

1 2 e e

eT

v

m

⎛ ⎞ ⎜ ⎟ ⎜ ⎟ ⎝ ⎠

=

(3.6) 1 2 i i

eT

v

M

⎛ ⎞ ⎜ ⎟ ⎜ ⎟ ⎝ ⎠

=

(3.7)

Where, e is the electron charge, and the ionized particle charge, Te and Ti are the electron and ion temperatures, and m and M are the masses of the electron and ion, respectively. For typical semiconductor processing plasmas, Te is between 1 and 10V (here, temperature and voltage can be considered equivalent through the relation T

q K V = B

therefore, 300K≡ 0.026V), while Ti is near room temperature. The ion mass is also much larger than the electron mass. For these two reasons

v

_e

>>

v

_i , and during the time immediately after the plasma is ignited, the electrons are lost to the chamber walls much more rapidly than the ions. This leaves the bulk plasma with a net positive charge, which sets up an electric field from the plasma to the walls. The high electric field region near the walls is called the sheath.

(42)

The sheath thickness is one of the plasma features. This thickness is the order of 1 mm and is the only separated charge region. All other areas can be considered quasi-neutral over a length scale larger than ~15 µm. The potential across the sheath relative a floating surface (the floating potential) is typically 10-30 V and accelerates ions that enter the sheath towards the floating surface. Since the ion temperature in the bulk plasma is low (≈ 0.026V), the energy of the ions bombarding the chamber walls is on the order of several Te. If the wafers to be processed are placed on one of the walls, these ions will strike the wafers with high velocity, and at nearly-normal incidence. This ionic bombardment is the basis for the anisotropic etching.

3.2.2 Cooling System in RIE

In reactive ion etch process, large heat is generated by electrons strikes and heavy ion bombardment, where the high temperature can cause photoresist reticulation. Consequently, Temperature control is needed to keep wafer health and protect wafer photoresist. Where low pressure is not good to transfer the heat a cooling system is required. Thus, Helium backside cooling is commonly used as shown in figure 3.4., either clamp ring or electrostatic chuck (E-chuck) is hold wafer. So in this cooling process, Helium is pressurized at the wafer back so heat transfer from wafer to water cooled chuck.

(43)

3.3 Case Study: Oxide RIE in PSC

This thesis studied one of reactive etching process in Power Ship Corp., where 2300 Exelan Flex (figure 3.5) used to etch the dioxide silicon films. This equipment is a complete package with an integral power supply, RIE chamber, and internal cooling system.

(44)

This advanced Reactive Ion Etching equipment made up of many different sections (figure 3.6). Some of these sections are maintenance and controller sections. The main electronics monitor the status and manipulate the values of controlled variables such as: Pressure, Temperature, gas mass flow, voltage, current, and RF power.

Figure 3.6: Sketch of 2300 Exelan Flex equipment includes some controller

The other main section in 2300 Exelan Flex is RIE chamber (figure 3.7). The main etching chamber configured inside the vacuum chamber for optimal efficiency, where these two chambers separated by quartz confinement rings, these rings effort pressure controller for the main chamber. This chamber as typical RIE (sec 3.2) consists two parallel plates, Esc, RF power supply, and pumping system.

(45)

Figure 3.7: Sketch of the main etching chamber inside RIE chamber

The purpose of oxide reactive ion etch is to create the trenches, which are filled by metal to connect wafer layers. To achieve this purpose three main oxide etching steps applied during oxide etch process, which contains eleven steps. Before the main etching process, good plasma environment must be prepared by stabilization steps (stable step, strike plasma step, and ramp up plasma step).

The first step in RIE recipe starts by opening high vacuum pump automatically followed by closing rough vacuum pump, thus ultra-high vacuum (1mTorr) is achieved inside RIE chamber, after that the opened area between the vacuum chamber and main etch chamber is minimized by quartz confinement rings, at this instant the fluorine gas

(46)

Figure 3.8: Process recipe for oxide film etching in RIE

The second step objective is to strike the plasma. Plasma is initiated in the system through applying RF power to the wafer platter. This power is typically applied at a few hundred watts with twenty seven and two megahertz, these frequencies create an oscillating electric field that ionizes few gas molecules by stripping them of electrons, creating plasma as shown in equation 3.10. In step three the values of the RF power will be increased to enhance the plasma amount until plasma reaches the required density.

e

CF

F

CF

e

+

_n

→

+

_n₋₁

+

(3.10)

After plasma enhanced, the main etching process is occurred by physical and chemical mechanisms. More physical than chemical mechanisms occurred in silicon oxide etching processes. In addition, free fluorine radicals are the main etchant for this etching process. When etching oxide, oxygen byproduct can react with C to free more

Step1: Stable

Step 2: Plasma Strike

Step 3: Plasma Ramp up

Step 4: Oxide Main Etching 1

Step 6: Plasma Ramp down

Step 7: Stable

Step 8: Plasma Strike

Step 9: Plasma Ramp up

(47)

fluorine (equation 3.10). The reactions between the etchant atoms and oxide silicon wafer surface shown in the following equations:

Adsorption on Substrate

2 n

2 nF SiO

+

→

SiF

+

O

(3.11) Chemical Reaction 4(ads)

4 Si nF

−

→

SiF

n

<

(3.12) Ion bombard ) ( ) ( ads x ions

SiF

nF

Si

−

⎯

⎯ →

⎯

(3.13) Product Desorption ) ( ) (ads x gas x

SiF

→

(3.14)

As shown in figure 3.8 there are three main etching steps: the first oxide etch step is outlined the wafer surface shape from the rounded mask. The second main etch step provides the pattern etch. The third main oxide etch occurs in deepest depth to prepare the contact areas between two layers, thus this etching need higher plasma density than the previous etching steps to achieve its purpose. So the stabilization steps repeated with same procedure but different RF power and gas flow values preparing new plasma.

3.4 RIE Factors

For reactive ion etching systems (case study) mainly forty different signals were collected from 2300 Exelan Flex equipment in PSC, twenty two signals are carefully chosen from the whole signals, the chosen signals are directly impact the wafer status. Figure 3.9 clarifies the main twenty one factors except the time factor. For more understanding table 3.2 provides more elaborations about RIE factors.

(48)

(49)

Figure 3.9: Main factors of oxide RIE.

3.4.1 RF Power

The energy of the ions bombarding the substrate can be increased by employing an additional capacitive source of RF. Thus the etch rate increase linearly with the RF power supply. RF power present the most important knob that controls etch rate. When etch rate is out of specifications the first response the engineers do is checking RF system. four signals related to RF power presented in 2300 Exelan Flex table 3.2, 2 MHZ power supply, 27 MHZ power supply, 2MHZ reflected power and 2MHZ reflected power. In

(50)

Figure 3.10: Etch rate and DC bias as a function of the RF power.

3.4.2 Pressure

Gas pressure in RIE is typically maintained in a range between a few millitorr and a few hundred millitorr by adjusting gas flow rates or adjusting an exhaust orifice. Normally, as pressure is decreased below about l00 mTorr, the potential across the discharge characteristically increases. At very low pressure, physical etching mechanisms tend to dominate (figure 3.11) , because of high ion energy, low reactant density, and long mean free paths.

According to the kinetic theory of molecular gases, the mean free path of a gas molecule at constant temperature is inversely proportional with the pressure. So when the pressure decreases the mean free paths of the species increases, and the energetic particles in the plasma can easily transfer their kinetic energy to the atoms at the oxide silicon film surface.

(51)

(52)

C

H A P T E R

4 P

RINCIPAL

C

OMPONENT

A

NALYSIS

Principal component analysis (PCA) is an important analysis technique in multivariate statistics, it was first suggested in 1901 by Pearson [36], and formally developed by Hotelling [37]. The main idea of principal component analysis (PCA) is to represent number of correlated variables into a smaller number of uncorrelated variables called principal components. The first principal component accounts for as much of the variability in the data as possible, the second PC is the linear combination with the second largest variance and orthogonal to the first PC, and so on. There are as many PCs as the number of the original variables. For many datasets, the first several PCs explain most of the variance, so that the rest can be disregarded with minimal loss of information. The objectives of using PCA are to reduce the dimensionality of a data set, and to identify new underlying variables that are now orthogonal.

To enhance performance of prediction model in this study, PCA is suggested to represent the RIE factors, since simple neural networks with few nodes and connections tend to have better generalization capability. In this chapter, PCA technique

(53)

automatically extracts three principle components (PC) from all RIE factors (twenty two factors). Table 4.1 shows the RIE factors and its abbreviation.

Table 4.1 RIE factors and its abbreviation.

1

x

_{Bias Voltage}

x

₁₂ _{Gas 4}

2

x

_{ESC Clamp Voltage}

x

₁₃ _{Gas 7}

3

x

_{ESC Current1}

x

₁₄ _{He Flow Inner}

4

x

_{ESC Current2}

x

₁₅ _{He Flow Outer}

5

x

_{ESC Temperature}

x

₁₆ _{He Pressure Inner}

6

x

_{Foreline Pressure}

x

₁₇ _{He Pressure Outer}

7

x

_{Forward Power 27MHz}

x

₁₈ _Pressure

8

x

_{Forward Power 2MHz}

x

₁₉ _{Process Time}

9

x

_{Gas 1}

x

₂₀ _{Reflect Power 27MHz}

10

x

_{Gas 10}

x

₂₁ _{Reflect Power 2MHz}

11

x

_{Gas 11}

x

₂₂ _{Top Plate Temperature}

It is important to treat each step separately in PCA, because each etching step has different inherent physical/chemical characteristics, and by considering the overall process characteristics and the objective of model simplicity, it was decided that utilizing one PCA for each of the eleven steps might yield a better solution than utilizing a single PCA for the entire process. In this thesis, principal component analysis was utilized for 90 training wafers, the principle components are found by computing the sample covariance* matrix and selecting its eigenvectors (loading vectors) for the k biggest eigenvalues as shown in figure 4.1.

(54)

(55)

The covariance matrix Cov(X) can be obtained by:

(

) (

)

( )

1

T

X

Cov X

n

−

=

−

(4.1) Where

X

=

⎡_⎣X₁ X₂

L

X₂₂⎤_⎦

is data matrix with n samples (rows) and 22 variables (columns), as well as each column represents one of RIE factors.

i

X

represents vector i of data matrix

(

X

−

X

)

stands for subtract the mean value of each column from the corresponding column .

To find the eigenvalues and eigenvector of the covariance matrix, Cov(X) represented by using singular value decomposition (SVD) as shown in the following equation:

( )

T

Cov X

= Λ

V V

(4.2)

where

V

=

⎡_⎣

v v

₁

₂

L

v

₂₂⎤_⎦

is an 22 by 22 unitary matrix of corresponding eigenvector

1 2 22

0 0

λ

⎡ ⎤ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ ⎦

Λ =

L L

M

L L

(56)

eigenvalues for the eleven steps. Figure 4.2 shows the decrease in eigenvalues

λ

for each of etching step.

The variance of the ith PC is equal to the ith largest eigenvalue of the covariance matrix [38]. Because of this important property, the

ϕ

percentage (equation 4.3) is used as a guide in choosing an appropriate number of PC. The goal is to choose as small a value of k as possible while achieving a reasonably high percentage of PC variance. The

ϕ

values shown in table 4.3 give the cumulative proportion of the variance explained by the first k PCs,

ϕ

is larger than 88% when k equals three, thus three principle components are selected to characterise the twenty two RIE factors by using equation 4.4:

1 22 1 k i i i i

λ

ϕ

= =

=

∑

(4.3 )

PC

=

XV

(4.4)

Score equations (

T

=

XV

) is tend to identify the three principle components according to RIE factors. Score equations for each etch step is shown in Appendix II

(57)

(58)

(59)

(60)

C

H A P T E R

5 P

REDICTION

M

ODELS FOR

RIE

The computing, telecommunications, aerospace, automotive and consumer electronics industries all rely heavily on integrated circuits (ICs). Next-generation IC manufacturing equipment will require dramatic improvements in cost, quality, throughput, and flexibility. Reducing manufacturing cost involves increasing chip yield, reducing cycle time, maintaining consistent product quality, improving equipment reliability, and maintaining stringent process control. Since IC fabrication consists of hundreds of steps, maintaining product quality requires the control of thousands of variables. Process steps are performed in sequence, and yield loss may occur at every step. However, analyzing wafer defects is the regular method for evaluation semiconductor technologies. Wafer defects carry a lot of wafer status information which can be analyzed in order to characterize the quality of processes and products. If the prediction model accurately predicts the wafer status, the repeated etching failure rate should be prevented, process yield should be greatly enhanced, inspection cost should be reduced and profit should be increased.

The experimental process of this study is as depicted in figure 5.1, and four models are included: offline back-propagation neural network (BPNN), offline principle

(61)

component analysis BPNN (PCABPNN), online BPNN and online PCABPNN. These models have the potential to reduce the overall cost of ownership of semiconductor equipment by increasing the wafer yield and throughput of product wafers, and not depend upon monitor wafers or expensive metrology rather it will enable inexpensive real-time wafer-to-wafer control applications in RIE. The capability of the four prediction models to predict the wafer status correctly is discussed in this chapter.

(62)

5.1 Data Collection from RIE.

The most important step in semiconductor process modelling is the collection of data. It is essential to gather a sufficient sample of representative data; or else it is impossible to train a neural network or any other type of model. In this study, the parameters data of 2300 Exelan Flex machine is collected by engineers in Powerchip Semiconductor Corp (PSC) factory based on their experience. These parameters include chamber temperature and pressure, forward and reflected RF power, DC bias and gas flow rates (table 3.2).

Wafer Percentages in this Case Study.

25 21% 76 63% 14 12% 5 4%

Good Wafer tested Good Wafer trained defect Wafer trained defected Wafer tested

Figure 5.2 Percentage of the training and testing wafers in this case study.

Figure 5.2 presents the training, and testing percentage of one hundred twenty wafers which collected from 2300 Exelan Flex machine, and the percentage of training wafers to testing wafers which is three to one, where fourteen wafers (12%) from the ninety training wafers (75%) stand for unopened etch defected wafers, and five wafers (4%) from the thirty tested wafers (25%) stand for defected wafers (unopened etch).

(63)

5.2 Data Preparation.

This section explain the details of the preparation data performed in this study, data preparation techniques are used to obtain good prediction results. Figure 5.3 reveals the variation of the data point number/ process time for each step, for example in step five the number of data points is between 10- 140 with 60 data points as an average of step five . Because of this dissimilarity, it is hard to decide the BPNN inputs. Therefore three different data preparation techniques are suggested and prepared such as: raw data, sampling data and statistical summary data.

• Raw data preparation: one hundred eighty four data points is the minimum number of data points from the collected data, thus the first one hundred eighty data points are suggested as raw data inputs for offline prediction models. And twenty data points are suggested as raw data inputs for online prediction models.

(64)

• Sampling: non symmetric sampling is the second suggested preparation technique, which has ability to cover all etching steps (figure 3.8), at the same time focusing on the main three etching steps (step 4, step 5 and step 10). Table 5.1 shows the number of captured samples in each step, where two samples captured from stabilization steps (step 1, 2, 3, 7, 8, and 9) and two from plasma ramp down steps (step 6 and 11), more than half of the captured samples are captured from the main etching steps (step 4, 5, and 10), as a result thirty four captured samples cover all etching steps. These thirty four captured samples used as inputs for offline prediction models.

Table 5.1: Number of suggested sampling for each step in the sampling technique

Figure 5.4: The position of captured samples.

Figure 5.4 illustrates the position of captured samples, where the first data point of each step is captured, and the suggested sampling rate for the main etching steps is five

(65)

the online prediction models, these samples includes the first data point of each stabilization steps, the third point after the beginning of the first three stabilization steps and the end point of the third step (which is the first data point of forth step).

• Statistical summary preparation: the last suggested preparation technique, and depends on mean and standard deviation values. The following equations show the calculation of the two statistical summary values.

1

N i i

X

x

N

₌

=

∑

(5.1)

(

)

2 1

1

N i i

x

X

N

σ

=

₋

∑

−

(5.2)

Figure 5.5: Statistical summary data preparation technique

Because of many samples have no data for the sixth step, this step became out of interest. This means there is ten steps will be statistical summarized and applied in offline prediction models.

(66)

5.3 Architecture of prediction models

As stated before, many researchers have been adopted BPNN to solve the problem of categorization, prediction and examination the manufacturing process, because the advantages of BPNN such as: easy and fast to comprehend, high accuracy and fast recalling speed. This study combines back propagation neural network (BPNN) and principle component analysis (PCA) to construct the four prediction models as shown in table 5.2. Offline prediction models concern about all etching process steps to predict the wafer status after the end of etching process. In other hand, online prediction models concern about the first three stabilization steps. Online prediction models are capable to reduce the defected wafers more than offline prediction models, due to the abnormalities of the etching process can be predicted as soon as possible and before the end of the first main etching step (step 4).

Table 5.2: The methods used in each prediction model.

PCA

BPNN

Notes

Offline BPNN

※

Concern about all etching

steps

Online BPNN

※

Concern about the first three

steps

Offline PCABPNN

※

Concern about all etching

steps

Online PCABPNN

※

Concern about the first three

steps

(67)

5.3.1 BPNN

BPNN in this study consists three layers of neurons: the input layer, hidden layer, and output layer. The input layer receives external information such as RIE processing factors or principle components. From the output layer, predictions are produced, the prediction values expressed as a binary values to represent the wafer status, since the network output is between zero to one, the zone that is smaller than the Min value is set to zero and the zone that is greater than the Max value is set to one. If the network output value is equal to one that means the wafer status is good, and otherwise is bad (defected wafer). When the network output value is between Min and Max values, then the network fails to predict the wafer status. The defined value range must be established to determine whether the output value is close to the target value.

The BPNN also incorporates hidden layers of neurons, these neurons do not interact with the outside world, but assist in performing nonlinear feature extraction on the data provided by the input and output layers. The number of hidden layers was set to one in this application. With a description of the BPNN network structure, training matters have to be settled.

5.3.2 Training

As previously mentioned the overall objective in training is to minimize the discrepancy between real data and the output of the network. During training, the network is trained to associate outputs with input patterns, this principle is referred to as

(68)

supervised learning. The training is continued until the training reached the maximum number of epochs or training neural network has MSE less than1*10−6. The maximum number of epochs used during training the networks is set to 10000 epochs.

After training, the prediction performance of the prediction models is estimated with two test sets. The first test set is formed by comparing the prediction error of new data set, data of twenty five good wafers and five defected wafer stand for testing data set. Two type of error obtained in the first test: type І prediction error occurred when good wafer is predicted as bad wafer, and type П prediction error occurred when defected wafer is predicted as good wafer.

The second test set depends on the recognition rate/ rejection rate. The recognition rate is the percentage of test samples recognized correctly and the output value is located outside the zone values (Min and Max values). The rejection rate is the percentage of input samples that could not be assigned to any particular class; because the output value is located somewhere within the zone values. The Min and Max values are determined for every prediction models after testing the training wafers, where Min value is the highest output value for defected wafer, and Max value indicates the lowest value for good wafer.

應用類神經網路於蝕刻製程之缺陷分析與預測

國

立

交

通

大

學

機械工程學系

碩

士

論

文

應用類神經網路於蝕刻製程之缺陷分析與預測

An Online/Offline Prediction Model for RIE Using Neural

Networks

研 究 生：倪席琳

中華民國九十五年九月

應用類神經網路於蝕刻製程之缺陷分析與預測

An Online/Offline Prediction Model for RIE Using Neural Networks

研 究 生：倪席琳

Student: Nesrin Talat

指導教授：李安謙

Advisor:

Dr.

An

Chen

Lee

國 立 交 通 大 學

機 械 工 程 學 系

碩 士 論 文

應用類神經網路於蝕刻製程之缺陷分析與預測

研 究 生：倪席琳

指導教授：李安謙 教授

國 立 交 通 大 學 機 械 工 程 學 系 碩 士 論 文

摘要

An Online/Offline Prediction Model for RIE Using

Neural Networks

Student: Nesrin Talat

Advisor:

Dr.

An

Chen

Lee

Institute of Mechanical Engineering

National Chiao Tung University

Abstract

Acknowledgements

Dedication

To some world peace is a dream; attractive, alluring,

desirable, fascinating,

perhaps even a little intoxicating; but still a dream...

Then there are those who make dreams come true

...

Table of Contents

摘要………..

i

Abstract……… iii

Acknowledgements………... v

Dedication……… vi

Table of Contents………. vii

List of Figures……….. x

List of Tables………... xii

List of Abbreviations………... xiv

C h a p t e r 1 Introduction………... 1

C h a p t e r 2 Neural Network ……….…………..………. 10

C h a p t e r 3 Reactive Ion Etch……….. 18

C h a p t e r 4 Principal Component Analysis………...…………... 35

C h a p t e r 5 Prediction Models for RIE……...………….…………... 43

C h a p t e r 6 Conclusion………..………. 74

References……… 77

Appendix I Terminology ………...………. 82

List of Figures

List of Tables

List of Abbreviation

C

H A P T E R

1

I

NTRODUCTION

1.1 Defect Reduction Overview

研究生：倪席琳

研究生：倪席琳

國立交通大學

機械工程學系

碩士論文

研究生：倪席琳

指導教授：李安謙教授

國立交通大學機械工程學系碩士論文