INTRODUCTION - 改良的迴歸樹在半導體良率提升之應用

1. 1 Motivation and objectives

In the semiconductor industry, a better wafer yield is equivalent not only to the quality of the products in the company but also to operation costs and completion. In a good company, high quality is good and often increases its competitive power.

Therefore each company's management promotes yield as its production profit target.

But a semiconductor's system is quite complex, frequently passing through hundreds of process stations to be able to completely manufacture the product. And after completing a system of ownership regulation, the product will be able to detect its yield rate. Therefore when the yield rate varies, it is an enormous challenge for engineers.

If some system regulation takes place the problem at time t, there will occur two distributions of yield rates around time t (Figure 1). According to the product in this station's production order, engineers draw the trend chart of a product's yield rate. He will find that the trend shifts the mean if there is a problem at the time t. An engineer would think that the problem which influenced the yield rate is at time t and need to check it.

The traditional SPC method is not suitable for this question. There are two primary causes: first, the product must be able to get the yield rate detection through hundreds of system regulation stations and then obtain the product's yield rate. Second, the semiconductor industry's production pattern is not necessarily in accordance with FIFO (first-in and first-out), such that the first product produced may not necessarily complete the whole manufacturing process first and detect its yield rate. Hence, we must find a new way to monitor the yield rate in the process.

On different fields, there are many methods to solve the mean-shift problem. CPD (statistical Change-Point Detection) is used widely. For example, CPD may be used to discover the nerve where the fission in biosphere appears [11]; it also applied to monitor the semiconductor yield rate in semiconductor industry. [15]

Moreover, we can use CART (Classification And Regression Tree) to detect the mean-shift problem. However, it is difficult to select the cost-complexity, which describes the order of complexity of the model. In Statistics, we will use cross-validation (Seymour Geisser, 1929 – 2004) to determine the cost-complexity, but its shortcoming lies in the its complex computation, and it results in small samples, which is bad.

Because in the semiconductor industry, manufacturing processes are quite complex, people sometimes avoid some outlier materials that are produced because of artificial mistakes. Engineers hoped to understand a regulation system with an overall tendency to exclude the effect of outliers. As a result of the outliers' appearance, the outcome usually changes tremendously. Thus, there needs to be a method to deal with outliers. In Figure 2, and Figure 3 we can find that the outliers affect the outcomes.

Finally, in the semiconductor field, the components are often measured by different instruments, so methods must be effective to detect the location of the mean shift suitably in different situations for engineers and be able not to influence the unit of measurement.

Figure 1 Some system regulation takes place the problem at time t.

time

Y-value

0 20 40 60 80 100

-15-505

Figure 2 The outliers affect the outcomes.(1)

time

Y-value

0 20 40 60 80 100

-15-505

Figure 3 The outliers affect the outcomes.(2)

1. 2 The procedure of research

Figure 4 The procedure of research

1. To understand the problem

In the semiconductor field, solving the mean-shift problem most often involves using CPD, which is a method Taylor proposed in 2000 to find a change-point, mainly using a cumulated sum (CUSUM) method. However, the simulated yield rate was unable to satisfy the expectations of an engineer. Therefore we need to research and develop a method to get a high simulated yield rate for the mean-shift problem and widely to apply it to different places.

2. To understand the regression tree

A regression tree is a fast calculating method which uses dichotomy to quickly divide data. In this way, it will clearly understand the whole properties of data. But, there is no good way in literature to choose the size of model. Moreover, we discover

that the result of partitioning is very easily wrong if outliers exist. Thus, to deal with outliers is also an important link.

3. To improve the regression tree

We propose a method to improve the regression tree. The main quotation of the new method is Bayesian, and the concept of an influence point in regression analysis is to choose the infection of outliers.

4. To check ideal by simulation

Using the different models, we discuss the simulated classification rate of the new method. Then from simulation results, the new method's simulated classification rate is high. And it does not affect unbalanced data, different units, and less affect the outliers.

1. 3 Organization

This thesis is organized as follows. Chapter 1 outlines the procedure of this research, motivation and objective. In Chapter 2, we describe literature review, which contains two methods to detect the location of mean-shift. In Chapter 3, we propose our new method, which is improved by CART and deals with outliers by the view of influence point. In addition, we verify our method by using simulation result in Chapter 4 and conclude the thesis in Chapter 5.

在文檔中改良的迴歸樹在半導體良率提升之應用 (頁 10-15)