Introduction - 利用統計方法自動依工程忍受度判斷機台差異及其在半導體製程改善之應用

The importance of semiconductor technology in today’s world can hardly be exaggerated. Semiconductor devices are absolutely essential components for almost all electronic products. Without semiconductors, most of the electronic products and the systems cannot be made or operated and their influences on human society are beyond belief. The global semiconductor industry had around US$230 billion worth of revenue in 2005 and keeps creating new opportunities, socio-economic advancements and new human developments to nations and societies around the world [1]. Since building a modern wafer fabrication facility needs around US$3 billion, enhancing the yield rapidly to volume the production becomes an extremely important source of the competitive advantage in the hyper-competitive world on semiconductor manufacturing. The sooner a potentially lucrative circuit yields, the better the manufacturer generates a revenue stream. On the other hand, rapidly identifying the cause of yield loss can restore a revenue stream and prevent the destruction of the materials in process [2] [3].

In the following, the semiconductor manufacturing is briefly introduced. It follows a very complex process flow which is quite different from the traditional manufacturing industries. It takes about 30-60 days to complete the processes of

making bare silicon wafers into integrated circuits, such as the microprocessors or the memory chips. In general, 25 wafers are processed together in a group called a lot, and the size of each wafer ranges from 3 to 12 inches in diameter. Each wafer could contain thousands of dies depending on the size of the die being produced. During the manufacturing process, the lots are manufactured through lots of process steps (more than 150 process steps). Each process step involves several tools for production. After completing each process step, the metrology systems collect the physical data and the electrical data, such as the film thickness, film uniformity, critical dimension, overlay, defect particle count, voltage, current and wafer sort, etc. At the end of the all process steps, the Wafer Acceptance Test (WAT) with 100–500 electrical test items and the Wafer Sort Test (WST) with 50-100 test items are performed sequentially to each wafer. The objectives of WAT and WST are to perform the device characteristic analysis and the die functionality sorting, respectively. Since these testing data also characterize the quality of the manufacturing and the performance of the products, therefore how to use these data to improve the process itself becomes an interesting issue. However these data are huge with lots of variables (about 100M–1G for each lot), it is very time-consuming for engineers to analyze the data and find out the sources of variations in the production processes. Among various analyses, tool comparison is one key task for engineers in the yield improvements and therefore an

effective and time-efficient method for comparing tools is critical for rapidly improving the yields [4]. In the following, we briefly review the conventional approaches in this area, including the multiple comparison methods and the clustering methods.

For multiple comparisons, the analysis of variance (ANOVA) for normal data and the Kruskal-Wallis test [5] for non-normal data are the two most popular statistical methods for testing the significant differences between population means among groups. For our tool partition problem, in order to compare the performances among different tools (i.e., groups) at each process step, the engineers perform these two statistical tests regarding the distribution of the considered metrology data associated with each tool. By quickly reviewing the testing result for each individual step, tool differences might be detected at certain process steps and an alarm will be triggered for further checking or investigation. In general, the main purpose of this kind of testing procedures is to find the variation sources (i.e., which process steps) and identify the possibility of abnormal tools. After finding the significant differences among tools at certain steps, the engineers will partition all the relevant tools into several homogenous groups and further identify the best groups or the problematic groups of tools in order to enhance the product quality or to exclude the worst tools, see for examples in [6]-[8]. This partition problem is an important and practical issue

for engineers but cannot be handled by the ANOVA or the Kruskal-Wallis test simultaneously. Some multiple pairwise comparison procedures, such as the methods suggested by Fisher [9], Tukey [10], Keuls [11], Duncan [12], Scheffe [13], and Dunnett [14] [15], provide useful information about the ranking or ordering structures of the group means but these methods cannot directly partition different tools (or treatments) into homogenous and non-overlapping groups. For example, there are

three tools to be partitioned and their sample means satisfy Y 1 ≤ Y 2 ≤ Y 3 . Suppose that a multiple comparison procedure finds that the differences

1 2

|Y − Y | and | Y 2 − Y 3 | are not significant but the difference

1 3

| Y − Y |is significant. It is not clear how to partition these three tools into homogenous but non-overlapping groups since both (Y 1,Y 2) and (Y 2,Y 3)

i P are

reasonable homogenous groups.

Another popular approach for partitioning is using the cluster analysis. Scott

and Knott [16] suggested a procedure which starts by dividing the k means into two

groups and then performs a test to decide whether the partition is acceptable. This

approach is equivalent to a hypothesis testing problem:

0 1 2 1 1 disjoint and nonempty sets with

P1 P₂

1 2, {1, 2 , ..., }

P U P = κ . If the test is significant

at some chosen level α , similar testing procedure is then applied to each individual , i=1,2. The procedure is continued sequentially until all tests are not rejected (i.e., no further partition is necessary). Worsley [17] proposed a nonparametric version of Scott and Knott’s method. Although this approach is intuitive and easy to implement, the Type I error of the entire test is difficult to control due to sequential testing procedures, in particular when the number of splitting gets larger. Moreover, the final partition result may not be unique which highly depends on the initial partition.

To overcome the difficulty of controlling the Type I error (the probability of erroneous grouping) for the sequential testing procedure, Calinski and Corsten [18]

proposed two cluster methods to partition these tools (or treatments) in a balanced design by embedding the simultaneous testing procedures based on F test and Studentized range test, respectively. Although this approach solved the problem of controlling Type I error, it still has several other disadvantages, such as the partition groups are too many with small differences when the number of observations for each tool is large; the issue about unbalance data is not considered which generally loses the power when the usages are quite different among tools.

Jolliffe [19] proposed an alternative method to perform the cluster analysis.

This approach used a particular dissimilarity measure which is defined by the P-value of the Studentized range test for testing the difference between two group means. A

larger P-value indicates that two groups are more similar and a smaller P-value indicates that two groups are more distinguishable. One critical issue for this hierarchical clustering approach is to determine the number of clusters which is usually determined subjectively.

Data mining approaches [20] [21], such as the classification and regression trees (CART) [22] and the neural networks [23], have also been used for the partition problem. Recently, some commercial data analysis software (for example: Yield Dynamics, BI IBM, Odyssey YMS, and dataPOWER) in engineering use these approaches for the yield enhancements. However, these approaches involve supervised algorithms which rely on more complex initial parameter setups and the partition results are usually sensitive to these setups [22], and therefore it is somehow difficult for engineers to use them in practice. In Appendix A, the CART algorithm is briefly introduced which will be compared with the proposed method in the simulation study.

To sum up, each above-mentioned method has its own advantages and disadvantages compared to the other methods under different circumstances. But none of them is capable of incorporating the experts’ specific opinions into the statistical analysis. For example, for our tool partition problem, the engineers’ tolerance controls which quantify the tolerance (minimum value) of tool differences should be used in

some way in the analysis on determining the grouping structure. Another challenge for our tool partition problem in the manufacturing processes is to account for unequal tool usages. Unequal usages for different tools at each process step make the numbers of processed lots for each tool varying which induces unbalance data issue for the statistical tests. Montgomery [24] addressed that the statistical tests for multiple comparison lose power for unbalance data. The goal of this thesis is to develop a method for tool partition which incorporates the experts’ specific opinions on the tolerance of tool differences and considers the unbalance data issue simultaneously.

We formulate the tool partition problem as a hierarchical model [25] in which the metrology measurements from each tool follow a normal distribution with tool-specific mean. It is reasonable to expect that the tool-specific means for “similar tools” are related in some way, such as viewing these tool-specific means as realizations from the same distribution. Under this setup, the engineers’ tolerance of tool differences can be naturally incorporated into the variance structure of the model imposed on the tool-specific means. For such hierarchical models, Bayesian analysis is the most popular method for inference in the literature. We will use the Bayesian analysis for searching the possible grouping structure for different tools in which the reversible jump Markov chain Monte Carlo (RJMCMC) algorithm, proposed by Green [26], will be used for the implementation. We called this proposed procedure

“tolerance control partitioning” (TCP) which partitions tools into several homogenous groups subject to the engineer’s specific tolerance about the mean differences.

The remainder of this paper is structured as follows. In Chapter 2, the hierarchical model is introduced and the guidelines for choosing the priors for the parameters and hyper-parameters in the Bayesian analysis are also addressed. In Chapter 3, three simulation experiments are designed to illustrate the advantages of using the TCP method. For comparison, the partition results using the pruning scheme for the CART method (which also incorporates the tolerance of tool differences into account) are also considered in the simulations. In Chapter 4, two real examples in the semiconductor industry are analyzed for illustration. One example is related to the yield enhancement and the other example is about the Cp/Cpk enhancement. In addition, we propose two new ideas to integrate TCP with a statistical dashboard [4]

for yield enhancement and automatic process control (APC) [27]-[29] for Cp/Cpk enhancement, respectively. In Chapter 5, some conclusions and discussions for the TCP method are given. In Chapter 6, possible extensions of applying the TCP method are addressed for future work.

在文檔中利用統計方法自動依工程忍受度判斷機台差異及其在半導體製程改善之應用 (頁 9-17)