A novel design of wafer yield model for semiconductor using a GMDH
polynomial and principal component analysis
Jun-Shuw Lin
Department of Industrial Engineering and Management, National Chiao Tung University, 1001 Dah-Hsei Road, Hsin-Chu 300, Taiwan, ROC
a r t i c l e
i n f o
Keywords: Yield model Defect cluster index
Group method of data handling (GMDH) Principal component analysis (PCA)
a b s t r a c t
According to previous studies, the Poisson model and negative binomial model could not accurately esti-mate the wafer yield. Numerous mathematical models proposed in past years were very complicated. Furthermore, other neural networks models can not provide a certain equation for managers to use. Thus, a novel design of this paper is to construct a new wafer yield model with a handy polynomial by using group method of data handling (GMDH). In addition to defect cluster index (CIM), 12 critical electrical test
parameters are also considered simultaneously. Because the number of input variables for GMDH is inad-visable to be too many, principal component analysis (PCA) is used to reduce the dimensions of 12 critical electrical test parameters to a manageable few without much loss of information. The proposed approach is validated by a case obtained in a DRAM company in Taiwan.
Ó 2011 Elsevier Ltd. All rights reserved.
1. Introduction
For integrated circuits (IC) manufacturers, the wafer yield is a key index to evaluate their profit. Semiconductor manufacturing companies strive to achieve defect-free products and increase profit rate by adopting advanced manufacturing, planning, and evaluating technologies. In these performance technologies (Leachman, 1993), wafer yield prediction is one of the most widely researched approaches in the complicated semiconductor manu-facturing system. Wafer yield prediction is very important for a semiconductor manufacturing factory in improving yield, decreas-ing cost and maintaindecreas-ing a good relationship with customers (Kumar et al., 2006). For this reason, it is an essential task for engineers to manage the wafer yield.
As the wafer size increases, the clustering phenomenon of defects becomes pronounced. Although the Poisson model is the simplest model to use, the essential assumption is that defects must occur independently with constant probability of occurring in small area on a wafer (Albin & Friedman, 1991). The negative binomial yield model (Stapper, 1973) includes a clustering index (
a
), but the value ofa
can be very scattered and negative that leads to unhandy analysis (Cunningham, 1990). Numerous mathematical models have been developed to predict wafer yield in the last 40 years (Cunningham, 1990; Stapper, 1991; Stapper & Rosner, 1995), but these models are very complicated in practice.Neural networks are also utilized to construct the wafer yield models, but those models (Tong & Chao, 2008; Tong, Lee, & Su, 1997) must set several parameters (e.g., the number of neurons in the hidden layers, the momentum, and the learning rate) and
can not provide a certain equation for managers to use. Thus, those neural networks models are often difficult for managers without profound profession knowledge to use in performing wafer yield prediction.
On the basis of practicability, a novel design of this paper is to construct a new wafer yield model with a handy polynomial by using group method of data handling (GMDH) (Ivakhnenko, 1968, 1971). This proposed GMDH model does not need any statistical assumption and can be friendly to use. In addition to defect cluster index (CIM) (Tong, Wang, & Chen, 2007), 12 critical electrical test parameters are also considered simultaneously. Because the num-ber of input variables for GMDH is inadvisable to be too many, prin-cipal component analysis (PCA) (Pearson, 1901) is used to reduce the dimensions of 12 critical electrical test parameters to a manage-able few without much loss of information for convenient analysis. Finally, a case of a DRAM company in Taiwan is utilized to dem-onstrate the effectiveness of the proposed approach. Comparisons are also made among negative binomial yield model, back-propa-gation neural network (BPNN) yield model, general regression neu-ral network (GRNN) yield model (Tong & Chao, 2008), and the proposed GMDH yield model to demonstrate that the proposed ap-proach is indeed superior.
2. Literature review 2.1. Yield models
The Poisson yield model assumes that the defects on a chip follow a Poisson probability distribution. Under this assumption, the probability that a chip has k number of defects is
0957-4174/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2011.09.146
E-mail address:junsoon1@hotmail.com
Contents lists available atSciVerse ScienceDirect
Expert Systems with Applications
PðkÞ ¼e k0kk
0
k! ; k ¼ 0; 1; 2; . . . ð1Þ
where k0 is the average number of defects per chip, and k is the number of defects per chip. The Poisson yield model can be ob-tained as
Y ¼ Pðk ¼ 0Þ ¼ ek0 ð2Þ
Cunningham (1990)indicated that, when the chip size is less than 0.25 cm2, the Poisson yield model is appropriate. However, as the chip size increases, the conventional Poisson yield model will fre-quently underestimate the actual wafer yield.
The negative binomial yield model proposed byStapper (1973)
is a widely applied yield model, which employs a gamma function for the distribution of defect density. The negative binomial yield model can be expressed as
Y ¼ 1
ð1 þ D0A=
a
Það3Þ
where D0is the average number of defects per unit area, A is the chip area, and
a
is the cluster parameter. The value ofa
is calculated by the following equation:a
¼ k2=ðr
2 kÞ ð4Þwhere kis the mean number of defects per chip, and
r
2is the var-iance. Cunningham (1990) indicated that, the value ofa
can be quite scattered and sometimes negative when the negative bino-mial yield model is used to predict yield.Other yield models are summarized in Stapper and Rosner
(1995).Tong et al. (1997) proposed a neural network-based ap-proach to predict the wafer yield. Langford, Liou, and Raghavan (2001)presented a simple robust windowing method for the Pois-son yield model to extract the systematic and random components of yield from wafer probe bin map data. Liou et al. (2002) pre-sented a statistical modeling of MOS devices for parametric yield prediction. Meyer and Park (2003) presented a center-satellite model to predict defect-tolerant yield in the embedded core con-text.Dupret and Kielbasa (2004)presented the partial least square (PLS) regression model to predict the yield from measurements ob-tained during the production.Kim and Baldwin (2005)presented a theoretical yield model for assembly processes of area array sol-ders inter connect process.Tong and Chao (2008)proposed a gen-eral regression neural network (GRNN) to predict the wafer yield with clustered defects.
2.2. Defect cluster index
The intensity of defects clustered on a wafer can be depicted by a defect cluster index. The cluster parameter (
a
) of the negative binomial model, the variance/mean ratio (V/M) and the non-parameters assumption cluster index (CI) are commonly used. The negative binomial yield model is as follows:Y ¼ 1
ð1 þ k=
a
Þa ð5Þwhere
a
is the cluster parameter and kis the mean number of de-fects per chip. Earlier reports show that cluster parametera
in the negative binomial model may be quite scattered and may even have a negative value when the model is used to forecast yield ( Cunning-ham, 1990).Tyagi and Bayoumi (1992, 1994) utilized various grid sizes superimposed on a wafer map to measure the intensity of defects distributed on a wafer. The defects contained within each grid can be used to judge the spatial distribution of defects. The distribution of defects follows a Poisson distribution if the defects are randomly distributed. Because both variance (V) and mean (M) are equal in
the Poisson distribution, the value of V/M equals 1 if the wafer fects are randomly scattered. The value of V/M exceeds 1 if the de-fects distributed on a wafer are clustered. The values of V/M depend on how the grids are selected and cannot indicate the grad-ualness of cross-wafer defect density variations.
Jun, Hong, Kim, Park, and Park (1999)proposed a cluster index based on the projected x and y coordinates of defect locations on a wafer. Defect clustering tends to show clumps in the x and the y coordinates, which result in a large variance in defect intervals. However, showing clumps either on the x-axis or on the y-axis does not necessarily represent the clustered defects. The clustering index CI can be calculated as
CI ¼ min S 2 v V2; S2W W2 ( ) ð6Þ
where Viand Wiare a sequence of defect intervals on the x-axis and y-axis defined as
Vi¼ XðiÞ Xði1Þ; i ¼ 1; 2; . . . ; n ð7Þ Wi¼ YðiÞ Yði1Þ; i ¼ 1; 2; . . . ; n ð8Þ
where X(i)and Y(i)denote the ith smallest defect coordinates on the x-axis and y-axis respectively, X(0)= Y(0)= 0, and n is the number of defects on a wafer. The value of CI is close to 1 if the defects are ran-domly scattered, and the value of CI is expected to be greater than 1 if clustering of defects appears.
3. Proposed approach
The constructing of the proposed wafer yield model is described in the following subsections.
3.1. Group method of data handling (GMDH)
The GMDH (Ivakhnenko, 1968, 1971) is a special model, and it can be expressed as a set of neurons in which different pairs of them in each layer are connected through a polynomial and, so produce new neurons in the next layer. For instance, the training set is divided into two parts: model learning set E1 and model selecting set E2in GMDH. Let X = (X1, X2, . . ., Xn) and y be the input vector and actual output, respectively. Given M observations of multi-input, single-output data pairs {yi, Xi1, Xi2, . . ., Xin, i = 1, 2, . . ., M} in set E1, I train a GMDH-type neural network to pre-dict the output values ^yi:
^
yi¼ ^fðXi1;Xi2; . . . ;XiMÞ; i ¼ 1; 2; . . . ; M ð9Þ
The problem transforms to construct a GMDH-type neural network so that minX M i¼1 ½^fðXi1;Xi2; . . . ;XiMÞ yi 2 ð10Þ
The connection between the inputs and the output variables can be expressed by a complicated discrete form of the Volterra functional series in the form of
y ¼ a0þ XM i¼1 aiXiþ XM i¼1 XM j¼1 aijXiXjþ XM i¼1 XM j¼1 XM k¼1 aijkXiXjXk þ ð11Þ
which is also called as the Kolmogorov–Gabor (K–G) polynomial (Madala & Ivakhnenko, 1994; Muller & Lemke, 2000), in particular by the K–G polynomial of degree 2 consisting of only two variables (neurons) in the form of
^
In this manner, such a partial quadratic description is recursively used in a network of connected neurons to construct the general mathematical relation of the inputs and output variables given in Eq.(11). The coefficients aiin Eq. (12) are calculated with least squares (LS) (Madala & Ivakhnenko, 1994; Muller & Lemke, 2000). In this manner, the coefficients of each quadratic function Giare given to optimally fit the output yiin the whole set E1, that is
min PM i¼1ðyi GiÞ2 M " # ð13Þ
By the GMDH algorithm, all the possibilities of two independent variables out of the total n input variables are taken in order to con-struct the polynomial in the form of Eq. (12) that best fits the dependent observations (yi, i = 1, 2, . . ., M) with LS. Therefore, C2n¼ nðn 1Þ=2 neurons will be constructed in the first hidden layer of the feed-forward network from the observations {(yi, Xip, Xiq)} for different p, q 2 {1, 2, . . ., n}. Likewise, it is now possible to construct M data triples {(yi, Xip, Xiq)} from observations with such p, q 2 {1, 2, . . ., n} in the form X1p X1q y1 X2p X2q y2 XMp XMq yM 0 @ 1 A.
By the quadratic sub-expression in the form of Eq.(12)for each row of M data triples, the following matrix equation can be given as Aa = Y, where a is the vector of unknown coefficients of the qua-dratic polynomial in Eq. (12), a = {a0, a1, a2, a3, a4, a5}T and Y = {y1, y2, . . ., yM}Tis the vector of the output’s value from observa-tion. It can be shown in the following
1 X1p X1q X1pX1q X21p X 2 1q 1 X2p X2q X2pX2q X22p X 2 2q 1 XMp XMq XMpXMq X2Mp X 2 Mq 0 B B @ 1 C C A
The LS obtains the solution of the equations in the form of
a ¼ ðATAÞ1ATY
ð14Þ
which determines the vector of the best coefficients of Eq.(12)for the whole set of M data triples. It should be paid attention to that this pro-cedure is repeated for each neuron of the next hidden layer according to the connectivity topology of the network. In each layer, it uses LS to estimate the parameters of candidate models in set E1, and uses the external criterion to evaluate and select the candidate models in set E2. The process continues and should be stopped when we find the optimal model by the termination principle, which is presented by the theory of optimal complexity (Madala & Ivakhnenko, 1994): along with the increase of model complexity, the value of external criterion will decrease first and then increase, and finally the global extreme value agrees with the optimal complexity.
3.2. Principal component analysis (PCA)
Given a set of centered input vectors xt (t = 1, 2, . . ., l and Pl
t¼1xt¼ 0), each of which is of m dimension xt= [xt(1), xt(2), . . ., xt(m)]Tusually m < l, PCA (Pearson, 1901) linearly trans-forms each vector xtinto a new one stby
st¼ UTxt ð15Þ
where U is the m m orthogonal matrix whose ith column, uiis the eigenvector of the sample covariance matrix
C ¼1 l
Xl t¼1
xtxTt ð16Þ
In other words, PCA firstly solves the eigenvalue problem
kiui¼ Cui; i ¼ 1; 2; . . . ; m ð17Þ
where kiis one of the eigenvalues of C, uiis the corresponding eigen-vector. Based on the estimated ui, the components of stare then cal-culated as the orthogonal transformations of xt
stðiÞ ¼ uTixt; i ¼ 1; 2; . . . m ð18Þ
The new components are called principal components. By using only the first several eigenvectors sorted in descending order of the eigenvalues, the number of principal components in stcan be re-duced. So PCA has the dimensional reduction characteristic. The principal components of PCA have the following properties: st(i) are uncorrelated, has sequentially maximum variances and the mean squared approximation error in the representation of the ori-ginal inputs by the first several principal components is minimal (Jolliffe, 1986).
3.3. Defect cluster index (CIM)
In this study, I use the clustering index (CIM) proposed byTong
et al. (2007)to measure the clustering phenomenon of defects. The detailed descriptions of obtaining CIMare listed as the following five steps.
Step 1: Project the defect coordinates (Xi, Yi) into a new axis obtained by rotating the x-axis counterclockwise using h. Sup-pose that a wafer has n defects, and (Xi, Yi) denotes the x and y coordinates of the ith defect location in a two-dimensional space, i = 1, . . ., n. These n defects then can be projected onto a new axis X
i;hobtained by rotating the x-axis counterclockwise using h°. The new coordinates for the ith defect with respect to h then can be calculated as follows:
Xi;h¼ cos h Xiþ sin h Yi ð19Þ
where i denotes the ith defect and h represents a rotating angle, where 0 6 h 6 180.
Step 2: Sort the Xi;hvalues in ascending order and calculate the intervals between each adjacent coordinate value X
i;h. The inter-vals between each adjacent coordinate value X
i;h then can be calculated as follows:
Vi;h¼ Xði;hÞ X
ði1;hÞ ð20Þ
where X
ð0;hÞ¼ 0 and Vi,hrepresents the ith interval between Xði;hÞ and X
ði1;hÞ.
Step 3: Calculate the squared coefficient of variation (SCV) for Vi,h. The SCV for Vi,hcan be determined as follows:
SCVh¼ S2 v;h V2 h ð21Þ
where SCVh represents the squared coefficient of variation for Vi;h;Vh¼ ðPni¼1Vi;hÞ=n, and S2V;h¼ ð
Pn
i¼1ðVi;h VhÞ2Þ=ðn 1Þ. Step 4: Change the angle of h and calculate the corresponding h= 1° value. The number of 180 SCVhvalues with respect to h, increased by h = 1°, can be obtained through Steps 1–3. Step 5: According to the SCVhvalues obtained from Step 4, the average SCVh value determines the clustering index (CIM), as follows:
CIM¼ P180
h¼0SCVh
180 ð22Þ
where CIMrepresents defect cluster index. A larger CIMvalue indi-cates a stronger degree of defect clustering formed on a wafer. 3.4. Prepare the relative data per wafer
In this study, defect counts, the value of CIM, and the value of principal component scores are utilized as the input variables for
GMDH. The value of actual wafer yield is the output variable for GMDH. Follows are brief descriptions for the obtainment of CIM, principal component scores, and the actual wafer yield.
3.4.1. Calculate the value of CIM
The clustering phenomenon of defects on a wafer influences the accuracy of a wafer yield model, and the CIMcan effectively mea-sure the clustering phenomenon on a wafer. The CIMcan be ob-tained by the five calculating steps introduced in Section3.3.
3.4.2. Obtain the value of principal component scores
Use the principal component analysis (PCA) to form new vari-ables that are linear combinations of the original varivari-ables (i.e., 12 critical electrical test parameters). Then let the standardized data of original variables into the linear equations of new variables to obtain the value of principal component scores.
3.4.3. Calculate the value of actual wafer yield
The actual yield value can be obtained by the number of non-defective chips divided by the total number of chips on a wafer. 3.5. Verify the proposed model
The accuracy of neural networks can be measured by a root-mean squared error (RMSE). When the value of RMSE is smaller,
the accuracy of neural networks is higher. The RMSE can be calcu-lated as RMSE ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Pn i¼1ðAi OiÞ2 n s ð23Þ
where n represents the number of data, Airepresents the actual va-lue of output, and Oirepresents the predicted value. The general indicator for measuring the strength of the relationship between the actual and predicted outputs is the Pearson’s linear correlation coefficient r. In this study, RMSE and r are both used to evaluate the performance of wafer yield model.
4. Implementation
In this section, a case of a DRAM company in Taiwan is utilized to demonstrate the effectiveness of the proposed approach. Com-parisons are also made among negative binomial yield model, back-propagation neural network (BPNN) yield model, general regression neural network (GRNN) yield model (Tong & Chao, 2008), and the proposed GMDH yield model to demonstrate that the proposed approach is indeed superior.
Table 1
The partial 12 critical electrical test parameters of 111 wafer data in this case. No. Parameter x1 Parameter x2 Parameter x3 Parameter x12
1 1.1727 0.0658 322.9658 2.1098 2 1.1662 0.0615 315.9007 2.6533 3 1.1774 0.0682 322.2566 2.3412 4 1.1695 0.0632 310.9355 2.2145 5 1.1834 0.0595 313.5132 2.4589 6 1.1607 0.0659 320.4290 2.5098 109 1.1808 0.0614 319.6768 2.3325 110 1.1762 0.0604 321.3472 2.5977 111 1.1849 0.0632 317.0603 2.1104
Fig. 1. The scree plot of PCA.
4.1. PCA of 12 critical electrical test parameters
There are totally 111 data of 8-in. wafer in a case of a DRAM company in Taiwan, and 12 critical electrical test parameters per
wafer are considered in this case. The partial 12 critical electrical test parameters of 111 wafer data are listed inTable 1.
The computer software, STATISTICA 6.0, is used to perform PCA.
Fig. 1shows the scree plot of PCA.Fig. 2shows the eigenvalues of PCA.Fig. 3shows the eigenvectors of PCA. By Kaiser’s rule, we re-tain only those components whose eigenvalues are greater than 1. Therefore, there are 3 principal components which should be re-tained. According toFig. 3, the 3 principal components can be cal-culated as follows:
Prin1 ¼ 0:0267x1 0:3285x2 0:2843x3þ 0:3363x12 ð24Þ Prin2 ¼ 0:0077x1þ 0:2830x2þ 0:3014x3þ 0:2779x12 ð25Þ Prin3 ¼ 0:9385x1þ 0:0655x2 0:1783x3 þ 0:0534x12 ð26Þ
The standardized principal component scores of Prin1, Prin2, and Prin3 are partially listed inTable 2.
Fig. 3. The eigenvectors of PCA.
Fig. 4. The five clustering patterns. Table 2
The standardized principal component scores of Prin1, Prin2, and Prin3. No. Scores of Prin1 Scores of Prin2 Scores of Prin3
1 1.2071 0.6119 0.1203 2 0.6368 1.5014 0.0992 3 1.6939 0.1562 1.3042 109 0.8735 1.3814 0.5404 110 0.6523 0.8559 1.4322 111 0.2593 1.6555 1.4796
4.2. Construct a new wafer yield model using GMDH
In this study, the computer software, NeuroShell 2.0, is used to construct the proposed GMDH yield model. In this case of a DRAM company in Taiwan, one random pattern and four clustering pat-terns (i.e., bull eye pattern, edge pattern, bottom pattern, and cres-cent moon pattern) (Friedman, Hansen, Nair, & James, 1997) are considered, and these five clustering patterns are shown inFig. 4. Eighty-nine wafer data are randomly selected as training sam-ples, and the rest 22 wafer data are the testing samples. The result of GMDH learning is shown inFig. 5. ByFig. 5, the GMDH polyno-mial of proposed yield model is shown in Eq.(27)
Y ¼ 8:8E 002 X5 0:15 X2 þ X3 0:11 X4 3:8E 002 7:3E 002 X1 1:1 X3^2 6:9 X4^2 0:61 X3^3 þ 1:8 X4^3 þ 8:5 X3 X4 1:3 X3 X5 þ 7 X4 X5 þ 7:9 X3 X4 X5 þ 0:1 X2^2 þ 1:1 X2 X3 þ 0:28 X2 X5 1:4 X2 X3^2 6:5 X2 X4^2 0:77 X2 X3^3 þ 1:7 X2 X4^3 þ 8:1 X2 X3 X4 1:3 X2 X3 X5 þ 6:6 X2 X4 X5 þ 7:5 X2 X3 X4 X5 0:26 X1^2 þ 0:12 X1^3 þ 0:19 X5^2 þ 0:15 X2^3 þ 0:21 X5^3 ð27Þ
where Y denotes the predictive wafer yield value, X1 is defect counts, X2 is the value of CIM, X3 is scores of Prin1, X4 is scores of Prin2, and X5 is scores of Prin3. The value of RMSE ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiMSE¼
0:010540 p
¼ 0:1027, and the value of correlation coefficient is 0.9784.
4.3. Compare with other wafer yield models
Finally, the comparisons made among negative binomial yield model, back-propagation neural network (BPNN) yield model, gen-eral regression neural network (GRNN) yield model (Tong & Chao,
2008), and the proposed GMDH yield model are listed inTable 3. The scatter plots in the negative binomial yield model, BPNN yield model, GRNN yield model, and the proposed GMDH yield model are shown fromFigs. 6–9.
FromTable 3, it can be seen that the proposed GMDH model in this study has the smallest value of RMSE and the largest value of correlation coefficient. Therefore, the predictive accuracy of the proposed model in this study is indeed superior.
Table 3
Comparisons of RMSE and r between predictive and actual yield value.
Yield model RMSE r
Negative binomial yield model 0.1443 0.9159
BPNN yield model 0.1224 0.9308
GRNN yield model 0.1189 0.9496
Proposed GMDH yield model 0.1027 0.9784
0.6 0.7 0.8 0.9 1 0.6 0.7 0.8 0.9 1
Actual yield value
Predictive yield value
Fig. 9. The scatter plot in GMDH yield model. Fig. 5. The result of GMDH learning.
0.6 0.7 0.8 0.9 1 0.6 0.7 0.8 0.9 1
Actual yield value
Predictive yield value
Fig. 6. The scatter plot in negative binomial yield model.
0.6 0.7 0.8 0.9 1 0.6 0.7 0.8 0.9 1
Actual yield value
Predictive yield value
Fig. 7. The scatter plot in BPNN yield model.
0.6 0.7 0.8 0.9 1 0.6 0.7 0.8 0.9 1
Actual yield value
Predictive yield value
5. Conclusions
When the clustering phenomenon of defects is pronounced, the conventional Poisson yield model can not reasonably estimate the wafer yield. Other neural networks models have the problem of setting parameters and can not provide a certain equation for man-agers to use.
On the basis of practicability, a novel design of this paper is to construct a new wafer yield model with a handy polynomial by using GMDH, and it can accurately predict the wafer yield. In addi-tion to defect cluster index (CIM), 12 critical electrical test param-eters are also considered simultaneously. In this study, the PCA is used to reduce the dimensions of 12 critical electrical test param-eters to a manageable few without much loss of information for convenient analysis.
The merits of the proposed approach are summarized as follows:
(1) The proposed GMDH yield model can provide a handy poly-nomial for managers to use, and this model does not require setting parameters of neural networks.
(2) This study employs PCA to reduce the dimensions of 12 crit-ical electrcrit-ical test parameters to a manageable few without much loss of information, and it can effectively simplify the constructions of variables.
(3) The proposed GMDH yield model is fast learning and has high accuracy of prediction.
(4) The proposed GMDH yield model does not need any statisti-cal assumption and can be friendly to use.
(5) The proposed GMDH yield model can help the IC manufac-turers to manage the wafer yield and evaluate their process capability in relation to profit and loss.
Acknowledgement
The author thanks the National Chiao Tung University for its resourceful support.
References
Albin, S. L., & Friedman, D. J. (1991). Clustered defects in IC fabrication: impact on process control charts. IEEE Transactions on Semiconductor Manufacturing, 4(1), 36–42.
Cunningham, J. A. (1990). The use and evaluation of yield models in integrated circuit manufacturing. IEEE Transactions on Semiconductor Manufacturing, 3(2), 60–71.
Dupret, Y., & Kielbasa, R. (2004). Modeling semiconductor manufacturing yield by test data and partial least squares. In Proceedings of 16th international conference on microelectronics (pp. 404–407).
Friedman, D. J., Hansen, M. H., Nair, V. N., & James, D. A. (1997). Model-free estimation of defect clustering in integrated circuit fabrication. IEEE Transactions on Semiconductor Manufacturing, 10(3), 344–359.
Ivakhnenko, A. G. (1968). The group method of data handling; a rival of the method of stochastic approximation. Soviet Automatic Control, 13(3), 43–55.
Ivakhnenko, A. G. (1971). Polynomial theory of complex systems. IEEE Transactions on Systems, Man and Cybernetics, 1(4), 364–378.
Jolliffe, I. J. (1986). Principal component analysis. New York: Springer.
Jun, C. H., Hong, Y., Kim, S. Y., Park, K. S., & Park, H. (1999). A simulation-based semiconductor chip yield model incorporating a new defect cluster index. Microelectronics Reliability, 39(4), 451–456.
Kim, C., & Baldwin, D. F. (2005). A theoretical yield model for assembly process of area array solder interconnects packages with experimental verification. IEEE Transactions on Electronics Packaging Manufacturing, 28(4), 344–354. Kumar, N., Kennedy, K., Gildersleeve, K., Abelson, R., Mastrangelo, C. M., &
Montgomery, D. C. (2006). A review of yield modeling techniques for semiconductor manufacturing. International Journal of Production Research, 44(23), 5019–5036.
Langford, R. E., Liou, J. J., & Raghavan, V. (2001). The application and validation of a new robust windowing method for the Poisson yield model. In Advanced semiconductor manufacturing conference, IEEE/SEMI (pp. 157–160).
Leachman, R. C. (1993). The competitive semiconductor manufacturing survey. In IEEE international symposium on semiconductor manufacturing conference, Austin, Texas, USA (pp. 359–381). Piscataway, NJ: IEEE.
Liou, J. J., Zhang, Q., McMacken, J., Thomson, J. R., Stiles, K., & Layman, P. (2002). Statistical modeling of MOS devices for parametric yield prediction. Microelectronics Reliability, 42(4), 787–795.
Madala, H. R., & Ivakhnenko, A. G. (1994). Inductive learning algorithms for complex systems modeling. Boca Raton, FL: CRC Press.
Meyer, F. J., & Park, N. (2003). Predicting defect-tolerant yield in the embedded core context. IEEE Transactions on Computers, 52(11), 1470–1479.
Muller, J. A., & Lemke, F. (2000). self-organising data mining: An Intelligent approach to extract knowledge from data. Hamburg: Libri.
Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. Philosophical Magazine, 2, 559–572.
Stapper, C. H. (1973). Defect density distribution for LSI yield calculations. IEEE Transactions on Electron Devices (Correspondence), 20(7), 655–657.
Stapper, C. H. (1991). On Murphy’s yield integral. IEEE Transactions on Semiconductor Manufacturing, 4(4), 294–297.
Stapper, C. H., & Rosner, R. J. (1995). Integrated circuit yield management and yield analysis: Development and implementation. IEEE Transactions on Semiconductor Manufacturing, 8(2), 95–102.
Tong, L. I., & Chao, L. C. (2008). Novel yield model for integrated circuit with clustered defects. Expert Systems with Applications, 34, 2334–2341.
Tong, L. I., Lee, W. I., & Su, C. T. (1997). Using a neural network-based approach to predict the wafer yield in integrated circuit manufacturing. IEEE Transactions on Components, Packaging, and Manufacturing Technology – Part C, 20(4), 288–294. Tong, L. I., Wang, C. H., & Chen, D. L. (2007). Development of a new cluster index for wafer defects. International Journal of Advanced Manufacturing Technology, 31, 705–715.
Tyagi, A., & Bayoumi, M. A. (1992). Defect clustering viewed through generalized Poisson distribution. IEEE Transactions on Semiconductor Manufacturing, 5(3), 196–206.
Tyagi, A., & Bayoumi, M. A. (1994). The nature of defect patterns on integrated circuit wafer maps. IEEE Transactions on Reliability, 43(1), 22–29.