- Compacted BWI Similarity-computing algorithm:

Input: The relevant degree rdij of record Rj with a new query, the Compacted Mask Vector, and the Similarity Mapping List L.

Output: The similarity of Rj with a new record.

Step 1: Initialize a zero binary string of length r.

Step 2: For each i, 1 ≤ i ≤ ∑

= r i eli

, set the i-th position in the string to 1 if

AND(cMaski, rdij) = AND(cMaski, BWIN).

Step 3: Transform the binary string into an integer j.

Step 4: Get Lj from the Similarity Mapping List.

Step 5: Return Lj.

EXAMPLE 4.7:

Continuing from Example 4.6, the BWI_N^M and BWI_N^D of a new query RN, which is <StepID=PS_1, ToolID=AWOX13, Yield=99.1>, is <ei₁¹=00000 ei¹₂=10000

ei3=1000 ei₄¹=00001 > and <ei₁²=00000 ei₃²=0010 ei₄²=00100>. Also assume that weight W2, W3 and W4 are set to 0.4, 0.4 and 0.2 correspondingly. Each BWI^M_j in TBWI

in Table 4.3 is processed as follows.

• For BWI₁^M _,the relevant degree rdm₁ = <00000 10000 1000 00000> since:

<00000 10000 1000 00001> (BWI_N^M) AND <00000 10000 1000 00000> (BWI₁^M)

<00000 10000 1000 00000> (rdm1)

Since more than one bit in rdm1 is "1", all BWI_i^D in Drill-Packet Matrix T_BWI^DP¹are

retrieved to further investigation. There are three BWI_i^D in T_BWI^DP¹ , including BWI₁D,BWI₂^DandBWI₃^D,the relevant degree rdd₁ = <00000 0000 00000> since:

<00000 0010 00100> (BWI_N^D) AND <00000 1000 10000> (BWI₁^D)

<00000 0000 00000> (rdd₁)

The relevant degree rdd2 = <00000 0000 00000> since:

<00000 0010 00100> (BWI_N^D) AND <00000 1000 10000> (BWI₂^D)

<00000 0000 10000> (rdd₂)

The relevant degree rdd3 = <00000 0000 00000> since:

<00000 0010 00100> (BWI_N^D) AND <00000 0100 10000> (BWI₃^D)

<00000 0000 00000> (rdd₃)

After the Compacted BWI Concatenate-rdi-result algorithm executed, the rdi₁, rdi₂ and rdi₃are thus generated as following:

rdi1= <00000 00000 10000 1000 0000 00000 00000>

rdi₂= <00000 00000 10000 1000 0000 00000 00000>

rdi3= <00000 00000 10000 1000 0000 00000 00000>

According to the Definition 5.1, the cMask2 = <00000 00000 11111 0000 0000 00000 00000> and cMask3=<00000 00000 00000 1111 1111 00000 00000>. Since the result of AND(cMask2, rdi1) = <00000 00000 10000 0000 0000 00000 00000> is equal to the result of AND(cMask2, BWIN) = <00000 00000 10000 0000 0000 00000 00000> and the result of AND(cMask3, rdi1) = <00000 00000 00000 1000 0000 00000 00000> is not equal to the result of AND(cMask3, BWIN) = <00000 00000 00000 1000 1000 00000 00000>, the similarities of record 1, 2 and 3 are found as 0.4.

Record 1, 2, 3 are then the relevant records.

• For BWI₂^M _,the relevant degree rdm₂ = <00000 10000 1000 00001> since:

<00000 10000 1000 00001> (BWI_N^M) AND <00000 10000 1000 00001> (BWI₂^M)

<00000 10000 1000 00001> (rdm2)

Since more than one bit in rdm2 is "1", all BWI_i^D in Drill-Packet Matrix T_BWI^DP²are retrieved to further investigation. There is only one BWI₄^D in T_BWI^DP¹, ,the relevant degree rdd1 = <00000 0000 00000> since:

<00000 0010 00100> (BWI_N^D) AND <00000 0010 00100> (BWI₄^D)

<00000 0010 00100> (rdd4)

After the Compacted BWI Concatenate-rdi-result algorithm executed, the rdi4, is thus generated as following:

rdi₄= <00000 00000 10000 1000 0010 00001 00100>

According to the Definition 5.1, the cMask2 = <00000 00000 11111 0000 0000 00000 00000>, cMask3=<00000 00000 00000 1111 1111 00000 00000> and cMask4=<00000 00000 00000 0000 0000 11111 11111>. Since the results of AND(cMask2, rdi4) is equal to AND(cMask2, BWIN), AND(cMask3, rdi4) is equal to AND(cMask3, BWIN) and AND(cMask4, rdi4) is equal to AND(cMask4, BWIN), the

similarity of Record 4 is found as 1. Record 4 is then a relevant record.

• For BWI₁^M , the relevant degree rdm3 = <00000 10000 1000 00001> since:

<00000 10000 1000 00001> (BWI_N^M) AND <00000 10000 1000 00001> (BWI₃^M)

<00000 10000 1000 00001> (rdm₃)

Since more than one bit in rdm3 is "1", all BWI_i^D in Drill-Packet Matrix T_BWI^DP³are retrieved to further investigation. There only one BWI₅^D in T_BWI^DP¹, ,the relevant degree rdd5 = <00000 0000 00000> since:

<00000 0010 00100> (BWI_N^D) AND <00000 1000 00010> (BWI₅^D)

<00000 0000 00000> (rdd₅)

After the Compacted BWI Concatenate-rdi-result algorithm executed, the rdi₅ is thus generated as following:

rdi5= <00000 00000 10000 1000 0000 00001 00000>

According to the Definition 5.1, the cMask2 = <00000 00000 11111 0000 0000 00000 00000>, cMask3=<00000 00000 00000 1111 1111 00000 00000> and cMask4=<00000 00000 00000 0000 0000 11111 11111>. Since the result of AND(cMask2, rdi1) is equal to the result of AND(cMask2, BWIN), however, result of AND(cMask3, rdi1) is not equal to the result of AND(cMask3, BWIN), the similarities

of record 5 are found as 0.4. Record 5 is then a relevant record.

• For the other BWI_i^M, the relevant degree rdmi are all equal to <00000 00000 0000

00000> , where 6≤ i ≤14, since no “1” bit in rdm, all other records are filtered out using the Main Matrix only.

After the relevant records are sorted in decreasing order of similarities, the results are shown is Table 4.18.

Table 4.18: Two relevant records and their similarities

Relevant Record Record 1 Record 2 Record 3 Record 4 Record 5

Similarity 0.4 0.4 0.4 1 0.4

4.2.4 Analysis and Experiments of Compacted BWI Method

As we can see, the major different between Encapsulated BWI Similar-records-seeking algorithm (Algorithm 4.4) of Encapsulated BWI method and

Compacted BWI Similar-records-seeking algorithm (Algorithm 4.10) of Compacted

BWI method is in Step 3, the computation time analysis (worse case analysis) is shown

below

In Encapsulated BWI method, the “AND” operations should be taken

∏∏

= =

In Compacted BWI method, the “AND” operations should be taken

∏ ∏

The number of extra “AND” operations is:

∏∏

In the worst case analysis, the Compacted BWI method uses extra

∏∏

₌^r ₌

time “AND” operations than the Encapsulated BWI method. However, the

Encapsulated BWI method should process extra

∏∏

₌ ₌ ^×

∏ ∏

₌ ₌ ₊ ^×

∑∑

₌^r ₌

bits than the Compacted BWI method.

In Encapsulated BWI method, the total bit should be processed

∏

₌^r

∏

₌ ^×

∑∑

₌ ₌

In Compacted BWI method, the total bit should be processed

∏ ∏ ∑ ∑

The saving bits are:

∏ ∏ ∑∑

When the record size (|R|) of T is quite large, the Compacted BWI method can be applied since the disk storage will be largely reduced. Once the record size is

smaller then ( 1) ( )

1 1

∑∑

∏ ∏

_i₌^r _j₌^el_clⁱ₊ ⁱ^j⁻ ^× _i₌^r ^cl_j₌ⁱ ⁱ^j

ei , the Encapsulated BWI method should be

used since the extra processing time will be used by Compacted BWI method

Chapter 5 Using BWI indexing in an Intelligent Manufacturing Defect Detection

Method for the Time Issue

In this chapter, an implementation that consisted of a reinforcement-learning defect detection root-cause learning system for the time aspect in manufacturing domains is introduced. This implementation employed the Sample Bit-Wise Indexing Method to encode the defect status of manufacturing products and hence accelerate data preprocessing. Additionally, a bit-based Genetic Algorithm is used to learn suitable weights for each computed signature, since the chromosome and the corresponding GA operators are appropriate for the bit operations of BWI indexing method.

5.1 Problem Description

In recent years, the problem of detecting defects in the workshop has become increasingly important for manufacturers. In order to raise the quality of products, the root causes of low-quality situations must be found as soon as possible. Thus, process

control, statistical analysis, and cause-methodology-analysis techniques have all been widely applied in addressing the problem [10][18][22][27][53][62][70]. However, it is very difficult to identify the root causes of defects due to a wide variety in the types of causes of defects. For example, in the semiconductor manufacturing industry there are many causes of low yields, among them: machine failures, improper operation, improper parameters, manufacturing time problems, and scheduling and material problems. Many studies have been devoted to investigating these issues. The advent of advanced manufacturing technologies has led to overlong queues and increased manufacturing times in workshops that may cause oxidation problems, which are becoming more critical, but the diagnosis of such problems is usually very difficult and time-consuming. In this chapter, we will proposed a manufacturing defect detection problem, time aspect, for manufacturing domains (MDDP-t) is formally modeled and defined. In this section, the manufacturing defect detection problem, time aspect, for manufacturing domains (MDDP-t) is formally modeled and defined. A root-cause evaluation function (RCEF), which is a linear combination of three probing functions defined independently according to the experiences of domain experts, is proposed to evaluate whether a specific machine is the root cause of a time problem. Determining the weights for these probing functions is considered a separate issue here, and a genetic algorithm (GA) with encoding and GA operations suitable for MDDP-t

weight-learning problems is given to find appropriate weights for the probing functions.

Several instances of MDDP-t with known root causes, some provided by the Taiwan Semiconductor Manufacturing Company [TSMC]), are given as training examples.

Experimental results show the proposed approaches can ensure efficiency and accuracy.

Many technologies or methods are employed to identify the causative factors of manufacturing defects, including Statistical Process Control (SPC), Advanced Process Control (APC) [18][53], and Machine Learning (ML) approaches. However, the real problems are sometimes chaotic, little-understood, and may be caused by complex interactions among multiple factors. Therefore, root-cause sorting becomes a critical issue for all manufacturing enterprises, especially some high technology ones like semiconductor manufacturing corporations.

SPC and APC [10][53] are widely used in the semiconductor industry to monitor manufacturing behavior in workshops via motion and condition sensors. SPC monitors manufacturing by analyzing the statistical results of procedures, generating lists of meaningful results, and warning if the results are outside predefined control boundaries based on machine behaviors and expert experience. However, they sometimes issue warnings for good products (type-two error) and may not always warn of defective products (type-one error). APC, an advanced revision of SPC, not only monitors the

statistical results of machines behaviors [18][53], but also takes predefined actions to adjust machine behaviors when machines become unstable. Although APC seems more advanced than SPC, the resulting action-selection problem raises a separate issue that must be resolved.

Certain intelligent methods with self-learning abilities are employed to provide fault analysis and suggest solutions. In [53], a combination of self-organizing neural networks and rule induction was used to identify critical poor-yield factors from normally collected wafer manufacturing data, and the corresponding behavior model thus learned to predict possible behaviors. A decision-tree approach used to locate the root cause of yield loss in integrated circuits was reported in [59]. The utility of decision trees for yield analysis lies in pointing to process steps that may not be captured by analyses of parametric data.

5.2 Problem Definition of MDDP-t

As mentioned above, we are concerned with the time aspects of detecting which machines make product defects. In this section, we first define various parameters used in this chapter, and then propose a formal definition of “Manufacturing Defect Detection Problem, time aspects” (MDDP-t). Generally, quality baselines must exist for all products in order to ensure good manufacturing procedures. Taking an example

from semiconductor manufacturing, the quality baseline for 150-nanometer yields is usually set to 90% or above in a well-tuned manufacturing fab. When yields become unstable and drop below the quality baseline, product engineers (“lot owners” in semiconductor manufacturing fabs) investigate to find the major reason (called the

“root cause”) for the low-yield situation. For example, a product engineer may collect data on all low and normal product yields and identify suspect factors, e.g., abnormal machine behaviors, in-line metrologies, processing and queuing times, which are the most likely root causes according to statistical- or data-analysis results. In this chapter, MDDP-t is considered a quadruple, including product manufacturing machine

information (PM), product manufacturing time information (PT), product manufacturing yield information (PY), and quality baseline(yθ). The Notation 5.1 is defined as following:

NOTATION 5.1:

M the set of machines;

cp number of products;

cm number of machines;

cs number of machine clusters;

sⁱ i-th machine cluster such that sⁱ = {mi,1, mi,2, …, m_i_,_α₍_i₎}, where 1 ≤ i ≤ cs,

and α(i)is the number of machines in sⁱ and mi,j is the j-th machine in sⁱ, 1 ≤

j ≤ α(i);

pi product pi, 1 ≤ i ≤ cp;

yi product quality pi, 1 ≤ i ≤ cp;

yθ acceptable product quality baseline;

pmi product pi manufacturing information vector

pmi = < pm¹_i, pm_i², …, pm_i^c^s>, where pi is processed by thepm_i^j-th machine in s^j and 1 ≤ i ≤ cp;

pti target manufacturing time vector for product pi

pti = < pt_i¹, pt_i², …, pt_i^c^s >, where pt_i^j is the processing time for machine pm_i^j and 1 ≤ i ≤ cp;

pyi pi product yield;

PM manufacturing procedure for products in P, where PM is a cp×cs matrix and PMi,j=pm_i^j;

PT product manufacturing time, where PT is a cp×cs matrix and PTi,j= pt_i^j; PY product quality yield, where PY is a column matrix and PYi =py_i;

MDDP-t a given quadruple manufacturing defect detection problem involving time,

where MDDP-t=(PM, PT, PY, yθ).

Table 5.1: An example of products passing through two machine clusters

s¹ pt¹ s² pt² Y

p₁ m_1,1 10 m_2,1 23 0.85

p2 m1,1 10 m2,1 23 0.86

p3 m1,1 11 m2,2 23 0.80

p4 m1,2 13 m2,3 60 0.60

p5 m1,2 12 m2,3 25 0.90

p6 m1,2 12 m2,3 66 0.60

p7 m1,3 10 m2,3 27 0.83

p8 m1,3 11 m2,3 25 0.65

p9 m1,3 11 m2,2 25 0.88

p10 m1,3 10 m2,2 23 0.85

EXAMPLE 5.1.

As shown in Table 5.1, there are 10 products in this example (cp=10) and each product is processed by two machine clusters (cs=2), where s¹={m1,1, m1,2, m1,3}(α(1)=3) and s²={m2,1, m2,2, m2,3} (α(2)=3 and cm=6). Each product pi is processed by machine pmi in target time pti. Assume that the given yield threshold yθ is 0.7. According to the definitions given above, the manufacturing information vector pm1 and corresponding manufacturing target time vector pt1 are, respectively, <1,1,1,2,2,2,3,3,3,3> and

<10,10,11,13,12,12,10,11,11,10>. Therefore, the manufacturing procedure, target time, and product yield matrixes are

⎥⎥

Finally, the production for the MDDP-t instance in Table 7.1 is set to (PM, PT, PY, 0.7).

Three probing functions, including Individual Machine, Intra-cluster, and Machine Behavior, are proposed to find possible root causes for given MDDP-t instances. The three probing functions are described in detail below:

1. Individual-Machine probing function (f1): This criterion considers individual machine behaviors in given datasets. If the low-product-yield percentage of one machine, especially one with an abnormal target time, is higher than that of other machines, it may be considered a root-cause candidate. For example, Figure 5.1, shows that machine m1,2 produces low yields of products p4 and p6, 66%, obviously higher than that of machine m1,1 with a low-yield percentage of 0%.

Figure 5.1: Products processed by machines m1,1 and m1,2

Certain notation must be defined in order to calculate the parameters of this function:

NOTATION 5.2:

mvi,j the set of products processed by machine mi,j;

myi,j the set of low-yield products processed by machine mi,j;

mtyi,j the set of low-yield products with abnormal target time processed by machine mi,j.

The Individual-machine probing function for machine mi,j is the multiplication of the ratio of processed product (

n mv_i _j|

| _,

) by the ratio of low-yield product processed

with abnormal target time (

mty )j. As mentioned above, a higher result from this

Low-Yield

Processed products of m_1,1 Processed products of m_1,2

function means a higher possibility of being a root cause.

Since applying conventional comparison and computation operators to generate mvi,j, myi,j, and, mtyi,j may be time-consuming, we use the BWI indexing method to reduce the time required to compute this decision variable. The detailed notation and functions resulting from use of the BWI indexing method are defined as follows:

NOTATION 5.3:

mv_i,j the machine-bit vector of machine m_i,j, where mv_i,j=<b₁b₂b₃…

b >, mv_i,j(k) is

the k-bit (bk) of mvi,j, and bk = 1 if pmⁱ_k = j for 1 ≤ i ≤ cs , 1 ≤ j ≤ α(i), and 1

≤ k ≤ cp; otherwise, bk = 0;

mv^LY the machine-bit vector of low-yield products for the given MDDP-t instance, where mv^LY=< b₁b₂b₃…

b > and b_k = 1 if py_k<yθ for 1 ≤ k ≤ c_p; otherwise, b_k

= 0;

OC j

mvi_, the abnormal target time machine-bit vector of machine mi,j, where mvi,j=<b1b2b3…

b > and bk = 1 if pt_kⁱ >μ(m_i_,_j)+σ(m_i_,_j) or )

( )

( _i_,_j _i_,_j

k m m

pt <μ −σ ; otherwise, bk = 0;

myi,j the machine vector for low-yield products from machine mvi,j, where myi,j

=AND(mvi,j, mv^LY);

mtyi,j the machine vector for outlier products of machine mvi,j, where mtyi,j

=AND(myi,j, mvi^OC_,j );

count_one(x) 1-bit count in bit-vector x;

count_zero(x) 0-bit count in bit-vector x;

μ(mi,j) the average manufacturing time for machine mi,j,

( )

reduced since all comparison and computation operations use the bit-wise indexing method. The formulation of the Individual Machine probing function (f1) is thus:

Individual Machine probing function f1(mi,j) for an MDDP-t

)

2. Intra-cluster probing function (f2): The second criterion considers the slopes of machine behavior regression lines within machine clusters. Intra-cluster machine

behavior is represented as a regression line of data points on a two-dimensional plane where the x and y axes are, respectively, the target time and yield of each product processed by the machine. A higher absolute slope value for the regression line means higher time-issue sensitivity for the corresponding machine.

In other words, it may be a root-cause candidate in the time-issue problem. As shown in Figures. 5.2(a) and 5.2(b), the absolute value of the machine-curve slope of mi,j is higher than that of mi,k. Therefore, machine mi,j has a higher possibility of being a root-cause candidate. The following definitions and functions are needed to calculate the parameters of this function:

Figure 5.2: The regression lines for (a) mi,j and (b) mi,k

Certain notation must be defined in order to calculate the parameters of this function:

) , ( ix

offset i-th 1-bit offset (l. to r.) in bit-vector x;

evs the set of data points for products processed by machine pm^j:

Machine mi,j Machine mi,k

Yield Yield

Target time Target time

evs_i,j={(x₁, y₁), (x₂, y₂), …, ( _{_} ₍ ₎ _{_} ₍ ₎ regress(evsi,j) the evsi,j regression line;

slope(regress(evsi,j)) the slope of regress(evsi,j).

For the example shown in Table 5.1, the bit operation is mv1,1(3)=1, )

( _one mv₁_,₁

count =3, count_zero(mv₁_,₁)=7 and offset(m₁_.₁,3)=3, and we have the evaluation vector set for machine pm₁¹, evs_1,1={<0.85, 10>, <0.86, 10>, <0.80, 11>}.

Intra-machine-center probing function f2(mi,j) for the MDDP-t problem:

))

3. Machine Behavior probing function (f3): The third criterion considers similarities among machine behaviors in given datasets with respect to the time issue. The behavior of an arbitrary machine can be represented as a machine-behavior vector with count_one(mty_i_{, j}) and count_one(mv_i^OC_,_j ) the respective x and y axes. The sum of the degrees of included angle between the machine-behavior vector of a machine in a machine cluster and all the other machine-behavior vectors is calculated. The machine with the highest sum has the highest possibility of being

the root cause in that machine cluster. As shown in Figure 5.3, of the four machines in machine cluster sⁱ, the computed sum for machine mi,4 is obviously much higher than the others. Thus, machine mi,4 has higher possibility of being the root cause in this example. The following definitions and functions must be defined in order to calculate the parameters for this function.

Figure 5.3: The machine-behavior vectors of machine cluster sⁱ

inner_product(x, y) the inner product of machine-behavior vector (x, y);

) , (x y

θ the included angle of machine-behavior vector (x, y), where

mx_inc(mi,j, mi,k) the included angle between the machine-behavior vectors of machines mi,j and mi,k, where mx_inc(mi,j, mi,k)=

Therefore, formulation of Machine Behavior probing function f3(mi,j) is as follows:

Machine-behavior probing function f3(mi,j) of MDDP-t is:

Continuing from Example 5.1, the following machine bit-vectors were obtained:

mv1,1=<1110000000>, mv1,2=<0001110000>, mv1,3=<0000001111>, mv2,1=<1100000000>, mv2,2=<0010000011>, and mv2,3=<0001111100>; the low-yield machine bit-vector of product P is <0001010100> and the out-of-control machine-bit vectors of machines m1,1 and m1,2 are, respectively, mbv₁^OC_,₁ =<0010000000> and

mbv₁OC_,₂ =<0001000000>. And my1,1=<1110000000> AND <0001010100> =

<0000000000>, my1,2=<0001110000> AND <0001010100> = <0001010000> and the corresponding mty1,1 and mty1,2 are thus ANDed to <0000000000> and <0001000000>.

As mentioned above, we use these probing functions as major criteria in evaluating MDDP-t according to experts’ experience in the semiconductor manufacturing domain. We then define a Root Cause Evaluation Function RCEF(mi,j), which is a linear combination of the three probing functions along with their

corresponding weights wi used to identify the importance of each probing function in the RCEF, to compute the root-cause possibility of machine mi,j.

Root Cause Evaluation Function RCEF(mi,j) of MDDP-t

∑

= ³

, ) ( )

(

j i k k j

i w f m

m RCEF

However, the corresponding weights W={w1, w2, w3} of these three RCEF probing functions require further investigation. A genetic algorithm is thus used to solve the weight-learning problem of the three given probing functions in order to determine suitable weights for the MDDP-t.

5.3 Genetic Algorithm for MDDP-t

The search space in a GA (Genetic Algorithm) consists of possible solutions to a problem [15]. A solution in the search space is called an individual and its genotype consists of a set of chromosomes represented by sequences of 0s and 1s. These chromosomes can dominate individual phenotypes. Each individual has an associated objective function called its fitness. A good individual is one that has a high/low fitness

value depending on whether the problem involves maximization or minimization. The strength of a chromosome in an individual is represented by its fitness value and the chromosomes of individuals are carried to the next generation. The set of individuals with associated fitness values is called the population. The population at a given stage in the GA is referred to as a generation. The best individual in each generation is the individual with the best discovered fitness value.

There are three main components in the GA while loop:

(1) selection/reproduction, the process of selecting good individuals from the current

在文檔中知識系統中快速索引機制之研究 (頁 98-127)