The Clustering Algorithm - The Separation Matrices Based Clustering Algorithm

2.2 The Hybrid Classification Tree (HCT)

2.2.1 The Separation Matrices Based Clustering Algorithm

2.2.1.4 The Clustering Algorithm

Figure 2.8. An example of using Algorithm I to build the clustering tree.

2.2.1.4 The Clustering Algorithm

The separation matrix based clustering algorithm consists of two parts: the training part and the classification part. The training part, which is prepared for the classification part, consists of three steps: (i) construction of the separation matrices for all attributes, (ii) determine the cluster splitting attribute and build the clustering tree, and (iii) throughout the clustering tree, generate the fuzzy if-then rules needed to classify a data pattern into proper child cluster based on a given set of training data patterns with known TCs. In the above three steps, (i) and (ii) had been presented in previous subsections. The details of (iii) as well as the classification part are described below.

1. The Fuzzy-Rule Generation Procedures of the Clustering Algorithm

The fuzzy rules for splitting a non-TC cluster using the corresponding splitting attribute in our clustering algorithm are of the same type. Thus, for the sake of explanation, we will focus on generating the fuzzy rules for one cluster in the clustering tree. We let Cr_j denote a non-TC cluster and k denote the corresponding splitting attribute. We let x_k^s, s1,...,g, denote the kth attribute of g data patterns, x^s, s1,...,g, from M_j known child clusters,

jMj

j CCr

CCr ,...,₁ . These g data patterns form the training data set for splitting Cr_j. The fuzzy rules for splitting cluster Cr_j are of the following type.

For i1,...,K, where K denotes the number of fuzzy partitioned intervals on the range of the kth attribute values,

Rule R_i(Cr_j): If x_k^s is A_i^K, then the x^s belongs to CCr_ji with CF_i^K, where A_i^K is the ith partitioned fuzzy interval, CCr_ji is the consequent, i.e. one of the M_j child clusters, and CF_i^K is the grade of certainty of rule R_i(Cr_j) (2.4)

What need be determined in the above rule are CCr_ji and CF_i^K, and the procedures for determining them are called fuzzy rules generation procedures for splitting one cluster as described below.

Let A_i^K be characterized by the nonnegative fuzzy membership function f_i^K(). The membership function f_i^K() can be triangular, Gaussian, or any other shape. In this chapter, we consider the triangular membership function. Then, f_i^K(x_k^s) can be considered as the grade of compatibility of x_k^s with respect to A_i^K. We define







jl jl s

CCr x

s k K i j

CCr (R (Cr )) f (x )

 (2.5)

as the sum of grade of compatibility of child cluster CCr_jl with respect to A_i^K. Then the algorithm for generating fuzzy rules for splitting cluster Cr_j can be stated as follows:

Algorithm II: Generation of the K fuzzy rules for splitting cluster Cr_j.

Step 0: Given g training data patterns x^s, s1,...,g, with known child clusters CCr_jl, Mj

l1,..., of the to-be-split cluster Cr_j and the corresponding splitting attribute k. Set i1.

Step 1: Calculate the sum of grade of compatibility of child cluster CCr_jl, l1,...,M_j, with respect to A_i^K by (2.5).

Step 2: Find the child cluster CCr_jx such that

 denotes the average of the sum

of grade of compatibility of the rest of child clusters with respect to A_i^K. Step 4: If iK, stop; else, set ii1, and return to Step 1.

2. Training Part of the Clustering Algorithm

Combining the construction of separation matrices, determination of the splitting attributes, building of the clustering tree and the above fuzzy rule generation procedures, we are ready to summarize the training procedures of the clustering algorithm using the training data set.

Algorithm III: Training procedures of the clustering algorithm.

Step 0: Given a set of training data patterns with known classes; compute  and_i^k  of_i^k each class C_i and each attribute k ; compute the separation matrices

] ) , (

[D C_i C_j _k based on (2.2) for each attribute k.

Step 1: Apply Algorithm I to obtain the splitting attributes and build the clustering tree.

Step 2: Use Algorithm II to generate the fuzzy rules for each cluster in the clustering tree.

3. Classification Part of the Clustering Algorithm

Once the fuzzy rules for splitting the clusters in the clustering tree are generated, we can determine the child cluster to which the new data pattern belongs at each cluster based on a

fuzzy reasoning method.

Let the new data pattern be xand let xbe the_k kth attribute of xcorresponding to the splitting attribute k at cluster Cr_j. We define

CCrjl

 , the weighting grade of certainty of x_k with respect to the child cluster CCr_jl, as the sum of the multiplication of the grade of compatibility of xwith respect to_k A_i^K and the grade of certainty of rule R_i(Cr_j) over all

K trained rules whose consequent are CCr_jl. We can express

CCrjl

 ( . Then the classification procedures for the new data can be

stated below.

Classification Procedures: The child cluster CCr_jy, with respect to which the weighting grade of certainty of x is_k maximum, is the concluded cluster of x, that is,

})

Now, the classification procedures for classifying a new data pattern xinto a TC can be stated in the following.

Algorithm IV: Classification procedures of the clustering algorithm.

Step 0: Given a new data pattern x(x₁,...,x_n), where n denotes the total number of attributes; set Present Cluster (PCr)=Cr .₀

Step 1: Use x_k, where k corresponds to the attribute used for splitting the PCr, and the classification procedures stated above to classify xinto a child cluster of PCr, we denote this child cluster by CPCr. If the CPCr is not a TC, set PCr=CPCr and repeat this step; otherwise, stop.

在文檔中兩個關於晶圓製造及測試程序的產能與良率之問題及解決方法 (頁 26-29)