• 沒有找到結果。

2.2 The Hybrid Classification Tree (HCT)

2.2.1 The Separation Matrices Based Clustering Algorithm

2.2.1.4 The Clustering Algorithm

Figure 2.8. An example of using Algorithm I to build the clustering tree.

2.2.1.4 The Clustering Algorithm

The separation matrix based clustering algorithm consists of two parts: the training part and the classification part. The training part, which is prepared for the classification part, consists of three steps: (i) construction of the separation matrices for all attributes, (ii) determine the cluster splitting attribute and build the clustering tree, and (iii) throughout the clustering tree, generate the fuzzy if-then rules needed to classify a data pattern into proper child cluster based on a given set of training data patterns with known TCs. In the above three steps, (i) and (ii) had been presented in previous subsections. The details of (iii) as well as the classification part are described below.

1. The Fuzzy-Rule Generation Procedures of the Clustering Algorithm

The fuzzy rules for splitting a non-TC cluster using the corresponding splitting attribute in our clustering algorithm are of the same type. Thus, for the sake of explanation, we will focus on generating the fuzzy rules for one cluster in the clustering tree. We let Crj denote a non-TC cluster and k denote the corresponding splitting attribute. We let xks, s1,...,g, denote the kth attribute of g data patterns, xs, s1,...,g, from Mj known child clusters,

jMj

j CCr

CCr ,...,1 . These g data patterns form the training data set for splitting Crj. The fuzzy rules for splitting cluster Crj are of the following type.

For i1,...,K, where K denotes the number of fuzzy partitioned intervals on the range of the kth attribute values,

Rule Ri(Crj): If xks is AiK, then the xs belongs to CCrji with CFiK, where AiK is the ith partitioned fuzzy interval, CCrji is the consequent, i.e. one of the Mj child clusters, and CFiK is the grade of certainty of rule Ri(Crj) (2.4)

What need be determined in the above rule are CCrji and CFiK, and the procedures for determining them are called fuzzy rules generation procedures for splitting one cluster as described below.

Let AiK be characterized by the nonnegative fuzzy membership function fiK(). The membership function fiK() can be triangular, Gaussian, or any other shape. In this chapter, we consider the triangular membership function. Then, fiK(xks) can be considered as the grade of compatibility of xks with respect to AiK. We define

jl jl s

CCr x

s k K i j

i

CCr (R (Cr )) f (x )

(2.5)

as the sum of grade of compatibility of child cluster CCrjl with respect to AiK. Then the algorithm for generating fuzzy rules for splitting cluster Crj can be stated as follows:

Algorithm II: Generation of the K fuzzy rules for splitting cluster Crj.

Step 0: Given g training data patterns xs, s1,...,g, with known child clusters CCrjl, Mj

l1,..., of the to-be-split cluster Crj and the corresponding splitting attribute k. Set i1.

Step 1: Calculate the sum of grade of compatibility of child cluster CCrjl, l1,...,Mj, with respect to AiK by (2.5).

Step 2: Find the child cluster CCrjx such that

denotes the average of the sum

of grade of compatibility of the rest of child clusters with respect to AiK. Step 4: If iK, stop; else, set ii1, and return to Step 1.

2. Training Part of the Clustering Algorithm

Combining the construction of separation matrices, determination of the splitting attributes, building of the clustering tree and the above fuzzy rule generation procedures, we are ready to summarize the training procedures of the clustering algorithm using the training data set.

Algorithm III: Training procedures of the clustering algorithm.

Step 0: Given a set of training data patterns with known classes; compute  andik  ofik each class Ci and each attribute k ; compute the separation matrices

] ) , (

[D Ci Cj k based on (2.2) for each attribute k.

Step 1: Apply Algorithm I to obtain the splitting attributes and build the clustering tree.

Step 2: Use Algorithm II to generate the fuzzy rules for each cluster in the clustering tree.

3. Classification Part of the Clustering Algorithm

Once the fuzzy rules for splitting the clusters in the clustering tree are generated, we can determine the child cluster to which the new data pattern belongs at each cluster based on a

fuzzy reasoning method.

Let the new data pattern be xand let xbe thek kth attribute of xcorresponding to the splitting attribute k at cluster Crj. We define

CCrjl

 , the weighting grade of certainty of xk with respect to the child cluster CCrjl, as the sum of the multiplication of the grade of compatibility of xwith respect tok AiK and the grade of certainty of rule Ri(Crj) over all

K trained rules whose consequent are CCrjl. We can express

CCrjl

( . Then the classification procedures for the new data can be

stated below.

Classification Procedures: The child cluster CCrjy, with respect to which the weighting grade of certainty of x isk maximum, is the concluded cluster of x, that is,

})

Now, the classification procedures for classifying a new data pattern xinto a TC can be stated in the following.

Algorithm IV: Classification procedures of the clustering algorithm.

Step 0: Given a new data pattern x(x1,...,xn), where n denotes the total number of attributes; set Present Cluster (PCr)=Cr .0

Step 1: Use xk, where k corresponds to the attribute used for splitting the PCr, and the classification procedures stated above to classify xinto a child cluster of PCr, we denote this child cluster by CPCr. If the CPCr is not a TC, set PCr=CPCr and repeat this step; otherwise, stop.