Mining association rules through integration of clustering analysis and ant colony system for health insurance database in Taiwan

(1)

Mining association rules through integration of clustering analysis

and ant colony system for health insurance database in Taiwan

R.J. Kuo

a,*

, S.Y. Lin

a

, C.W. Shih

b

a_{Department of Industrial Engineering and Management, National Taipei University of Technology, Taipei 106, Taiwan, ROC} b_{Department of Industrial Engineering and Management, National Chiao-Tung University, Shin-Chu 300, Taiwan, ROC}

Abstract

In addition to sharing and applying the knowledge in the community, knowledge discovery has become an important issue in the knowledge economic era. Data mining plays an important role of knowledge discovery. Therefore, this study intends to propose a novel framework of data mining which clusters the data ﬁrst and then followed by association rules mining. The ﬁrst stage employs the ant system-based clustering algorithm (ASCA) and ant K-means (AK) to cluster the database, while the ant colony system-based association rules mining algorithm is applied to discover the useful rules for each group. The medical database provided by the National Health Insurance Bureau of Taiwan Government is used to verify the proposed method. The evaluation results showed that the proposed method not only is able to extract the rules much faster, but also can discover more important rules.

2006 Published by Elsevier Ltd.

Keywords: Data mining; Ant colony system; Cluster; Association rule

1. Introduction

In recent years, there are dramatic changes in the human life, especially the information technology. It has become the essential part of our daily life. Its convenience let us more easily to store any kind of the information regarding science, medicine, finance, population statistics, marketing and so on. However, if there is not a useful method to help us apply these data, then they are only the garbage instead of resources. Due to such demand, there are more and more researchers who pay more attention on how to use the data effectively as well as efficiently. And this is so called data mining.

Data mining includes many areas, in which there are databases techniques, artiﬁcial intelligence, machine learn-ing, neural network, statistical techniques, pattern recogni-tion, data visualization etc., is growing up very quickly. It is assigned an objective to ﬁnd the hidden knowledge or information, which may be helpful to make decisions for

business or policies, from large database automatically. Data mining can be classiﬁed into some topics, like classi-ﬁcation, estimation, forecasting, clustering, association rule

and sequential pattern (Peacock Peter, 1998). Among

them, this study intends to propose a framework which integrates both the clustering analysis and association rules mining to discover the useful rules from the database through ant colony optimization system.

Therefore, the proposed method is consisted of two components: (1) clustering analysis and (2) association rules mining. The ﬁrst stage employs the ant system-based clustering algorithm (ASCA) and ant K-means (AK) to cluster the database, while the ant colony system-based association rules mining algorithm is applied to discover the useful rules for each group. The reason to clustering the database ﬁrst is that this can dramatically decrease the min-ing time. In order to assess the proposed method, a data-base being provided by the National Health Insurance Plan of Taiwan Government is applied. This database has accumulated 12 millions administrative and claims data, which is the largest database in the world. Basically, this work is a cooperation of National Health Research

0957-4174/$ - see front matter 2006 Published by Elsevier Ltd. doi:10.1016/j.eswa.2006.08.035

*

Corresponding author. Fax: +886 2 27763996. E-mail address:rjkuo@ntut.edu.tw(R.J. Kuo).

www.elsevier.com/locate/eswa Expert Systems with Applications 33 (2007) 794–808

(2)

Institute with the National Health Insurance Bureau of Taiwan Government in order to establish a Nation Health Insurance research database. The computational results show that the proposed method not only can extract the useful rules faster, but also can provide more precise rules for the medical doctors.

The rest of this paper is organized as follows. Section2

summarizes some general background for data mining, clustering analysis, association rule and ant colony optimi-zation system, and the proposed method is presented in

Section3. The result of real world data with the proposed

method is illustrated in Section 4. Finally, concluding

remarks are made in Section5.

2. Background

This section will brieﬂy review four aspects of litera-tures. They include data mining, clustering analysis, associ-ation rule mining and ant colony optimizassoci-ation system algorithm. Detailed information is presented in the follow-ing subsections.

2.1. Data mining

In the past study, Fayyad et al. had deﬁned the knowl-edge discovery in database (KDD) as a nontrivial process of identifying valid, novel, potentially useful, and ultimately

understandable patterns in data (Fayyad,

Piatetsky-Shap-iro, & Smyth, 1996, 1997). By the term process shows that KDD is made up of several steps, which involve the selec-tion, preprocessing, transformaselec-tion, data mining, and inter-pretation/evaluation. Data mining is a multi-disciplinary ﬁeld that is at the intersection of statistics, machine learning, database management, and data visualization, to provide a new perspective in data analysis (Peacock Peter, 1998).

The following five foundation-level analysis domains are the ‘‘reason why’’ of using data mining: summarization, predictive modeling, clustering/segmentation, classifica-tion, and link analysis (Peacock Peter, 1998). Link analysis refers to a family of methods that are employed to correlate patterns cross-section over time with each other. In the marketing, a link analysis model can provide information about the buyers’ behavior. Using the same idea to analyze medical behavior, link analysis can find patterns in patients’ visits to doctors. This can be helpful for diagnosis and deciding on drugs. If a medical analyst could find out which groups of sets of items are most likely to be diag-nosed in a particular group of patients, he can make several treating strategies, depending on the results of link analysis for their regular uses to make more effects. Because of its importance in medical science, the following study we will focus on this issue.

2.2. Clustering analysis

The goal of clustering analysis is to group similar objects together. There are many methods applying in clustering

analysis, such as hierarchical clustering, partition-based clustering, density-based clustering, and artificial intelli-gence-based clustering. In this subsection, the artificial intelligence-based clustering, which includes artificial neu-ral networks (ANN) and genetic algorithm (GA), was illus-trated. The others are introduced in the survey research (Bellaachia, Portnoy, Chen, & Elkahoun, 2002; Witten & Frank, 2000; Berkhin, 2002).

2.2.1. Applications of ANN in clustering analysis

The artiﬁcial neural network (ANN) is a system which has been derived through models of a collection of simple nonlinear computing elements whose inputs and outputs

are linked to form the network (Kohonen, 1991).

Kohonen’s feature maps (also called Self-Organizing Feature Map, SOM) is the most widely applied unsuper-vised learning scheme. The SOM has two layers that include input layer and output layer. The input layer is fully connected to the output layer that is a two-dimen-sional layer. Each output layer nodes measures the Euclid-ean distance of its weights to the incoming input vector. The weights of winning node that has the smallest distance in the output layer are adjusted to be closer to the vector of the input nodes.

Because SOM can map the input vectors with high dimensions into 2-D space, it is easier to visualize the data and cluster analysis. In other word, most applications of cluster analysis with SOM are to observe the mapping net-work by vision, and then determine the distribution of clus-ters. Recent years, there ware many studies that improved the efficacy and efficiency of SOM. Such as, the Double SOM that can adjust at learning stage and let the nodes that have similar input vectors produce similar weight vectors and come near (Su & Chang, 2000, 2001). But it may have different results by observing the mapping network with the same data. Resson proposed adaptive double SOM

(ADSOM) (Fayyad et al., 1996) that combines features of

the popular SOM with two-dimensional position vectors, which serve as a visualization tool to detect the number of clusters presented in the data. ADSOM allowed automating detection of the number of clusters with a novel index that is introduced on the base of hierarchical clustering of the ﬁnal locations of position vectors. Thereby, reducing human error could be incurred from counting clusters visually.

Adaptive resonance theory (ART) is another widely applied unsupervised learning scheme. ART include ART1, which is applicable for binary input, and Art2 which is used to deal with continuous input (Carpenter & Gross-berg, 1987). Unlike traditional SOM, ART network can determine the actual number of cluster with any visual examination.

2.2.2. Application of GA in clustering analysis

GA-based clustering algorithm was proposed byMaulik

and Bandyopadhyay (2000). It can improve result of the conventional statistics methods, like K-means, that are

(3)

2000). Krishna and Murty proposed Genetic K-means Algorithm (GKA) for clustering analysis, which defines a biased mutation operator specific to clustering called dis-tance-based-mutation. And they proved that GKA can converge to the best known optimum by using finite Mar-kov chain theory (Krishna & Murty, 1999).

2.3. Association rule mining

The issue of mining association rules was ﬁrst addressed

in 1993 (Agrawal, Imielinski, & Swami, 1993a, 1993b).

They pointed out that there are some hidden relationships among the purchased items in transactional databases. For example, there are associations or relationships between items such as bread and milk, which are often purchased together in a single basket transaction. The mining results can help understand the customer’s purchase behavior, which might not have been previously perceived.

An association rule is of the form X) Y, where X and

Y are both frequent itemsets in the given database and the

intersection of X and Y is an empty set, i.e., X\ Y = B.

The support of the rule X) Y is the percentage of

transac-tions in the given database that contain both X and Y, i.e.,

P(X[ Y). The conﬁdence of the rule X ) Y is the

percent-age of transactions in the given database containing X that also contains Y, i.e., P(YjX). Therefore, association rule mining is used to find all the association rules among item-sets in a given database, where the support and confidence of these association rules must satisfy the user-specified minimum support and minimum confidence. The problem of association rule mining can be divided into two sub-problems:

1. Finding frequent itemsets with their supports above the minimum support threshold.

2. Using frequent itemsets found in the step 1 to generate association rules that have a conﬁdence level above the minimum conﬁdence threshold.

Therefore, many studies of association rule mining con-centrate on developing eﬃcient algorithms for frequent itemset discovery. The following subsections summarize some of the most popular algorithms for frequent itemset mining.

2.3.1. Apriori-like algorithm

Agrawal et al. (1993a, 1993b)proposed the well-known algorithm, Apriori, to mine large itemsets to ﬁnd out the association rules among items. This algorithm employs a level-wise approach, which iteratively generates candidate

k-itemsets from previously found frequent (k 1)-itemsets,

and then checks the supports of candidates to form fre-quent k-itemsets. The algorithm scans multiple passes over the database. The eﬃciency and correctness of the level-wise generation of frequent itemsets are based on an impor-tant property, called the Apriori Property.

The algorithm is ﬁrst pass counts item occurrences to ﬁnd the set of frequent 1-itemsets, denoted as L1. A

subse-quent pass, say pass k, consists of two steps; the join and prune steps. In the join step, a set of candidate k-itemsets (denoted as Ck) is generated by joining the frequent

item-sets Lk 1found in the (k 1)th pass with itself. For

exam-ple,Fig. 1 demonstrates how to ﬁnd frequent itemsets in min_sup = 2.

2.3.2. FP-growth algorithm

Han, Pei, and Yin (2000)proposed a novel frequent pat-tern tree (FP-tree) structure, which contains all the com-pact information for mining frequent itemsets, and then proposed the FP-growth algorithm, which adopts a pattern segment growth approach to prevent generating a large number of candidate itemsets. Their mining method only scans the whole database twice and does not need to gener-ate candidgener-ate itemsets, and so it is very eﬃcient.

2.3.3. Parallel mining

Parallel mining (Agrawal & Shafer, 1996) is another

technique used to improve the classic algorithm of mining association rules on the premise that there exist multiple processors in the computing environment. The core idea of parallel mining is to separate the mining tasks into sev-eral sub-tasks so that each sub-task can be performed simultaneously on various processors, which are embedded in the same computer system or even spread over the dis-tributed systems. Thus; this improves the eﬃciency of the overall algorithm for mining association rules.

2.3.4. Sampling algorithm

A random sampling technique (Toivonen, 1996) was

used to ﬁnd association rules to reduce the database activ-ity. The sampling algorithm applies the level-based method on the sample with lower minimum support threshold to mine the superset of large itemsets. This method produces exact association rules, but in some cases it does not gener-ate the entire association rules, that is, there might exist some missing association rules. Therefore, this approach

(4)

requires only one full pass over the database in most cases, and only two passes in the worst case.

2.3.5. Lattice-based algorithm

Zaki (2000)organized the items into a structure of lat-tice and presented a set of algorithms including Eclat, MaxEclat, MaxClique, TopDown and AprClique for iden-tifying maximal large itemsets. All of the algorithms attempt to look ahead and identify long large itemsets early to help prune oﬀ the number of candidate itemsets consid-ered. There are also another two approaches for mining

long large itemsets. Lin and Kedem (2002)proposed

Pin-cer–Search algorithm for mining long large itemsets,

whereas Bayardo (1998) proposed the Max–Miner

algo-rithm. Both algorithms attempt to discover the long and large scale patterns through the search eﬀort. The greatest diﬀerence between the two methods is in the generation of candidate itemsets. The Max–Miner approach generates the candidate itemsets in polynomial time since it is an NP-hard problem in the Pincer–Search method to ensure that no long candidate itemsets contain any known infre-quent itemset.

2.3.6. Partition algorithm

For mining association rules,Savasere, Omiecinski, and

Navathe (1995) introduced a partition algorithm that is fundamentally diﬀerent from the classic algorithm. First, a partition algorithm scans the database once to generate a set of all potentially large itemsets, and then the supports for all the itemsets are measured in the second scan of the database. The key to correctness of the partition algorithm is that a potentially large itemset appears as a large itemset in at least one of the partitions. This algorithm logically divides the database into a number of non-overlapping par-titions, which can be held in the main memory. The parti-tions are considered individually and all large itemsets for that partition are generated. These large itemsets are fur-ther merged to create a set of all potential large itemsets. Then these itemsets are generated.

2.3.7. Cluster-decomposition association rule algorithm Tsay and Chang-Chien (2004) and Zhang and Li (1993) proposed the cluster-based association rule (CBAR), which creates cluster tables by scanning the database once, and then clustering the transaction records by the length of record. Moreover, the large itemsets are generated by con-trasts with the partial clusters. They found that CBAR could improve more eﬃciency with the increasing of data-base size or the decreasing of minimum support.

2.3.8. Proximus

Koyuturk, Grama, and Ramakrishnan (2005)proposed an eﬃcient framework, PROXIMUS, for error-bounded compression of high-dimensional discrete-attribute data sets. Given a transaction set on a set of items, we can con-struct a binary transaction matrix by mapping transactions to rows and items to columns and setting entry Tijof

trans-action matrix T as 1 if item j is in transtrans-action Ti•. And

then, decompose the matrix T to n kinds diﬀerent trans-action sets by ﬁnding rank-one approximation of T: [x1, . . . , xn]• [y1, . . .,yn]T. By using this method, the

frame-work can condense the large transaction data to n diﬀerent kinds of virtual transaction sets, each of them that is asso-ciated a weight, which is deﬁned as the number of non-zeros in the corresponding presence vector, i.e., the number of transactions that contain the corresponding pattern. Finally, the framework can reduce the size of transaction sets. In other words, it can have higher performance in mining association rule from transaction sets.

2.4. Ant colony optimization algorithm

2.4.1. The concept of ant colony optimization system In the real world, ants communicate with others by a trail of chemicals called ‘‘pheromones’’ which are deposited by ants when they search for food. Then, the other ants encounter the previously laid pheromones and decide how many probabilities they will follow. As more and more ants pass by the same path, the pheromones on the shorter path would be increased, but the pheromone would evapo-rate on the other paths, as illustevapo-rated in Fig. 2.

2.4.2. Ant colony system

The ant colony system (ACS) is based on agents that simulate the natural behavior of ants, develop mechanisms

of cooperation and learn from experiences (Dorigo &

Gambardella, 1997). The heuristics have been shown to be robust and versatile for diﬀerent problems. In addition, ACS is a population-based heuristics that enables the exploration of the positive feedback between agents as a search mechanism.

ACS is a particular algorithm of ACO whereas the real ants are able to communicate information concerning food sources via an aromatic essence. While searching for food, they secrete a pheromone to mark the path leading to food source. When there are more pheromones on a path, there

(5)

is larger probability that other ants will use that path, and therefore the pheromone trail on such a path will grow fas-ter and attract more ants to follow. In the ACS, the method whereas ants select the path is changed, called ACS state transition rule. When ant k in the city r will go to next city s, the selection rule is:

s¼ arg maxu2JkðrÞ

fsðr; uÞ gðr; uÞbg; if q 6 q₀

S; otherwise

(

ð1Þ where 0 6 q 6 1 is randomly produced and q with 0 6 q061 is the random parameter of the system. S is the city

by the random-proportional rule selection, which is deﬁned as

p_kðr; sÞ ¼

sðr; sÞ gðr; sÞb P

u2JkðrÞsðr; uÞ gðr; uÞ

b; if s2 JkðrÞ 0; otherwise 8 > < > : ð2Þ

where s is called pheromone trials, and g is 1/d between the

nodes. Thus, d represents distance, JK means that ant k is

non-passed city after ant k pass city r, and b is another sys-tem parameter in the ant colony syssys-tem.

These two formulas are overall called pseudo random proportional rules, and they are according to the method of Ant-Q in the ant evolution. There are three models for the station transition rule: pseudo random, pseudo random

proportional, and random proportional. Eq. (1) is called

the act of exploitation as q 6 q0; otherwise s is equal to

S, which is called the act of biased exploration.

In the ACS, the pheromone trials are divided into two parts, the ACS global and local updating rules, respec-tively. The ACS global updating rule is referred to the ANT-cycle method in the ant system. When ants have com-pleted all their tours, the pheromone trial could be renewed, which is called the oﬄine method. The ACS local updating rule is referred to as the ANT-density method in the ant system. When an ant is walking, each step renews the pheromone trail once, called online method.

The ACS global updating rule is presented as

sðr; sÞ ¼ ð1 aÞ sðr; sÞ þ a Dsðr; sÞ ð3Þ where Dsðr; sÞ ¼ 1 Lgb; if ðr; sÞ 2 global-best-tour 0; otherwise ( ð4Þ In addition, 0 < a < 1 is called the pheromone decay

parameter, and Lgbis the shortest path from ﬁrst point to

current point (In the TSP problem). ACS is the method which means the path that ants ﬁnd the shortest path from the start to current points. Therefore, it can reach the opti-mal solution.

The ACS local updating rule is presented in the follow-ing equation:

sðr; sÞ ¼ ð1 qÞ sðr; sÞ þ q Dsðr; sÞ ð5Þ

where 0 < q < 1 is called the pheromone evaporate param-eter, and Ds (r, s) = s0.

The ACS local updating rule is similar to the ACS glo-bal updating rule. It is increased by a ﬁx quantity of pher-omone trails every time. When the pherpher-omone on the

original path is bigger than s0, the pheromone value on

the path is decreased after the local updating rule. This can prevent a larger number of ants using the same path, which causes pheromone trials to stagnate. When ants tra-vel from one to other items, they could do local updating; and when ants ﬁnished their travel once, global updating is implemented.

2.4.3. Ant colony system in clustering analysis

Tsai, Wu, and Tsai (2002) proposed the algorithm that was named ant colony optimization with diﬀerently favor (ACODF). ACODF algorithm has the following desirable strategies:

1. It uses diﬀerently favorable ants to solve the clustering problem.

2. ACODF adopts simulated annealing concept for ants to decreasingly visit the amount of cities and get the local optimal solutions.

3. It utilizes tournament selection strategy to choose a path.

Every ant only needs to visit few cities instead of all of cities. Thus, the ant will reduce visiting the cities every iter-ations. After several iterations, the closer nodes are, the higher trail intensity will be. On the other hand, the further nodes are, the lower trail intensity will be. Therefore, ants will favor to visit the closer nodes and then reinforcing the trail with their own pheromone. Finally, the clusters will be built by dividing the pheromone that was laid on the edge between the data points.

Kuo and his colleagues proposed the Ant System-based

clustering algorithm (ASCA) (Kuo, Cha, Chou, Shih, &

Chiu, 2003) and Ant K-means algorithm (AK) (Kuo, Wang, Hu, & Chou, 2005) to solve the problem of cluster-ing analysis. They combined these two algorithms as a two-stage clustering method, which uses ASCA to determine the number of cluster, and then uses AK to optimize the result of clusters. The AK modiﬁes the K-means as locating objects in cluster with the probability, which is updated by the pheromone, while the rule of updating pheromone is according to total within cluster variance (TWCV).

In Yang, Sun, and Huang (2002), they applied the ant colony system (ACS) for clustering problem. Based on ACS, it treats the data (objects or elements) as the ants. Thus, each ant has diﬀerent properties. Basically, the pro-cess of data clustering is the propro-cess of ant looking for food.

2.4.4. Ant in association rule mining

The ant system employed for mining association rules is a very new application, although there have been many

(6)

applications in data mining. Regarding the application of

ant colony system for mining association rules,Su (2002)

adopted the technique and concept of Ant System to develop association rules. The developed algorithm is sup-ported by quality data, quantity data, and mix data. According to its results, the ant system must take more time on running the data in assign cycle; and if the data is critical or has time constraints, it may not feasible. Fur-thermore, there are some parameters in the ant algorithm which need to be pre-determined, which may be time

con-suming. In order to resolve the foregoing problems, Kuo

and Shih (accepted for publication) used the constraints concept to decrease the run time, and let almost all of the parameters be known before running the model. In this

study and (Shih, 2004), the algorithms which are based

on Ant colony system were proposed to mine the associa-tion rules.

3. Methodology

The proposed framework is described in this section. The following subsections will describe the problem deﬁni-tion and the proposed method, Ant System-based

Cluster-ing Algorithm (ASCA), Ant K-means Clustering

Algorithm (AK), and ACS-based association rule mining

algorithm. The mining stages are shown inFig. 3.

3.1. Clustering algorithm 3.1.1. Deﬁnitions and notations

The following terms and notations are used throughout this study:

• Let E = {O1, O2, . . . , On} be the set of n data or objects,

where O is the objects (or data, item) collected from the database. And each object has k attributes, where k > 0 (seeFig. 4).

• a: The relative importance of the trail, a P 0. • b: The relative importance of the visibility, b P 0. • q: The pheromone decay parameter, 0 < q < 1. • Q: A constant.

• n: Number of objects. • m: Number of ants. • nc: Number of clusters.

• T is the set includes used objects. The maximal number recorded by T array will be n, i.e. T = {Oa, Ob, . . . , Ot},

where a, b, . . . , t are the points that ant has been. • Tk: the set T is performed by ant k.

• Ocenter(T): the object which is the center of all objects in

T, i.e., OcenterðT Þ ¼ 1 nT X Oi2T Oi; ð6Þ

where nTis the number of objects in T.

• TWCV: Total within cluster variance, i.e., Xnc k¼1 X i2k ðOi; OcenterðTkÞÞ 2 : ð7Þ

3.1.2. Ant system-based clustering algorithm (ASCA) The algorithm of ASCA is including four sub-proce-dures, that is Divide, Agglomerate_obj, Agglomerate, and Remove. Following is the subscribing of procedures of ASCA. First, initialize the parameters and group all the objects as a cluster. And then the sub-procedure Divide will divide the cluster into several sub-clusters and some object which does not belong to any sub-clusters through the con-sistency of the pheromone and some criterion. After Divide, the Agglomerate_obj is the next step at this algorithm in order to agglomerate the objects into the suitable sub-clus-ter. Fourth, Agglomerate is the sub-procedure to merge the similar two sub-clusters into a cluster. And then run Agglomerate_obj again. Sixth, after agglomerating the sim-ilar object into the suitable cluster, the Remove sub-procedure tries to remove the un-similar from sub-cluster. Calculate the total within cluster variance (TWCV). If

Transformation

Cluster with ASCA & AK

Cluster 1 Cluster 2 ... Cluster n -1 Cluster n

ACS -based Association rule Ming

Generate Association rule Data Cleaning &

selection

Fig. 3. The mining stages of the study.

Objects O1 On . . . A1 A1 A1 1.22 55.6 32.5 5.6 56.4 8.4 . . . . . . . . . …

…

… …

(7)

TWCV is not changed, grouping the non-clustered objects to the closest cluster, and stop the procedure. Otherwise, repeat the sub-procedure Divide, Agglomer-ate_obj, Agglomerate, AgglomerAgglomer-ate_obj, Remove, round and round until TWCV is not changed. The detail

algo-rithm of ASCA is introduced inKuo et al. (2003)as shown

inFig. 5.

3.1.3. Ant K-means clustering algorithm (AK)

Ant K-means Algorithm (AK) (Kuo et al., 2005) is the

second stage of clustering. AK modiﬁes the K-means as locating the objects in a cluster with the probability which

is modiﬁed by the pheromone. And the rule of updating pheromone is according to total within variance. The pro-cess is as following. The ﬁrst step is initializing the param-eters including the number of clusters and its centroid. Then, lay equal pheromone on each path. Third, each ant k chooses the centroid to move with P, i.e.,

Pk_ij¼ s a ijg b ij Pnc c s a icg b ic ; ð8Þ

where i is the start point, j is the end point (centroid) which ant k chooses to move, c is the centroid and nc is the

(8)

ber of centroids. Therefore, if the value of Pijis bigger than

others, ant k will move from point i to point j, i.e., object i belongs to centroid j. Fourth, update the pheromone by sij sijþ

Q

TWCV; ð9Þ

where Q is the constant, TWCV is the total within

cluster variance. And then, calculate Ocenter(Tk) where

k = 1, 2, 3, . . . , nc. After that, calculate TWCV. If

TWCV is changed, go back to third step; otherwise, if TWCV is smaller than smallest TWCV, replace it. The next step, run the procedure Perturbation to leap from the local minimal solution. If the number of iterations is not reached, go back to third step; otherwise, stop this

algo-rithm. Fig. 6 shows the procedure of Ant K-means

algorithm.

(9)

3.2. The ACS-based association rule mining

In this section, ACS-based association proposed in

Fay-yad et al. (1996)is introduced. 3.2.1. Problem deﬁnition

Let U = {I1, I2, . . . , In} be a set of all items, where an

item is an object with m dimensional attributes (m P 1) that are so-called dimensions (e.g., weight, high, cost, . . .etc), as illustrated in Fig. 7. The value kmis on

dimen-sion Aj[j2 {1, 2, . . . , m}] of item Ik Aj.

Deﬁnition 3.1. Association Rules Mining

If r(Ai, Aj) = sijP sthreshold, (1 6 sij61), represents

the degree of relations between Ai, Aj," i, j = 1, . . . , n with

r(Ai, Aj) = r (Aj, Ai) and r(Ai, Ai) = 1, then an association

rule is an expression of Ai () Aj, for any Ai, Aj2 A when

sijP sthreshold, with 0 6 sthreshold61.

3.2.2. ACS-based association rule mining algorithm

The ACS-based association rule mining algorithm (Shih,

2004) was applied to mine the association rule. The associ-ation rules with n items construct a complete graph, where each pair of vertices is joined by an edge, as illustrated in Fig. 8. Let gijbe the frequency between items i and j. The

gijthe edges represents the frequency between items.

Let bi(t) (i = 1, . . . , n) be the number of ants at item i at

time t and let m¼Pn_i¼1biðtÞ be the total number of ants at

time t. All ants will follow:

1. Ant chooses next item j to follow by the state transition rule that is deﬁned by

j¼ arg maxu2ZfsiuðtÞ g b

iug; if q 6 q0

S; otherwise

(

ð10Þ

where Z is the set of the ant unaccomplished tour and b

is a system parameter. In addition, q and q0are random

Fig. 6. The procedure of Ant K-means.

Attributes (dimensions)

k

1

= I

k

- A

1

itemID A

1

A

2

… A

m

I

k

(k

=

1

, k

2

,…, k

m

)

Fig. 7. Multi-dimensional items.

η

ij

η

jk

η

kz

η

zi

i

z

k

j

η

ik

η

jz

(10)

number uniformly distributed in [0, 1] and in parameter

(0 6 q061), which determines the relative

impor-tance of exploitation versus exploration, respectively. If q 6 q0, the item of unaccomplished tour j with

maxi-mum siuðtÞg b

iu value is put at position (exploitation);

otherwise the item is chosen according to S (biased exploration).

2. The random variable S is selected according to the prob-ability distribution of the random-proportional rule as following equation: pk ij¼ sijðtÞgbij P u2ZsiuðtÞg b iu if j2 Z 0; otherwise 8 < : ð11Þ

The resulting state transition rules refer to Eqs.(10) and

(11), and are called the pseudo-random-proportional

rule.

3. Ants can only choose a path that has never been used (increase tabu).

4. After traveling on a path, an ant will lay some phero-mone on it (local updating).

Let sij(t) be the intensity of pheromone trail on edge (i, j)

at time t, therefore, we can consider an iteration to be when an ant completes the tour, and next iteration will be started at (t + 1). Then the pheromone intensity is updated accord-ing to

sijðt þ 1Þ ¼ ½ð1 aÞ sijðtÞ þ a Dsij ð12Þ

where a (0 6 a 6 1) is a coefficient for the remaining per-centage of pheromone between time t to t + 1. The associ-ation rules a is considered as a time series coefficient, which will decrease the impact levels of the old data and regulate coefficient Dsij. Therefore, if the data is irrelevant to time,

set a = 0.

For mining association rules, use U-correlation instead of correlation coeﬃcient, as deﬁned below:

DskijðtÞ ¼

1 Lgb

; if the kth ant in its tour is global-high-frequency

0; otherwise

8 < :

ð13Þ Lgbis proposed as the highest frequency from ﬁrst ant to

kth ant accumulate value. It is also the pheromone inten-sity among the shortest path ij, while each ant completes its trip at time t.

Now let us summarize our algorithm as follows: Step 1: Initialization

Set t = 0 {t is the time counter};

Set NC = 0 {NC is the iteration counter};

Set sij(t) = c and Dsij= 0 and s0= c, " i, j = 1, . . . , n;

i 5 j;

Set m = n Place the mth ant on the nth nodes (items);

Set b = c and q = c and q = c and a = c2 [0, 1]; tabu

(s) =;.

Step 2: Multi-dimensional constraints test

Scan the database once and ﬁnd the complete set [SATc(U)] of itemsets satisfying C.

Step 3: Mining guided by ant colony system 1. Calculate gij(t) = supportij(t) from set [SATc(U)].

2. Choose next item j by the state transition rule (Eqs.(10)). 3. If q P q0, then choose the next edge ij until a given step

is selected to move to with the transition probability as

shown in Eqs.(11).

4. Move the kth ant from node i to the node j and insert that path into tabu (s).

5. Move the kth ant from node i to the node j and change the local pheromone trial by using

sk

ij¼ ð1 qÞ sijþ q Dsij ð14Þ

where 0 < q < 1 and nsij= s0.

6. After running a cycle, we update Dsk

ijðtÞ to follow Eqs.

(13).Then, we calculate Eqs. (12).

7. Set t = t + 1 and NC = NC + 1, and then repeat steps

1 5 until the termination iteration met.

8. According to mining results generate the association rules.

The ﬂows and steps of ASC-based association rule min-ing are shown inFig. 9.

4. Model evaluation results and discussion

This section will apply the real world problem to evalu-ate the proposed method. The procedures and results are provided in the following subsections.

4.1. Data preparation and transformation

The National Health Insurance Plan of Taiwan Govern-ment has accumulated 12 million administrative and claims data. It is the largest database in the world. To rapidly and eﬀectively respond to current and emerging health issues, The NHRI (National Health Research Institutes) cooper-ates with the National Health Insurance Bureau (NHIB) of Taiwan to establish a Nation Health Insurance research database. The NHRI is responsible of protecting the pri-vacy and conﬁdentiality of the data. She also routinely transfers the health insurance data from the NHIB to enable health researchers in order to analyze and improve the health of Taiwan’s citizens.

The data used the systematic sampling method to ran-domly sample a representative database from the entire database. The size of the subset from each month is deter-mined by the ratio of the amount of data in each month to that of the entire year. Then a systematic sampling is per-formed for each month to randomly choose a representative subset. This sampling database is obtained by combining

(11)

the subsets for 12 months. The sampling database of the dis-ease is around 0.2% to the entire database.

In a medical database, the most complete and detailed information are anamnesis data which contain disease name, prescription, patient’s detail information, etc. Using this we aim to ﬁnd the association rules between diseases; and also to detect the fake cases by data mining technology. This process should be able to increase medical quality as well as decrease the cost and the waste of medical resources. In this study, the ACS-based association rule mining algorithm is employed to ﬁnd some hidden relationships among disease items in the western medicine database. This study is based in part on data from the National Health Insurance Research Database provided by the Bureau of National Health Insurance, Department of Health and managed by National Health Research Institutes. The interpretation and conclusions contained herein do not represent those of Bureau of National Health Insurance, Department of Health or National Health Research

Insti-tutes. Because of the huge resources for executing the ASCA and AK, there are only 1000 data which is mined in the study.

In the data preparation stage, it is necessary to delete the data in disease column whose value is invalid, because this study is concerned with the disease relationships. There are thirty-seven columns in the original medicine database, but this study only concerned with the relationships of disease, so some columns in the database must be deleted. The remaining columns are ‘‘outpatient services,’’ ‘‘outpatient services date,’’ ‘‘patient’s birthday,’’ ‘‘international classiﬁ-cation disease number 1–3 (ICD code),’’ and ‘‘patient’s sex.’’

There are ICD-9-CM and A-Code in the international classiﬁcation disease number that can be classiﬁed into 18

classiﬁcations as shown inTable 1 by anatomy and

etiol-ogy. The 18th classification includes A-code which can not be classified by ICD-9-CM and supplementary classifi-cation which have V-code, E-code, and M-code.

Initialization

Calculate ηij(t)

Check state transition rule

q ≤ q0

Implement the act of exploitation

Implement the act of biased exploration tabu (s)= tabu (s)+1 Local pheromone renew Global pheromone renew Conform Termination Conditions Generate association rules No Yes Yes No

(12)

This study transformed the ICD code of National Health Insurance Research Database into 18 diﬀerent

attri-butes through Table 1. In order to avoid the data length

may inﬂuence the attributes, which is the input of cluster-ing analysis and then inﬂuence the relationship between

data, the ICD code was encoded by Eq.(15).

DkðiÞ ¼

C SkðiÞ

CðiÞ ð15Þ

where

Dk(i) is the frequency which is relative to the maximum

length of total data of the kth dimension (classiﬁcation) in ith data.

Sk(i) is the frequency of the kth dimension

(classiﬁca-tion) in ith data.

C is the maximum length of total data,and C(i) is the length of ith data.

Table 2is an example of decoding. Suppose C = 3, and Data ID is 1.

4.2. Clustering analysis

After encoding, clustering analysis was done with the two-stage clustering algorithm, which includes ASCA and AK. The Agglomerate level of ASCA was tuned, and

gen-erates diﬀerent number of clusters shown in Fig. 10. The

result, which generated 3 clusters, is chosen for lower cost of association rule mining. And then AK is used to modify

the clusters. The curve of the TWCV is drawn in Fig. 11.

and the TWCV is 3153.68 at last. 4.3. Association rule mining and analysis

Before mining the association rule from the result of clustering analysis, the eﬀect of the number of data and items were compared. At ﬁrst, the data sets that had 100, 200, 300, or 400 items were generated randomly. And then the association rules were mined with ACS-based associa-tion rule mining algorithm. The searching time and

infor-mation of data sets were shown inTable 3, the Fig. 12(a)

and (b).

Table 2

An example of decoding

Input data Data ID ICD-9-CM ICD-9-CM ICD-9-CM

1 250.00 461.90 477.90 Encoded Data ID 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 1 0 0 1 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 27 13 ₁₂ 9 7 3 1 0 5 10 15 20 25 30 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 Agglomerate level Number of clusters

Fig. 10. The number of clusters with diﬀerent agglomerate level.

3050 3100 3150 3200 3250 3300 3350 1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 Iteration TWCV

Fig. 11. The curve of the TWCV. Table 1

The classiﬁcation of ICD code

No. The disease items The ICD

code

1 Infectious and parasitic diseases 001–139

2 Neoplasm 140–239

3 Endocrine, nutritional, and metabolic diseases 240–279 4 Diseases of the blood-forming organs 280–289

5 Mental disorders 290–319

6 Diseases of the nervous systems and sense organs 320–389

7 Diseases of the circulatory system 390–459

8 Diseases of respiratory system 460–519

9 Diseases of the digestive system 520–579

10 Diseases of the genitourinary system 580–629 11 Complications of pregnancy, childbirth, and the

puerperium

630–677 12 Diseases of the skin and subcutaneous tissue 680–709 13 Diseases of the musculoskeletal system and

connective tissue

710–739

14 Congenital anomalies 740–759

15 Certain conditions originating in the perinatal period 760–779 16 Symptoms, signs, and ill-deﬁned conditions 780–799

17 Injury and poisoning 800–999

(13)

Fig. 12shows that the amount of items has a great inﬂu-ence upon the eﬃciency of searching time, but it has less upon the amount of data. So, an inference about improving the performance of ACS-based association rule mining algorithm by reducing the amount of items could be drawn. The study that applied clustering analysis to reduce the amount of items could ameliorate the performance of ACS-based association rule mining algorithm.

The ACS-based association rule mining system was exe-cuted in Intel Pentium 4 3.0 G with 1024 MB ram. The

searching time is shown in Table 4. The total searching

time in all the clusters was 3.282 s., and it spent 3.875 s in mining the complete data. From mentioned above, the method, which mines association rule from diﬀerent clus-ters, solves 15.31% time in mining complete data. But it spends more than 1 h, which is hugely long time, in cluster-ing analysis.

Table 5 shows the association rules with top 5 phero-mone, which were built up by ACS-based association rule mining system in all clusters, support and confidence of association rules. Compare clustered with non-clustered, and it can be found that the support and confidence of association rule in clustered are higher. Thus, the attributes of association rules are more similar between each other, and the rules are found easier. In this way, it is more help-ful to study the pathology in some group of patients. From mentioned above, the proposed framework, which uses clustering analysis at first and then mines the association rule by ACS-based association rule mining algorithm, can not only improve the efficiency of performance but make the rules hidden in data easier to find.

From mentioned above, the proposed method can find the useful rule. For example, the rules such as ‘‘Essential hypertension, unspecified ==> Other and unspecified hyperlipidaemia’’ in cluster 2, has lower confidence, which is less than 10%, and may be overlooked easier before anal-ysis. But the rules are found by the proposed method.

Table 6is the result extracted from the complete data in ACS-based association mining system. The result shows that most rules are the same as the one extracted from the larger clusters, cluster 2. There are 808 data in the clus-ter, which is 80.8% of total data. The rules, which experts considered as the useful rules with robust relation, are extracted from not only the above-mentioned clusters but also the others. For example, ‘‘Trichiasis ==> Conjuncti-vitis, unspecified’’, which are extracted from cluster 3, is the rules with robust relationship. But it must set the threshold of pheromone as a lower value to generate a very large number of rules in the complete data. Thus many rules to examine the generated ones may be overlooked readily because of the big size of information. By contract, from the proposed method, it is easier to find out the hid-den rules, which may occur less but have robust relation-ship. In other words, it can meet the same effect with lower cost.

Although the proposed method can find the useful and important rule and is executed in higher efficiency, the method also produces some useless rules and noise, in. For example, the rules, ‘‘Conjunctival xerosis ==> Chronic conjunctivitis, unspecified’’ in cluster 1, ‘‘Hyper-trophy (benign) of prostate ==> Hypertensive heart dis-ease, benign without congestive heart’’ and ‘‘Hypertrophy (benign) of prostate ==> Calculus of kidney’’ in cluster 3, go against the results of reaching in the past. The possi-ble reasons are summed up as following:

0 200 400 600 800 1000 1200 1400 1600 1800 1 2 3 4 5 6 7 8 9 10 11 12 Data set Number of Data 0 1 2 3 4 5 6 7 Searching Time(sec.)

Number of Data Searching Time

0 50 100 150 200 250 300 350 400 450 1 2 3 4 5 6 7 8 9 10 11 12 Data set Number of item 0 1 2 3 4 5 6 7 Searchin g time(sec.)

Number of Item Searching Time

Fig. 12. Performance of ASC-based association rule mining on various amount data and amount itemsets. (a) Performance on various amounts of data. (b) Performance on various amounts of itemsets.

Table 4

The searching time in diﬀerent clusters

Cluster 1 2 3 Total

Number of data 49 808 143 1000

Number of items 68 507 162 642

Searching time (s) 0.157 2.75 0.375 3.875

Table 3

Searching time and information of diﬀerent data sets

Data Set 1 2 3 4 5 6 Number of data 62 111 116 397 305 167 Number of items 100 100 100 200 200 200 Searching time (s) 0.094 0.093 0.109 0.735 0.7 0.703 Data set 7 8 9 10 11 12 Number of data 364 689 870 597 1193 1614 Number of items 300 300 300 400 400 400 Searching time (s) 2.282 2.43 2.406 5.328 5.797 5.718

(14)

1. The ICD codes are not good enough to classify the dis-eases and can not describe the relationship between the diseases.

2. The habits of Taiwanese to take medical treatment. 3. The error of observing the mapping network by vision.

5. Conclusions

In the early of 21st century, the developing of science and technology lets the medicine be prosperous and makes a huge change for the environments. Thus, it is quite diffi-cult to predict what will happen in the further. Especially there are more and more previously unknown diseases, like SARS and bird flu, which were found recently. As men-tioned above, human beings have to fight with the germs more and more hardly. Therefore, developing a decision support system which is about patient treatments and extracting the important relationships or association rules

between diseases has become a very critical issue. This also can provide another way, which is diﬀerent from the med-icine and biology, to help diagnose the diseases for ﬁnding out the treatments.

According to the above ﬁndings, this study has devel-oped a method which is able to discover more useful and accurate rules from the medical database fast. In order to avoid the missing knowledge in dividing the data, we divide the medical database into several clusters by ant colony system and then mine the hidden knowledge from the clus-tered data also via ant colony system. This can not only let the researchers pay more attention on some important groups and ﬁnd out the hidden relation in the groups eas-ier, but also avoid the important relationship ignored in the large database. The evaluation results using National Health Insurance Database have shown the proposed method’s feasibility.

Although the result in this study shows the promising application, there are some issues that should be further

Table 5

The result of the proposed method

Association rule Clustered Support Conﬁdence

Pheromone Support Conﬁdence

1 Conjunctival xerosis ==> Chronic conjunctivitis, unspecified 4.016184 16.32653 62.50000 0.80000 62.50000 Conjunctivitis, unspecified ==> Other specified disorders of eye and adnexa 3.011721 18.36735 33.33333 1.10000 27.27273 Nonsenile cataract, unspecified ==> Conjunctivitis, unspecifie 1.996815 6.12245 66.66667 0.30000 66.66667

Trichiasis ==> Conjunctivitis, unspeciﬁed 1.603509 8.16327 50.00000 0.40000 50.00000

Other speciﬁed disorders of eye and adnexa ==> Age-related macular degeneration 1.012361 6.12245 33.33333 0.30000 33.33333 2 Acute upper respiretory infections of unspeciﬁed site ==> Essential hypertension,

unspeciﬁed

8.640093 6.68317 16.66667 5.90000 15.25424 Essential hypertension, unspecified ==> Other and unspecified hyperlipidaemia 8.002158 15.47030 8.00000 14.10000 7.09220 Chronic ischemic heart disease, unspecified ==> Diabetes mellitus 7.202158 5.19802 21.42857 4.60000 19.56522 Chronic ischemic heart disease, unspecified ==> Hypertensive heart disease,

unspeciﬁed, without congestive

6.988804 5.19802 16.66667 4.60000 15.21739

Headache ==> Dizziness and giddiness 5.999774 4.33168 17.14286 3.70000 16.21622

3 Diabetes mellitus (no complication) ==> Chronic renal failure 4.965272 11.18881 31.25000 11.70000 4.27350 Urinary tract infection, site not speciﬁed ==> Vaginitis and vulvovaginitis,

unspeciﬁed

4.00394 16.08392 17.39130 2.30000 17.39130 Hypertrophy (benign) of prostate ==> Hypertensive heart disease, benign without

congestive heart

3.973273 21.67832 12.90323 3.10000 12.90323 Hypertrophy (benign) of prostate ==> Calculus of kidney 3.840274 21.67832 12.90323 3.10000 12.90323 Diabetes mellitus(no complication) ==> Essential hypertension, unspeciﬁed 3.20137 11.18881 25.00000 11.70000 34.18803

Table 6

The result of the complete data

Association rules Pheromone Support Conﬁdence

Asthma, unspeciﬁed, without mention of status asthmaticus ==> Allergic rhinitis case unspeciﬁed 10.91202 2.7000% 40.7407%

Essential hypertension ==> Hyperlipidemia 9.998129 14.1000% 7.0922%

Allergic rhinitis case unspeciﬁed ==> Acute bronchitis 8.986932 4.3000% 20.9302%

Allergic rhinitis case unspeciﬁed ==> Asthma, unspeciﬁed, without mention of status asthmaticus 8.800364 4.3000% 25.5814%

Menopausal syndrome ==> Osteoporosis 7.998769 2.4000% 33.3333%

Upper respiratory infection ==> Essential hypertension 7.200365 5.9000% 15.2542%

Chronic hepatitis ==> Diabetes mellitus 6.400364 3.7000% 21.6216%

Headache ==> vertigo 5.999409 3.8000% 15.7895%

Other and unspeciﬁed hyperlipidaemia ==> Hypertensive heart disease, unspeciﬁed, without congestive 5.991731 4.0000% 15.0000%

Upper respiratory infection ==> Headache 5.952014 5.9000% 10.1695%

(15)

solved. Because this study just mines the relation between the ICD codes, it is suggested to add in the numerical data of medical examination and fuzzy the numerical data in preparation stage. In the clustering analysis stage, the pro-posed method utilized ASCA and AK to build up the clus-ter. Therefore, it may be desirable to apply other cluster method, like ART2, ADSOM or other two-stage methods to cluster the data. Besides, there many similar rules gener-ated from the mining process, so it is feasible to apply other technology, such as the Fuzzy theorem, to merge the simi-lar rules.

Acknowledgements

This study is partially supported by the National Science Council of Taiwan Government under Contract Number: NSC94-2416-H-027-001. Her support is appreciated. References

Agrawal, R., Imielinski, T., & Swami, A. (1993a). Database mining: a performance perspective. IEEE Transactions on Knowledge and Data Engineering, 5(6), 914–925 (Special issue on Learning and Discovery in Knowledge-Based Databases).

Agrawal, R., Imielinski, T., & Swami, A. (1993b). Mining association rules between sets of items in large databases. In Proc. ACM-SIGMOD int. conf. management of data (SIGMOD’93), May, Washington, USA (pp. 207–216).

Agrawal, R., & Shafer, J. C. (1996). Parallel mining of association rules. IEEE Transactions on Knowledge and Data Engineering, 8(6), 962–969. Bayardo, R. J. (1998). Eﬃciently mining long patterns from database. In Proceedings of the ACM SIGMOD international conference on management of data, Washington, USA (pp. 85–93).

Bellaachia, A., Portnoy, D., Chen, Y., & Elkahoun, A. G. (2002). E-CAST: a data mining algorithm for gene expression data. In 2nd workshop on data mining in bioinformatics, July (pp. 49–54).

Berkhin, P. (2002). Survey of clustering data mining techniques. Accrue Software, Inc. Available from http://www.accrue.com/products/ researchpapers.html.

Carpenter, G. A., & Grossberg, S. (1987). ART2: self-organization of stable category recognition codes for analog input pattern. Applied Optics, 26, 4919–4930.

Dorigo, M., & Gambardella, L. M. (1997). Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Transactions on Evolutionary Computation, 1, 53–66.

Fayyad, U. (1997). Data mining and knowledge discovery in databases: implications for scientiﬁc databases. In Scientiﬁc and statistical database management, 1997 proceedings, ninth international conference (pp. 2–11).

Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). From data mining to knowledge discovery in database. American Association for Artiﬁcial Intelligence(August), 37–54.

Han, J., Pei, J., & Yin, Y. (2000). Mining frequent patterns without candidate generation. In: Proceedings of the ACM SIGMOD interna-tional conference on management of data, Dallas, TX, USA (pp.1–12). Kohonen, T. (1991). Self-organizing maps: optimization approaches. In T. Kohonen, K. Makisara, O. Simula, & J. Kangas (Eds.), Artiﬁcial neural networks (pp. 981–990). Amsterdam, The Netherlands: Elsevier.

Koyuturk, Mehmet, Grama, Ananth, & Ramakrishnan, Naren (2005). Member, compression, clustering, and pattern discovery in very high-dimensional discrete-attribute data sets. IEEE Transactions on Knowl-edge and Data Engineering, 17(4), 447–461.

Krishna, K., & Murty, M. (1999). Genetic K-means algorithm. IEEE Transactions on Systems, Man, and Cybernetics—Part B: Cybernetics, 29(3), 433–439.

Kuo, R.J., Cha, C.L., Chou, S.H., Shih, C.W., & Chiu, C.Y. (2003). Integration of ant algorithm and case based reasoning for knowledge management. In Proceedings of International Conference on IJIE, November 10–12, 2003, Las Vegas, USA, in CD-R.

Kuo, R. J., & Shih, C. W. (accepted for publication). Association rule mining through the ant colony system for national health insurance research database in Taiwan. Computers and Mathematics with Applications.

Kuo, R. J., Wang, H. S., Hu, T.-L., & Chou, S. H. (2005). Application of ant K-means on clustering analysis in data mining. International Journal of Computers and Mathematics with Applications, 50, 1709–1724.

Lin, D., & Kedem, Z. (2002). Pincer-search: an eﬃcient algorithm for discovering the maximum frequent set. IEEE Transactions on Knowl-edge and Data Engineering, 14(3), 553–566.

Maulik, H., & Bandyopadhyay, S. (2000). Genetic algorithm-based clustering technique. Pattern Recognition, 33, 1455–1465.

Peacock Peter, R. (1998). Data mining in marketing: Part 1. Marketing Management, 9–18.

Savasere, A., Omiecinski, E., & Navathe, S. (1995). An eﬃcient algorithm for mining associate rules in large databases. In Proceedings of the international conference on very large data bases, Zurich, Switzerland (pp. 432–444).

Shih, C. W. (2004). Applying ant colony system in data mining under multi-dimensional constraints. Master Thesis of National Taipei University of Technology, Taiwan, ROC.

Su, B. D. (2002). Discovering association rules through ant systems. Master Thesis of National Chin-Hwa Univeristy, Taiwan, ROC. Su, M. C., & Chang, H. T. (2000). Fast self-organizing feature map

algorithm. IEEE Transactions on Neural Networks, 11(3), 721–733. Su, M. C., & Chang, H. T. (2001). A new model of self-organizing neural

networks and its application in data projection. IEEE Transactions on Neural Networks, 12, 153–158.

Toivonen, H. (1996). Sampling large databases for association rules. In Proceedings of the international conference on very large data bases, Mumbai (Bombay), India (pp. 134–145).

Tsai, C. F., Wu, H. C., & Tsai, C. W. (2002). A new clustering approach for data mining in large databases. In Proceedings of the interna-tional symposium on parallel architectures, algorithms and networks (ISPAN’02) (pp. 1087–4089). IEEE Computer Society.

Tsay, Y. J., & Chang-Chien, Y. W. (2004). An eﬃcient cluster and decomposition algorithm for mining association rules. Information Sciences, 160, 161–171.

Witten, I. H., & Frank, E. (2000). Data mining: practical machine learning tools and techniques with java implementations. Morgan Kaufmann Publishers.

Yang, X. B., Sun, J. G., & Huang, D. (2002). A new clustering method based on ant colony algorithm. In Proceedings of the 4th world congress on intelligent control and automation, June (pp. 2222–2226).

Zhang, X., & Li, Y. (1993). Self-organizing map as a new method for clustering and data analysis. In Proc. IJCNN’93, int. joint conf. on neural networks (pp. 2448–2451).

Zaki, M. J. (2000). Scalable algorithms for association mining. IEEE Transactions on Knowledge and Data Engineering, 12(3), 372–390.