Chaotic particle swarm optimization for data clustering

(1)

Chaotic particle swarm optimization for data clustering

Li-Yeh Chuang

a

, Chih-Jen Hsiao

b

, Cheng-Hong Yang

b,c,⇑

a

Department of Chemical Engineering, I-Shou University, Kaohsiung 80041, Taiwan b

Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung 80708, Taiwan c

Department of Network Systems, Toko University, Chiayi 61363, Taiwan

a r t i c l e

i n f o

Keywords: Data clustering Chaotic map

Particle swarm optimization

a b s t r a c t

Data clustering is a popular analysis tool for data statistics in several fields, including includes pattern recognition, data mining, machine learning, image analysis and bioinformatics, in which the information to be analyzed can be of any distribution in size and shape. Clustering is effective as a technique for dis-cerning the structure of and unraveling the complex relationship between massive amounts of data. An improved technique which combines chaotic map particle swarm optimization with an acceleration strategy is proposed, since results of one of the most used clustering algorithm, K-means can be jeopar-dized by improper choices made in the initializing stage. Accelerated chaotic particle swarm optimization (ACPSO) searches through arbitrary data sets for appropriate cluster centers and can effectively and efficiently find better solutions. Comparisons of the clustering performance are obtained from tests conducted on six experimental data sets; the algorithms compared with ACPSO includes PSO, CPSO, K-PSO, NM-PSO, K-NM-PSO and K-means clustering. Results of the robust performance from ACPSO indicate that this method an ideal alternative for solving data clustering problem.

1. Introduction

Clustering analysis is a very popular data mining technique. It is the process of grouping a set of objects into clusters so that objects within the same cluster are similar to each other but are dissimilar to objects in other clusters (Han, Kamber, & Tung, 2001; Jain, Murty, & Flynn, 1999; Maimon & Rokach, 2005). When a set of ob-jects has been applied to by a clustering algorithm, the resulting clusters can be used to reveal inherent structures resident in the data. The purpose of cluster analysis is to classify the clusters into groups holding that have some meaning in the context of a partic-ular problem. More speciﬁcally, a set of patterns, usually repre-sented by multi-dimensional vectors in a predeﬁned space, are clumped into clusters based on their similarity. If the number of clusters, K, is known a prior clustering may be formulated as the distribution of n objects in an N-dimensional space among K groups in such a way that the objects in the same group are more similar with regard to certain criteria than those in different groups (Anderberg, 1973). This involves the minimization of some extrin-sic optimization criteria.

Many clustering algorithms are based on evolutionary computation techniques, e.g., genetic algorithms (Bandyopadhyay & Maulik, 2002; Murthy & Chowdhury, 1996); however, particle swarm optimization is seldom chosen for solving clustering prob-lem (Paterlini & Krink, 2006). The typical work flow of genetic algo-rithms starts with an initialization of a set of candidate solutions for the optimization problem. The candidates are subsequently put through genetic operations such as selection, crossover and mutation, and evolve towards a better solution. Particle swarm optimization (PSO) is a population-based algorithm (Kennedy & Eberhart, 1995). It simulates the behavior of birds flocking or fish schooling in order to achieve a self-evolving system. PSO searches automatically for the optimum solution in the search space by using a search process that is not random. Depending on the differ-ent nature of problems, a fitness function decides the best way to conduct this search. The PSO algorithm has rapidly become popu-lar and has been applied in electric power systems (AlRashidi & El-Hawary, 2009), data clustering (Omran, Engelbrecht, & Salman, 2005), biclustering of microarray data (Liu, Li, Hu, & Chen, 2009), engineering design (He & Wang, 2007), etc.

Chaos can be described as a bounded nonlinear system with deterministic dynamic behavior that has ergodic and stochastic properties (Schuster & Just, 2005). It is very sensitive to the initial conditions and the parameters used. In other word, cause and ef-fect of chaos are not proportional to small differences in the initial values. In what is called the ‘‘butterﬂy effect’’, small variations of

⇑ Corresponding author at: Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung 80708, Taiwan. Tel.: +886 7 381 4526x5639; fax: +886 7 383 6844.

E-mail addresses:chuang@isu.edu.tw(L.-Y. Chuang),1097305142@cc.kuas.edu. tw(C.-J. Hsiao),chyang@cc.kuas.edu.tw(C.-H. Yang).

Expert Systems with Applications 38 (2011) 14555–14563

Contents lists available atScienceDirect

Expert Systems with Applications

(2)

an initial variable can result in huge differences in the solutions after some iteration. Mathematically, chaos is random and unpre-dictable, yet it also possesses an element of regularity (Alatas, Akin, & Ozer, 2009).

Although evolutionary computation techniques do eventually locate the desired solution, severely limited by the high computa-tional cost associated with the slow convergence rate. PSO applied to clusters in multi-dimensional space has shown outstanding per-formance. However, the rate of convergence when searching for global optima is still not sufﬁcient (Kao, Zahara, & Kao, 2008). For this reason, we combined chaotic map particle swarm optimization (CPSO) with an accelerated convergence rate strategy, and intro-duce this accelerated chaotic map particle swarm optimization (ACPSO) in this research. The choice of chaotic sequences as a legit-imate complement to PSO is justiﬁed by their unpredictability, i.e., by the ergodic properties and spread-spectrum characteristics of these sequences (Alatas et al., 2009). We used these characteristics on chaotic maps and adaptive action to avoid entrapment of the PSO in a local optimum (Chuanwen & Bompard, 2005; Xiang, Liao, & Wong, 2007). This technique allows the ACPSO algorithm to clus-ter arbitrary data betclus-ter than previous algorithms. Results of the conducted experimental trials on a variety of data sets taken from several real-life situations demonstrate that ACPSO is superior to the K-means, PSO, NM-PSO, CPSO, K-PSO and K-NM-PSO algo-rithms (Kao et al., 2008).

2. Methods

2.1. Particle swarm optimization

The robust and efﬁcient PSO evolutionary computation learning algorithm was developed byKennedy and Eberhart (1995). The ori-ginal PSO is a population-based optimization technique, where a population is called a swarm. A swarm consists of n particles mov-ing around in a D-dimensional search space.

The position of the ith particle can be represented by xi= (xi1, x-i2, . . ., xiD). The velocity for the ith particle can be written as

v

i=

(

v

i1,

v

i2, . . .,

v

iD). The positions and velocities of the particles are

conﬁned within [Xmin, Xmax]Dand [Vmin, Vmax]D, respectively. Each

particle coexists and evolves simultaneously based on knowledge shared with its neighboring particles. It makes use of its own mem-ory and knowledge gained by the swarm as a whole to ﬁnd the best solution.

The best previously encountered position of the ith particle is denoted its individual best position pi= (pi1, pi2, ..., piD), a value that

is called pbesti. The best value of all individual pbestivalues is

de-noted the global best position g = (g1, g2, . . ., gD) and called gbest.

The PSO process is initialized with a population of random parti-cles, and the algorithm then executes a search for optimal solutions by continuously updating generations. In each generation, the po-sition and velocity of the ith particle are updated by pbesti and

gbest of the swarm population. The update equations can be for-mulated as:

v

new

id ¼ w

v

oldid þ c1 r1 ðpbestid xoldidÞ þ c2 r2 ðgbestd xoldidÞ

ð1Þ xnew

id ¼ xoldid þ

v

newid ð2Þ

where r1and r2are random numbers between (0, 1), and c1and c2

are acceleration constants that control how far a particle moves in a single generation. Velocities

v

new

id and

v

oldid denote the velocities

of the new and old particle, respectively. xold

id is the current particle

position, and xnew

id is the new, updated particle position. The

inertia weight w controls the impact of the previous velocity of a particle on its current one (Shi & Eberhart, 1998). It is deﬁned in Eq.(3)

w ¼ 0:5 þrand

2:0 ð3Þ

In Eq.(3), rand is a randomly generated number between zero and one. The pseudo-code of the PSO process is shown below.

PSO pseudo-code 01: begin

02: Randomly initialize particle swarm

03: while (number of iterations, or the stopping criterion is not met)

04: Evaluate ﬁtness of particle swarm 05: for n = 1 to number of particles 06: Find pbest

07: Find gbest

08: for d = 1 to number of dimensions of particle 09: update the position of particles by Eqs.(1)and

(2)

10: next d 11: next n

12: update the inertia weight value with Eq.(3)

13: next generation until stopping criterion 14: end

2.1.1. PSO clustering algorithm

Over the past several years, PSO has been proven to be both effective and fast for solving optimization problems, PSO shows a promising performance on nonlinear function optimization and has thus received much attention (Liu, Qin, Shi, & Lu, 2007). It has been successfully applied in many research and application areas (AlRashidi & El-Hawary, 2009; He & Wang, 2007; Liu et al., 2009; Omran et al., 2005). The clustering problem can be viewed as an optimization problem in data clustering research area of locating the optimal centroids for each cluster instead of other non-optimal partitioning.

The PSO clustering algorithm, analogous to many clustering algorithms and partitioning methods, is employed to minimize intra-cluster distances as well as maximize distances between clusters by uncovering a proper set of cluster centroids fulﬁlling the given objectives. What separates the PSO algorithm from most other cluster partitioning methods is its capability to per-form a global search. Most other partitioning methods perper-form only a local search, a situation under which the obtained solution is usually close to the solution obtained in the previous step. Take the K-means clustering algorithm for example. This method initializes a search with a set of preliminary cluster centroids from randomly generated seeds, and then iteratively updates the positions of the cluster centroids at each step. This cluster-reﬁning procedure suggests that the K-means algorithm only probes proximal regions around of the randomly generated initial solution.

The partitioning performed by the PSO clustering algorithm is achieved by the integration of two procedures: a global search and a localized search, in which the reﬁning process takes place. Based on the particle velocity updating Eq.(1)from the PSO algo-rithm,

v

irepresents the initial velocity for particle i, (r1, r2) are two

random numbers generated at each iteration from a uniform distri-bution in the range of (0, 1) and w is the inertia weight factor nec-essary for diversifying the searching behavior of the particle swarm by altering the momentum. Particles can thus avoid entrapment in a local optimum. Since searches are performed concurrently by the swarm, the algorithm evaluates a wide variety of solutions by particles exploring the problem space. The global search step is performed in the initial iterations. A particle’s velocity is gradually

(3)

reduced after several iterations and the particle’s area of exploration shrinks when the particle approaches the optimal solution. Thus, the searching procedure gradually moves from the global search stage to the local refining stage. With different selec-tions of parameters applied to the PSO algorithm, the transition time from the global search stage to the local refining stage can be controlled. By delaying the transition from the global search stage to the local refining stage, the possibility of finding global optimal solution is increased.

In the PSO clustering algorithm, the problem space is modeled as data vectors in a multi-dimensional space. A single particle in the swarm represents one possible solution for clustering the data collection. Therefore, a swarm contains a group of candidate solu-tions for clustering the data collection. Each particle is repre-sented by a matrix xi= (C1, C2, . . ., Cj, . . ., Ck), where Cj speciﬁes

the vector of the jth cluster centroid and k is the number of clus-ters. The particle will then update the positions of clusters’ cen-troids in the each iteration according to knowledge acquired from its own experience and that of particles in its neighborhood. To evaluate the performance of each solution, the ﬁtness value is deﬁned as the average distance of data points to the cluster centroid.

2.2. Chaos theory

In the ﬁeld of engineering, it is well recognized that chaos the-ory can be applied as a very useful technique in practical applica-tion. The chaotic system can be described by a phenomenon, in which a small change in the initial condition will lead to nonlinear change in future behavior, besides that the system exhibits distinct behaviors under different phases, i.e. stable ﬁxed points, periodic oscillations, bifurcations, and ergodicity (dos Santos Coelho & Herrera, 2007). Chaos (Lu, Zhang, & Ma, 2006) is also a common nonlinear phenomenon with much complexity and is similar to randomness. Chaos is typically highly sensitive to the initial values and thus provides great diversity based on the ergodic property of the chaos phase, which transits every state without repetition in certain ranges. It is generated through a deterministic iteration for-mula. Due to these characteristics, chaos theory can be applied in optimization.

One of the simplest maps, the logistic map, was brought to the attention of scientists by May (1976). It appears in nonlinear dynamics of biological population evidencing chaotic behavior. The logistic map can be described by the following equation:

Xðnþ1Þ¼ a XðnÞ ð1 XðnÞÞ ð4Þ

In this equation, X(n)is the nth chaotic number where n denotes the

iteration number. Obviously, XðnÞ2 ð0; 1Þ under the conditions that

the initial Xð0ÞRð0; 1Þ and that X(0){0.0, 0.25, 0.5, 0.75, 1.0}. In our

experiments a = 4 has been used. As show inFig. 1a, the chaotic se-quence value X is bounded within (0, 1) when a equals 4.Fig. 1b shows the chaotic X value of a logistic map for 100 iterations where X(0)= 0.0001.

2.3. Chaotic particle swarm optimization (CPSO)

In PSO, the main advantage of the chaotic optimization is the maintenance of population diversity in the problem of interest. According to studies conducted byClerc and Kennedy (2002) and Trelea (2003), the parameters w, c1, c2, r1and r2are generally the

key factors affecting the typical PSO convergence. Therefore, this paper incorporates chaotic mapping with ergodic, irregular, and stochastic properties in PSO to improve the global convergence. The use of chaotic sequences in PSO can facilitate the escape from local minima. The literature is rich in chaotic time-series se-quences, such as logistic map, ten map, Lozi map, Ikeda map,

Hénon map, and others (Caponetto, Fortuna, Fazzino, & Xibilia, 2003; Coelho & Mariani, 2009; May, 1976; Singh & Sinha, 2009). In this study, we combine the logistic map (May, 1976) with parti-cle swarm optimization in a process we named chaotic partiparti-cle swarm optimization (CPSO).

In CPSO, sequences generated by the logistic map (May, 1976) substitute the random parameters r1and r2of PSO. The parameters

r1and r2are modiﬁed by the logistic map based on the following

equation:

Crðtþ1Þ¼ k CrðtÞ ð1 CrðtÞÞ ð5Þ

In Eq.(5), Cr(0)is generated randomly for each independent run,

with Cr(0)not being equal to {0, 0.25, 0.5, 0.75, 1} and k equal to 4.

The driving parameter k of the logistic maps controls the behavior of Cr(t)(as t goes to inﬁnity). The behavior of the logistic map for

various values of the parameter k is shown inFig. 1a. For low val-ues of k (k < 3), Cr eventually converges to a single number. When k = 3, Cr oscillates between two values. This characteristic change in behavior is called bifurcation. For k > 3, Cr goes through further bifurcations, eventually resulting in chaotic behavior. In fact, the bifurcation diagram is itself a fractal (Kuo, 2005). The velocity up-date equation for CPSO can be formulated as:

v

new id ¼ w

v

old id þ c1 Cr ðpbestid xoldidÞ þ c2 ð1 CrÞ ðgbestd x old idÞ ð6Þ

In Eq.(5), Cr is a function based on the results of the logistic map with values between 0.0 and 1.0. The pseudo-code of CPSO is shown below.

CPSO pseudo-code 01: begin

02: Randomly initialize particle swarm 03: Randomly generate Cr(0)

04: while (number of iterations, or the stopping criterion is not met)

05: Evaluate ﬁtness of particle swarm 06: for n = 1 to number of particles 07: Find pbest

08: Find gbest

09: for d = 1 to number of dimensions of particle 10: update the Chaotic Cr value with Eq.(5)

11: update the position of particles with Eqs.(6)and

(2)

12: next d 13: next n

14: update the inertia weight value with Eq.(3)

15: next generation until stopping criterion 16: end

2.4. Accelerated chaotic particle swarm optimization (ACPSO)

In order to increase the particle distribution, we used a chaotic map to allow the PSO to ﬁnd optimal solutions. We also increased the speed of convergence of PSO by incorporating an acceleration strategy in a process we call ACPSO. The acceleration strategy is similar to the K-means algorithm, but it does not perform a com-plete K-means algorithm. The strategy focuses on the part of the calculations where the particle cluster around the arithmetic aver-age of the center; it calculates the arithmetic mean center. A part of the particles contained in the original cluster center are then re-placed. We reduce the sum of the intra-cluster distance to ﬁnd optimal solutions and thereby expedite the rate of convergence in CPSO.

(4)

5. Conclusions

In this paper, we employed the ACPSO algorithm to clustering data vectors for six data sets. ACPSO uses minimum intra-cluster distances as a metric, and searches for robust data cluster centers in an N-dimensional Euclidean space. Under the same metric, PSO, NM-PSO, K-PSO, and K-NM-PSO need more iteration to arrive at a global optimum than ACPSO. The K-means algorithm has a ten-dency of getting stuck in a local optimum, depending on the choice of the initial cluster centers. Although the method introduced uti-lizes neither k-means nor local search, the results obtained are bet-ter than the results of other libet-terature hybrid algorithms. The experimental results indicate that ACPSO reaches a minimal error rate faster than the other methods, and thus reduces the computa-tional cost. The ACPSO algorithm developed in this paper can be applied when the number of clusters is known a priori and the clusters are clearly deﬁned.

Acknowledgement

This work was partly supported by the National Science Council in Taiwan under Grants 2221-E-214-050-MY3, NSC96-2622-E-214-004-CC3, and NSC97-2622-E-151-008-CC2.

References

Alatas, B., Akin, E., & Ozer, A. B. (2009). Chaos embedded particle swarm optimization algorithms. Chaos, Solitons & Fractals, 40, 1715–1734.

AlRashidi, M. R., & El-Hawary, M. E. (2009). A survey of particle swarm optimization applications in electric power systems. IEEE Transactions on Evolutionary Computation, 13, 913–918.

Anderberg, M. R. (1973). Cluster analysis for applications. New York: Academic Press. Bandyopadhyay, S., & Maulik, U. (2002). An evolutionary technique based on

K-Means algorithm for optimal clustering in RN

. Information Science, 146, 221–237. Caponetto, R., Fortuna, L., Fazzino, S., & Xibilia, M. G. (2003). Chaotic sequences to improve the performance of evolutionary algorithms. IEEE Transactions on Evolutionary Computation, 7, 289–304.

Chuanwen, J., & Bompard, E. (2005). A self-adaptive chaotic particle swarm algorithm for short term hydroelectric system scheduling in deregulated environment. Energy Conversion and Management, 46, 2689–2696.

Clerc, M., & Kennedy, J. (2002). The particle swarm-explosion, stability, and convergence in a multidimensional complex space. IEEE Transactions on Evolutionary Computation, 6, 58–73.

Coelho, L. d. S., & Mariani, V. C. (2009). A novel chaotic particle swarm optimization approach using Hénon map and implicit ﬁltering local search for economic load dispatch. Chaos, Solitons & Fractals, 39, 510–518.

dos Santos Coelho, L., & Herrera, B. M. (2007). Fuzzy identiﬁcation based on a chaotic particle swarm optimization approach applied to a nonlinear yo-yo motion system. IEEE Transactions on Industrial Electronics, 54, 3234–3245. Han, J., Kamber, M., & Tung, A. K. H. (2001). Spatial clustering methods in data mining:

A survey. London: Taylor & Francis.

He, Q., & Wang, L. (2007). An effective co-evolutionary particle swarm optimization for constrained engineering design problems. Engineering Applications of Artiﬁcial Intelligence, 20, 89–99.

Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering: A review. ACM Computing Surveys, 31, 264–323.

Kao, Y.-T., Zahara, E., & Kao, I. W. (2008). A hybridized approach to data clustering. Expert Systems with Applications, 34, 1754–1762.

Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. In IEEE international joint conference on neural network (Vol. 4, pp. 1942–1948).

Kuo, D. (2005). Chaos and its computing paradigm. IEEE Potentials, 24, 13–15. Liu, J., Li, Z., Hu, X., & Chen, Y. (2009). Biclustering of microarray data with MOSPO

based on crowding distance. BMC Bioinformatics, 10, S9.

Liu, Y., Qin, Z., Shi, Z., & Lu, J. (2007). Center particle swarm optimization. Neurocomputing, 70, 672–679.

Lu, H., Zhang, H. M., & Ma, L. H. (2006). A new optimization algorithm based on chaos. Journal of Zhejiang University Science A, 7, 539–542.

Maimon, O. Z., & Rokach, L. (2005). Data mining and knowledge discovery handbook. New York: Springer.

May, R. M. (1976). Simple mathematical models with very complicated dynamics. Nature, 261, 459–467.

Murthy, C. A., & Chowdhury, N. (1996). In search of optimal clusters using genetic algorithms. Pattern Recognition Letters, 17, 825–832.

Nelder, J. A., & Mead, R. (1965). A simplex method for function minimization. The Computer Journal, 7, 308–313.

Omran, M., Engelbrecht, A. P., & Salman, A. (2005). Particle swarm optimization method for image clustering. International Journal on Pattern Recognition and Artiﬁcial Intelligence, 19, 297–322.

Paterlini, S., & Krink, T. (2006). Differential evolution and particle swarm optimisation in partitional clustering. Computational Statistics and Data Analysis, 50, 1220–1247.

Schuster, H. G., & Just, W. (2005). Deterministic chaos: An introduction. Weinheim: Wiley-VCH Verlag GmbH.

Shi, Y., & Eberhart, R. (1998). A modiﬁed particle swarm optimizer. In The 1998 IEEE international conference on evolutionary computation proceedings (pp. 69– 73).

Singh, N., & Sinha, A. (2009). Chaos based multiple image encryption using multiple canonical transforms. Optics & Laser Technology, 42, 724–731.

Trelea, I. C. (2003). The particle swarm optimization algorithm: Convergence analysis and parameter selection. Information Processing Letters, 85, 317–325.

Xiang, T., Liao, X., & Wong, K. (2007). An improved particle swarm optimization algorithm combined with piecewise linear chaotic map. Applied Mathematics and Computation, 190, 1637–1645.