• 沒有找到結果。

A FUZZY CLUSTERING-ALGORITHM FOR GRAPH BISECTION

N/A
N/A
Protected

Academic year: 2021

Share "A FUZZY CLUSTERING-ALGORITHM FOR GRAPH BISECTION"

Copied!
5
0
0

加載中.... (立即查看全文)

全文

(1)

ELSEVIER Information Processing Letters 52 (1994) 259-263

Information

Processing

Letters

A fuzzy clustering algorithm for graph bisection

Jin-Tai Yan *, Pei-Yung

Hsiao

Department of Computer and Information Science, National Chiao Tung Unitiersity, 1001 Ta Hsueh Road, Hsinchu, Taiwan, ROC

Communicated by D. Gries; received 27 August 1993; revised 14 June 1994

Abstract

A fuzzy clustering algorithm based on global connection information is proposed to solve the graph bisection problem.

Keywords: Algorithms; Graph bisection; Fuzzy membership; Fuzzy clustering

1. Introduction

Let G = (V, E) be an undirected connected edge-weighted graph. In general, a partition of G is a partition of its vertex set V. Hence, if the ends of an edge e in E belong to two different subsets of the partition, e will be cut by a parti- tion (V,, V,) of V. The cut of a partition <V,, V,) for graph G is defined as the sum of the weights of all the edges cut by the partition

cUt( v1, vz) = C C ‘ij,

where cij is the weight of the edge {i, j} in G. Therefore, a min-cut partition for graph G is a partition (V,, VJ of I/ with minimum cut. How- ever, a min-cut partition always yields an unbal- anced partition, and an unbalanced partition is inefficient on many applications. Therefore, bal- anced-partition graph bisection is formulated as

* Corresponding author.

follows: A partition (V,, 1/Z] of I/ is said to be a graph bisection (GB) if I VI I = I V2 I when

I I/ I

is even or I V, I = I V, I - 1 when I I/ I is odd.

Due to the size constraint on the partition, GB is NP-complete [4]. Many heuristic approaches have been suggested for GB. In 1970, Kernighan and Lin [.5] proposed a two-way “group-migra- tion” improvement algorithm with a constraint on the subset size. They randomly started with two subsets and iteratively applied pairwise swapping on all pairs of nodes. Subsequently, Fiduccia and Mattheyses [3] reduced the time complexity to O(P) with respect to the number of pins P. Saab and Rao [7,8] also proposed heuristic algorithms to solve GB. Generally speaking, the Kernighan- Lin based algorithm [l] is quite efficient, but it does not focus on the global connection informa- tion of the given graph. Therefore, it is difficult for it to obtain an optimal or near-optimal graph bisection.

In this paper, we propose a fuzzy clustering algorithm based on global connection information to solve GB. For an undirected connected edge-

0020.0190/94/$07.00 0 1994 Elsevier Science B.V. All rights reserved

(2)

260 J.-T. Yan, P. -Y. Hsiao /Information Processing Letters 52 (1994) 259-263

weighted graph, we introduce two groups of fuzzy

memberships

on the vertex set and define the

clustering distance between any pair of vertices in

the graph according to global connection. Based

on fuzzy c-means clustering [2,61, two-way fuzzy

graph clustering generates

two groups of con-

verged fuzzy memberships for the vertex set. Fi-

nally, according to the grades of the member-

ships, the vertices in the graph can be separated

into two even subsets with minimum cut.

2. Fuzzy

membership on vertices

Given is an undirected

edge-weighted

graph

G =(V, E), where I/= {xi, x2 ,..., XJ and

E = (Yl, Y,, * f * 7 y,). Let [w+

the set of nonnegative

reals and iWZn the set of real 2

x n

matrices.

First, fuzzy memberships and fuzzy functions for

the vertex set I/ are introduced.

Every fuzzy

function

ui : Y+ [O,l] assigns grades of fuzzy

memberships onto the vertices in V. Function ui

is called the ith fuzzy set in I/. There are in-

finitely many fuzzy sets associated with I/. Every

fuzzy set in I/ represents a possible fuzzy cluster-

ing. Hence, for two-way partitioning,

two fuzzy

sets in V will be applied to partition the vertex

set

V.

In order to partition

V

by means of fuzzy sets,

we need some clustering constraints between the

two fuzzy sets. For example, for each xk in

V,

the

sum of the fuzzy memberships in the two fuzzy

sets is restricted

to be 1. Formally, a two-way

fuzzy clustering for two-way partitioning can be

represented

by a fuzzy matrix U in MZn whose

entries satisfy the following clustering constraints:

(1)

(2)

(3)

(4)

(5)

(6)

Row

i

of U, say q = (uil, ui2), exhibits fuzzy

set

i

of

V.

d;=

if {i,

j)

is an edge in the graph,

I

-

Short path(

i,

j)

if {i,

j}

is not an edge in the graph,

Column j of U, say L$ = (uij, uZj) exhibits

the values of the 2 fuzzy sets of the jth datum

in

V.

uik

shall be interpreted as uj(x,), the value of

fuzzy set

i

for the kth datum.

The sum of the membership values for each

xk is 1 (L+ + uZk = 1, for all k).

No fuzzy set is empty (row sum &uik > 0, for

all

i).

where Short path& t) is the sum of the weights

on the short&t path from vertex s to vertex

t.

Clearly, the clustering distance of all pairs of

vertices in the graph must be obtained for fuzzy

clustering. Hence, all clustering distances can be

computed by running an all-pairs shortest-path

algorithm.

4.

Optimality of fuzzy clustering

No

fuzzy set is all of

V

(row sum Ckujk <n,

Based on fuzzy c-means clustering, two-way

for all

il.

fuzzy graph clustering can be transformed into a

3.

Clustering distance

Due to the primary min-cut operation in graph

partitioning, it is sure that any pair of connected

vertices with larger weight will be clustered into

the same cluster to reduce the partitioning result.

Hence, for graph partitioning,

the larger the

weight of the edge, the less its clustering distance.

A related clustering graph can be generated by

modifying the edge weights of the original edge-

weighted graph. For G =

(V, E),

the related clus-

tering graph G* =

(V*, E*)

is an undirected

edge-weighted graph, where

V* = V, E* = E,

and

the edge weight c$ of the edge

Ii, j)

is defined by

c; = l/Cij.

Since there is no geometrical distance between

any pair of vertices in a graph, it is critical for

fuzzy clustering on a graph structure to estimate

the clustering distance of any pair of vertices.

Simply speaking, for an undirected edge-weighted

graph, the distance of any pair of vertices in the

graph is the distance of the shortest path. Fur-

thermore, the clustering distance of any pair of

vertices in the related clustering graph can be

computed by running a shortest-path

algorithm.

The clustering distance will indicate the clustered

degree of the pair of vertices in the same cluster.

For any pair of vertices

i

and j in G*, the

clustering distance d: between vertex

i

and j can

be further obtained as

(3)

J.-T. Yan, P.-Y. Hsiao / Information Processing Letters 52 (1994) 259-263 261 mathematical optimization problem for the

mapped objective function. Using the fuzzy mem- berships of the vertex set and the clustering dis- tance between any pair of vertices, the objective function for two-way fuzzy graph clustering can be formulated as follows: Let U in I’$,, be a fuzzy graph partition of V, and let u = (u,, u2) be the cluster centers. Objective function Ji : M,, X I/+ R+ is defined as

Ji( u? ui) = 2 ( uik)2( dik)2.

k=l

Further, objective function J : M,, X v2 + IF!+ is defined as

= i: i: (Uik)2(dik)2, k=l i=l

where U in M,, is a fuzzy graph clustering of V, u =(ui, u2) in V2 is the cluster centers, and dik = 11 xk - ui II is the clustering distance between xk and ui. Note the several parameters in the definition of the objective function. The squared clustering distance is weighted by the second power of the membership of datum xk in cluster i. Thus, function J is a squared error criterion, and its minimization produces a fuzzy clustering matrix U that is optimal in a generalized least- squared error sense.

Since two-way fuzzy graph clustering can be transformed into a mathematical optimization problem for the mapped objective function, two- way fuzzy graph clustering can be stated as an approach that attempts to find a solution for the following mathematical program:

Minimize J(U, ‘) = ~ i: (uik)2(dik)2 k=l i=l subject to ulk + U2k = l, uik > 0, 1 Q i G 2, 1 G k G n,

xi E V, 1 G i G n are vertices in the graph, Vj E V, 1 <j =G 2 are unknown cluster centers,

U = {uik) is a 2 X n matrix, where uik is re- ferred to as the grade of membership of xk in row i of matrix U.

Objective function J is a nonlinear multi-vari- able function, and it is difficult for two-way fuzzy graph clustering to obtain an optimal matrix U. For minimizing J, iterative optimization on U and clustering center v can be applied to approx- imate the minima of the function. In the follow- ing lemmas, we discuss necessary conditions on U and v for the mapped objective function.

Lemma 4.1. Consider the following problem:

Minimize

k=l i=l

subject to

Ulk + U2k = 1, l<k<n,

uik > 0, 1 G i G 2, 1 G k Gn, where v is fixed. Then U = {uik) is a mum of the problem if for 1 =G k < n, if xk # ~1, and xk Z u2 then

d,:, *

d:k

Uik = dfk( di”k + dzk) (for lGiG2 >

global mini-

,

else

i

1 if xk = vi,

“‘= 0 ifxk#vi (for l<i<2).

Proof. By the definition of fuzzy membership, the columns in matrix U are independent. Therefore,

Min{ J( U, v)) = Min i 5 c ( uik)‘( dik)2 k-1 i=l 1 = 5 Min i (uik)2(dik)2 . k=l ( i=l 1

As mentioned in the previous definitions, the restricted condition for each column in U is C~=iuik = 1. Further, the minimum function can be formulated as a function F and solved by the Lagrange Multiplier method,

(4)

262 J.-T. Yan, P.-Y Hsiao /Information Processing Letters 52 (1994) 259-263

The first-order sufficient and necessary condi- tions for optimality are

~ = [2(Uik)(dik)2-h] =O. rk By (2), we obtain h Uik = ~ 2(dik)2 ’ Substitute (3) into (1): Therefore, A d,: + d;k -= 2 dfk * dik ’

Substitute (4) into (3), we obtain 1 d:k + d;k d,:, * d;k uik=(din)?* d:k * d;k = dFk(d;k+d;k)’

(1)

(2)

(3)

(4)

The fuzzy membership assignment can be further classified into two different cases. If xk corre- sponds to ui, the fuzzy membership of xk on cluster i is 1 and that on the other cluster is 0. Thus, uik is assigned as follows: for 1 < k <n, if xk # ui and xk # u2 then d:k * d;k U lk = dfk(d;k + d;k) (for 1 <’ G 2), else i 1 if xk = ci,

Uik= 0 if xkfui (forlGiG2). 0 Lemma 4.2. Consider the following problem:

Minimize J(U, L’) = i i: (Uik)2(dik)2 k=l i=l subject to Ulk + U2k = 1, 16 k <n, uik > 0, I G i G 2, 1~ k G n,

where U is fixed. Then v = (v 1, ~1~) is a global

minimum of the problem if vi is in V such that J&U, yi> is the least.

Proof. Due to U being fixed, all rows in matrix U are independent. Therefore,

Min( J(U, u)} = Min{J,(U, v,) +J,(U, u,)} = Min{J,( U, vi))

+ Min{J,(U, v,)}.

Furthermore, the minimization of J(U, U) will depend on the minimization of Min{J,(U, vi)} + Min(J,(U, v2)}. Thus, the center of cluster i for 1 G i G 2 can be assigned by vi such that J,<U, vi) is the least. KI

5. Fuzzy clustering and graph bisection

According to Lemmas 4.1 and 4.2, two-way fuzzy graph clustering, via iterative optimization of J(lJ, v) on U and v, produces a feasible fuzzy graph partition of V= {x,, x2,. . . , x,). The basic steps of the algorithm are as follows:

Algorithm Fuzzy Graph Clustering -

Determine the clustering distance d: between xi and xj, 1 < i,j < n.

Initialize an arbitrary partition and establish a fuzzy matrix U,

Calculate the centers v = (c,, L’*) using U as follows:

(1) Determine ui such that J,(U, v,> is the least,

(2) Determine u2 such that J,(U, v,) is the least.

Calculate a new fuzzy matrix U’ using v = (v,, v,) as follows:

(5)

J.-T. Yan, P.-Y Hsiao /Information Processing Letters 52 (1994) 259-263 263

for l<k<n,

if xk ZU, and x,#u, then

d:k *

&

u’k := dfk( dFk + d$) (for l<i<2),

else 1 u’k ‘=

if xk = o’,,

0 if xk # ci (for 1 < i < 2),

5. If I z& - uik I < E, for 1 < i G 2, 1 < k G ~1, then stop; otherwise, U := U’, and repeat at step 3.

After U converges, two groups of fuzzy mem- berships can be generated for all the vertices in the graph. According to the grades of any one group of fuzzy memberships, a vertex ordering will be constructed by sorting the selected group of fuzzy membership, and all the vertices will be separated into two even subsets with minimum cut for graph bisection.

Acknowledgement

References

111

f21

T. Bui, C. Heigham, C. Jones and T. Leighton, Improving the performance of the Kernighan-Lin and simulated an- nealing graph bisection algorithms, in Proc. ACM/IEEE 26th Design Automation Conf (1989) 775-778.

R.L. Cannon, J.V. Dave and J.C. Bezdek, Efficient imple- mentation of the fuzzy c-means clustering algorithms, IEEE Trans. Pattern Analysis Machine Intelligence 8 (1986) 248-255.

[3] C.M. Fiduccia and R.M. Mattheyses, A linear-time heuris- tic for improving network partitions, in Proc. ACM/IEEE 19th Design Automation Conf. (1982) 175-181.

[4] M.R. Garey and D.S. Johnson, Computers and Intractabil- ity (Freeman, San Francisco, CA, 1979).

[5] B.W. Kernighan and S. Lin, An efficient heuristic proce- dure for partitioning graphs, Bell Systems Tech. .I. 49 (1970) 291-307.

[6] T. Kim, J.C. Bezdek and R. J. Hathaway, Optimality tests for fixed points of the fuzzy c-means algorithm, Pattern Recognition 21 (1988) 651-663.

[7] Y.G. Saab and V.B. Rao, Fast effective heuristics for the graph bisectioning problem, IEEE Trans. Computer-Aided Design 9 (1990) 91-98.

[S] Y.G. Saab and V.B. Rao, On the graph bisection problem, IEEE Trans. Circuits Systems I Fund. Theory Appl. 39 (1992) 760-762.

We thank David Gries and the anonymous referees for many helpful suggestions.

參考文獻

相關文件

• Suppose the input graph contains at least one tour of the cities with a total distance at most B. – Then there is a computation path for

• If a graph contains a triangle, any independent set can contain at most one node of the triangle.. • We consider graphs whose nodes can be partitioned into m

In particular, we present a linear-time algorithm for the k-tuple total domination problem for graphs in which each block is a clique, a cycle or a complete bipartite graph,

&#34;Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values,&#34; Data Mining and Knowledge Discovery, Vol. “Density-Based Clustering in

The research proposes a data oriented approach for choosing the type of clustering algorithms and a new cluster validity index for choosing their input parameters.. The

We will give a quasi-spectral characterization of a connected bipartite weighted 2-punctually distance-regular graph whose halved graphs are distance-regular.. In the case the

Miroslav Fiedler, Praha, Algebraic connectivity of graphs, Czechoslovak Mathematical Journal 23 (98) 1973,

We give a quasi- spectral characterization of a connected bipartite weighted 2-punctually distance- regular graph whose halved graphs are distance-regular.. In the case the