Apply robust segmentation to the service industry using kernel induced fuzzy clustering techniques

(1)

Apply robust segmentation to the service industry using kernel induced fuzzy

clustering techniques

Chih-Hsuan Wang

*

Ming Chuan University, Department of Business Administration, Taiwan

a r t i c l e

i n f o

Keywords: Robust classiﬁcation Robust segmentation Kernel induced fuzzy clustering

a b s t r a c t

To understand customers’ characteristics and their desire is critical for modern CRM (customer relation-ship management). The easiest way for a company to achieve this goal is to target their customers and then to serve them through providing a variety of personalized and satisfactory goods or service. In order to put the right products or services and allocate resources to specific targeted groups, many CRM researchers and/or practitioners attempt to provide a variety of ways for effective customer tion. Unfortunately, most existing approaches are vulnerable to outliers in practice and hence segmenta-tion results may be unsatisfactory or seriously biased. In this study, a hybrid approach that incorporates kernel induced fuzzy clustering techniques is proposed to overcome the above-mentioned difficulties. Two real datasets, including the WINE and the RFM, are used to validate the proposed approach. Exper-imental results show that the proposed approach cannot only fulfill robust classification, but also achieve robust segmentation when applied to the noisy dataset.

1. Introduction

Today, the mass marketing approach cannot satisfy customers’ needs and their diverse preferences and most companies need to contact, serve, and manage their customers through providing a variety of attractive, personalized, and satisfactory goods or ser-vice. Market segmentation assumes that groups of customers with similar needs and purchasing patterns are likely to demonstrate a homogeneous response to marketing programs that target speciﬁc customer groups (Tsai & Chiu, 2004). With proper market segmen-tation, companies can put the right products or services to a tar-geted customer group and hence improve the efﬁciency of their marketing strategies. In order to understand their customers more clearly, companies may integrate an abundance of data collected from multiple channels. Typical ways include web browsing, pur-chasing pattern, complaints demographics and psychographic behavior.

According to the so-called ‘‘20–80” rule, a dramatic business improvement is often achieved by identifying the 20% of core cus-tomers and by maximizing the attention applied to them since they will account for the 80% of contribution of company’s proﬁt. Therefore, satisfying existing customers’ needs and build close

relationships with them will be very imperative in modern elec-tronic commerce. Owing to the rapid development of data ware-housing and data-mining techniques, it is less costly to ‘‘up-sell” or to ‘‘cross-sell” the existing customers. However, acquiring new customers is still difﬁcult and expensive. Based on this perspective, companies need to understand their customers by analyzing cus-tomer information, to differentiate between various groups, to identify the most or the least valuable customers, and to increase customer loyalty through providing customized products and ser-vices (Ha, 2007).

One of the critical and challenging issues for successful market segmentation is the selection of the segmentation variables (Tsai & Chiu, 2004). In general, segmentation variables can be roughly classified into customer related variables (i.e. demographics, life-styles) and product specific variables (i.e. purchasing behavior, transaction records). In spite of various types of segmentation vari-ables, practical marketers continue to use RFM (recency, frequency, and monetary) models since it is easy to be implemented and to be understood by decision makers (McCarty & Hastak, 2007). Specifi-cally, ‘‘recency” denotes the length of time period since the last purchase, ‘‘frequency” means the number of purchases within a certain period, and ‘‘monetary” represents the amount of money spent during a certain period. There are a variety of ways of apply-ing RFM model on customer segmentation, includapply-ing K-means or fuzzy C means (FCM), artificial neural network (ANN), and decision tree (DT). Unfortunately, most of the above-mentioned methods still have the following flaws:

* Current address: Department of Industrial Engineering & Management, National Chiao Tung University, Taiwan. Tel.: +886 3 5714261.

E-mail addresses:[email protected],[email protected].

Contents lists available atScienceDirect

Expert Systems with Applications

(2)

The adverse effect of outliers is usually omitted or rarely investigated.

Segmentation results are very vulnerable to outliers or noisy data.

The determination of the number of clusters is ambiguous or inconsistent.

Therefore, this inspires us to develop a hybrid approach that is capable to quickly detect outliers and to segment customers more effectively. The remainder of this paper is organized as follows. Section 2discusses related work of outlier detection and robust segmentation. Section3reviews possibilistic clustering and proba-bilistic clustering techniques. Experimental results collected from two real datasets are illustrated in Section4and conclusions are drawn in Section5.

2. Related works

In contrast to traditional data-mining task that aims to search for a general pattern for the majority of input data, novelty detec-tion attempts to find the rare class whose behavior is very excep-tional when compared to the rest of input data (He, Xu, Huang, & Deng, 2004). Novelty detection, or so-called outlier detection, is the identification of ‘‘novel” or ‘‘unknown” events that an expert system is not aware of during training or testing. It is very funda-mental to a classification or identification system since outliers may indicate abnormal running conditions and lead to significant performance degradation. By contrast, clustering is an unsuper-vised process of dividing patterns into groups and make objects within a cluster show relatively high intra-similarity whereas ob-jects between different clusters have very low inter-similarity (Jain, Murty, & Flynn, 1999).

Traditional clustering techniques could handle well-separated groups, but they could not treat with the overlapped or diverse clusters very well. Thus, support vector clustering and kernel based fuzzy clustering are further employed for segmenting complex cus-tomer proﬁles (Huang, Tzeng, & Ong, 2007; Wang, 2009). However, for most proposed schemes, the performance evaluation with re-spect to outliers is very scarce. In real implementation, outliers or noisy data samples often lead to biased clustering results and hence managerial insights are difﬁcult to be obtained. Hence, un-der the noisy environment, developing a robust clustering

ap-proach becomes very imperative for successful market

segmentation. 2.1. Outlier detection

An outlier is one that appears to obviously deviate from the oth-ers of the sample in which it occurs or an observation which ap-pears to be inconsistent with the remainder of the dataset (Barnett & Lewis, 1994). They also think that outliers may be con-sidered as noisy points lying outside a set of defined clusters or may be defined as points that lie outside of the set of clusters but are also different from the noise. An outlier may also denote an anomalous object or an intruder inside the system with mali-cious intentions. Detecting fraudulent usage of credit cards or mo-bile phones (fraud detection) and discovering potential criminal activities in electronic commerce (intrusion detection) are two typ-ical applications. Besides, for loan application evaluation or public health benefit payments, an outlier identification system is helpful to detect any anomalies or abuse of social resource before any ap-proval or payment.

In recent years, outlier detection has attracted much attention from both statistics community and data-mining research commu-nity and many techniques are proposed to fulﬁll this task. Three

fundamental approaches are well reviewed (Hodge & Austin, 2004):

Determine outliers without any prior knowledge of the data: it is analogous to the unsupervised clustering. This approach pro-cesses the data as a static distribution and ﬂags the most remote points in the dataset as potential outliers.

Model both normality and abnormality: this is analogous to supervised classification and requires pre-labeled data, tagged as normal or abnormal. However, the supervised classification is limited to known classes but new examples derived from a previously unseen region may be classified incorrectly. Model only normality: this is analogous to the semi-supervised

paradigm as only the normal class is taught and the system needs to learn to recognize abnormality. This technique is usu-ally named as novelty detection since it aims to deﬁne the boundary of normality instead of estimating the density of the dataset.

In addition, a state-of-the-art review respectively based on sta-tistical approaches and neural network approaches are presented (Markou & Singh, 2003a, 2003b). To our best knowledge, most pro-posed schemes are based on supervised or parametric approaches. That means they need to rely on labeled training data or known data distribution. However, in real application, either labeled train-ing data or its underlytrain-ing distribution may be unknown or difﬁcult to obtain in advance. As a matter of fact, an unsupervised and ‘‘dis-tribution-free” RPCM is proposed to identify outliers in this study. Further details will be illustrated in Section3.1.

2.2. Robust segmentation

Customer segmentation has become an important research is-sue in the field of electronic commerce because the identification of valuable segments can give market researchers the basis for effective targeting and predicting of potential customers (Kuo, Ho, & Hu, 2002). In particular, a popular data-mining technique called ‘‘clustering” is widely used for market or customer segmen-tation. There are many clustering algorithms proposed to deal with different problems, including partitioning clustering, hierarchical clustering, neural network based clustering, mixture model based clustering and kernel based clustering (Xu & Wunsch, 2005). Among those proposed schemes, clustering techniques involving K-means or fuzzy C means are relatively popular due to its short computation time and easy accommodation. A fuzzy clustering is adopted to group users of the on-line music industry and internet portals (Ozer, 2001, 2005). Traditional clustering techniques could handle well-separated groups but they could not treat with over-lapped or diverse clusters very well. Hence, support vector cluster-ing and kernel based fuzzy clustercluster-ing are further employed for segmenting complex customer profiles (Huang et al., 2007).

Recently, soft computing based methods including self-orga-nized feature maps (SOM) and adaptive resonance theory (ART) are quite popular to be applied to many problems (Lee, Suh, Kim, & Lee, 2004; Shin & Sohn, 2004; Vellido, Lisboa, & Meehan, 1999). A SOM based approach is presented to segment the on-line shopping market (Vellido et al., 1999). A two-stage method that combined SOM with K-means clustering is introduced (Kuo et al., 2002; Lee et al., 2004). In particular, SOM is used to determine the number of clusters and K-means was employed to ﬁnd the ﬁnal solutions. Three clustering algorithms, including K-means, FCM and SOM, are simultaneously used to segment Korean stock trad-ing customers and concluded that FCM is the most robust approach (Shin & Sohn, 2004). Besides, a laddering technique with ART2 net-work is used to acquire customer requirements (Chen, Khoo, & Yan, 2002).

(3)

However, clustering methods need to be robust against outliers or noise if they are to be useful in practice (Davé & Krishnapuram, 1997; Davé & Sen, 2002; Lin & Chen, 2004). Robustness means the performance of an algorithm should not deteriorate drastically due to noise or outliers. In fact, outliers or commonly referred to nov-elty instances, often exist in many real databases and hence results in unsatisﬁed segmentation results. The purpose of this paper at-tempts to facilitate the research gap between outlier detection and robust segmentation by incorporating two robust clustering methods. In particular, robust possibilistic clustering method (RPCM) is proposed to detect outliers and robust fuzzy clustering method (RFCM) is used to segment customers.

3. Proposed techniques

Robust clustering techniques involving RPCM (see Section3.1) and RFCM (see Section3.2) are respectively proposed to detect out-liers and to segment customers for the purpose of target market-ing. In addition, the number of clusters is determined by examining signiﬁcant eigenvalues of the afﬁnity matrix (see Sec-tion3.3).

3.1. Outlier detection using RPCM

The main idea originates from possibilistic C means (PCM) pro-posed byKirishnapuram and Keller (1993). It can be reconsidered to ﬁnd one single cluster instead of searching for multi-clusters. In contrast to fuzzy C means (FCM) (Bezdek, 1981), the membership of each data instance can be interpreted as the ‘‘typicalness” degree instead of the ‘‘belongness” membership because the separation constraint during various clusters was removed. Assume an input dataset X within p dimension, such as X = {x1, x2, . . ., xn} RP, the objective function, typicalness updating and one common centroid can be shown below.

JPCM¼ Xn j¼1 um j kxj ak2þ

g

Xn j¼1 ð1 ujÞm; ð1Þ a ¼ Pn j¼1u m j xj Pn j¼1umj ; ð2Þ uj¼ ½1 þ ðkxj ak2=

g

Þ1=ðm1Þ1: ð3Þ

Here, m > 1 is known as the fuzziﬁer and

g

is a regularization parameter. Based on the objective function, the ﬁrst term requires that the distance from the input data xjto the common centroid a be as low as possible whereas the second term forces its typicalness ujas large as possible to avoid the trivial solution. Through iterative optimization, the common centroid and typicalness updating can be easily obtained. Intuitively, those data points with low typicalness will be considered as potential outliers.

In order to enhance the robustness against noise or outliers, ro-bust possibilistic clustering method (RPCM) is further proposed. Instead of using Euclidean distance between the data instance and the common centroid, RPCM uses a kernelized distance to reconstruct the objective function and makes the algorithm insen-sitive to noisy data. Their mathematical forms are listed as follows:

JRPCM¼ Xn j¼1 um j k/ðxjÞ /ðaÞk2þ

g

Xn j¼1 ð1 ujÞm; ð4Þ k/ðxjÞ /ðaÞk 2

¼ Kðxj;xjÞ þ Kða; aÞ 2Kðxj;aÞ; ð5Þ

where / is a nonlinear mapping from the input space to the feature space, K(xj,a) = exp (kxj ak2/b) represents the Gaussian kernel and b denotes the kernel widths. At last, the typicalness function and the common centroid can be iteratively obtained as:

uj¼ ½1 þ ð2ð1 Kðxj;aÞÞ=

g

Þ1=ðm1Þ1; ð6Þ a ¼ Pn j¼1Kðxj;aÞumj xj Pn j¼1Kðxj;aÞumj : ð7Þ

Obviously, the proposed RPCM demonstrates the following two advantages. RPCM is able to compute the outlier possibility of each instance in a ‘‘continuous” manner. In other words, the possibilistic membership of RPCM can be regarded as a measure of f possibility of potential outliers. Moreover, RPCM is easy and fast to be imple-mented empirically since it dose not need to solve quadratic opti-mization or statistical testing.

3.2. Robust segmentation using RFCM

FCM (fuzzy C means) can be regarded as a soft extension of hard K-means. FCM assumes that the number of clusters c, is known as a priori, and partitions a dataset X within p dimension, such as X = {x1, x2, . . ., xn} RP, into c fuzzy subsets through minimizing an objective function. The objective function, which is based on the Euclidean distance between the input data xjand the cluster cen-teroid ci, can shown as follows:

JFCM¼ Xc i¼1 Xn j¼1 ðuijÞmkxj cik2; ð8Þ

subject to the probability constraints Pci¼1uij¼ 1, 1 6 j 6 n and 0 <Pnj¼1uij<n, 1 6 i 6 c. Similarly, the membership function uij and different cluster center ciare respectively updated through an alternative optimization from Eqs.(9) and (10):

uij¼ kxj cik 2=ðm1Þ Pc k¼1kxj ckk2=ðm1Þ ; ð9Þ ci¼ Pn j¼1ðuijÞ m xj Pn j¼1ðuijÞm : ð10Þ

Here, m > 1 is known as the fuzziﬁer and m = 2 is usually adopted. Obviously, FCM is not robust to tolerate noise or outliers be-cause of assigning relatively high membership values to outliers across c various clusters. Hence, robust fuzzy clustering method (RFCM) using a kernelized distance is proposed to effectively seg-ment customers. The objective function of RFCM and its kernel in-duced distance measure between the input data xjand the cluster center cican be respectively shown below:

JRFCM¼ Xc i¼1 Xn j¼1 ðuijÞmk/ðxjÞ /ðciÞk2 ¼ 2X c i¼1 Xn j¼1 ðuijÞmð1 Kðxj;ciÞÞ; ð11Þ k/ðxjÞ /ðciÞk 2 ¼ Kðxj;xjÞ þ Kðci;ciÞ 2Kðxj;ciÞ: ð12Þ

By iteratively minimizing the objective function under the probabil-ity constraintsPc

i¼1uij¼ 1, 1 6 j 6 n and 0 <Pnj¼1uij<n, 1 6 i 6 c, its membership function and cluster center can be respectively ob-tained as: uij¼ ð1 Kðxj ;ciÞÞ1=ðm1Þ Pc k¼1ð1 Kðxj;ckÞÞ1=ðm1Þ ; ð13Þ ci¼ Pn j¼1Kðxj;ciÞðuijÞ m xj Pn j¼1Kðxj;ciÞðuijÞm : ð14Þ

Apparently, the estimation of cluster centers is weighted by the ker-nel function and hence the effect of outliers will be signiﬁcantly decreased.

(4)

3.3. Cluster validity consideration

In general, to determine the number of clusters in advance is very challenging especially when the dataset includes diverse clus-ters or outliers. Most existing methods treat this problem as a mea-sure of ‘‘cluster validity” and test various numbers of clusters within a specific range. Based on various indices for ‘‘cluster valid-ity”, the determination of the optimal number of clusters is often inconsistent among different approaches. Examining the largest eigenvalues of the affinity matrix is a good way to roughly estimate the number of clusters. If the datasets consist of clearly separated, there should be a significant drop between dominant and non-dominant eigenvalues derived from the affinity matrix. An alterna-tive approach which relies on the structure of both eigenvalues and eigenvectors for more complex datasets is suggested (Girolami, 2002). He considered dominant terms of the following:

Xn i¼1

ki 1Tnui h i2

; ð15Þ

where 1nis a notation for a n-dimensional vector with all compo-nents equal to 1/n, and ki, uiare associated eigenvalues/eigenvectors of the afﬁnity matrix A. In simple words, there will be N dominant terms contributed to the summationPn_i¼1ki 1Tnui

h i2

if there are N distinct clusters embedded in the datasets. In this study, the num-ber of customer groups will be estimated by an eigen-decomposi-tion consisting of both eigenvalues and eigenvectors of its kernel afﬁnity matrix.

4. Experimental results

In this study, two real datasets are used to validate the proposed method: the ﬁrst is the WINE dataset downloaded from http://ar-chive.ics.uci.edu/ml/datasets/Wineand the other is the RFM data-set provided by Taiwan Toyota automobile dealer. In addition, two kinds of evaluation metrics are used to test the performance of the proposed approach: ‘‘misclassiﬁcation error” is used for the WINE dataset, and total ‘‘within-variance” (see Eqs.(17) and (18)) is used in the RFM dataset.

4.1. WINE dataset

The WINE dataset consists of 13 features belonging to three physical classes. This dataset was obtained by chemical analysis of wine produced by three different cultivators of Italy. Speciﬁcally, it contains 178 samples, with 59 in class 1, 71 in class 2, and 48 in class 3. Besides, the feature variances span a wide range and indi-cate that outliers are very likely to exist within the dataset. In this study, those potential outliers are intentionally kept to test robust performance of the proposed RFCM. For the problem of supervised classiﬁcation, a comparison between FCM and RFCM is shown in

Table 1. Obviously, the total error count for FCM is 13 whereas for RFCM is only 7. Therefore, RFCM is more capable to handle the noisy dataset than FCM since RFCM can signiﬁcantly reduce the effect of outliers.

4.2. RFM dataset

Taiwan Toyota automobile retailer provided a motor-mainte-nance dataset composed of 162 distribution centers that lasted from January 2006 to December 2006. Meanwhile, three features involving R (recency), F (frequency), M (monetary), are used to seg-ment customers and are standardized by the following form (see Eq.(16)).

XS¼ ðX XminÞ=ðXmax XminÞ; ð16Þ

Table 1

Misclassiﬁcation counts for the WINE dataset.

Type A Type B Type C Total Error (%)

FCM 1 12 0 13 7.3

RFCM 1 6 0 7 3.9

(5)

where XS/Xmax/Xmindenote the standardized/maximal/minimal va-lue of the corresponding feature X, respectively. Then, the standard-ized dataset is directed for the input of kernel eigen-decomposition to specify the number of clusters in advance. Obviously, the optimal number of segments is suggested as 4 (seeFig. 1).

Secondly, 20 outliers are successfully identiﬁed via RFCM and they are removed prior to clustering. Using ‘‘RFM” features, RFCM is adopted for customer segmentation and their marketing insights are shown inTable 2. Apparently, group 2 and group 3 are the so-called gold segments because they visit the company ‘‘recently” and purchase ‘‘regularly”. By contrast, the other groups need to be enhanced to increase their purchasing frequency (for group 4) or monetary (for group 1). More importantly, higher ‘‘recency” and lower ‘‘frequency” or lower ‘‘monetary” usually indicates the higher possibility of customers’ defection in the future. Hence, companies need to spend more effort to increase customers’ satis-faction or loyalty since it is much easier than acquiring new cus-tomers from their business competitors.

Furthermore, to evaluate the performance of various schemes, the objective of clustering can be simply described as: to partition a set of objects into speciﬁc groups such that the data within the same cluster is as homogeneous as possible and the data between each cluster is as heterogeneous as possible. Hence, the total ‘‘with-in-variance” which describes how well and how compact various clusters are constructed is suggested as a performance metric (Lee et al., 2004; Shin & Sohn, 2004; Vellido et al., 1999).

Speciﬁcally, the total ‘‘within-variance” w.r.t. the common cen-troid (see Eq.(17)for COVA) or w.r.t. various individual centroids (see Eq.(18)for INVA) are suggested in ‘‘outlier detection” or ‘‘ro-bust segmentation”, respectively. To determine the common cen-troid for outlier detection, RPCM demonstrates its robust superiority over PCM owing to its less COVA (seeTable 3). Similarly, in terms of lower INVA, RFCM also outperforms FCM signiﬁcantly when applied to the noisy dataset but the difference between FCM and RFCM is not obvious when the outliers are removed.

COVA ¼X n j¼1 kxj ak2; ð17Þ INVA ¼X c i¼1 Xn j¼1 kxj cik2; ð18Þ

where xjrepresents the instance composed of RFM features, a and ci, respectively represent the common centroid and the centroid of ith segment of the whole dataset.

5. Conclusions

Even much work has been done in the area of customer seg-mentation, the evaluation of robust performance with respect to outliers has not received strong attention that it desires so far. In this paper, a hybrid approach that incorporates kernel induced fuz-zy clustering techniques namely RPCM and RFCM, is presented to detect outliers efﬁciently and to segment customers more effec-tively. Based on the typicalness degree of RPCM, the outlier possi-bility of each instance within the whole dataset is easily obtained without the need of labeled data samples in advance. Similarly, by the aid of kernelized belongness membership, RFCM is more capa-ble to achieve robust segmentation when applied to the noisy data-set. Two real datasets including the WINE and the RFM, are used to validate the proposed approach. More importantly, the suggested method is very promising to be applied to other business areas, such as ﬁnancial fraud detection (Dorronsoro, Cinel, Sánchez, & Cruz, 1997), computer intrusion detection (Chen, Hsu, & Shen,

2005) and telecommunication churn management (Hadden,

Ti-wari, Roy, & Ruta, 2005).

Acknowledgements

The authors would thank Taiwan Toyota motor dealer for pro-viding the RFM dataset of her downstream distribution centers. This research is ﬁnancially supported by National Science Council of Taiwan under Contract 97-2410-H-130-025.

References

Barnett, V., & Lewis, T. (1994). Outliers in statistical data. New York: John Wiley & Sons.

Bezdek, J. C. (1981). Pattern recognition with fuzzy objective function algorithms. New York: Plenum Press.

Chen, W. H., Hsu, S. H., & Shen, H. P. (2005). Application of SVM and ANN for intrusion detection. Computers & Operations Research, 32, 2617–2634. Chen, C. H., Khoo, C. L., & Yan, W. (2002). A strategy for acquiring customer

requirement patterns using laddering technique and ART2 neural network. Advanced Engineering Informatics, 16, 229–240.

Davé, R. N., & Krishnapuram, R. (1997). Robust clustering models: A uniﬁed view. IEEE Transactions on Fuzzy Systems, 5(2), 270–293.

Davé, R. N., & Sen, S. (2002). Robust fuzzy clustering of relational data. IEEE Transactions on Fuzzy Systems, 10(6), 713–726.

Dorronsoro, J. R., Cinel, F., Sánchez, C., & Cruz, C. S. (1997). Neural fraud detection in credit card operations. IEEE Transactions on Neural Networks, 8(4), 827–834. Girolami, M. (2002). Mercer kernel based clustering in the feature space. IEEE

Transactions on Neural Networks, 13(3), 780–784.

Ha, S. H. (2007). Applying knowledge engineering techniques to customer analysis in the service industry. Advanced Engineering Informatics, 21, 293–301. Hadden, J., Tiwari, A., Roy, R., & Ruta, D. (2005). Customer assisted customer churn

management: State-of-the-art and future trends. Computers & Operations Research, 34, 2902–2917.

He, Z., Xu, X., Huang, J. Z., & Deng, S. (2004). Mining class outliers: Concepts, algorithms and applications in CRM. Expert Systems with Applications, 27, 681–697.

Hodge, V. J., & Austin, J. (2004). Survey of outlier detection methodologies. Artiﬁcial Intelligence Review, 22, 85–126.

Huang, J. H., Tzeng, G. H., & Ong, C. S. (2007). Marketing segmentation using support vector clustering. Expert Systems with Applications, 32, 313–317.

Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering: A review. ACM Computing Survey, 3(3), 264–323.

Kirishnapuram, R., & Keller, J. M. (1993). A possibilistic approach to clustering. IEEE Transactions on Fuzzy Systems, 1(1), 98–110.

Kuo, R. J., Ho, L. M., & Hu, C. M. (2002). Integration of self-organized feature map and K-means algorithm for market segmentation. Computers & Operations Research, 29, 1475–1493.

Lee, S. C., Suh, Y. H., Kim, J. K., & Lee, K. J. (2004). A cross-national market segmentation of online game industry using SOM. Expert Systems with Applications, 27, 559–570.

Lin, C. C., & Chen, A. P. (2004). Fuzzy discriminant analysis with outlier detection by genetic algorithm. Computers & Operations Research, 31, 877–888.

Markou, M., & Singh, S. (2003a). Novelty detection: A review, Part 1: Statistical approaches. Signal Processing, 83(12), 2481–2497.

Markou, M., & Singh, S. (2003b). Novelty detection: A review, Part 2: Neural network based approaches. Signal Processing, 83(12), 2499–2521.

Table 2

Marketing insights of four customer segments.

Counts Symbol Recency Frequency Monetary Strategy Group 1 40 Diamond High Middle Low Enhancing

Group 2 28 Star Low High High Retention

Group 3 26 Circle Low Middle Middle Retention Group 4 48 Cross High Low Middle Enhancing

Table 3

Performance evaluation for RPCM and RFCM. Common within variance (COVA) Individual within variance (INVA) With outliers PCM 2.09 1016 FCM 1.76 1016 With outliers RPCM 1.06 1016 RFCM 1.47 1015 Without outliers PCM 8.4 1014 FCM 3.99 1014 Without outliers RPCM 7.6 1014 _RFCM _{3.79 10}14

(6)

McCarty, J. A., & Hastak, M. (2007). Segmentation approaches in data-mining: A comparison of RFM, CHAID and logistic regression. Journal of Business Research, 20, 656–662.

Ozer, M. (2001). User segmentation of online music services using fuzzy clustering. Omega, 29, 193–206.

Ozer, M. (2005). Fuzzy c-means clustering and internet portals: A case study. European Journal of Operational Research, 164, 696–714.

Shin, H. W., & Sohn, S. Y. (2004). Segmentation of stock trading customers according to potential value. Expert Systems with Applications, 27, 27–33.

Tsai, C. Y., & Chiu, C. C. (2004). A purchase-based market segmentation methodology. Expert Systems with Applications, 27, 265–276.

Vellido, A., Lisboa, P. J. G., & Meehan, K. (1999). Segmentation of the on-line shopping market using neural networks. Expert Systems with Applications, 17, 303–314.

Wang, C. H. (2009). Outlier identiﬁcation and market segmentation using kernel based clustering techniques. Expert Systems with Applications, 36, 3744–3750. Xu, R., & Wunsch, D. (2005). Survey of clustering algorithms. IEEE Transactions on