數量型資料庫中模糊概念轉移探勘之研究

全文

(1)國立高雄大學資訊工程所碩士論文. 數量型資料庫中模糊概念轉移探勘之研究 A Study on Mining Fuzzy Concept Drift from Quantitative Databases. 研究生：李運康撰指導教授：洪宗貝博士吳明泰博士. 中華民國 104 年 7 月.

(2) 致謝雖然只有短短的兩年，但在碩士的求學生涯中，讓我學習到一生都授用的想法和觀念。首先，非常感謝我的兩位指導教授洪宗貝博士和吳明泰博士。因為兩位老師的耐心教導與鼓勵，讓我在研究的道路上獲得許多成長並且完成我的碩士論文，在此至上由衷的謝意。其次，感謝我的碩士論文口試委員，林威成教授和藍國誠教授，謝謝他們百忙之中撥冗參加口試會議並且對於我的碩士論文指教修正並給予許多建設性及前瞻性的建議，使的我的碩士論文更加豐富且完善。此外，也要感謝人工智慧實驗室一起奮鬥的夥伴們，菊添學長、正男學長、偉銘學長、宥傑學長、齡儀學姐、信守學長、李昱、睿淇、育瑒、志峰學弟、元慶學弟和昆毅學弟等人，以及智慧型嵌入式系統實驗室的譯升和耀皚學弟，有你們的陪伴，讓我的碩士生涯更加精彩。同時，我要深深感謝我的好朋友柯淵、士誠和婷伃，不論遇到任何事情都會陪伴並且鼓勵我，在我遇到困難的時候總是能適時地幫忙我。我要感謝我的家人，沒有你們就不會有現在的我，你們總是默默的支持與關心，我才能無憂無慮的完成我的碩士學位。最後，我要感謝孟儒，雖然只有短短的一年，但由於妳的關心和陪伴，才能讓我人生留下精彩的一刻。. 李運康謹誌中華民國一百零四年八月. i.

(3) 數量型資料庫中模糊概念轉移探勘之研究指導教授：洪宗貝博士國立高雄大學資訊工程所. 共同指導教授：吳明泰博士國立高雄大學資訊工程所. 學生：李運康國立高雄大學資訊工程所摘要. 近年來，概念轉移在資料探勘成為一門熱門的研究議題因具有廣泛的實務應用。概念轉移主要探討資料隨著時間或者地點的改變會有所差異。因此，概念轉移可應用於不同時間或地點的資料庫中找尋顧客的購買行為。在過去，許多先前的研究集中在傳統交易資料庫的概念轉移。但是，在於交易資料庫中購買商品的數量通常可以提供更多有用的概念轉移訊息。因此，在本論文中，我們採用模糊理論來處理概念轉移與數量型資料庫，並提出了二種方法。第一種方法，主要是找出隸屬函數的概念轉移。對於模糊理論而言不同的時間、地點購買商品的數量會有其適合的隸屬函數，但它也會根據不同的需求做為改變。第二種方法是使用固定的隸屬函數找出關聯式規則的概念轉移，並透過模糊值來比較兩條關聯式規則，以找出重要的概念轉移。其中有兩種概念轉移包含了新興. ii.

(4) 模式、意想不到的變化，用以發現顧客的購物行為是否有改變。最後實驗根據不同的參數，以顯示所提出之方法的性能。實驗結果表明，所提出之模糊概念轉移有效地發現顧客的購物行為。. 關鍵字: 概念轉移，模糊理論，模糊資料挖掘，隸屬函數，模糊關聯式規則. iii.

(5) A Study on Mining Fuzzy Concept Drift from Quantitative Databases Advisor: Dr. Tzung-Pei Hong Institute of Computer Science and Information Engineering National University of Kaohsiung. Co-Advisor: Dr. Min-Thai Wu Institute of Computer Science and Information Engineering National University of Kaohsiung. Student: Yan-Kang Li Institute of Computer Science and Information Engineering National University of Kaohsiung. ABSTRACT. In recent years, concept drift has become a popular research topic in data mining because of its wide range of practical applications. Concept drift discusses the significant change of concepts along with time process or location movement. It can thus be used to derive the purchasing behavior of customers from a database at different times or locations. In the past, most of the research focuses on concept drift from traditional transaction databases. However,. iv.

(6) quantities of purchased items usually exist in databases and can provide more information in concept drift. Therefore in this thesis, we adopt fuzzy sets to handle concept drift with quantitative databases. Two approaches are proposed. The first approach is designed to find the concept drift of membership functions. Since linguistic interpretation of fuzzy terms may vary according to different times and locations, appropriate membership functions for representing the quantities of purchased items may also need to evolve. The second approach finds the fuzzy concept drift of association rules with fixed membership functions. Fuzzy association rules with fuzzy values are compared to find important concept drift. Two kinds of concept drifts including emerging patterns, unexpected changes are found to represent the behavior change of customers. Finally, experiments under different parameter settings are conducted to show the performance of the proposed approaches. The results show that the proposed fuzzy concept drift can effectively find customer-shopping behavior.. Keywords: Concept-drift, fuzzy set, fuzzy data mining, membership functions, fuzzy association rules. v.

(7) Content 致謝................................................................................................................................. i 摘要................................................................................................................................ii ABSTRACT.................................................................................................................. iv Content .......................................................................................................................... vi List of Figures ............................................................................................................ viii List of Tables ................................................................................................................. ix CHAPTER 1 INTRODUCTION ................................................................................... 1 1.1. Background and Motivation........................................................................ 1. 1.2. Contribution ................................................................................................ 4. 1.3. Thesis Organization .................................................................................... 6. CHAPTER 2. REVIEW OF RELATED WORK ......................................................... 7. 2.1. Concept Drift............................................................................................... 7. 2.2. Apriori Algorithm ..................................................................................... 11. 2.3. Fuzzy Data Mining.................................................................................... 12. 2.4. Fuzzy C-means .......................................................................................... 13. 2.5. Fuzzy Membership Functions ................................................................... 15. CHAPTER 3 3.1. CONCEPT DRIFT FOR FUZZY MEMBERSHIP FUNCTIONS ..... 19. Definitions and Review Fuzzy Membership Functions ............................ 19 3.1.1. Fuzzy Membership Functions by Fuzzy C-means ....................... 19. 3.1.2. Concept-drift Patterns for Fuzzy Membership Functions............ 21. 3.2. The Proposed CDMF Mining Algorithm .................................................. 25. 3.3. Experimental Results ................................................................................ 28. CHAPTER 4 4.1. CONCEPT DRIFT FOR FUZZY ASSOCIATION RULES .............. 35. Definitions and Review Fuzzy Association Rules .................................... 35 vi.

(8) 4.1.1. Fuzzy Membership Functions by Fuzzy C-means ....................... 35. 4.1.2. Generating Fuzzy Association Rules by Fuzzy Apriori ............... 36. 4.1.3. Concept-drift Patterns for Fuzzy Association Rules .................... 39. 4.2. The Proposed CDFAR Mining Algorithm ................................................. 45. 4.3. Experimental Results ................................................................................ 48. CHAPTER 5. CONCLUSION AND FUTURE WORK ........................................... 52. REFERENCES ............................................................................................................ 54. vii.

(9) List of Figures Figure 2.1: Membership functions ............................................................................... 16 Figure 2.2: 2-tuple linguistic for membership functions ............................................. 16 Figure 3.1: Membership functions of an item Ij........................................................... 20 Figure 3.2: Membership functions set that apple was purchased in a year. ................. 21 Figure 3.3: Membership functions for the purchasing amount of apples in this year. . 23 Figure 3.4: The number of concept-drift item by the algorithms along with different linguistic term threshold in database ......................................................... 29 Figure 3.5: The execution efficiency of the four different time and location with different linguistic term threshold in database. ......................................... 30 Figure 3.6: The number of concept-drift item by the algorithms along with different membership functions threshold in database. ........................................... 31 Figure 3.7: The execution efficiency of the four different time and location with different membership functions threshold in database.............................. 32 Figure 3.8: The number of concept-drift item by the algorithms along with different support threshold in database. ................................................................... 33 Figure 3.9: The execution efficiency of the four different time and location with different support threshold in database. .................................................... 34 Figure 4.1: Membership functions set that apple. ........................................................ 36 viii.

(10) List of Tables Table 2.1: The set of three quantitative transaction data for this example ................... 18 Table 2.2: The fuzzy sets converted for transactions ................................................... 18 Table 4.1: An example of a transaction database. ........................................................ 36 Table 4.2: Table 4.1 after converting the fuzzy database ............................................. 37 Table 4.3: The first case of an emerging pattern for two fuzzy association rules ........ 43 Table 4.4: The second case of an emerging pattern for two fuzzy association rules ... 43 Table 4.5: The third case of an emerging pattern for two fuzzy association rules ....... 44 Table 4.6: The first case of the unexpected change for two fuzzy association rules.... 45 Table 4.7: The second case of the unexpected change for two fuzzy association rules ..................................................................................................................... 45 Table 4.8: The number of fuzzy concept-drift patterns at thresholds minimum support value as 4%. ................................................................................................. 49 Table 4.9: The number of fuzzy concept-drift patterns at different minimum support thresholds value as 3%. ............................................................................... 51. ix.

(11) CHAPTER 1 INTRODUCTION. 1.1 Background and Motivation Data processing and data storage are now more convenient than before because of the booming development of information technologies. They have a significant impact on daily life as well as on business. For example, if policy-makers can obtain information and knowledge from databases effectively and quickly, they are more likely to make good decisions than they do not. The quantity of database types and the size of databases are, however, growing constantly. Getting useful and valuable information from large databases for decision-making becomes quite difficult. Research dealing with information storage and knowledge mining is thus an important and challenging task. Data mining techniques have been applied to various practical applications to find useful rules or patterns [1-4]. These applications include supermarket promotions [1], biological data applications [2], multimedia data applications [3], and mobile data applications [4], among others [5-7]. Association-rule mining is one of the important issues in data mining since the relationship among items in a set of transactions can be effectively analyzed in this representation. For example, assume there is a frequent 1.

(12) product combination “{milk, bread}” in a transaction database, which means that most customers in the store usually buy milk and bread together. To address this problem, Agrawal et al. first presented a well-known mining approach, called the apriori algorithm, to find association rules from databases [8]. Han et al. then proposed the Frequent-Pattern-tree (FP-tree) structure for efficiently mining association rules without generation of candidate itemsets [9]. Some other improved techniques were proposed based on the two and some are still in progress [10-13]. The fuzzy set theory has recently been used more and more frequently in intelligent systems because of its simplicity and similarity to human reasoning [14-16]. When quantitative databases are processed, it is natural and informative to use fuzzy sets to represent quantities into linguistic terms. Several fuzzy data mining algorithms for inducing rules from a given set of data have thus been designed and used with good results for specific domains. Hong et al. proposed a fuzzy mining algorithm to mine fuzzy rules from quantitative transition data [17]. Zheng et al. also proposed a novel optimized fuzzy association rule mining method is proposed to mine association rules from quantitative data [18]. Wang et al. then proposed a data mining algorithm for extracting fuzzy knowledge from transactions stored as quantitative values [19]. Fuzzy data mining uses fuzzy membership functions to derive linguistic terms from quantitative data [18, 20]. As the result, fuzzy membership functions are a crucial 2.

(13) factor that affects the quality of the results. Some heuristic approaches were proposed to get appropriate membership functions from data instead of just using static or manually defined membership functions [21-23]. For example, self-organizing feature map, Ant colony systems and simulated annealing. Yang et al. proposed to generate fuzzy membership functions with unsupervised learning using a self-organizing feature map [21]. Hong et al. proposed also an Ant colony systems algorithm to extract membership functions in fuzzy data mining [22]. Liu et al. then proposed the learning algorithm of membership functions based on simulated annealing is also presented [23]. Along with the strict competition of business in these days, understanding and adapting to the evolvement of customer behavior turns out to be an important aspect of enterprise survival in the continuously changing environment. Good companies have to know what is changing and how it has been changed in order to provide right products and services to satisfy the varying market needs. Due to the above reason, concept drift has thus become a popular research topic in data mining [24-26]. Concept drift discusses the significant change of concepts along with time process or location movement. It can thus be used to derive the purchasing behavior of customers from a database at different times or locations. Liu et al. then defined three kinds of customer behavior about concept drift [27]. The first one is called emerging patterns, which means the purchased product number by customers may gradually increase or decrease. 3.

(14) For example, rt: Bread,Toast → Mike (support = 0.2), rt+k: Bread,Toast → Mike (support = 0.5), Second is called unexpected changes, customers’ purchased behavior moves from product A to product B. For exmaple, rt: Bread,Toast → Mike, rt+k: Bread,Toast → Apple. Finally is called added/perished rules, certain customers gradually emerge to be the major buyers. For exmaple, rt: Bread,Toast → Mike, rt+k: Potato, Hamburger →Coffee. In the past, most of the research focuses on concept drift from traditional transaction databases. However, quantities of purchased items usually exist in databases and can provide more information in concept drift. Therefore in this thesis, we adopt fuzzy sets to handle concept drift with quantitative databases.. 1.2. Contribution. The contribution of this thesis is to find different types of fuzzy concept-drift patterns such as fuzzy membership-function concept-drift patterns and fuzzy association-rule concept-drift patterns, which may provide more information to managers to make appropriate decisions in varying environments than crisp conceptdrift patterns. Two new research issues are presented in this thesis. The first issue is to find the concept drift of membership functions. Since linguistic interpretation of fuzzy terms 4.

(15) may vary according to different times and locations, appropriate membership functions for representing the quantities of purchased items may need to evolve as well. We thus proposed a mining algorithm to find concept drifts for membership functions (abbreviated as CDMF). The quantities of each item in a quantitative database are first divided into several linguistic terms. For example, three linguistic groups may be given as follows: low, medium and high. Then a modified fuzzy c-means (FCM) approach is designed to generate a set of related membership functions for each item in the two databases at different times or at different places. At last, a concept-drift judging algorithm for comparing the membership functions under some criteria is proposed to decide the concept-drift membership functions in these two databases. The second issue discusses the fuzzy concept drift of association rules with fixed membership functions. We propose an algorithm to mine fuzzy association rules concept-drift patterns (abbreviated as CDFAR) from two quantitative databases at different times and locations. Fuzzy association rules with fuzzy values are first mined by the fuzzy mining algorithm in [8] with the membership functions predefined. A concept-drift judging algorithm then compares the fuzzy rules to find the two kinds of fuzzy concept-drift patterns including emerging patterns, unexpected changes under different criteria.. 5.

(16) 1.3. Thesis Organization. The rest of this thesis is organized as follows. The background and the related works including concept-drift, apriori algorithm, fuzzy data mining, fuzzy C-means and fuzzy membership functions are reviewed in Chapter 2. The proposed mining algorithm for fuzzy concept-drift patterns of membership functions for varying quantitative databases is proposed in Chapter 3. Chapter 4 then describes the mining algorithm proposed for finding the two kinds of fuzzy concept-drift patterns with the experimental results. Some experimental evaluation for the approach is also given in the chapter. The approach which considers the concept drift of fuzzy association rules and membership functions at the same time is also stated. Finally, the conclusion and future work are given in Chapter 6.. 6.

(17) CHAPTER 2 REVIEW OF RELATED WORK. In this chapter, some related studies on concept-drift, apriori algorithm, fuzzy data mining, fuzzy C-means and fuzzy membership functions are briefly reviewed.. 2.1. Concept Drift. In recent years, the field of concept-drift has become popular. Tsymbal proposed concept-drift as finding patterns what changes over time in unexpected ways [28]. For example, assume at time t there is an association rule "if buying milk, then buying bread" and at time t + k; there is another rule "if buying milk, then buying apple" mined. The latter rule changes from a former rule in the consequence part along with the time. This change is a type of concept-drift pattern. Based on the concept-drift patterns, the traditional method on data mining has been used in various research areas [29-32]. When the concept-drift occurs, the classification model built by using old dataset is not suitable for predicting new coming dataset. However, in a real life users might be very interested in the rules of concept-drift. For example, doctors would desire to know the main causes more for disease variation since such rules would enable them to diagnose patients more correctly and quickly. Lee et 7.

(18) al. proposed a rule based on the concept of a decision tree to mining the concept-drift rule [33] L.C. Cheng et al. then proposed utilizes group information at different timepoints for consensus sequence mining. The aim is to find the change in consensus sequence at different times so as to understand changes in group preference moving closer to the authentic idea of that group, this study classifies the change of group consensus sequence into five types (emerging patterns, emerging ambiguous pairs, order change sequence, addition/removal of items and significant changes ) [34]. The continued growth of Email usage, which is naturally followed by an increase in unsolicited emails so called spams, motivates research in spam filtering area. In the context of spam filtering systems, addressing the evolving nature of spams, which leads to obsolete the related models. Hayat et al. proposed an adaptive spam filtering system based on language model is proposed which can detect concept-drift based on computing the deviation in email contents distribution [35]. The concept-drift was used to data classification and data steam [36-42]. Conceptdrift has been a very important concept in the realm of data streams. Streaming data may consist of multiple drifting concepts each having its own underlying data distribution. Concept-drift occurs when a set of examples has legitimate class labels at one time and has different legitimate labels at another time. Padmalatha et al. proposed a comprehensive overview of existing concept -evolution in concept drifting techniques 8.

(19) along different dimensions and it provides lucid vision about the ensemble's behavior when dealing with concept-drift. Song et al. defined three types of concept-drift patterns in the association rule mining [24]. These are emerging patterns, unexpected changes, and added/perished patterns. The different types of concept-drift patterns indicate the different meaning of concept-drift for the association rule. An evaluative function was designed to calculate the degrees of concept-drift, if the degree of the concept-drift is between two rules and is bigger than a predefined threshold. The function will generate concept-drift patterns and related concept-drift rules immediately. Assume there are two rules rit: A→B with sup (A→B) = a and rit+k: C→D with sup (C→D) = b, where rit is the i-th rule of rule set RSt at time t. rit+k is the j-th rule of rule set RSt+k at time t+k, and A, B, C, D are itemsets. The definitions of the three patterns are given below [24]. Definition 1.. (Emerging Patterns) If a rule rk is an emerging pattern, then the. following two conditions should be satisfied: (1) The conditional and the consequent parts of rules rit and rit+k are the same. That is, A = C and B = D; (2) Supports of rules rit and rit+k are different. That is sup (A→B) ≠ (C→D). Example 1.. rit: Bread = high → Mike = Large (support = 0.2), rit+k: Bread =. high → Mike = Large (support = 0.5). In this case, rit+k is the emerging patterns to rit if we set the minimum threshold at 0.2, there are two rules that have the same structure 9.

(20) and the difference of the support for these two rules is 0.3. The emerging patterns will be generated immediately. Definition 2. (Unexpected Change) If a rule rk is an unexpected change, then the following two conditionals should be satisfied: (1) the conditional parts of rule rit and rit+k are same. That is A = C; (2) the consequent parts of rule rit and rit+k are different. That is B = D. Example 2.. rit: Bread = high → Mike = Large, rit+k: Bread = high → Mike =. Low. In this case, rule rit+k is an unexpected consequent change with respect to rit since the conditional parts of rit and rit+k are similar, but the consequent parts of the two rules are quite different. Definition 3.. (Added/Perished Rules) If rit+k is an added rule it means that. conditional part C and the consequent part D of rit+k are different from those of any rit in RSt. If rit is a perished rule, it means that the conditional part A and the consequent part B of rit are different from those anyrit+k in RSt+k. Example 3.. rit: Bread = high → Mike = Large, rit+k: Vegetable= high → Apple. = High. In this case, rit+k is the Added change with respect to rit since the conditional parts and consequent part of rit and rit+k are different.. 10.

(21) 2.2. Apriori Algorithm. The goal of data mining is to discover important associations among items such that the presence of some items in a transaction will imply the presence of some other items. To achieve this purpose, Agrawal and his co-workers proposed several mining algorithms based on the concept of large itemsets to find association rules in transaction data [8, 43-45]. The processes of Apriori algorithm as following : INPUT: D : quantitative transaction databases; α : the minimum support threshold. OUTPUT: The L large items. STEP 1: Calculate the number (count) of each item in the transaction data. Assume the total number of transaction data is n. If one item appears more than once in a transaction, count its occurrence only once. Set the support (support) of each item as count/n. STEP 2: Check whether the support of each items is larger than or equal to the predefined minimum support value α. If the item satisfies the condition, put it in the set of large l-itemsets (L1). STEP 3: If L1 is null, then exit the algorithm; otherwise, do the next step. STEP 4: Set r = 1 where r is the number of item in the large itemsets currently being processed. STEP 5: Generate the candidate set Cr+1 by joining Lr. 11.

(22) STEP 6: Calculate the number (counts) of each candidate (r+1)-itemset s in Cr+1; set its support (supports) as conuts/n. STEP 7: Check whether the support of each candidate (r+1)-itemset s is larger than or equal to the predefined minimum support value α. If the item satisfies the condition, put it in the set of large (r+1)-itemsets (Lr+1). STEP 8: If Lr+1 is null, then exit the algorithm; otherwise, set r = r + 1 and repeat Steps 5 to 7. Return L. 2.3. Fuzzy Data Mining. In data mining, the patterns with high frequency of occurrences will be found out as association rules, and these association rules can be used to analyze and describe the purchase behavior. However, since the traditional data mining methods do not take quantitative information into consideration, some valuable rules may thus be lost. To solve this problem, Kuok et al. developed a new issue, fuzzy data mining [46], which applied the fuzzy set theory to traditional data mining. The main reason is that fuzzy set theory has been widely used to various applications due to its simplicity and similarity to human reasoning. According to key steps of the proposed approach, the quantitative values in transaction were first converted into linguistic term through 12.

(23) membership functions, and then the count of a fuzzy itemset in a transaction could be calculated by the product of fuzzy regions of all fuzzy of all fuzzy terms of the itemset in that transaction. Finally, the fuzzy association rule, which satisfied the user-specified minimum fuzzy confidence threshold, could be derived from a set of fuzzy frequent itemsets with high fuzzy frequency. Different from the calculation function in Kuok et al. study, Hong et al. proposed a fuzzy mining algorithm to find fuzzy association rules by deriving quantitative data into fuzzy values [47]. Hong et al. applied the fuzzy minimum operator in fuzzy set theory to evaluate counts of fuzzy itemsets in a set of transactions, and they also proposed an apriori-based mining algorithm to efficiently find fuzzy association rules. In addition, Hong et al. investigated the trade-off problem between the number of fuzzy rules and computation time. Besides, Hong et al. also proposed a fuzzy weighted data mining approach based on the support-confidence framework to extract weighted association rules with linguistic terms from quantitative transactions [48].Because of the success of fuzzy mining, many extend approaches are widely proposed [49, 50].. 2.4. Fuzzy C-means. Fuzzy C-means (FCM) is a popular method for clustering that uses fuzzy theory and allows one piece of data to belong to two or more clusters [51]. FCM is frequently 13.

(24) used in pattern recognition. FCM is based on the minimization of the following objective function: c. c. n. J (U , c1 , c2 ,..., cc ) = ∑ J i = ∑∑ (uij ) m ci − x j i =1. 2. i =1 j =1. (2-1). Where m is a number which is greater than 1, uij is the fuzzy value of xi for the membership function in the cluster j, xi is the i-th of d-dimensional measured data, cj is the d-dimension center of the cluster j, and ||*|| is a norm expressing the similarity between any measured data and the center. Euclidean distance is commonly used. This iteration will stop when max ij u ijk −1 − u ijk < β. and where β. is a. termination criterion between 0–1, whereas k is the iteration step(s). This procedure converges to a local minimum or a saddle point of Jm. The processes of Fuzzy C-means Algorithm as following : INPUT: U : the degree of membership for each individual; c : the number of cluster; m : fuzziness index; i is used to identify the individuals; j is used to identify the clusters and β : termination criterion; OUTPUT: U (The degree of membership for each individual). STEP 1: Initialize U = [uij] matrix, U(0). STEP 2: At k-step: calculate the centers vectors C(k) = [cj] with U(k). C is the set of the center of membership functions which are used to calculate the degree of membership for each individual. 14.

(25) n. cj =. (∑ (uij ) m xi ) i =1 n. (2-2). (∑ (uij ) m ) i =1. STEP 3: Update U(k), U(k+1).. 1. uij = c. xi − c j. ∑( x k =1. i. − ck. (. 2 ) m −1. ). (2-3). k ( k −1) STEP 4: If max ij u ij − u ij < β then STOP; otherwise return to step 2.. Return matrix U. 2.5. Fuzzy Membership Functions. In this part, membership function in fuzzy set theory are introduced and ways to find scalar cardinality values of items according to their corresponding membership function is described. In fuzzy set theory a membership function can be represented using a graph that defines how each point in the input space is mapped to membership value between 0 and 1. Currently, there are two common methods for encoding membership functions for items. We the fuzzy regions were encoded as Parodi and Bonelli did [52], the encoding stored each fuzzy region Rjk as an isosceles-triangle membership functions similar to Figure 2.1 with the (c, w) pairs. Where c indicates the center abscissa of a membership function and w represents half the spread of a membership function. 15.

(26) Membership value. Rj1. Rjk. Rjl. 1.0. cj1. wi1. wik. cjk. cjl. wil. Quantity. Figure 2.1: Membership functions The second encoding approach is by using the 2-tuple linguistic representation model [53]. Take the set of membership functions MFj for the item Ij as an example. They are encoded as a substring of cjlLRjl…cjkLRjk…cj|Ij|LRj|Ij|, where cjk and LRjk are the center abscissa and lateral displacement of k-th membership function for item Ij. The scheme is shown in Figure 2.2. Membership value. 𝐿𝐿𝐿𝐿𝑗𝑗1. 1.0. 𝑤𝑤𝑖𝑖1. 𝑐𝑐𝑗𝑗1. 𝐿𝐿𝐿𝐿𝑗𝑗𝑗𝑗. 𝑤𝑤𝑖𝑖𝑖𝑖. 𝑐𝑐𝑗𝑗𝑗𝑗. 𝐿𝐿𝐿𝐿𝑗𝑗𝑗𝑗. 𝑤𝑤𝑖𝑖𝑖𝑖. 𝑐𝑐𝑗𝑗𝑗𝑗 Quantity. Figure 2.2: 2-tuple linguistic for membership functions Note that the half spreads of membership functions are predefined in the second encoding method. Assume there are m items, the entire membership functions for all 16.

(27) items are encoded by concatenating substrings of MF1, MF2, …, MFj, …, MFm. For expamle, the membership function of an item A is shown in Figure 2.3, and its has three fuzzy regions : Low, Middle and High. Membership value Item A 1.0. Quantity 2.0. 5.0. 7.0. Low. Middle. High Membership value. Membership value. Item C. Item B 1.0. 1.0. Quantity. Quantity 3.0 Low. 6.0 Middle. 9.0. 1.0. 3.0. 5.0. High. Low. Middle. High. Figure 2.3: The membership functions of the three items A, B, C for this example According to the membership shown in Figure 2.3, different quantities of items in transactions database can be represented by different degree value in different regions. Table 2.1 is a transaction database. Item A appears in two transactions, Trans 2 and Trans 3, as shown in Table 2.1, and quantity value in the transactions are 6 and 3, respectively. Also, the membership functions of item A is shown in Figure 2.3 of, and it includes three regions, Low, Middle and High. 17.

(28) Table 2.1: The set of three quantitative transaction data for this example ID. Expanded Items. Trans 1. (A, 0)(B, 9)(C, 2). Trans 2. (A, 6)(B, 3)(C, 0). Trans 3. (A, 3)(B, 0)(C, 5). According to the membership function of A in Figure 2.3, its two quantity values in region A.Low are pointed to 0 and 0.5, and the quantity values in region A.Middle are pointed to 0.5 and 0.5, and the quantity values in other region A.High are pointed to 0 and 0.5, respectively. After that, the quantity value of item A in the three transaction is then converted into a fuzzy set (A.Low, 0.5 + A.Middle, 0.5). All other transactions that include item A are processed similarly. Thus, corresponding fuzzy sets of item A in the four transactions, Trans2 are all, as shown in Table 2.2. Table 2.2: The fuzzy sets converted for transactions TID. Fuzzy set. 1. (. 0.5 0.5 1 1 ) )( )( + A.Low B.High C.Low C.Middle. 2. (. 1 1 0.5 0.5 ) )( )( + A.Middle A.High C.Low C.Low. 3. (. 0.5 0.5 1 0.5 0.5 ) )( )( + + A.Low A.Middle B.Low C.Middle C.High. 18.

(29) CHAPTER 3 CONCEPT DRIFT FOR FUZZY MEMBERSHIP FUNCTIONS. 3.1. Definitions and Review Fuzzy Membership. Functions In this chapter, we will present the concept-drift patterns for fuzzy membership functions (CDMF).. 3.1.1 Fuzzy Membership Functions by Fuzzy C-means In this part, we propose a simple method to generate a set of membership functions by FCM. Each membership function is designed as a triangle and encoded as a pair (c, w). The peak of the triangle is located at c and the distance between c and left acme is w, if we need to generate n membership functions for each item. The proposed algorithm will obtain an n cluster center by using FCM as described in chapter 2.5. In addition, each center will be as the location c of the peak of triangle (membership function) then calculate the span w as the distance between the locations of the peak in this triangle with the previous one with the first one is the distance between the locations of the peak with 0. Figure 3.1 shows an example. In Figure 3.1, the set of membership functions MFj for the item Ij are represented 19.

(30) as a substring of cj1wj1…c1|Ij|w1|Ij|, where |Ij| is the number of membership functions of I j. Membership value. MF1. 1.0. wi1. cj1. MF2. wik. cjk. wil. MFj. Quantity. cjl. Figure 3.1: Membership functions of an item Ij Membership function plays a role in converting commodity items into something similar with human semantics. Figure 3.2 shows membership functions set for apples purchased in a year. Figure 3.2 consists of three membership functions representing low, medium, and high for each different purchase amount. If we buy five apples, the low fuzzy value is equal to 0.4, the medium fuzzy value is equal to 0.6, and the high fuzzy value is equal to 0.. 20.

(31) Membership value Apple 1.0 0.6 0.4 Quantity 3.0. 6.0. 9.0. Low. Medium. High. Figure 3.2: Membership functions set that apple was purchased in a year. Additionally, for this example, we know the status of the concept from the membership functions. A purchasing amount less than three projects on behalf of the linguistic term group with low membership values will reach 1. A purchasing amount of six projects on behalf of the linguistic term group medium membership values will reach 1. A purchasing amount greater than nine projects on behalf of the linguistic term group high membership values will reach 1. We can regard the data as representative of value linguistic terms and observes the changes at different times.. 3.1.2 Concept-drift. Patterns. for. Fuzzy. Membership. Functions In this part, we studied three different concept-drifts of fuzzy membership functions. The first concept-drift is the change of the representative value for the 21.

(32) linguistic term group (the center of membership function). The second concept-drift is the change of the linguistic term range. The third concept-drift is the change of fuzzy support for the linguistic terms. Each kind of variant degree of concept-drift is described below.. (A) The change of the representative value for the linguistic term Figure 3.2 shows the membership functions of the purchasing amount of apples over the last year and Figure 3.3 shows the membership functions of the purchasing amount of apples for this year. In the low linguistic term group, the representative value, which is the center of membership functions reduced from three to two. In the high linguistic term group, the representative value increases from nine to ten. This represents the concepts of the low and the high linguistic term groups have already changed. The concept of the medium linguistic term group is retained as the original status.. 22.

(33) Membership value Apple 1.0. Quantity 2.0. 6.0. 10.0. Low. Medium. High. Figure 3.3: Membership functions for the purchasing amount of apples in this year. Formula (3-1) shows the variant degree of the representative value of a linguistic term. 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 =. 𝑡𝑡 𝑡𝑡+𝑘𝑘 |𝐶𝐶𝑛𝑛𝑛𝑛 − 𝐶𝐶𝑛𝑛𝑛𝑛 | 𝑁𝑁 𝑡𝑡 (∑𝑖𝑖=1 𝐶𝐶𝑛𝑛𝑛𝑛 )/𝑁𝑁. (3-1). Where Dt and Dt+k are the transaction databases by different times or different places. C is a representative value of some linguistic terms for database t. n is used to identify a commodity items, m is used to identify a linguistic terms and N is the number of all linguistic terms, respectively.. (B) The change of the linguistic term range The meaning of the range of membership functions is the influence of a linguistic term. For example, although the representative value of the medium linguistic term has not been changed in Figure 3.2 and Figure 3.3, the range of the membership functions for this year is larger than the previous year. The influence (scope) of the medium 23.

(34) linguistic term group is bigger than before. Formula (3-2) shows the variant degree of the range of a membership functions. 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 =. 𝑡𝑡 𝑡𝑡+𝑘𝑘 |𝐷𝐷𝑛𝑛𝑛𝑛 − 𝐷𝐷𝑛𝑛𝑛𝑛 | 𝑡𝑡 𝑡𝑡 (𝐶𝐶𝑛𝑛𝑛𝑛 − 𝐶𝐶𝑛𝑛1 )/(𝑁𝑁 − 1). (3-2). Where Dt and Dt+k are the transaction databases. D is the linguistic term range. C is a representative value of some linguistic terms for database t. n is used to identify a commodity items, m is used to identify a linguistic terms and N is the number of all linguistic terms, respectively.. (C) The change of fuzzy support for the linguistic term A change in fuzzy support represents group size changed for this linguistic term group. We can use value rules for this type of concept change. For example, people that buy expensive mobile phones this year are greater than last year. Formula (3-3) is shown the variant degree of the fuzzy support for a membership function. 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 =. 𝑡𝑡 𝑡𝑡+𝑘𝑘 |𝑠𝑠𝑠𝑠𝑝𝑝𝑛𝑛𝑛𝑛 − 𝑠𝑠𝑠𝑠𝑠𝑠𝑛𝑛𝑛𝑛 | 𝑡𝑡 𝑠𝑠𝑠𝑠𝑠𝑠𝑛𝑛𝑛𝑛. (3-3). Where sup is the fuzzy support for a specific membership function (linguistic term) of some item. The proposed algorithm will compare the membership function with a predefined threshold after calculating its change. If the degree is larger than the threshold and the related concept-drift patterns will be immediately generated. The detailed algorithm is 24.

(35) described below.. 3.2. The Proposed CDMF Mining Algorithm. In this part, the proposed approach that combines concept-drift, FCM algorithm is described as follows: INPUT: Dt、Dt+k : databases; I : the number of item; S : concept-drift rules sets; M : the number of linguistic term; α: linguistic term threshold; β : membership functions threshold and γ: support threshold; OUTPUT: The concept-drift rule for fuzzy membership functions. STEP 1: The two database generate fuzzy membership functions for each item via the following sub-steps. (a) Set i = 1, where i is used to keep the identity number of the current item from database. (FCM refers to the related words). (b) The center points of these N clusters are set as the center of fuzzy membership functions for these M linguistic terms. (c) Output fuzzy membership functions for each linguistic term. An example is shown in Figure 3.1. (d) Set i = i + 1. (e) i ≤ I, go to Step (a). 25.

(36) STEP 2: Find the concept-drift rules from the fuzzy membership functions of N items between Dt and Dt+k. STEP 3: Set the initial concept-drift rules sets S = φ . STEP 4: Set n = 1, where n is used to keep the identity number of the current item from database. STEP 5: Calculate the degree of change for each linguistic term representative values and check the concept-drift rules from two databases Dt and Dt+k by below sub-steps. (a) Set m = 1, where m is used to keep the identity number of the current linguistic term. (b) Calculate the degree of change for each linguistic term representative values. 𝑡𝑡 representative for the n-th item and m-th linguistic term from databases. 𝐶𝐶𝑛𝑛𝑛𝑛 𝑡𝑡 𝑡𝑡+𝑘𝑘 |𝐶𝐶𝑛𝑛𝑚𝑚 − 𝐶𝐶𝑛𝑛𝑛𝑛 | 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 = 𝑁𝑁 𝑡𝑡 (∑𝑖𝑖=1 𝐶𝐶𝑛𝑛𝑛𝑛 )/𝑁𝑁. (3-1). (c) Check the concept-drift rules. If α ≤ cdLT then put the concept-drift rule 𝑡𝑡 𝑡𝑡+𝑘𝑘 “The value of Cnm is changed from 𝐶𝐶𝑛𝑛𝑛𝑛 to 𝐶𝐶𝑛𝑛𝑛𝑛 ” in S.. (d) Set m = m + 1. STEP 6: Calculate the degree of change for each linguistic term range values and check the concept-drift rules from two databases Dt and Dt+k by below sub steps.. 26.

(37) (a) Set m = 1, where m is used to keep the identity number of the current linguistic term. (b) Calculate the change of the linguistic term range for each linguistic term 𝑡𝑡 representative values. 𝐷𝐷𝑛𝑛𝑛𝑛 representative for the n-th item and m-th linguistic. term from databases.. 𝑡𝑡 𝑡𝑡+𝑘𝑘 |𝐷𝐷𝑛𝑛𝑛𝑛 − 𝐷𝐷𝑛𝑛𝑛𝑛 | 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 = 𝑡𝑡 𝑡𝑡 (𝐶𝐶𝑛𝑛𝑛𝑛 − 𝐶𝐶𝑛𝑛1 )/(𝑁𝑁 − 1). (3-2). (c) Check the concept-drift rules. If β ≤ cdMF then put the concept-drift rule 𝑡𝑡 𝑡𝑡+𝑘𝑘 “The linguistic term range value Dnm of is changed from 𝐷𝐷𝑛𝑛𝑛𝑛 to 𝐷𝐷𝑛𝑛𝑛𝑛 ” in S.. (d) Set m = m + 1. STEP 7: Calculate the degree of change for each support values and check the conceptdrift rules from two databases Dt and Dt+k by below sub steps. (a) Set m = 1, where m is used to keep the identity number of the current linguistic term. 𝑡𝑡 (b) Calculate the support change for each items. 𝑠𝑠𝑠𝑠𝑠𝑠𝑛𝑛𝑛𝑛 representative for the n-. th item and m-th linguistic term from databases.. 𝑡𝑡 𝑡𝑡+𝑘𝑘 |𝑠𝑠𝑠𝑠𝑠𝑠𝑛𝑛𝑛𝑛 − 𝑠𝑠𝑠𝑠𝑠𝑠𝑛𝑛𝑛𝑛 | 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 = 𝑡𝑡 𝑠𝑠𝑠𝑠𝑠𝑠𝑛𝑛𝑛𝑛. (3-3). (c) check the concept-drift rules. If γ ≤ cdSup then put the concept-drift rule 𝑡𝑡 𝑡𝑡+𝑘𝑘 “The support value Supnm of is changed from 𝑆𝑆𝑆𝑆𝑆𝑆𝑛𝑛𝑛𝑛 to 𝑆𝑆𝑆𝑆𝑆𝑆𝑛𝑛𝑛𝑛 ” in S.. (d) Set m = m + 1. 27.

(38) STEP 8: Set n = n + 1. STEP 9: If the item set has not been processed as well as its items, then go to Step 5. STEP 10: Output item sets S.. 3.3. Experimental Results. In this part, the results of the experiments show the performance of the proposed concept-drift for fuzzy membership functions (CDMF) algorithm. We used single computer with 3rd generation Intel Core i5-3230M 2.60GHz processor with 4 cores, 4 threads and DDR3-1600Mhz 12 GB random-access memory. The operating system was Microsoft Windows 8.1 Pro, and the programming language was .NET Framework 4.5.1 C# (C# Version 5.0). A simulation dataset containing 1559 items and 21,556 transactions was used in the experiments. In the data set, the number of purchased items in transactions was first randomly generated, and the purchased items and their quantities in each transaction were then generated. Here, we selected 21,566 transactions from the simulated dataset, and divided into two datasets as databases Dt and Dt+k, where each dataset had 10,733 transactions. The initial cluster size C was set at 3, the fuzziness index value m was set at 2, the linguistic term threshold varies from 1 to 0.1, the membership functions threshold varies from 1 to 0.1,and support threshold varies from 1 to 0.1 Firstly, Figure 28.

(39) 3.4 , 3.6 and 3.8 show the proposed approach. Experiments were first conducted on database to evaluate the numbers of linguistic term concept-drifts items with different thresholds.. Linguistic Term 1600. Concept-drift Items. 1400 1200 Different Location. 1000 800. First Half with Second Half of A Year. 600. Random Months. 400. A Random Month with Whole Year. 200 0. 1. 0.9. 0.8. 0.7. 0.6. 0.5. 0.4. 0.3. 0.2. 0.1. Thresholds. Figure 3.4: The number of concept-drift item by the algorithms along with different linguistic term threshold in database Figure 3.4 shows the proposed algorithm was performed with different pair of databases that were two databases at different locations, the databases of first half with second half of a year, the two databases of the random months, and the database of a random month with whole year. In the experimental results, we can find the influence for customer behavior for a different time that was bigger than a different location. We observed that the short-term databases may contain more special rules, so when we compared these databases, more concept-drifts can be found. In the contrast, since the 29.

(40) long-term databases tended to be stable, less concept-drifts will occur. As a result, we consider that the comparison between short-term databases as more preferable. Figure 3.5 showed the executive efficiency for proposed algorithm on database for different linguistic term thresholds varying from 1 to 0.1.. Linguistic Term 700. Execution Time(sec.). 600 500 Different Location 400 First Half with Second Half of A Year. 300. Random Months 200 A Random Month with Whole Year. 100 0 1. 0.9. 0.8. 0.7. 0.6. 0.5. 0.4. 0.3. 0.2. 0.1. Thresholds. Figure 3.5: The execution efficiency of the four different time and location with different linguistic term threshold in database. In Figure 3.5, it showed the executive efficiency of the different linguistic term thresholds in database. More concept-drift patterns will be generated when the threshold value increases. Thus, the proposed algorithm will spend more executive time if the process sets a higher value of threshold. Experimental results were second conducted on database to evaluate the numbers of membership functions concept-drifts items with and different threshold. Figure 3.6 30.

(41) shows the effect of different threshold values that identify the number of rules.. Membership Functions 1600. Concept-drift Items. 1400 1200 Different Location. 1000 800. First Half with Second Half of A Year. 600. Random Months. 400 A Random Month with Whole Year. 200 0. 1. 0.9. 0.8. 0.7. 0.6. 0.5. 0.4. 0.3. 0.2. 0.1. Thresholds Figure 3.6: The number of concept-drift item by the algorithms along with different membership functions threshold in database. Experiment were made database to evaluate efficiency of the algorithms. Figure 3.7 showed the execution efficiency on database for different membership functions threshold varying from 1 to 0.1.. 31.

(42) Membership Functions. Execution Time(sec.). 600 500 400. Different Location. 300. First Half with Second Half of A Year. 200. Random Months. 100. A Random Month with Whole Year. 0 1. 0.9. 0.8. 0.7. 0.6. 0.5. 0.4. 0.3. 0.2. 0.1. Thresholds Figure 3.7: The execution efficiency of the four different time and location with different membership functions threshold in database. Finally, experiment were conducted on database to evaluate the numbers of support concept-drifts items with and different threshold. Figure 3.8 shows the effect of different threshold values that identify the number of rules.. 32.

(43) Support 1800. Concept-drift Items. 1600 1400 1200. Different Location. 1000 800. First Half with Second Half of A Year. 600. Random Months. 400. A Random Month with Whole Year. 200 0. 1. 0.9. 0.8. 0.7. 0.6. 0.5. 0.4. 0.3. 0.2. 0.1. Thresholds. Figure 3.8: The number of concept-drift item by the algorithms along with different support threshold in database. Experiment were at last made database to evaluate efficiency of the algorithms. Figure 3.9 showed the execution efficiency on database for different support threshold varying from 1 to 0.1.. 33.

(44) Support 800. Execution Time(sec.). 700 600 500. Different Location. 400. First Half with Second Half of A Year. 300. Random Months. 200 A Random Month with Whole Year. 100 0. 1. 0.9. 0.8. 0.7. 0.6. 0.5. 0.4. 0.3. 0.2. 0.1. Thresholds Figure 3.9: The execution efficiency of the four different time and location with different support threshold in database. There are some concept-drift patterns with higher threshold values. However, these patterns are mostly represented by different types of concept-drifts. The proposed algorithm should be set at a suitable threshold value to attain a reasonable number of patterns as well as making a good representation.. 34.

(45) CHAPTER 4 CONCEPT DRIFT FOR FUZZY ASSOCIATION RULES 4.1. Definitions and Review Fuzzy Association. Rules In this chapter, we will present the fuzzy association rules concept-drift patterns mining (CDFAR).. 4.1.1 Fuzzy Membership Functions by Fuzzy C-means In this part, our previous method is introduced to generate the set of membership functions by fuzzy C-means. Each membership function is designed as a triangle and encoded as a pair (c, w). The peak of triangle is located at c and the distance between c and left acme is w. Membership function plays a role in converting commodity items into something similar with human semantics. Figure 4.1 shows membership functions set. Figure 4.1 consists of three membership functions representing low, medium, and high for each different purchase amount.. 35.

(46) Membership value Apple 1.0. Quantity 3.0. 7.0. 11.0. Low. Medium. High. Figure 4.1: Membership functions set that apple. We combine the data of the two different databases and generate fuzzy membership functions by this data. The fuzzy membership functions which is generated by Fuzzy C-means is fixed and the same in order to apply to the two databases.. 4.1.2 Generating Fuzzy Association Rules by Fuzzy Apriori In this part, we will generate fuzzy membership functions which are produced by the method in chapter 4.1.1 as the input data by fuzzy Apriori. Table 4.1 is a transaction database. In the fuzzy association rules, the membership functions is applied to turn information into semantic words. Table 4.1: An example of a transaction database. ID. Expanded Items. 1. (A, 3)(C, 6)(E, 9). 2. (B, 4)(C, 7)(D, 10). 3. (B, 2)(C, 5)(E, 8). 4. (C, 1)(E, 14). A membership function is shown in Figure 4.1. After converting the fuzzy 36.

(47) membership functions, we can get fuzzy values of different linguistic terms of each item. So the original transaction database can be converted into a database with fuzzy linguistic terms. An example is shown in Table 4.2. Table 4.2: Table 4.1 after converting the fuzzy database TID. Fuzzy set (. 1 2 3 4. (. 0.5 0.5 1 0.2 0.8 ) )( )( + + A.Low C.Low C.Middle E.Middle E.High. 0.8 0.2 1 0.2 0.8 + )( )( + ) B.Low B.Middle C.Middle D.Middle D.High (. 0.2 0.8 1 0.5 0.5 ) )( )( + + B.Low C.Low C.Middle E.Middle E.High (. 1 1 )( ) C.Low E.High. Next step, fuzzy frequent itemsets are generated by fuzzy linguistic terms, and then the fuzzy association rules are obtained by fuzzy apriori algorithm [50]. The processes of Apriori algorithm as following : INPUT: n : quantitative database consisting of transaction; a set of membership functions; α : the minimum support threshold. OUTPUT: The concept-drift rule for fuzzy membership functions. STEP 1: Transform the quantitative value vij of each item Ij in the i-th transaction into a fuzzy set fij represented as (fij/Rj1+ fij/Rj2+…+ fijh/Rjh) using the given membership functions, where h is the number of fuzzy regions (linguistic terms) for Ij , Rjl is the l-th fuzzy region of Ij, 1 ≤ l ≤ h, and fijl is vij’s fuzzy membership value in region Rjl. 37.

(48) STEP 2: Calculate the scalar cardinality of each fuzzy region (linguistic terms) Rjl in the transaction data as: n. count jl = ∑ fijl. (4-1). i =1. STEP 3: Check whether the value countjl of the fuzzy region Rjl is larger than or equal to the minimum count n × α. If the count of a fuzzy region Rjl is equal to or greater than the minimum count, put the fuzzy region in the set of frequent fuzzy regions (L1). That is L1 = {R jl | count jl ≥ n × α ,1 ≤ j ≤ m}.. (4-2). STEP 4: If L1 is null, then exit the algorithm; otherwise, do the next step. STEP 5: Set r = 1, where r is used to represent the number of items kept in the current large itemsets. STEP 6: Generate the candidate set Cr+1 form Lr. Restated, the algorithm joins Lr and Lr under the condition that r+1 items in the two itemsets are the same and the other one is different. Store in Cr+1 the itemsets which have all their sub-ritemsets in Lr. STEP 7: Calculate the following sub-steps for each newly formed (r+1)-itemset s with items (s1, s2, …, sr+1) in Cr+1.. (A) For each transaction datum D, calculate its fuzzy value on s as f s = f s1 ∧ f s 2 ∧ ... f sr +1 , where f sj is membership value of D is sj. If the 38.

(49) minimum operator is used for the intersection, then r +1. f s = Min f s j j =1. (4-3). (B) Calculate the count of s in the transaction as : n. count s = ∑ f s. (4-4). n =1. (C) If counts is larger than or equal to the minimum support value α, put s in Lr+1. STEP 8: If Lr+1 is null, then do the next step; otherwise, set r = r + 1 and repeat Steps 6 to 7. STEP 9: Construct the fuzzy association rules for each large q-itemset s with items (s1, s2, …, sq), q ≥ 2, using the following sub-steps :. (A) form all possible association rules as follows : s1 ∧ ... ∧ sk −1 ∧ sk +1 ∧ ... ∧ sq → sk , k = 1 to q. (4-5). (B) Calculate the confidence values of each fuzzy association rules using : n. ∑f i =1. n. s. ∑ ( f s1 ∧ ... ∧ f sk −1 , f sk +1 ∧ ... ∧ f sq ). (4-6). i =1. STEP 10: Output the rules with confidence values larger than or equal to the confidence threshold λ .. 4.1.3 Concept-drift Patterns for Fuzzy Association Rules In this part, , we generalize the original concept-drift patterns in [24] to quantitative transactions. The following different concept drift patterns of fuzzy 39.

(50) association rules are considered. The first one is the fuzzy emergent patterns in which both the conditional and the consequent parts between two fuzzy association rules from two different databases are the same but the fuzzy support values of the conditional or consequent parts are different. The second one is the unexpected change for fuzzy association rules. It considers two rules in different databases with similar change of the condition parts, but their consequent parts are quite different. The last one also considers unexpected change for fuzzy association rules, but it considers the two rules with similar consequent parts and quite different conditional parts. They are described below. The added and perished concept drift patterns are not considered in the paper.. (A) The fuzzy emerging change In fuzzy emerging patterns, both the conditional and the consequent parts between two fuzzy association rules from two different databases are the same but the fuzzy support values of the conditional or consequent parts are different. There are three kinds of fuzzy support change for fuzzy emerging patterns. The first case is that the fuzzy support values of the conditional terms between two fuzzy association rules are similar but the fuzzy support values of the consequent terms are different. The second case is that the fuzzy support values of the conditional terms are different but the fuzzy support values of the consequent terms are similar. The third case is that the fuzzy support values of the conditional terms and the consequent terms are both different. Two rules 40.

(51) with the similar support values in both the consequent and conditional parts are not considered since they do not change significantly and are thus not emerging patterns. In order to calculate fuzzy concept-drift patterns, the following formula modified from [24] is adopted to estimate the similarity ps of the premise (conditional) part in two fuzzy association rules: ℓ𝑖𝑖𝑖𝑖 × ∑𝑘𝑘∈𝐴𝐴𝑖𝑖𝑖𝑖 𝑥𝑥𝑖𝑖𝑖𝑖𝑖𝑖. 𝑝𝑝𝑝𝑝 = �. �𝐴𝐴𝑖𝑖𝑖𝑖 �. , 𝑖𝑖𝑖𝑖�𝐴𝐴𝑖𝑖𝑖𝑖 � ≠ 0. (4-7). 0, 𝑖𝑖𝑖𝑖�𝐴𝐴𝑖𝑖𝑖𝑖 � = 0. The notation in this formula is briefly explained as follows: 𝑝𝑝𝑠𝑠𝑖𝑖𝑖𝑖 : The degree of premise similarity between two rules rit and rjt+k ,0 ≤ 𝑠𝑠𝑖𝑖𝑖𝑖 ≤ 1,. ℓ𝑖𝑖𝑖𝑖 : The degree of attribute match of the premise part between two rules rit and rjt+k,. �𝐴𝐴𝑖𝑖𝑖𝑖 �: The number of attributes common to both conditional parts of rit and rjt+k,. 𝑥𝑥𝑖𝑖𝑖𝑖𝑖𝑖 : The degree of attribute value (linguistic term) match of the k-th matching attribute in 𝐴𝐴𝑖𝑖𝑖𝑖 .. In the above formula, ℓ𝑖𝑖𝑖𝑖 can be defined by the following formula [24]: ℓ𝑖𝑖𝑖𝑖 =. �𝐴𝐴𝑖𝑖𝑖𝑖 �. max��𝑋𝑋𝑖𝑖𝑡𝑡 �, �𝑋𝑋𝑗𝑗𝑡𝑡+𝑘𝑘 ��. ,. (4-8). where |𝑋𝑋𝑖𝑖𝑡𝑡 | and �𝑋𝑋𝑗𝑗𝑡𝑡+𝑘𝑘 � are the numbers of attributes in the premise parts of rit and. rjt+k, respectively. 𝑥𝑥𝑖𝑖𝑖𝑖𝑖𝑖 represents the degree of attribute value (linguistic term) match 41.

(52) of the k-th matching attribute in 𝐴𝐴𝑖𝑖𝑖𝑖 . It can be defined as follows by considering fuzzy match:. 𝑥𝑥𝑖𝑖𝑖𝑖𝑖𝑖 = 1 − �. 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖_𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑖𝑖𝑖𝑖 𝛼𝛼 � , 𝑛𝑛𝑘𝑘 − 1. (4-9). where nk is the number of membership functions the k-th attribute, 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖_𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 is the number of intervals between the linguistic values of the two. same attributes in the two rules rit and rjt+k, and α is a parameter controlling the effect of different linguistic values. For example, if an attribute has only three linguistic terms:. high, middle, low. Then the value of interval_distance between high and middle is 1, and between high and low is 2. After the premise similarity of fuzzy rules is defined, the similarity cs of the consequent parts in two fuzzy association rules is designed as follows: 𝑖𝑖𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛_𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑖𝑖𝑖𝑖 𝛼𝛼 𝑐𝑐𝑐𝑐𝑖𝑖𝑖𝑖 = 𝑐𝑐𝑖𝑖𝑖𝑖 × �1 − � � �, 𝑛𝑛𝑘𝑘 − 1. (4-10). where cij = 1 if the consequent attributes (not including values) in the two rules rit and rjt+k are the same, and cij = 0 otherwise. If both the psij and sij values are equal to or larger than a predefined threshold value T, then the supports of the two rules are then checked according to the three cases mentioned above. If the rule similarity measure is less than the threshold, it means the conditional 42.

(53) terms are similar in these two fuzzy association rules and if the rule similarity measure is large than the threshold, it means the conditional terms are quite different. Below is an example to show the concept above. Assume the membership functions for the purchased amount of apples are the same as those in Figure 4.1. Also assume there are two fuzzy association rules from two different databases are shown in Table 4.3. Table 4.3: The first case of an emerging pattern for two fuzzy association rules Database. Fuzzy Association Rules. Dt. (Apple.Low,0.5), (Mike.High,0.6) → (Bread.High,0.7). t+k. D. (Apple.Low,0.5), (Mike.High,0.6) → (Bread.High,0.5). In Table 4.3, both the premise similarity and the conditional similarity between the two association rules are very similar (actually the same), thus we judge their fuzzy support change. The fuzzy support values at the premise parts of the two rules are the same, but the value at the consequent parts of the two rules are different with 0.2. If 0.2 is larger than the threshold, then it is the first case of an emerging pattern. The two fuzzy association rules in Table 4.4 are another example. Both the premise similarity and the conditional similarity between the two association rules are very similar (actually the same), thus we judge their fuzzy support change. The fuzzy support values at the premise parts of the two rules are different, but the values at the consequent parts of the two rules are the same. It is the second case of an emerging pattern. Table 4.4: The second case of an emerging pattern for two fuzzy association rules 43.

(54) Database. Fuzzy Association Rules. Dt. (Apple.Low,0.5), (Mike.High,0.9) → (Bread.High,0.7). t+k. D. (Apple.Low,0.5), (Mike.High,07) → (Bread.High,0.7). At last, Table 4.5 shows an example for an emerging pattern for two fuzzy association rules. Both the premise similarity and the conditional similarity between the two association rules are the same. The fuzzy support values of the premise and consequent terms are different. It is thus the third case of an emerging pattern. Table 4.5: The third case of an emerging pattern for two fuzzy association rules Database. Fuzzy Association Rules. Dt. (Apple.Low,0.4), (Mike.High,1) → (Bread.High,0.7). t+k. D. (Apple.Low,0.4), (Mike.High,0.6) → (Bread.High,0.9). (B) The fuzzy unexpected change There are two kinds of fuzzy concept drift patterns for unexpected change. The first one is that the premise similarity between the two association rules is very high, but the consequent similarity of the two rules is not high. The second one is the contrary. That is, the consequent similarity between the two association rules is very high, but the premise similarity of the two rules is not high. It can be judged by a threshold in a way similar to that for emerging patterns. Below is an example to show the first case. In Table 4.6, there are two fuzzy association rules. The premise similarity of the two rules is high, which is 1. If there are only three membership functions for Bread, then the consequent similarity of the two rules is 0.5. If the threshold is set at 0.6, then they are thought of as quite different. 44.

(55) It is then the first case of the unexpected change. Table 4.6: The first case of the unexpected change for two fuzzy association rules Database. Fuzzy Association Rules. Dt. (Apple.Low,0.4), (Mike.High,0.9) → (Bread.Middle,0.4). t+k. D. (Apple.Low,0.4), (Mike.High,0.9) → (Bread.Low,0.2). Finally, an example is given to show the concept drift which has different premise terms but similar consequent terms. Table 4.7 shows an example for this case. In this case, the consequent similarity of the two rules is 1, but the premise similarity of the two rules is 0.5. If the threshold is set at 0.6, then they are thought of as quite different. It is then the second case of the unexpected change Table 4.7: The second case of the unexpected change for two fuzzy association rules Database. Fuzzy Association Rules. Dt. (Apple.Middle,0.6), (Mike.High,0.9) → (Bread.High,0.7). t+k. D. 4.2. (Apple.High,0.8), (Mike.Low,0.3) → (Bread.High,0.7). The Proposed CDFAR Mining Algorithm. In this part, the proposed CDFAR approach that combines concept-drift, fuzzy Cmeans algorithm and fuzzy data mining is described as follows: INPUT: Two quantitative transaction databases Dt consists of n quantitative transactions and m items at time t, and Dt+k consists of w quantitative transactions and m items at time t+k; The parameters include a support threshold α; A confidence threshold λ; A concept-drift rules sets S; cd: 45.

(56) conditional threshold; cs : consequent threshold; A set of membership functions. OUTPUT: The fuzzy concept-drift patterns. STEP 1: The two database generate fuzzy membership functions for each item via the following sub-steps. (a) Set i = 1, where i is used to keep the identity number of the current item from database. (fuzzy c-means refer to the related words). (b) The center points of these N clusters are set as the center of fuzzy membership functions for these M linguistic terms. (c) Output fuzzy membership functions for each linguistic term. An example is shown in Figure 4.1. (d) Set i = i + 1. (e) i ≤ I, go to Step (a). STEP 2: The two database generate fuzzy association rules for each item via the following sub-steps. (a) A set of fuzzy membership functions by fuzzy C-means (b) If the item satisfies the condition put it in R large itemsets. (fuzzy apriori refer to the chapter 4.1.2).. 46.

(57) (c) Output large itemsets for each fuzzy association rules. STEP 3: Find the concept-drift rules from the fuzzy association rules of large itemsets between Dt and Dt+k by the following sub-steps. STEP 4: Set the initial concept-drift rules sets S = φ . STEP 5: Set r = 1, where r is used to keep the identity number of the current rule from database. STEP 6: Calculate the emerging change for the fuzzy association rules and check the concept-drift rules from two databases Dt and Dt+k by below sub steps. (a) Set j = 1, where j is used to keep the identity number of the current conditional terms. (b) Calculate the fuzzy values of conditional terms for each rule sets. ℓ𝑖𝑖𝑖𝑖 × ∑𝑘𝑘∈𝐴𝐴𝑖𝑖𝑖𝑖 𝑥𝑥𝑖𝑖𝑖𝑖𝑖𝑖. 𝑝𝑝𝑝𝑝 = � (c) set j = j + 1. �𝐴𝐴𝑖𝑖𝑖𝑖 �. , 𝑖𝑖𝑖𝑖�𝐴𝐴𝑖𝑖𝑖𝑖 � ≠ 0. (4-7). 0, 𝑖𝑖𝑖𝑖�𝐴𝐴𝑖𝑖𝑖𝑖 � = 0. (d) Check the concept-drift rules. (e) Calculate the fuzzy values of consequents term for each rule sets.. (f). 𝑖𝑖𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛_𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑖𝑖𝑖𝑖 𝛼𝛼 𝑐𝑐𝑐𝑐𝑖𝑖𝑖𝑖 = 𝑐𝑐𝑖𝑖𝑖𝑖 × �1 − � � �, 𝑛𝑛𝑘𝑘 − 1. (4-8). Check the concept-drift rules.. STEP 7: Calculate the unexpected change for the fuzzy association rules and check the 47.

(58) concept-drift rules from two databases Dt and Dt+k by below sub steps. (a) Set j = 1, where j is used to keep the identity number of the current conditional terms. (b) Calculate the fuzzy values of conditional terms for each rule sets. ℓ𝑖𝑖𝑖𝑖 × ∑𝑘𝑘∈𝐴𝐴𝑖𝑖𝑖𝑖 𝑥𝑥𝑖𝑖𝑖𝑖𝑖𝑖. 𝑝𝑝𝑝𝑝 = � (c) set j = j + 1. �𝐴𝐴𝑖𝑖𝑖𝑖 �. , 𝑖𝑖𝑖𝑖�𝐴𝐴𝑖𝑖𝑖𝑖 � ≠ 0. (4-7). 0, 𝑖𝑖𝑖𝑖�𝐴𝐴𝑖𝑖𝑖𝑖 � = 0. (d) Check the concept-drift rules. (e) Calculate the fuzzy values of consequents term for each rule sets.. (f). 𝑖𝑖𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛_𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑖𝑖𝑖𝑖 𝛼𝛼 𝑐𝑐𝑐𝑐𝑖𝑖𝑖𝑖 = 𝑐𝑐𝑖𝑖𝑖𝑖 × �1 − � � �, 𝑛𝑛𝑘𝑘 − 1. (4-8). Check the concept-drift rules.. STEP 8: Set r = r + 1. STEP 9: If the item set has not been the processed as well as items, go to Step 6. STEP 10: Output rule sets S.. 4.3. Experimental Results. In this part the results of the experiments to show the performance of the proposed fuzzy association rules concept-drift patterns mining (CDFAR) algorithm. In the experimental results, fuzzy membership functions which are generated by Fuzzy C48.

(59) means are fixed and the same in order to apply to the two databases. The experiments were implemented in a computer with Intel Core i5-3230M 2.60GHz processor, 4 threads and 12G RAM. The operating system was Microsoft Windows 8.1 Pro, and the programming language was .NET Framework 4.5.1 C# (C# Version 5.0). A simulation dataset containing 60 items and 10,000 transactions was used in the experiments. In the data set, the number of purchased items in transactions was first randomly generated, and the purchased items and their quantities in each transaction were then generated. Here, we selected 10,000 transactions from the simulated dataset, and divided them into two datasets as databases Dt and Dt+k, where each dataset thus has 5,000 transactions. The minimum support threshold value α was set at 0.04 (4%). Firstly, the proposed approach is shown in Table 4.8. Table 4.8: The number of fuzzy concept-drift patterns at thresholds minimum support value as 4%. Emerging Patterns. Unexpected Changes. different location. 37. 50. first half with second half of a year. 20. 40. random months. 2. 7. a random month with whole year. 7. 13. 49.

(60) In Table 4.8, the proposed CDFAR algorithm was performed with different pair of databases which were two databases with different locations, the databases of first half with second half of a year, the two databases of the random months and the database of a random month with whole year. In the experimental results, we can find the influence for customer behavior of the different time is bigger than different location. In the experimental results, we can find the concept-drift patterns from different locations more than different times, but also represents the influence customer behavior more than different times. We observed that two databases of the random months could find less the same rules, but the rules are found quite special. We can find more concept-drift patterns for fuzzy association rules in two databases by different locations. As the result, the concept-drift rules can represent the different meanings (customer behaviors) in different time or different places. Then we compared the experimental results for proposed CDFAR algorithm with different thresholds in Table 4.9.. 50.

(61) Table 4.9: The number of fuzzy concept-drift patterns at different minimum support thresholds value as 3%. Emerging Patterns. Unexpected Changes. different location. 26. 42. first half with second half of a year. 14. 30. random months. 1. 2. a random month with whole year. 4. 10. In Table 4.9, we discuss the effect of different threshold values to the number of fuzzy concept-drift patterns. The minimum support threshold value α was set at 0.03 (3%). The result is shown at Table 4.9. Evidently, there are few fuzzy concept-drift patterns for fuzzy association rules with higher threshold values. However, this concept-drift patterns are most representative by different kinds. Thus, the proposed CDFAR algorithm should be set a suitable threshold value in order to get a reasonable number of patterns and these patterns are also representatives of special meaning.. 51.

(62) CHAPTER 5 CONCLUSION AND FUTURE WORK. In the first part of this thesis, we have proposed a new research issue, named fuzzy concept-drift patterns mining. In addition, the CDMF approach is developed to find concept-drift patterns for fuzzy membership function in two different training database. To our best knowledge, this research is the second work on mining concept-drift patterns for fuzzy membership functions. In particular, the proposed methods can be understand customer purchase number of commodity in different times or different places. The experimental results show that proposed CDMF approach can find useful concept-drift rules and provide valuable information on among various parameter settings. In the second part of this thesis, we have also introduced another new issue, named fuzzy association rules concept-drift mining, which considers not only quantities but also linguistic terms in fuzzy theory. In addition, a fuzzy association rule mining approach (CDFAR) is designed to find fuzzy concept-drift patterns. The previous methods for fuzzy association rules can not obtain the information for change of customers’ behavior, however, this information is very valuable for businesses. From. 52.

(63) the experimental results, it can be observed the proposed CDFAR approach can be find the effectiveness of the fuzzy association rules concept-drift patterns. In the future, we would apply the proposed algorithms to other practical applications, such observing change of customers’ behavior for each year, the difference of customer favorite products each season, and among others. In addition, how to design more effective ways to decrease the computing time and find out more about the concept-drift patterns is another interesting topic.. 53.