Discussion and Conclusions

Itemset Count

7. Discussion and Conclusions

From Figure 6, it is easily seen that the numbers of rules by the proposed mining

algorithm are greater than those by FDM. The taxonomic structures can gather purchased

items into general classes, thus being able to generating additional large itemsets. The large

amounts of rules may, however, cause humans hard to interpret them. Thus, the support and

the confidence values in the proposed algorithm can be set at a higher value than those set by

FDM. For example, when the minimum support value is set at 1000, the number of rules can

be reduced to below 250 (from Figure 4).

7. Discussion and Conclusions

In this paper, we have proposed a fuzzy multiple-level data-mining algorithm that can

process transaction data with quantitative values and discover interesting patterns among 0

2000 4000 6000 8000 10000 12000

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 min-confidence

number of rules

FDM The proposed method

them. The rules thus mined exhibit quantitative regularity on multiple levels and can be used

to provide suggestions to appropriate supervisors. Compared to conventional crisp-set mining

methods for quantitative data, our approach gets smoother mining results due to its fuzzy

membership characteristics. The mined rules are expressed in linguistic terms, which are more

natural and understandable for human beings. The proposed mining algorithm can also be

degraded into a conventional non-fuzzy mining algorithm by assigning a single membership

function with values always equal to 1 for quantity larger than 0.

When compared to fuzzy mining methods, which take all fuzzy regions into

consideration, our method achieves better time complexity since only the most important

fuzzy term is used for each item. If all fuzzy terms are considered, the possible combinational

searches are large. Therefore, a trade-off exists between rule completeness and time

complexity.

Our proposed algorithm does not find association rules for items on the same paths in

given hierarchy trees. Such rules can however be found to analyze interesting behavior among

these items by slightly modifying the proposed algorithm. For example, it may be that a

certain brand of milk is found to be the most popular for its chocolate flavor from a rule that

states that if the brand of milk has high sales figures, then chocolate-flavored milk also has

high sales figures.

Our algorithm can also be easily modified to allow users to assign their items or terms of

interest. The rules thus mined can therefore satisfy users' requirements. For example, users

can require that only items with high sales quantities are mined, which would greatly reduce

the mining time.

Although the proposed method works well in data mining for quantitative values, it is

just a beginning. There is still much work to be done in this field. In the future, we will first

extend our proposed algorithm to resolve the two problems stated above. Our method also

assumes that membership functions are known in advance. In [15-17], we proposed some

fuzzy learning methods to automatically derive membership functions. We will therefore

attempt to dynamically adjust the membership functions in the proposed mining algorithm to

avoid the bottleneck of membership function acquisition. We will also attempt to design

specific data-mining models for various problem domains.

Acknowledgment

The authors would like to thank the anonymous referees for their very constructive

comments. This research was supported by the National Science Council of the Republic of

China under contract NSC89-2213-E-214-003.

References

[1] R. Agrawal, T. Imielinksi and A. Swami, “Mining associations between sets of items in

massive databases,“ The 1993 ACM SIGMOD Conference on Management of Data,

Washington DC, USA, 1993, pp. 207-216.

[2] R. Agrawal, T. Imielinksi and A. Swami, “Database mining: a performance perspective,”

IEEE Transactions on Knowledge and Data Engineering, Vol. 5, No. 6, 1993, pp.

914-925.

[3] A. F. Blishun, “Fuzzy learning models in expert systems,” Fuzzy Sets and Systems, Vol. 22,

No. 1, 1987, pp. 57-70.

[4] L. M. de Campos and S. Moral, “Learning rules for a fuzzy inference model,” Fuzzy Sets

and Systems, Vol. 59, No. 2, 1993, pp. 247-257.

[5] R. L. P. Chang and T. Pavlidis, “Fuzzy decision tree algorithms,” IEEE Transactions on

Systems, Man and Cybernetics, Vol. 7, No. 1, 1977, pp. 28-35.

[6] M. S. Chen, J. Han and P. S. Yu, “Data mining: an overview from a database perspective,”

IEEE Transactions on Knowledge and Data Engineering, Vol. 8, No. 6, 1996, pp.

866-883.

[7] C. Clair, C. Liu and N. Pissinou, “Attribute weighting: a method of applying domain

knowledge in the decision tree process,” The Seventh International Conference on

Information and Knowledge Management, Bethesda, Maryland, USA, 1998, pp. 259-266.

[8] M. Delgado and A. Gonzalez, “An inductive learning procedure to identify fuzzy

systems,” Fuzzy Sets and Systems, Vol. 55, No. 2, 1993, pp. 121-132.

[9] A. Famili, W. M. Shen, R. Weber and E. Simoudis, "Data preprocessing and intelligent

data analysis," Intelligent Data Analysis, Vol. 1, No. 1, 1997, pp. 3-23.

[10] W. J. Frawley, G. Piatetsky-Shapiro and C. J. Matheus, “Knowledge discovery in

databases: an overview,” The AAAI Workshop on Knowledge Discovery in Databases,

Anaheim, CA, 1991, pp. 1-27.

[11] T. Fukuda, Y. Morimoto, S. Morishita and T. Tokuyama, "Mining optimized association

rules for numeric attributes," The Fifteenth ACM SIGACT-SIGMOD-SIGART

Symposium on Principles of Database Systems, Montreal, Canada, 1996, pp. 182-191.

[12] A. Gonzalez, “A learning methodology in uncertain and imprecise environments,”

International Journal of Intelligent Systems, Vol. 10, No. 3, 1995, pp. 357-371.

[13] J. Han and Y. Fu, “Discovery of multiple-level association rules from large databases,”

The International Conference on Very Large Databases, Zurich, Switzerland, 1995, pp.

420 -431.

[14] T. P. Hong, C. S. Kuo and S. C. Chi, "A data mining algorithm for transaction data

with quantitative values," Intelligent Data Analysis, Vol. 3, No. 5, 1999, pp. 363-376.

[15] T. P. Hong and J. B. Chen, "Finding relevant attributes and membership functions,"

Fuzzy Sets and Systems, Vol.103, No. 3, 1999, pp. 389-404.

[16] T. P. Hong and J. B. Chen, "Processing individual fuzzy attributes for fuzzy rule induction," Fuzzy Sets and Systems, Vol. 112, No. 1, 2000, pp. 127-140.

[17] T. P. Hong and C. Y. Lee, "Induction of fuzzy rules and membership functions from

training examples," Fuzzy Sets and Systems, Vol. 84, No. 1, 1996, pp. 33-47.

[18] T. P. Hong, C. S. Kuo and S. C. Chi, "Trade-off between time complexity and number of rules for fuzzy mining from quantitative data," International Journal of Uncertainty, Fuzziness, and Knowledge-based Systems, Vol. 9, No. 5, 2001, pp.

587-604.

[19] A. Kandel, Fuzzy Expert Systems, CRC Press, Boca Raton, 1992, pp. 8-19.

[20] R. S. Michalski, I. Bratko and M. Kubat, Machine Learning and Data Mining: Methods

and Applications, John Wiley & Sons Ltd, England, 1998.

[21] J. R. Quinlan, “Decision tree as probabilistic classifier,” The Fourth International

Machine Learning Workshop, Morgan Kaufmann, San Mateo, CA, 1987, pp. 31-37.

[22] J. R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann, San Mateo,

CA, 1993.

[23] R. Rastogi and K. Shim, "Mining optimized association rules with categorical and

numeric attributes," The 14th IEEE International Conference on Data Engineering,

Orlando, 1998, pp. 503-512.

[24] R. Rastogi and K. Shim, "Mining optimized support rules for numeric attributes," The

15th IEEE International Conference on Data Engineering, Sydney, Australia, 1999, pp.

206-215.

[25] J. Rives, “FID3: fuzzy induction decision tree,” The First International Symposium on

Uncertainty, Modeling and Analysis, University of Maryland, College Park, Maryland,

1990, pp. 457-462.

[26] R. Srikant, Q. Vu and R. Agrawal, “Mining association rules with item constraints,” The

Third International Conference on Knowledge Discovery in Databases and Data Mining,

Newport Beach, California, August 1997, pp.67-73.

[27] R. Srikant and R. Agrawal, “Mining quantitative association rules in large relational

tables,” The 1996 ACM SIGMOD International Conference on Management of Data,

Monreal, Canada, June 1996, pp. 1-12.

[28] C. H. Wang, T. P. Hong and S. S. Tseng, “Inductive learning from fuzzy examples,” The

Fifth IEEE International Conference on Fuzzy Systems, New Orleans, 1996, pp. 13-18.

[29] C. H. Wang, J. F. Liu, T. P. Hong and S. S. Tseng, “A fuzzy inductive learning strategy

for modular rules,” Fuzzy Sets and Systems, Vol. 103, No. 1, 1999, pp. 91-105.

[30] R. Weber, “Fuzzy-ID3: a class of methods for automatic knowledge acquisition,” The

Second International Conference on Fuzzy Logic and Neural Networks, Iizuka, Japan,

1992, pp. 265-268.

[31] Y. Yuan and M. J. Shaw, “Induction of fuzzy decision trees,” Fuzzy Sets and Systems, Vol.

69, No. 2, 1995, pp. 125-139.

[32] L. A. Zadeh, “Fuzzy sets,” Information and Control, Vol. 8, No. 3, 1965, pp. 338-353.

在文檔中 Mining Fuzzy Multiple-Level Association Rules from Quantitative Data (頁 26-33)