• 沒有找到結果。

6 Conclusions and Future Work

6.2 Future Work

There are still many issues worth further investigation. First of all, although SCRIPT can efficiently and accurately amend the original classifier when the concepts of instances drift, the resulting decision tree does not guarantee to be the same to the one built from scratch. It would be interesting to find an efficient way by which one can obtain an identical decision tree. In addition, when there are two-way drifts, we need not to amend the original sub-trees by use of incoming instances, but switch the classification rules. Therefore, further analyzing the drifting conditions and proposing a more efficient and accurate correction mechanism is considered in our future work.

Secondly, CDR-Tree considers the cases in which there are only two data blocks in a data stream. If analysis of greater than two is required, CDR-Trees will become much larger and more complicated. Therefore, another future focus is to extend the use of CDR-Tree which can more efficiently process multi-block concept-drifting problems. Besides, although our extraction method can efficiently extract the classification model from CDR-Trees and the extracted model can reach accuracy comparable to the decision tree built from the beginning, it would be interesting to find a way by which the extracted decision tree is the same to that built from the beginning.

Finally, OMMD is only applicable to ordered multi-valued and multi-labeled datasets.

However, there are non-ordered multi-valued and multi-labeled data such as the input data of MMC (multi-valued and multi-labeled classifier) [10] and MMDT (multi-valued and multi-labeled decision tree) [8]. Since many available datasets in our real life are ordered multi-valued and multi-labeled, we intend to design an ordered multi-valued and multi-labeled discretization algorithm in the future.

Bibliography

[1] R. Agrawal, S. Ghosh, T. Imielinski, B. Iyer and A. Swami, “An interval classifier for database mining applications”, in Proceedings of the 18th International Conference on Very Large Databases, pp. 560-573, 1992.

[2] R. Agrawal, T. Imielinski and A. Swami, “Database mining: a performance perspective”, IEEE Transactions on Knowledge and Data Engineering, vol. 5, no. 6, pp. 914-925, 1993.

[3] S. D. Bay, “Multivariate discretization of continuous variables for set mining,” in:

Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 315-319, 2000.

[4] F. Berzal, J. C. Cubero, N. Marín and D. Sánchez, “Building multi-way decision trees with numerical attributes,” Information Sciences, vol. 165, no. 1-2, pp. 73-90, 2004.

[5] A. Blum, “Empirical support for winnow and weighted-majority algorithms: results on a calendar scheduling domain,” Machine Learning, vol. 26, pp. 5-23, 1997.

[6] L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification and Regression Trees, Wadsworth, 1984.

[7] S. Chao and Y. Li, “Multivariate interdependent discretization for continuous attribute,”

in: Proceedings of the Third International Conference on Information Technology and Applications, Vol. 1, pp. 167-172, 2005.

[8] S. Chou and C. L. Hsu, “MMDT: a multi-valued and multi-labeled decision tree classifier for data mining,” Expert System with Application, vol. 28, no. 4, pp. 799-812, 2005.

[9] N. V. Chawla, L. O. Hall, K. W. Bowyer, T. E. Moore and W. P. Kegelmeyer,

“Distributed pasting of small Votes,” Multiple Classifier Systems, pp. 52-61, 2002.

[10] Y. L. Chen, C. L. Hsu and S. C. Chou, “Constructing a multi-valued and multi-labeled decision tree,” Expert Systems with Applications, vol. 25, no. 2, pp. 199-209, 2003.

[11] J. Y. Ching, A. K. C. Wong and K. C. C. Chan, “Class-dependent discretization for inductive learning from continuous and mixed mode data,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 17, no. 7, pp. 641-651, 1995.

[12] D. Chiu, A. Wong and B. Cheung, “Information discovery through hierarchical maximum entropy discretization and synthesis,” Knowledge Discovery in Databases, G.

Piatetsky-Shapiro and W. J. Frawley (EDs.), MIT/AAAI Press, pp. 125-140, 1991.

[13] K. J. Cios and L. A. Kurgan, “CLIP4: hybrid inductive machine learning algorithm that generates inequality rules,” Information Science, vol. 163, no. 1-3, pp. 37-83, 2004.

[14] K. J. Cios and L. A. Kurgan, “Hybrid inductive machine learning: an overview of clip algorithms,” New Learning Paradigms in Soft Computing, L.C. Jain and J. Kacprzyk (EDs.), Physica-Verlag (Springer), pp. 276-322, 2001.

[15] P. Clark and T. Niblett, “The CN2 induction algorithm,” Machine Learning, vol. 3, no. 4, pp. 261–283, 1989.

[16] W. Cohen, “Learning rules that classify e-mail,” in: Proceedings of the AAAI Spring Symposium on Machine Learning in Information Access, Menlo Park, CA, AAAI Press, Technical Report SS96-05, pp. 18-25, 1996.

[17] P. Cunningham, and N. Nowlan, “A case-based approach to spam filtering that can track concept drift,” in: Proceedings of the ICCBR Workshop on Long-Lived CBR Systems, 2003.

[18] P. Domingos and G. Hulten, “Mining high-speed data streams,” in: Proceedings of 6th International Conference on Knowledge Discovery and Data Mining, pp. 71-80, 2000.

[19] J. Dougherty, R. Kohavi and M. Sahami, “Supervised and unsupervised discretization of

continuous features,” in: Proceeding of the 12th International Conference on Machine Learning, pp. 194-202, 1995.

[20] T. Elomaa, J. Kujala and J. Rousu, “Practical approximation of optimal multivariate discretization,” in: Proceedings of the 16th International Symposium on Foundations of Intelligent Systems, pp. 612-621, 2006.

[21] H. Fan and K. Ramamohanarao, “Fast discovery and the generalization of strong jumping emerging patterns for building compact and accurate classifiers,” IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 6, pp. 721-737, 2006.

[22] W. Fan, “Systematic data selection to mine concept-drifting data streams,” in:

Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 128-137, 2004.

[23] U. M. Fayyad and K. B. Irani, “Multi-interval discretization of continuous-valued attributes for classification learning,” in: Proceeding of the 13th International Conference on Artificial Intelligence, pp. 1022-1027, 1993.

[24] U. M. Fayyad and K. B. Irani, “On the handling of continuous-valued attributes in decision tree generation,” Machine Learning, vol. 8, pp. 87-102, 1992.

[25] S. Ferrandiz and M. Boullé, “Multivariate discretization by recursive supervised bipartition of graph,” in: Proceedings of the 4th International Conference on Machine Learning and Data Mining in Pattern Recognition, pp. 253-264, 2005.

[26] A. A. Freitas, “Understanding the crucial differences between classification and discovery of association rules,” SIGKDD Explorations, vol. 2, no. 1, pp. 65-69, 2000.

[27] J. Furnkranz and G. Widmer, “Incremental reduced error pruning,” in: Proceedings of the 11th International Conference on Machine Learning, pp. 70–77, 1994.

[28] J. Gehrke, R. Ramakrishnan and V. Ganti, “RainForest: a framework for fast decision tree construction of large datasets,” Data Mining and Knowledge Discovery, vol. 4, no.

2/3, pp. 127-162, 2000.

[29] D. Gómez, J. Montero and J. Yáñez, “A coloring fuzzy graph approach for image classification,” Information Sciences, vol. 176, no. 24, pp. 3645-3657, 2006.

[30] J. Han and M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann Publisher, 2001.

[31] M. B. Harries, C. Sammut and K. Horn, “Extracting hidden context,” Machine Learning, vol. 32, no.2, pp. 101-126, 1998.

[32] G.. Hulten, L. Spencer and P. Ddmingos, “Mining time-changing data streams,” in:

Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 97-106, 2001.

[33] R. Jin and G. Agrawa, “Efficient decision tree construction on streaming data,” in:

Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 571-576, 2003.

[34] N. Japkowicz and S. Stephen, “The class imbalance problem: a systematic study,”

Intelligent Data Analysis, vol. 6, no. 5, pp. 429-450, 2002.

[35] K. A. Kaufman and R. S. Michalski, “Learning from inconsistent and noisy data: the AQ18 approach,” in: Proceeding of the 11th International Symposium on Methodologies for Intelligent Systems, 1999.

[36] R. Kerber, “ChiMerge: discretization of numeric attributes,” in: Proceeding of the 9th International Conference on Artificial Intelligence, pp. 123-128, 1992.

[37] D. Kifer, S. Ben-David and J. Gehrke, “Detecting change in data streams,” in:

Proceedings of the 30th International Conference on Very Large Databases, pp. 180-191, Toronto, Canada, 2004.

[38] R. Klinkenberg, “Learning drifting concepts: example selection vs. example weighting,”

Intelligent Data Analysis, vol. 8, no. 3, pp. 281-300, 2004.

[39] R. Klinkenberg and I. Renz, “Adaptive information filtering: learning in the presence of concept drifts,” in: Proceedings of International Conference on Machine Learning, pp.

33-40, Menlo Park, California, 1998.

[40] J. Z. Kolter and M. A. Maloof, “Dynamic weighted majority: a new ensemble method for tracking concept drift,” in: Proceedings of the 3rd International IEEE Conference on Data Mining, pp. 123-130, Melbourne, FL, 2003.

[41] I. Koychev, “Gradual forgetting for adaptation to concept drift,” in: Proceedings of ECAI 2000 Workshop in Spatio-Temporal Reasoning, Berlin, Germany, 2000.

[42] A. Kuh, T. Petsche and R. L. Rivest, “Learning time-varying concepts,” In Advances in Neural Information Processing Systems 3, vol. 3, San Francisco, CA: Morgan Kaufmann, pp. 183-189, 1991.

[43] L. Kurgan and K. J. Cios, “Fast class-attribute interdependence maximization (CAIM) discretization algorithm,” in: Proceeding of International Conference on Machine Learning and Applications, pp. 30-36, 2003.

[44] L. Kurgan and K. J. Cios, “CAIM discretization algorithm,” IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 2, pp. 145-153, 2004.

[45] T. Lane and C. E. Brodley, “Approaches to online learning and concept drift for user identification in computer security,” in: Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining, pp. 259-263, New York, 1998.

[46] M. Lazarescu and S. Venkatesh, “Using multiple windows to track concept drift,”

Intelligent Data Analysis Journal, vol. 8, no. 1, pp. 29-59, 2004.

[47] C. I. Lee, C. J. Tsai, T. Q. Wu and W. P. Yang, “A multi-relational classifier for imbalanced database,” Expert Systems with Applications, accepted, to appear in 36(3).

[48] C. I. Lee, C. J. Tsai and C. W. Ku, “An evolutionary and attribute-oriented ensemble classifier,” in: Proceedings of International Conference on Computational Science and its Applications, pp. 1210-1218, 2006.

[49] H. Liu, F. Hussain, C.L. Tan and M. Dash, “Discretization: an enabling technique,”

Journal of Data Mining and Knowledge Discovery, vol. 6, no. 4, pp. 393-423, 2002.

[50] H. Liu and R. Setiono, “Feature selection via discretization,” IEEE Transactions on Knowledge and Data Engineering, vol. 9, no. 4, pp. 642-645, 1997.

[51] M. A. Maloof and R. S. Michalski, “Incremental learning with partial instance memory,”

Artificial Intelligence, vol. 154, no. 1-2, pp. 95-126, 2004.

[52] M. A. Maloof, “Incremental rule learning with partial instance memory for changing concepts,” in: Proceedings of the International Joint Conference on Neural Networks, pp.

2764-2769, Los Alamitos, CA, IEEE Press, 2003

[53] M. A. Maloof and R.S. Michalski, “Selecting examples for partial memory learning,”

Machine Learning, vol. 41, no. 1, pp. 27-52, 2000.

[54] M. Mehta, R. Agrawal and J. Rissanen, “SLIQ: a fast scalable classifier for data mining,” in: Proceedings of the 5th International Conference on Extending Database Technology, pp. 18-32, 1996.

[55] M. Mehta, J. Rissanen, and R. Agrawal, “MDL-Based Decision Tree Pruning,” in:

Proceedings of the First International Conference on Knowledge Discovery and Data Mining, pp. 216-221, 1995.

[56] S. Mehta, S. Parthasarathy and H. Yang, “Correlation preserving discretization data mining,” in: Proceeding of the 4th IEEE International Conference on Data Mining, pp.

479-482, 2004.

[57] T. Menzies, “Data mining for very busy people,” in: Proceedings of the International IEEE Conference on Data Mining, pp. 22-29, 2003.

[58] R. S. Michalski, I. Mozetic, J. Hong and N. Lavrac, “The multipurpose incremental learning system AQ15 and its testing application to three medical domains,” in:

Proceeding of the 5th National Conference on Artificial Intelligence, pp. 1041-1045, 1986.

[59] P. M. Murphy and D. W. Aha, “UCI Repository of Machine Learning Databases,” Irvine, CA: University of California, Department of Information and Computer Science, 1992.

[60] A. Paterson and T. B. Niblett, ACLS Manual, Edinburgh: Intelligent Terminals, Ltd, 1987.

[61] B. Pfahringer, “Compression-based discretization of continuous attributes,” in:

Proceeding of the 12th International Conference on Machine Learning, pp. 456-463, 1995.

[62] J. R. Quinlan, C4.5: Program for Machine Learning, Morgen Kaufmann Publisher, San Mateo, CA, 1993.

[63] J. R. Quinlan, “Induction of decision trees,” Machine Learning, vol. 1, no. 1, pp. 81-106, 1986.

[64] R. Rastogi and K. Shim, “PUBLIC: a decision tree classifier that integrates building and pruning,” in: Proceedings of the 24th International Conference on Very Large Databases, pp. 404-415, 1998.

[65] J. C. Shafer, R. Agrawal and M. Mehta, “SPRINT: a scalable parallel classifier for data mining,” in: Proceedings of the 22th International Conference on Very Large Databases, pp. 544-555, 1996.

[66] J. C. Schlimmer and D. H. Fisher, “A case study of incremental concept induction,” in:

Proceedings of the 5th International Conference on Artificial Intelligence, pp. 496-501, Philadelphia, PA, 1986 .

[67] J. C. Schlimmer and R. H. Granger, “Beyond incremental processing: tracking concept drift,” in: Proceedings of 5th National Conference on Artificial Intelligence, pp. 502-507, Philadelphia, PA., 1986.

[68] W. Street and Y. Kim, “A streaming ensemble algorithm for large-scale classification,”

in: Proceedings of 7th International Conference on Knowledge Discovery and Data Mining, pp. 377-382, New York, 2001.

[69] C. T. Su and J. H. Hsu, “An extended chi2 algorithm for discretization of real value attributes,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 3, pp.

437-441, 2005.

[70] F. Tay and L. Shen, “A modified chi2 algorithm for discretization,” IEEE Transactions on Knowledge and Data Engineering, vol. 14, no. 3, pp. 666-670, 2002.

[71] P. E. Utgoff, “Incremental induction of decision trees,” Machine Learning, vol. 4, no. 2, pp. 161-186, 1989.

[72] P. E. Utgoff, N. C. Berkman and J. A. Clouse, “Decision tree induction based on efficient tree restructuring,” Machine Learning, vol. 29, no. 1, pp. 5-44, 1997.

[73] H. Wang, W. Fan, P. S. Yu and J. Han, “Mining concept-drifting data streams using ensemble classifiers” in: Proceedings of 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 226-235, Washington, DC, 2003.

[74] L. Wang, H. Zhao, G. Dong and J. Li, “On the complexity of finding emerging patterns,” Theoretical computer science, vol. 335, no. 1, pp. 15-27, 2006.

[75] G.. Widmer and M. Kubat, “Learning in the presence of concept drift and hidden contexts,” Machine Learning, vol. 23, no. 1, pp. 69-101, 1996.

[76] A. K. C. Wong and D. K. Y. Chiu, “Synthesizing statistical knowledge from incomplete mixed-mode data,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 9, pp. 796-805, 1987.

[77] Q. X. Wu, D. A. Bell, T. M. McGinnity, G. Prasad, G. Qi and X. Huang, “Improvement of decision accuracy using discretization of continuous attributes,” in: Proceedings of the Third International Conference on Fuzzy Systems and Knowledge Discovery, pp.

674-683, Lecture Notes in Computer Science 4223, 2006.

[78] S. Zadrożny and J. Kacprzyk, “Computing with words for text processing: an approach to the text categorization,” Information Sciences, vol. 176, no. 4, pp. 415-437, 2006.