Chapter 7 Conclusions and Future Work
7.2 Future Work
The study of mining indirect associations from streaming data is in its infancy. Many research issues are worthy of further investigation. First, we will continue to improve the efficiency of the proposed algorithms, seeking more effective way for reducing the memory usage. We will also extend the proposed algorithms to different window models such as time-fading model and sliding-window model. Recently, the design of adaptive data stream mining methods that can perform adaptively under constrained resources has emerged into an important and challenging research issue to the data mining community. In the future, we will study how to apply or incorporate some adaptive technique such as load shedding into our approach, especially when the situation is that we have very limited resources, such as CPU computing power or memory size, without sacrificing too much the quality of the discovered indirect associations.
61
References
[1] C. C. Aggarwal, Data streams : Models and Algorithms, New York: Springer, 2007.
[2] R. Agrawal and R. Srikant, "Fast algorithms for mining association rules in large databases," In Proc. of the 20th Intl. Conf. on Very Large Data Bases, pp. 487-499, 1994.
[3] B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom, "Models and issues in data stream systems," in Proc. of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 1-16, 2002.
[4] J. H. Chang and W. S. Lee, "estWin: Online data stream mining of recent frequent itemsets by sliding window method," Journal of Information Science, vol. 31, no. 2, pp. 76-90, 2005.
[5] J. H. Chang and W. S. Lee, "Finding recent frequent itemsets adaptively over online data streams," in Proc.of the 9th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, pp. 487-492, 2003.
[6] L. Chen, S. Bhowmick, and J. Li, "Mining temporal indirect associations," in Proc. of the 10th Pacific-Asia Conf. on Knowledge Discovery and Data Mining, pp. 425-434, 2006.
[7] C. Cornells, Y. Peng, Z. Xing, and C. Guoqing, "Mining positive and negative association rules from large databases," in Proc. of IEEE Conf. on Cybernetics and Intelligent Systems, pp. 1-6, 2006.
[8] L. Daesu and L. Wonsuk, "Finding maximal frequent itemsets over online data streams adaptively," in Proc. of the 5th IEEE Intl. Conf. on Data Mining, pp., 2005.
[9] M. M. Gaber, A. Zaslavsky, and S. Krishnaswamy, "Towards an adaptive approach for mining data streams in resource constrained environments," in Proc. of the 6th Intl. Conf.
on Data Warehousing and Knowledge Discovery, pp. 189-198, 2004.
[10] L. Golab and M. T. Ö zsu, "Issues in data stream management," SIGMOD Record, vol. 32, no. 2, pp. 5-14, 2003.
[11] K. Gouda and M. J. Zaki, "Efficiently mining maximal frequent itemsets," in Proc. of the 1st Intl. Conf. on Data Mining, pp. 163-170, 2001.
62
[12] J. Han, J. Pei, and Y. Yin, "Mining frequent patterns without candidate generation," in Proc. of the 2000 ACM SIGMOD Intl. Conf. on Management of Data, pp. 1-12, 2000.
[13] N. Jiang and L. Gruenwald, "CFI-Stream: mining closed frequent itemsets in data streams," In Proc. of the 12th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, pp. 592-597, 2006.
[14] R. Jin and G. Agrawal, "An algorithm for in-core frequent itemset mining on streaming data," in Proc. of the 5th IEEE Intl. Conf. on Data Mining, pp. 210-217, 2005.
[15] P. Kazienko, "IDARM — Mining of indirect association rules," ed, 2005, pp. 77-86.
[16] P. Kazienko and K. Kuzminska, "The influence of indirect association rules on recommendation ranking lists," in Proc. of the 5th Intl. Conf. on Intelligent Systems Design and Applications, pp. 482-487, 2005.
[17] R. Kohavi, C. E. Brodley, B. Frasca, L. Mason, and Z. Zheng, "KDD-Cup 2000 organizers' report: peeling the onion," SIGKDD Exploration Newsletters, vol. 2, pp. 86-93, 2000.
[18] H.-F. Li, C.-C. Ho, F.-F. Kuo, and S.-Y. Lee, "A new algorithm for maintaining closed frequent itemsets in data streams by incremental updates," in Proc. of the 6th IEEE Intl.
Conf. on Data Mining - Workshops, pp. 672-676, 2006.
[19] H. F. Li, S. Y. Lee, and M. K. Shan, "An efficient algorithm for mining frequent itemsets over the entire history of data streams," in Proc. of the 1st Intl. Workshop on Knowledge Discovery in Data Streams, pp. 20-24, 2004.
[20] H. F. Li, S. Y. Lee, and M. K. Shan, "Mining maximal frequent itemsets in data streams,"
in Proc. of Intl. Computer Symposium, 2004.
[21] G. S. Manku and R. Motwani, "Approximate frequency counts over data streams," in Proc.
of 28th Intl. Conf. on Very Large Data Bases, pp. 346-357, 2002.
[22] A. Savasere, E. Omiecinski, and S. Navathe, "Mining for strong negative associations in a large database of customer transactions," in Proc. of 14th Intl. Conf. on Data Engineering, pp. 494-502, 1998.
[23] P.-N. Tan, V. Kumar, and J. Srivastava, "Indirect association: Mining higher order dependencies in data," in Proc. of the 4th European Conf. on Principles of Data Mining and Knowledge Discovery, pp. 632-637, 2000.
63
[24] P. N. Tan and V. Kumar, "Interestingness measures for association patterns: A perspective," in Proc. of KDD 2000 Workshop on Postprocessing in Machine Learning and Data Mining, 2000.
[25] P. N. Tan and V. Kumar, "Mining indirect associations in Web data," Lecture Notes in Artificial Intelligence, vol. 2356, pp. 145-166, 2002.
[26] V. S. Tseng, Y. C. Liu, and J. W. Shin, "Mining gene expression data with indirect association rules," in Proc. of National Computer Symposium, 2007.
[27] Q. Wan and A. An, "An efficient approach to mining indirect associations," Journal of Intelligent Information Systems, vol. 27, no. 2, pp. 135-158, 2006.
[28] Q. Wan and A. An, "Efficient indirect association discovery using compact transaction databases," in Proc. of 2006 IEEE Intl. Conf. on Granular Computing, pp. 154-159, 2006.
[29] W. G. Teng, M. J. Hsieh, and M. S. Chen., "On the mining of substitution rules for statistically dependent items," in Pro. of the 2nd Intl. Conf. on Data Mining, pp. 442-449, 2002.
[30] X. Wu, C. Zhang, and S. Zhang, "Efficient mining of both positive and negative association rules," ACM Transactions on Information Systems, vol. 22, pp. 381-405, 2004.
[31] C. Yun, W. Haixun, P. S. Yu, and R. R. Muntz, "Moment: maintaining closed frequent itemsets over a stream sliding window," in Proc. of the 4th IEEE Intl. Conf. on Data Mining, pp. 59-66, 2004.
[32] Y. Zhu and D. Shasha, "StatStream: statistical monitoring of thousands of data streams in real time," in Proc. of the 28th Intl. Conf. on Very Large Data Bases, pp. 358-369, 2002.