Discussions and Conclusions - Updating generalized association rules with evolving taxonomies

In this paper we have investigated the problem of updating generalized association rules under evolving taxonomies. We presented two algorithms, Diff_ET and Diff_ET2, for updating generalized frequent itemsets. Empirical evaluation showed that both algorithms are effective and have good linear scale-up characteristic.

Before we come to an end, we like to point out that our algorithms can be applied to some extensions of the problem concerned in this paper. Firstly, though we have assumed, for simplicity, the item taxonomy is arranged as hierarchy tree, the proposed algorithms indeed can handle the taxonomy organized as directed acyclic graph or lattice. Secondly, the proposed algorithms can be applied to other types of data sources not in transactional format. As founded in [9], the process of discovering knowledge from databases or data warehouses usually involves some preliminary steps including relevant data extraction, cleansing, and transformation to prepare the data workable for applying the appropriate mining algorithms or tools. In this context, our algorithms can be applied to update previously discovered patterns once the relevant data have been prepared in transactional format.

Although our work in this study has advanced the research into efficient

incorporates incremental database updates and fuzzy taxonomic structure. Another important direction is on embedding the frequent pattern maintenance scheme into a data mining platform. An example is on-line discovery of multi-dimensional association rules from databases or data warehouses [15]. The realization of such systems heavily depends on an auxiliary repository depicting to some extent the statistics of the patterns to be mined. Previous work toward this avenue includes J.

Han on utilizing OLAP cube to create an OLAP-like mining environment [10][15], iceberg cube [8], OLAM cube [16], and materialized data mining views [7], etc.

Efficiently maintaining this auxiliary repository with respect to data source update and/or taxonomic structure (or more general, schema) evolution then become another important research issues.

Acknowledgement

We would like to acknowledge many constructive comments from the anonymous referees. This work was supported by the National Science Council of ROC under grant NSC 94-2213-E-390-006.

References

[1] R.Agrawal,T.Imielinski,and A.Swami,“Mining association rulesbetween sets ofitemsin largedatabases,”Proc. 1993 ACM-SIGMOD Intl. Conf. Management of Data, 1993, pp. 207-216.

[2] R.Agrawaland R.Srikant,“Fastalgorithmsformining association rules,”Proc.

20th Int. Conf. Very Large Data Bases, 1994, pp. 487-499.

[3] S. Brin, R. Motwani, J.D. Ullman, and S. Tsur, "Dynamic itemset counting and implication rules for market basket Data," SIGMOD Record, Vol. 26, 1997, pp.

255-264.

[4] D.W. Cheung, J. Han, V.T. Ng, and C.Y. Wong. “Maintenance of discovered association rules in large databases:An incremental update technique,” Proc.

1996 Int. Conf. Data Engineering, 1996, pp.106-114.

[5] D.W. Cheung, V.T. Ng,B.W.Tam,“Maintenanceofdiscovered knowledge:a case in multi-level association rules,” Proc. 1996 Int. Conf. Knowledge Discovery and Data Mining, 1996, pp. 307-310.

[6] D.W. Cheung, S.D. Lee, and B. Kao, “A general incremental technique for maintaining discovered association rules,”Proc. DASFAA'97, 1997, pp. 185-194.

[7] B. Czejdo, M. Morzy, M. Wojciechowski, and M. Zakrzewicz, “Materialized views in data mining,”Proc. 13th Intl. Workshop on Database and Expert Systems Applications, 2002, pp. 827-831.

[8] M. Fang, N. Shivakumar, H. Garcia-Molina, R. Motwani, J.D. Ullman,

“Computing iceberg queries efficiently,”Proc. 24th Intl. Conf. Very Large Data Bases, 1998, p.299-310.

[9] U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth,“TheKDD processforextracting usefulknowledgefrom volumesofdata,”Communications of the ACM, Vol. 39, No. 11, 1996, pp. 27–34.

[10] J. Han, “OLAP mining: An integration of OLAP with data mining,”Proc. IFIP Conf. Data Semantics, 1997, pp. 1-11.

[11] J. Han, Y. Cai, and N. Cercone, “Knowledge discovery in databases: An attribute-oriented approach,”Proc. 18th Intl. Conf. Very Large Data Bases, 1992, pp. 547-559.

Discovery in Databases (KDD’94), 1994, pp. 157-168.

[13] J. Han and Y. Fu, “Discovery of multiple-level association rules from large databases,”Proc. 21st Int. Conf. Very Large Data Bases, 1995, pp. 420-431.

[14] T.P. Hong, C.Y. Wang, and Y.H. Tao, “Incremental data mining based on two support thresholds,” Proc. 4th Int. Conf. Knowledge-Based Intelligent Engineering Systems and Allied Technologies, 2000, pp.436-439.

[15] M. Kamber, J. Han, and J. Y. Chiang, “Metarule-guided mining of multidimensional association rules using data cubes,”Proc. 3rd Intl. Conf.

Knowledge Discovery and Data Mining (KDD'97), 1997, pp. 207-210.

[16] W.Y. Lin, J.H. Su, and M.C. Tseng, “OMARS: The framework of an online multi-dimensional association rules mining system,”ICEB 2nd Intl. Conf.

Electronic Business, Taipei, Taiwan, 2002.

[17] K.K. Ng and W. Lam, “Updating of association rules dynamically,”Proc. 1999 Intl. Symp. Database Applications in Non-Traditional Environments, 2000, pp.

84-91.

[18] J.S. Park, M.S. Chen, and P.S. Yu, "An effective hash-based algorithm for mining association rules," Proc. 1995 ACM SIGMOD Intl. Conf. Management of Data, San Jose, CA, USA, 1995, pp. 175-186.

[19] A.Savasere,E.Omiecinski,and S.Navathe,“An efficientalgorithm for mining association rules in large databases,” Proc. 21st Intl. Conf. Very Large Data Bases, 1995, pp. 432-444.

[20] N.L.Sarda and N.V.Srinivas,“An adaptive algorithm forincrementalmining of association rules,”Proc. 9th Intl. Workshop on Database and Expert Systems Applications, 1998, pp. 240-245.

[21] R.Srikantand R.Agrawal,“Mining generalized association rules,”Proc. 21st

Intl. Conf. Very Large Data Bases, 1995, pp. 407-419.

[22] S. Thomas, S. Bodagala, K. Alsabti, and S. Ranka, “An efficient algorithm for the incremental updation of association rules in large databases,”Proc. 3rd Intl.

Conf. Knowledge Discovery and Data Mining, 1997.

[23] M.C.Tseng and W.Y.Lin,“Maintenance of generalized association rules with multiple minimum supports,”Intelligent Data Analysis, Vol. 8, 2004, pp.

417-436.

[24] M.J. Zaki, "Scalable algorithms for association mining," IEEE Transactions on Knowledge and Data Engineering, Vol. 12, No. 2, 2000, pp. 372-39.

在文檔中 Updating generalized association rules with evolving taxonomies (頁 33-37)