結論與未來研究方向 - 有效率探勘社交標籤系統中前k名擴展查詢字集之研究

7-1 結論

本論文針對社交標籤資源提出一個探勘前k名擴展查詢字集的演算法。本方法針對使用者所給予的查詢標籤找出包含查詢字的物件，分析這些物件中所包含的標籤字，透過 UT-tree 樹狀結構的儲存，在探勘可用性前 k 名擴展查詢字集時，

能夠避免產生不必要的擴展查詢字集。最後結合可用性值上限值及下限值的概念，

讓 UT-tree 能夠動態建立，不用將資料物件全部建立完之後，才進行探勘。

另外，本論文使用兩種挑選代表標籤的方法，針對物件中的標籤集合進行過濾，保留與查詢字關聯程度高的標籤。根據實驗結果證實，使用代表標籤探勘擴展查詢字集的效果明顯優於未使用代表標籤。本論文也評估所提出的 UT-growth 與 Dynamic UT-growth 演算法與相關研究方法的效率，從實驗數據中，Dynamic

UT-growth 的執行效率優於另外兩個演算法。此外，透過調整不同參數模擬資料分佈的密集性與稀疏性，觀察對於擴展查詢字集大小的影響，其實驗結果符合不同的資料分佈能夠結合出的擴展查詢字集大小。

7-2 未來研究方向

本論文所使用的標籤資源，除了 Delicious 有使用者對於連結的儲存次數，其他都是用隨機方式給予資料物件的可用性分數值，這對於在計算擴展標籤字集的分數值會有失真實性，未來可以針對這部分提供一個客觀的評分模式，處理沒有物件可用性分數值的標籤資源。

由於在重覆率實驗的效果並不是很好，在這部分可以加入判斷機制，能夠使得擴展查詢字集涵蓋到不重覆的物件，例如：考慮以挑選的擴展查詢字集所涵蓋的物件，比較彼此間交集的物件，若交集達一定個數則不考慮。上述方法去能有效降低重覆率，但卻會額外增加比較時間降低效率，所以將來可以再針對這部分更進一步進行研究。

參考文獻

[1] H. S. Al-Khalifa and H. C. Davis, ”Measuring the Semantic Value of Folksonmies,” Innovations in Information Technology, 2006.

[2] R. Agrawal and R. Srikant, “Fast Algorithm for Mining Association Rule in Large Databases,” in Proceedings of the 20^th International Conference on Very Large Databases, 1994.

[3] T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y.-T. Zheng, ”NUS-WIDE: A Real-World Web Image Database from National University of Singapore,” ACM International Conference on Image and Video Retrieval, 2009.

[4] J. Fokker, J. Pouwelse, W. Buntine, “Tag-Based Navigation for Peer-to-Peer Wikipedia,” in Proceedings of the 15^th international conference on World wide web(WWW), 2006.

[5] M. Gupta, R. Li, Z. Yin and J. Han, “Survey on Social Tagging Techniques,” in Proceedings of the 16^th ACM SIGKDD international conference on Knowledge discovery and data mining(KDD), 2010.

[6] S. A. Golder and B. A. Huberman, “The Structure of Collaborative Tagging System, ”Information dynamics Lab, HP Labs.

[7] J. Gemmell, A. Shepitsen, B. Mobasher and R. Burke, “Personalization in Folksonomies Based on Tag Clustering,” in Proceedings of the 23^rd Association for the Advancement of Artiﬁcial Intelligence(AAAI), 2008.

[8] J. Han, J. Pei and Y. Yin, “Mining Frequent Pattern without Candidate Generation,”

in Proceedings of the 2000 ACM SIGMOD international conference on Management of data, 2000.

[9] J.-L. Koh, G.-T. Chiang, and I-C. Chiu “The Strategies for Supporting query Specialization and Query Generalization in Social Tagging System,” in Proceedings of the 4th International Workshop on Social Networks and Social Web Mining(SNSM), 2013.

[10] D. Liu, X.-S. Hua, L. Yang, M. Wang and H.-J. Zhang, “Tag Ranking”, in Proceedings of the 18^th international conference on World wide web(WWW), 2009.

[11] X. Liang, M. Xie, L. V.S. Lakshmanan, “Adding Structure to Top-K: From Items to Expansions,” in Proceedings of the 20^th ACM international conference on Information and knowledge management(CIKM), 2011.

[12] R. Lémdani, G. Polaillon, N. Bennacer, and Y. Bourda, “A semantic similarity measure for recommender systems,” in Proceedings of the 7th International Conference on Semantic Systems(I-Semantics), 2011

[13] C. Mesnage and M. J. Carman, “Tag Navigation,” in Proceedings of the 2^nd international workshop on Social software engineering and applications, 2009.

[14] J. Peng, D. Zeng, H. Zhao and F.-Y. Wang, “Collaborative Filtering in Social Tagging Systems Based on Joint Item-Tag Recommendations,” in Proceedings of the 19^th ACM international conference on Information and knowledge management(CIKM), 2010.

[15] D. Skoutas and M. Alrifai, “Tag Clouds Revisited,” in Proceedings of the 20^th ACM international conference on Information and knowledge management(CIKM), 2011.

[16] P.-N. Tan, M.Steinbach, V. Kumar, “Introduction to Data Mining,” 2006.

[17] D. Vandic, J.-W. v. Dam and F. Hogenboom, “A Semantic Clustering-Based Approach for Searching and Browsing Tag Spaces,” in Proceedings of the 26^th ACM Symposium on Applied Computing(SAC), 2011.

[18] Wikipedia. Tag cloud — wikipedia, the free encyclopedia, 2013.

[Online; accessed 25-February-2013].

[19] L. Wu, L. Yang, , N. Yu, X.-S. Hua, “Learning to tag,” in Proceedings of the 18th international conference on World wide web(WWW), 2009.

[20] K. Wang, L. Tang, J. Han and J. Lin, “Top Down FP-Growth for Association Rule Mining,” in Proceedings of the 6^th Pacific-Asia Conference on Knowledge Discovery and Data Mining(PAKDD), 2002.

[21] A. Zubiaga, A. P. García-Plaza, V. Fresno, and R. Martínez, “Content-based Clustering for Tag Cloud Visualization,” in Proceedings of International Conference on Advances in Social Networks Analysis and Mining(ASONAM), 2009.

在文檔中有效率探勘社交標籤系統中前k名擴展查詢字集之研究 (頁 79-83)