未來可能研究方向

第七章結論與未來可能研究方向

7.2 未來可能研究方向

國

立政治大學

‧

Na tiona

l Ch engchi University

7.2 未來可能研究方向

本研究是從 Euclidean Coclustering Algorithm （ MSRICC ）、 Information Theoretic Coclustering algorithm （ ITCC ）、 Minimum Squared Residue Clustering algorithm（MSRIICC）三種雙分群演算法中選擇了 MSRICC 與 MSRIICC 來進行分析，在分群結果上仍有進步的空間，在未來可嘗試配合雙分群演算法混合其他不同方法來進行研究，或以 PCA 之外的方法對資料進行分析與測試，嘗試更精確的完成貼文與字串的分群。

另外，本研究的貼文在分群後，是以凝聚率與鑑別率進行評估後再以主觀的方式進行分析，來了解分群的結果。未來希望能找出一個方法，以客觀的角度來分析分群結果，並且依據這個方法進行延伸，找出每個群中貼文的架構。另外，

關鍵詞的部分，在貼文上並非是以出現次數多寡來定義其重要性，亦希望能加入某些方法來對貼文中的字詞進行權重判別，為所有字詞附上權重，將權重較高的關鍵詞提供給使用者，除了讓使用者可以迅速了解這個群體的貼文架構與其所代表的內容外，也可透過關鍵詞迅速找到符合相關的貼文群體。

‧ 國

立政治大學

‧

Na tiona

l Ch engchi University

100

參考文獻

[1] 蕭世平，“台灣地區線上音樂會員使用狀況與業者行銷策略研究”，南臺科技大學資訊傳播研究所碩士論文，2007。

[2] 鄭博元,“設計與實作一個臉書粉絲頁資料抓取器”，政治大學資訊科學研究所碩士論文,2015。

[3] 陳稼興, 謝佳倫, & 許芳誠，“以遺傳演算法為基礎的中文斷詞研究”，資訊管理研究第二卷第二期，pp. 27-44，2000。

[4] 王瑞平，“應用平行語料建構中文斷詞組件”，政治大學資訊科學研究所碩士論文，2012。

[5] Tsai, Y. F., & Chen, K. J.,“Reliable and Cost-Effective Pos-Tagging”, International Journal of Computational Linguistics & Chinese Language Processing, Vol. 9 #1, pp. 83-96, 2004.

[6] Ma, W. Y., & Chen, K. J.,“A Bottom-up Merging Algorithm for Chinese Unknown Word Extraction”, Proceedings of ACL, Second SIGHAN Workshop on Chinese Language Processing, pp. 31-38, 2003.

[7] Ma, W. Y., & Chen, K. J.,“Introduction to CKIP Chinese Word Segmentation System for the First International Chinese Word Segmentation Bakeoff”, Proceedings of ACL, Second SIGHAN Workshop on Chinese Language Processing, pp. 168-171, 2003.

‧

[8] 黃俊堯,“看懂，然後知輕重。「互聯網+」的10堂必修課”，pp. 21-29，

台北：先覺出版社，2015。

[9] 張家寧,“以概念萃取為基礎之文件分群與視覺化”，交通大學資訊科學與工程研究所碩士論文，2006。

[10] 徐俊傑,“網際網路資訊應用研究”，台灣科技大學資訊管理系行政院國家科學委員會專題研究計畫，2007。

[11] Hartigan, J. A.,“Direct Clustering of a Data Matrix”, Journal of the American Statistical Association Volume 67, Issue 337, 1972.

[12] 陳貫中,“以雙分群方法分析基因微矩陣資料”，交通大學資訊科學與工程研究所碩士論文，2006。

[13] 張智愷,“基於動態調整權重之co-cluster演算法”，交通大學資訊科學與工程研究所碩士論文，2011。

[14] Mirkin, B.,“Mathematical Classification and Clustering”, Kluwer Academic Publishers,1996.

[15] Dhillon, I. S.,“Co-clustering documents and words using bipartite spectral graph partitioning”, in Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, ser.

KDD ’01, pp. 269–274, 2001.

[16] Dhillon, I. S., Mallela, S., & Modha, D. S.,“Information-theoretic co-clustering”, in Proceedings of the ninth ACM SIGKDD international conference on KKluwer Academic Publishersnowledge discovery and data mining, pp. 89–98, 2003.

‧

[17] Kwon, B., & Cho, H.,“Scalable Co-Clustering Algorithm”, Algorithms and Architectures for Parallel Processing, Lecture Notes in Computer Science, Vol. 6081, pp. 32–43, 2010.

[18] Cho, H., Dhillon, I. S., Guan, Y., & Sra, S.,“Minimum sum-squared residue co-clustering of gene expression data”, in Proceedings of the fourth SIAM international conference on data mining, 2004.

[19] Cho, H., & Dhillon, I. S.,“Coclustering of Human Cancer Microarrays Using Minimum Sum-Squared Residue Coclustering”, IEEE/ACM Transactions on Computational Biology and Bioinformatics, Vol. 5, NO. 3, 2008.

[20] Cheng, Y., & Church, G. M., “Biclustering of Expression Data”, in Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology, Vol. 8, pp. 93-103, 2000.

[21] Martínez, A. M., & Kak, A. C.,“Pca versus lda”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 23, No. 2, pp. 228-233, 2001.

[22] Zhang, Y., & Wu, L.,“An MR brain images classifier via principal component analysis and kernel support vector machine”, Progress In Electromagnetics Research 130, pp. 369-388, 2012.

[23] 林育臣 , “ 群聚技術之研究 ” , 朝陽科技大學資訊管理研究所碩士論文,2002。

[24] 陳榮昌,“群聚演算法及群聚參數的分析與探討”,朝陽科技大學資訊管理研究所碩士論文,2003。

‧

[26] Mihalcea, R., & Tarau, P.,“TextRank: Bringing Order into Texts”, Proceedings of the Conference on Empirical Methods in Natural Language Processing, Vol. 4, pp. 404-411, 2004.

[27] De Choudhury, M., Gamon, M., Counts, S., & Horvitz, E.,“Predicting depression via social media”, In Proceedings of the 7th International AAAI Conference on Weblogs and Social Media, 13, pp. 1-10, 2013.

[28] Yin, J., Lampert, A., Cameron, M., Robinson, B., & Power, R.,“Using social media to enhance emergency situation awareness”, IEEE Intelligent Systems, 27(6), pp. 52-59, 2012.

[29] Benson, E., Haghighi, A., & Barzilay, R.,“Event discovery in social media feeds”, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, Association for Computational Linguistics, pp. 389-398, 2011.

[30] Girvan, M., & Newman, M. E.,“Community structure in social and biological networks”, Proceedings of the national academy of sciences, 99(12), pp. 7821-7826, 2002.

[31] Pohl, D., Bouchachia, A., & Hellwagner, H.,“Online indexing and clustering of social media data for emergency management ” , Neurocomputing, 172, pp. 168-179, 2016.

‧

[32] Papadopoulos, S., Kompatsiaris, Y., Vakali, A., & Spyridonos, P.,

“ Community detection in social media ” , Data Mining and Knowledge Discovery, 24(3), pp.515-554, 2012.

[33] Azizifard, N., “Social Network Clustering”, International Journal of Information Technology and Computer Science, 6(1), 76, 2013.

[34] Reuter, T., Cimiano, P., Drumond, L., Buza, K., & Schmidt-Thieme, L., “Scalable Event-Based Clustering of Social Media Via Record Linkage Techniques ” , In Fifth International AAAI Conference on Weblogs and Social Media, 2011.

[35] 吳怡瑾,方友杉, & 喻欣凱,“運用文件分群與概念關聯分析技術協助網誌瀏覽: 任務導向評估方法”,輔仁大學資訊管理學研究所,圖書資訊學研究,4(1), pp. 133-164, 2009.

[36] Becker, H., Naaman, M., & Gravano, L.,“Learning similarity metrics for event identification in social media”, In Proceedings of the third ACM international conference on Web search and data mining, pp. 291-300, 2010.

[37] 蔡宜龍,“特殊領域文件分群之系統設計與研究--以佛學資料為例”,國立臺灣大學資訊工程研究所碩士論文,未出版論文,2002。

[38] Ferrara, E., JafariAsbagh, M., Varol, O., Qazvinian, V., Menczer, F., & Flammini, A.,“Clustering memes in social media”, In Advances in social networks analysis and mining, IEEE/ACM international conference on pp. 548-555, 2013.

‧

[39] Wang, X., Tang, L., Gao, H., & Liu, H.,“Discovering overlapping groups in social media”, In Data Mining, 2010 IEEE 10th International Conference on pp. 569-578, 2010.

[40] 尹其言, & 楊建民,“應用文件分群與文字探勘技術於機器學習領域趨勢分析以 SSCI 資料庫為例”, 長榮大學學報, 14(2), pp. 1-16, 2010.

[41] Steinbach, M., Karypis, G., & Kumar, V.,“A comparison of document clustering techniques”, In KDD workshop on text mining, Vol. 400, No.

1, pp. 525-526, 2000.

[42] Hotho, A., Staab, S., & Stumme, G.,“Ontologies improve text document clustering ” , In Data Mining, ICDM 2003. Third IEEE International Conference on pp. 541-544, 2003

[43] 黃純敏,陳聰宜, & 詹雅筑,“新聞事件偵測與追蹤之分群分類演算法研究”, 資訊科技國際期刊, 8(1), pp. 1-9, 2014

[44] Ting, X., & Jufang, L.,“A Comparative Study between Single-Pass Algorithm and K-means Algorithm in Web Topic Detection.”, 中國國防科學技術大學信息系統與管理學院, 2014.

[45] Willett, P.,“Recent trends in hierarchic document clustering: a critical review ” , Information Processing & Management, 24(5), pp.

577-597, 1988.

[46] Yan, Y., Chen, L., & Tjhi, W. C., “ Fuzzy semi-supervised co-clustering for text documents”, Fuzzy Sets and Systems, 215, pp. 74-89, 2012.

‧ 國

立政治大學

‧

Na tiona

l Ch engchi University

106

[47] 詹欣逸,“利用WordNet判斷字詞包含關係-應用於動態階層文件分群”, 國立中央大學資訊管理研究所碩士論文, 2013.

[48] 謝昆霖,楊義清,林俊男, & 林育弘,“模糊群聚分析程序於生物 DNA 序列之研究”, Journal of Information Technology and Applications, 2(1), pp.

17-22, 2007.

在文檔中結合中文斷詞系統與雙分群演算法於音樂相關臉書粉絲團之分析：以KKBOX為例 - 政大學術集成 (頁 110-117)

第七章 結論與未來可能研究方向

7.2 未來可能研究方向

國

立 政 治 大 學

‧

‧ 國

立 政 治 大 學

‧

參考文獻

‧

‧

‧

‧

‧

‧ 國

立 政 治 大 學

‧

第七章結論與未來可能研究方向

立政治大學

立政治大學

立政治大學