6.1 實驗結論
從實驗結果可以發現,雖然因為運算資源和時間的限制無法將結果最佳化,
尤其中文資料集比起英文資料集更面對了歧義詞、領域詞彙和未見過詞彙等問 題,造成MSE 結果上表現並不理想,但可以發現不管是中文評論或是英文評論在 使用 Local LDA+LRR 模型的面向評分相關性表現上均優於基準線 Bootstrap+
LRR 的模型,更重要的是使用 Local LDA 於做面向分割時不必事先人工手動設定 關鍵字,這樣可以讓研究的應用更加廣泛。
6.2 研究貢獻
本研究試圖解決中文評論所遇到的LARA 問題,運用 Local LDA 和 LRR 兩階 段模型,希望透過給定整體評分(overall rating)和評論的內容(review content)
能夠推論出潛在面向的權重(aspect weight)和潛在的面向分數(latent aspect rating),提供使用者評論中所潛藏的面向等更深入的決策資訊。從實驗中我們可以 看到中文評論或是英文評論資料集在實驗結果上都表現的比基準線模型來的優 異。
此外,我們也整理過去研究者在意見探勘(opinion mining)和情感分析
(sentiment analysis)方面的研究。讓對於此方面研究有興趣的研究者可以更進一 步的深入探討相關的議題。
6.3 未來研究方向
在應用 LARA 分析於中文評論後,未來研究可以朝三個方向繼續深入探討相 關議題。第一方面是嘗試優化Local LDA 模型和 LRR 模型的
各項參數,雖然最佳化參數並非本研究之目標,但若能在運算資源和時間的許可 下增加unique token 的數量,讓整個模型實驗的結果可以表現的更好;第二個方向 是更深入的將研究中的模型應用於個人化商品推薦、客製化資訊檢索等領域;第 三個方向是加入時間、評論者性別等其他可以探討的變數進行其他更進一步的分 析研究。
參考文獻
[1] Onix text retrieval toolkit stopword list.
http://www.lextek.com/manuals/onix/stopwords1.html
[2] M. Porter. An algorithm for su±x stripping. Program,14(3):130 - 137, 1980.
[3] Wang, H., Lu, Y., & Zhai, C. (2010, July). Latent aspect rating analysis on review text data: a rating regression approach. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 783-792). ACM.
[4] Ma, G., & Qu, Y. (2012, October). A local LDA based method for Latent Aspect Rating Analysis on reviews. In Signal Processing (ICSP), 2012 IEEE 11th International Conference on (Vol. 3, pp. 2240-2245). IEEE.
[5] D. Blei, A. Ng, and M. Jordan., (2003), Latent dirichlet allocation. The Journal of Machine Learning Research, 3:993-1022,2003.
[6] Dave, K., Lawrence, S., & Pennock, D. M. (2003, May). Mining the peanut gallery:
Opinion extraction and semantic classification of product reviews. InProceedings of the 12th international conference on World Wide Web (pp. 519-528). ACM.
[7] Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and trends in information retrieval, 2(1-2), 1-135.
[8] Pak, A., & Paroubek, P. (2010, May). Twitter as a Corpus for Sentiment Analysis and Opinion Mining. In LREC.
[9] Hu, M., & Liu, B. (2004, July). Mining opinion features in customer reviews.
InAAAI (Vol. 4, pp. 755-760).
[10] Chaovalit, P., & Zhou, L. (2005, January). Movie review mining: A comparison between supervised and unsupervised classification approaches. In System Sciences,
(pp. 112c-112c). IEEE.
[11] Hu, M., & Liu, B. (2004, August). Mining and summarizing customer reviews.
InProceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 168-177). ACM.
[12] Titov, I., & McDonald, R. (2008). A joint model of text and aspect ratings for sentiment summarization. Urbana, 51, 61801.
[13] Lu, Y., Zhai, C., & Sundaresan, N. (2009, April). Rated aspect summarization of short comments. In Proceedings of the 18th international conference on World wide web (pp. 131-140). ACM.
[14] Lin, C., & He, Y. (2009, November). Joint sentiment/topic model for sentiment analysis. In Proceedings of the 18th ACM conference on Information and knowledge management (pp. 375-384). ACM.
[15] Brody, S., & Elhadad, N. (2010, June). An unsupervised aspect-sentiment model for online reviews. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (pp.
804-812). Association for Computational Linguistics.
[16] Jo, Y., & Oh, A. H. (2011, February). Aspect and sentiment unification model for online review analysis. In Proceedings of the fourth ACM international conference on Web search and data mining (pp. 815-824). ACM.
[17] Su, Q., Xu, X., Guo, H., Guo, Z., Wu, X., Zhang, X., ... & Su, Z. (2008, April).
Hidden sentiment association in chinese web opinion mining. In Proceedings of the 17th international conference on World Wide Web (pp. 959-968). ACM.
[18] Angeliki Lazaridou, Ivan Titov and Caroline Sporleder, (2013), A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations.
[19] Bing Lu, (2013), Web Data Mining: Exploring Hyperlinks, Contents, and Usage
Data (Data-Centric Systems and Applications) [20] CNET
http://www.cnet.com/products/apple-iphone-5/user-reviews/
[21] W. Gerrod Parrott, (2001), Emotions in Social Psychology: Essential Readings [22] Bo Pang and Lillian Lee, (2008), Opinion Mining and Sentiment Analysis
[23] Bo Pang and Lillian Lee, Shivakumar Vaithyanathan, (2002), Thumbs up?
Sentiment Classification using Machine Learning Techniques
[24] Bo Pang and Lillian Lee, (2005), Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales, Proceedings of ACL 2005.
[25] Peter D. Turney, (2002), Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews
[26] B Liu, M Hu, J Cheng, (2005), Opinion Observer: Analyzing and Comparing Opinions on the Web
[27] Minqing Hu and Bing Liu, (2004), Mining and Summarizing Customer Reviews [28] Yiming Yang ,Jan O. Pedersen, A Comparative Study on Feature Selection in Text Categorization
[29] Samuel Brody, Noemie Elhadad, (2007), An Unsupervised Aspect-Sentiment Model for Online Reviews
[30] Niu, Zheng-Yu, Dong-Hong Ji, and Chew-Lim Tan., (2007) I2r: three systems for word sense discrimination, chinese word sense disambiguation, and English word sense disambiguation. In SemEval '07: Proc. of the 4th International Workshop on Semantic Evaluations. ACL, Morristown, NJ, USA, pages 177-182
[31] Yue Lu, ChengXiang Zhai, Neel Sundaresan, (2009), Rated Aspect Summarization of Short Comments
Details
[33] Prasanth Lade, (2011), Study of Latent Dirichlet Allocation (LDA) models and their application to Human Affective state recognition
[34] 中文停止詞表
http://www.cnblogs.com/ibook360/archive/2011/11/23/2260397.html