• 沒有找到結果。

本研究旨在探究應用社群資料改善圖書搜尋結果,經歷上述實驗設計與結果 分析後,本章敘述所得之結果以及對未來研究之建議。

第一節 結論

本研究主要是以資訊檢索的機率模型對書籍的社群資料進行實驗。第一階段 先以主題內容對不同的索引資料進行搜尋,比較包含社群資料與未包含社群資料 的索引之間的差異。第二階段則是分別以社會標記和評論,兩個社群資料的相關 分數調整第一階段的搜尋結果分數,並重新排序以得到新的搜尋結果。研究結論 如下:

(一) 運用社群資料在機率模型的圖書搜尋,比目前圖書館使用的傳統書目資 料,可以得到更好的檢索效能。

(二) 社會評論資料在機率模型的檢索之中,可以得到最好的結果。加入任何 權重調整,反而會降低其結果得分。

(三) 社會標記的資料在機率模型的檢索之中,與傳統書目資料並無明顯的差 異,但是以被標記次數做為權重調整之後,其檢索效能提升 270%,明 顯高於未權重調整前之結果,僅次於社會評論資料索引。

(四) 將被標記次數為 1 的社會標記篩除之後,相較於未篩除之前的社會標記 索引,可以使檢索效能提升 61%。但是,其檢索效能並不如以被標記次 數做為權重調整的索引。

(五) 使用所有欄位索引,再加上社會評論相關分數之後,將檢索結果重新排 序,可以得到本研究中最好的檢索結果,可以提升 3.1%的 nDCG 分數。

(六) 社會評論在所有索引欄位中,會有最好的檢索效果,但使用社會評論來 重新排序其他檢索結果,對圖書搜尋效能可以最高提升 25%的 nDCG

51

分數,但其整體成效並不如社會標記。

第二節 未來建議

在本研究中,利用機率模型的搜尋引擎對書籍資料進行一般性的搜尋,實驗 社群資料對搜尋的結果的影響,並利用社群資料調整結果的相關分數,進行重新 排序,以到更好的搜尋結果。然而,在真實世界之中,社群資料的使用往往必須 考慮其作者以及其他使用者之關係,以及發佈時間對於資料的影響。在未來的研 究方面,擬對於使用者間的關係,以及社群資料的發佈時間對搜尋的影響深入研 究,使其達到更完整的社群搜尋。

另外,本研究之範圍設定是 280 萬筆書籍資料,其包含不同領域、不同專業、

不同類型之書籍,其字詞量大則繁雜,處理上相當不易。後續建議針對特定的專 業領域,將部份的書籍以及社群資料之字詞進行處理,建立專業領域的字詞庫,

或許對特定領域的圖書搜尋,會有卓越的表現。

52

陳光華(2004)。資訊檢索之績效評估。Paper presented at the 2004 年現代資訊 組織與檢索研討會,臺北:淡江大學。

曾元顯(1997)。關鍵詞自動擷取技術與相關詞回饋。中國圖書館學會會報,59,

59-64

Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern information retrieval (Vol. 463).

New York: ACM press.

Bao, S., Xue, G., Wu, X., Yu, Y., Fei, B., & Su, Z. (2007). Optimizing Web Search Using Social Annotations.Proceedings of the 16th International Conference on World Wide Web. New York, NY, USA: ACM. doi:10.1145/1242572.1242640 Buckley, C., Salton, G., Allan, J., & Singhal, A. (1995). Automatic query expansion

using SMART: TREC 3. NIST SPECIAL PUBLICATION SP, 69-69.

Cambria, E., Rajagopal, D., Olsher, D., & Das, D. (2013). Big Social Data Analysis . In Big Data Computing, Big Data Computing.

Chau, M., Fang, X., & Liu Sheng, O. R. (2005). Analysis of the query logs of a web site search engine. Journal of the American Society for Information Science and Technology, 56(13), 1363-1376.

Cohen, J.D. (1995). Highlights: Language- and Domain-Independent Automatic Indexing Terms for Abstracting. JASIS, 46(3), 162-174.

Crecelius, T., Kacimi, M., Michel, S., Neumann, T., Parreira, J. X., Schenkel, R., &

Weikum, G. (2008). Making sense: Socially enhanced search and exploration.

Proceedings of the VLDB Endowment, 1(2), 1480-1483. doi:

10.1145/1454159.1454206

Kim, D. W., & Lee, K. H. (2001). A new fuzzy information retrieval system based on user preference model. In Fuzzy Systems, 2001. The 10th IEEE International Conference on (Vol. 1, pp. 127-130). IEEE.

Dominguez, G., & Simon, S. (2010). Seek and Find: Folksonomy Tags to Support Usability and Findability in Library Catalogs.

Fukumoto, F., Sekiguchi, Y., & Suzuki, Y. (1998). Keyword extraction of radio news using term weighting with an encyclopedia and newspaper articles. Paper presented at the Proceedings of the 21st annual international ACM SIGIR

53

conference on Research and development in information retrieval.

Gantz, J., & Reinsel, D. (2012). THE DIGITAL UNIVERSE IN 2020: Big Data, Bigger Digital Shadow s, and Biggest Grow th in the Far East . Retrieved from http://www.emc.com/collateral/analyst-reports/idc-the-digital-universe-in-2020.p df

Gauch, S., & Smith, J.B. (1993). An expert system for automatic query reformulation.

J. of the Amer. Society of Inf. Sci, 44(3), 124-136.

Hadro, J. (2008). Darien Library's Open Source SOPAC 2.0 Emphasizes Patron Content. LibraryJournal.com, LibraryJournal.com. Retrieved from http://www.libraryjournal.com/article/CA6591377.html?rssid=191

Harman, D. (1993). The First Text REtrieval Conference(TREC-1). Information Processing and Management, 29(4), 411-414.

Hulth, A. (2003). Improved automatic keyword extraction given more linguistic knowledge. Paper presented at the Proceedings of the 2003 conference on Empirical methods in natural language processing.

Humphreys, K. (2002). PhraseRate: An HTML Keyphrase Extractor. Riverside:

University of California, Riverside.

Järvelin, K., & Kekäläinen, J. (2000). IR evaluation methods for retrieving highly relevant documents. Paper presented at the Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval.

Joachims, T., Granka, L., Pan, B., Hembrooke, H., & Gay, G. (2005, August).

Accurately interpreting clickthrough data as implicit feedback. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 154-161). ACM.

Kato, M., Ohshima, H., Oyama, S., & Tanaka, K. (2008). Can social tagging improve web image search?. In Web Information Systems Engineering-WISE 2008 (pp.

235-249). Springer Berlin Heidelberg.

Klir, G. J., & Yuan, B. (1995). Fuzzy Sets and Fuzzy Logic: Theory and Applications (1st ed.). NJ, USA: Prentice Hall PTR.

Koolen, M., Kazai, G., & Craswell, N. (2009, February). Wikipedia pages as entry points for book search. In Proceedings of the Second ACM International Conference on Web Search and Data Mining (pp. 44-53). ACM.

Magdy, W., & Darwish, K. (2008, October). Book search: indexing the valuable parts.

In Proceedings of the 2008 ACM workshop on Research advances in large digital book repositories (pp. 53-56). ACM.

Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to information

54

retrieval (Vol. 1, p. 6). Cambridge: Cambridge university press.

Maron, M. E., & Kuhns, J. L. (1960). On Relevance, Probabilistic Indexing and Information Retrieval. J. ACM, 7(3), 216-244. doi: 10.1145/321033.321035 Matusiak, K.K. (2006). Towards user-centered indexing in digital image collections.

OCLC Systems & Services, 22(4), 283-298.

Mitra, M., Singhal, A., & Buckley, C. (1998). Improving automatic query expansion.

Paper presented at the Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval.

O'reilly, T. (2007). What is Web 2.0: Design patterns and business models for the next generation of software. Communications & strategies, (65). Retrieved Jul, 30, 2012, from http://oreilly.com/web2/archive/what-is-web-20.html

Peat, H.J., & Willett, P. (1991). The limitations of term co-occurrence data for query expansion in document retrieval systems. JASIS, 42(5), 378-383.

Peters,‎ I.,‎ &‎ Stock,‎ W.‎ G.‎ (2010).‎ “Power‎ tags”‎ in‎ information‎ retrieval.‎ Library‎ Hi‎

Tech, 28(1), 81-93.

Salton, G., & Lesk, M. E. (1968). Computer Evaluation of Indexing and Text Processing. J. ACM, 15(1), 8-36. doi: 10.1145/321439.321441

Jones, K. S. (1971). Automatic keyword classification for information retrieval. (Vol.

253): London: Butterworths.

Spink, A., Wolfram, D., Jansen, M.B.J., & Saracevic, T. (2000). Searching the web:

The public and their queries. Journal of the American Society for Information Science and Technology, 52(3), 226-234.

Thomas, M., Caudle, D.M., & Schmitz, C.M. (2009). To tag or not to tag? Library Hi Tech, 27(3), 411-434.

Trant, J. (2009). Studying social tagging and folksonomy: A review and framework.

Journal of Digital Information, 10(1).

van der Plas, L., Pallotta, V., Rajman, M., & Ghorbel, H. (2004). Automatic keyword extraction from spoken text. a comparison of two lexical resources: the edr and wordnet. arXiv preprint cs/0410062.

Vélez, B., Weiss, R., Sheldon, M.A., & Gifford, D.K. (1997). Fast and effective query refinement. Paper presented at the ACM SIGIR Forum.

Voorhees, E., & Harman, D.K. (2005). TREC: Experiment and evaluation in information retrieval (Vol. 63): MIT press Cambridge^ eMA MA.

Witten, I.H., Paynter, G.W., Frank, E., Gutwin, C., & Nevill-Manning, C.G. (1999).

KEA: Practical automatic keyphrase extraction. Paper presented at the Proceedings of the fourth ACM conference on Digital libraries.

Wu, H., Kazai, G., & Taylor, M.. (2008). Book search experiments: investigating IR methods for the indexing and retrieval of books. Paper presented at the

55

Proceedings of the IR research, 30th European conference on Advances in information retrieval, Glasgow, UK.

Yan, J., Liu, N., Chang, E. Q., Ji, L., & Chen, Z.. (2009). Search result re-ranking based on gap between search queries and social tags. Paper presented at the Proceedings of the 18th international conference on World wide web, Madrid, Spain.

Zhai, C., & Lafferty, J. (2004). A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst., ACM Trans. Inf. Syst., 22, 179–214. doi:10.1145/984321.984322

Zhang, C., Wang, H., Liu, Y., Wu, D., Liao, Y., & Wang, B. (2008). Automatic keyword extraction from documents using conditional random fields. Journal of Computational Information Systems, 4(3), 1169-1180.

相關文件