Procedia Engineering 15 (2011) 3174 – 3178 1877-7058 © 2011 Published by Elsevier Ltd. doi:10.1016/j.proeng.2011.08.596
Procedia
Engineering
Procedia Engineering 00 (2011) 000–000 www.elsevier.com/locate/procediaAdvanced in Control Engineering and Information Science
A Cloud-Based Recommender System - A Case Study of
Delicacy Recommendation
Chi-Hua Chen
a,*, Ya-Ching Yang
a, Pei-Liang Shih
b, Fang-Yi Lee
a, Chi-Chun Lo
aaInstitute of Information Management, National Chiao-Tung University, Taiwan, R.O.C.
bDegree Program of Computer Science, College of Computer Science, National Chiao-Tung University, Taiwan, R.O.C.
Abstract
Delicacy recommendation services are the trend of the future. In this paper, we propose an effective decision support
systems (DSS), the Cloud-Based Recommender System (CBRS), which provides the introduction and commentaries
of delicacies and restaurants with relevant recommendation. CBRS provides the web content retrieval agent (WCRA) and multiple document summarization (MDS) technology to generate summary of commentaries. Finally, CBRS combines with the cloud computing for MDS to provide delicacy recommendation services.
© 2011 Published by Elsevier Ltd. Selection and/or peer-review under responsibility of [CEIS 2011]
Keywords: Delicacy Recommendation; Web Content Corpus; Automatic Text Summarization; Cloud-Based Recommender System.
1. Introduction
The rise of the quality of life index together with the improvement of economic growth leads to increase delicacy requirements. Delicacy recommendation services are the trend of the future. In this paper, we propose an effective decision support systems (DSS), the Cloud-Based Recommender System (CBRS), which is a three-tier system composed of the users, multimedia application server (MAS), and
database server (DS) to provide the introduction and commentaries of delicacies and restaurants with
relevant recommendation. CBRS provides the web content retrieval agent (WCRA) and multiple
document summarization (MDS) technology to generate summary of commentaries [2]. Finally, CBRS
combines with the cloud computing for MDS to provide delicacy recommendation services.
* Corresponding author. Tel.: +886-975292259 E-mail address: chihua0826@gmail.com
weight of each sentence by words and phrases. The main three features are centrality, sentence length, and position [2, 4-7].
(3) Classifier: The scores of every sentence are computed through the weight with each feature [3].
(4) Reranker: Because the Classifier is carried out only in accordance with score of sentence
similarity calculation and sorting, there is often high similarity between sentences, especially in multi-document summarization. MEAD designs a Reranker mechanism to recalculate the sentence with the syntactic similarity and set the threshold to filter out high similarity sentences to reduce the redundancy ratio. Finally, the summary is done by extracting the sentences from original document by the compression ratio [4].
(5) Summarization: Summarization can retrieve and recombine words and phrases in the original
document according to the order of the sentences by Reranker sorting.
(6) Evaluation: CBRS is used to measure the performance of text summarization system including
the effect of output results as well as users’ satisfaction.
3. System Design Principles
In this paper, the design of the CBRS provides functions which are WCRA, MDS, and cloud computing algorithm. WCRA searches the web contents using Google blog search engine and ipeen.com search engine, and it finds comments about the delicacies in web pages and stores the crawl and parse into web content corpus. CBRS uses MDS technology to provide the introduction and commentaries of delicacies and restaurants with relevant recommendation [2]. Finally, cloud computing algorithm can support MDS to summarize delicacy comments rapidly. Users then use the system interface to query relevant information. Overall system processes are shown as Figure 1.
3.1. Web Content Retrieval Agent
The WCRA provides functions which are fuzzy search, HTML crawler, and HTML parser. The functions are shown as follows.
(1) Fuzzy search: Fuzzy search provides fuzzy computing and judge. It establishes the keywords corpus and uses the terms in corpus to search the articles via Google blog search engine and ipeen.com search engine.
(2) HTML crawler: HTML crawler is used to create a copy of all the visited web pages for later processing by a fuzzy search. In this paper, CBRS uses the results of Google blog search and ipeen.com search in various web contents and track related page links which HTML contents.
Fig. 1. System architecture of the CBRS
(3) HTML parser: HTML parser analyzes the HTML tags generated from HTML crawler to get the key information. After that, it removes the special characters (such as single quotes and double quotes), and avoid hacking attacks. Finally, we would establish web content corpus to get the summarization from multiple documents to provide relevant recommendation.
3.2. Multiple Document Summarization
CBRS combines MDS technology to summarize automatically the various delicacy comments in real-time and reduce the amount of information effectively so that users can quickly browse the tourist of consumers’ point of view and the past experience.
We refer the procedures of MEAD to design MDS modules and combine the cloud computing algorithm in CBRS. The relevant good comments in the web content corpus are inputted into the MDS modules which are (1) Preprocess, (2) Feature Selection, (3) Classifier, (4) Reranker, and (5) Summary to get text summarization generated automatically.
(1) Preprocess
In first step, preprocess would transfer the format of original HTML documents from a web content entry. And then, we set the documents ID and Sentence ID sequentially in order to calculate the weight of sentences in each document for the summary.
(2) Feature Selected
After that, CBRS uses features including (i) thematic terms and (ii) comment terms to calculate the weight of each sentence.
1). Thematic terms: There are n terms in the sentence s. If the i-th word wi is a thematic term, the score
ai set to 1. Otherwise the score ai set to 0.
⎩ ⎨ ⎧ ∈ = + =
∑
= 0 ,otherwise terms thematic and in word ,1 where , 1 ) ( 1 i i i n i i Thematic w s w a a s Score (1)2). Comment terms: There are n terms in the sentence s. If the i-th word wi is a comment term, the
score bi set to 1. Otherwise the score bi set to 0.
⎩ ⎨ ⎧ ∈ = + =
∑
= 0 ,otherwise rms comment te and in word ,1 where , 1 ) ( 1 i i i n i i Comment w s w b b s Score (2)3.3
ber of web contents which are distributed in different web sites. We analyze these web contents to infer and summarize the delicacy comments. For performance improvement, we use computing and parallel processing environments [1]. Then we can implement MapReduce program to analyze and compute the scores of sentences in each comment do
commentaries for users.
. Cloud Computing Algorithm
There are a large num
Hadoop platform to build cloud
cument for user’s delicacy decision support [1].
Each Mapper uses Eq. (1), (2), and (3) to compute the Score(sj) of sentence
s
j=
{
w
1j,
w
2j,...,
w
njj}
indocument di shown as Figure 2. There are total y sentences in x documents. After computing the score of
each sentence, we sort these sentences by their score and select top n sentences to generate the delicacy comment summarization. { 1 1} 2 1 1,w,...,wn1 w { 2 2} 2 2 1,w,...,wn2 w { y} n y y y w w w1, 2,...,
Fig. 2. Cloud Computing Algorithm
4. System Usage
ous terminal devices that include personal computer (PC), notebook, tablet PC, personal digital assistant (PDA), and smart phone to access CBRS to get the delicacy comment d delicacy recommendation services. The user can acquire (i) the delicacies and restaurants introduction and (ii) the relevant comments of each delicacy. For example, mobile user inputs and submits his requirement “千葉火鍋” into CBRS shown as Figure 3. CBRS then retrieves the relevant
Mobile users can utilize vari summarization an
comments from web contents via WCRA and provides the delicacy comment summaries via MDS based on cloud computing to help mobile users make their delicacy decision shown as Figure 4.
Fig. 3. Query Message Fig. 4. Delicacy Summary
5. Conclusions
This paper focuses on providing restaurant information and comments for delicacy recommendation cribes the CBRS, an integra , which provides relevant recommendations for finding delicacy recommendation services. CBRS provides the WCRA and MDS
nerate summary of commentaries. Finally, CBRS combines with the cloud computing for MDS to provide delicacy recommendation services.
ork Behavior Analysis System for Cloud Computing Service”, Information-An International Interdisciplinary Journal, Vol. 14, No. 3, pp. 931-938, 2011.
[2] C.C. Lo, C.H. Chen, D.Y. Cheng, H.Y. Kung, “Ubiquitous Healthcare Service System with Context-awareness gn and Implementation”, Expert Systems with Applications, Vol. 38, No. 4, pp. 4416-4436, 2011.
[3] K. Kaikhah, “Text Summarization Using Neural Networks”, WSEAS Transactions on Systems, Vol. 3, No. 2, pp. 960-963 4.
E. Drabek, A. Hakim, W. Lam, D. Liu, J. Ott he
C 2004), Lisbon, Por , 2
adev, H. Jing, M. Budzikowska, “Centroid-based summarization of multiple documents: sentence extraction, uti
, 2002.
services and des ted service platform technology to ge
References
[1] B.Y. Lin, C.H. Chen, H.C. Chang, C.C. Lo, “A Netw
Capability: Desi , 200
[4] D.R. Radev, T. Allison, S. Blair-goldensohn, J. Blitzer, A. Çelebi, S. Dimitrov,
erbac r, H. Qi, H. Saggion, S. Teufel, A. Winkel, Z. Zhang, “MEAD - a platform for multidocument multilingual text summarization”, In Proceedings of the 4th International Conference on Language Resources and Evaluation (LRE
tugal 004. [5] D.R. R
lity-based evaluation, and user studies,” the Proceedings of the ANLP/NAACL 2000 Workshop on Automatic Summarization, 2000.
[6] D.R. Radev, A. Winkel, and M. Topper, “Multi Document Centroid-based Text Summarization”, In ACL 2002, Philadelphia, PA
[7] J.Y. Yeh, H.R. Ke, W.P. Yang, and I.H. Meng, “Text Summarization Using a Trainable Summarizer and Latent Semantic Analysis”, Information Processing and Management, Vol. 41, No. 1, pp. 75-95, 2005.