A Cloud-Based Recommender System - A Case Study of Delicacy Recommendation

(1)

Procedia

Engineering

Procedia Engineering 00 (2011) 000–000 www.elsevier.com/locate/procedia

Advanced in Control Engineering and Information Science

A Cloud-Based Recommender System - A Case Study of

Delicacy Recommendation

Chi-Hua Chen

a,*

, Ya-Ching Yang

a

, Pei-Liang Shih

b

, Fang-Yi Lee

a

, Chi-Chun Lo

a

a_{Institute of Information Management, National Chiao-Tung University, Taiwan, R.O.C.}

b_{Degree Program of Computer Science, College of Computer Science, National Chiao-Tung University, Taiwan, R.O.C.}

Abstract

Delicacy recommendation services are the trend of the future. In this paper, we propose an effective decision support

systems (DSS), the Cloud-Based Recommender System (CBRS), which provides the introduction and commentaries

of delicacies and restaurants with relevant recommendation. CBRS provides the web content retrieval agent (WCRA) and multiple document summarization (MDS) technology to generate summary of commentaries. Finally, CBRS combines with the cloud computing for MDS to provide delicacy recommendation services.

Keywords: Delicacy Recommendation; Web Content Corpus; Automatic Text Summarization; Cloud-Based Recommender System.

1. Introduction

The rise of the quality of life index together with the improvement of economic growth leads to increase delicacy requirements. Delicacy recommendation services are the trend of the future. In this paper, we propose an effective decision support systems (DSS), the Cloud-Based Recommender System (CBRS), which is a three-tier system composed of the users, multimedia application server (MAS), and

database server (DS) to provide the introduction and commentaries of delicacies and restaurants with

relevant recommendation. CBRS provides the web content retrieval agent (WCRA) and multiple

document summarization (MDS) technology to generate summary of commentaries [2]. Finally, CBRS

combines with the cloud computing for MDS to provide delicacy recommendation services.

* Corresponding author. Tel.: +886-975292259 E-mail address: chihua0826@gmail.com

(2)

weight of each sentence by words and phrases. The main three features are centrality, sentence length, and position [2, 4-7].

(3) Classifier: The scores of every sentence are computed through the weight with each feature [3].

(4) Reranker: Because the Classifier is carried out only in accordance with score of sentence

similarity calculation and sorting, there is often high similarity between sentences, especially in multi-document summarization. MEAD designs a Reranker mechanism to recalculate the sentence with the syntactic similarity and set the threshold to filter out high similarity sentences to reduce the redundancy ratio. Finally, the summary is done by extracting the sentences from original document by the compression ratio [4].

(5) Summarization: Summarization can retrieve and recombine words and phrases in the original

document according to the order of the sentences by Reranker sorting.

(6) Evaluation: CBRS is used to measure the performance of text summarization system including

the effect of output results as well as users’ satisfaction.

3. System Design Principles

In this paper, the design of the CBRS provides functions which are WCRA, MDS, and cloud computing algorithm. WCRA searches the web contents using Google blog search engine and ipeen.com search engine, and it finds comments about the delicacies in web pages and stores the crawl and parse into web content corpus. CBRS uses MDS technology to provide the introduction and commentaries of delicacies and restaurants with relevant recommendation [2]. Finally, cloud computing algorithm can support MDS to summarize delicacy comments rapidly. Users then use the system interface to query relevant information. Overall system processes are shown as Figure 1.

3.1. Web Content Retrieval Agent

The WCRA provides functions which are fuzzy search, HTML crawler, and HTML parser. The functions are shown as follows.

(1) Fuzzy search: Fuzzy search provides fuzzy computing and judge. It establishes the keywords corpus and uses the terms in corpus to search the articles via Google blog search engine and ipeen.com search engine.

(2) HTML crawler: HTML crawler is used to create a copy of all the visited web pages for later processing by a fuzzy search. In this paper, CBRS uses the results of Google blog search and ipeen.com search in various web contents and track related page links which HTML contents.

(3)

Fig. 1. System architecture of the CBRS

(3) HTML parser: HTML parser analyzes the HTML tags generated from HTML crawler to get the key information. After that, it removes the special characters (such as single quotes and double quotes), and avoid hacking attacks. Finally, we would establish web content corpus to get the summarization from multiple documents to provide relevant recommendation.

3.2. Multiple Document Summarization

CBRS combines MDS technology to summarize automatically the various delicacy comments in real-time and reduce the amount of information effectively so that users can quickly browse the tourist of consumers’ point of view and the past experience.

We refer the procedures of MEAD to design MDS modules and combine the cloud computing algorithm in CBRS. The relevant good comments in the web content corpus are inputted into the MDS modules which are (1) Preprocess, (2) Feature Selection, (3) Classifier, (4) Reranker, and (5) Summary to get text summarization generated automatically.

(1) Preprocess

In first step, preprocess would transfer the format of original HTML documents from a web content entry. And then, we set the documents ID and Sentence ID sequentially in order to calculate the weight of sentences in each document for the summary.

(2) Feature Selected

After that, CBRS uses features including (i) thematic terms and (ii) comment terms to calculate the weight of each sentence.

1). Thematic terms: There are n terms in the sentence s. If the i-th word wi is a thematic term, the score

ai set to 1. Otherwise the score ai set to 0.

⎩ ⎨ ⎧ ∈ = + =

∑

= 0 ,otherwise terms thematic and in word ,1 where , 1 ) ( 1 i i i n i i Thematic w s w a a s Score (1)

2). Comment terms: There are n terms in the sentence s. If the i-th word wi is a comment term, the

score bi set to 1. Otherwise the score bi set to 0.

⎩ ⎨ ⎧ ∈ = + =

∑

= 0 ,otherwise rms comment te and in word ,1 where , 1 ) ( 1 i i i n i i Comment w s w b b s Score (2)

(4)

3.3

ber of web contents which are distributed in different web sites. We analyze these web contents to infer and summarize the delicacy comments. For performance improvement, we use computing and parallel processing environments [1]. Then we can implement MapReduce program to analyze and compute the scores of sentences in each comment do

commentaries for users.

. Cloud Computing Algorithm

There are a large num

Hadoop platform to build cloud

cument for user’s delicacy decision support [1].

Each Mapper uses Eq. (1), (2), and (3) to compute the Score(sj) of sentence

s

_j

=

{

w

1j

,

w

2j

,...,

w

_nj_j

}

in

document di shown as Figure 2. There are total y sentences in x documents. After computing the score of

each sentence, we sort these sentences by their score and select top n sentences to generate the delicacy comment summarization. { 1 1} 2 1 1,w,...,wn1 w { 2 2} 2 2 1,w,...,wn2 w { y} n y y y w w w1, 2,...,

Fig. 2. Cloud Computing Algorithm

4. System Usage

ous terminal devices that include personal computer (PC), notebook, tablet PC, personal digital assistant (PDA), and smart phone to access CBRS to get the delicacy comment d delicacy recommendation services. The user can acquire (i) the delicacies and restaurants introduction and (ii) the relevant comments of each delicacy. For example, mobile user inputs and submits his requirement “千葉火鍋” into CBRS shown as Figure 3. CBRS then retrieves the relevant

Mobile users can utilize vari summarization an

(5)

comments from web contents via WCRA and provides the delicacy comment summaries via MDS based on cloud computing to help mobile users make their delicacy decision shown as Figure 4.

Fig. 3. Query Message Fig. 4. Delicacy Summary

5. Conclusions

This paper focuses on providing restaurant information and comments for delicacy recommendation cribes the CBRS, an integra , which provides relevant recommendations for finding delicacy recommendation services. CBRS provides the WCRA and MDS

nerate summary of commentaries. Finally, CBRS combines with the cloud computing for MDS to provide delicacy recommendation services.

ork Behavior Analysis System for Cloud Computing Service”, Information-An International Interdisciplinary Journal, Vol. 14, No. 3, pp. 931-938, 2011.

[2] C.C. Lo, C.H. Chen, D.Y. Cheng, H.Y. Kung, “Ubiquitous Healthcare Service System with Context-awareness gn and Implementation”, Expert Systems with Applications, Vol. 38, No. 4, pp. 4416-4436, 2011.

[3] K. Kaikhah, “Text Summarization Using Neural Networks”, WSEAS Transactions on Systems, Vol. 3, No. 2, pp. 960-963 4.

E. Drabek, A. Hakim, W. Lam, D. Liu, J. Ott he

C 2004), Lisbon, Por , 2

adev, H. Jing, M. Budzikowska, “Centroid-based summarization of multiple documents: sentence extraction, uti

, 2002.

services and des ted service platform technology to ge

References

[1] B.Y. Lin, C.H. Chen, H.C. Chang, C.C. Lo, “A Netw

Capability: Desi , 200

[4] D.R. Radev, T. Allison, S. Blair-goldensohn, J. Blitzer, A. Çelebi, S. Dimitrov,

erbac r, H. Qi, H. Saggion, S. Teufel, A. Winkel, Z. Zhang, “MEAD - a platform for multidocument multilingual text summarization”, In Proceedings of the 4th International Conference on Language Resources and Evaluation (LRE

tugal 004. [5] D.R. R

lity-based evaluation, and user studies,” the Proceedings of the ANLP/NAACL 2000 Workshop on Automatic Summarization, 2000.

[6] D.R. Radev, A. Winkel, and M. Topper, “Multi Document Centroid-based Text Summarization”, In ACL 2002, Philadelphia, PA

[7] J.Y. Yeh, H.R. Ke, W.P. Yang, and I.H. Meng, “Text Summarization Using a Trainable Summarizer and Latent Semantic Analysis”, Information Processing and Management, Vol. 41, No. 1, pp. 75-95, 2005.