Exploring latent browsing graph for question answering recommendation

(1)

DOI 10.1007/s11280-011-0146-0

Exploring latent browsing graph for question answering

recommendation

Meng-Fen Chiang· Wen-Chih Peng · Philip S. Yu

Received: 15 April 2011 / Revised: 24 September 2011 / Accepted: 10 October 2011 / Published online: 25 October 2011 © Springer Science+Business Media, LLC 2011

Abstract In this paper, we develop a framework of Question Answering Pages

(referred to as QA pages) recommendation. Our proposed framework consists of the two modules: the off-line module to determine the importance of QA pages and the on-line module for on-line QA page recommendation. In the off-line module, we claim that the importance of QA pages could be discovered from user click streams. If the QA pages are of higher importance, many users will click and spend their time on these QA pages. Moreover, the relevant relationships among QA pages are captured by the browsing behavior on these QA pages. As such, we exploit user click streams to model the browsing behavior among QA pages as QA browsing graph structures. The importance of QA pages is derived from our proposed QA browsing graph structures. However, we observe that the QA browsing graph is sparse and that most of the QA pages do not link to other QA pages. This is referred to as a sparsity

problem. To overcome this problem, we utilize the latent browsing relations among

QA pages to build a QA Latent Browsing Graph. In light of QA latent browsing graph, the importance score of QA pages (referred to as Latent Browsing Rank) and the relevance score of QA pages (referred to as Latent Browsing Recommendation Rank) are proposed. These scores demonstrate the use of a QA latent browsing graph not only to determine the importance of QA pages but also to recommend QA pages. We conducted extensive empirical experiments on Yahoo! Asia Knowledge Plus to evaluate our proposed framework.

M.-F. Chiang· W.-C. Peng (

_B

)

National Chiao Tung University, Hsinchu, Taiwan e-mail: [email protected]

M.-F. Chiang

e-mail: [email protected] P. S. Yu

University of Illinois at Chicago, Chicago, IL 60607, USA e-mail: [email protected]

(2)

Keywords browsing graph· question answering · recommendation

1 Introduction

Question and Answering (QA) forums, such as Baidu and Yahoo! Answers, are effective tools in searching for information and knowledge on the Internet. In general, a QA forum features a portal in which users form several communities and contribute their questions and answers. The contributed information and knowledge is a rich repository to fulfill the information needs of users by providing an on-line QA search engine or QA recommendation system.

Most QA forums, such as Yahoo! Asia Knowledge Plus, provide QA search or QA recommendation services based on a textual model that exploits the textual similarity. By issuing keywords, users could get a ranked list of QA pages that contain some words similar to the keywords issued. From a ranked list of QA search results, users can either sort through the QA pages of interest or issue other keywords if the current search results do not satisfy their requirements. Without loss of generality, most QA portals will provide a ranked list of relevant QA pages when a QA page (referring to as a query QA page Q) is browsed. For example, for each browsed QA page in Yahoo! Asia Knowledge Plus, a ranked list of relevant QA pages are also displayed in the current QA page to fulfill the information needs of users if users want to see more relevant QA pages after browsing the current QA page. The existing solutions for relevant QA recommendations in Yahoo! Asia Knowledge Plus use a textual model to retrieve those QA pages that contain the same keywords as the query QA page, Q. However, one of the disadvantages of the textual models is that, even though QA pages contain the same keyword as in the query QA page, their topics may be irrelevant to the topic of the query QA page. Note that the textual models are language-dependent. Consequently, a textual model that is designed for a particular language may fail for another language. To overcome the above disadvantages of using textual models, we explore the actual browsing behavior of users to determine how important and relevant of QA pages.

Another characteristic of QA forums is the variation in quality. When users contribute information to QA forums, the quality of QA pages may vary. The accurate evaluation of the quality of QA pages is essential under these circumstances. A number of QA forums enable users to rate QA pages to obtain a high quality of user-generated content. The rating information provided by users is then used to filter out QA pages of a lower quality. Several studies have elaborated on designing a quality-aware search on QA forums [1,3,4,7,20] with user-generated content (for example, rating information). For example, the authors in [3,4] proposed a semi-supervised approach for retrieving relevant and high quality content in QA forums. The authors in [3,4] modeled user reputations and used less manual supervision to retrieve relevant and high quality answers by integrating content quality and user reputation information into their ranking process. The authors in [20] introduced a quality-aware QA framework that considered both answer relevance and quality in selecting the answers to be returned. Several methods have been developed for estimating content quality by considering the expertise of a user in both asking and answering questions. However, most prior studies on estimating answer quality are limited in analyzing language-dependent user-generated content. Moreover,

(3)

user-generated content, such as rating information, may be manipulated and biased and thus cannot accurately represent the users reactions to the content of QA pages. As indicated in [5], over 30% of the best answer selections in Yahoo! Answers are affected by the users who provide the answers.

We claim that the importance of QA pages and the relevance of QA pages for QA recommendation by investigating and modeling the user browsing logs is a more reliable method because the user-generated content is unreliable and the textual models are language-dependent. Without a loss of generality, the user browsing logs record information about when and which Web pages are clicked on, and by whom. Note that the user browsing logs record the time-stamp, user identity, and URL information. To facilitate our presentation, Web pages that are not QA pages are referred to as non-QA pages because user browsing logs consist of not only QA pages but also a number of non-QA pages. Several types of relations among QA pages were identified from the user browsing logs. In this paper, the user browsing logs are modeled as a QA browsing graph in which each node represents one QA page and the edges represent the browsing behaviors of users. Then, we adopt the state-of-the-art ranking approaches to compute both the importance and relevance scores for QA pages. For example, we adopted the BrowseRank algorithm in our QA browsing graph to determine the importance of QA pages. However, a naive QA browsing graph that only considers explicit relations among Web pages is too sparse to link the QA pages. According to our observation from Yahoo! Answers, approximately 54% of QA pages are isolated in a QA browsing graph. As illustrated in Figure1, the link distribution reveals the notable sparsity problem that is faced by naive modeling of the user browsing logs. Consequently, the naive adoption of BrowseRank fails to determine the importance scores for isolated QA pages.

To address the sparsity problem, we explore the latent browsing relations among QA pages to build a QA Latent Browsing Graph. In particular, we adopt a time-dependent Markov property to determine whether two QA pages are context-dependent or not. The QA pages are context-context-dependent or latently related to a previously visited QA page if their time difference is not larger than a given time constraint. The time constraint is used to explore the latent browsing behavior of QA pages. However, the setting of this time constraint is an important issue. To derive a time constraint, we propose a behavioral coherence evaluation approach to evaluate the quality of a time constraint for each QA page. Once a QA latent browsing

Figure 1 Degree distribution

from Yahoo! Asia Knowledge Plus (July 15, 2009–July 17, 2009). 0 20 40 60 80 100 0 10 20 30 40 50 60 Percentage (%) Degree Degree Distribution Browsing Graph

(4)

graph is built for QA pages on user behavior logs, we determine the staying time distributions and compute their latent browsing rank of the given QA pages. By using the QA latent browsing graph, we propose a Latent Browsing Rank (abbreviated as LBR) of QA pages to determine the importance of QA pages but also recommended QA pages with a higher Latent Browsing Recommendation Rank (abbreviated as LBRR). We develop a framework that consists of the off-line module and the on-line module. The QA latent browsing graph is first built and the LBR values of QA pages are determined in the off-line module. A recommendation list of QA pages that are ranked by LBRR is generated in the on-line module. We conducted extensive experiments to demonstrate the effectiveness of latent browsing relations on the real data set, that is, Yahoo! Asia Knowledge Plus. The experimental results indicate that our framework is able to determine high quality QA pages and recommend relevant QA pages, which demonstrates the advantage of exploring the QA latent browsing graph.

In summary, the main contributions of this study are as follows:

• We propose a QA latent browsing graph to capture the user browsing behavior in QA forums.

• We utilize the QA latent browsing graph and staying time information to determine the importance score of QA pages (i.e., LBR).

• We exploit Random Walk with Restart in the QA latent browsing graph to derive the relevance score of QA pages (i.e., LBRR).

• We conducted experiments on a large-scale real data set (i.e., Yahoo!Asia Knowledge Plus) to evaluate our proposed QA latent browsing graph and algorithms.

The remainder of this paper is organized as follows. Section 2 provides an overview of the related work. The background information of this study is described in Section 3. Section 4 introduces our observations and presents the QA latent browsing graph in modeling the user-perceived relevances. In Section5, we analyze latent browsing relations. Section6describes two algorithms to determine LBR and LBRR of QA pages. Section7presents the performance studies. Section8concludes this paper.

2 Related work

We briefly review previous studies on link-based ranking algorithms in this sec-tion. Then, we describe the state-of-the-art research works of Question Answering Systems.

2.1 Link-based ranking algorithm

Numerous studies have focused on exploring the linkage relationships among data items for ranking [8,13]. A typical example is to model Web data as a link graph, where the nodes represent Web pages and the edges represent the hyper-links. Then, a link-based ranking algorithm (e.g., PageRank) is used to determine the importance of pages. In principle, PageRank considers Web pages as more important if they are pointed by more links from more important pages. Specifically, PageRank simulates

(5)

the random walk of a “Web surfer” on the graph, and the importance score is defined as the stationary probability of the discrete-time Markov process.

Several algorithms were developed to improve the performance of PageRank [9,11]. For example, [9] discussed several possible alternatives to enhance the basic model of PageRank, such as storage issues, convergence properties, and updating problems. In contrast to the static network in traditional PageRank, the authors in [11] proposed BrowseRank that explores the dynamic hyper-link transitions to model user behavior data. The basic idea in BrowseRank is to formulate a browsing graph based on the browsed hyper-links, in which each vertex represents a Web page and the edges represent the browsed hyper-link transitions between Web pages. Furthermore, a staying time distribution is determined for each Web page from the observed staying time information. The importance score was affected not only by the underlying linkage structure but also by the staying time distribution of all Web pages. In principle, similar to PageRank, BrowseRank considers Web pages as more important if they are linked by more links from more important pages. The higher the ratios of time that “Web surfers” spend on a particular page to the time they spend on all of the pages, the more likely it is that the page is important.

Another variation of PageRank is the computation of the reachability for a query node. For example, Random Walk with Restart analyzes the reachability of a particular query node to the remaining destination nodes. The basic idea of Random Walk with Restart is as follows: the importance information is propagated by two ways: (1). jump back to the query node with probability c and (2). propagate to their adjacent neighbors with probability (1-c). Consequently, given a query node, the more paths that connect a destination node to the query node within a few hops, the more likely it is that the destination node is relevant. Random Walk with Restart was proven to be successful in several applications, such as in content-based image retrieval [6], cross modal correlation discovery [15,19], and a movie recommender system [12].

2.2 A quality-aware question answering system

A considerable amount of research efforts has been dedicated to user preference mining [10] and Web content filtering [2,14,16]. Question and Answering (QA) forums, such as Baidu and Yahoo! Answers, are essential among various Web contents in searching for information and knowledge on the Internet. Although the question-answering (QA) systems are a valuable repository for user-generated con-tent, the distribution of content quality exhibits a high variance. Several algorithms for content quality estimation in QA systems were developed to enhance further applications [1,3,4,7,20].

In [7], a stochastic model was built from manually labeled data to predict the quality of a question and answer pair (QA) by determining the correlations among non-textual answer features and answer quality. The set of non-textual answer fea-tures includes answer length, the number of answers of the respondent, the current questions and best answers, and answer rating. In [1], multiple features, such as textual relevant features, user interaction features and content usage statistics, were used to estimate the quality of the QA content. The authors in [3,4] recently modeled the user reputations and used less manual supervision to retrieve relevant and high quality answers by integrating content quality and user reputation information into

(6)

Figure 2 An example of non-QA page and QA page.

the ranking process. The authors in [20] introduced a quality-aware QA framework that considered both the answer relevance and the quality in selecting the answers to be returned. Several methods were developed to estimate the content quality by considering the expertise of a user in both asking and answering questions. However, most of the prior studies on estimating answer quality are confined to user-generated content. To the best of our knowledge, there is no prior work on computing the importance scores of QA pages or determining relevance scores between QA pages from user behavior data. The large amount of daily user behavior data contains valuable information, which motivates the development of the model and the algorithms to compute the importance scores for QA pages and to determine the relevance scores among QA pages.

3 Preliminaries

In the domain of Question and Answering (QA) forums, two types of Web pages are considered: QA pages and non-QA pages. Figure2shows an example of a non-QA page (left) and an example of a QA page (right) from Yahoo! Answers.1_{In Figure}₂_,

the non-QA page is a search result page that contains a list of QA pages after a keyword “Golden Gate Bridge,” is issued. The figure on the right side of Figure2is an example of a QA page when a user clicked on a search result from the figure of the left side of Figure2. As illustrated in Figure2, a QA page contains three types of content: (1) question content: the content of a posted question, (2) answer content: the content of a set of answers, and (3) recommendation list: a ranked list of hyper-links that correspond to the QA pages returned by the current QA recommendation service.

(7)

Table 1 A snippet of a user’s click stream.

Date April 23 April 24

Visiting time 0 10 20 25 30 35 40 50 0 10 20 25 35 100 110 120

Visiting page P0 P1 P2 P3 P4 P2

(non-QA)

Visiting A B C D E F G H A D

page (QA)

The user click behaviors are logged in the QA forums (e.g., Yahoo! Answers). An example of user click streams is shown in Table1.2_{As illustrated in Table}₁_{, five}

distinct non-QA pages (P0∼ P4) and eight QA pages ( A∼ H) were visited by a

user. The click time for these Web pages were recorded. For example, at time stamp 10, QA page A was clicked on from non-QA page P0. The non-QA page P0provided

a hyper-link for QA page A, and this user clicked the hyper-link to visit QA page A. Thus, click logs in QA forums record the click behavior of users in detail.

Recommendation of QA pages We develop a framework of QA page

recommen-dation in which, if a user issues a query QA page Q, we recommend a list of QA pages that are relevant to the query Q, where the QA pages in the recommendation list are ranked by their relevance score. Most prior studies recommended QA pages are based on keyword matching methods and those QA pages that contain the issued keyword are ranked by their ratings, as provided by users. Our framework evaluates the relevance degree of QA pages from user click streams without matching key-words or human ratings for QA pages. The overview of our framework is illustrated in Figure3, where our proposed framework consists of the off-line module and the on-line module. The task in the off-line module is to model a QA latent browsing graph from a given set of click streams. Once the observations of staying time information are collected from each QA page, we utilize BrowseRank on the QA latent browsing graph to derive the importance scores of QA pages. Then, we further derive the relevance degree of the QA pages in the QA latent browsing graph for a given QA page because the QA latent browsing graph contains the direct relevance information between QA pages based on the time-constraint Markov property. In the on-line module, given a QA page, a list of QA pages is derived by exploring Random Walk with Restart in the QA latent browsing graph.

4 Graph structures to model user browsing behavior

Given a set of click streams, we propose two graph structures to capture the user browsing relationships among QA pages. In Section 4.1, the QA browsing graph model is presented. To include more links via latent relationships among QA pages, the QA latent browsing graph is developed in Section4.2.

2_{The snippet of user click streams is from real logs of Yahoo! Answsers after removing privacy} information.

(8)

Figure 3 An overview of our proposed QA recommendation.

4.1 QA browsing graph

Since BrowseRank in [11] uses one Browsing Graph to model user browsing behav-ior, we borrow the concept of Browsing Graph to generate a QA browsing graph. Explicitly, in the QA browsing graph, each node represents one QA page and the edges between QA pages indicate the corresponding browsing relationship. Note that from the user click streams, we define a QA event as follows:

Definition 1 (QA event) A QA event e is a four-tuple: (u, x, y, t), where u is a user

ID, y is the QA page, x is the referred page (either non-QA or QA page) that provides a link to QA page y, and t is the time-stamp when QA page y is visited.

A sequence of QA events is obtained by ordering the QA events of a user in an increasing order of time stamps. The users may search for satisfactory answers to a question by visiting QA pages until the QA pages satisfy the need of the user. Consequently, a sequence of chronological QA events that are triggered by a user is derived. Each sequence of QA events from a user represents the relevance of QA pages because the user surfs these QA pages for their questions. Given a sequence of QA events from users, a QA browsing graph is built via the following four steps:

Step 1—session segmentation The sequences of QA events from the collective users

were segmented into a set of QA sessions with a given time constraint c because nearby QA events are likely to contain similar QA content. Our segmentation was

(9)

similar to the time rule of BrowseRank [11]. The definition of a QA session is as follows:

Definition 2 (QA session) Given a timing constraint c, a QA session, s, from a user is

a sequence of QA events that are ordered by their time stamp, s= (e1, ..., er), where

the time difference between each consecutive QA event is not higher than c. Given user click streams in Table 1 and the time constraint c= 60 s, Table 2

illustrates the result of session segmentation.

Step 2—browsed hyper-link relations We extract the browsed hyper-link relations

among the QA pages once the user sessions are determined. A browsed hyper-link relation between QA pages (qi, qj) indicate the transition process from QA page qi

to QA page qjthrough the hyper-links in qi. Specifically, the browsed hyper-link

transition in the QA pages is defined as follows:

Definition 3 (Browsed hyper-link relations) Given a pair of QA pages (qi, qj), a

browsed hyper-link relation r= (qi, qj) from QA page qi to QA page qjoccurred

when a user reaches qjthrough hyper-links in the QA recommendation list of qi.

The browsed hyper-link relations are explicitly recorded in the QA events. Specifically, given the QA events ej= (uj, xj, yj, tj), if the referred page xj is a

QA page, r= (xj, yj) demonstrates a browsed hyper-link relation. For example, we

observe a browsed hyper-link transition from QA event e12 in QA session s1 in

Table2. We consider each pair of QA pages (qi, qj) that are involved in a hyper-link transition as evidence of context-dependency between qi and qjbecause they

are suggested as relevant by the on-line user who triggered this event. Given a browsed hyper-link transition r= (qi, qj), a weighted and directed edge from qito qj

is created. Each edge r is associated with a transition frequency, and defined as the number of hyper-link transitions from qito qjfrom the collective users. For example,

in Figure4, a browsed hyper-link relation ( A→ B) for e12is built with its transition

frequency, 1, because the browsing log of one user is available.

Step 3—global resetting relations The relation among sessions is presented in this

step. In BrowseRank, the transition from the end page of a session to the initial page of another session is called a global resetting relation. Given two QA sessions

Table 2 Session segmentation. _{Session ID} _Event _x _y _t

S1 e11 P0 A 10 e12 A B 20 e13 P1 C 30 e14 P2 D 40 e15 P2 E 50 S2 e21 P3 F 10 e22 P4 G 25 e23 P4 H 35 S3 e31 P2 A 110 e32 P2 D 120

(10)

Figure 4 An example of QA

browsing graph. Relation Type

Browsed Hyper-link Relation

From A E H D To B S S S Weight 1 1 1 1 Global Resetting Relation

2 1 S S A F A B C D E F G H S

1

2

1

si= (ei1, ..., eim) and sj= (ej1, ..., ejn), a global resetting relation indicates a browsing

relation from the end QA event of a QA session to the initial QA event in a consecutive QA session. For example, the QA page yim in the QA event eim has

a global resetting relation to yi1 and yj1 in the session si and sj respectively with

probabilities in proportion to their frequencies to be an initial QA page.

We follow the technique used in BrowseRank to model global resetting relations. In BrowseRank, two types of QA pages are involved in a global resetting relation, the end QA pages and initial QA pages. An end QA page refers to the last visited QA page in a QA session. An initial QA page refers to the first QA page in a session. Once the set of end QA pages Qendand the set of initial QA pages Qinitare

identified, BrowseRank introduces a pseudo vertex, S, to connect the end pages and initial pages. Specifically, a weighted and directed edge rinit= (S, qinit) is created for

each initial QA page Qinit∈ Qinit and a weighted and directed edge rend= (qend, S)

is created for each end QA page qend∈ Qend. Each edge rinit (rend) is associated

with a transition frequency, defined as the frequency that qinit (qend) initiates (ends)

a session. As illustrated in Table 2, the set initial QA pages is Qinit= {A, F} and

the set of end QA pages is Qend= {E, H, D}. Accordingly, we obtained five global

resetting relations as illustrated in Figure4. Among the global resetting relations,

S→ A occurred twice and the other global resetting relations occurred once.

A global resetting transition refers to the behavior in which users drop current sessions and restart from the initial QA pages of available QA sessions. A QA page qinit is more likely to be a restart point if qinitinitiates the QA sessions more

frequently. Similarly, yend is more likely to be a drop point if users frequently end

sessions after visiting yend. In our example, the pseudo vertex S was used to form a

primitive graph (connected graph).

Step 4—staying time extraction For simplicity, given a QA session s= (e1, ..., er)

from a user u, the time difference between two consecutive QA events ei= (u, xi, yi, ti) and ei+1= (u, xi+1, yi+1, ti+1) is the staying time for the QA page yi in ei. The staying time for the end QA page was randomly determined based on the derived staying times in the corresponding QA session because we could not derive

(11)

the staying time by subtracting the visiting times of two immediate QA events for the end QA pages.

A QA browsing graph G= (VS, E, W) is built via the above four steps, where S represents a pseudo vertex, v ∈ V represents a QA page, and each edge e ∈ E

represents one of the following relations: (i) browsed hyper-link relation; and (ii) the global resetting relations involved in the pseudo vertex. Each QA pagev ∈ V is associated with a set of observations of staying time. The transition matrix of the QA browsing graph is denoted as W, where each entryw(i, j) refers to the transition frequency fromvitovj. Given Table 2, the QA browsing graph shown in Figure4

contains nine vertices, where S represents the pseudo vertex and the remaining vertices indicate the QA pages. As illustrated in Figure4, the QA browsing graph contains six directed edges, in which one solid edge represent the browsed hyper-link relation and the five dashed edges represent the global resetting relations. The QA browsing graph comprises three isolated QA pages (i.e., C, E and G). The importance of QA pages may be determined by using a QA browsing graph. Given a query QA page, Random Walk with Restart is performed to retrieve the relevant QA pages. The relevant QA pages are derived by traveling the QA browsing graph and their corresponding relevance scores are determined during the traveling the QA browsing graph. However, if most nodes are isolated or have few links, most QA pages may not obtain their importance score and relevance scores. Figure1illustrates the degree distribution of QA pages in a QA browsing graph (ignore the direction of edges). As illustrated in Figure1, approximately 40.5% of QA pages have a zero degree of distribution. Figure5illustrates the in-link and out-link distribution of QA pages in a QA browsing graph. As observed in Figure5, most of the QA pages in the QA browsing graph have zero in-link (out-links). Particularly, 64.2% (49.1%) of QA pages have zero in-links (out-links). Consequently, the link relationships must be enhanced, which requires the design of a latent QA browsing graph.

4.2 QA latent browsing graph

We propose a QA latent browsing graph by improving the links from the latent user browsing behavior. The latent user browsing behavior consists of three relations:

0 10 20 30 40 50 60 70 80 0 20 40 60 80 100 Percentage (%) Number of inlinks In-link Distribution QA-BG

(a) In-link distribution

0 10 20 30 40 50 60 70 80 0 20 40 60 80 100 Percentage (%) Number of outlinks Out-link Distribution QA-BG (b) Out-link distribution Figure 5 Link distributions of QA pages from Yahoo! Asia Knowledge Plus (July 15, 2009–July 17,

(12)

(1) local resetting relations, (2) multiple-click relations, and (3) time-constrained relations. More links are added if QA pages contain these relations. These three relations are presented as follows:

4.2.1 Local resetting relations

In contrast to global resetting relations, which model the transition behavior among QA pages from sessions to sessions, the local resetting relations model the transition behavior for QA pages within a session. We occasionally observe a fragment of a non-QA page sequence between two QA pages. For example, (ei, p1, ..., pk, ei+1),

where ei and ei+1represent QA events and pjrepresents non-QA events for each a≤ j ≤ k. An intuitive approach to relate the corresponding QA pages in ei and ei+1 is to construct a path from yi to yi+1 as follows: yi→ ¯y1→ ... → ¯yk→ yi+1.

However, users do not always visit the next page through a hyper-link transition from the current page. Instead of configuring the complex relations among the fragment, we simplify their relations for QA pages and connect two corresponding QA pages in eiand ei+1by generating a direct path from yito yi+1. We refer to such relation as local resetting relation. The definition of local resetting relation is as follows:

Definition 4 (Local resetting relations) Given a pair of continuous QA events

(ei, ei+1) in a QA session s, a local resetting relation is observed between eiand ei+1

if a continuous sequence of non-QA events between eiand ei+1is found.

The concept of local resetting relations between two QA pages (yi, yi+1) is to

describe a jump behavior (i.e., from the QA page in ei, and end at the QA page in ei+1with a minimum of one non-QA page between eiand ei+1). More links are added

among QA pages with local resetting relations. Figure6a illustrates the examples of local resetting relations (dashed black links). For example, a user may examine the recommended QA pages listed in B, decides to move on to the non-QA page P1,

discover C in P1, and then decides to visit C. The series of determinations implies

that the QA page B is the prior context of C. Furthermore, C is regarded as more relevant or significant than those QA pages that are listed in the recommendation block of B.

(a) Local resetting links (b) Multiple-click links

(c) Time-constrained links A B _B _C _D _E S S F G H C D E F G H A B S C D E F G H A 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ₁ ₁

(13)

4.2.2 Multiple-click relations

A multiple-click relation refers to the behavior in which a user visits more than one QA page from a non-QA page. The sequence of QA pages visited through hyper-link transitions from a non-QA page is referred to as a multiple-click group. The current QA page prior to the non-QA page is considered as the context related to each QA page in the multiple-click group. Formally, the multiple-click relation can be defined as follows:

Definition 5 (Multiple-click relations) Given a segment of a QA session s= (ei, ei+1.., ei+r), the QA page yiin the QA event ei has a multiple-click relation to

each subsequent QA page in ej, if (1) the referred page xiin eiis different from each xjin the subsequent QA event ejand (2) each consecutive QA event ejhas the same

referred page xj, where i+ 1 ≤ j ≤ i + r.

An example of a multiple-click relation is illustrated by the dashed black links in Figure6b. A user first examines the QA page C, moves on to the non-QA page P2

discovers two QA pages{D, E} and visits each of them sequentially. The set of QA pages{D, E} forms a multiple-click group. The current QA page prior to {D, E} is

C, which is considered as the context of both D and E. Consequently, there are two

multiple-click relations, C→ D and C → E.

4.2.3 Time-constrained relations

We explore time-constrained relations for QA pages to improve the QA browsing graph. In contrast to the previously mentioned latent relations, the time-constrained relations may link the QA pages that are not consecutive and improve the neigh-boring context of each QA page in a session by a given time constraint on this pair. The browsing graph is modeled based on the property of Markov assumption. The Markov assumption assumes that the page that a user will visit next only depends on the current page and is independent of the pages that the user visited previously. The Markov assumption is not realistic because the QA page that a user will visit next may relatively depend on the QA pages that the user visited previously. Table3

illustrates an example of a QA session. We observed that the first three QA pages were highly relevant to the last QA page event, although the first two QA pages and the last QA page were not repeatedly visited by the user.

To relax the Markov assumption, the time-constrained relation is considered, where an imposed time constraint determines a flexible number of previously visited pages for the QA page that a user will visit next. Specifically, a neighboring time constraint maxspan specifies the maximal allowed time difference between a pair of QA pages in a session. The time-constrained relation for a pair of QA pages is defined as follows:

Table 3 An example of

time-constrained relation among QA pages in a user QA session.

Time-stamp Question subject 2009/07/21 21:08:39 About upload in facebook 2009/07/21 21:09:25 How to upload photos to facebook 2009/07/21 21:12:29 How to upload photos to facebook 2009/07/21 21:12:57 Ask for help facebook experts

(14)

Definition 6 (Time-constrained relations) Given a QA session s= (e1, ..., er) and a

neighboring timing constraint maxspan, a pair of QA pages yi and yjhas a

time-constrained relation if their corresponding QA events eiand ejare discontinuously

triggered by a user in a session and if the triggered time difference between eiand ej

is less than maxspan.

A QA browsing graph including time-constrained relations is illustrated in Fig-ure6c by dashed black links, with the assumption that maxspan is 60 s. We observe that the amount of context (previously visited QA pages) that a QA page depends on was extended and restricted by maxspan. The maxspan affects the fraction of prior context that belongs to a QA page. The QA page that is located in the head of a session has a higher probability to be part of the prior context of the latter QA page in the session. In contrast, the QA page that is located in the end of a session has an optimal prior context. The fraction of prior context is represented by the ratio of incoming and outgoing time-constrained relations. As illustrated in Figure6c, the QA page that is located in the last position of a session E has the most incoming time-constrained relations, whereas A has the most outgoing time-time-constrained relations.

Both the local resetting relations and the multiple-click relations that do not satisfy the timing constraint must be eliminated from the QA browsing graph. With the above relations, a QA latent browsing graph is a weighted and directed graph

Gt_{= (V}t_{∪ S, E}t_{, W}t_{), where S represents the pseudo vertex, each vertex v ∈ V}t

represents a QA page, each e∈ Et _{involved in S represents the global resetting}

relations, and each e∈ Et connecting two QA pages represent a mixture of the following relations: (i) browsed hyper-link transitions, (ii) local resetting relation (under the time constraint maxspan), (iii) multiple-click relation (under the time constraint maxspan), and (iv) time-constrained relations. The adjacency matrix of

Gtis represented by Wt, where each entrywt(i, j) denotes the transition frequency from vertex yito yj. Formally, the weight of the edge et(i, j) is defined as follows:

wt_{(i, j) =} ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩

γ, if yi= S and yj∈ Vtand the QA page yjappearsγ times

as the first QA event in a QA session

δ, if yi∈ Vt_{and y}

j= S and the QA page yiappearsδ times

as the last QA event in a QA session

c, if yi∈ Vtand yj∈ Vtand the transitions from yito yj

appears c times 0, otherwise.

(1)

5 Analysis of timing constraint

The timing constraint determines the amount of latent relations and the density of the latent relation browsing graph. The amount of latent relations and the density of the latent relations browsing graph increases in conjunction with maxspan. To determine the value of maxspan, we address the following issues: (1) identify the elementary features that influence the quality of the relations; (2) examine the effect of these elementary features on the relation qualities as maxspan increases; and (3) determine the value of maxspan.

(15)

5.1 Indicators of the quality of relations

To address the first issue, we examine two aspects of qualities as follows: (i) the relevance degree of a relation, and (ii) the behavioral coherence among a set of relations.

Relevance degree The relevance degree of a relation refers to the relevance of a

pair of QA pages is in a relation. If the time difference between the pair of QA pages is short, they are more likely to be relevant to each other. Table4presents an example to support our observation in that QA pages involved in short relations tend to be more relevant. Let the QA page on the left hand side of a relation be pref ix and the QA page on the right hand size be suf f ix. For example, the time difference between QA page A and B is approximately 20 s, and they are considered highly relevant because both pages are related to the topic of “free on-line movies.” In contrast, the time difference between QA page A and D is approximately 900 s, and they are considered less relevant because A is about “free on-line movies” and D is about “free on-line music.” Furthermore, if a relation, such as (A,B), is supported by several users, the pages are likely to be relevant. Consequently, we present a metric to evaluate the relevance degree of a relation as follows:

rel(i, j)t= wi, j k∈N(i)wi,k

, (2)

where wi, j represents the transition frequency of the relation (xi, xj) and N(i)

represents the neighbors of xigiven a pre-specified maxspan t.

Behavioral coherence The behavioral coherence among a set of relations refers to

the similarity of the QA pages that are visited after a particular QA page. Specifically, given a QA page xiand a maxspan t, if the set of QA pages that are visited after xiare

markedly overlapped, then t may sufficiently identify the relevant subsequent QA pages for xi. As illustrated in Table4, we observe that, after u1visited QA page A, u1

sequentially visited B. Similarly, after u2visited QA page A, u2sequentially visited B and C. Given maxspan as 86 s, we observed the highest number of similar suffix

sets for the QA page A because the suffix set of u1for A within 86 s was{B} and that

of u2for A within 86 s was{B}. In contrast, u1and u2shared less coherence in their

browsing behavior when maxspan was longer than 86 s. In this case, we discovered that the time difference determined a relative level of behavioral coherence, which

Table 4 Examples of short relations and long relations.

User ID Prefix Topic Suffix Topic t (s)

u1 1509072906787 (A) Free on-line 1008061704665 (B) Free on-line 23

movies movies

u2 1509072906787 (A) Free on-line 1008061704665 (B) Free on-line 86

movies movies

u2 1509072906787 (A) Free on-line 1008060702361 (C) Free on-line 284

movies movies

u3 1509072906787 (A) Free on-line 1508072509613 (D) On-line mp3 919

(16)

consequently indicated the relevance degree between the subsequent visited QA pages to a particular QA page.

In summary, the key observations in terms of quality of relations are as follows: • The relevance degree of a relation increases as the transition frequency increases

and the time different decreases.

• The behavioral coherence that is shared among a set of relations may be used to measure the reliability of a timing constraint for a particular QA page.

5.2 Reliability testing

We considere the relevance degree and behavioral coherence to automatically determine the value of the timing constraint, maxspan, as follows:

Given a QA page p and a timing constraint t, for each QA event e of visiting the QA page p, we collect those QA pages visited after p within time window t into a suffix set, denoted as s fe. If the QA page is visited n times, a set of suffix sets of size n, denoted as Su f f ix(p) = {sfe1, ..., sfen}, is obtained. We compute a similarity score sim(sfei, sfej) for each pair of suffix set sfeiand s fej. The Jaccard coefficient is used to

evaluate the similarity of a pair of suffix sets. We then derive an average similarity score from the set of suffix sets. The average similarity score represent the reliability of a timing constraint t for prefix p and may be formally defined as follows:

Rpt= _{|Suf fix(p)|}1 2 i, j∈Suf fix(p),i= j sims fei, sfej . (3)

We derive a reliability score for each time occurrence when a subsequent QA page was visited after p to determine the optimal timing constraint. Considering the collection of user behaviors in Table 4, four time occurrences are observed after the prefix QA page A (i.e., t= 23, t = 86, t = 284, and t = 919). We calculate the reliability score at each time occurrence for the set of corresponding suffix sets and then select the time occurrence with the highest reliability score as the timing constraint t. In this example, t= 86 is chosen as the value of maxspan. A different prefix QA page may have derived a different timing constraint. This is reasonable because users may behave differently on various QA pages. A number of prefix QA pages are followed by diverse content, which may require a longer maxspan for the suffix sets to achieve coherence. On the other hand, a number of prefix QA pages are consistently followed by similar QA pages, which tends to have a shorter maxspan. (Table5).

Table 5 Examples of short

relations and long relations. User ID Prefix t= 23 t= 86 t= 284 t= 919

u1 A B

u2 A B C

u3 A D

Rt

(17)

6 Algorithms of latent browsing rank and QA recommendation

Given the QA browsing graph structures, the Continuous-time Markov Model is used to derive the importance scores for the QA pages in Section6.1. Then, given a query QA page Q, we adopt Random Walk with Restart in the latent browsing graph to recommend relevant QA pages.

6.1 Design of latent browsing rank

We propose Latent Browsing Rank (abbreviated as LBR) for computing the im-portance scores of QA pages. Similar to BrowseRank [11], LBR relies on the continuous-time Markov model. The details of BrowseRank are referred to in [11].

The concept of Browse Rank is to build a model of a continuous-time time-homogeneous Markov process to simulate a random walk on a browsing graph and to use the stationary probability distribution of the process as a measure of page importance. To efficiently estimate the stationary probability distribution of a continuous-time and time-homogeneous Markov process, BrowseRank leverages the correspondence between a continuous-time and time-homogeneous Markov process and a process. Therefore, deriving the stationary probability distribution of a Q-process is a problem in computing the page importance. According to [18], deriving the stationary probability distribution of a Q-process may be reduced to the problem of deriving a stationary probability distribution of the Embedded Markov chain (EMC), which is a discrete-time Markov process. Let the stationary probability distributions of a Q-process be r, where rirepresents the importance of page xiand

the stationary probability distribution of an EMC be˜r. As such, we derive r by the following equation: ri= ˜ri qii n j=1q˜rjjj , (4)

where qiirepresents the parameters in the Q-process and ˜rirepresents the stationary

probability of page xi. Consequently, two tasks are required to determine page

importance, as follows: (1) qiiestimation, and (2) deriving the stationary probability

distribution˜r of EMC.

To determine the LBR of QA pages in the QA latent browsing graph (i.e., a QA latent browsing graph Gt= (Vt∪ S, Et, Wt), where Wt denotes the transition frequency matrix), we perform a column-normalized process to derive Mt _{for each}

column in Wt_{. Then, for each QA page y}

i, the parameter qii that determined the

underlying staying time distribution of QA page yiwas determined from a collection

of staying time information of yiby the following equation:

min qii μi+ 1 qi −1 2 σ2 i − 1 q2 i 2 , (5)

where qii< 0, μirepresents the average staying time andσirepresents the variance

of staying time.

The ranking vector is defined as ˜r and initialized ˜r with all elements equal to 1 n,

(18)

the probability of a random surfer remaining in a session instead of resetting to other sessions, the transition probability matrix of the EMC TEMCis estimated as follows:

TEMC(i, j) = ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ αwi, j kwi,k + (1 − α)γj, if kwi,k= 0 γj, ifkwi,k= 0 0, if i= j , (6)

wherewi, jdenotes the number of transitions from node i to node j andγjdenotes

the global resetting probabilities from the restart node to node j. Once the EMC transition probability matrix TEMCis derived, the stationary probability distribution

˜r is calculated for QA pages by a power iteration algorithm (line 4). The stationary probability distribution ˜r reflects the importance score of QA pages given the un-derlying relation structure among QA pages. To incorporate the determined staying time distributions into˜r, the stationary probability distribution of the Q-process for QA pages r is computed by (4), where each entry riindicated its importance score

given the relation structure and staying time distribution (line 5).

Algorithm 1 Latent browsing rank. Input:

Gt_{: a QA latent browsing graph,} α: a damping factor.

Output:

r: the vector of importance scores for QA pages. 1 Estimate qiifor each QA page

2 Estimate the transition probability matrix of the EMC TEMC

3 Compute the stationary probability distribution of EMC for QA pages, ˜r, by power iteration algorithm

4 Compute the stationary probability distribution of the Q-process for QA pages, r

5 return r

6.2 On-line QA recommendation module

The on-line QA recommendation module is presented in this section. Our recom-mendation module (abbreviated as LBRR) is based on the Latent QA Browsing graph. Specifically, given a query QA page Q, the Latent QA Recommend aims to compute the top-k QA pages that are most relevant to Q. Given a query QA Q, LBRR exploits Random Walks with Restart to retrieve the relevant QA pages in QA browsing graph structures.

Given a QA latent browsing graph Gt_{, the column-normalized transition}

prob-ability matrix of Gt _{is defined as M}t _{(line 2). The restart vector} _v

Q and the

relevance vector uQare initialized with all elements set to zero, except for the entry

corresponding to the query QA page Q, which is set to one (line 3–4). Then, the relevance scores of each QA page related to Q is obtained by the Random Walk with Restart model, which iteratively computes the following equation until the ranking vector uQconverged (line 5–7):

(19)

where k is the number of iterations, uk

Q represents the relevance score of each QA

page in kth _{iteration, and} _{α is the restart factor, which represents the probability}

of a random surfer restarting from the current QA page to the query QA page

Q instead of following the outgoing relations to its neighboring QA pages in Mt

(line 6). Consequently, a QA page yi in Gt is regarded as more relevant to Q

if the probability of a random surfer reaching the QA page yi from Q is higher.

After the relevance vector uk_Q+1 is converged, uk_Q+1 is returned, where each entry

uQ(i)k+1 _{represents its relevance score to the query QA page Q. Finally, a list of}

QA pages with top-k highest relevance scores are returned. If the relevance scores of the returned QA pages are the same, their rank may be determined by the LBR score. The algorithm for computing the relevance scores of QA pages for a query QA page Q is summarized in Algorithm 2.

Algorithm 2 Latent browsing recommendation. Input:

Gt_{: a QA latent browsing graph} α: a restart probability,

Q : a query QA page. Output:

uQ: the vector of relevance scores for query QA page Q.

1 Compute Mt

2 Initialize rQ 3 Initialize uQ= vQ

4 while ukQ+1has not converged do

5 Update ukQby u k+1 Q = (1 − α)M t ukQ+ αvQ 6 end

7 return ukQ+1: the vector of relevance scores for query QA page Q

7 Performance evaluation

In this section, we first compare the effectiveness of our proposed graph models with the BrowseRank in terms of importance rank of QA pages (Latent Browse Rank) and relevance rank of QA pages (Latent Browse Recommendation Rank). Then, we investigate the robustness of our proposed graph models by varying values of

maxspan.

7.1 Datasets and experimental settings

We conduct experiments with real datasets to evaluate the performance of the proposed graph model. The dataset is collected over three days during July 2009 from the commercial Question and Answering forums of Yahoo! Asia Knowledge Plus (AKP). The dataset contains approximately 43 millions click events and 5.8 millions clicked QA pages. The dataset is modeled into three types of graphs as follows: browsing graph (BG), QA browsing graph (QA-BG) and QA latent browsing graph

(20)

Table 6 Description of graph models.

Graph model maxspan (s) Number Number Graph density

of vertices of edges BG None 3,375,998 2,708,259 4.75 × 10−7 QA-BG None 3,162,988 2,675,099 5.35 × 10−7 L-QA-BG(t300) 300 5,434,668 27,298,162 1.85 × 10−6 L-QA-BG(t600) 600 5,496,791 39,235,500 2.60 × 10−6 L-QA-BG(t1200) 1,200 5,519,518 47,315,599 3.11 × 10−6

(L-QA-BG). As summarized in Table6, the browsing graph BG contains both non-QA pages and non-QA pages; and the non-QA-BG contains only non-QA pages. The L-non-QA-BG only has QA pages with an imposed time threshold maxspan, ranging from 300 to 1200 s. Now, we compare the density of graphs, where given a graph, G= (V, E), the graph density is computed as follows:

D(G) = 2|E|

|V| (|V| − 1). (8)

7.2 Evaluation metrics

Both the importance ranking quality and relevance ranking quality are measured in terms of three aspects as follows: (i) the amount of incoming transitions, (ii) the accumulated staying time, and (iii) the number of multiple-click groups. Each of these aspects is defined as follows:

Let the ranking list of QA pages generated by Latent BrowseRank on the QA latent browsing graph L-QA-BG(t600) be TOP(L-QA-BG,t600); and the top QA pages generated by BrowseRank on BG and QA-BG be TOP(BG) and TOP(QA-BG), respectively. An indicator of popularity is proposed to measure the distribution of the incoming hyper-link transitions over the rank. Specifically, we define the popularity of a QA page q, I Nq, as the amount of incoming transitions of a QA page q in the QA browsing graph QA-BG. Considering the ranking position, we define

the average number of incoming transitions for a QA page q at rank K as follows:

αq@K=

K

p=1I Nq@ p

K . (9)

Accordingly, the popularity measured by mean average incoming transitions of a given ranking list is defined as follows:

β@K =

K p=1αq@K

K . (10)

In principle, given a ranking list with popular highly-ranked QA pages, the ranking list achieves a higher score in terms of popularity at a higher rank in comparison with a ranking list with unpopular highly-ranked QA pages.

An indicator of information quality,δ@K, is proposed to measure the distribution of staying time over the rank. Specifically, we define the information quality of a QA page q as the sum of a set of observations of staying time for a QA page q, denoted asγq. The longer the total amount of time that users spent on a QA page q, the more

(21)

informative the QA page is considered. Given a ranking list, the information quality defined by total amount of time that users spent on a top-K ranking list is measured as follows: δ@K = K p=1 γq@ p. (11)

In principle, given a ranking list with highly-ranked QA pages that are associated with longer accumulated staying time, the ranking list achieves a higher score in terms of information quality at a higher rank in comparison with a ranking list with QA page that are associated with less staying time but are highly-ranked.

The indicator of reference value to evaluate the effectiveness of a ranking list is proposed. Specifically, the value of a QA page q is measured by how the QA page

q associates with other QA pages. To measure the reference value of QA pages, we

identify multiple-click groups in the collected user behavior dataset and accumulated the occurrences in multiple-click groups for each QA page q. A multiple-click group is a set of QA pages with the same referrer. We maintain a minimum of two QA pages in the dataset for these multiple-click groups. The reference value of a given QA page q is measured by the number of multiple-click groups that q participated in. The more multiple-click groups q participated in, the more QA pages q was associated with. The more QA pages associated with q, the higher the reference value of q. An example is illustrated in Figure7, in which three multiple-click groups are represented by dotted boxes. The QA page marked by a rectangle is of high reference value because it participates in three multiple-click groups; whereas the QA page marked by a circle is of less reference value because it has no associations. In this case, the QA page represented by rectangle is more valuable than the QA page represented by the circle because the QA page represented by the rectangle frequently interact with other QA pages (Figure8).

The metric notations with their corresponding descriptions are summarized in Table7. Let the number of multiple-click groups that a QA page q participates in beηq. Considering ranking position, the average counts of multiple-click groups of a given a QA page q at rank K is defined as:

q@K= K

p=1ηq@ p

K . (12)

Figure 7 An example of

multiple-click groups.

timeline

Session 1

Session 2

Session n

.

(22)

Figure 8 Recommendation

quality in terms of mean average number of important QA pages. 0 200 400 600 800 1000 1200 5 10 15 20 25 30

Mean Average Number

Top-K

Rcmd(t600,RWR) Rcmd(t0,RWR)

Accordingly, given a ranking list of QA pages, the reference value defined by the mean average counts of multiple-click groups at rank K is defined as:

ι@K =

K

p=1q@ p

K . (13)

Similarly, if the QA pages that more frequently participates in multiple-click groups are ranked higher in a given ranking list, the ranking list achieves a higher score in terms of reference value when K is small.

7.3 Importance ranking quality

In the first set of experiments, we compare the ranking quality in different graph models, as illustrated in Table6.

7.3.1 Results and discussions

Figure 9 illustrates the comparison of the ranking results in terms of popularity (β@K), information quality (δ@K) and reference value (ι@K) at varying K. From this figure, we have the following observations.

First, the QA latent browsing graph, L-QA-BG(t600), tends to rank the QA pages of large incoming transitions higher. As Figure 9a demonstrates,

Table 7 Metric notations used in experiments.

Notations Descriptions

αq@K Average number of incoming transitions for a QA page q at rank K

β@K Mean average incoming transitions of a given ranking list

γq Sum of a set of observations of staying time for a QA page q

δ@K Total amount of time that users spent on a top-K ranking list

ηq@ p Number of multiple-click groups that a QA page q participates in

q@K Average count of multiple-click groups of a given a QA page q at rank K

(23)

0 200 400 600 800 1000 5 10 15 20 25 30 b Top-K TOP(L-QA-BG,t600) TOP(QA-BG) TOP(BG) (a) Popularity 0 200 400 600 800 1000 5 10 15 20 25 30 d Top-K TOP(L-QA-BG,t600) TOP(QA-BG) TOP(BG) (b) Information Quality 0 200 400 600 800 1000 5 10 15 20 25 30 i Top-K TOP(L-QA-BG,t600) TOP(QA-BG) TOP(BG) (c) Reference Value Figure 9 Comparison of ranking quality.

TOP(L-QA-BG,t600) exhibits a significant increase in the popularity score after

K= 18; whereas the popularity score in TOP(BG) and TOP(BG) is low and stable

over K. The QA page ranked at K= 18 in TOP(L-QA-BG,t600), denoted as q@18, is an effectively-summarized documents that is illustrated with excellent pictures and concise texts. It was visited by over ten thousand users within three days and accumulated a large amount of incoming transitions from other QA pages in the domain-specific browsing graph QA-BG. The QA page (q@18) was ranked at 83,642 in TOP(QA-BG). This difference is mainly because BrowseRank is highly sensitive to the number of observations of staying time information. A few observations of staying time information can result in a bias of the underlying staying time of a page. Consequently, those pages with a markedly high standard deviation, but visited by a few people (less than ten), are in TOP(BG). In contrast, popular pages such as

q@18 have a relatively low average staying time and standard deviation and are

regarded as less important by BrowseRank. The popular pages of a higher quality may be highlighted with the help of the transitions that were derived from the implicit information of user-perceived relevance among QA pages. A number of QA pages with high quality in the top-10 of L-QA-BG(t600) are isolated in QA-BG and should not appear in TOP(QA-BG). However, the implicit transitions in the QA latent browsing graph may discover these pages.

Second, Latent Browsing Rank may rank those QA pages with excellent infor-mation quality as high as illustrated in Figure9b. Most QA pages in TOP(QA-BG) have a significantly long average staying time or standard deviations, however, they were visited by less than ten people. Unlike BrowseRank, which is sensitive to the staying time distribution, Latent Browsing Rank emphasizes the importance of latent relevance transitions. Consequently, the QA pages with a higher accumulated staying time may be ranked higher by Latent Browsing Rank.

Third, Latent Browsing Rank tends to rank the frequently associated QA pages higher. As Figure9c demonstrates, the QA pages in TOP(L-QA-BG,t600) partici-pated in more multiple-click groups. The QA pages in TOP(L-QA-BG,t600) are of higher reference value because they are frequently co-visited with other QA pages.

7.3.2 Top-10 QA pages

Table8illustrates the top-10 QA pages that were produced by BrowseRank under QA-BG and the top-10 QA pages that were produced by Latent Browsing Rank under L-QA-BG(t600). Among the top-1000 QA pages that were produced by QA-BG and L-QA-BG(t600), 0.14% were the same. The first column in Table 8

(24)

Table 8 Top 10 QA pages produced by two different graph models.

Rank TOP (BG) TOP (L-QA-BG,t600)

1 [Family] channels to appeal for sudden surges [Movie] the best Chinese Films you of water consumption have ever seen

2 [Social and Human] what’s the civil culture? [Drama] TV channel and schedule for nice Taiwanese Cinema

3 [Investment] a platform in domestic stock for [Movie] the Websites about the release bidding “stop loss limit order” via intelligent of Chinese Films

investment tool

4 [Mind] healthy diet for effective defecation [Movie] Taiwanese Films during early stage of Taiwan

5 [Literature] a translation for classical Chinese [Movie] where can I buy the Chinese Film “Wolf”

6 [Social and Human] comments on navy ship [Movie] the first Taiwanese films after volunteer for military service World War II

7 [Education] where can I buy “magic follows” [Movie] TV channel for Chinese Films 8 [Health Care] dental clinic near MRT Dingxi [Movie] opinions about Taiwanese Films

Station

9 [Hardware] related issues in installing driver [Movie] the Websites to search for movies for Bluetooth devices

10 [Party Politics] the cause of death of [Movie] the major Chinese Films in recent

Ching-feng Yin? ten years

illustrates the rank from 1 to 10. The second and third column summarizes the main idea of the QA page at the corresponding rank.

7.4 Recommendation quality

We select 2000 QA pages of high visiting frequency as the query collection Q to investigate the recommendation quality. For each query q∈ Q, Random Walk with Restart was performed on QA-BG, BG, and L-QA-BG(t600) to derive the recom-mendation lists. The recomrecom-mendation lists that were derived from QA-BG, BG and L-QA-BG(t600) were denoted as Rcmd(QA-BG), Rcmd(BG), and Rcmd(L-QA-BG,t600), respectively. The recommended QA pages for a query q were filtered by a relevance threshold, that is, only the QA pages which are relevant to the query q to certain extent will be recommended. After filtering, Rcmd(BG) contains 1,472 recommendation lists, Rcmd(QA-BG) contains 1,476 recommendation lists and Rcmd(L-QA-BG,t600) contains 1,546 recommendation lists. In addition, a well-known textual relevance model, BM25 [17], is implemented for comparing the rec-ommendation quality. In BG25, the textual content for each QA page is collected and modeled as a set of n-grams. In essence, the title of each QA q∈ Q is regarded as the query and the list of QA pages returned by the textual relevance model is regarded as the relevance search result for the QA page q. The collection of relevance search lists of each query q∈ Q derived by BM25 are denoted as Rcmd(Text). The recommendation quality of Rcmd(QA-BG), Rcmd(BG), Rcmd(L-QA-BG,t600), and Rcmd(Text) are shown in Figure10, where the popularity, information quality and reference value of different approaches are presented.

There are several observations from Figure10. First, the recommendation results derived from all graph models are better than the relevance search results derived

(25)

0 50 100 150 200 250 5 10 15 20 25 30 b Top-K Rcmd(L-QA-BG,t600) Rcmd(QA-BG) Rcmd(BG) Rcmd(Text) (a) Popularity 0 20 40 60 80 100 5 10 15 20 25 30 d Top-K Rcmd(L-QA-BG,t600) Rcmd(QA-BG) Rcmd(BG) Rcmd(Text) (b) Information Quality 0 50 100 150 200 250 300 350 400 5 10 15 20 25 30 i Top-K Rcmd(L-QA-BG,t600) Rcmd(QA-BG) Rcmd(BG) Rcmd(Text) (c) Reference Value Figure 10 Comparison of recommendation quality.

from the textual relevance model. The main reason is that the keyword-matching technique used in BM25 may fail if QA pages contain the same keyword as in the query QA page, but their contents are totally irrelevant to the topic of the query QA page. On the other hand, BrowseRank and our proposed framework rely on analyzing and modeling user browsing behavior, which avoid the keyword matching problem. Second, Figure 10a indicates that the QA latent browsing graph tends to recommend the QA pages with a high volume of incoming transitions. This is because the QA latent browsing graph emphasizes the factor of linkage information to correct the bias that results from a few sample problems in the staying time distributions. Third, the QA latent browsing graph tends to recommend informative QA pages, in which people are likely to spend a large amount of browsing time. As illustrated in Figure 10b, the accumulated staying time increases dramatically in Rcmd(L-QA-BG,t600) over K. Fourth, the QA latent browsing graph tends to recommend QA pages of higher reference value. As illustrated in Figure 10c, the number of multiple-click groups that were participated by QA pages in Rcmd(L-QA-BG,t600) is markedly higher in comparison with Rcmd(QA-BG) and Rcmd(BG) over K.

From the perspective of page importance, suppose a recommended QA page in Rcmd(L-QA-BG,t600) that is produced by QA latent browsing graph is important if it falls within top-10,000 of TOP(L-QA-BG,t600); and a recommended QA page in Rcmd(QA-BG) is important if it falls within top-100,000 of TOP(BG). As illustrated in Figure 8, at each rank K, Rcmd(L-QA-BG,t600) returns a higher number of important QA pages than Rcmd(QA-BG). Overall, given Q, Rcmd(L-QA-BG,t600) contained 117,196 important QA pages and Rcmd(QA-BG) contained 2,489 impor-tant QA pages.

Particularly, we also compare Rcmd(QA-BG) and Rcmd(L-QA-BG,t600) in terms of precision, recall and normalized Discount Cumulative Gain (NDCG) to evaluate the potential effectiveness of the recommendation quality. The dataset is partitioned into training and testing sets according to the time-stamps of records. The amount of records in training data is 80% and the rest 20% of data is regarded as testing data. Suppose the query collection Q is the collection of QA pages associated with each record in testing data. Afterward, for each QA page q∈ Q, we collect the list of clicked QA pages after q within time interval t as the ground truth GTq. Let the

top-K QA pages returned for the query q be Sq, the precision for the query collection Q is defined as follows:

Precision@K= |GTq∩ Sq|

|Sq| , ∀q ∈ Q.

(26)

0 0.0005 0.001 0.0015 0.002 0.0025 0.003 0.0035 0.004 1 2 3 4 5 6 7 8 9 10 Precision Top-K Rcmd(L-QA-BG, t600) Rcmd(QA-BG) (a) Precision 0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 1 2 3 4 5 6 7 8 9 10 Recall Top-K Rcmd(L-QA-BG, t600) Rcmd(QA-BG) (b) Recall 0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 1 2 3 4 5 6 7 8 9 10 NDCG Top-K Rcmd(L-QA-BG, t600) Rcmd(QA-BG) (c) NDCG Figure 11 Comparison of recommendation quality in testing set.

The recall for the query collection Q is defined as follows:

Recall@K= |GTq∩ Sq|

|GTq| , ∀q ∈ Q.

(15) To highlight the ranking quality of recommendation results Sq, we use discounted

cumulative gain defined as follows:

DCG@K= p=1 k 2rel(p)− 1 log(1 + p), ∀q ∈ Q (16)

where rel( p)=1 if the QA pages ranked at position p falls within GTq; otherwise, rel( p)=0.

Accordingly, the normalized discounted cumulative gain for a query q is computed as:

nDCG@K= DCG@K

I DCG@K, ∀q ∈ Q (17)

where I DCG@K is the DCG@K of ideal ordering at K.

As shown in Figure 11, the recommendation results derived from QA latent browsing graph have higher precision, recall and nDCG than those derived from QA

0 100 200 300 400 500 600 700 5 10 15 20 25 30 b Top-K TOP(L-QA-BG,t300) TOP(L-QA-BG,t600) TOP(L-QA-BG,t1200) (a) Popularity 0 200 400 600 800 1000 5 10 15 20 25 30 d Top-K TOP(L-QA-BG,t300) TOP(L-QA-BG,t600) TOP(L-QA-BG,t1200) (b) Information Quality 0 100 200 300 400 500 600 5 10 15 20 25 30 i Top-K TOP(L-QA-BG,t300) TOP(L-QA-BG,t600) TOP(L-QA-BG,t1200) (c) Reference Value Figure 12 Comparison of ranking quality with varying value of maxspan.