• 沒有找到結果。

influence spreads through texts or media and the hetergeneous network can be constructed based on the relation. Based on the hetergeneous network, an influence matrix is proposed to predict the indirect relation. However, there is still lack of a unified framework for incorporating supplementary information, such as time and textual information, into the social influence analysis in a straightforward way.

2.2 Recommender Algorithm

The main purpose of our task is to model the social influence and utilize it to achieve the social leaders prediction. We consider that the core of recommendation system is quite similar to our prediction task. Specifically, a recommendation algorithm is an attempt to recommend new items to an user based on the items which this user liked in before. In other words, the items are treated as the important clues for realizing an user’s behavior.

Therefore, the recommendation-based algorithms are also feasible to finding the social leaders as long as we make use of the coauthor relations.

Recommender systems have become immensely prevalent in recent years, and are applied in a variety of social applications. Recommender system produces a list of rec-ommendation in one of several ways, through collabrative or content-based filtering. We consider that there are several similar quality between recommender systems and influ-encer detecting. For example, we recommend items through the user contents and item contents also inflencer detecting does.

2.2.1 Collabrative Filtering

CF methods are based on collecting and analyzing a large amount of information on user’s behavior, preference and predicting what the user will like based on the similar users.

This method is based on the assumption that people’s behavior or preference in the past is the same as the behavior or preference in the future. Many algorithms are used in measuring the similarity between users or items in recommender system. For instance, k-neareset neighor. The advantage of CF methods is we do not need to know the content of users or items. For example, if we recommend a music to an user, even though we do not understand the metedata of the music we can give a recommended list to the user.

Although, this method can not work correctly if there comes a new user.

2.2.2 Content-based Filtering

Content-based filtering is based on the description of items and the profiles of users. The main idea of content-based filtering is try to recommend items which are similar to items

4

the user liked in the past. In this method, we first fetch features to represent an item such as vector space. Then, we create a user profile from the rating items. Through these information, system can recommend the most similar items.

Because this method is based on the profiles, it lacks the consideration of other peo-ple’s experience, the system can not make any decision of a quality, style or viewpoint for the item. Also, it is missing any personality assessment.

Because of these disadvantages, there comes the hybrid algorithm to combine the problems of CF and content-based filtering algorithm.

2.2.3 Hybrid Algorithm

Most of recent studies are hybrid approach which is combining CF and content-based filtering. The purpose of CF is to filter information or patterns involving collaboration among multiple people and data sources [8]. Factorization Machines (FM) is one of the state-of-the-art recommendation algorithms [5]. The algorithm has emerged as a popular technique in recommender systems because of its ability of not only simulating CF but simultaneously incorporating with auxiliary information into the models.

In FM, there’s an user-item matrix to record the ratings between user to item. Because of the sparsity of data, there are most losing ratings. To overcome this problem, FM generates an interaction matrix to represent the unknown ratings. For example, the user A listened a music X but didn’t listen music Y. The interaction matrix can simulate the ratings between all users and all items. Then, FM can remember user A is interested in music Y or not. Depend on this framework, we can use an user-item matrix to represent a social network. We regared the ratings as the relations between users and items. For example, if an user A likes or shares a post from user B, we record this relation in the user-item matrix. In other words, it means user B affects user A through the post. We consider the values of interaction matrix simulates is the latent influence. We will describe the details in Chapter 3.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

6

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

Chapter 3 Methodology

Section 3.1 describes how to utilize the technique of CF to calculate latent social influ-ence. In Section 3.2, we show the formulation of the social influence calculation with FM.

3.1 Collaborative Latent Social Influence

CF is a common technique adopted by recommendation systems. In this work, we attempt to model the latent social influence of people in a certain research community with this technique, which filters information or patterns involving collaboration among people.

Figure 3.1 gives an illustrative example to introduce the core idea of the proposed framework for modeling the latent social influence. Figure 3.1(a) depicts the relation-ships between the authors and their papers. These relationrelation-ships can be transformed to the coauthor matrix in Figure 3.1(b), in which each element xai,pj equals to 1 if ai is the author of paper pj, and otherwise the element is 0. We then define an influence trans-formation function F (·) to build up the influence matrix, as shown in Figure 3.1(c); this is the key step to transform the relationships in Figure 3.1(a) to the input of a standard CF algorithm. The transformation function F (·) can be designed variously; in this paper, F (·) is defined as



1, if aiis the author of pj,

Coauthor Matrix Influence Matrix Latent Influence Matrix

Figure 3.1: The Proposed Framework for Modeling Latent Social Influence.

0.1

Author Paper Text information

associated with author

Figure 3.2: An Example Input for FM.

In Figure 3.1(d), each number in blue color can be explained as the estimated latent social influence; the numbers in the green box are the sum of the influence scores of each author on all papers. As shown in the figure, we can observe that although author 2 has only written 2 papers, his/her social influence score (i.e., 4) is larger than that of author 1 (i.e., 3.4), who has written the most papers among the 4 authors. Even though author 2 is not the author of papers 3, 4, and 5, we consider that author 2 should still have (latent) influence on these three papers and the influence can be modeled with the patterns of collaborations among the authors.

相關文件