Evaluation Metrics

where x, y represent different items separately, and d is the dimension of the vector.

The closer distance between the user and the playlist, the higher priority of the playlist would be recommended.

4.1.3 Evaluation Metrics

To access the effectiveness of our recommender system, we employed several metrics to evaluate the performance of algorithms in recommendation, precision (hit rate), mean average precision at n (M AP @n) and recall.

We denote the precision at n as P (n) which is the fraction of the recommendation playlists and the given positive integer number n. It can be computed as the following formula:

P (n) = |R(n) ∩ P ref (u)|

n (4.2)

where R(n) is the top n recommendation playlists and P ref (u) is the user’s preference in playlists. Furthermore, the average precision of the user (AP (u)) is denoted as:

AP (u) = 1

and the mean of the average precision score for the top n result (MAP@n) can be com-puted by:

where |U | is the total number of users. The higher of the MAP@n score, the better recommendation result.

The other performance metric is recall. Recall is the fraction of preference playlists that have been recommended over the total amount of preference playlists. The formula is defined as:

Recall = |R(n) ∩ P ref (u)|

|P ref (u)| (4.5)

High recall means that most of playlists user loved are recommended.

‧

In this section, we compare the recommendation results between DeepWalk, LINE, and HPE methods. Also, we separately discuss the consequence of their parameter settings.

4.2.1 DeepWalk

We created the bipartite graph to describe the relationships between the nodes in this user-song-playlist network. Then we input the edge list like, user-song, playlist-song, as our training data. In addition, we only consider one hidden layer in our model. Before training this model, we have to initiate several parameters to start the training process.

The following are the parameters we used in our experiments.

Parameters Definition

window size (wsize) The neighbor’s number around the center node. See Fig. 3.3 walking wteps (wstep) Number of steps walking from the starting node. See Fig. 3.3 walking times (wtime) Number of random walks on each node.

dimension (dim) The neuron’s number in the hidden layer.

Table 4.3: DeepWalk parameters definition

In oreder to evaluate how changes to the parameterization of this method affects its performance, we had fixed the window size and the walking steps to the values, wsize=5, wstep=6, which is choosen heuristically. We then varied the number of neurons in the embedding layer, and the number of walking times started per node, to determine their impacts on the recommendation task.

Walk Times(k) Dim=64 Dim=128 Dim=256 Dim=512 1699(0.1) 0.131653 0.233496 0.309483 0.302355 935(0.3) 0.127857 0.223235 0.295636 0.290624 610(0.5) 0.117862 0.206414 0.27606 0.27765 407(0.7) 0.111141 0.18107 0.250692 0.264103 267(0.9) 0.10195 0.154225 0.214481 0.24156 Table 4.4: Precision at 20 of DeepWalk with different parameter settings

Table 4.4 shows the effects of such impacts to the model. And from this table, we can observe two interesting situations. The first is that as walking times increase, we get the higher hit rates in each dimension circumstance. As for selecting the number of the

‧

walking times, we can derive it from the user listening logs. The walking times is equal to the top k percent of the user’s least times of listening song, and the k is 1, 10, 30, 50, 70, 90 as shown in the Table 4.4.

The second is that the performance of the recommendation task rises up as the number of neurons in the hidden layer increases. The number of neurons could be seen as the number of latent features for the node. In our bipartite graph of the user-song-playlist network, we tried to encode the song information into the user and the playlist in the same dimensional space. In our experiments, we set up the highest dimension to 512, which approximates to the essential features that is mentioned in the Music Genome Project.

The Music Genome Project¹ uses over 450 attributes, which are the essence of music at the most fundamental level, to describe each song. Therefore, setting the appropriate dimension would help us encode sufficient information and get better performance on our task.

4.2.2 LINE

To embed the local structure information effectively, we used the LINE model to preserve first-order proximity and second-order proximity between the vertices. This model not only preserves network structure information, but also retains edge information which is assigned as a different numerical value. For example, the edge weight can be explicitly used to quantify the user’s interest.

In this model, we investigated the performance with respect to the parameter dimen-sion.

Sample Times Dim=64 Dim=128 Dim=256 Dim=512 1,000 0.119317 0.129924 0.136724 0.137504 3,000 0.136117 0.150536 0.164196 0.173081 5,000 0.131234 0.147328 0.166585 0.178067 10,000 0.119171 0.142782 0.167774 0.178445 30,000 0.111514 0.145627 0.165697 0.165595 Table 4.5: Precision at 20 of LINE with different parameter settings

Table 4.5 reports the performance of the LINE model. We can see performance drops when the sample time becomes too large, but the higher precision with the larger dimen-sion is consistent with the results on DeepWalk method.

However, the precision rate in LINE cannot compete with DeepWalk in our dataset as you can see in the Figure 4.2.

1https://en.wikipedia.org/wiki/Music_Genome_Project

‧

Figure 4.2: Comparison between DeepWalk, LINE and HPE on precision rate

The reason why the local structure learning doesn’t work so well in our experiments is because we don’t have the explicit user-playlist subscription information in our dataset.

Since the user and playlist vertices don’t have the direct connection, we couldn’t take the advantage of the first-order proximity structure learning in LINE.

Therefore, we could only take the second-order proximity into consideration, and try to capture the nodes’ similar local structures. There is still another issue that the user vertices edge’s degree is at different scale with the playlist vertices’. Each user might have hundreds of user-song edges. On the other hand, there is only tens of the playlist-song edges for each playlist node. Compared to user node’s embedding, the playlist node’s embedding might be more precise to specific genres.

Despite DeepWalk’s precision rate outcompeting LINE’s, the latter takes less train-ing process time, and also has the good performance for the recommendation tasks. To accelerate the training process, and improve the effectiveness of the stochastic gradient descent, LINE uses the edge sampling method to reduce the learning time.

4.2.3 HPE

In this method, we set up the number of walk steps to 6, and the related parameter set-tings is described in the Table 4.6. The table also shows the performance of the playlist recommendation task.

Besides, we compare HPE model with the other network embedding methods for our dataset. As shown in Fig. 4.2, the experiment results reflect that the HPE model embeds a better network representation in our bipartite graph. And also, we can see that there

‧

Sample Times Dim=64 Dim=128 Dim=256 Dim=512 30,000 0.206507 0.284222 0.317316 0.294788 10,000 0.177745 0.23117 0.265098 0.270714 5,000 0.152062 0.186453 0.214954 0.222994 3,000 0.139876 0.163291 0.182021 0.194649 1,000 0.13059 0.149434 0.165154 0.174612 Table 4.6: Precision at 20 of HPE with different parameters

are some common properties from these models. As the number of neurons in the hidden layer increases, more latent features are possible. At the same time, sufficient samples will make our vertex representations more robust.

The HPE model adopts weighted edge random walk and uses edge sampling to quicken the convergence. Taking the weighted edge into account, helps us encode user interests more accurately, especially if we don’t have user-song preference information. Sampling from multiple random walks on each vertex also helps us get the more indirect informa-tion from the network. From these walks, the user nodes can walk through related playlist nodes by their common songs, using edge sampling to sample from these weighted edge helps us encode similar relations between the nodes. Table 4.7 shows the HPE model outperforming the DeepWalk and LINE models.

Methods Dim=64 Dim=128 Dim=256 Dim=512

DeepWalk 0.28 0.36 0.43 0.45

LINE 0.33 0.33 0.34 0.35

HPE 0.37 0.44 0.48 0.48

Table 4.7: MAP at 20 with different embedding methods

Overall, the HPE method benefits from these two advantage processes, edge sampling and random walk. These really provide us a better and more efficient way to embed our bipartite graph into the more robust representations.

4.3 Case Study

In this section, we will discuss some case studies and compare different playlist rec-ommendation results on each case. To help us intuitively check our recrec-ommendation outcomes, we will show user’s listening statistic and others related information.

After training the embedding model, we should get the similar playlist representation

‧

to the user who have like tastes with that playlist. We would also like to project the dif-ferent users with alike listening behaviors into similar vector space. Here is the result that demonstrates these embedding method outcomes. We will briefly introduce UserA (u16011847) and UserB (u96013022) first, and then analyze their recommendation out-comes. Below Table. 4.8 are their listening statistic information.

Statistic Number of Times Table 4.8: User listening statistic

In the Figure 4.3, we can see that UserA has strong preferences to Korean songs, and based on his listening behaviors, we can recommend a similar genre playlist to him, like Table 4.10 where the playlist is mainly composed of Korean songs.

A B C D E F G H I J K L M N O P Q R S T

Figure 4.3: UserA listening behavior

‧

B SUPER JUNIOR DONGHAE & EUNHYUK

C EXO

N My Love From the Star

O 主君的太陽電視原聲帶

P Girls’ Generation (少女時代)

Q 2NE1

R KYUHYUN

S miss A

T AOA

Table 4.9: UserA label reference

Song Artist Genre

Miniskirt AOA Korean

NoNoNo Apink Korean

Pinocchio - ROY KIM 皮諾丘電視原聲帶 Korean

Up & Down EXID Korean

BANG BANG BANG BIGBANG Korean

CALL ME BABY EXO Korean

Devil SUPER JUNIOR Korean

Growing pains SUPER JUNIOR DONGHAE & EUNHYUK Korean

12:30 BEAST Korean

最佳的幸運沒關係是愛情啊電視原聲帶 Volume 1 Korean

Table 4.10: Playlist: Recommendation playlist

‧

Figure 4.4: UserB listening behavior

Labels Artist

A G-DRAGON

B SUPER JUNIOR DONGHAE & EUNHYUK

C SUPER JUNIOR

P My Love From the Star

Q Colbie Caillat

R Min Chae

S 主君的太陽電視原聲帶

Table 4.11: UserB label reference

‧

On the other hand, we can also discover the similar preference user, like UserB, in the high dimensional space. As you can see in the Figure 4.4, UserB and UserA have alike preference on Korean songs and similar artists, so the playlist (Table 4.10) should be recommended to the UserB, too. And as we expected, the playlist (Table 4.10) is recommended to UserA and UserB, which means that the embedding method will actually encode the similar listening behavior into the similar representation.

Figure 4.4 shows that UserB has additional flavor on English song too, such as Lily Allen, Meghan Trainor, Calvin Harris, and Maroon 5, but the total English song playing times just account for the small part of his listening records. Therefore, it is more likely to recommend the Korean style playlist to the user. However, if we use the DeepWalk which does not consider the weighted edge in the training model, then it cannot capture the user’s taste precisely. And here is the experiment result that shows DeepWalk’s performance is worse than HPE as you can see in the Table 4.12 (b). In general, the embedding methods could compress the user’s preference into the representation, and recommend the related playlists to user.

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

‧

In this paper, we use embedding methods to encode the user-song-playlist networks, and apply the node’s learning vector to our playlist recommendation task. In our network, it is constructed from multiple types of node, like user, song, and playlist, and we propose to use the bipartite graph to describe such heterogeneous graphs. To sufficiently utilize the social information in the network, we adopt the embedding method, a deep learning technique, to encode the network structure information into the vector space. There are some embedding methods which are proposed to learn the latent representations for clas-sification tasks, and the basic concepts behind these methods are exploited by statistical models. The state-of-the-art methods like DeepWalk and LINE are mostly applied in the homogeneous graph. In order to apply DeepWalk, LINE, and HPE method to the het-erogeneous graph, we reconstruct our hethet-erogeneous network by decomposing our user, playlist and song nodes into separate sets, and then employ these methods on our bipartite graph. Experimental results show that HPE method outperforms the others. Considering weighted edges and using random walks as sampling method in our bipartite graph help us learn the informative representation. Besides, by walking through our bipartite graph, both direct and indirect relations would be explored on the walks, and that may yield improvements to our recommendation task. Overall, by using the embedding method, it provide a personalization service to a user and giving more exposures to the content provider (i.e.,music playlist creator).

In the future, we could investigate the different impacts on the multiple types of edge, and also study how to adjust the weight on each edge depending on every step of the walk. In addition, if we have the user-playlist subscription information, we can preserve the first-proximity and enhance the vector representation. Therefore, we can encode the more personalization information into the vectors, and derive the more diversity recom-mendation results.

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

‧

[1] G. Adomavicius and A. Tuzhilin. Context-Aware Recommender Systems, pages 217–253. Springer US, 2011.

[2] J. A. Bullinaria and J. P. Levy. Extracting semantic representations from word co-occurrence statistics: a computational study. Behavior Research Methods, 39 3:510–

26, 2007.

[3] C.-M. Chen, M.-F. Tsai, Y.-C. Lin, and Y.-H. Yang. Query-based music recommen-dations via preference embedding. In Proceedings of the 10th ACM Conference on Recommender Systems, RecSys ’16, pages 79–82. ACM, 2016.

[4] K. Choi, G. Fazekas, and M. B. Sandler. Understanding music playlists. CoRR, abs/1511.07004, 2015.

[5] J. R. Firth. A synopsis of linguistic theory 1930-55. 1952-59:1–32, 1957.

[6] J. L. Herlocker, J. A. Konstan, L. G. Terveen, and J. T. Riedl. Evaluating collabo-rative filtering recommender systems. ACM Transactions on Information Systems, 22(1):5–53, Jan 2004.

[7] Y. Hu, Y. Koren, and C. Volinsky. Collaborative filtering for implicit feedback datasets. In In IEEE International Conference on Data Mining (ICDM 2008, pages 263–272, 2008.

[8] A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts. Learning word vectors for sentiment analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, HLT ’11, pages 142–150. Association for Computational Linguistics, 2011.

[9] T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word repre-sentations in vector space. CoRR, abs/1301.3781, 2013.

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

[10] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed repre-sentations of words and phrases and their compositionality. CoRR, abs/1310.4546, 2013.

[11] T. Mikolov, W.-t. Yih, and G. Zweig. Linguistic regularities in continuous space word representations. In Proceedings of the 2013 Conference of the North Amer-ican Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 746–751. Association for Computational Linguistics, 2013.

[12] B. Perozzi, R. Al-Rfou, and S. Skiena. Deepwalk: Online learning of social repre-sentations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, pages 701–710. ACM, 2014.

[13] S. Rendle. Factorization machines with libfm. ACM Transactions on Intelligent Systems and Technology, 3(3):57:1–57:22, May 2012.

[14] J. D. M. Rennie and N. Srebro. Fast maximum margin matrix factorization for collaborative prediction. In Proceedings of the 22Nd International Conference on Machine Learning, ICML ’05, pages 713–719. ACM, 2005.

[15] J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei. Line: Large-scale infor-mation network embedding. In Proceedings of the 24th International Conference on World Wide Web, WWW ’15, pages 1067–1077, Republic and Canton of Geneva, Switzerland, 2015. ACM.

[16] W. Y. Zou, R. Socher, D. M. Cer, and C. D. Manning. Bilingual word embeddings for phrase-based machine translation. In EMNLP, 2013.

在文檔中以使用者音樂聆聽記錄於音樂歌單推薦之研究 - 政大學術集成 (頁 33-46)

4.1.3 Evaluation Metrics

‧

4.2.1 DeepWalk

‧

4.2.2 LINE

‧

4.2.3 HPE

‧

‧

‧

‧

‧

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

立政治大學

立政治大學

立政治大學