Chapter 5 Experiments
5.3 The influence of transition matrix
The relation in folksonomies could be basically regularized as a ternary relation between users, tags, and items. All co-occurrences of users and items, items and tags, users and tags are projected from the ternary relation to undirected and weighted edges in a social graph.
To reduce the influence of frequent occurring elements in a 2-dimension matrix, IDF-TF weighting is used on each matrix. [5] normalize the matrices, namely 𝐔𝐓, 𝐈𝐓 and 𝐑, and then combine these sub-matrices and their transposes in the transition matrix 𝐀. In our model, we compute 𝐓𝐔, 𝐓𝐈 and 𝐑𝐓 respectively rather than the transpose matrices. Notice that by our definition, 𝐑𝐓 is not the transpose of 𝐑.
We compare the results of SFR_3_DBSCAN that uses asymmetric transition matrix and symmetric one, denoted as ASYM and SYM as shown in Table 5-4 and Figure 5-10. ASYM gets 31.4% improvement in terms of NDCG. In terms of precision, ASYM gets 52.76% improvement, while SYM also gets 35.67% improvement in terms of recall. Because of our definition of relevance, the tags relevant to an item for the target user are fewer so that the difference of the results between SYM and ASYM diminishes with the growing of the reference items in Figure 5-10.
ASYM SYM
NDCG 0.72029 0.54814
P@10 0.49273 0.26204
P@20 0.33497 0.21104
P@30 0.24714 0.17799
P@40 0.19511 0.15288
53
Table 5-4 Results of ASYM and SYM in terms of NDCG, precision and recall. The model that uses the transition matrix that we modify outperforms the other.
Figure 5-10 Results of ASYM and SYM in terms of precision and recall. ASYM outperforms SYM.
By the definition of normalization, the value of an element in the matrix is different from that in its transpose. The influence of a frequent occurring element could be reduced logarithmically proportional to the sparsity of the elements located in the same column. Thus, the normalization that we modify can represent the local linking relation around the elements more precisely.
54
Chapter 6
Conclusions and Future Work
We have proposed a novel learning model, the Supervised FolkRank (SFR), for link recommendation in social tagging networks. By approximating the NDCG to be the objective function, we consider the rank position of each item rather than split items into two sets. Moreover, to make our model reliable, we define the relevance of an item so that items that a target user has never tagged before would be pruned. Both the search space of relevant and irrelevant items would be reduced to the items that the target user has tagged before.
The transition matrix in our model is similar to most of random walk-based methods in the social tagging networks [5, 6, 11, 13]. However, our transition matrix is not symmetric. We argue that the asymmetric transition matrix would adapt to the real condition though edges in the social graph are undirected. The Supervised FolkRank provides two types. We use the random walk with restart model as our basic type, and we introduce the probability of self-transition to our model to combine the PageRank-like model and the Lazy Random Walk model. While computing the objective function in the training phase, by our definition of relevance, we compute the NDCG-based objective function where the rating of a relevant item is taken as the relevance score. Thus, the irrelevant items would not be counted in to affect the predictive result.
By optimizing the parameters of our model, we analyze the linking behavior of a user that would affect the result much. Due to the divergence of users’ linking behaviors, we argue that the prediction by PageRank-based model with only one parameter vector may not adapt to the real datasets. Thus, by clustering, we could find the representatives for each cluster by computing the mean of each cluster. Each
55
cluster represents a sort of user behavior. While recommending, we use the parameter vector, which belongs to the cluster that is similar to the target user. Thus, the prediction could be enhanced.
Experiments on LibraryThing demonstrate good performance of the Supervised FolkRank. Comparing with supervised (e.g., modified Supervised Random Walks) and unsupervised methods, Supervised FolkRank (SFR) outperforms other methods.
The list-wise learning method we utilize could obtain more precise distribution than the pair-wise one such as Supervised Random Walks. Besides, due to the learning techniques, SFR could make reliable prediction without the requirement of network features discovery and extraction.
Supervised FolkRank is a robust model that can be applied to the problems which require ranking nodes in a social tagging graph, such as keyword search and recommendation.
56
Bibliography
[1] Al-Maskari, A., Sanderson, M. and Clough P. The Relationship between IR effectiveness measures and user satisfaction. In SIGIR ’07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, p.p. 23-27, 2007.
[2] Backstrom, L. and Leskovec, J. Supervised Random Walks: Predicting and recommending links in social networks. In WSDM ’11: Proceedings of the 4th International Conference on Web Search and Web Data Mining, p.p. 635-644, 2011.
[3] Brin, S. and Page, L. The anatomy of a large-scale hypertextual web search engine. In Computer Networks and ISDN Systems, 30(1-7):107-117, April 1998.
[4] Broyden, C. G. The convergence of a class of double-rank minimization algorithms. In Journal of the institute of Mathematics and Its Applications 6, p.p. 76-90.
[5] Clements, M., Vries, A. P., and Reinders, M. I. J. The influence of personalization on tag query length in social media search. In Information Processing and Management, p.p. 403-412, 2010.
[6] Clements, M., Vries, A. P., and Reinders, M. I. J. The Task-Dependent Effect of Tags and Ratings on Social Media Access. In ACM Transactions on Information Systems, Vol. 28, No. 4, Article 21, Nov. 2010.
[7] Fletcher, R. A New Approach to variable metric algorithms. In Computer Journal 13(3), p.p. 317-322.
[8] Fletcher, Roger. Practical methods of optimization (2nd ed.), New York: John Wiley & Sons.
[9] Goldfarb, D. A family of variable metric updates derived by variational means.
In Mathematics of Computation 24 (109), p.p. 23-26.
[10] Herlocker, J. L., Konstan, J. A., Borchers, A., and Riedl, J. An algorithmic framework for performing collaborative filtering. In SIGIR ’99: Proceedings of the 22nd annual international ACM SIGIR conference o Research and development in information retrieval, p.p. 230-247, 1999.
[11] Hotho, A., Jäschke, R., Schmitz, C. and Stumme, G. Information retrieval in folksonomies: search and ranking. In ESWC ’06: Proceedings of the 3rd
57 collaborative recommendation. In SIGIR ’09: Proceedings of the 32th annual international ACM SIGIR conference on Research and development in information retrieval, p.p. 19-23, 2009.
[14] Liu, D. and Nocedal, J. On the limited memory bfgs method for large scale optimization. In Mathematical Programming, pp. 45:503-528, 1989.
[15] Liu, T.Y. Learning to Rank for Information Retrieval. Springer-Verlag Berlin Heidelberg.
[16] Marinho, L. B., Nanopoulos, A., Schmidt-Thieme, L., Jäschke, R., Hotho, A., Stumme, G. and Symeonidis, P. Social tagging recommendation systems. In Recommender System Handbook, p.p. 615-632, 2011
[17] .Milicevic, A. K., Nanopoulos, A. and Ivanovic, M. Social tagging in recommender systems: A survey of the state-of-art and possible extensions. In Artificial Intelligence Review, Volume 33 Issue 3, March 2010, p.p. 187-209, 2010.
[18] Parra, D. and Brusilovsky, P. Collaborative filtering for social tagging systems:
An experiment with CiteULike. In RecSys ’09: Proceedings of the 2009 ACM conference on Recommender Systems, p.p. 237-240, 2009.
[19] Parra-Santander, D. and Brusilovsky, R. Improving collaborative filtering in social tagging systems for the recommendation of scientific articles. In IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, vol. 1, pp.136-142, 2010.
[20] Qin T., Liu, T.Y., Li, H. A general approximation framework for direct optimization of information retrieval measures. In Information Retrieval 13(4), p.p. 375-397, 2009.
[21] Rae, A., Sigurbjörnsson, B. and Zwol, R. Improving Tag Recommendation Using Social Networks. In RIAO’ 10: 9th Recherche d'Information Assistée par Ordinateur Conference Adaptivity, Personalization and Fusion of Heterogeneous Information, 2010.
[22] Salton, G. and Buckley, C. Term-weighting approaches in automatic text
58
retrieval. In Information Processing and Management, 24(5), p.p. 513-523, 2010.
[23] Shanno, David F. Conditioning of quasi-Newton methods for function minimization. In Math. Comput. 24(111), p.p. 647-656.
[24] Taylor, M., Guiver, J., et al. Softrank: optimising non-smooth rank metrics. In WSDM ’08: Proceedings of the 1st International Conference on Web Search and Web Data Mining, p.p. 77-86, 2008.
[25] Valizadegan, H., Jin, R., Zhang, R. and Mao J. Learning to rank by optimizing NDCG measure. In Neural Information Processing Systems, 2010.
[26] Wu, X., Zhang, L. and Yu, Y. Exploring social annotations for the semantic web. In WWW ’06: Proceedings of the 15th international conference on World Wide Web, p.p. 417-426, 2006.
[27] Yan, L., Dodier, R., Mozer, M. and Wolniexicz, R. Optimizing classifier performance via an approximation to the Wilcoxon-Mann-Whitney statistic. In ICML ’03: The 20th International Conference on Machine Learning, p.p.
848-855, 2003.