Chapter 4. Cross-Domain Media Learning for Event Discovery 55
4.3 Proposed Approach
4.4.3 Evaluation of the Proposed Approach
In the sequel, Section 4.4.3.1 shows the effect of cross-domain media min-ing, Section 4.4.3.2 shows the effect of spanning-graph construction, and finally, Section 4.4.3.3 shows the effect of data normalization.
4.4.3.1 Effect of Cross-Domain Media Mining
Figure 4.8 compares the results of using the Instagram’s dataset alone and the results of using both the taxi’s dataset and the Instagram’s dataset, where the approach of using the Instagram’s dataset alone was implemented based on that in [50]. As can be seen, combing the two datasets achieved better performance for
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
Mar Apr May Jun Jul Aug Sep Oct Nov
p@10
Instagram-only Taxis+Instagram
Figure 4.8: Comparison of the results of using the Instagram’s dataset alone and the results of using both the taxi’s dataset and the Instagram’s dataset.
finding events in the nine months. Based on the result, we found that some events can only be detected by using the Instagram’s dataset alone. Despite of that, the information of time for the events might not be accurate enough. The reason is that people might not upload their media data for an event immediately, but a few days after the event. Perhaps, some of them like to remove some photos, add tags for someone, or make some comments. In contrast, the information provided by the taxi’s dataset may be more real-time (as we summarized in Table 4.1). It is thus beneficial to add the taxi’s dataset for good performance.
4.4.3.2 Effect of Spanning-Graph Construction
This section considers the use of the k-nearest neighbors (kNN) for graph construction in our work, where the kNN is a classical algorithm that usually works well in practice. For graph construction, the kNN algorithm will connect each vertex to its nearest k vertices. In contrast, for spanning-graph construction, given
a vertex, the vertex connects only the nearest vertex in each of the eight regions with respect to itself (c.f., Section 4.3.4). Figure 4.9 compares the results of using the kNN algorithm for graph construction and the results of using spanning graphs, where the k-value of the kNN algorithm was set to 8 because the spanning graphs used in our work considers only eight regions. (The construction of spanning graphs
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
Mar Apr May Jun Jul Aug Sep Oct Nov
p@10
kNN Ours
Figure 4.9: Compares the results of using the k-nearest neighbors (kNN) algorithm for graph construction and the results of using spanning graphs (c.f., Section 4.3.4) for our work.
can be easily extended to consider more than eight regions.) Based on the results, we clearly see that the use of spanning graphs achieved better performance. We found that kNN may make a vertex connected to many vertices that are in similar directions to the vertex, and usually, this situation will mislead the information transmission along the edges.
4.4.3.3 Effect of Data Normalization
Finally, the section evaluates the effect of data normalization (c.f., Sec-tions 4.3.1 to 4.3.3). Specifically, we intend to evaluate the effect of the flow nor-malization and the variability-score calculation, except the buzz-score calculation.
Buzz-score calculation is essential for us to extract critical textual information. We consider the correctness, i.e., P@n, for the top-10 events, top-20 events, top-50 events, and all events (i.e, 212 events), where the top-10 events refer to the 10
events that have the most data volumes among all the events, and so forth. Ta-ble 4.4 compares the results of no normalization (i.e., no flow normalization and no variability-score calculation) and the results of using data normalization. As
ex-Table 4.4: Comparison of the results of no data normalization and the results of using data normalization. Note that “Ours w/o Normalization” did not use flow normalization and variability-score calculation, but buzz-score calculation.
Ours w/o Normalization
P@1 P@2 P@5 P@10
Pre-defined 10 0.1814 0.1925 0.2144 0.2177 Pre-defined 20 0.1739 0.1911 0.2080 0.2280 Pre-defined 50 0.2004 0.2391 0.2529 0.2643 All pre-defined 0.2123 0.2385 0.2955 0.3149
Ours
P@1 P@2 P@5 P@10
Pre-defined 10 0.4292 0.5452 0.5727 0.6047 Pre-defined 20 0.4385 0.5561 0.5733 0.6126 Pre-defined 50 0.5348 0.5728 0.6517 0.6166 All pre-defined 0.5054 0.5682 0.6419 0.6323
pected, normalization does play an important role to combine datasets with different units, e.g. the “times” of boarding and alighting from taxis, for high performance.
We found that data from the taxi’s dataset dominated the ranking results if no normalization was involved. This is because the data volume of the taxi’s dataset is much more than that of the textual data from the Instagram’s dataset in our work.
Chapter 5
Conclusion and Future Work
In recent years, social media have changed the world and our lives. More and more people like to share their daily life by social media. This dissertation has presented three techniques for an effective social-media mining system, aiming to make our lives better. Given a photo, image location identification provides us geographical information, image annotation provides us people’s description or comments, and event discovery provides us events of interest to the nearby places associated to the photo. This chapter concludes our research results and lists some directions for future research.
5.1 Conclusion
For image location identification, we have presented an approach that unifies visual features, geo-tags, and check-in data of images, for the addressed problem.
Moreover, we have introduced a location-aware graph-based regrouping approach on clusters of images, where this approach might benefit existing clustering-based tech-niques. Furthermore, we have integrated sparse coding in our system and developed a graph-based dictionary selection approach for sparse coding. Finally, experimental results have shown that our technique can be applied on daily-life large-scale image datasets to retrieve image locations in a reasonable quality.
For image annotation, we have presented a graph-based layer multi-label SSL method which can effectively unify the visual and textual information for multi-label learning. Our framework does not require to pre-processing the given large-scale dataset but is capable of performing large-scale multi-label propagation.
On the other hand, we have also presented a tag refinement technique which can
simultaneously suppress noisy tags and emphasize the other tags. Experimental results have shown that our algorithm can operate on a large-scale image dataset while effectively infer the image labels.
For event discovery, we have presented a two-stage framework, including data normalization and graph-based data fusion, that unifies a flow-based dataset and a check-in-based dataset for ranked events. Data normalization enables the fusion of two datasets, and data fusion combines data from two datasets with graph approaches. Based on a taxi’s dataset and an Instagram’s dataset, the experiments have shown the effectiveness of the proposed approach. Further, the framework is capable of combing other datasets for high performance.
5.2 Future Work
Two directions for future research are listed as follows.
1. Combing Visual and Textual Information for Event Discovery:
Most approaches for event discovery were based on textual information. Re-cently, many people like to share photos than textual data (e.g., comments) on social media websites. Generally, photos can provide people a different kind of information, in contrast to textual data. It is desirable to combing visual and textual information for event discovery. Fortunately, there have been lots of studies on image content analysis and/or its application. An idea is to transform information of photos into textual data, and then add the data to existing textual data. By doing so, we can apply conventional approaches for event discovery. However, it is challenging to extract textual data from photos with high performance. Moreover, it could be difficult to find proper weights for different kinds and amounts of information for data fusion.
2. Combing Check-in and Flow Information for Traffic Route Recom-mendation:
Traffic congestion is an important problem. Traffic congestion not only wastes time and energy resources, but may also endanger our lives. For example, ambulances or fire engines may get suck in traffic jams. Discovering relation between events and traffic flows is capable of solving the problem. Specifically, check-in data can be used to find events, and flow data can be used to find re-gions of traffic congestion. Given an event, it is expected to predict the rere-gions of traffic congestion based on the historical data, and if possible, recommend people proper routes to avoid the generation of traffic congestion. Further, it is desirable to perform efficient information update based users’ feedback.
Bibliography
[1] Flickr.
http://www.flickr.com/.
[2] Foursquare.
https://foursquare.com/.
[3] Google maps.
https://maps.google.com/.
[4] Instagram API endpoints.
http://instagram.com/developer/endpoints/media/.
[5] Yahoo! travel.
http://travel.yahoo.com/.
[6] Z. Al Bawab, G. H. Mills, and J.-F. Crespo. Finding trending local topics in search queries for personalization of a recommendation system. In Proceedings of ACM International Conference on Knowledge Discovery and Data Mining, pages 397–405, August 2012.
[7] Y. Avrithis, Y. Kalantidis, G. Tolias, and E. Spyrou. Retrieving landmark and non-landmark images from community photo collection. In Proceedings of ACM International Conference on Multimedia, pages 153–162, October 2010.
[8] H. Becker, D. Iter, M. Naaman, and L. Gravano. Identifying content for planned events across social media sites. In Proceedings of ACM International Confer-ence on Web Search and Data Mining, pages 533–542, February 2012.
[9] A. Z. Broder. On the resemblance and containment of documents. In Proceed-ings of IEEE Compression and Complexity of Sequences, pages 21–29, June 1997.
[10] X. Chang, Y. Yang, A. G. Hauptmann, E. P. Xing, and Y.-L. Yu. Semantic concept discovery for large-scale zero-shot event detection. In Proceedings of IJCAI International Joint Conference on Artificial Intelligence, pages 2234–
2240, July 2015.
[11] D. M. Chen, G. Baatz, K. K¨oser, S. S. Tsai, R. Vedantham, T. Pylv¨an¨ainen, K. Roimela, X. Chen, J. Bach, M. Pollefeys, B. Girod, and R. Grzeszczuk.
City-scale landmark identification on mobile devices. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pages 737–744, June 2011.
[12] A. Coates and A. Y. Ng. The importance of encoding versus training with sparse coding and vector quantization. In Proceedings of IMLS International Conference on Machine Learning, pages 921–928, June–July 2011.
[13] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms. The MIT Press, 2009.
[14] J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. ACM Communications of the ACM, 51(1):107–113, January 2008.
[15] B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. Least angle regression.
IMS The Annals of Statistics, 32(2):407–499, April 2004.
[16] Facebook. Form 10-K (annual report)–filed 02/01/13 for the period ending 12/31/12, 2013.
[17] L. Feng, J. Wu, S. Liu, and H. Zhang. Global correlation descriptor: a novel image representation for image retrieval. Elsevier Journal of Visual Commu-nication and Image Representation, 33(1):104–114, November 2015.
[18] T. Fujisaka, R. Lee, and K. Sumiya. Discovery of user behavior patterns from geo-tagged micro-blogs. In Proceedings of ACM International Conference on Ubiquitous Information Management and Communication, pages 246–255, Jan-uary 2010.
[19] J. He, M. Li, H.-J. Zhang, H. Tong, and C. Zhang. Manifold-ranking based im-age retrieval. In Proceedings of ACM International Conference on Multimedia, pages 9–16, October 2004.
[20] K.-H. Ho, H.-C. Ou, Y.-W. Chang, and H.-F. Tsao. Coupling-aware length-ratio-matching routing for capacitor arrays in analog integrated circuits. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 34(2):161–172, February 2015.
[21] S.-L. Huang, C.-A. Wu, K.-F. Tang, C.-H. Hsu, and C.-Y. R. Huang. A robust ECO engine by resource-constraint-aware technology mapping and incremental routing optimization. In Proceedings of IEEE/ACM Asia and South Pacific Design Automation Conference, pages 382–387, January 2011.
[22] Y.-G. Jiang, J. Wang, and S.-F. Chang. Lost in binarization: query-adaptive ranking for similar image search with compact codes. In Proceedings of ACM International Conference on Multimedia Retrieval, pages 1–8, April 2011.
[23] U. Kang, C. E. Tsourakakis, and C. Faloutsos. PEGASUS: mining peta-scale graphs. Springer Knowledge and Information Systems, 27(2):303–325, May 2011.
[24] G. Karakostas. A better approximation ratio for the vertex cover problem.
ACM Transactions on Algorithms, 5(4):41:1–41:8, October 2009.
[25] L. Kennedy and M. Naaman. Generating diverse and representative image search results for landmarks. In Proceedings of ACM International Conference on World Wide Web, pages 297–306, April 2008.
[26] Y.-H. Kuo, K.-T. Chen, C.-H. Chiang, and W. H. Hsu. Query expansion for hash-based image object retrieval. In Proceedings of ACM International Con-ference on Multimedia, pages 65–74, October 2009.
[27] Y.-H. Kuo, Y.-Y. Chen, B.-C. Chen, W.-Y. Lee, C.-C. Wu, C.-H. Lin, Y.-L.
Hou, W.-F. Cheng, Y.-C. Tsai, C.-Y. Hung, L.-C. Hsieh, and W. H. Hsu. Dis-covering the city by mining diverse and multimodal data streams. In Proceed-ings of ACM International Conference on Multimedia, pages 201–204, Novem-ber 2014.
[28] Y.-H. Kuo, W.-Y. Lee, W. H. Hsu, and W.-H. Cheng. Augmenting mobile city-view image retrieval with context-rich user-contributed photos. In Proceedings of ACM International Conference on Multimedia, pages 687–690, November–
December 2011.
[29] H. Lee, A. Battle, R. Raina, and A. Y. Ng. Efficient sparse coding algorithms.
In Proceedings of NIPS Foundation Advances in Neural Information Processing Systems, pages 801–808, December 2006.
[30] H. Liu, T. Mei, J. Luo, H. Li, and S. Li. Finding perfect rendezvous on the go:
accurate mobile visual localization and its applications to routing. In Proceed-ings of ACM International Conference on Multimedia, pages 9–18, October–
November 2012.
[31] J. Liu, Z. Huang, L. Chen, H. T. Shen, and Z. Yan. Discovering areas of interest with geo-tagged images and check-ins. In Proceedings of ACM International Conference on Multimedia, pages 589–598, October–November 2012.
[32] W. Liu, J. He, and S.-F. Chang. Large graph construction for scalable semi-supervised learning. In Proceedings of IMLS International Conference on Ma-chine Learning, pages 679–686, June 2010.
[33] Y. Liu, T. Mei, X. Wu, and X.-S. Hua. Multigraph-based query-independent learning for video search. IEEE Transactions on Circuits and Systems for Video Technology, 19(12):1841–1850, December 2009.
[34] J. Long, H. Zhou, and S. O. Memik. An O(nlogn) edge-based algorithm for obstacle-avoiding rectilinear Steiner tree construction. In Proceedings of ACM International Symposium on Physical Design, pages 126–133, April 2008.
[35] J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online dictionary learning for sparse coding. In Proceedings of IMLS International Conference on Machine Learning, pages 689–696, June 2009.
[36] P. Meladianos, G. Nikolentzos, F. Rousseau, Y. Stavrakas, and M. Vazirgian-nis. Degeneracy-based real-time sub-event detection in Twitter stream. In Proceedings of AAAI International Conference on Web and Social Media, pages 248–257, May 2015.
[37] J.-Y. Pan, H.-J. Yang, C. Faloutsos, and P. Duygulu. Automatic multime-dia cross-modal correlation discovery. In Proceedings of ACM International Conference on Knowledge Discovery and Data Mining, pages 653–658, August 2004.
[38] B. Quanz, J. Huan, and M. Mishra. Knowledge transfer with low-quality data:
a feature extraction issue. IEEE Transactions on Knowledge and Data Engi-neering, 24(10):1789–1802, October 2012.
[39] D. Rao and D. Yarowsky. Ranking and semi-supervised classification on large scale graphs using Map-Reduce. In Proceedings of ACM Workshop on Graph-Based Methods for Natural Language Processing, pages 58–65, August 2009.
[40] M. Rischka and S. Conrad. Image landmark recognition with hierarchical k-means tree. In Proceedings of GI Database Systems for Business, Technology, and Web, pages 455–464, March 2015.
[41] T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes Twitter users: real-time event detection by social sensors. In Proceedings of ACM International Conference on World Wide Web, pages 851–860, April 2010.
[42] J. Sankaranarayanan, H. Samet, B. E. Teitler, M. D. Lieberman, and J. Sper-ling. TwitterStand: news in tweets. In Proceedings of ACM International Con-ference on Advances in Geographic Information Systems, pages 42–51, Novem-ber 2009.
[43] B. Shaw, J. Shea, S. Sinha, and A. Hogue. Learning to rank for spatiotemporal search. In Proceedings of ACM International Conference on Web Search and Data Mining, pages 717–726, February 2013.
[44] S. Theodoridis and K. Koutroumbas. Pattern Recognition. Academic Press, 2008.
[45] H. Tong, J. He, M. Li, W.-Y. Ma, H.-J. Zhang, and C. Zhang. Manifold-ranking-based keyword propagation for image retrieval. EURASIP Journal on Advances in Signal Processing, 2006(1):1–10, January 2006.
[46] H. Tong, J. He, M. Li, C. Zhang, and W.-Y. Ma. Graph based multi-modality learning. In Proceedings of ACM International Conference on Multimedia, pages 862–871, November 2005.
[47] V. V. Vazirani. Approximation Algorithms. Springer-Verlag, 2001.
[48] M. Wang, X.-S. Hua, X. Yuan, Y. Song, and L.-R. Dai. Optimizing multi-graph learning: towards a unified video annotation scheme. In Proceedings of ACM International Conference on Multimedia, pages 862–871, September 2007.
[49] J. Weng, Y. Yao, E. Leonardi, and B.-S. Lee. Event detection in Twitter.
Technical report, HP Laboratories, 2011.
[50] C.-C. Wu, T. Mei, W. H. Hsu, and Y. Rui. Learning to personalize trending image search suggestion. In Proceedings of ACM International Conference on Research and Development in Information Retrieval, pages 727–736, July 2014.
[51] F. Wu, Y. Han, Q. Tian, and Y. Zhuang. Multi-label boosting for image an-notation by structural grouping sparsity. In Proceedings of ACM International Conference on Multimedia, pages 15–24, October 2010.
[52] Y. Yan, Y. Yang, H. Shen, D. Meng, G. Liu, A. Hauptmann, and N. Sebe.
Complex event detection via event oriented dictionary learning. In Proceedings of AAAI Conference on Artificial Intelligence, pages 3841–3847, January 2015.
[53] Y.-H. Yang, P.-T. Wu, C.-W. Lee, K.-H. Lin, W. H. Hsu, and H. Chen. Con-textSeer: context search and recommendation at query time for shared con-sumer photos. In Proceedings of ACM International Conference on Multimedia, pages 199–208, October 2008.
[54] A. C.-C. Yao. On constructing minimum spanning trees in k-dimensional spaces and related problems. SIAM Journal on Computing, 11(4):721–736, November 1982.
[55] P. A. Zandbergen and S. J. Barbeau. Positional accuracy of assisted GPS data from high-sensitivity GPS-enabled mobile phones. RIN Journal of Navigation, 64(3):381–399, July 2011.
[56] W. Zhang, G. Qi, G. Pan, H. Lu, S. Li, and Z. Wu. City-scale social event detection and evaluation with taxi traces. ACM Transactions on Intelligent Systems and Technology, 6(3):40:1–40:20, May 2015.
[57] D. Zhou, Q. Bousquet, T. N. Lal, J. Weston, and B. Sch¨olkopf. Learning with local and global consistency. In Proceedings of NIPS Foundation Advances in Neural Information Processing Systems, pages 321–328, December 2003.
[58] H. Zhou, N. Shenoy, and W. Nicholls. Efficient minimum spanning tree con-struction without Delaunay triangulation. Elsevier Information Processing Let-ters, 81(5):271–276, February 2002.
[59] G. Zhu, S. Yan, and Y. Ma. Image tag refinement towards low-rank, content-tag prior and error sparsity. In Proceedings of ACM International Conference on Multimedia, pages 461–470, October 2010.
[60] X. Zhu. Semi-supervised learning literature survey. Doctoral Dissertation, Carnegie Mellon University, 2006.
[61] X. Zhu. Semi-supervied learning literature survey. Technical report, University of Wisconsin Madison, 2008.
[62] X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learning using Gaus-sian fields and harmonic functions. In Proceedings of IMLS International Con-ference on Machine Learning, pages 912–919, August 2003.