• 沒有找到結果。

From Media Retrieval to Data Analytics –

N/A
N/A
Protected

Academic year: 2022

Share "From Media Retrieval to Data Analytics –"

Copied!
22
0
0

加載中.... (立即查看全文)

全文

(1)

Winston H. Hsu (徐宏民)

National Taiwan University, Taipei

From Media Retrieval to Data Analytics

Research Highlights

Office: R512, CSIE Building

Communication and Multimedia Lab (通訊與多媒體實驗室) http://www.csie.ntu.edu.tw/~winston

November 11, 2014

(2)

@NTU, November 2014 – Winston Hsu

Dr. Winston Hsu (徐宏⺠民) – Short Bio

§  Associate Professor in NTU CSIE and GINM, since Feb. 1, 2007

–  Affiliated with Communication and Multimedia Lab (CMLab)

台大通訊與多媒體實驗室

§  PhD from Columbia University, New York, 2007

§  4 years in (startup-period) CyberLink (

訊連科技

)

–  Founding Engineer, Project Leader, and RD Manger

§  Honors & Awards

–  2600+ Google citations in the past years

–  Editorial Board for IEEE Multimedia Mag., Organizing Committee for ACM

Multimedia 2010/2013, IEEE/ACM Senior Member, MSR Visiting Researcher

–  Awards:

國科會100年度吳大猷先生紀念獎

(2011), FIRST PRIZE in ACM Multimedia Grand Challenge 2011, FIRST PLACE in MSR-Bing Image Retrieval Challenge 2013, Microsoft Research Award in Multimedia Search 2009/2012, 2013資訊⽉月傑出資訊⼈人才獎, 台灣⼤大學電資學院學術貢獻獎 (top 3%), etc.

2

(3)

@NTU, November 2014 – Winston Hsu

Ongoing Research Projects (Selected) –

More Details and Demos in

3 facial/clothing attribute

detection/search

web-scale indexing &

feature learning

large-scale photo/video recognition

web-scale facial image retrieval

mobile visual recognition

multimodal deep neural network

social media mining big data analytics and visualization

first-person/wearable cameras

consumer photo retrieval

http://www.csie.ntu.edu.tw/~winston/

next

(4)

@NTU, November 2014 – Winston Hsu

Exponentially Growing Photos (Billions)

§  Why?

–  Mobile phones!

–  Sharing in social media: for organization and (social) communication [Ames, et al., CHI’07]

§  Evidenced by the numbers

–  880 billion photographs will be taken in 2015 [Yahoo!]

–  500 million: average number of tweets sent per day –  350 million: average daily uploaded Facebook photos –  250 billion: total number of uploaded Facebook photos –  20 billion: total number of photos shared in Instagram

Source: http://expandedramblings.com; photo from NBC News 4

@St. Peter's Square, Vatican

(5)

@NTU, November 2014 – Winston Hsu

(1/4) National Security –

Why? Strong Social and Industrial Needs

§  Search suspects across thousands of video cameras or millions of community-contributed photos

§  For example, “Boston Marathon Bombing 2013”

5

[Lei&Hsu, ACMMM’11, SIGIR’12 Lin, TCSVT’13, ICCV’13]

(6)

@NTU, November 2014 – Winston Hsu

(2/4) Identifying Geographic Information and Objects –

Why? Strong Social and Industrial Needs

§  Identifying locations from photos and searching suspicious objects in satellite images

§  For example,

–  IM2GPS [Hays et al., CVPR’09] – over million/billion-scale geo-ref. photos –  “Malaysia Airline Flight 370” over high-resolution satellite images

6

[Kuo & Hsu, ACMMM’09, ’10, CVPR’11, IEEE TMM’12]

(7)

@NTU, November 2014 – Winston Hsu

(3/4) Augmented Reality –

Why? Strong Social and Industrial Needs

§  In real time, recognizing and matching the real world to help those in needs; e.g., elderlies, surgeons, tourists, etc.

§  Sample projects:

–  Wearable devices (e.g., Google Glass)

–  MIT Six Sense Project (Pranav Mistry and Pattie Maes)

http://www.healthcare.philips.com/main/about/future-of-healthcare/ 7

[Wu’13 IEEE MM, ACMMM Su’13, Wu’12, Cheng’10]

(8)

@NTU, November 2014 – Winston Hsu

(4/4) Culture & Commerce –

Why? Strong Social and Industrial Needs

8

[Hou&Hsu, 2014, Cheng, MM’11, Chen MM’12, TMM’13]

(9)

@NTU, November 2014 – Winston Hsu

Photos and Videos are Essential, Ever-Growing, and Challenging Big Data!!

9

(10)

@NTU, November 2014 – Winston Hsu

•   Goal – exploiting mobile- and cloud-based capabilities for improving user experiences and visual analytics and search

10

Our Research Paradigm –

Mobile and Cloud-based Media Computing

Billion-scale photos, videos, and (noisy)

metadata

• Users (Social)

• Contents

• Services

• Context

• Interactions

• Camera

• Sensors

(11)

@NTU, November 2014 – Winston Hsu

Automatic Voice-based Facial Image Annotation and Retrieval

§  Novel and intuitive scenario – voice tagging when taking photos

–  Propagating proper names to each candidate faces; learning from the uncertainty

–  Accepted for ACM Multimedia 2014 (and ECCV 2014)

11 Jennifer

Angelina Brad

(12)

@NTU, November 2014 – Winston Hsu

Large-Scale Attribute-based People Search – Search by Impression

§  Search by impression – searching people-related photos by graphically describing the search intentions

§  FIRST PRIZE in ACM Multimedia Grand Challenge 2011

demo

[Lei&Hsu, ACM MM 2011]

12

[Lei&Hsu, SIGIR 2012]

(13)

@NTU, November 2014 – Winston Hsu

Scalable Face Image/Video Retrieval

(Sub-second Response in Million Dataset)

13

[Chen & Hsu, ACMMM’11, IEEE TMM’13, TCSVT’14

ECCV’14]

Age-invariant

(14)

@NTU, November 2014 – Winston Hsu

Me-link: Link Me to the Media –

Fusing Audio and Visual Cues for Robust and Efficient Mobile Media Interaction

http://vimeo.com/82499464

demo

demo

14

[Yeh&Hsu, WWW 2014]

(15)

@NTU, November 2014 – Winston Hsu

Our Work in Deep Learning (DNN) for Video Recognition –

Solving Scarce Training Data Problem

0 0.2 0.4 0.6 0.8 1 WeddingReception

Graduation MusicPerformance NonMusicPerformance Birthday

0 0.1 0.2 0.3 0.4 0.5 0.6 Bird

PlayGround Dog Baseball Soccer

0 0.1 0.2 0.3 0.4 Swimming

IceSkating Biking Dog Beach

[Su&Hsu, 2014]

15

demo

(16)

@NTU, November 2014 – Winston Hsu

§  Limited storage, memory, power in mobile devices

§  Reducing model size by approximating the high-dimensional kernel

–  Efficient computation – linear projection –  Small model – sparse projection matrix

Mobile Visual Recognition –

Preserving Accuracy + Reducing Model Size

# of feature dim.

# of classifiers

Su et al., Scalable Mobile Visual Classification by Kernel Preserving Projection Over High- 16

Dimensional Features. IEEE Transactions on Multimedia 2014

(17)

@NTU, November 2014 – Winston Hsu

§  Music service visualization and mining (~1B logs, 0.2M active users)

Our Projects on Very Large Visual Collections and Logs

§  Microsoft Bing search logs (21M users, 41M queries)

–  Improving image search: FIRST PLACE in MSR-Bing Image Retrieval Challenge 2013

–  Personalized image search [Wu et al., SIGIR’14]

§  Social media mining for the city

(million-scale geo-referenced photos, tweets, checkins, bikes, etc.)

17

SSeeaarrcchh LLoogg ooff UUsseerr 11112255

͞͞ǁǁŽŽƌƌůůĚĚŶŶĞĞǁǁƐƐ͟͟

Baseball

Personalized Suggestion with Trending-Aware Image Selection General Suggestion

(A)

(B)

͞

͞ƚƚĞĞŶŶŶŶŝŝƐƐ͟͟

͞

͞aattpp ƌƌĂĂŶŶŬŬŝŝŶŶŐŐ͟͟

Alex Rodriguez

Pablo Sandoval Clayton Kershaw

Masahiro Tanaka Ukraine

South Sudan

Rafael Nadal RogerFederer

Tennis Global Issue

1/19 1/21 1/23

Search Count

͞zĂŶŬĞĞƐLand Masahiro dĂŶĂŬĂ͟-ESPN

Travel Planning Hotel Reservation Dining Guide

(d) Personalized Mobile Recommendation (b) People Attribute &

Group Type Detection Statue of Liberty

Central Park

Hilton Hotel

(c) Spatio-Temporal Activity Mining

(a) Large-Scale Community-Contributed Photos

+

(18)

@NTU, November 2014 – Winston Hsu

Example – App for Flower Recognition in Apple Store

§  Collaborating with TW environmental protection NGO (荒野保護協會) and sponsored by Taiwan Mobile

§  The first flower recognition APP – KNN at server side

§  Recognizing 50+ flowers now; scalable to hundreds and more

§  Very challenging feedbacks à motivating new research directions (context, fine grain, web crowdsourcing, mobile hashing, etc.)

* PR event and news on 04/09/2013

demo

18

[Wu&Hsu, IEEE MM’13]

(19)

@NTU, November 2014 – Winston Hsu

Discovering the City by Mining Diverse and Multimodal Data Streams – IBM Grand Challenge: New York City 360

§  Exploring and Integrating Multiple Contents and Sources for NYC Life

§  ACM Multimedia 2014 Grand Challenge Multimodal Award

19

5.3M geotagged tweets 1.6M check-in photos

12.4K attractions & restaurants 6.5B subway logs

3

Exploring and Integrating Multiple Contents and Sources for NYC Life

Diverse Data Collection

(Subway)

Affinity Matrix 1

0

The similarity of top-ranked POIs

Text

Image

Venue

Traffic Data

(20)

@NTU, November 2014 – Winston Hsu

Complementary and diverse media data Rich and informative

attributes / facets

State-of-the-art and novel methods

Promising and emerging utilities /

applications

Events

Landmark Clothing

Transportation Food

People

HumanAttributes Traffic Pattern

Spatiotemporal Awareness

Trend & Event Detection Content-BasedVisual Analysis

Posts, Images, Videos, Likes, Contextual Information, Meta-Data

Sentiment Analysis

Brooklyn Bridge Park

Manhattan Skyline Sunset

Lincoln Center

Mercedes-Benz Fashion Week Food

(21)

@NTU, November 2014 – Winston Hsu

Recent Student Awards (selected)

Working on Essential and Emerging Problems

§  ACM Multimedia 2014 Grand Challenge Multimodal Award

§  FIRST PLACE in MSR-Bing Image Retrieval Challenge 2013

§  First Prize for ACM Multimedia Grand Challenge 2011

§  ACM Multimedia 2013 Grand Challenge Multimodal Award

§  陳殷盈ACM Multimedia 2012 Doctoral Symposium Best Paper Award

§  郭盈希Microsoft Research Asia Fellowship 2012

§  朱冠宇榮獲「中國電機⼯工程學會102年⻘青年論⽂文獎」第三名

§  博⼠士班學⽣生陳冠婷(102)、陳殷盈(101)、林彥良(101)獲得 「補助博⼠士

⽣生赴國外研究(千⾥里⾺馬)」獎助

§  陳柏村榮獲101年度中華⺠民國⼈人⼯工智慧學會碩⼠士論⽂文獎

§  中華電信2011電信創新應⽤用⼤大賽雲端應⽤用校園組亞軍

§  鄭安容榮獲「中國電機⼯工程學會100年⻘青年論⽂文獎」第⼆二名

§  李⽂文瑜榮獲頂尖國際會議SIGIR 2011 Google Fellowship for Women

§  陳殷盈榮獲頂尖國際會議WWW 2011 Google Fellowship for Women

§  郭盈希同學榮獲「中國電機⼯工程學會99年⻘青年論⽂文獎」第⼆二名

§  學⽣生榮獲中華電信2010電信奧斯卡—花博應⽤用組冠軍

21

(22)

@NTU, November 2014 – Winston Hsu

Thanks and Comments!

22

Winston H. Hsu (徐宏民)

National Taiwan University, Taipei

Office: R512, CSIE Building

Communication and Multimedia Lab (通訊與多媒體實驗室) http://www.csie.ntu.edu.tw/~winston

參考文獻

相關文件

A quote from Dan Ariely, “Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they

"Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values," Data Mining and Knowledge Discovery, Vol. “Density-Based Clustering in

important to not just have intuition (building), but know definition (building block).. More on

“Big data is high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced?. insight and

Access - ICT skills: the technical skills needed to use digital technologies and social media. - Information

Know how to implement the data structure using computer programs... What are we

• Recorded video will be available on NTU COOL after the class..

—we cannot teach all, but with reading you can learn all 3-6: 3 hour teaching, 6 hour reading/writing after class as important as writing assignments:. some may show up