• 沒有找到結果。

Machine Learning for Modern Artificial Intelligence

N/A
N/A
Protected

Academic year: 2022

Share "Machine Learning for Modern Artificial Intelligence"

Copied!
50
0
0

加載中.... (立即查看全文)

全文

(1)

Machine Learning for Modern Artificial Intelligence

Hsuan-Tien Lin National Taiwan University

December 19, 2019

invited talk for Wistron NeWeb Corporation

(2)

ML for (Modern) AI

Outline

ML for (Modern) AI

ML Research for Modern AI

ML for Future AI

(3)

ML for (Modern) AI

From Intelligence to Artificial Intelligence

intelligence: thinking and actingsmartly

• humanly

• rationally

artificialintelligence:computersthinking and actingsmartly

• humanly

• rationally

humanly≈smartly≈rationally

—are humans rational? :-)

(4)

ML for (Modern) AI

Humanly versus Rationally

What if your self-driving car decides one death is better than two—and that one is you? (The Washington Post http://wpo.st/ZK-51)

You’re humming along in your self-driving car, chatting on your iPhone 37 while the machine navigates on its own. Then a swarm of people appears in the street, right in the path of the oncoming vehicle.

Car ActingHumanly tosave my (and passengers’) life, stay on track

Car ActingRationally avoid the crowd and crash the owner forminimum total loss

which issmarter?

—depending on where I am, maybe? :-)

(5)

ML for (Modern) AI

(Traditional) Artificial Intelligence

Thinking Humanly

• cognitive modeling

—now closer to Psychology than AI

Acting Humanly

• dialog systems

• humanoid robots

• computer vision

Thinking Rationally

• formal logic—now closer to Theoreticians than AI practitioners

Acting Rationally

• recommendation systems

• cleaning robots

• cross-device ad placement

actinghumanly or rationally:

more academia/industry attentions nowadays

(6)

ML for (Modern) AI

Traditional vs. Modern [My] Definition of AI

Traditional Definition

humanly ≈ intelligently ≈ rationally

My Definition

intelligently ≈ easily

is your smart phone ‘smart’? :-)

modern artificial intelligence

=applicationintelligence

(7)

ML for (Modern) AI

Examples of Application Intelligence

Siri

By Bernard Goldbach [CC BY 2.0]

Amazon Recommendations

By Kelly Sims [CC BY 2.0]

iRobot

By Yuan-Chou Lo [CC BY-NC-ND 2.0]

Vivino

from nordic.businessinsider.com

(8)

ML for (Modern) AI

AI Milestones

logic inference

expert system

machine learning +deep learning

begin 1st winter 2nd winter revolution

1956 1980 1993 2012 time

heat

AI history

• first AI winter: AI cannot solve ‘combinatorial explosion’ problems

• second AI winter: expert system failed to scale

reason of winters: expectation mismatch

(9)

ML for (Modern) AI

What’s Different Now?

More Data

• cheaper storage

• Internet companies

Faster Computation

• cloud computing

• GPU computing

Better Algorithms

• decades of research

• e.g. deep learning

Healthier Mindset

• reasonable wishes

• key breakthroughs

data-enabledAI: mainstream nowadays

(10)

ML for (Modern) AI

Machine Learning and AI

Easy-to-Use

Acting Humanly Acting Rationally

Machine Learning

machine learning: core behind modern (data-driven) AI

(11)

ML for (Modern) AI

ML Connects Big Data and AI

From Big Data to Artificial Intelligence

big data

ML

artificial intelligence

ingredient tools/steps dish

(Photos Licensed under CC BY 2.0 from Andrea Goh on Flickr)

“cooking” needs many possible tools & procedures

(12)

ML for (Modern) AI

Bigger Data Towards Better AI

best route by shortest path

best route by current traffic

best route by predicted travel time

big datacanmake machine look smarter

(13)

ML for (Modern) AI

ML for Modern AI

big data

ML AI

human learning/

analysis

domain knowledge

(HumanI)

method

model expert system

• human sometimesfaster learneroninitial (smaller) data

• industry: black plum is as sweet as white

often important to leverage human learning, especiallyin the beginning

(14)

ML for (Modern) AI

Application: Tropical Cyclone Intensity Estimation

meteorologists can ‘feel’ & estimate TC intensity from image

TC images

ML

estimationintensity

human learning/

analysis

domain knowledge

(HumanI)

CNN polar

rotation invariance

current weather

system

better than current system & ‘trial-ready’

(Chen et al., KDD 2018) (Chen et al., Weather & Forecasting 2019)

(15)

ML Research for Modern AI

Outline

ML for (Modern) AI

ML Research for Modern AI

ML for Future AI

(16)

ML Research for Modern AI

Cost-Sensitive Multiclass Classification

(17)

ML Research for Modern AI

What is the Status of the Patient?

?

H7N9-infected cold-infected healthy

• aclassificationproblem

—grouping ‘patients’ into different ‘status’

are all mis-prediction costs equal?

(18)

ML Research for Modern AI

Patient Status Prediction

error measure = society cost

XXXX

XXXXXX actual

predicted

H7N9 cold healthy

H7N9 0 1000 100000

cold 100 0 3000

healthy 100 30 0

• H7N9 mis-predicted as healthy:very high cost

• cold mis-predicted as healthy: high cost

• cold correctly predicted as cold: no cost

human doctors consider costs of decision;

how about computer-aided diagnosis?

(19)

ML Research for Modern AI

Our Works

binary multiclass

regular well-studied well-studied

cost-sensitive known(Zadrozny et al., 2003) ongoing(our works, among others)

selected works of ours

• cost-sensitive SVM(Tu and Lin, ICML 2010)

• cost-sensitive one-versus-one(Lin, ACML 2014)

• cost-sensitive deep learning(Chung et al., IJCAI 2016)

why are peoplenot

using thosecool ML works for their AI? :-)

(20)

ML Research for Modern AI

Issue 1: Where Do Costs Come From?

A Real Medical Application: Classifying Bacteria

• by human doctors: different treatments⇐⇒ serious costs

• cost matrix averaged from two doctors:

Ab Ecoli HI KP LM Nm Psa Spn Sa GBS

Ab 0 1 10 7 9 9 5 8 9 1

Ecoli 3 0 10 8 10 10 5 10 10 2

HI 10 10 0 3 2 2 10 1 2 10

KP 7 7 3 0 4 4 6 3 3 8

LM 8 8 2 4 0 5 8 2 1 8

Nm 3 10 9 8 6 0 8 3 6 7

Psa 7 8 10 9 9 7 0 8 9 5

Spn 6 10 7 7 4 4 9 0 4 7

Sa 7 10 6 5 1 3 9 2 0 7

GBS 2 5 10 9 8 6 5 6 8 0

issue 2: is cost-sensitive classification really useful?

(21)

ML Research for Modern AI

Cost-Sensitive vs. Traditional on Bacteria Data

. . . . . .

Are cost-sensitive algorithms great?

RBF kernel

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

OVOSVM

csOSRSVM csOVOSVM csFTSVM

algorithms

cost

.

...Cost-sensitive algorithms perform better than regular algorithm

Jan et al. (Academic Sinica) Cost-Sensitive Classification on SERS October 31, 2011 15 / 19

(Jan et al., BIBM 2011)

cost-sensitivebetter thantraditional;

but why are peoplestill not

using those cool ML works for their AI? :-)

(22)

ML Research for Modern AI

Issue 3: Error Rate of Cost-Sensitive Classifiers

The Problem

0.1 0.15 0.2 0.25 0.3

0 0.05 0.1 0.15 0.2

Error (%)

Cost

• cost-sensitive classifier: low cost but high error rate

• traditional classifier: low error rate but high cost

• how can we get theblueclassifiers?: low error rate and low cost

cost-and-error-sensitive:

more suitable forreal-world medical needs

(23)

ML Research for Modern AI

Improved Classifier for Both Cost and Error

(Jan et al., KDD 2012)

Cost

iris

wine

glass

vehicle

vowel

segment

dna

satimage

usps

zoo

splice

ecoli

soybean

Error

iris

wine

glass

vehicle

vowel

segment

dna

satimage

usps

zoo

splice

ecoli

soybean

now,are people using those cool ML works for their AI? :-)

(24)

ML Research for Modern AI

Lessons Learned from

Research on Cost-Sensitive Multiclass Classification

? H7N9-infected cold-infected healthy

1 more realistic (generic) in academia

6=more realistic (feasible) in application e.g. the ‘cost’ ofinputting a cost matrix? :-)

2 cross-domain collaborationimportant

e.g. getting the ‘cost matrix’ fromdomain experts

3 not easy to winhuman trust

—humans are somewhatmulti-objective

(25)

ML Research for Modern AI

Label Space Coding for

Multilabel Classification

(26)

ML Research for Modern AI

What Tags?

?: {machine learning,data structure,data mining,object oriented programming,artificial intelligence,compiler, architecture,chemistry,textbook,children book,. . .etc. }

amultilabel classification problem:

tagginginput to multiple categories

(27)

ML Research for Modern AI

Binary Relevance: Multilabel Classification via Yes/No

Binary

Classification {yes,no}

multilabel w/ L classes: LY/Nquestions machine learning(Y), data structure(N), data

mining(Y), OOP(N), AI(Y), compiler(N), architecture(N), chemistry(N), textbook(Y),

children book(N),etc.

Binary Relevance approach:

transformation tomultiple isolated binary classification

• disadvantages:

isolation—hidden relations not exploited (e.g. ML and DMhighly correlated, MLsubset ofAI, textbook & children bookdisjoint)

unbalanced—fewyes, manyno

Binary Relevance: simple (& good) benchmark with known disadvantages

(28)

ML Research for Modern AI

From Label-set to Coding View

label set apple orange strawberry binary code

{o} 0 (N) 1 (Y) 0 (N) [0, 1, 0]

{a, o} 1 (Y) 1 (Y) 0 (N) [1, 1, 0]

{a, s} 1 (Y) 0 (N) 1 (Y) [1, 0, 1]

{o} 0 (N) 1 (Y) 0 (N) [0, 1, 0]

{} 0 (N) 0 (N) 0 (N) [0, 0, 0]

subset of 2{1,2,··· ,L}⇔ length-L binary code

(29)

ML Research for Modern AI

A NeurIPS 2009 Approach: Compressive Sensing

General Compressive Sensing

sparse (many0) binary vectorsy ∈ {0, 1}Lcan berobustly

compressed by projecting to M  L basis vectors {p1,p2, · · · ,pM}

Comp. Sensing for Multilabel Classification(Hsu et al., NeurIPS 2009) 1 compress: encode original data bycompressive sensing

2 learn: getregressionfunction from compressed data

3 decode: decode regression predictions to sparse vector by compressive sensing

Compressive Sensing:

seemly strong competitor from related theoretical analysis

(30)

ML Research for Modern AI

Our Proposed Approach:

Compressive Sensing ⇒ PCA

Principal Label Space Transformation (PLST),

i.e. PCA for Multilabel Classification (Tai and Lin, NC Journal 2012) 1 compress: encode original data byPCA

2 learn: getregressionfunction from compressed data

3 decode: decode regression predictions to label vector byreverse PCA + quantization

does PLST perform better than CS?

(31)

ML Research for Modern AI

Hamming Loss Comparison: PLST vs. CS

0 20 40 60 80 100

0.03 0.035 0.04 0.045 0.05

Full−BR (no reduction) CS

PLST

mediamill (Linear Regression)

0 20 40 60 80 100

0.03 0.035 0.04 0.045 0.05

Full−BR (no reduction) CS

PLST

mediamill (Decision Tree)

PLSTbetter thanCS: faster,better performance

• similar findings acrossdata sets and regression algorithms

Why? CScreates

harder-to-learnregression tasks

(32)

ML Research for Modern AI

Our Works Continued from PLST

1 CompressionCoding(Tai & Lin, NC Journal 2012 with 249 citations)

—condense for efficiency: better (than CS) approach PLST

— key tool: PCA from Statistics/Signal Processing

2 Learnable-CompressionCoding(Chen & Lin, NeuIPS 2012 with 186 citations)

—condense learnably forbetterefficiency: better (than PLST) approach CPLST

— key tool: Ridge Regression from Statistics (+ PCA)

3 Cost-SensitiveCoding(Huang & Lin, ECML Journal Track 2017)

—condense cost-sensitively towards application needs: better (than CPLST) approach CLEMS

— key tool: Multidimensional Scaling from Statistics

cannot thankstatisticans enough for those tools!

(33)

ML Research for Modern AI

Lessons Learned from

Label Space Coding for Multilabel Classification

?: {machine learning,data structure,data mining,object oriented programming,artificial intelligence,compiler,architecture,chemistry,

textbook,children book,. . .etc. }

1 Is Statistics the same as ML? Is Statistics the same as AI?

does it really matter?

Modern AI should embraceevery useful tool from other fields.

2 good toolsnot necessarily most sophisticated tools e.g. PCA possibly more useful than CS

3 more-cited paper 6= more-useful AI solution

—citation countnot the only impact measure

(34)

ML Research for Modern AI

Active Learning by Learning

(35)

ML Research for Modern AI

Active Learning: Learning by ‘Asking’

labeling isexpensive:

active learning ‘question asking’

—query ynofchosenxn unknown target function

f : X → Y

labeled training examples ( , +1), ( , +1), ( , +1)

( , -1), ( , -1), ( , -1)

learning algorithm

A

final hypothesis gf

+1

active: improve hypothesis with fewer labels (hopefully) by asking questionsstrategically

(36)

ML Research for Modern AI

Pool-Based Active Learning Problem

Given

• labeled poolDl =n

(featurexn ,label yn(e.g. IsApple?))oN n=1

• unlabeled pool Du= nx˜soS

s=1

Goal

design an algorithm that iteratively

1 strategically querysome˜xs to get associatedy˜s

2 move (˜xs,y˜s)fromDutoDl

3 learnclassifier g(t)fromDl

and improvetest accuracy of g(t) w.r.t#queries

how toquery strategically?

(37)

ML Research for Modern AI

How to Query Strategically?

Strategy 1

askmost confused question

Strategy 2

askmost frequent question

Strategy 3

askmost debateful question

choosingone single strategy isnon-trivial:

0 10 20 30 40 50 60

0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8

% of unlabelled data

Accuracy

RAND UNCERTAIN PSDS QUIRE

0 10 20 30 40 50 60

0.4 0.5 0.6 0.7 0.8 0.9

% of unlabelled data

Accuracy

RAND UNCERTAIN PSDS QUIRE

0 10 20 30 40 50 60

0.5 0.6 0.7 0.8 0.9 1

% of unlabelled data

Accuracy

RAND UNCERTAIN PSDS QUIRE

application intelligence: how to choose strategy smartly?

(38)

ML Research for Modern AI

Idea: Trial-and-Reward Like Human

when do humanstrial-and-reward?

gambling

K strategies:

A1, A2, · · · , AK

tryone strategy

“goodness” of strategy asreward

K bandit machines:

B1, B2, · · · , BK

tryone bandit machine

“luckiness” of machine asreward

intelligent choice of strategy

=⇒intelligent choice ofbandit machine

(39)

ML Research for Modern AI

Active Learning by Learning

(Hsu and Lin, AAAI 2015)

K strategies:

A1, A2, · · · , AK

try one strategy

“goodness” of strategy as reward

Given: K existing active learning strategies for t = 1, 2, . . . , T

1 let some bandit modeldecide strategy Ak to try

2 query the ˜xssuggested by Ak, and compute g(t)

3 evaluategoodness of g(t) asrewardoftrialto update model

proposed Active Learning by Learning (ALBL):

motivated but unrigorousreward design

(40)

ML Research for Modern AI

Comparison with Single Strategies

UNCERTAINBest

510 15 20 25 30 35 40 45 50 55 60 0.55

0.6 0.65 0.7 0.75 0.8 0.85 0.9

% of unlabelled data

Accuracy ALBL

RAND UNCERTAIN PSDS QUIRE

vehicle

PSDSBest

510 15 20 25 30 35 40 45 50 55 60 0.5

0.55 0.6 0.65 0.7 0.75 0.8

% of unlabelled data

Accuracy ALBL

RAND UNCERTAIN PSDS QUIRE

sonar

QUIREBest

510 15 20 25 30 35 40 45 50 55 60 0.5

0.55 0.6 0.65 0.7 0.75

% of unlabelled data

Accuracy ALBL

RAND UNCERTAIN PSDS QUIRE

diabetes

no single best strategyfor every data set

—choosing needed

• proposedALBLconsistentlymatches the best

—similar findings across other data sets

‘application intelligence’ outcome:

open-source toolreleased

(https://github.com/ntucllab/libact)

(41)

ML Research for Modern AI

Lessons Learned from

Research on Active Learning by Learning

by DFID - UK Department for International Development;

licensed under CC BY-SA 2.0 via Wikimedia Commons

1 scalability bottleneckof ‘application intelligence’:

choiceof methods/models/parameter/. . .

2 think outside of themathbox:

‘unrigorous’ usage may begood enough

3 important to bebraveyetpatient

idea: 2012

paper:(Hsu and Lin, AAAI 2015); software:(Yang et al., 2017)

(42)

ML Research for Modern AI

Tropical Cyclone Intensity Estimation

(43)

ML Research for Modern AI

Experienced Meteorologists Can ‘Feel’ and Estimate Tropical Cyclone Intensity from Image

Can ML do the same/better?

• lack ofML-ready datasets

• lack ofmodel that properly utilizes domain knowledge

issues addressed in our latest works

(Chen et al., KDD 2018) (Chen et al., Weather & Forecasting 2019)

(44)

ML Research for Modern AI

Recall: Flow behind Our Proposed Model

TC images

ML

estimationintensity

human learning/

analysis

domain knowledge

(HI)

CNN polar

rotation invariance

current weather

system

is proposedCNN-TCbetter than current weather system?

(45)

ML Research for Modern AI

Results

RMS Error

ADT 11.75

AMSU 14.40

SATCON 9.66 CNN-TC 9.03

CNN-TC much betterthan current weather system (SATCON)

why are peoplenot using thiscool ML model? :-)

(46)

ML Research for Modern AI

Lessons Learned from

Research on Tropical Cyclone Intensity Estimation

1 again,cross-domain collaborationimportant e.g. even from ‘organizing data’ to be ML-ready

2 not easy to claimproduction ready

—can ML be used for ‘unseenly-strongTC’?

3 good AI system requiresboth human and machine learning

—still an ‘art’ to blend the two

(47)

ML for Future AI

Outline

ML for (Modern) AI

ML Research for Modern AI

ML for Future AI

(48)

ML for Future AI

AI: Now and Next

2010–2015 AI becomes promising, e.g.

• initial success of deep learningon ImageNet

• mature tools for SVM (LIBSVM) and others

2016–2020 AI becomes competitive, e.g.

• super-human performance of alphaGoand others

• all big technology companies becomeAI-first

2021–

AI becomes necessary

• “You’ll not be replaced by AI, butby humans who know how to use AI”

(Sun, Chief AI Scientist of Appier, 2018)

(49)

ML for Future AI

Needs of ML for Future AI

more creative win humanrespect

e.g. Appier’s 2018 work on

design matching clothes

(Shih et al., AAAI 2018)

more explainable win humantrust

e.g. my students’

work on

automatic bridge bidding

(Yeh et al., IEE ToG 2018)

more interactive win humanheart

e.g. my student’s work (w/ DeepQ) on efficient disease diagonsis

(Peng et al., NeurIPS 2018)

(50)

ML for Future AI

Summary

• ML for (Modern) AI:

tools + human knowledge ⇒easy-to-use application

• ML Research for Modern AI:

need to bemore open-minded

—in methodology, in collaboration, in KPI

• ML for Future AI:

crucial to be ‘human-centric’

Thank you! Questions?

參考文獻

相關文件

Suggestions to Medicine Researchers on Using ML-driven AI.. From Intelligence to Artificial Intelligence.. intelligence: thinking and

Machine Learning for Modern Artificial Intelligence.. Hsuan-Tien

• cost-sensitive classifier: low cost but high error rate. • traditional classifier: low error rate but

what is the most sophisticated machine learning model for (my precious big) data. • myth: my big data work best with most

Ongoing Projects in Image/Video Analytics with Deep Convolutional Neural Networks. § Goal – Devise effective and efficient learning methods for scalable visual analytic

Agent learns to take actions maximizing expected reward.. Machine Learning ≈ Looking for

Modern Machine Learning Models Adaptive (or Gradient) Boosting. Modern Machine Learning

2 machine learning, data mining and statistics all need data. 3 data mining is just another name for