Machine Learning for Modern Artificial Intelligence
Hsuan-Tien Lin 林軒田
Dept. of Computer Science and Information Enginnering, National Taiwan University
國立臺灣大學資訊工程學系 December 17, 2020
Keynote talk for International Computer Symposium 2020 &
教育部人工智慧技術及應用人才培育計畫成果發表會
About Me
Professor
National Taiwan University
Chief Data Science Consultant (former Chief Data Scientist)
Appier Inc.
Co-author Learning from Data
Instructor
NTU-Coursera MOOCs ML Foundations/Techniques
ML for (Modern) AI
Outline
ML for (Modern) AI
ML Research for Modern AI
ML for Future AI
From Intelligence to Artificial Intelligence
intelligence: thinking and actingsmartly
• humanly
• rationally
artificialintelligence:computersthinking and actingsmartly
• humanly
• rationally
humanly≈smartly≈rationally
—are humans rational? :-)
ML for (Modern) AI
Humanly versus Rationally
What if your self-driving car decides one death is better than two—and that one is you? (The Washington Post http://wpo.st/ZK-51)
You’re humming along in your self-driving car, chatting on your iPhone 37 while the machine navigates on its own. Then a swarm of people appears in the street, right in the path of the oncoming vehicle.
Car ActingHumanly tosave my (and passengers’) life, stay on track
Car ActingRationally avoid the crowd and crash the owner forminimum total loss
which issmarter?
—depending on where I am, maybe? :-)
(Traditional) Artificial Intelligence
Thinking Humanly
• cognitive modeling
—now closer to Psychology than AI
Acting Humanly
• dialog systems
• humanoid robots
• computer vision
Thinking Rationally
• formal logic—now closer to Theoreticians than AI practitioners
Acting Rationally
• recommendation systems
• cleaning robots
• cross-device ad placement
actinghumanly or rationally:
more academia/industry attentions nowadays
ML for (Modern) AI
Traditional vs. Modern [My] Definition of AI
Traditional Definition
humanly ≈ intelligently ≈ rationally
My Definition
intelligently ≈ easily
is your smart phone ‘smart’? :-)
modern artificial intelligence
=applicationintelligence
Examples of Application Intelligence
Siri
By Bernard Goldbach [CC BY 2.0]
Amazon Recommendations
By Kelly Sims [CC BY 2.0]
iRobot
By Yuan-Chou Lo [CC BY-NC-ND 2.0]
Vivino
From nordic.businessinsider.com
ML for (Modern) AI
AI Milestones
logic inference
expert system
machine learning +deep learning
begin 1st winter 2nd winter revolution
1956 1980 1993 2012 time
heat
AI history
• first AI winter: AI cannot solve ‘combinatorial explosion’ problems
• second AI winter: expert system failed to scale
reason of winters: expectation mismatch
What’s Different Now?
More Data
• cheaper storage
• Internet companies
Faster Computation
• cloud computing
• GPU computing
Better Algorithms
• decades of research
• e.g. deep learning
Healthier Mindset
• reasonable wishes
• key breakthroughs
data-enabledAI: mainstream nowadays
ML for (Modern) AI
Bigger Data Towards Easier-to-use AI
By deepanker70 on https://pixabay.com/
past
best route by shortest path
present
best route by current traffic
future
best route by predicted travel time
big datacanmake machine look smarter
Machine Learning Connects Big Data and AI
From Big Data to Artificial Intelligence
big data
ML
artificial intelligenceingredient tools/steps dish
Photos Licensed under CC BY 2.0 from Andrea Goh on Flickr
“cooking” needs many possible tools & procedures
ML for (Modern) AI
ML for Modern AI
big data
ML AI
human learning/
analysis
domain knowledge
(HumanI)
method
model expert system
• human sometimesfaster learneroninitial (smaller) data
• industry: black plum is as sweet as white
often important to leverage human learning, especiallyin the beginning
Application: Tropical Cyclone Intensity Estimation
meteorologists can ‘feel’ & estimate TC intensity from image
TC images
ML
estimationintensityhuman learning/
analysis
domain knowledge
(HumanI)
CNN polar
rotation invariance
current weather
system
better than current system & ‘trial-ready’
(Chen et al., KDD ’18; Chen et al., Weather & Forecasting ’19)
ML Research for Modern AI
Outline
ML for (Modern) AI
ML Research for Modern AI
ML for Future AI
Cost-Sensitive Multiclass Classification
ML Research for Modern AI
What is the Status of the Patient?
?
By DataBase Center for Life Science;
licensed under CC BY 4.0 via Wikimedia Commons
COVID19 cold healthy
Pictures Licensed under CC BY-SA 3.0 from 1RadicalOne on Wikimedia Commons
• aclassificationproblem
—grouping ‘patients’ into different ‘status’
are all mis-prediction costs equal?
Patient Status Prediction
error measure = society cost
XXXX
XXXXXX actual
predicted
COVID19 cold healthy
COVID19 0 1000 100000
cold 100 0 3000
healthy 100 30 0
• COVID19 mis-predicted as healthy: very high cost
• cold mis-predicted as healthy: high cost
• cold correctly predicted as cold: no cost
human doctors consider costs of decision;
how about computer-aided diagnosis?
ML Research for Modern AI
Our Works
binary multiclass
regular well-studied well-studied
cost-sensitive known(Zadrozny et al., 2003) ongoing(our works, among others)
selected works of ours
• cost-sensitive SVM(Tu and Lin, ICML 2010)
• cost-sensitive one-versus-one(Lin, ACML 2014)
• cost-sensitive deep learning(Chung et al., IJCAI 2016)
why are peoplenot
using thosecool ML works for their AI? :-)
Issue 1: Where Do Costs Come From?
A Real Medical Application: Classifying Bacteria
• by human doctors: different treatments⇐⇒ serious costs
• cost matrix averaged from two doctors:
Ab Ecoli HI KP LM Nm Psa Spn Sa GBS
Ab 0 1 10 7 9 9 5 8 9 1
Ecoli 3 0 10 8 10 10 5 10 10 2
HI 10 10 0 3 2 2 10 1 2 10
KP 7 7 3 0 4 4 6 3 3 8
LM 8 8 2 4 0 5 8 2 1 8
Nm 3 10 9 8 6 0 8 3 6 7
Psa 7 8 10 9 9 7 0 8 9 5
Spn 6 10 7 7 4 4 9 0 4 7
Sa 7 10 6 5 1 3 9 2 0 7
GBS 2 5 10 9 8 6 5 6 8 0
issue 2: is cost-sensitive classification really useful?
ML Research for Modern AI
Cost-Sensitive vs. Traditional on Bacteria Data
. . . . . .
Are cost-sensitive algorithms great?
RBF kernel
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
OVOSVM
csOSRSVM csOVOSVM csFTSVM
algorithms
cost
.
...Cost-sensitive algorithms perform better than regular algorithm
Jan et al. (Academic Sinica) Cost-Sensitive Classification on SERS October 31, 2011 15 / 19
(Jan et al., BIBM 2011)
cost-sensitivebetter thantraditional;
but why are peoplestill not
using those cool ML works for their AI? :-)
Issue 3: Error Rate of Cost-Sensitive Classifiers
The Problem
0.1 0.15 0.2 0.25 0.3
0 0.05 0.1 0.15 0.2
Error (%)
Cost
• cost-sensitive classifier: low cost but high error rate
• traditional classifier: low error rate but high cost
• how can we get theblueclassifiers?: low error rate and low cost
cost-and-error-sensitive:
more suitable forreal-world medical needs
ML Research for Modern AI
Improved Classifier for Both Cost and Error
(Jan et al., KDD 2012)
Cost
iris ≈
wine ≈
glass ≈
vehicle ≈
vowel
segment
dna
satimage ≈
usps
zoo
splice ≈
ecoli ≈
soybean ≈
Error
iris
wine
glass
vehicle
vowel
segment
dna
satimage
usps
zoo
splice
ecoli
soybean
now,are people using those cool ML works for their AI? :-)
Lessons Learned from
Research on Cost-Sensitive Multiclass Classification
? H7N9-infected cold-infected healthy
See Page 16 of the Slides for Sources of the Pictures
1 more realistic (generic) in academia
6=more realistic (feasible) in application e.g. the ‘cost’ ofinputting a cost matrix? :-)
2 cross-domain collaborationimportant
e.g. getting the ‘cost matrix’ fromdomain experts
3 not easy to winhuman trust
—humans are somewhatmulti-objective
ML Research for Modern AI
Active Learning by Learning
Active Learning: Learning by ‘Asking’
labeling isexpensive:
active learning ‘question asking’
—query ynofchosenxn
unknown target function f : X → Y
labeled training examples ( , +1), ( , +1), ( , +1)
( , -1), ( , -1), ( , -1)
learning algorithm
A
final hypothesis g≈f
+1
active: improve hypothesis with fewer labels (hopefully) by asking questionsstrategically
ML Research for Modern AI
Pool-Based Active Learning Problem
Given
• labeled poolDl =n
(featurexn ,label yn(e.g. IsApple?))oN n=1
• unlabeled pool Du= nx˜soS
s=1
Goal
design an algorithm that iteratively
1 strategically querysome˜xs to get associatedy˜s
2 move (˜xs,y˜s)fromDutoDl
3 learnclassifier g(t)fromDl
and improvetest accuracy of g(t) w.r.t#queries
how toquery strategically?
How to Query Strategically?
Strategy 1
askmost confused question
Strategy 2
askmost frequent question
Strategy 3
askmost debateful question
• choosingone single strategy isnon-trivial:
0 10 20 30 40 50 60
0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8
% of unlabelled data
Accuracy
RAND UNCERTAIN PSDS QUIRE
0 10 20 30 40 50 60
0.4 0.5 0.6 0.7 0.8 0.9
% of unlabelled data
Accuracy
RAND UNCERTAIN PSDS QUIRE
0 10 20 30 40 50 60
0.5 0.6 0.7 0.8 0.9 1
% of unlabelled data
Accuracy
RAND UNCERTAIN PSDS QUIRE
application intelligence: how to choose strategy smartly?
ML Research for Modern AI
Idea: Trial-and-Reward Like Human
when do humanstrial-and-reward?
gambling
K strategies:
A1, A2, · · · , AK
tryone strategy
“goodness” of strategy asreward
K bandit machines:
B1, B2, · · · , BK
tryone bandit machine
“luckiness” of machine asreward
intelligent choice of strategy
=⇒intelligent choice ofbandit machine
Active Learning by Learning
(Hsu and Lin, AAAI 2015)K strategies:
A1, A2, · · · , AK
try one strategy
“goodness” of strategy as reward
Given: K existing active learning strategies for t = 1, 2, . . . , T
1 let some bandit modeldecide strategy Ak to try
2 query the ˜xssuggested by Ak, and compute g(t)
3 evaluategoodness of g(t) asrewardoftrialto update model
proposed Active Learning by Learning (ALBL):
motivated but unrigorousreward design
ML Research for Modern AI
Comparison with Single Strategies
UNCERTAINBest
510 15 20 25 30 35 40 45 50 55 60 0.55
0.6 0.65 0.7 0.75 0.8 0.85 0.9
% of unlabelled data
Accuracy ALBL
RAND UNCERTAIN PSDS QUIRE
vehicle
PSDSBest
510 15 20 25 30 35 40 45 50 55 60 0.5
0.55 0.6 0.65 0.7 0.75 0.8
% of unlabelled data
Accuracy ALBL
RAND UNCERTAIN PSDS QUIRE
sonar
QUIREBest
510 15 20 25 30 35 40 45 50 55 60 0.5
0.55 0.6 0.65 0.7 0.75
% of unlabelled data
Accuracy ALBL
RAND UNCERTAIN PSDS QUIRE
diabetes
• no single best strategyfor every data set
—choosing needed
• proposedALBLconsistentlymatches the best
—similar findings across other data sets
‘application intelligence’ outcome:
open-source toolreleased
(https://github.com/ntucllab/libact)
Have We Made Active Learning More Realistic? (1/2)
Yes!
open-source toollibactdeveloped (Yang, 2017) https://github.com/ntucllab/libact
• including uncertainty, QUIRE, PSDS, . . .,and ALBL
• received>500starsand continuousissues
“libact is a Python package designed tomake ac- tive learning easierfor real-world users”
ML Research for Modern AI
Have We Made Active Learning More Realistic? (2/2)
No!
• single-most raisedissue: hard to install on Windows/Mac
—because several strategies requires some C packages
• performance in a recent industry project:
• uncertaintysamplingoften suffices
• ALBLdragged down by bad strategy
“libact is a Python package designed to make active learning easierfor real-world users”
Lessons Learned from
Research on Active Learning by Learning
by DFID - UK Department for International Development;
licensed under CC BY-SA 2.0 via Wikimedia Commons
1 scalability bottleneckof ‘application intelligence’:
choiceof methods/models/parameter/. . .
2 think outside of themathbox:
‘unrigorous’ usage may begood enough
3 important to bebraveyetpatient
• idea: 2012
• paper:(Hsu and Lin, AAAI 2015); software:(Yang et al., 2017) 4 easy-to-use in design 6=easy-to-use in reality
ML Research for Modern AI
Tropical Cyclone Intensity Estimation
Experienced Meteorologists Can ‘Feel’ and Estimate Tropical Cyclone Intensity from Image
Can ML do the same/better?
• lack ofML-ready datasets
• lack ofmodel that properly utilizes domain knowledge issues addressed in our latest works
(Chen et al., KDD ’18; Chen et al., Weather & Forecasting ’19)
ML Research for Modern AI
Recall: Flow behind Our Proposed Model
TC images
ML
estimationintensityhuman learning/
analysis
domain knowledge
(HI)
CNN polar
rotation invariance
current weather
system
is proposedCNN-TCbetter than current weather system?
Results
RMS Error
ADT 11.75
AMSU 14.40
SATCON 9.66 CNN-TC 9.03
CNN-TC much betterthan current weather system (SATCON)
why are peoplenot using thiscool ML model? :-)
ML Research for Modern AI
Lessons Learned from
Research on Tropical Cyclone Intensity Estimation
1 again,cross-domain collaborationimportant e.g. even from ‘organizing data’ to be ML-ready
2 not easy to claimproduction ready
—can ML be used for ‘unseenly-strongTC’?
3 good AI system requiresboth human and machine learning
—still an ‘art’ to blend the two
Outline
ML for (Modern) AI
ML Research for Modern AI
ML for Future AI
ML for Future AI
AI: Now and Next
2010–2015: AI | AI becomes promising, e.g.
• initial success of deep learningon ImageNet
• mature tools for SVM (LIBSVM) and others
2016–2020: AI + AI becomes competitive, e.g.
• super-human performance of alphaGoand others
• all big technology companies becomeAI-first
2021–: AI × AI becomes necessary
• “You’ll not be replaced by AI, butby humans who know how to use AI”
(Sun, Chief AI Scientist of Appier, 2018)
Needs of ML for Future AI
more creative win humanrespect
e.g. Appier’s 2018 work on
design matching clothes
(Shih et al., AAAI 2018)
more explainable win humantrust
e.g. my students’
work on
automatic bridge bidding
(Yeh et al., IEE ToG 2018)
more interactive win humanheart
e.g. my student’s work (w/ DeepQ) on efficient disease diagonsis
(Peng et al., NeurIPS 2018)
ML for Future AI
Summary
• ML for (Modern) AI:
tools + human knowledge ⇒easy-to-use application
• ML Research for Modern AI:
need to bemore open-minded
—in methodology, in collaboration, in KPI
• ML for Future AI:
crucial to be ‘human-centric’
Thank you! Questions?