Active Learning by Learning
Hsuan-Tien Lin (林軒田) htlin@csie.ntu.edu.tw
Department of Computer Science
& Information Engineering
National Taiwan University ( 國立台灣大學資訊工程系)
2015 IR Workshop, IIS Sinica, Taiwan
joint work with Wei-Ning Hsu, presented in AAAI 2015
About Me
Hsuan-Tien Lin
• Associate Professor, Dept. of CSIE, National Taiwan University
• Leader of the Computational Learning Laboratory
• Co-author of the textbook “Learning from Data: A Short Course” (often
ML best seller on Amazon)• Instructor of the NTU-Coursera Mandarin-teaching ML Massive Open Online Courses
•
“Machine Learning Foundations”:
www.coursera.org/course/ntumlone
•
“Machine Learning Techniques”:
www.coursera.org/course/ntumltwo
Hsuan-Tien Lin (NTU CSIE) Active Learning by Learning 1/18
Active Learning
Apple Recognition Problem
Note: Slide Taken from my “ML Techniques” MOOC
•
needapple classifier: is this a picture of an apple?
•
gather photos under CC-BY-2.0 license on Flicker (thanks to theauthors below!) and label them as apple/other for learning
(APAL stands for Apple and Pear Australia Ltd)
Dan Foy APAL adrianbartel ANdrzej cH. Stuart Webster https:
//flic.
kr/p/jNQ55
https:
//flic.
kr/p/jzP1VB
https:
//flic.
kr/p/bdy2hZ
https:
//flic.
kr/p/51DKA8
https:
//flic.
kr/p/9C3Ybd
nachans APAL Jo Jakeman APAL APAL
https:
//flic.
kr/p/9XD7Ag
https:
//flic.
kr/p/jzRe4u
https:
//flic.
kr/p/7jwtGp
https:
//flic.
kr/p/jzPYNr
https:
//flic.
kr/p/jzScif
Active Learning
Apple Recognition Problem
Note: Slide Taken from my “ML Techniques” MOOC
•
needapple classifier: is this a picture of an apple?
•
gather photos under CC-BY-2.0 license on Flicker (thanks to theauthors below!) and label them as apple/other for learning
Mr. Roboto. Richard North Richard North Emilian Robert Vicol
Nathaniel Mc- Queen https:
//flic.
kr/p/i5BN85
https:
//flic.
kr/p/bHhPkB
https:
//flic.
kr/p/d8tGou
https:
//flic.
kr/p/bpmGXW
https:
//flic.
kr/p/pZv1Mf
Crystal jfh686 skyseeker Janet Hudson Rennett Stowe https:
//flic.
kr/p/kaPYp
https:
//flic.
kr/p/6vjRFH
https:
//flic.
kr/p/2MynV
https:
//flic.
kr/p/7QDBbm
https:
//flic.
kr/p/agmnrk
Hsuan-Tien Lin (NTU CSIE) Active Learning by Learning 2/18
Active Learning
Batch (Traditional) Machine Learning
Note: Flow Taken from my “ML Foundations” MOOC
unknown target function f : X → Y
training examples D : (x1,y1), · · · , (xN,yN) ( , +1), ( , +1), ( , +1)
( , -1), ( , -1), ( , -1)
learning algorithm
A
final hypothesis g≈f
hypothesis set H
batch
supervised classification:learn from
fully labeled
dataActive Learning
Active Learning: Learning by ‘Asking’
but labeling is
expensive
Protocol ⇔ Learning Philosophy
•
batch: ‘duck feeding’• active: ‘question asking’
(iteratively)—query ynof
chosen x
nunknown target function f : X → Y
labeled training examples ( , +1), ( , +1), ( , +1)
( , -1), ( , -1), ( , -1)
learning algorithm
A
final hypothesis g≈f
hypothesis set H
+1
active: improve hypothesis with fewer labels (hopefully) by asking questions
strategically
Hsuan-Tien Lin (NTU CSIE) Active Learning by Learning 4/18
Active Learning
Pool-Based Active Learning Problem
Given
•
labeled poolD
l =n(feature
x
n ,label y
n(e.g. IsApple?))oN n=1•
unlabeled pool Du=nx ˜
soSs=1
Goal
design an algorithm that iteratively
1
strategically query
some˜ x
s to get associatedy ˜
s 2 move (x ˜
s,y ˜
s)fromD
utoD
l3 learn
classifier g
(t)fromD
land improve
test accuracy of g
(t) w.r.t#queries
how to
query strategically?
Active Learning
How to Query Strategically?
by DFID - UK Department for International Development;
licensed under CC BY-SA 2.0 via Wikimedia Commons
Strategy 1
ask
most confused
questionStrategy 2
ask
most frequent
questionStrategy 3
askmost helpful
questiondo you use a
fixed strategy
in practice?Hsuan-Tien Lin (NTU CSIE) Active Learning by Learning 6/18
Active Learning
Choice of Strategy
Strategy 1:
uncertainty
askmost confused
questionStrategy 2:
representative
askmost frequent
questionStrategy 3:
exp.-err. reduction
askmost helpful
question• choosing
one single strategy isnon-trivial:
0 10 20 30 40 50 60
0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8
% of unlabelled data
Accuracy
RAND UNCERTAIN PSDS QUIRE
0 10 20 30 40 50 60
0.4 0.5 0.6 0.7 0.8 0.9
% of unlabelled data
Accuracy
RAND UNCERTAIN PSDS QUIRE
0 10 20 30 40 50 60
0.5 0.6 0.7 0.8 0.9 1
% of unlabelled data
Accuracy
RAND UNCERTAIN PSDS QUIRE
•
human-designed strategyheuristic
andconfine
machine’s ability can wefree
the machineby letting it
learn to choose
the strategies?Active Learning
Our Contributions
a philosophical and algorithmic study of active learning, which ...
•
allows machine to makeintelligent choice of strategies, just like my cute daughter
•
studiessound feedback scheme
to tell machine about goodness of choice, just likewhat I do
•
results inpromising active learning performance, just like (hopefully) bright future
of my daughterwill describe
key philosophical ideas
behind our proposed approachHsuan-Tien Lin (NTU CSIE) Active Learning by Learning 8/18
Online Choice of Strategy
Idea: Trial-and-Reward Like Human
by DFID - UK Department for International Development;
licensed under CC BY-SA 2.0 via Wikimedia Commons
K strategies:
A1, A2, · · · , AK
try
one strategy“goodness” of strategy as
reward
two issues:
try
andreward
Online Choice of Strategy
Reduction to Bandit
when do humans
trial-and-reward?
gambling
K strategies:
A1, A2, · · · , AK
tryone strategy
“goodness” of strategy asreward
K bandit machines:
B1, B2, · · · , BK
tryone bandit machine
“luckiness” of machine asreward
—will take one well-known
probabilistic bandit learner (EXP4.P)
intelligent choice of strategy=⇒intelligent choice of
bandit machine
Hsuan-Tien Lin (NTU CSIE) Active Learning by Learning 10/18
Online Choice of Strategy
Active Learning by Learning
K strategies:
A1, A2, · · · , AK
try
one strategy“goodness” of strategy as
reward
Given: K existing active learning strategies
for t = 1, 2, . . . , T1 let EXP4.P
decide strategy A
kto try
2
query the ˜ x
ssuggested by Ak, and compute g(t)3 evaluate
goodness of g
(t) asreward
oftrial
to update EXP4.Ponly remaining problem:
what reward?
Design of Reward
Ideal Reward
ideal reward
after updating classifier g(t) by the query (xnt,ynt):accuracy
1 MM
X
m=1
r
ym =g(t)(xm)z
on
test set
{(xm,ym)}Mm=1• test accuracy
asreward:
area under query-accuracy curve
≡cumulative reward
0 10 20 30 40 50 60
0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8
% of unlabelled data
Accuracy
RAND UNCERTAIN PSDS QUIRE
• test accuracy infeasible
in practice—labeling
expensive, remember?
difficulty: approximate
test accuracy on the fly
Hsuan-Tien Lin (NTU CSIE) Active Learning by Learning 12/18
Design of Reward
Training Accuracy as Reward
test accuracy
(( (( (( (( (( (( ( hhh hhh
hhh hhh h
1 M
P
Mm=1
qy
m= g
(t)(x
m)y infeasible, naïve replacement:
accuracy 1 t
t
X
τ =1
r
ynτ =g(t)(xnτ) z
on
labeled pool
{(xnτ,ynτ)}tτ =1• training accuracy
asreward:
training accuracy
≈test accuracy?
•
not necessarily!!—for active learning strategy that asks
easiest
questions:•
training accuracy high: x
nτeasy to label
•
test accuracy low: not enough information about harder instances training accuracy:
too
biased
to approximatetest accuracy
Design of Reward
Weighted Training Accuracy as Reward
training accuracy
(( (( (( (( (( (( ( hhh hhh
hhh hhh h
1 t
P
tτ =1
qy
nτ= g
(t)(x
nτ)y biased,
wantunbiased estimator
• non-uniform sampling
theorem: if(x
nτ, y
nτ) sampled with probability p
τ> 0
from data set {(xn,yn)}Nn=1 in iteration τ ,weighted training accuracy
1 tt
X
τ =1
1
p
τJy
nτ= g(x
nτ) K
≈ 1
N
N
X
n=1
Jyn=g(xn)K in
expectation
•
withprobabilistic query
like EXP4.P:weighted training accuracy
≈test accuracy weighted training accuracy:
unbiased
approx. oftest accuracy on the fly
Hsuan-Tien Lin (NTU CSIE) Active Learning by Learning 14/18
Design of Reward
Human-Designed Criterion as Reward
(Baram et al., 2004) COMB approach:
bandit +
balancedness
of g(t) on unlabeled data as reward•
why? human criterion that matches classifier todomain assumption
•
but many active learning applications are onunbalanced data!
—assumption may be
unrealistic
existing strategies: active learning
by acting;
COMB: active learning
by acting;
ours: active learning
by learning
Experiments
Comparison with Single Strategies
UNCERTAIN
Best5 10 15 20 25 30 35 40 45 50 55 60 0.55
0.6 0.65 0.7 0.75 0.8 0.85 0.9
% of unlabelled data
Accuracy ALBL
RAND UNCERTAIN PSDS QUIRE
vehicle
PSDS
Best5 10 15 20 25 30 35 40 45 50 55 60 0.5
0.55 0.6 0.65 0.7 0.75 0.8
% of unlabelled data
Accuracy ALBL
RAND UNCERTAIN PSDS QUIRE
sonar
QUIRE
Best5 10 15 20 25 30 35 40 45 50 55 60 0.5
0.55 0.6 0.65 0.7 0.75
% of unlabelled data
Accuracy ALBL
RAND UNCERTAIN PSDS QUIRE
diabetes
• no single best strategy
for every data set—choosing/blending needed
• ALBL
consistentlymatches the best
—similar findings across other data sets
ALBL: effective in making intelligent choices
Hsuan-Tien Lin (NTU CSIE) Active Learning by Learning 16/18
Experiments
Comparison with Other Adaptive Blending Algorithms
ALBL
≈COMB
5 10 15 20 25 30 35 40 45 50 55 60 0.6
0.62 0.64 0.66 0.68 0.7 0.72 0.74 0.76
% of unlabelled data
Accuracy
ALBL COMB ALBL−Train
diabetes
ALBL
>COMB
5 10 15 20 25 30 35 40 45 50 55 60 0.5
0.55 0.6 0.65 0.7 0.75 0.8
% of unlabelled data
Accuracy
ALBL COMB ALBL−Train
sonar
• ALBL
>ALBL-Train
generally—importance-weightedmechanism needed for
correcting
biased training accuracy
• ALBL
consistentlycomparable to or better than COMB
—learning performancemore useful than
human-criterion
ALBL: effective in utilizing performance
Conclusion
Conclusion
Active Learning by Learning
•
based onbandit learning
+unbiased performance estimator
as reward•
effective inmaking intelligent choices
—comparable or superior to the best of existing strategies
•
effective inutilizing learning performance
—superior to human-criterion-based blending
New Directions
• open-source tool
being developed•
extending tomore sophisticated active learning problems
Thank you! Questions?
Hsuan-Tien Lin (NTU CSIE) Active Learning by Learning 18/18