Basics of Machine Learning
( 機器學習入門)
Hsuan-Tien Lin (林軒田) htlin@csie.ntu.edu.tw
Department of Computer Science
& Information Engineering
National Taiwan University
( 國立台灣大學資訊工程系)
The Learning Problem
More about Me
Associate Professor Dept. CSIE National Taiwan
University
•
graduated from ck49th314, 1997•
co-leader of KDDCup world champion teams at NTU: 2010–2013•
Secretary General, Taiwanese Association for Artificial Intelligence•
co-author of bestseller ML textbook “Learning from Data”•
instructor of Mandarin-teaching MOOC of Machine Learning on NTU-Coursera:2013.11–
https://www.coursera.org/course/ntumlone
The Learning Problem What is Machine Learning
From Learning to Machine Learning
learning: acquiring skill
learning:
with experience accumulated from
observations observations learning skill
machine learning: acquiring skill
machine learning:
with experience accumulated/computedfrom
data
data ML skill
What is
skill?
The Learning Problem What is Machine Learning
A More Concrete Definition
⇔
skill
⇔ improve some
performance measure
(e.g. prediction accuracy)machine learning: improving some performance measure
machine learning:
with experience
computed
fromdata
data ML
improved performance measure
An Application in Computational Finance
stock data ML more investment gain
Why use machine learning?
The Learning Problem What is Machine Learning
Yet Another Application: Tree Recognition
•
‘define’ trees and hand-program:difficult
•
learn from data (observations) and recognize: a3-year-old can do so
•
‘ML-based tree recognition system’ can beeasier to build
than hand-programmed systemML: an
alternative route
to build complicated systemsThe Learning Problem What is Machine Learning
The Machine Learning Route
ML: an
alternative route
to build complicated systemsSome Use Scenarios
•
when human cannot program the system manually—navigating on Mars
•
when human cannot ‘define the solution’ easily—speech/visual recognition
•
when needing rapid decisions that humans cannot do—high-frequency trading
•
when needing to be user-oriented in a massive scale—consumer-targeted marketing
Give a
computer a fish, you feed it for a day;
teach it how to fish, you feed it for a lifetime.
:-)
The Learning Problem What is Machine Learning
Key Essence of Machine Learning
machine learning: improving some performance measure
machine learning:
with experience
computed
fromdata
data ML
improved performance measure
1
existssome ‘underlying pattern’ to be learned
—so ‘performance measure’ can be improved
2
butno
programmable (easy)definition
—so ‘ML’ is needed
3
somehow there isdata
about the pattern—so ML has some ‘inputs’ to learn from
key essence: help decide whether to use ML
The Learning Problem Applications of Machine Learning
Daily Needs: Food, Clothing, Housing, Transportation
data ML skill
1
Food(Sadilek et al., 2013)
• data: Twitter data (words + location)
• skill: tell food poisoning likeliness of restaurant properly
2
Clothing(Abu-Mostafa, 2012)
• data: sales figures + client surveys
• skill: give good fashion recommendations to clients
3
Housing(Tsanas and Xifara, 2012)
• data: characteristics of buildings and their energy load
• skill: predict energy load of other buildings closely
4
Transportation(Stallkamp et al., 2012)
• data: some traffic sign images and meanings
• skill: recognize traffic signs accurately
ML
is everywhere!The Learning Problem Applications of Machine Learning
Education
data ML skill
• data: students’ records on quizzes on a Math tutoring system
• skill: predict whether a student can give a correct answer to
another quiz questionA Possible ML Solution
answer correctly≈Jrecent
strength
of student>difficulty
of questionK•
give ML9 million records
from3000 students
•
ML determines (reverse-engineers)strength
anddifficulty
automaticallykey part of the
world-champion
system from National Taiwan Univ. in KDDCup 2010The Learning Problem Applications of Machine Learning
Entertainment: Recommender System (1/2)
data ML skill
• data: how many users have rated some movies
• skill: predict how a user would rate an unrated movie
A Hot Problem
•
competition held by Netflix in 2006• 100,480,507 ratings that 480,189 users gave to 17,770 movies
• 10% improvement = 1 million dollar prize
•
similar competition (movies→ songs) held by Yahoo! in KDDCup 2011• 252,800,275 ratings that 1,000,990 users gave to 624,961 songs
How can machineslearn our preferences?
The Learning Problem Applications of Machine Learning
Entertainment: Recommender System (2/2)
Match movie and viewer factors
predicted rating
com edy
content action
cont ent blockbu ster?
Tom Cru
isein it?
likesTomCruise? prefersblockbusters?
likesaction?
likescomedy?
movie viewer
add contributions from each factor
A Possible ML Solution
•
pattern:rating
←viewer/movie factors
•
learning:→
known rating
→ learned
factors
→ unknown rating prediction
key part of the
world-champion
(again!) system from National Taiwan Univ.in KDDCup 2011
The Learning Problem Components of Machine Learning
Components of Learning:
Metaphor Using Credit Approval
Applicant Information
age 23 years
gender female
annual salary NTD 1,000,000 year in residence 1 year
year in job 0.5 year current debt 200,000
unknown pattern to be learned:
‘approve credit card good for bank?’
The Learning Problem Components of Machine Learning
Formalize the Learning Problem
Basic Notations
•
input:x
∈ X (customer application)•
output: y ∈ Y (good/bad after approving credit card)• unknown pattern to be learned ⇔ target function
: f : X → Y (ideal credit approval formula)• data ⇔ training examples
:D = {(x1
, y1
), (x2
, y2
),· · · , (xN
, yN
)} (historical records in bank)• hypothesis ⇔ skill
with hopefullygood performance:
g : X → Y (‘learned’ formula to be used)
{(x n , y n ) }
fromf ML g
The Learning Problem Components of Machine Learning
Learning Flow for Credit Approval
unknown target function f : X → Y
(ideal credit approval formula)
training examples D : (x
1, y
1), · · · , (x
N,y
N) (historical records in bank)
learning algorithm
A
final hypothesis g ≈ f
(‘learned’ formula to be used)
•
target funknown
(i.e. no programmable definition)
•
hypothesis g hopefully≈ f but possiblydifferent
from f(perfection ‘impossible’ when f unknown) What does g look like?
The Learning Problem Components of Machine Learning
The Learning Model
training examples D : (x
1, y
1), · · · , (x
N, y
N) (historical records in bank)
learning algorithm
A
final hypothesis g ≈ f
(‘learned’ formula to be used)
hypothesis set H
(set of candidate formula)
•
assume g∈ H = {hk
}, i.e. approving if• h
1: annual salary > NTD 800,000
• h
2: debt > NTD 100,000 (really?)
• h
3: year in job ≤ 2 (really?)
•
hypothesis setH:• can contain good or bad hypotheses
• up to A to pick the ‘best’ one as g
learning model
=A and HThe Learning Problem Components of Machine Learning
Practical Definition of Machine Learning
unknown target function f : X → Y
(ideal credit approval formula)
training examples D : (x
1, y
1), · · · , (x
N,y
N) (historical records in bank)
learning algorithm
A
final hypothesis g ≈ f
(‘learned’ formula to be used)
hypothesis set H
(set of candidate formula)
machine learning:
use
data
to computehypothesis g
that approximates
target f
The Learning Problem Machine Learning and Other Fields
Machine Learning and Data Mining
Machine Learning
use data to compute hypothesis g that approximates target f
Data Mining
use
(huge)
data tofind property
that is interesting•
if ‘interesting property’same as
‘hypothesis that approximate target’—ML = DM(usually what KDDCup does)
•
if ‘interesting property’related to
‘hypothesis that approximate target’—DM can help ML, and vice versa(often, but not always)
•
traditional DM also focuses onefficient computation in large database
difficult to distinguish ML and DM in reality
The Learning Problem Machine Learning and Other Fields
Machine Learning and Artificial Intelligence
Machine Learning
use data to compute hypothesis g that approximates target f
Artificial Intelligence
computesomething
that shows intelligent behavior
•
g ≈ f is something that shows intelligent behavior—ML can realize AI, among other routes
•
e.g. chess playing• traditional AI: game tree
• ML for AI: ‘learning from board data’
ML is one possible route to realize AI
The Learning Problem Machine Learning and Other Fields
Machine Learning and Statistics
Machine Learning
use data to compute hypothesis g that approximates target f
Statistics
use data to
make inference about an unknown process
•
g is an inference outcome; f is something unknown—statistics
can be used to achieve ML
•
traditional statistics also focus onprovable results with math assumptions, and care less about computation
statistics: many useful tools for ML
The Learning Problem
A Learning Puzzle
y
n= −1
y
n= +1
g(x) = ?
let’s test your ‘human learning’
with 6 examples :-)
The Learning Problem
Two Controversial Answers
whatever you say about g(x),
yn=−1
yn= +1
g(x) = ?
y n = −1
y n = +1
g(x) = ?
truth f (x) = +1 because . . .
•
symmetry⇔ +1•
(black or white count = 3) or (black count = 4 andmiddle-top black)⇔ +1
truth f (x) = −1 because . . .
•
left-top black⇔ -1•
middle column contains at most 1 black and right-top white⇔ -1p
all valid reasons, your
adversarial teacher
can always call you ‘didn’t learn’.:-(
The Learning Problem
No Free Lunch Theorem
Without any assumptions on the learning problem on hand, all learning algorithms perform the same.
No algorithm is better for all
learning problems
The Learning Problem
Gender Classification Problem
?
Male Female
The Learning Problem
Gender Classification: Lesson 1
?
Male Female Female Male Male
Female Female Male Female Male
The Learning Problem
Gender Classification: Lesson 2
Male
Male Female Female Male Male
Female Female
Male
Female Male
The Learning Problem
Gender Classification: Lesson 3
Male
Male
Male Female Female
Male
Male
Female Female Male Female Male
The Learning Problem
Gender Classification: Lesson 4
?
Male Female Female Male
Male
Female
The Learning Problem
Nearest Neighbors
Intuition
•
memorize everything•
predict with the closest caseAlgorithm
•
Training: memorize all examples (picture, label)•
Prediction:• find K nearest neighbors
• let them vote!
The Learning Problem
Apple Recognition Problem
•
Is this a picture of an apple?•
We want to teach a class of 6 year olds.•
Gather photos from NY Apple Asso. and Google Image.The Learning Problem
Our Fruit Class Begins
Teacher:
How would you describe an apple? Michael?Michael:
I think apples are circular.(Class):
Apples are circular.The Learning Problem
Our Fruit Class Continues
Teacher:
Being circular is a good feature for the apples.However, if you only say circular, you could make several mistakes. What else can we say for an apple? Tina?
Tina:
It looks like apples are red.(Class):
Apples are somewhat circular and somewhat red.The Learning Problem
Our Fruit Class Continues
Teacher:
Yes. Many apples are red. However, you could still make mistakes based on circular and red. Do you have any other suggestions, Joey?Joey:
Apples could also be green.(Class):
Apples are somewhat circular and somewhat red and possibly green.The Learning Problem
Our Fruit Class Continues
Teacher:
Yes. It seems that apples might be circular, red, green. But you may confuse them with tomatoes or peaches, right? Any more suggestions, Jessica?Jessica:
Apples have stems at the top.(Class):
Apples are somewhat circular, somewhat red, possibly green, and may have stems at the top.The Learning Problem
Adaptive Boosting
ML and Life
•
combine simple rules to approximate complex function (many heads are better than one)•
emphasize incorrect data for valuable information (again you can learn by correcting mistakes)AdaBoost Algorithm
•
Input: examples (picture xn
, label yn
)N n=1
.•
For t = 1, 2,· · · , T ,• learn a simple rule h
tfrom emphasized examples
• get the confidence w
tof such rule
• emphasize the examples that do not agree with h
t.
•
Output: weighted vote of the rulesPT
t=1
wt
ht
(x )The Learning Problem