• 沒有找到結果。

‘easy yet robust’ nonlinear model

Modern Machine Learning Models Random Forest

A Complicated Data Set

g

t

(N

0

=N/2) G with first t trees

Modern Machine Learning Models Random Forest

A Complicated Data Set

g

t

(N

0

=N/2) G with first t trees

‘easy yet robust’ nonlinear model

Modern Machine Learning Models Random Forest

A Complicated Data Set

g

t

(N

0

=N/2) G with first t trees

‘easy yet robust’ nonlinear model

Modern Machine Learning Models Random Forest

A Complicated Data Set

g

t

(N

0

=N/2) G with first t trees

‘easy yet robust’ nonlinear model

Modern Machine Learning Models Random Forest

A Complicated Data Set

g

t

(N

0

=N/2) G with first t trees

‘easy yet robust’ nonlinear model

Modern Machine Learning Models Adaptive (or Gradient) Boosting

Modern Machine Learning Models ::

Adaptive (or Gradient) Boosting

Modern Machine Learning Models Adaptive (or Gradient) Boosting

Apple Recognition Problem

is this a picture of an apple?

say, want to teach a class of

6 year olds

gather photos under CC-BY-2.0 license on Flicker (thanks to the authors below!)

(APAL stands for Apple and Pear Australia Ltd)

Dan Foy APAL adrianbartel ANdrzej cH. Stuart Webster

https:

//flic.

kr/p/jNQ55

https:

//flic.

kr/p/jzP1VB

https:

//flic.

kr/p/bdy2hZ

https:

//flic.

kr/p/51DKA8

https:

//flic.

kr/p/9C3Ybd

nachans APAL Jo Jakeman APAL APAL

https:

//flic.

kr/p/9XD7Ag

https:

//flic.

kr/p/jzRe4u

https:

//flic.

kr/p/7jwtGp

https:

//flic.

kr/p/jzPYNr

https:

//flic.

kr/p/jzScif

Modern Machine Learning Models Adaptive (or Gradient) Boosting

Apple Recognition Problem

is this a picture of an apple?

say, want to teach a class of

6 year olds

gather photos under CC-BY-2.0 license on Flicker (thanks to the authors below!)

Mr. Roboto. Richard North Richard North Emilian Robert Vicol

Nathaniel Mc-Queen https:

//flic.

kr/p/i5BN85

https:

//flic.

kr/p/bHhPkB

https:

//flic.

kr/p/d8tGou

https:

//flic.

kr/p/bpmGXW

https:

//flic.

kr/p/pZv1Mf

Crystal jfh686 skyseeker Janet Hudson Rennett Stowe

https:

//flic.

kr/p/kaPYp

https:

//flic.

kr/p/6vjRFH

https:

//flic.

kr/p/2MynV

https:

//flic.

kr/p/7QDBbm

https:

//flic.

kr/p/agmnrk

Modern Machine Learning Models Adaptive (or Gradient) Boosting

Our Fruit Class Begins

Teacher: Please look at the pictures of apples and non-apples below. Based on those pictures, how would you describe an apple? Michael?

Michael: I think apples are

circular.

(Class): Apples are

circular.

Modern Machine Learning Models Adaptive (or Gradient) Boosting

Our Fruit Class Continues

Teacher: Being circular is a good feature for the apples. However, if you only say circular, you could make several mistakes. What else can we say for an apple? Tina?

Tina: It looks like apples are

red.

(Class): Apples are somewhat

circular and

somewhat

red.

Modern Machine Learning Models Adaptive (or Gradient) Boosting

Our Fruit Class Continues More

Teacher: Yes. Many apples are red. However, you could still make mistakes based on circular and red. Do you have any other suggestions, Joey?

Joey: Apples could also be

green.

(Class): Apples are somewhat

circular and

somewhat

red and possibly green.

Modern Machine Learning Models Adaptive (or Gradient) Boosting

Our Fruit Class Ends

Teacher: Yes. It seems that apples might be circular, red, green.

But you may confuse them with tomatoes or peaches, right? Any more suggestions, Jessica?

Jessica: Apples have

stems

at the top.

(Class): Apples are somewhat

circular, somewhat red, possibly green,

and may have

stems

at the top.

Modern Machine Learning Models Adaptive (or Gradient) Boosting

Motivation

students: simple hypotheses g

t

(like

vertical/horizontal lines)

(Class): sophisticated hypothesis G (like black curve)

Teacher: a tactic learning algorithm that

directs the students to focus on key examples

next: demo of such an algorithm

Modern Machine Learning Models Adaptive (or Gradient) Boosting

A Simple Data Set

‘Teacher’-like algorithm works!

Modern Machine Learning Models Adaptive (or Gradient) Boosting

A Simple Data Set

‘Teacher’-like algorithm works!

Modern Machine Learning Models Adaptive (or Gradient) Boosting

A Simple Data Set

‘Teacher’-like algorithm works!

Modern Machine Learning Models Adaptive (or Gradient) Boosting

A Simple Data Set

‘Teacher’-like algorithm works!

Modern Machine Learning Models Adaptive (or Gradient) Boosting

A Simple Data Set

‘Teacher’-like algorithm works!

Modern Machine Learning Models Adaptive (or Gradient) Boosting

A Simple Data Set

‘Teacher’-like algorithm works!

Modern Machine Learning Models Adaptive (or Gradient) Boosting

A Simple Data Set

‘Teacher’-like algorithm works!

Modern Machine Learning Models Adaptive (or Gradient) Boosting

Putting Everything Together

Gradient Boosted Decision Tree (GBDT) s 1 = s 2 = . . . = s N = 0

for t = 1, 2, . . . , T

1

obtain

g t

by

A

({(x

n

,

y n − s n

)}) where

A

is a (squared-error) regression algorithm

—such as ‘weak’ C&RT?

2

compute

α t

=OneVarLinearRegression({(

g t (x n ), y n − s n

)})

3

update

s n

s n

+

α t g t (x n )

return G(x) =P

T

t=1 α t g t

(x)

GBDT: ‘regression sibling’ of AdaBoost +

decision tree

—very popular in practice

Modern Machine Learning Models Deep Learning

Modern Machine Learning Models ::

Deep Learning

Modern Machine Learning Models Deep Learning

Physical Interpretation of Neural Network

x

0

= 1 x

1

x

2

.. . x

d

+1

tanh

tanh

w

ij(1)

w

jk(2)

w

kq(3)

+1

tanh

tanh

s

3(2) tanh

x

3(2)

each layer:

pattern feature extracted

from data,

remember? :-)

how many neurons? how many layers?

—more generally,

what structure?

• subjectively, your design!

• objectively, validation, maybe?

structural decisions:

key issue

for applying NNet

Modern Machine Learning Models Deep Learning

Shallow versus Deep Neural Networks

shallow: few (hidden) layers; deep: many layers

Shallow NNet

more

efficient

to train ( )

simpler

structural decisions ( )

theoretically

powerful enough

( )

Deep NNet

challenging

to train (×)

sophisticated

structural decisions (×)

‘arbitrarily’ powerful

( )

more

‘meaningful’?

(see next slide)

deep NNet (deep learning)

gaining attention

in recent years

Modern Machine Learning Models Deep Learning

Meaningfulness of Deep Learning

,

is it a ‘1’? ✲ ✛ is it a ‘5’?

z

1

z

5

φ

1

φ

2

φ

3

φ

4

φ

5

φ

6

positive weight negative weight

‘less burden’

for each layer:

simple

to

complex

features

natural for

difficult

learning task with

raw features, like vision

deep NNet: currently popular in

vision/speech/. . .

Modern Machine Learning Models Deep Learning

Challenges and Key Techniques for Deep Learning

difficult

structural decisions:

• subjective with domain knowledge: like convolutional NNet for images

high

model complexity:

• no big worries if big enough data

regularization towards noise-tolerant: like

dropout (tolerant when network corrupted)

denoising (tolerant when input corrupted)

hard

optimization problem:

careful initialization to avoid bad local minimum:

called pre-training

huge

computational complexity

(worsen with

big data):

• novel hardware/architecture: like mini-batch with GPU

IMHO, careful

regularization

and

initialization

are key techniques

Modern Machine Learning Models Deep Learning

A Two-Step Deep Learning Framework

Simple Deep Learning

1

for` = 1, . . . , L,

pre-train

n w

ij (`)

o

assuming w

(1)

,. . . w

(`−1)

fixed

(a) (b) (c) (d)

2 train with backprop

on

pre-trained

NNet to

fine-tune

all n

w

ij (`)

o

different deep learning models deal with the steps somewhat differently

Modern Machine Learning Models Deep Learning

Mini-Summary

Modern Machine Learning Models Support Vector Machine

large-margin boundary ranging from linear to non-linear

相關文件