Introduction to Adaptive Boosting

(1)

Hsuan-Tien Lin

National Taiwan University

Machine Learning, Fall 2008

Hsuan-Tien Lin AdaBoost

(2)

Introduction to Adaptive Boosting Intuition

Adaptive Boosting (AdaBoost)

Apple Recognition Problem

Is this a picture of an apple?

We want to teach a class of 6 year olds.

Gather photos from NY Apple Asso. and Google Image.

(3)

Apple Recognition Problem

(4)

Apple Recognition Problem

(5)

Apple Recognition Problem

(6)

Apple Recognition Problem

(7)

Apple Recognition Problem

(8)

Apple Recognition Problem

(9)

Apple Recognition Problem

(10)

Apple Recognition Problem

(11)

Apple Recognition Problem

(12)

Apple Recognition Problem

(13)

Apple Recognition Problem

(14)

Apple Recognition Problem

(15)

Apple Recognition Problem

(16)

Apple Recognition Problem

(17)

Apple Recognition Problem

(18)

Apple Recognition Problem

(19)

Apple Recognition Problem

(20)

Apple Recognition Problem

(21)

Apple Recognition Problem

(22)

Apple Recognition Problem

(23)

Our Fruit Class Begins

Teacher: How would you describe an apple? Michael?

Michael: I think apples are circular.

(Class): Apples are circular.

(24)

Our Fruit Class Begins

(25)

Our Fruit Class Begins

(26)

Our Fruit Class Continues

Teacher: Being circular is a good feature for the apples. However, if you only say circular, you could make several mistakes.

What else can we say for an apple? Tina?

Tina: It looks like apples are red.

(Class): Apples are somewhat circular and somewhat red.

(27)

Our Fruit Class Continues

(28)

Our Fruit Class Continues

(29)

Our Fruit Class Continues

Teacher: Yes. Many apples are red. However, you could still make mistakes based on circular and red. Do you have any other suggestions, Joey?

Joey: Apples could also be green.

(Class): Apples are somewhat circular and somewhat red and possibly green.

(30)

Our Fruit Class Continues

(31)

Our Fruit Class Continues

(32)

Our Fruit Class Continues

Teacher: Yes. It seems that apples might be circular, red, green. But you may confuse them with tomatoes or peaches, right?

Any more suggestions, Jessica?

Jessica: Apples have stems at the top.

(Class): Apples are somewhat circular, somewhat red, possibly green, and may have stems at the top.

(33)

Our Fruit Class Continues

(34)

Our Fruit Class Continues

(35)

Put Intuition to Practice

Intuition

Combine simple rules to approximate complex function.

Emphasize incorrect data to focus on valuable information.

AdaBoost Algorithm (Freund and Schapire 1997) Input: training examples Z = {(xn,yn)}^N_n=1. For t = 1, 2, · · · , T ,

Learn a simple rule ht from emphasized training examples.

Get the confidence αt of such rule

Emphasize the training examples that do not agree with ht. Output: combined function H(x ) = sign

PT

t=1α_th_t(x )

(36)

Put Intuition to Practice

Intuition

PT

t=1α_th_t(x )

(37)

Put Intuition to Practice

Intuition

PT

t=1α_th_t(x )

(38)

Put Intuition to Practice

Intuition

PT

t=1α_th_t(x )

(39)

Put Intuition to Practice

Intuition

PT

t=1α_th_t(x )

(40)

Put Intuition to Practice

Intuition

PT

t=1α_th_t(x )

(41)

Some More Details

AdaBoost Algorithm

Input: training examples Z = {(x_n,y_n)}^N_n=1. For t = 1, 2, · · · , T ,

Learn a simple rule h_t from emphasized training examples.

How? Choose a ht ∈ H with minimum emphasized error.

Get the confidence α_t of such rule

How? An ht with lower error should get higher αt.

Emphasize the training examples that do not agree with ht. How? Maintain an emphasis value unper example.

Output: combined function H(x ) = sign PT

t=1α_th_t(x ) Let’s see some demos.

(42)

Some More Details

AdaBoost Algorithm

(43)

Some More Details

AdaBoost Algorithm

(44)

Some More Details

AdaBoost Algorithm

(45)

Some More Details

AdaBoost Algorithm

(46)

Some More Details

AdaBoost Algorithm

(47)

Some More Details

AdaBoost Algorithm

(48)

Some More Details

AdaBoost Algorithm

(49)

The Final Version

Input: Z = {(xn,yn)}^N_n=1.Set u_n= _N¹ for all n.

For t = 1, 2, · · · , T ,

Learn a simple rule h_t such that h_t solves

minh N

X

n=1

u_n· I[y_n6= h(x_n)].

Compute the error t =PN

n=1 un

PN

m=1um · I[y_n6= h(x_n)]and the confidence

α_t =1

2ln1 − t

_t

Emphasize the training examples that do not agree with h_t: un=un· exp

−α_tynht(xn) . Output: combined function H(x ) = sign

PT

t=1αth_t(x )

(50)

The Final Version

For t = 1, 2, · · · , T ,

minh N

X

n=1

u_n· I[y_n6= h(x_n)].

Compute the error t =PN n=1 un

PN

α_t =1

2ln1 − t

_t

PT

t=1αth_t(x )

(51)

The Final Version

For t = 1, 2, · · · , T ,

minh N

X

n=1

u_n· I[y_n6= h(x_n)].

PN

α_t =1

2ln1 − t

_t

PT

t=1αth_t(x )

(52)

The Final Version

For t = 1, 2, · · · , T ,

minh N

X

n=1

u_n· I[y_n6= h(x_n)].

PN

α_t =1

2ln1 − t

_t

PT

t=1αth_t(x )