Machine Learning Techniques (ᘤᢈ)

25  Download (0)

Full text

(1)

Machine Learning Techniques ( 機器學習技巧)

Lecture 13: RBF Networks

Hsuan-Tien Lin (林軒田) htlin@csie.ntu.edu.tw

Department of Computer Science

& Information Engineering

National Taiwan University

( 國立台灣大學資訊工程系)

(2)

RBF Networks

Agenda

Lecture 13: RBF Networks Full RBF Model

Prototype Extraction RBF Network

Connection to Other Views

(3)

RBF Networks Full RBF Model

Disclaimer

Many parts of this lecture borrows

Prof. Yaser S. Abu-Mostafa’s slides with permission.

Learning From Data

YaserS.Abu-Mostafa

CaliforniaInstituteofTe hnology

Le ture16:RadialBasisFun tions

(4)

RBF Networks Full RBF Model

Basi RBFmodel

LearningFromData-Le ture16 3/20

Basi RBFmodel

Ea h

(x n , y n ) ∈ D

inuen es

h(x)

basedon

kx − x | {z } n k

radial

LearningFromData-Le ture16 3/20

Basi RBFmodel

Ea h

(x n , y n ) ∈ D

inuen es

h(x)

basedon

k | {z } x − x n k

radial

LearningFromData-Le ture16 3/20

Basi RBFmodel

Ea h

(x n , y n ) ∈ D

inuen es

h(x)

basedon

k | {z } x − x n k

radial

Standard form:

h(x) = X N n=1

w n exp 

−γ kx − x n k 2 

| {z }

basisfun tion

LearningFromData-Le ture16 3/20

Basi RBFmodel

Ea h

(x n , y n ) ∈ D

inuen es

h(x)

basedon

k | {z } x − x n k

radial

Standard form:

h(x) = X N n=1

w n exp 

−γ kx − x n k 2 

| {z }

basisfun tion

LearningFromData-Le ture16 3/20

Basi RBFmodel

Ea h

(x n , y n ) ∈ D

inuen es

h(x)

basedon

k | {z } x − x n k

radial

Standard form:

h(x) = X N n=1

w n exp 

−γ kx − x n k 2 

| {z }

basisfun tion

LearningFromData-Le ture16 3/20

Basi RBFmodel

Ea h

(x n , y n ) ∈ D

inuen es

h(x)

basedon

k | {z } x − x n k

radial

Standard form:

h(x) = X N n=1

w n exp 

−γ k x − x n k 2 

| {z }

basisfun tion

LearningFromData-Le ture16 3/20

Basi RBFmodel

Ea h

(x n , y n ) ∈ D

inuen es

h(x)

basedon

k | {z } x − x n k

radial

Standard form:

h(x) = X N n=1

w n exp 

−γ k x − x n k 2 

| {z }

basisfun tion

LearningFromData-Le ture16 3/20

Basi RBFmodel

Ea h

(x n , y n ) ∈ D

inuen es

h(x)

basedon

k | {z } x − x n k

radial

Standard form:

h(x) = X N n=1

w n exp 

−γ k x − x n k 2 

| {z }

basisfun tion

LearningFromData-Le ture16 3/20

Basi RBFmodel

Ea h

(x n , y n ) ∈ D

inuen es

h(x)

basedon

k | {z } x − x n k

radial

Standard form:

h(x) = X N n=1

w n exp 

−γ k x − x n k 2 

| {z }

basisfun tion

LearningFromData-Le ture16 3/20

Basi RBFmodel

Ea h

(x n , y n ) ∈ D

inuen es

h(x)

basedon

k | {z } x − x n k

radial

Standard form:

h(x) = X N n=1

w n exp 

−γ k x − x n k 2 

| {z }

basisfun tion

LearningFromData-Le ture16 3/20

(5)

RBF Networks Full RBF Model

Thelearning algorithm

Finding

w 1 , · · · , w N

:

h(x) = X N n=1

w n exp 

−γ kx − x n k 2 

basedon

D = (x 1 , y 1 ), · · · , (x N , y N ) E

in

= 0

:

h(x n ) = y n

for

n = 1, · · · , N

:

X N m=1

w m exp 

−γ kx n − x m k 2 

= y n

LearningFromData-Le ture16 4/20

Thelearning algorithm

Finding

w 1 , · · · , w N

:

h(x) = X N n=1

w n exp 

−γ kx − x n k 2 

basedon

D = (x 1 , y 1 ), · · · , (x N , y N ) E

in

= 0

:

h(x n ) = y n

for

n = 1, · · · , N

:

X N m=1

w m exp 

−γ kx n − x m k 2 

= y n

LearningFromData-Le ture16 4/20

Thelearning algorithm

Finding

w 1 , · · · , w N

:

h(x) = X N n=1

w n exp 

−γ kx − x n k 2 

basedon

D = (x 1 , y 1 ), · · · , (x N , y N ) E

in

= 0

:

h(x n ) = y n

for

n = 1, · · · , N

:

X N m=1

w m exp 

−γ kx n − x m k 2 

= y n

LearningFromData-Le ture16 4/20

Thelearning algorithm

Finding

w 1 , · · · , w N

:

h(x) = X N n=1

w n exp 

−γ kx − x n k 2 

basedon

D = (x 1 , y 1 ), · · · , (x N , y N ) E

in

= 0

:

h(x n ) = y n

for

n = 1, · · · , N

:

X N m=1

w m exp 

−γ kx n − x m k 2 

= y n

LearningFromData-Le ture16 4/20

Thelearning algorithm

Finding

w 1 , · · · , w N

:

h(x) = X N n=1

w n exp 

−γ kx − x n k 2 

basedon

D = (x 1 , y 1 ), · · · , (x N , y N ) E

in

= 0

:

h(x n ) = y n

for

n = 1, · · · , N

:

X N m=1

w m exp 

−γ kx n − x m k 2 

= y n

LearningFromData-Le ture16 4/20

Thelearning algorithm

Finding

w 1 , · · · , w N

:

h(x) = X N n=1

w n exp 

−γ kx − x n k 2 

basedon

D = (x 1 , y 1 ), · · · , (x N , y N ) E

in

= 0

:

h(x n ) = y n

for

n = 1, · · · , N

:

X N m=1

w m exp 

−γ kx n − x m k 2 

= y n

LearningFromData-Le ture16 4/20

Thelearning algorithm

Finding

w 1 , · · · , w N

:

h(x) = X N n=1

w n exp 

−γ kx − x n k 2 

basedon

D = (x 1 , y 1 ), · · · , (x N , y N ) E

in

= 0

:

h(x n ) = y n

for

n = 1, · · · , N

:

X N m=1

w m exp 

−γ kx n − x m k 2 

= y n

LearningFromData-Le ture16 4/20

Thelearning algorithm

Finding

w 1 , · · · , w N

:

h(x) = X N n=1

w n exp 

−γ kx − x n k 2 

basedon

D = (x 1 , y 1 ), · · · , (x N , y N ) E

in

= 0

:

h(x n ) = y n

for

n = 1, · · · , N

:

X N m=1

w m exp 

−γ kx n − x m k 2 

= y n

LearningFromData-Le ture16 4/20

Thelearning algorithm

Finding

w 1 , · · · , w N

:

h(x) = X N n=1

w n exp 

−γ kx − x n k 2 

basedon

D = (x 1 , y 1 ), · · · , (x N , y N ) E

in

= 0

:

h(x n ) = y n

for

n = 1, · · · , N

:

X N m=1

w m exp 

−γ k x n − x m k 2 

= y n

LearningFromData-Le ture16 4/20

Thelearning algorithm

Finding

w 1 , · · · , w N

:

h(x) = X N n=1

w n exp 

−γ kx − x n k 2 

basedon

D = (x 1 , y 1 ), · · · , (x N , y N ) E

in

= 0

:

h(x n ) = y n

for

n = 1, · · · , N

:

X N m=1

w m exp 

−γ k x n − x m k 2 

= y n

LearningFromData-Le ture16 4/20

(6)

RBF Networks Full RBF Model

Thesolution

X N m=1

w m exp 

−γ kx n − x m k 2 

= y n N

equationsin

N

unknowns

 

 

exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )

.

.

.

.

.

.

.

.

.

exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w N

 

 

| {z } w

=

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

isinvertible,

w = Φ −1 y

exa tinterpolation

LearningFromData-Le ture16 5/20

Thesolution

X N m=1

w m exp 

−γ kx n − x m k 2 

= y n N

equationsin

N

unknowns

 

 

exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )

.

.

.

.

.

.

.

.

.

exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w N

 

 

| {z } w

=

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

isinvertible,

w = Φ −1 y

exa tinterpolation

LearningFromData-Le ture16 5/20

Thesolution

X N m=1

w m exp 

−γ kx n − x m k 2 

= y n N

equationsin

N

unknowns

 

 

exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )

.

.

.

.

.

.

.

.

.

exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w N

 

 

| {z } w

=

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

isinvertible,

w = Φ −1 y

exa tinterpolation

LearningFromData-Le ture16 5/20

Thesolution

X N m=1

w m exp 

−γ kx n − x m k 2 

= y n N

equationsin

N

unknowns

 

 

exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )

.

.

.

.

.

.

.

.

.

exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w N

 

 

| {z } w

=

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

isinvertible,

w = Φ −1 y

exa tinterpolation

LearningFromData-Le ture16 5/20

Thesolution

X N m=1

w m exp 

−γ kx n − x m k 2 

= y n N

equationsin

N

unknowns

 

 

exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )

.

.

.

.

.

.

.

.

.

exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w N

 

 

| {z } w

=

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

isinvertible,

w = Φ −1 y

exa tinterpolation

LearningFromData-Le ture16 5/20

Thesolution

X N m=1

w m exp 

−γ kx n − x m k 2 

= y n N

equationsin

N

unknowns

 

 

exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )

.

.

.

.

.

.

.

.

.

exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w N

 

 

| {z } w

=

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

isinvertible,

w = Φ −1 y

exa tinterpolation

LearningFromData-Le ture16 5/20

Thesolution

X N m=1

w m exp 

−γ kx n − x m k 2 

= y n N

equationsin

N

unknowns

 

 

exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )

.

.

.

.

.

.

.

.

.

exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w N

 

 

| {z } w

=

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

isinvertible,

w = Φ −1 y

exa tinterpolation

LearningFromData-Le ture16 5/20

Thesolution

X N m=1

w m exp 

−γ kx n − x m k 2 

= y n N

equationsin

N

unknowns

 

 

exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )

.

.

.

.

.

.

.

.

.

exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w N

 

 

| {z } w

=

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

isinvertible,

w = Φ −1 y

exa tinterpolation

LearningFromData-Le ture16 5/20

Thesolution

X N m=1

w m exp 

−γ kx n − x m k 2 

= y n N

equationsin

N

unknowns

 

 

exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )

.

.

.

.

.

.

.

.

.

exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w N

 

 

| {z } w

=

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

isinvertible,

w = Φ −1 y

exa tinterpolation

LearningFromData-Le ture16 5/20

Thesolution

X N m=1

w m exp 

−γ kx n − x m k 2 

= y n N

equationsin

N

unknowns

 

 

exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )

.

.

.

.

.

.

.

.

.

exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w N

 

 

| {z } w

=

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

isinvertible,

w = Φ −1 y

exa tinterpolation

LearningFromData-Le ture16 5/20

Thesolution

X N m=1

w m exp 

−γ kx n − x m k 2 

= y n N

equationsin

N

unknowns

 

 

exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )

.

.

.

.

.

.

.

.

.

exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w N

 

 

| {z } w

=

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

isinvertible,

w = Φ −1 y

exa tinterpolation

LearningFromData-Le ture16 5/20

Thesolution

X N m=1

w m exp 

−γ kx n − x m k 2 

= y n N

equationsin

N

unknowns

 

 

exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )

.

.

.

.

.

.

.

.

.

exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w N

 

 

| {z } w

=

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

isinvertible,

w = Φ −1 y

exa tinterpolation

LearningFromData-Le ture16 5/20

(7)

RBF Networks Full RBF Model

RBFfor lassi ation

h(x) =

sign

X N n=1

w n exp 

−γ kx − x n k 2  !

Learning:

linearregressionfor lassi ation

s = X N n=1

w n exp 

−γ kx − x n k 2 

Minimize

(s − y) 2

on

D y = ±1 h(x) =

sign

(s)

LearningFromData-Le ture16 7/20

RBFfor lassi ation

h(x) =

sign

X N n=1

w n exp 

−γ kx − x n k 2  !

Learning:

linearregressionfor lassi ation

s = X N n=1

w n exp 

−γ kx − x n k 2 

Minimize

(s − y) 2

on

D y = ±1 h(x) =

sign

(s)

LearningFromData-Le ture16 7/20

RBFfor lassi ation

h(x) =

sign

X N n=1

w n exp 

−γ kx − x n k 2  !

Learning:

linearregressionfor lassi ation

s = X N n=1

w n exp 

−γ kx − x n k 2 

Minimize

(s − y) 2

on

D y = ±1 h(x) =

sign

(s)

LearningFromData-Le ture16 7/20

RBFfor lassi ation

h(x) =

sign

X N n=1

w n exp 

−γ kx − x n k 2  !

Learning:

linearregressionfor lassi ation

s = X N n=1

w n exp 

−γ kx − x n k 2 

Minimize

(s − y) 2

on

D y = ±1 h(x) =

sign

(s)

LearningFromData-Le ture16 7/20

RBFfor lassi ation

h(x) =

sign

X N n=1

w n exp 

−γ kx − x n k 2  !

Learning:

linearregressionfor lassi ation

s = X N n=1

w n exp 

−γ kx − x n k 2 

Minimize

(s − y) 2

on

D y = ±1 h(x) =

sign

(s)

LearningFromData-Le ture16 7/20

RBFfor lassi ation

h(x) =

sign

X N n=1

w n exp 

−γ kx − x n k 2  !

Learning:

linearregressionfor lassi ation

s = X N n=1

w n exp 

−γ kx − x n k 2 

Minimize

(s − y) 2

on

D y = ±1 h(x) =

sign

(s)

LearningFromData-Le ture16 7/20

RBFfor lassi ation

h(x) =

sign

X N n=1

w n exp 

−γ kx − x n k 2  !

Learning:

linearregressionfor lassi ation

s = X N n=1

w n exp 

−γ kx − x n k 2 

Minimize

(s − y) 2

on

D y = ±1 h(x) =

sign

(s)

LearningFromData-Le ture16 7/20

RBFfor lassi ation

h(x) =

sign

X N n=1

w n exp 

−γ kx − x n k 2  !

Learning:

linearregressionfor lassi ation

s = X N n=1

w n exp 

−γ kx − x n k 2 

Minimize

(s − y) 2

on

D y = ±1 h(x) =

sign

(s)

LearningFromData-Le ture16 7/20

RBFfor lassi ation

h(x) =

sign

X N n=1

w n exp 

−γ kx − x n k 2  !

Learning:

linearregressionfor lassi ation

s = X N n=1

w n exp 

−γ kx − x n k 2 

Minimize

(s − y) 2

on

D y = ±1 h(x) =

sign

(s)

LearningFromData-Le ture16 7/20

RBFfor lassi ation

h(x) =

sign

X N n=1

w n exp 

−γ kx − x n k 2  !

Learning:

linearregressionfor lassi ation

s = X N n=1

w n exp 

−γ kx − x n k 2 

Minimize

(s − y) 2

on

D y = ±1 h(x) =

sign

(s)

LearningFromData-Le ture16 7/20

RBFfor lassi ation

h(x) =

sign

X N n=1

w n exp 

−γ kx − x n k 2  !

Learning:

linearregressionfor lassi ation

s = X N n=1

w n exp 

−γ kx − x n k 2 

Minimize

(s − y) 2

on

D y = ±1 h(x) =

sign

(s)

LearningFromData-Le ture16 7/20

RBFfor lassi ation

h(x) =

sign

X N n=1

w n exp 

−γ kx − x n k 2  !

Learning:

linearregressionfor lassi ation

s = X N n=1

w n exp 

−γ kx − x n k 2 

Minimize

(s − y) 2

on

D y = ±1 h(x) =

sign

(s)

LearningFromData-Le ture16 7/20

(8)

RBF Networks Full RBF Model

Relationshipto nearest-neighbor method

LearningFromData-Le ture16 8/20

Relationshipto nearest-neighbor method

Adoptthe

y

valueofanearbypoint: similaree tbyabasisfun tion:

LearningFromData-Le ture16 8/20

Relationshipto nearest-neighbor method

Adoptthe

y

valueofanearbypoint: similaree tbyabasisfun tion:

LearningFromData-Le ture16 8/20

Relationshipto nearest-neighbor method

Adoptthe

y

valueofanearbypoint: similaree tbyabasisfun tion:

LearningFromData-Le ture16 8/20

Relationshipto nearest-neighbor method

Adoptthe

y

valueofanearbypoint: similaree tbyabasisfun tion:

LearningFromData-Le ture16 8/20

(9)

RBF Networks Full RBF Model

Fun Time

(10)

RBF Networks Prototype Extraction

RBFwith

K

enters

N

parameters

w 1 , · · · , w N

basedon

N

datapoints

Use

K ≪ N

enters:

µ 1 , · · · , µ K

insteadof

x 1 , · · · , x N

h(x) = X K k=1

w k exp 

−γ kx − µ k k 2 

1.Howto hoosethe enters

µ k

2.Howto hoosetheweights

w k

LearningFromData-Le ture16 9/20

RBFwith

K

enters

N

parameters

w 1 , · · · , w N

basedon

N

datapoints

Use

K ≪ N

enters:

µ 1 , · · · , µ K

insteadof

x 1 , · · · , x N

h(x) = X K k=1

w k exp 

−γ kx − µ k k 2 

1.Howto hoosethe enters

µ k

2.Howto hoosetheweights

w k

LearningFromData-Le ture16 9/20

RBFwith

K

enters

N

parameters

w 1 , · · · , w N

basedon

N

datapoints

Use

K ≪ N

enters:

µ 1 , · · · , µ K

insteadof

x 1 , · · · , x N

h(x) = X K k=1

w k exp 

−γ kx − µ k k 2 

1.Howto hoosethe enters

µ k

2.Howto hoosetheweights

w k

LearningFromData-Le ture16 9/20

RBFwith

K

enters

N

parameters

w 1 , · · · , w N

basedon

N

datapoints

Use

K ≪ N

enters:

µ 1 , · · · , µ K

insteadof

x 1 , · · · , x N

h(x) = X K k=1

w k exp 

−γ kx − µ k k 2 

1.Howto hoosethe enters

µ k

2.Howto hoosetheweights

w k

LearningFromData-Le ture16 9/20

RBFwith

K

enters

N

parameters

w 1 , · · · , w N

basedon

N

datapoints

Use

K ≪ N

enters:

µ 1 , · · · , µ K

insteadof

x 1 , · · · , x N

h(x) = X K k=1

w k exp 

−γ kx − µ k k 2 

1.Howto hoosethe enters

µ k

2.Howto hoosetheweights

w k

LearningFromData-Le ture16 9/20

RBFwith

K

enters

N

parameters

w 1 , · · · , w N

basedon

N

datapoints

Use

K ≪ N

enters:

µ 1 , · · · , µ K

insteadof

x 1 , · · · , x N

h(x) = X K k=1

w k exp 

−γ kx − µ k k 2 

1.Howto hoosethe enters

µ k

2.Howto hoosetheweights

w k

LearningFromData-Le ture16 9/20

RBFwith

K

enters

N

parameters

w 1 , · · · , w N

basedon

N

datapoints

Use

K ≪ N

enters:

µ 1 , · · · , µ K

insteadof

x 1 , · · · , x N

h(x) = X K k=1

w k exp 

−γ kx − µ k k 2 

1.Howto hoosethe enters

µ k

2.Howto hoosetheweights

w k

LearningFromData-Le ture16 9/20

RBFwith

K

enters

N

parameters

w 1 , · · · , w N

basedon

N

datapoints

Use

K ≪ N

enters:

µ 1 , · · · , µ K

insteadof

x 1 , · · · , x N

h(x) = X K k=1

w k exp 

−γ kx − µ k k 2 

1.Howto hoosethe enters

µ k

2.Howto hoosetheweights

w k

LearningFromData-Le ture16 9/20

RBFwith

K

enters

N

parameters

w 1 , · · · , w N

basedon

N

datapoints

Use

K ≪ N

enters:

µ 1 , · · · , µ K

insteadof

x 1 , · · · , x N

h(x) = X K k=1

w k exp 

−γ kx − µ k k 2 

1.Howto hoosethe enters

µ k

2.Howto hoosetheweights

w k

LearningFromData-Le ture16 9/20

RBFwith

K

enters

N

parameters

w 1 , · · · , w N

basedon

N

datapoints

Use

K ≪ N

enters:

µ 1 , · · · , µ K

insteadof

x 1 , · · · , x N

h(x) = X K k=1

w k exp 

−γ kx − µ k k 2 

1.Howto hoosethe enters

µ k

2.Howto hoosetheweights

w k

LearningFromData-Le ture16 9/20

RBFwith

K

enters

N

parameters

w 1 , · · · , w N

basedon

N

datapoints

Use

K ≪ N

enters:

µ 1 , · · · , µ K

insteadof

x 1 , · · · , x N

h(x) = X K k=1

w k exp 

−γ kx − µ k k 2 

1.Howto hoosethe enters

µ k

2.Howto hoosetheweights

w k

LearningFromData-Le ture16 9/20

RBFwith

K

enters

N

parameters

w 1 , · · · , w N

basedon

N

datapoints

Use

K ≪ N

enters:

µ 1 , · · · , µ K

insteadof

x 1 , · · · , x N

h(x) = X K k=1

w k exp 

−γ kx − µ k k 2 

1.Howto hoosethe enters

µ k

2.Howto hoosetheweights

w k

LearningFromData-Le ture16 9/20

(11)

RBF Networks Prototype Extraction

Choosingthe enters

Minimizethedistan ebetween

x n

andthe losest enter

µ k

:

K

-means lustering

Split

x 1 , · · · , x N

into lusters

S 1 , · · · , S K

Minimize

X K k=1

X

x n ∈S k

kx n − µ k k 2

Unsupervisedlearning

NP-hard

LearningFromData-Le ture16 10/20

Choosingthe enters

Minimizethedistan ebetween

x n

andthe losest enter

µ k

:

K

-means lustering

Split

x 1 , · · · , x N

into lusters

S 1 , · · · , S K

Minimize

X K k=1

X

x n ∈S k

kx n − µ k k 2

Unsupervisedlearning

NP-hard

LearningFromData-Le ture16 10/20

Choosingthe enters

Minimizethedistan ebetween

x n

andthe losest enter

µ k

:

K

-means lustering

Split

x 1 , · · · , x N

into lusters

S 1 , · · · , S K

Minimize

X K k=1

X

x n ∈S k

kx n − µ k k 2

Unsupervisedlearning

NP-hard

LearningFromData-Le ture16 10/20

Choosingthe enters

Minimizethedistan ebetween

x n

andthe losest enter

µ k

:

K

-means lustering

Split

x 1 , · · · , x N

into lusters

S 1 , · · · , S K

Minimize

X K k=1

X

x n ∈S k

kx n − µ k k 2

Unsupervisedlearning

NP-hard

LearningFromData-Le ture16 10/20

Choosingthe enters

Minimizethedistan ebetween

x n

andthe losest enter

µ k

:

K

-means lustering

Split

x 1 , · · · , x N

into lusters

S 1 , · · · , S K

Minimize

X K k=1

X

x n ∈S k

kx n − µ k k 2

Unsupervisedlearning

NP-hard

LearningFromData-Le ture16 10/20

Choosingthe enters

Minimizethedistan ebetween

x n

andthe losest enter

µ k

:

K

-means lustering

Split

x 1 , · · · , x N

into lusters

S 1 , · · · , S K

Minimize

X K k=1

X

x n ∈S k

kx n − µ k k 2

Unsupervisedlearning

NP-hard

LearningFromData-Le ture16 10/20

Choosingthe enters

Minimizethedistan ebetween

x n

andthe losest enter

µ k

:

K

-means lustering

Split

x 1 , · · · , x N

into lusters

S 1 , · · · , S K

Minimize

X K k=1

X

x n ∈S k

kx n − µ k k 2

Unsupervisedlearning

NP-hard

LearningFromData-Le ture16 10/20

Choosingthe enters

Minimizethedistan ebetween

x n

andthe losest enter

µ k

:

K

-means lustering

Split

x 1 , · · · , x N

into lusters

S 1 , · · · , S K

Minimize

X K k=1

X

x n ∈S k

kx n − µ k k 2

Unsupervisedlearning

NP-hard

LearningFromData-Le ture16 10/20

Choosingthe enters

Minimizethedistan ebetween

x n

andthe losest enter

µ k

:

K

-means lustering

Split

x 1 , · · · , x N

into lusters

S 1 , · · · , S K

Minimize

X K k=1

X

x n ∈S k

kx n − µ k k 2

Unsupervisedlearning

NP-hard

LearningFromData-Le ture16 10/20

Choosingthe enters

Minimizethedistan ebetween

x n

andthe losest enter

µ k

:

K

-means lustering

Split

x 1 , · · · , x N

into lusters

S 1 , · · · , S K

Minimize

X K k=1

X

x n ∈S k

kx n − µ k k 2

Unsupervisedlearning

NP-hard

LearningFromData-Le ture16 10/20

Choosingthe enters

Minimizethedistan ebetween

x n

andthe losest enter

µ k

:

K

-means lustering

Split

x 1 , · · · , x N

into lusters

S 1 , · · · , S K

Minimize

X K k=1

X

x n ∈S k

kx n − µ k k 2

Unsupervisedlearning

NP-hard

LearningFromData-Le ture16 10/20

(12)

RBF Networks Prototype Extraction

Aniterativealgorithm

Lloyd'salgorithm:Iterativelyminimize

X K k=1

X

x n ∈S k

kx n − µ k k 2

w.r.t.

µ k , S k

µ k ← 1

|S k | X

x n ∈S k

x n

S k ← {x n : kx n − µ k k ≤

all

kx n − µ ℓ k}

Convergen e

−→

lo al minimum

LearningFromData-Le ture16 11/20

Aniterativealgorithm

Lloyd'salgorithm:Iterativelyminimize

X K k=1

X

x n ∈S k

kx n − µ k k 2

w.r.t.

µ k , S k

µ k ← 1

|S k | X

x n ∈S k

x n

S k ← {x n : kx n − µ k k ≤

all

kx n − µ ℓ k}

Convergen e

−→

lo al minimum

LearningFromData-Le ture16 11/20

Aniterativealgorithm

Lloyd'salgorithm:Iterativelyminimize

X K k=1

X

x n ∈ S k

kx n − µ k k 2

w.r.t.

µ k , S k

µ k ← 1

|S k | X

x n ∈S k

x n

S k ← {x n : kx n − µ k k ≤

all

kx n − µ ℓ k}

Convergen e

−→

lo al minimum

LearningFromData-Le ture16 11/20

Aniterativealgorithm

Lloyd'salgorithm:Iterativelyminimize

X K k=1

X

x n ∈ S k

kx n − µ k k 2

w.r.t.

µ k , S k

µ k ← 1

|S k | X

x n ∈S k

x n

S k ← {x n : kx n − µ k k ≤

all

kx n − µ ℓ k}

Convergen e

−→

lo al minimum

LearningFromData-Le ture16 11/20

Aniterativealgorithm

Lloyd'salgorithm:Iterativelyminimize

X K k=1

X

x n ∈ S k

kx n − µ k k 2

w.r.t.

µ k , S k

µ k ← 1

|S k | X

x n ∈S k

x n

S k ← {x n : kx n − µ k k ≤

all

kx n − µ ℓ k}

Convergen e

−→

lo al minimum

LearningFromData-Le ture16 11/20

Aniterativealgorithm

Lloyd'salgorithm:Iterativelyminimize

X K k=1

X

x n ∈ S k

kx n − µ k k 2

w.r.t.

µ k , S k

µ k ← 1

|S k | X

x n ∈S k

x n

S k ← {x n : kx n − µ k k ≤

all

kx n − µ ℓ k}

Convergen e

−→

lo al minimum

LearningFromData-Le ture16 11/20

Aniterativealgorithm

Lloyd'salgorithm:Iterativelyminimize

X K k=1

X

x n ∈ S k

kx n − µ k k 2

w.r.t.

µ k , S k

µ k ← 1

|S k | X

x n ∈S k

x n

S k ← {x n : kx n − µ k k ≤

all

kx n − µ ℓ k}

Convergen e

−→

lo al minimum

LearningFromData-Le ture16 11/20

Aniterativealgorithm

Lloyd'salgorithm:Iterativelyminimize

X K k=1

X

x n ∈ S k

kx n − µ k k 2

w.r.t.

µ k , S k

µ k ← 1

|S k | X

x n ∈S k

x n

S k ← {x n : kx n − µ k k ≤

all

kx n − µ ℓ k}

Convergen e

−→

lo al minimum

LearningFromData-Le ture16 11/20

Aniterativealgorithm

Lloyd'salgorithm:Iterativelyminimize

X K k=1

X

x n ∈ S k

kx n − µ k k 2

w.r.t.

µ k , S k

µ k ← 1

|S k | X

x n ∈S k

x n

S k ← {x n : kx n − µ k k ≤

all

kx n − µ ℓ k}

Convergen e

−→

lo al minimum

LearningFromData-Le ture16 11/20

Aniterativealgorithm

Lloyd'salgorithm:Iterativelyminimize

X K k=1

X

x n ∈ S k

kx n − µ k k 2

w.r.t.

µ k , S k

µ k ← 1

|S k | X

x n ∈S k

x n

S k ← {x n : kx n − µ k k ≤

all

kx n − µ ℓ k}

Convergen e

−→

lo al minimum

LearningFromData-Le ture16 11/20

Aniterativealgorithm

Lloyd'salgorithm:Iterativelyminimize

X K k=1

X

x n ∈ S k

kx n − µ k k 2

w.r.t.

µ k , S k

µ k ← 1

|S k | X

x n ∈S k

x n

S k ← {x n : kx n − µ k k ≤

all

kx n − µ ℓ k}

Convergen e

−→

lo al minimum

LearningFromData-Le ture16 11/20

(13)

RBF Networks Prototype Extraction

Lloyd'salgorithm ina tion

Hi

Hi

1.Getthedatapoints

2.Onlytheinputs!

3.Initializethe enters

4.Iterate

5.Theseareyour

µ k

's

LearningFromData-Le ture16 12/20

Lloyd'salgorithm ina tion

Hi

Hi

1.Getthedatapoints

2.Onlytheinputs!

3.Initializethe enters

4.Iterate

5.Theseareyour

µ k

's

LearningFromData-Le ture16 12/20

Lloyd'salgorithm ina tion

Hi

Hi

1.Getthedatapoints

2.Onlytheinputs!

3.Initializethe enters

4.Iterate

5.Theseareyour

µ k

's

LearningFromData-Le ture16 12/20

Lloyd'salgorithm ina tion

Hi

Hi

1.Getthedatapoints

2.Onlytheinputs!

3.Initializethe enters

4.Iterate

5.Theseareyour

µ k

's

LearningFromData-Le ture16 12/20

Lloyd'salgorithm ina tion

Hi

Hi

1.Getthedatapoints

2.Onlytheinputs!

3.Initializethe enters

4.Iterate

5.Theseareyour

µ k

's

LearningFromData-Le ture16 12/20

Lloyd'salgorithm ina tion

Hi

Hi

1.Getthedatapoints

2.Onlytheinputs!

3.Initializethe enters

4.Iterate

5.Theseareyour

µ k

's

LearningFromData-Le ture16 12/20

Lloyd'salgorithm ina tion

Hi

Hi

1.Getthedatapoints

2.Onlytheinputs!

3.Initializethe enters

4.Iterate

5.Theseareyour

µ k

's

LearningFromData-Le ture16 12/20

Lloyd'salgorithm ina tion

Hi

Hi

1.Getthedatapoints

2.Onlytheinputs!

3.Initializethe enters

4.Iterate

5.Theseareyour

µ k

's

LearningFromData-Le ture16 12/20

Lloyd'salgorithm ina tion

Hi

Hi

1.Getthedatapoints

2.Onlytheinputs!

3.Initializethe enters

4.Iterate

5.Theseareyour

µ k

's

LearningFromData-Le ture16 12/20

Lloyd'salgorithm ina tion

Hi

Hi

1.Getthedatapoints

2.Onlytheinputs!

3.Initializethe enters

4.Iterate

5.Theseareyour

µ k

's

LearningFromData-Le ture16 12/20

Lloyd'salgorithm ina tion

Hi

Hi

1.Getthedatapoints

2.Onlytheinputs!

3.Initializethe enters

4.Iterate

5.Theseareyour

µ k

's

LearningFromData-Le ture16 12/20

Lloyd'salgorithm ina tion

Hi

Hi

1.Getthedatapoints

2.Onlytheinputs!

3.Initializethe enters

4.Iterate

5.Theseareyour

µ k

's

LearningFromData-Le ture16 12/20

Lloyd'salgorithm ina tion

Hi

Hi

1.Getthedatapoints

2.Onlytheinputs!

3.Initializethe enters

4.Iterate

5.Theseareyour

µ k

's

LearningFromData-Le ture16 12/20

Lloyd'salgorithm ina tion

Hi

Hi

1.Getthedatapoints

2.Onlytheinputs!

3.Initializethe enters

4.Iterate

5.Theseareyour

µ k

's

LearningFromData-Le ture16 12/20

Lloyd'salgorithm ina tion

Hi

Hi

1.Getthedatapoints

2.Onlytheinputs!

3.Initializethe enters

4.Iterate

5.Theseareyour

µ k

's

LearningFromData-Le ture16 12/20

(14)

RBF Networks Prototype Extraction

Fun Time

(15)

RBF Networks RBF Network

Choosingtheweights

X K k=1

w k exp 

−γ kx n − µ k k 2 

≈ y n N

equationsin

K< N

unknowns

 

 

exp( −γ kx 1 − µ 1 k 2 ) . . . exp( −γ kx 1 − µ K k 2 ) exp(−γ kx 2 − µ 1 k 2 ) . . . exp(−γ kx 2 − µ K k 2 )

.

.

.

.

.

.

.

.

.

exp(−γ kx N − µ 1 k 2 ) . . . exp(−γ kx N − µ K k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w K

 

 

| {z } w

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

T

Φ

isinvertible,

w = (Φ

T

Φ) −1 Φ

T

y

pseudo-inverse

LearningFromData-Le ture16 14/20

Choosingtheweights

X K k=1

w k exp 

−γ kx n − µ k k 2 

= y n N

equationsin

K< N

unknowns

 

 

exp( −γ kx 1 − µ 1 k 2 ) . . . exp( −γ kx 1 − µ K k 2 ) exp(−γ kx 2 − µ 1 k 2 ) . . . exp(−γ kx 2 − µ K k 2 )

.

.

.

.

.

.

.

.

.

exp(−γ kx N − µ 1 k 2 ) . . . exp(−γ kx N − µ K k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w K

 

 

| {z } w

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

T

Φ

isinvertible,

w = (Φ

T

Φ) −1 Φ

T

y

pseudo-inverse

LearningFromData-Le ture16 14/20

Choosingtheweights

X K k=1

w k exp 

−γ kx n − µ k k 2 

= y n N

equationsin

K< N

unknowns

 

 

exp( −γ kx 1 − µ 1 k 2 ) . . . exp( −γ kx 1 − µ K k 2 ) exp(−γ kx 2 − µ 1 k 2 ) . . . exp(−γ kx 2 − µ K k 2 )

.

.

.

.

.

.

.

.

.

exp(−γ kx N − µ 1 k 2 ) . . . exp(−γ kx N − µ K k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w K

 

 

| {z } w

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

T

Φ

isinvertible,

w = (Φ

T

Φ) −1 Φ

T

y

pseudo-inverse

LearningFromData-Le ture16 14/20

Choosingtheweights

X K k=1

w k exp 

−γ kx n − µ k k 2 

= y n N

equationsin

K< N

unknowns

 

 

exp( −γ kx 1 − µ 1 k 2 ) . . . exp( −γ kx 1 − µ K k 2 ) exp(−γ kx 2 − µ 1 k 2 ) . . . exp(−γ kx 2 − µ K k 2 )

.

.

.

.

.

.

.

.

.

exp(−γ kx N − µ 1 k 2 ) . . . exp(−γ kx N − µ K k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w K

 

 

| {z } w

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

T

Φ

isinvertible,

w = (Φ

T

Φ) −1 Φ

T

y

pseudo-inverse

LearningFromData-Le ture16 14/20

Choosingtheweights

X K k=1

w k exp 

−γ kx n − µ k k 2 

≈ y n N

equationsin

K< N

unknowns

 

 

exp( −γ kx 1 − µ 1 k 2 ) . . . exp( −γ kx 1 − µ K k 2 ) exp(−γ kx 2 − µ 1 k 2 ) . . . exp(−γ kx 2 − µ K k 2 )

.

.

.

.

.

.

.

.

.

exp(−γ kx N − µ 1 k 2 ) . . . exp(−γ kx N − µ K k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w K

 

 

| {z } w

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

T

Φ

isinvertible,

w = (Φ

T

Φ) −1 Φ

T

y

pseudo-inverse

LearningFromData-Le ture16 14/20

Choosingtheweights

X K k=1

w k exp 

−γ kx n − µ k k 2 

≈ y n N

equationsin

K< N

unknowns

 

 

exp( −γ kx 1 − µ 1 k 2 ) . . . exp( −γ kx 1 − µ K k 2 ) exp(−γ kx 2 − µ 1 k 2 ) . . . exp(−γ kx 2 − µ K k 2 )

.

.

.

.

.

.

.

.

.

exp(−γ kx N − µ 1 k 2 ) . . . exp(−γ kx N − µ K k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w K

 

 

| {z } w

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

T

Φ

isinvertible,

w = (Φ

T

Φ) −1 Φ

T

y

pseudo-inverse

LearningFromData-Le ture16 14/20

Choosingtheweights

X K k=1

w k exp 

−γ kx n − µ k k 2 

≈ y n N

equationsin

K< N

unknowns

 

 

exp( −γ kx 1 − µ 1 k 2 ) . . . exp( −γ kx 1 − µ K k 2 ) exp(−γ kx 2 − µ 1 k 2 ) . . . exp(−γ kx 2 − µ K k 2 )

.

.

.

.

.

.

.

.

.

exp(−γ kx N − µ 1 k 2 ) . . . exp(−γ kx N − µ K k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w K

 

 

| {z } w

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

T

Φ

isinvertible,

w = (Φ

T

Φ) −1 Φ

T

y

pseudo-inverse

LearningFromData-Le ture16 14/20

Choosingtheweights

X K k=1

w k exp 

−γ kx n − µ k k 2 

≈ y n N

equationsin

K< N

unknowns

 

 

exp( −γ kx 1 − µ 1 k 2 ) . . . exp( −γ kx 1 − µ K k 2 ) exp(−γ kx 2 − µ 1 k 2 ) . . . exp(−γ kx 2 − µ K k 2 )

.

.

.

.

.

.

.

.

.

exp(−γ kx N − µ 1 k 2 ) . . . exp(−γ kx N − µ K k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w K

 

 

| {z } w

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

T

Φ

isinvertible,

w = (Φ

T

Φ) −1 Φ

T

y

pseudo-inverse

LearningFromData-Le ture16 14/20

Choosingtheweights

X K k=1

w k exp 

−γ kx n − µ k k 2 

≈ y n N

equationsin

K< N

unknowns

 

 

exp( −γ kx 1 − µ 1 k 2 ) . . . exp( −γ kx 1 − µ K k 2 ) exp(−γ kx 2 − µ 1 k 2 ) . . . exp(−γ kx 2 − µ K k 2 )

.

.

.

.

.

.

.

.

.

exp(−γ kx N − µ 1 k 2 ) . . . exp(−γ kx N − µ K k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w K

 

 

| {z } w

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

T

Φ

isinvertible,

w = (Φ

T

Φ) −1 Φ

T

y

pseudo-inverse

LearningFromData-Le ture16 14/20

Choosingtheweights

X K k=1

w k exp 

−γ kx n − µ k k 2 

≈ y n N

equationsin

K< N

unknowns

 

 

exp( −γ kx 1 − µ 1 k 2 ) . . . exp( −γ kx 1 − µ K k 2 ) exp(−γ kx 2 − µ 1 k 2 ) . . . exp(−γ kx 2 − µ K k 2 )

.

.

.

.

.

.

.

.

.

exp(−γ kx N − µ 1 k 2 ) . . . exp(−γ kx N − µ K k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w K

 

 

| {z } w

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

T

Φ

isinvertible,

w = (Φ

T

Φ) −1 Φ

T

y

pseudo-inverse

LearningFromData-Le ture16 14/20

Choosingtheweights

X K k=1

w k exp 

−γ kx n − µ k k 2 

≈ y n N

equationsin

K< N

unknowns

 

 

exp( −γ kx 1 − µ 1 k 2 ) . . . exp( −γ kx 1 − µ K k 2 ) exp(−γ kx 2 − µ 1 k 2 ) . . . exp(−γ kx 2 − µ K k 2 )

.

.

.

.

.

.

.

.

.

exp(−γ kx N − µ 1 k 2 ) . . . exp(−γ kx N − µ K k 2 )

 

 

| {z }

Φ

 

  w 1

w 2

.

.

.

w K

 

 

| {z } w

 

  y 1

y 2

.

.

.

y N

 

 

| {z } y

If

Φ

T

Φ

isinvertible,

w = (Φ

T

Φ) −1 Φ

T

y

pseudo-inverse

LearningFromData-Le ture16 14/20

Figure

Updating...

References

Related subjects :