Machine Learning Techniques ( 機器學習技巧)
Lecture 13: RBF Networks
Hsuan-Tien Lin (林軒田) htlin@csie.ntu.edu.twDepartment of Computer Science
& Information Engineering
National Taiwan University
( 國立台灣大學資訊工程系)
RBF Networks
Agenda
Lecture 13: RBF Networks Full RBF Model
Prototype Extraction RBF Network
Connection to Other Views
RBF Networks Full RBF Model
Disclaimer
Many parts of this lecture borrows
Prof. Yaser S. Abu-Mostafa’s slides with permission.
Learning From Data
YaserS.Abu-Mostafa
CaliforniaInstituteofTe hnology
Le ture16:RadialBasisFun tions
•
RBF Networks Full RBF Model
Basi RBFmodel
LearningFromData-Le ture16 3/20
Basi RBFmodel
Ea h
(x n , y n ) ∈ D
inuen esh(x)
basedonkx − x | {z } n k
radial
LearningFromData-Le ture16 3/20
Basi RBFmodel
Ea h
(x n , y n ) ∈ D
inuen esh(x)
basedonk | {z } x − x n k
radial
LearningFromData-Le ture16 3/20
Basi RBFmodel
Ea h
(x n , y n ) ∈ D
inuen esh(x)
basedonk | {z } x − x n k
radial
Standard form:
h(x) = X N n=1
w n exp
−γ kx − x n k 2
| {z }
basisfun tion
LearningFromData-Le ture16 3/20
Basi RBFmodel
Ea h
(x n , y n ) ∈ D
inuen esh(x)
basedonk | {z } x − x n k
radial
Standard form:
h(x) = X N n=1
w n exp
−γ kx − x n k 2
| {z }
basisfun tion
LearningFromData-Le ture16 3/20
Basi RBFmodel
Ea h
(x n , y n ) ∈ D
inuen esh(x)
basedonk | {z } x − x n k
radial
Standard form:
h(x) = X N n=1
w n exp
−γ kx − x n k 2
| {z }
basisfun tion
LearningFromData-Le ture16 3/20
Basi RBFmodel
Ea h
(x n , y n ) ∈ D
inuen esh(x)
basedonk | {z } x − x n k
radial
Standard form:
h(x) = X N n=1
w n exp
−γ k x − x n k 2
| {z }
basisfun tion
LearningFromData-Le ture16 3/20
Basi RBFmodel
Ea h
(x n , y n ) ∈ D
inuen esh(x)
basedonk | {z } x − x n k
radial
Standard form:
h(x) = X N n=1
w n exp
−γ k x − x n k 2
| {z }
basisfun tion
LearningFromData-Le ture16 3/20
Basi RBFmodel
Ea h
(x n , y n ) ∈ D
inuen esh(x)
basedonk | {z } x − x n k
radial
Standard form:
h(x) = X N n=1
w n exp
−γ k x − x n k 2
| {z }
basisfun tion
LearningFromData-Le ture16 3/20
Basi RBFmodel
Ea h
(x n , y n ) ∈ D
inuen esh(x)
basedonk | {z } x − x n k
radial
Standard form:
h(x) = X N n=1
w n exp
−γ k x − x n k 2
| {z }
basisfun tion
LearningFromData-Le ture16 3/20
Basi RBFmodel
Ea h
(x n , y n ) ∈ D
inuen esh(x)
basedonk | {z } x − x n k
radial
Standard form:
h(x) = X N n=1
w n exp
−γ k x − x n k 2
| {z }
basisfun tion
LearningFromData-Le ture16 3/20
RBF Networks Full RBF Model
Thelearning algorithm
Finding
w 1 , · · · , w N
:h(x) = X N n=1
w n exp
−γ kx − x n k 2
basedon
D = (x 1 , y 1 ), · · · , (x N , y N ) E
in= 0
:h(x n ) = y n
forn = 1, · · · , N
:X N m=1
w m exp
−γ kx n − x m k 2
= y n
LearningFromData-Le ture16 4/20
Thelearning algorithm
Finding
w 1 , · · · , w N
:h(x) = X N n=1
w n exp
−γ kx − x n k 2
basedon
D = (x 1 , y 1 ), · · · , (x N , y N ) E
in= 0
:h(x n ) = y n
forn = 1, · · · , N
:X N m=1
w m exp
−γ kx n − x m k 2
= y n
LearningFromData-Le ture16 4/20
Thelearning algorithm
Finding
w 1 , · · · , w N
:h(x) = X N n=1
w n exp
−γ kx − x n k 2
basedon
D = (x 1 , y 1 ), · · · , (x N , y N ) E
in= 0
:h(x n ) = y n
forn = 1, · · · , N
:X N m=1
w m exp
−γ kx n − x m k 2
= y n
LearningFromData-Le ture16 4/20
Thelearning algorithm
Finding
w 1 , · · · , w N
:h(x) = X N n=1
w n exp
−γ kx − x n k 2
basedon
D = (x 1 , y 1 ), · · · , (x N , y N ) E
in= 0
:h(x n ) = y n
forn = 1, · · · , N
:X N m=1
w m exp
−γ kx n − x m k 2
= y n
LearningFromData-Le ture16 4/20
Thelearning algorithm
Finding
w 1 , · · · , w N
:h(x) = X N n=1
w n exp
−γ kx − x n k 2
basedon
D = (x 1 , y 1 ), · · · , (x N , y N ) E
in= 0
:h(x n ) = y n
forn = 1, · · · , N
:X N m=1
w m exp
−γ kx n − x m k 2
= y n
LearningFromData-Le ture16 4/20
Thelearning algorithm
Finding
w 1 , · · · , w N
:h(x) = X N n=1
w n exp
−γ kx − x n k 2
basedon
D = (x 1 , y 1 ), · · · , (x N , y N ) E
in= 0
:h(x n ) = y n
forn = 1, · · · , N
:X N m=1
w m exp
−γ kx n − x m k 2
= y n
LearningFromData-Le ture16 4/20
Thelearning algorithm
Finding
w 1 , · · · , w N
:h(x) = X N n=1
w n exp
−γ kx − x n k 2
basedon
D = (x 1 , y 1 ), · · · , (x N , y N ) E
in= 0
:h(x n ) = y n
forn = 1, · · · , N
:X N m=1
w m exp
−γ kx n − x m k 2
= y n
LearningFromData-Le ture16 4/20
Thelearning algorithm
Finding
w 1 , · · · , w N
:h(x) = X N n=1
w n exp
−γ kx − x n k 2
basedon
D = (x 1 , y 1 ), · · · , (x N , y N ) E
in= 0
:h(x n ) = y n
forn = 1, · · · , N
:X N m=1
w m exp
−γ kx n − x m k 2
= y n
LearningFromData-Le ture16 4/20
Thelearning algorithm
Finding
w 1 , · · · , w N
:h(x) = X N n=1
w n exp
−γ kx − x n k 2
basedon
D = (x 1 , y 1 ), · · · , (x N , y N ) E
in= 0
:h(x n ) = y n
forn = 1, · · · , N
:X N m=1
w m exp
−γ k x n − x m k 2
= y n
LearningFromData-Le ture16 4/20
Thelearning algorithm
Finding
w 1 , · · · , w N
:h(x) = X N n=1
w n exp
−γ kx − x n k 2
basedon
D = (x 1 , y 1 ), · · · , (x N , y N ) E
in= 0
:h(x n ) = y n
forn = 1, · · · , N
:X N m=1
w m exp
−γ k x n − x m k 2
= y n
LearningFromData-Le ture16 4/20
RBF Networks Full RBF Model
Thesolution
X N m=1
w m exp
−γ kx n − x m k 2
= y n N
equationsinN
unknowns
exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )
.
.
.
.
.
.
.
.
.
exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w N
| {z } w
=
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
isinvertible,w = Φ −1 y
exa tinterpolationLearningFromData-Le ture16 5/20
Thesolution
X N m=1
w m exp
−γ kx n − x m k 2
= y n N
equationsinN
unknowns
exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )
.
.
.
.
.
.
.
.
.
exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w N
| {z } w
=
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
isinvertible,w = Φ −1 y
exa tinterpolationLearningFromData-Le ture16 5/20
Thesolution
X N m=1
w m exp
−γ kx n − x m k 2
= y n N
equationsinN
unknowns
exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )
.
.
.
.
.
.
.
.
.
exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w N
| {z } w
=
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
isinvertible,w = Φ −1 y
exa tinterpolationLearningFromData-Le ture16 5/20
Thesolution
X N m=1
w m exp
−γ kx n − x m k 2
= y n N
equationsinN
unknowns
exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )
.
.
.
.
.
.
.
.
.
exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w N
| {z } w
=
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
isinvertible,w = Φ −1 y
exa tinterpolationLearningFromData-Le ture16 5/20
Thesolution
X N m=1
w m exp
−γ kx n − x m k 2
= y n N
equationsinN
unknowns
exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )
.
.
.
.
.
.
.
.
.
exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w N
| {z } w
=
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
isinvertible,w = Φ −1 y
exa tinterpolationLearningFromData-Le ture16 5/20
Thesolution
X N m=1
w m exp
−γ kx n − x m k 2
= y n N
equationsinN
unknowns
exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )
.
.
.
.
.
.
.
.
.
exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w N
| {z } w
=
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
isinvertible,w = Φ −1 y
exa tinterpolationLearningFromData-Le ture16 5/20
Thesolution
X N m=1
w m exp
−γ kx n − x m k 2
= y n N
equationsinN
unknowns
exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )
.
.
.
.
.
.
.
.
.
exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w N
| {z } w
=
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
isinvertible,w = Φ −1 y
exa tinterpolationLearningFromData-Le ture16 5/20
Thesolution
X N m=1
w m exp
−γ kx n − x m k 2
= y n N
equationsinN
unknowns
exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )
.
.
.
.
.
.
.
.
.
exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w N
| {z } w
=
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
isinvertible,w = Φ −1 y
exa tinterpolationLearningFromData-Le ture16 5/20
Thesolution
X N m=1
w m exp
−γ kx n − x m k 2
= y n N
equationsinN
unknowns
exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )
.
.
.
.
.
.
.
.
.
exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w N
| {z } w
=
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
isinvertible,w = Φ −1 y
exa tinterpolationLearningFromData-Le ture16 5/20
Thesolution
X N m=1
w m exp
−γ kx n − x m k 2
= y n N
equationsinN
unknowns
exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )
.
.
.
.
.
.
.
.
.
exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w N
| {z } w
=
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
isinvertible,w = Φ −1 y
exa tinterpolationLearningFromData-Le ture16 5/20
Thesolution
X N m=1
w m exp
−γ kx n − x m k 2
= y n N
equationsinN
unknowns
exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )
.
.
.
.
.
.
.
.
.
exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w N
| {z } w
=
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
isinvertible,w = Φ −1 y
exa tinterpolationLearningFromData-Le ture16 5/20
Thesolution
X N m=1
w m exp
−γ kx n − x m k 2
= y n N
equationsinN
unknowns
exp(−γ kx 1 − x 1 k 2 ) . . . exp(−γ kx 1 − x N k 2 ) exp( −γ kx 2 − x 1 k 2 ) . . . exp( −γ kx 2 − x N k 2 )
.
.
.
.
.
.
.
.
.
exp( −γ kx N − x 1 k 2 ) . . . exp( −γ kx N − x N k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w N
| {z } w
=
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
isinvertible,w = Φ −1 y
exa tinterpolationLearningFromData-Le ture16 5/20
RBF Networks Full RBF Model
RBFfor lassi ation
h(x) =
signX N n=1
w n exp
−γ kx − x n k 2 !
Learning:
∼
linearregressionfor lassi ations = X N n=1
w n exp
−γ kx − x n k 2
Minimize
(s − y) 2
onD y = ±1 h(x) =
sign(s)
LearningFromData-Le ture16 7/20
RBFfor lassi ation
h(x) =
signX N n=1
w n exp
−γ kx − x n k 2 !
Learning:
∼
linearregressionfor lassi ations = X N n=1
w n exp
−γ kx − x n k 2
Minimize
(s − y) 2
onD y = ±1 h(x) =
sign(s)
LearningFromData-Le ture16 7/20
RBFfor lassi ation
h(x) =
signX N n=1
w n exp
−γ kx − x n k 2 !
Learning:
∼
linearregressionfor lassi ations = X N n=1
w n exp
−γ kx − x n k 2
Minimize
(s − y) 2
onD y = ±1 h(x) =
sign(s)
LearningFromData-Le ture16 7/20
RBFfor lassi ation
h(x) =
signX N n=1
w n exp
−γ kx − x n k 2 !
Learning:
∼
linearregressionfor lassi ations = X N n=1
w n exp
−γ kx − x n k 2
Minimize
(s − y) 2
onD y = ±1 h(x) =
sign(s)
LearningFromData-Le ture16 7/20
RBFfor lassi ation
h(x) =
signX N n=1
w n exp
−γ kx − x n k 2 !
Learning:
∼
linearregressionfor lassi ations = X N n=1
w n exp
−γ kx − x n k 2
Minimize
(s − y) 2
onD y = ±1 h(x) =
sign(s)
LearningFromData-Le ture16 7/20
RBFfor lassi ation
h(x) =
signX N n=1
w n exp
−γ kx − x n k 2 !
Learning:
∼
linearregressionfor lassi ations = X N n=1
w n exp
−γ kx − x n k 2
Minimize
(s − y) 2
onD y = ±1 h(x) =
sign(s)
LearningFromData-Le ture16 7/20
RBFfor lassi ation
h(x) =
signX N n=1
w n exp
−γ kx − x n k 2 !
Learning:
∼
linearregressionfor lassi ations = X N n=1
w n exp
−γ kx − x n k 2
Minimize
(s − y) 2
onD y = ±1 h(x) =
sign(s)
LearningFromData-Le ture16 7/20
RBFfor lassi ation
h(x) =
signX N n=1
w n exp
−γ kx − x n k 2 !
Learning:
∼
linearregressionfor lassi ations = X N n=1
w n exp
−γ kx − x n k 2
Minimize
(s − y) 2
onD y = ±1 h(x) =
sign(s)
LearningFromData-Le ture16 7/20
RBFfor lassi ation
h(x) =
signX N n=1
w n exp
−γ kx − x n k 2 !
Learning:
∼
linearregressionfor lassi ations = X N n=1
w n exp
−γ kx − x n k 2
Minimize
(s − y) 2
onD y = ±1 h(x) =
sign(s)
LearningFromData-Le ture16 7/20
RBFfor lassi ation
h(x) =
signX N n=1
w n exp
−γ kx − x n k 2 !
Learning:
∼
linearregressionfor lassi ations = X N n=1
w n exp
−γ kx − x n k 2
Minimize
(s − y) 2
onD y = ±1 h(x) =
sign(s)
LearningFromData-Le ture16 7/20
RBFfor lassi ation
h(x) =
signX N n=1
w n exp
−γ kx − x n k 2 !
Learning:
∼
linearregressionfor lassi ations = X N n=1
w n exp
−γ kx − x n k 2
Minimize
(s − y) 2
onD y = ±1 h(x) =
sign(s)
LearningFromData-Le ture16 7/20
RBFfor lassi ation
h(x) =
signX N n=1
w n exp
−γ kx − x n k 2 !
Learning:
∼
linearregressionfor lassi ations = X N n=1
w n exp
−γ kx − x n k 2
Minimize
(s − y) 2
onD y = ±1 h(x) =
sign(s)
LearningFromData-Le ture16 7/20
RBF Networks Full RBF Model
Relationshipto nearest-neighbor method
LearningFromData-Le ture16 8/20
Relationshipto nearest-neighbor method
Adoptthe
y
valueofanearbypoint: similaree tbyabasisfun tion:LearningFromData-Le ture16 8/20
Relationshipto nearest-neighbor method
Adoptthe
y
valueofanearbypoint: similaree tbyabasisfun tion:LearningFromData-Le ture16 8/20
Relationshipto nearest-neighbor method
Adoptthe
y
valueofanearbypoint: similaree tbyabasisfun tion:LearningFromData-Le ture16 8/20
Relationshipto nearest-neighbor method
Adoptthe
y
valueofanearbypoint: similaree tbyabasisfun tion:LearningFromData-Le ture16 8/20
RBF Networks Full RBF Model
Fun Time
RBF Networks Prototype Extraction
RBFwith
K
entersN
parametersw 1 , · · · , w N
basedonN
datapointsUse
K ≪ N
enters:µ 1 , · · · , µ K
insteadofx 1 , · · · , x N
h(x) = X K k=1
w k exp
−γ kx − µ k k 2
1.Howto hoosethe enters
µ k
2.Howto hoosetheweights
w k
LearningFromData-Le ture16 9/20
RBFwith
K
entersN
parametersw 1 , · · · , w N
basedonN
datapointsUse
K ≪ N
enters:µ 1 , · · · , µ K
insteadofx 1 , · · · , x N
h(x) = X K k=1
w k exp
−γ kx − µ k k 2
1.Howto hoosethe enters
µ k
2.Howto hoosetheweights
w k
LearningFromData-Le ture16 9/20
RBFwith
K
entersN
parametersw 1 , · · · , w N
basedonN
datapointsUse
K ≪ N
enters:µ 1 , · · · , µ K
insteadofx 1 , · · · , x N
h(x) = X K k=1
w k exp
−γ kx − µ k k 2
1.Howto hoosethe enters
µ k
2.Howto hoosetheweights
w k
LearningFromData-Le ture16 9/20
RBFwith
K
entersN
parametersw 1 , · · · , w N
basedonN
datapointsUse
K ≪ N
enters:µ 1 , · · · , µ K
insteadofx 1 , · · · , x N
h(x) = X K k=1
w k exp
−γ kx − µ k k 2
1.Howto hoosethe enters
µ k
2.Howto hoosetheweights
w k
LearningFromData-Le ture16 9/20
RBFwith
K
entersN
parametersw 1 , · · · , w N
basedonN
datapointsUse
K ≪ N
enters:µ 1 , · · · , µ K
insteadofx 1 , · · · , x N
h(x) = X K k=1
w k exp
−γ kx − µ k k 2
1.Howto hoosethe enters
µ k
2.Howto hoosetheweights
w k
LearningFromData-Le ture16 9/20
RBFwith
K
entersN
parametersw 1 , · · · , w N
basedonN
datapointsUse
K ≪ N
enters:µ 1 , · · · , µ K
insteadofx 1 , · · · , x N
h(x) = X K k=1
w k exp
−γ kx − µ k k 2
1.Howto hoosethe enters
µ k
2.Howto hoosetheweights
w k
LearningFromData-Le ture16 9/20
RBFwith
K
entersN
parametersw 1 , · · · , w N
basedonN
datapointsUse
K ≪ N
enters:µ 1 , · · · , µ K
insteadofx 1 , · · · , x N
h(x) = X K k=1
w k exp
−γ kx − µ k k 2
1.Howto hoosethe enters
µ k
2.Howto hoosetheweights
w k
LearningFromData-Le ture16 9/20
RBFwith
K
entersN
parametersw 1 , · · · , w N
basedonN
datapointsUse
K ≪ N
enters:µ 1 , · · · , µ K
insteadofx 1 , · · · , x N
h(x) = X K k=1
w k exp
−γ kx − µ k k 2
1.Howto hoosethe enters
µ k
2.Howto hoosetheweights
w k
LearningFromData-Le ture16 9/20
RBFwith
K
entersN
parametersw 1 , · · · , w N
basedonN
datapointsUse
K ≪ N
enters:µ 1 , · · · , µ K
insteadofx 1 , · · · , x N
h(x) = X K k=1
w k exp
−γ kx − µ k k 2
1.Howto hoosethe enters
µ k
2.Howto hoosetheweights
w k
LearningFromData-Le ture16 9/20
RBFwith
K
entersN
parametersw 1 , · · · , w N
basedonN
datapointsUse
K ≪ N
enters:µ 1 , · · · , µ K
insteadofx 1 , · · · , x N
h(x) = X K k=1
w k exp
−γ kx − µ k k 2
1.Howto hoosethe enters
µ k
2.Howto hoosetheweights
w k
LearningFromData-Le ture16 9/20
RBFwith
K
entersN
parametersw 1 , · · · , w N
basedonN
datapointsUse
K ≪ N
enters:µ 1 , · · · , µ K
insteadofx 1 , · · · , x N
h(x) = X K k=1
w k exp
−γ kx − µ k k 2
1.Howto hoosethe enters
µ k
2.Howto hoosetheweights
w k
LearningFromData-Le ture16 9/20
RBFwith
K
entersN
parametersw 1 , · · · , w N
basedonN
datapointsUse
K ≪ N
enters:µ 1 , · · · , µ K
insteadofx 1 , · · · , x N
h(x) = X K k=1
w k exp
−γ kx − µ k k 2
1.Howto hoosethe enters
µ k
2.Howto hoosetheweights
w k
LearningFromData-Le ture16 9/20
RBF Networks Prototype Extraction
Choosingthe enters
Minimizethedistan ebetween
x n
andthe losest enterµ k
:K
-means lusteringSplit
x 1 , · · · , x N
into lustersS 1 , · · · , S K
Minimize
X K k=1
X
x n ∈S k
kx n − µ k k 2
Unsupervisedlearning
NP-hard
LearningFromData-Le ture16 10/20
Choosingthe enters
Minimizethedistan ebetween
x n
andthe losest enterµ k
:K
-means lusteringSplit
x 1 , · · · , x N
into lustersS 1 , · · · , S K
Minimize
X K k=1
X
x n ∈S k
kx n − µ k k 2
Unsupervisedlearning
NP-hard
LearningFromData-Le ture16 10/20
Choosingthe enters
Minimizethedistan ebetween
x n
andthe losest enterµ k
:K
-means lusteringSplit
x 1 , · · · , x N
into lustersS 1 , · · · , S K
Minimize
X K k=1
X
x n ∈S k
kx n − µ k k 2
Unsupervisedlearning
NP-hard
LearningFromData-Le ture16 10/20
Choosingthe enters
Minimizethedistan ebetween
x n
andthe losest enterµ k
:K
-means lusteringSplit
x 1 , · · · , x N
into lustersS 1 , · · · , S K
Minimize
X K k=1
X
x n ∈S k
kx n − µ k k 2
Unsupervisedlearning
NP-hard
LearningFromData-Le ture16 10/20
Choosingthe enters
Minimizethedistan ebetween
x n
andthe losest enterµ k
:K
-means lusteringSplit
x 1 , · · · , x N
into lustersS 1 , · · · , S K
Minimize
X K k=1
X
x n ∈S k
kx n − µ k k 2
Unsupervisedlearning
NP-hard
LearningFromData-Le ture16 10/20
Choosingthe enters
Minimizethedistan ebetween
x n
andthe losest enterµ k
:K
-means lusteringSplit
x 1 , · · · , x N
into lustersS 1 , · · · , S K
Minimize
X K k=1
X
x n ∈S k
kx n − µ k k 2
Unsupervisedlearning
NP-hard
LearningFromData-Le ture16 10/20
Choosingthe enters
Minimizethedistan ebetween
x n
andthe losest enterµ k
:K
-means lusteringSplit
x 1 , · · · , x N
into lustersS 1 , · · · , S K
Minimize
X K k=1
X
x n ∈S k
kx n − µ k k 2
Unsupervisedlearning
NP-hard
LearningFromData-Le ture16 10/20
Choosingthe enters
Minimizethedistan ebetween
x n
andthe losest enterµ k
:K
-means lusteringSplit
x 1 , · · · , x N
into lustersS 1 , · · · , S K
Minimize
X K k=1
X
x n ∈S k
kx n − µ k k 2
Unsupervisedlearning
NP-hard
LearningFromData-Le ture16 10/20
Choosingthe enters
Minimizethedistan ebetween
x n
andthe losest enterµ k
:K
-means lusteringSplit
x 1 , · · · , x N
into lustersS 1 , · · · , S K
Minimize
X K k=1
X
x n ∈S k
kx n − µ k k 2
Unsupervisedlearning
NP-hard
LearningFromData-Le ture16 10/20
Choosingthe enters
Minimizethedistan ebetween
x n
andthe losest enterµ k
:K
-means lusteringSplit
x 1 , · · · , x N
into lustersS 1 , · · · , S K
Minimize
X K k=1
X
x n ∈S k
kx n − µ k k 2
Unsupervisedlearning
NP-hard
LearningFromData-Le ture16 10/20
Choosingthe enters
Minimizethedistan ebetween
x n
andthe losest enterµ k
:K
-means lusteringSplit
x 1 , · · · , x N
into lustersS 1 , · · · , S K
Minimize
X K k=1
X
x n ∈S k
kx n − µ k k 2
Unsupervisedlearning
NP-hard
LearningFromData-Le ture16 10/20
RBF Networks Prototype Extraction
Aniterativealgorithm
Lloyd'salgorithm:Iterativelyminimize
X K k=1
X
x n ∈S k
kx n − µ k k 2
w.r.t.µ k , S k
µ k ← 1
|S k | X
x n ∈S k
x n
S k ← {x n : kx n − µ k k ≤
allkx n − µ ℓ k}
Convergen e
−→
lo al minimumLearningFromData-Le ture16 11/20
Aniterativealgorithm
Lloyd'salgorithm:Iterativelyminimize
X K k=1
X
x n ∈S k
kx n − µ k k 2
w.r.t.µ k , S k
µ k ← 1
|S k | X
x n ∈S k
x n
S k ← {x n : kx n − µ k k ≤
allkx n − µ ℓ k}
Convergen e
−→
lo al minimumLearningFromData-Le ture16 11/20
Aniterativealgorithm
Lloyd'salgorithm:Iterativelyminimize
X K k=1
X
x n ∈ S k
kx n − µ k k 2
w.r.t.µ k , S k
µ k ← 1
|S k | X
x n ∈S k
x n
S k ← {x n : kx n − µ k k ≤
allkx n − µ ℓ k}
Convergen e
−→
lo al minimumLearningFromData-Le ture16 11/20
Aniterativealgorithm
Lloyd'salgorithm:Iterativelyminimize
X K k=1
X
x n ∈ S k
kx n − µ k k 2
w.r.t.µ k , S k
µ k ← 1
|S k | X
x n ∈S k
x n
S k ← {x n : kx n − µ k k ≤
allkx n − µ ℓ k}
Convergen e
−→
lo al minimumLearningFromData-Le ture16 11/20
Aniterativealgorithm
Lloyd'salgorithm:Iterativelyminimize
X K k=1
X
x n ∈ S k
kx n − µ k k 2
w.r.t.µ k , S k
µ k ← 1
|S k | X
x n ∈S k
x n
S k ← {x n : kx n − µ k k ≤
allkx n − µ ℓ k}
Convergen e
−→
lo al minimumLearningFromData-Le ture16 11/20
Aniterativealgorithm
Lloyd'salgorithm:Iterativelyminimize
X K k=1
X
x n ∈ S k
kx n − µ k k 2
w.r.t.µ k , S k
µ k ← 1
|S k | X
x n ∈S k
x n
S k ← {x n : kx n − µ k k ≤
allkx n − µ ℓ k}
Convergen e
−→
lo al minimumLearningFromData-Le ture16 11/20
Aniterativealgorithm
Lloyd'salgorithm:Iterativelyminimize
X K k=1
X
x n ∈ S k
kx n − µ k k 2
w.r.t.µ k , S k
µ k ← 1
|S k | X
x n ∈S k
x n
S k ← {x n : kx n − µ k k ≤
allkx n − µ ℓ k}
Convergen e
−→
lo al minimumLearningFromData-Le ture16 11/20
Aniterativealgorithm
Lloyd'salgorithm:Iterativelyminimize
X K k=1
X
x n ∈ S k
kx n − µ k k 2
w.r.t.µ k , S k
µ k ← 1
|S k | X
x n ∈S k
x n
S k ← {x n : kx n − µ k k ≤
allkx n − µ ℓ k}
Convergen e
−→
lo al minimumLearningFromData-Le ture16 11/20
Aniterativealgorithm
Lloyd'salgorithm:Iterativelyminimize
X K k=1
X
x n ∈ S k
kx n − µ k k 2
w.r.t.µ k , S k
µ k ← 1
|S k | X
x n ∈S k
x n
S k ← {x n : kx n − µ k k ≤
allkx n − µ ℓ k}
Convergen e
−→
lo al minimumLearningFromData-Le ture16 11/20
Aniterativealgorithm
Lloyd'salgorithm:Iterativelyminimize
X K k=1
X
x n ∈ S k
kx n − µ k k 2
w.r.t.µ k , S k
µ k ← 1
|S k | X
x n ∈S k
x n
S k ← {x n : kx n − µ k k ≤
allkx n − µ ℓ k}
Convergen e
−→
lo al minimumLearningFromData-Le ture16 11/20
Aniterativealgorithm
Lloyd'salgorithm:Iterativelyminimize
X K k=1
X
x n ∈ S k
kx n − µ k k 2
w.r.t.µ k , S k
µ k ← 1
|S k | X
x n ∈S k
x n
S k ← {x n : kx n − µ k k ≤
allkx n − µ ℓ k}
Convergen e
−→
lo al minimumLearningFromData-Le ture16 11/20
RBF Networks Prototype Extraction
Lloyd'salgorithm ina tion
Hi
Hi
1.Getthedatapoints
2.Onlytheinputs!
3.Initializethe enters
4.Iterate
5.Theseareyour
µ k
'sLearningFromData-Le ture16 12/20
Lloyd'salgorithm ina tion
Hi
Hi
1.Getthedatapoints
2.Onlytheinputs!
3.Initializethe enters
4.Iterate
5.Theseareyour
µ k
'sLearningFromData-Le ture16 12/20
Lloyd'salgorithm ina tion
Hi
Hi
1.Getthedatapoints
2.Onlytheinputs!
3.Initializethe enters
4.Iterate
5.Theseareyour
µ k
'sLearningFromData-Le ture16 12/20
Lloyd'salgorithm ina tion
Hi
Hi
1.Getthedatapoints
2.Onlytheinputs!
3.Initializethe enters
4.Iterate
5.Theseareyour
µ k
'sLearningFromData-Le ture16 12/20
Lloyd'salgorithm ina tion
Hi
Hi
1.Getthedatapoints
2.Onlytheinputs!
3.Initializethe enters
4.Iterate
5.Theseareyour
µ k
'sLearningFromData-Le ture16 12/20
Lloyd'salgorithm ina tion
Hi
Hi
1.Getthedatapoints
2.Onlytheinputs!
3.Initializethe enters
4.Iterate
5.Theseareyour
µ k
'sLearningFromData-Le ture16 12/20
Lloyd'salgorithm ina tion
Hi
Hi
1.Getthedatapoints
2.Onlytheinputs!
3.Initializethe enters
4.Iterate
5.Theseareyour
µ k
'sLearningFromData-Le ture16 12/20
Lloyd'salgorithm ina tion
Hi
Hi
1.Getthedatapoints
2.Onlytheinputs!
3.Initializethe enters
4.Iterate
5.Theseareyour
µ k
'sLearningFromData-Le ture16 12/20
Lloyd'salgorithm ina tion
Hi
Hi
1.Getthedatapoints
2.Onlytheinputs!
3.Initializethe enters
4.Iterate
5.Theseareyour
µ k
'sLearningFromData-Le ture16 12/20
Lloyd'salgorithm ina tion
Hi
Hi
1.Getthedatapoints
2.Onlytheinputs!
3.Initializethe enters
4.Iterate
5.Theseareyour
µ k
'sLearningFromData-Le ture16 12/20
Lloyd'salgorithm ina tion
Hi
Hi
1.Getthedatapoints
2.Onlytheinputs!
3.Initializethe enters
4.Iterate
5.Theseareyour
µ k
'sLearningFromData-Le ture16 12/20
Lloyd'salgorithm ina tion
Hi
Hi
1.Getthedatapoints
2.Onlytheinputs!
3.Initializethe enters
4.Iterate
5.Theseareyour
µ k
'sLearningFromData-Le ture16 12/20
Lloyd'salgorithm ina tion
Hi
Hi
1.Getthedatapoints
2.Onlytheinputs!
3.Initializethe enters
4.Iterate
5.Theseareyour
µ k
'sLearningFromData-Le ture16 12/20
Lloyd'salgorithm ina tion
Hi
Hi
1.Getthedatapoints
2.Onlytheinputs!
3.Initializethe enters
4.Iterate
5.Theseareyour
µ k
'sLearningFromData-Le ture16 12/20
Lloyd'salgorithm ina tion
Hi
Hi
1.Getthedatapoints
2.Onlytheinputs!
3.Initializethe enters
4.Iterate
5.Theseareyour
µ k
'sLearningFromData-Le ture16 12/20
RBF Networks Prototype Extraction
Fun Time
RBF Networks RBF Network
Choosingtheweights
X K k=1
w k exp
−γ kx n − µ k k 2
≈ y n N
equationsinK< N
unknowns
exp( −γ kx 1 − µ 1 k 2 ) . . . exp( −γ kx 1 − µ K k 2 ) exp(−γ kx 2 − µ 1 k 2 ) . . . exp(−γ kx 2 − µ K k 2 )
.
.
.
.
.
.
.
.
.
exp(−γ kx N − µ 1 k 2 ) . . . exp(−γ kx N − µ K k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w K
| {z } w
≈
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
TΦ
isinvertible,w = (Φ
TΦ) −1 Φ
Ty
pseudo-inverseLearningFromData-Le ture16 14/20
Choosingtheweights
X K k=1
w k exp
−γ kx n − µ k k 2
= y n N
equationsinK< N
unknowns
exp( −γ kx 1 − µ 1 k 2 ) . . . exp( −γ kx 1 − µ K k 2 ) exp(−γ kx 2 − µ 1 k 2 ) . . . exp(−γ kx 2 − µ K k 2 )
.
.
.
.
.
.
.
.
.
exp(−γ kx N − µ 1 k 2 ) . . . exp(−γ kx N − µ K k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w K
| {z } w
≈
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
TΦ
isinvertible,w = (Φ
TΦ) −1 Φ
Ty
pseudo-inverseLearningFromData-Le ture16 14/20
Choosingtheweights
X K k=1
w k exp
−γ kx n − µ k k 2
= y n N
equationsinK< N
unknowns
exp( −γ kx 1 − µ 1 k 2 ) . . . exp( −γ kx 1 − µ K k 2 ) exp(−γ kx 2 − µ 1 k 2 ) . . . exp(−γ kx 2 − µ K k 2 )
.
.
.
.
.
.
.
.
.
exp(−γ kx N − µ 1 k 2 ) . . . exp(−γ kx N − µ K k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w K
| {z } w
≈
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
TΦ
isinvertible,w = (Φ
TΦ) −1 Φ
Ty
pseudo-inverseLearningFromData-Le ture16 14/20
Choosingtheweights
X K k=1
w k exp
−γ kx n − µ k k 2
= y n N
equationsinK< N
unknowns
exp( −γ kx 1 − µ 1 k 2 ) . . . exp( −γ kx 1 − µ K k 2 ) exp(−γ kx 2 − µ 1 k 2 ) . . . exp(−γ kx 2 − µ K k 2 )
.
.
.
.
.
.
.
.
.
exp(−γ kx N − µ 1 k 2 ) . . . exp(−γ kx N − µ K k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w K
| {z } w
≈
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
TΦ
isinvertible,w = (Φ
TΦ) −1 Φ
Ty
pseudo-inverseLearningFromData-Le ture16 14/20
Choosingtheweights
X K k=1
w k exp
−γ kx n − µ k k 2
≈ y n N
equationsinK< N
unknowns
exp( −γ kx 1 − µ 1 k 2 ) . . . exp( −γ kx 1 − µ K k 2 ) exp(−γ kx 2 − µ 1 k 2 ) . . . exp(−γ kx 2 − µ K k 2 )
.
.
.
.
.
.
.
.
.
exp(−γ kx N − µ 1 k 2 ) . . . exp(−γ kx N − µ K k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w K
| {z } w
≈
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
TΦ
isinvertible,w = (Φ
TΦ) −1 Φ
Ty
pseudo-inverseLearningFromData-Le ture16 14/20
Choosingtheweights
X K k=1
w k exp
−γ kx n − µ k k 2
≈ y n N
equationsinK< N
unknowns
exp( −γ kx 1 − µ 1 k 2 ) . . . exp( −γ kx 1 − µ K k 2 ) exp(−γ kx 2 − µ 1 k 2 ) . . . exp(−γ kx 2 − µ K k 2 )
.
.
.
.
.
.
.
.
.
exp(−γ kx N − µ 1 k 2 ) . . . exp(−γ kx N − µ K k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w K
| {z } w
≈
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
TΦ
isinvertible,w = (Φ
TΦ) −1 Φ
Ty
pseudo-inverseLearningFromData-Le ture16 14/20
Choosingtheweights
X K k=1
w k exp
−γ kx n − µ k k 2
≈ y n N
equationsinK< N
unknowns
exp( −γ kx 1 − µ 1 k 2 ) . . . exp( −γ kx 1 − µ K k 2 ) exp(−γ kx 2 − µ 1 k 2 ) . . . exp(−γ kx 2 − µ K k 2 )
.
.
.
.
.
.
.
.
.
exp(−γ kx N − µ 1 k 2 ) . . . exp(−γ kx N − µ K k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w K
| {z } w
≈
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
TΦ
isinvertible,w = (Φ
TΦ) −1 Φ
Ty
pseudo-inverseLearningFromData-Le ture16 14/20
Choosingtheweights
X K k=1
w k exp
−γ kx n − µ k k 2
≈ y n N
equationsinK< N
unknowns
exp( −γ kx 1 − µ 1 k 2 ) . . . exp( −γ kx 1 − µ K k 2 ) exp(−γ kx 2 − µ 1 k 2 ) . . . exp(−γ kx 2 − µ K k 2 )
.
.
.
.
.
.
.
.
.
exp(−γ kx N − µ 1 k 2 ) . . . exp(−γ kx N − µ K k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w K
| {z } w
≈
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
TΦ
isinvertible,w = (Φ
TΦ) −1 Φ
Ty
pseudo-inverseLearningFromData-Le ture16 14/20
Choosingtheweights
X K k=1
w k exp
−γ kx n − µ k k 2
≈ y n N
equationsinK< N
unknowns
exp( −γ kx 1 − µ 1 k 2 ) . . . exp( −γ kx 1 − µ K k 2 ) exp(−γ kx 2 − µ 1 k 2 ) . . . exp(−γ kx 2 − µ K k 2 )
.
.
.
.
.
.
.
.
.
exp(−γ kx N − µ 1 k 2 ) . . . exp(−γ kx N − µ K k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w K
| {z } w
≈
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
TΦ
isinvertible,w = (Φ
TΦ) −1 Φ
Ty
pseudo-inverseLearningFromData-Le ture16 14/20
Choosingtheweights
X K k=1
w k exp
−γ kx n − µ k k 2
≈ y n N
equationsinK< N
unknowns
exp( −γ kx 1 − µ 1 k 2 ) . . . exp( −γ kx 1 − µ K k 2 ) exp(−γ kx 2 − µ 1 k 2 ) . . . exp(−γ kx 2 − µ K k 2 )
.
.
.
.
.
.
.
.
.
exp(−γ kx N − µ 1 k 2 ) . . . exp(−γ kx N − µ K k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w K
| {z } w
≈
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
TΦ
isinvertible,w = (Φ
TΦ) −1 Φ
Ty
pseudo-inverseLearningFromData-Le ture16 14/20
Choosingtheweights
X K k=1
w k exp
−γ kx n − µ k k 2
≈ y n N
equationsinK< N
unknowns
exp( −γ kx 1 − µ 1 k 2 ) . . . exp( −γ kx 1 − µ K k 2 ) exp(−γ kx 2 − µ 1 k 2 ) . . . exp(−γ kx 2 − µ K k 2 )
.
.
.
.
.
.
.
.
.
exp(−γ kx N − µ 1 k 2 ) . . . exp(−γ kx N − µ K k 2 )
| {z }
Φ
w 1
w 2
.
.
.
w K
| {z } w
≈
y 1
y 2
.
.
.
y N
| {z } y
If
Φ
TΦ
isinvertible,w = (Φ
TΦ) −1 Φ
Ty
pseudo-inverseLearningFromData-Le ture16 14/20