Radial Basis Function Network RBF Network Learning
Fewer Centers as Regularization
recall:
gSVM(x) =
sign X
SV
α m y m exp
−γkx − x m k 2
+b
!
—only ‘ N’
SVs
needed in ‘network’•
next:M N
instead ofM = N
•
effect:regularization
Radial Basis Function Network RBF Network Learning
Fewer Centers as Regularization
recall:
gSVM(x) =
sign X
SV
α m y m exp
−γkx − x m k 2
+b
!
—only ‘ N’
SVs
needed in ‘network’•
next:M N
instead ofM = N
•
effect:regularization
by constraining
number of centers and voting weights
•
physical meaning ofcenters µ m
:prototypes
remaining question: how to extract
prototypes?
Radial Basis Function Network RBF Network Learning
Fewer Centers as Regularization
recall:
gSVM(x) =
sign X
SV
α m y m exp
−γkx − x m k 2
+b
!
—only ‘ N’
SVs
needed in ‘network’•
next:M N
instead ofM = N
•
effect:regularization
by constraining
number of centers and voting weights
•
physical meaning ofcenters µ m
:prototypes
remaining question: how to extract
prototypes?
Radial Basis Function Network RBF Network Learning
Fewer Centers as Regularization
recall:
gSVM(x) =
sign X
SV
α m y m exp
−γkx − x m k 2
+b
!
—only ‘ N’
SVs
needed in ‘network’•
next:M N
instead ofM = N
•
effect:regularization
by constraining
number of centers and voting weights
•
physical meaning ofcenters µ m
:prototypes
remaining question: how to extract
prototypes?
Radial Basis Function Network RBF Network Learning
Fewer Centers as Regularization
recall:
gSVM(x) =
sign X
SV
α m y m exp
−γkx − x m k 2
+b
!
—only ‘ N’
SVs
needed in ‘network’•
next:M N
instead ofM = N
•
effect:regularization
by constraining
number of centers and voting weights
•
physical meaning ofcenters µ m
:prototypes
remaining question:
Radial Basis Function Network RBF Network Learning
Fun Time
If
x 1
=x 2
, what happens in theZ
matrix of full Gaussian RBF network?1
the first two rows of the matrix are the same2
the first two columns of the matrix are different3
the matrix is invertible4
the sub-matrix at the intersection of the first two rows and the first two columns contains a constant of 0Reference Answer: 1
It is easy to see that the first two rows must be the same; so must the first two columns. The two same rows makes the matrix singular; the sub-matrix in 4 contains a constant of 1 = exp(−0) instead of 0.
Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 12/24
Radial Basis Function Network RBF Network Learning
Fun Time
If
x 1
=x 2
, what happens in theZ
matrix of full Gaussian RBF network?1
the first two rows of the matrix are the same2
the first two columns of the matrix are different3
the matrix is invertible4
the sub-matrix at the intersection of the first two rows and the first two columns contains a constant of 0Reference Answer: 1
It is easy to see that the first two rows must be the same; so must the first two columns. The two same rows makes the matrix singular; the
Radial Basis Function Network k -Means Algorithm
Good Prototypes: Clustering Problem
=⇒
if
x 1 ≈ x 2
,=⇒
no need
bothRBF(x, x 1
)&RBF(x, x 2
)in RBFNet,=⇒
cluster x 1
andx 2
byone prototype µ ≈ x 1 ≈ x 2
• clustering
withprototype:
• partition {x
n} to disjoint sets S
1, S
2, · · · , S
M• choose µ
mfor each S
m—hope:
x 1 , x 2
both ∈S m
⇔µ m ≈ x 1 ≈ x 2
•
cluster error with squared error measure:E
in(S
1, · · · , S
M; µ
1, · · · , µ
M) = 1 N
N
X
n=1 M
X
m=1
J x
n∈ S
mK kx
n− µ
mk
2goal: with
S 1 , · · · , S M
being a partition of{x n },
min{S
1,··· ,S
M;µ
1,··· ,µ
M}
Ein
(S1 , · · · , S M
;µ 1 , · · · , µ M
)Radial Basis Function Network k -Means Algorithm
Good Prototypes: Clustering Problem
=⇒
if
x 1 ≈ x 2
,=⇒
no need
bothRBF(x, x 1
)&RBF(x, x 2
)in RBFNet,=⇒
cluster x 1
andx 2
byone prototype µ ≈ x 1 ≈ x 2
• clustering
withprototype:
• partition {x
n} to disjoint sets S
1, S
2, · · · , S
M• choose µ
mfor each S
m—hope:
x 1 , x 2
both ∈S m
⇔µ m ≈ x 1 ≈ x 2
•
cluster error with squared error measure:E
in(S
1, · · · , S
M; µ
1, · · · , µ
M) = 1 N
N
X
n=1 M
X
m=1
J x
n∈ S
mK kx
n− µ
mk
2goal: with
S 1 , · · · , S M
being a partition of{x n },
min{S
1,··· ,S
M;µ
1,··· ,µ
M}
Ein
(S1 , · · · , S M
;µ 1 , · · · , µ M
)Radial Basis Function Network k -Means Algorithm
Good Prototypes: Clustering Problem
=⇒
if
x 1 ≈ x 2
,=⇒
no need
bothRBF(x, x 1
)&RBF(x, x 2
)in RBFNet,=⇒
cluster x 1
andx 2
byone prototype µ ≈ x 1 ≈ x 2
• clustering
withprototype:
• partition {x
n} to disjoint sets S
1, S
2, · · · , S
M• choose µ
mfor each S
m—hope:
x 1 , x 2
both ∈S m
⇔µ m ≈ x 1 ≈ x 2
•
cluster error with squared error measure:E
in(S
1, · · · , S
M; µ
1, · · · , µ
M) = 1 N
N
X
n=1 M
X
m=1
J x
n∈ S
mK kx
n− µ
mk
2goal: with
S 1 , · · · , S M
being a partition of{x n },
min{S
1,··· ,S
M;µ
1,··· ,µ
M}
Ein
(S1 , · · · , S M
;µ 1 , · · · , µ M
)Radial Basis Function Network k -Means Algorithm
Good Prototypes: Clustering Problem
=⇒
if
x 1 ≈ x 2
,=⇒
no need
bothRBF(x, x 1
)&RBF(x, x 2
)in RBFNet,=⇒
cluster x 1
andx 2
byone prototype µ ≈ x 1 ≈ x 2
• clustering
withprototype:
• partition {x
n} to disjoint sets S
1, S
2, · · · , S
M• choose µ
mfor each S
m—hope:
x 1 , x 2
both ∈S m
⇔µ m ≈ x 1 ≈ x 2
•
cluster error with squared error measure:E
in(S
1, · · · , S
M; µ
1, · · · , µ
M) = 1 N
N
X
n=1 M
X
m=1
J x
n∈ S
mK kx
n− µ
mk
2goal: with
S 1 , · · · , S M
being a partition of{x n },
min{S
1,··· ,S
M;µ
1,··· ,µ
M}
Ein
(S1 , · · · , S M
;µ 1 , · · · , µ M
)Radial Basis Function Network k -Means Algorithm
Good Prototypes: Clustering Problem
=⇒
if
x 1 ≈ x 2
,=⇒
no need
bothRBF(x, x 1
)&RBF(x, x 2
)in RBFNet,=⇒
cluster x 1
andx 2
byone prototype µ ≈ x 1 ≈ x 2
• clustering
withprototype:
• partition {x
n} to disjoint sets S
1, S
2, · · · , S
M• choose µ
mfor each S
m—hope:
x 1 , x 2
both ∈S m
⇔µ m ≈ x 1 ≈ x 2
•
cluster error with squared error measure:E
in(S
1, · · · , S
M; µ
1, · · · , µ
M) = 1 N
N
X
n=1 M
X
m=1
J x
n∈ S
mK kx
n− µ
mk
2goal: with
S 1 , · · · , S M
being a partition of{x n },
min{S
1,··· ,S
M;µ
1,··· ,µ
M}
Ein
(S1 , · · · , S M
;µ 1 , · · · , µ M
)Radial Basis Function Network k -Means Algorithm
Good Prototypes: Clustering Problem
=⇒
if
x 1 ≈ x 2
,=⇒
no need
bothRBF(x, x 1
)&RBF(x, x 2
)in RBFNet,=⇒
cluster x 1
andx 2
byone prototype µ ≈ x 1 ≈ x 2
• clustering
withprototype:
• partition {x
n} to disjoint sets S
1, S
2, · · · , S
M• choose µ
mfor each S
m—hope:
x 1 , x 2
both ∈S m
⇔µ m ≈ x 1 ≈ x 2
•
cluster error with squared error measure:E
in(S
1, · · · , S
M; µ
1, · · · , µ
M) = 1 N
N
X
n=1 M
X
m=1
J x
n∈ S
mK kx
n− µ
mk
2goal: with
S 1 , · · · , S M
being a partition of{x n },
min{S
1,··· ,S
M;µ
1,··· ,µ
M}
Ein
(S1 , · · · , S M
;µ 1 , · · · , µ M
)Radial Basis Function Network k -Means Algorithm
Good Prototypes: Clustering Problem
=⇒
if
x 1 ≈ x 2
,=⇒
no need
bothRBF(x, x 1
)&RBF(x, x 2
)in RBFNet,=⇒
cluster x 1
andx 2
byone prototype µ ≈ x 1 ≈ x 2
• clustering
withprototype:
• partition {x
n} to disjoint sets S
1, S
2, · · · , S
M• choose µ
mfor each S
m—hope:
x 1 , x 2
both ∈S m
⇔µ m ≈ x 1 ≈ x 2
•
cluster error with squared error measure:E
in(S
1, · · · , S
M; µ
1, · · · , µ
M) = 1 N
N
X
n=1 M
X
m=1
J x
n∈ S
mK kx
n− µ
mk
2goal: with
S 1 , · · · , S M
being a partition of{x n },
min{S
1,··· ,S
M;µ
1,··· ,µ
M}
Ein
(S1 , · · · , S M
;µ 1 , · · · , µ M
)Radial Basis Function Network k -Means Algorithm
Good Prototypes: Clustering Problem
=⇒
if
x 1 ≈ x 2
,=⇒
no need
bothRBF(x, x 1
)&RBF(x, x 2
)in RBFNet,=⇒
cluster x 1
andx 2
byone prototype µ ≈ x 1 ≈ x 2
• clustering
withprototype:
• partition {x
n} to disjoint sets S
1, S
2, · · · , S
M• choose µ
mfor each S
m—hope:
x 1 , x 2
both ∈S m
⇔µ m ≈ x 1 ≈ x 2
•
cluster error with squared error measure:E
in(S
1, · · · , S
M; µ
1, · · · , µ
M) = 1 N
N
X
n=1 M
X
m=1
J x
n∈ S
mK kx
n− µ
mk
2goal: with
S 1 , · · · , S M
being a partition of{x n },
min{S
1,··· ,S
M;µ
1,··· ,µ
M}
Ein
(S1 , · · · , S M
;µ 1 , · · · , µ M
)Radial Basis Function Network k -Means Algorithm
Good Prototypes: Clustering Problem
=⇒
if
x 1 ≈ x 2
,=⇒
no need
bothRBF(x, x 1
)&RBF(x, x 2
)in RBFNet,=⇒
cluster x 1
andx 2
byone prototype µ ≈ x 1 ≈ x 2
• clustering
withprototype:
• partition {x
n} to disjoint sets S
1, S
2, · · · , S
M• choose µ
mfor each S
m—hope:
x 1 , x 2
both ∈S m
⇔µ m ≈ x 1 ≈ x 2
•
cluster error with squared error measure:E
in(S
1, · · · , S
M; µ
1, · · · , µ
M) = 1 N
N
X
n=1 M
X
m=1
J x
n∈ S
mK kx
n− µ
mk
2goal: with
S 1 , · · · , S M
being a partition of{x n },
min{S
1,··· ,S
M;µ
1,··· ,µ
M}
Ein
(S1 , · · · , S M
;µ 1 , · · · , µ M
)Radial Basis Function Network k -Means Algorithm
Partition Optimization
with
S 1 , · · · , S M
being a partition of{x n },
{S
1,··· ,S
minM;µ
1,··· ,µ
M} N
X
n=1 M
X