• 沒有找到結果。

by constraining number of centers and voting weights

Radial Basis Function Network RBF Network Learning

Fewer Centers as Regularization

recall:

gSVM(x) =

sign X

SV

α m y m exp



−γkx − x m k 2 

+

b

!

—only ‘ N’

SVs

needed in ‘network’

next:

M  N

instead of

M = N

effect:

regularization

Radial Basis Function Network RBF Network Learning

Fewer Centers as Regularization

recall:

gSVM(x) =

sign X

SV

α m y m exp



−γkx − x m k 2 

+

b

!

—only ‘ N’

SVs

needed in ‘network’

next:

M  N

instead of

M = N

effect:

regularization

by constraining

number of centers and voting weights

physical meaning of

centers µ m

:

prototypes

remaining question: how to extract

prototypes?

Radial Basis Function Network RBF Network Learning

Fewer Centers as Regularization

recall:

gSVM(x) =

sign X

SV

α m y m exp



−γkx − x m k 2 

+

b

!

—only ‘ N’

SVs

needed in ‘network’

next:

M  N

instead of

M = N

effect:

regularization

by constraining

number of centers and voting weights

physical meaning of

centers µ m

:

prototypes

remaining question: how to extract

prototypes?

Radial Basis Function Network RBF Network Learning

Fewer Centers as Regularization

recall:

gSVM(x) =

sign X

SV

α m y m exp



−γkx − x m k 2 

+

b

!

—only ‘ N’

SVs

needed in ‘network’

next:

M  N

instead of

M = N

effect:

regularization

by constraining

number of centers and voting weights

physical meaning of

centers µ m

:

prototypes

remaining question: how to extract

prototypes?

Radial Basis Function Network RBF Network Learning

Fewer Centers as Regularization

recall:

gSVM(x) =

sign X

SV

α m y m exp



−γkx − x m k 2 

+

b

!

—only ‘ N’

SVs

needed in ‘network’

next:

M  N

instead of

M = N

effect:

regularization

by constraining

number of centers and voting weights

physical meaning of

centers µ m

:

prototypes

remaining question:

Radial Basis Function Network RBF Network Learning

Fun Time

If

x 1

=

x 2

, what happens in the

Z

matrix of full Gaussian RBF network?

1

the first two rows of the matrix are the same

2

the first two columns of the matrix are different

3

the matrix is invertible

4

the sub-matrix at the intersection of the first two rows and the first two columns contains a constant of 0

Reference Answer: 1

It is easy to see that the first two rows must be the same; so must the first two columns. The two same rows makes the matrix singular; the sub-matrix in 4 contains a constant of 1 = exp(−0) instead of 0.

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 12/24

Radial Basis Function Network RBF Network Learning

Fun Time

If

x 1

=

x 2

, what happens in the

Z

matrix of full Gaussian RBF network?

1

the first two rows of the matrix are the same

2

the first two columns of the matrix are different

3

the matrix is invertible

4

the sub-matrix at the intersection of the first two rows and the first two columns contains a constant of 0

Reference Answer: 1

It is easy to see that the first two rows must be the same; so must the first two columns. The two same rows makes the matrix singular; the

Radial Basis Function Network k -Means Algorithm

Good Prototypes: Clustering Problem

=⇒

if

x 1 ≈ x 2

,

=⇒

no need

both

RBF(x, x 1

)&

RBF(x, x 2

)in RBFNet,

=⇒

cluster x 1

and

x 2

by

one prototype µ ≈ x 1 ≈ x 2

clustering

with

prototype:

partition {x

n

} to disjoint sets S

1

, S

2

, · · · , S

M

choose µ

m

for each S

m

—hope:

x 1 , x 2

both ∈

S m

µ m ≈ x 1 ≈ x 2

cluster error with squared error measure:

E

in

(S

1

, · · · , S

M

; µ

1

, · · · , µ

M

) = 1 N

N

X

n=1 M

X

m=1

J x

n

∈ S

m

K kx

n

− µ

m

k

2

goal: with

S 1 , · · · , S M

being a partition of

{x n },

min

{S

1

,··· ,S

M

1

,··· ,µ

M

}

E

in

(S

1 , · · · , S M

;

µ 1 , · · · , µ M

)

Radial Basis Function Network k -Means Algorithm

Good Prototypes: Clustering Problem

=⇒

if

x 1 ≈ x 2

,

=⇒

no need

both

RBF(x, x 1

)&

RBF(x, x 2

)in RBFNet,

=⇒

cluster x 1

and

x 2

by

one prototype µ ≈ x 1 ≈ x 2

clustering

with

prototype:

partition {x

n

} to disjoint sets S

1

, S

2

, · · · , S

M

choose µ

m

for each S

m

—hope:

x 1 , x 2

both ∈

S m

µ m ≈ x 1 ≈ x 2

cluster error with squared error measure:

E

in

(S

1

, · · · , S

M

; µ

1

, · · · , µ

M

) = 1 N

N

X

n=1 M

X

m=1

J x

n

∈ S

m

K kx

n

− µ

m

k

2

goal: with

S 1 , · · · , S M

being a partition of

{x n },

min

{S

1

,··· ,S

M

1

,··· ,µ

M

}

E

in

(S

1 , · · · , S M

;

µ 1 , · · · , µ M

)

Radial Basis Function Network k -Means Algorithm

Good Prototypes: Clustering Problem

=⇒

if

x 1 ≈ x 2

,

=⇒

no need

both

RBF(x, x 1

)&

RBF(x, x 2

)in RBFNet,

=⇒

cluster x 1

and

x 2

by

one prototype µ ≈ x 1 ≈ x 2

clustering

with

prototype:

partition {x

n

} to disjoint sets S

1

, S

2

, · · · , S

M

choose µ

m

for each S

m

—hope:

x 1 , x 2

both ∈

S m

µ m ≈ x 1 ≈ x 2

cluster error with squared error measure:

E

in

(S

1

, · · · , S

M

; µ

1

, · · · , µ

M

) = 1 N

N

X

n=1 M

X

m=1

J x

n

∈ S

m

K kx

n

− µ

m

k

2

goal: with

S 1 , · · · , S M

being a partition of

{x n },

min

{S

1

,··· ,S

M

1

,··· ,µ

M

}

E

in

(S

1 , · · · , S M

;

µ 1 , · · · , µ M

)

Radial Basis Function Network k -Means Algorithm

Good Prototypes: Clustering Problem

=⇒

if

x 1 ≈ x 2

,

=⇒

no need

both

RBF(x, x 1

)&

RBF(x, x 2

)in RBFNet,

=⇒

cluster x 1

and

x 2

by

one prototype µ ≈ x 1 ≈ x 2

clustering

with

prototype:

partition {x

n

} to disjoint sets S

1

, S

2

, · · · , S

M

choose µ

m

for each S

m

—hope:

x 1 , x 2

both ∈

S m

µ m ≈ x 1 ≈ x 2

cluster error with squared error measure:

E

in

(S

1

, · · · , S

M

; µ

1

, · · · , µ

M

) = 1 N

N

X

n=1 M

X

m=1

J x

n

∈ S

m

K kx

n

− µ

m

k

2

goal: with

S 1 , · · · , S M

being a partition of

{x n },

min

{S

1

,··· ,S

M

1

,··· ,µ

M

}

E

in

(S

1 , · · · , S M

;

µ 1 , · · · , µ M

)

Radial Basis Function Network k -Means Algorithm

Good Prototypes: Clustering Problem

=⇒

if

x 1 ≈ x 2

,

=⇒

no need

both

RBF(x, x 1

)&

RBF(x, x 2

)in RBFNet,

=⇒

cluster x 1

and

x 2

by

one prototype µ ≈ x 1 ≈ x 2

clustering

with

prototype:

partition {x

n

} to disjoint sets S

1

, S

2

, · · · , S

M

choose µ

m

for each S

m

—hope:

x 1 , x 2

both ∈

S m

µ m ≈ x 1 ≈ x 2

cluster error with squared error measure:

E

in

(S

1

, · · · , S

M

; µ

1

, · · · , µ

M

) = 1 N

N

X

n=1 M

X

m=1

J x

n

∈ S

m

K kx

n

− µ

m

k

2

goal: with

S 1 , · · · , S M

being a partition of

{x n },

min

{S

1

,··· ,S

M

1

,··· ,µ

M

}

E

in

(S

1 , · · · , S M

;

µ 1 , · · · , µ M

)

Radial Basis Function Network k -Means Algorithm

Good Prototypes: Clustering Problem

=⇒

if

x 1 ≈ x 2

,

=⇒

no need

both

RBF(x, x 1

)&

RBF(x, x 2

)in RBFNet,

=⇒

cluster x 1

and

x 2

by

one prototype µ ≈ x 1 ≈ x 2

clustering

with

prototype:

partition {x

n

} to disjoint sets S

1

, S

2

, · · · , S

M

choose µ

m

for each S

m

—hope:

x 1 , x 2

both ∈

S m

µ m ≈ x 1 ≈ x 2

cluster error with squared error measure:

E

in

(S

1

, · · · , S

M

; µ

1

, · · · , µ

M

) = 1 N

N

X

n=1 M

X

m=1

J x

n

∈ S

m

K kx

n

− µ

m

k

2

goal: with

S 1 , · · · , S M

being a partition of

{x n },

min

{S

1

,··· ,S

M

1

,··· ,µ

M

}

E

in

(S

1 , · · · , S M

;

µ 1 , · · · , µ M

)

Radial Basis Function Network k -Means Algorithm

Good Prototypes: Clustering Problem

=⇒

if

x 1 ≈ x 2

,

=⇒

no need

both

RBF(x, x 1

)&

RBF(x, x 2

)in RBFNet,

=⇒

cluster x 1

and

x 2

by

one prototype µ ≈ x 1 ≈ x 2

clustering

with

prototype:

partition {x

n

} to disjoint sets S

1

, S

2

, · · · , S

M

choose µ

m

for each S

m

—hope:

x 1 , x 2

both ∈

S m

µ m ≈ x 1 ≈ x 2

cluster error with squared error measure:

E

in

(S

1

, · · · , S

M

; µ

1

, · · · , µ

M

) = 1 N

N

X

n=1 M

X

m=1

J x

n

∈ S

m

K kx

n

− µ

m

k

2

goal: with

S 1 , · · · , S M

being a partition of

{x n },

min

{S

1

,··· ,S

M

1

,··· ,µ

M

}

E

in

(S

1 , · · · , S M

;

µ 1 , · · · , µ M

)

Radial Basis Function Network k -Means Algorithm

Good Prototypes: Clustering Problem

=⇒

if

x 1 ≈ x 2

,

=⇒

no need

both

RBF(x, x 1

)&

RBF(x, x 2

)in RBFNet,

=⇒

cluster x 1

and

x 2

by

one prototype µ ≈ x 1 ≈ x 2

clustering

with

prototype:

partition {x

n

} to disjoint sets S

1

, S

2

, · · · , S

M

choose µ

m

for each S

m

—hope:

x 1 , x 2

both ∈

S m

µ m ≈ x 1 ≈ x 2

cluster error with squared error measure:

E

in

(S

1

, · · · , S

M

; µ

1

, · · · , µ

M

) = 1 N

N

X

n=1 M

X

m=1

J x

n

∈ S

m

K kx

n

− µ

m

k

2

goal: with

S 1 , · · · , S M

being a partition of

{x n },

min

{S

1

,··· ,S

M

1

,··· ,µ

M

}

E

in

(S

1 , · · · , S M

;

µ 1 , · · · , µ M

)

Radial Basis Function Network k -Means Algorithm

Good Prototypes: Clustering Problem

=⇒

if

x 1 ≈ x 2

,

=⇒

no need

both

RBF(x, x 1

)&

RBF(x, x 2

)in RBFNet,

=⇒

cluster x 1

and

x 2

by

one prototype µ ≈ x 1 ≈ x 2

clustering

with

prototype:

partition {x

n

} to disjoint sets S

1

, S

2

, · · · , S

M

choose µ

m

for each S

m

—hope:

x 1 , x 2

both ∈

S m

µ m ≈ x 1 ≈ x 2

cluster error with squared error measure:

E

in

(S

1

, · · · , S

M

; µ

1

, · · · , µ

M

) = 1 N

N

X

n=1 M

X

m=1

J x

n

∈ S

m

K kx

n

− µ

m

k

2

goal: with

S 1 , · · · , S M

being a partition of

{x n },

min

{S

1

,··· ,S

M

1

,··· ,µ

M

}

E

in

(S

1 , · · · , S M

;

µ 1 , · · · , µ M

)

Radial Basis Function Network k -Means Algorithm

Partition Optimization

with

S 1 , · · · , S M

being a partition of

{x n },

{S

1

,··· ,S

minM

1

,··· ,µ

M

} N

X

n=1 M

X

m=1

J x n ∈ S m K kx n − µ m k 2

hard to optimize: joint combinatorial-numerical

optimization

two sets

of

variables: will optimize alternatingly

if

µ 1 , · · · , µ M fixed, for each x n

• J x n ∈ S m K

: choose

one and only one subset

• kx n − µ m k 2

: distance to each

prototype

optimal

chosen subset S m

= the one with

相關文件