• 沒有找到結果。

Input data

N/A
N/A
Protected

Academic year: 2022

Share "Input data"

Copied!
28
0
0

加載中.... (立即查看全文)

全文

(1)

Implementation of the MLP Kernel

Cheng-Yuan Liou* and Wei-Chen Cheng

Department of Computer Science and Information Engineering

National Taiwan University Republic of China

*cyliou@csie.ntu.edu.tw

25th, Nov., 2008 17:40-19:00

Auckland

(2)
(3)

ICONIP 2008 Liou, C.-Y. 3

Related works

ICONIP, Weight design, upper bound Liou, Yu

1994

ICNN, Perth, AIR Liou, Yu

1995

ICONIP Liou, Cheng

2007

ICS, SIR Liou, Chen, Huang

2000

Support vector machine Boser

1992

Contribution People

Year

<

1 1

m m

m n

n Y

(4)

Liou and Yu, 1995, ICNN, Perth

0 =14

= Y

X Y1 = 6 Y2 = 4 Y3 = 2

m

m Y

Y −1 <<

xp

=2 C

( )p,1

y xq

, many -to-one mapping

= 2

= C YL

(5)

ICONIP 2008 Liou, C.-Y. 5

<

1 1

m m

m n

n Y

y(p,m-1)

y(p,m-2) y(p,m)

Wm by design, and guaranteed (Liou and Yu, 1994, ICONIP, Seoul)

Wm by training, SIR

(Liou, Chen, Huang, 2000, ICS) (Liou, Cheng, 2007, ICONIP)

mth layer m-1

C YL =

( )

{ }

p m

Y m = y ,

(6)

The reason why the training is layer after layer independently.

[Liou and Yu, 1995, ICNN, Perth]

(7)

ICONIP 2008 Liou, C.-Y. 7

ICNN, 1995, Perth, AIR tree

L1 L2

bia s

L3

x bi a s

y

input l a ye r bi a s

out put la ye r

first hidde n l a ye r se c ond hi dde n la ye r

(001) (111)

(100)

(110)

(000) (010)

(011) L3

L1

L2

(0)

(010) (1)

(011)

(110) (100)

(001)

(111) (001) (010) (000)

(000)

(100) (011)

(8)

Conclusions of AIR, 1995

• BP can not correct the latent error neurons by adjusting their succeeding layers.

• AIR tree can trace the errors in a latent layer that near the front input layer.

• The front layers must send right signals to their succeeding layers.

• The front layer must be trained layer after layer in order to get right signals.

• Split the function of supervised BP, categorization and calibration.

• Reduced number of representations. Ym−1 << Ym

(9)

ICONIP 2008 Liou, C.-Y. 9

AIR, 1995

• Supervised BP

• Identified the function of MLP

– Classification =

Categorization + Calibration Differences of classes class labels

Front

Categorize

Output Calibrate

x y

(10)

Weight design

ICONIP, 1994, Seoul

• Weight design for each layer

• Number of neurons (E.B. Baum 1988)

– Upper bound for first hidden layer

– for hidden layers

– the number of classes, guaranteed

<

1 1

m m

m n

n Y

C Y L =

⎥⎥

⎢⎢ D

P

(11)

ICONIP 2008 Liou, C.-Y. 11

Continuous border

1st hidden layer

divide and conquer

(12)

Devise training for layers SIR, ICS 2000

• Categorization sector

• Using differences between classes implicitly

(p m) (q m)

E

rep , ,

2

1 yy

= −

(p m) (q m)

E

att , ,

2

1 yy

=

Inter-class

Intra-class

(13)

ICONIP 2008 Liou, C.-Y. 13

SIR kernel

output space input space

Input data

(p,0)

p y

x = y( )p,1

W1

x2

x1

x3

x4

x5

( )1,1

y

( )2,1

y

( )3,1

y

( )4,1

y

( )5,1

y

( ) ( ) ( )

{

1 2 1 3 2 3

}

1 x ,x , x ,x , x ,x

U =

( )

{

4 5

}

2 x , x

U =

( ) ( ) ( ) ( ) ( ) ( )

{

1 4 1 5 2 4 2 5 3 4 3 5

}

2 ,

1 x ,x , x ,x , x ,x , x ,x , x ,x , x ,x

V = :inter-class pattern pairs

(14)

SIR kernel

output space input space

Input data

(p,0)

p y

x = y( )p,1

W1

x2

x1

x3

x4

x5

( )1,1

y

( )2,1

y

( )3,1

y

( )4,1

y

( )5,1

y

( )

( )

( ),1 ( ),1 ,

4

3 argmin

2 , 1

, i j

V x xi j

x

x = y y

( )

= ( ) ( )

( ) ( ) ( )

{

1 2 1 3 2 3

}

1 x ,x , x ,x , x ,x

U =

( )

{

4 5

}

2 x , x

U =

(15)

ICONIP 2008 Liou, C.-Y. 15

SIR kernel

output space input space

Input data

(p,0)

p y

x = y( )p,1

W1

x2

x1

x3

x4

x5

( )1,1

y

( )2,1

y

( )3,1

y

( )4,1

y

( )5,1

y

( )

( )

( ),1 ( ),1 ,

4

3 argmin

2 , 1

, i j

V x xi j

x

x = y y

( )

( ) ( )

( ),1 ( ),1 ,

or ,

2

1 argmax

2 1

, p q

U x x U x

xp q p q

x

x = y y

( ) ( ) ( )

{

1 2 1 3 2 3

}

1 x ,x , x ,x , x ,x

U =

( )

{

4 5

}

2 x , x

U =

( ) ( ) ( ) ( ) ( ) ( )

{

1 4 1 5 2 4 2 5 3 4 3 5

}

2 ,

1 x ,x , x ,x , x ,x , x ,x , x ,x , x ,x

V =

(16)

SIR kernel

output space input space

Input data

(p,0)

p y

x = y( )p,1

W1

x2

x1

x3

x4

x5

( )1,1

y

( )2,1

y

( )3,1

y

( )4,1

y

( )5,1

y

(

3 4

)

( )3,1 ( )4,1

2

,x 1 y y

x =

Erep

( )

( ) ( ) ( )

{

1 2 1 3 2 3

}

1 x ,x , x ,x , x ,x

U =

( )

{

4 5

}

2 x , x

U =

(17)

ICONIP 2008 Liou, C.-Y. 17

SIR kernel

( ) ( )

1 2

1 1

1

, ,

W x x E

W x x W E

s r rep q

p att

η η

1 1

1 W W

W

Input dat a

(p,0)

p y

x = y( )p,1

W1

In this work, we set,

1 . 0 ,

01 .

0 2

1 = η =

η

Update by the following two equations,

This means that the force of repelling is stronger than attracting.

(18)

Two-Class Problem

• The border of the data is

• All input values are in the range [-1,1].

( )

1 3 1 2

10

1 x x

x + =

(19)

ICONIP 2008 Liou, C.-Y. 19

Two-Class Problem

• The number of neuron of SIR kernel is five, nm=5.

• The supervised BP uses two hidden layers which consists of five neurons, nMLP1=nMLP2=5.

• SVM kernel:K( )u,v =

(

uTv+1

)

3

supervised BP SIR kernel SVM

(20)

Three-Class Problem

(21)

ICONIP 2008 Liou, C.-Y. 21

Three-Class Problem

• SOM is used for analyzing the output of each layer. The y is the output of each

layer.

• The class color of input pattern are

plotted on the winner neuron.

(22)

Three-Class Problem

(23)

ICONIP 2008 Liou, C.-Y. 23

Three-class problem, n m =5

(24)

Real World Data

• Patterns in a whole dataset are divided into 5 partitions.

• The testing accuracy is the average of the 5 -fold cross validation.

• The SVM uses a Gaussian kernel.

• The parameters, C and gamma are in

the list.

(25)

ICONIP 2008 Liou, C.-Y. 25

Real World Data

10 30

0.05 50

(5,1) (20,5)

Parkinsons

10 30

0.05 50

(5,1) (30,7)

Wisconsin Breast Cancer

10 20

0.05 50

(5,3) (11,5)

iris

C

Supervised BP SVM

SIR kernel

(nm, Lmax) γ nMLP

1

n2MLP

(n1c,n2c)

(26)

Real World Data

92.82%

88.20%

91.28%

99.87%

98.33%

100%

Parkinsons

96.42%

95.57%

96.00%

97.53%

98.89%

100%

Wisconsin Breast Cancer

96.00%

94.66%

97.33%

97.50%

99.67%

100%

iris

Supervised SVM SIR kernel BP

Supervised SVM SIR kernel BP

Testing Accuracy Training Accuracy

(27)

ICONIP 2008 Liou, C.-Y. 27

Summary

• Class to point, guaranteed,

• Widely separated class points

• Weights by design or training

• Class labels are not used.

• SIR kernel can be used in SVM.

• Hairy network techniques can be used in the calibration sector.

• Suitable for multiple classes problem.

C Y L =

(28)

Thank You

Implementation of the MLP Kernel

Cheng-Yuan Liou* and Wei-Chen Cheng

Department of Computer Science and Information Engineering

National Taiwan University Republic of China

*cyliou@csie.ntu.edu.tw

參考文獻

相關文件

The research proposes a data oriented approach for choosing the type of clustering algorithms and a new cluster validity index for choosing their input parameters.. The

For a polytomous item measuring the first-order latent trait, the item response function can be the generalized partial credit model (Muraki, 1992), the partial credit model

• When a number can not be represented exactly with the fixed finite number of digits in a computer, a near-by floating-point number is chosen for approximate

• However, inv(A) may return a weird result even if A is ill-conditioned, indicates how much the output value of the function can change for a small change in the

Each unit in hidden layer receives only a portion of total errors and these errors then feedback to the input layer.. Go to step 4 until the error is

Constrain the data distribution for learned latent codes Generate the latent code via a prior

○ exploits unlabeled data to learn latent factors as representations. ○ learned representations can be transfer to

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005..