最近搜尋

沒有找到結果。

標籤

沒有找到結果。

文件

沒有找到結果。

上傳

首頁學校主題

登錄

Machine Learning Techniques (ᘤᢈ)

Share "Machine Learning Techniques (ᘤᢈ)"

N/A

N/A

Protected

學年: 2022

Info

Protected

Academic year: 2022

Share "Machine Learning Techniques (ᘤᢈ)"

Copied!

145

0

0

145

0

0

加載中.... (立即查看全文)

立即下載 ( 145 頁 )

全文

(1)

Machine Learning Techniques

( 機器學習技法)

Lecture 13: Deep Learning

Hsuan-Tien Lin (林軒田) htlin@csie.ntu.edu.tw

Department of Computer Science

& Information Engineering

National Taiwan University

( 國立台灣大學資訊工程系)

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 0/24

(2)

Deep Learning

Roadmap

1 Embedding Numerous Features: Kernel Models

2 Combining Predictive Features: Aggregation Models

3

Distilling Implicit Features: Extraction Models

Lecture 12: Neural Network

automatic

pattern feature extraction

from

layers of neurons

with

backprop

for GD/SGD

Lecture 13: Deep Learning Deep Neural Network Autoencoder

Denoising Autoencoder

Principal Component Analysis

(3)

Deep Learning Deep Neural Network

Physical Interpretation of NNet Revisited

x

0

= 1 x

1

x

2

.. . x

d

+1

tanh

tanh

w

_ij⁽¹⁾

w

_jk⁽²⁾

w

_kq⁽³⁾

+1

tanh

tanh

s

₃⁽²⁾ tanh

x

₃⁽²⁾

•

each layer:

pattern feature extracted

from data,

remember? :-)

•

how many neurons? how many layers?

—more generally,

what structure?

• subjectively, your design!

• objectively, validation, maybe?

structural decisions:

key issue

for applying NNet

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 2/24

(4)

Deep Learning Deep Neural Network

Physical Interpretation of NNet Revisited

x

0

= 1 x

1

x

2

.. . x

d

+1

tanh

tanh

w

_ij⁽¹⁾

w

_jk⁽²⁾

w

_kq⁽³⁾

+1

tanh

tanh

s

₃⁽²⁾ tanh

x

₃⁽²⁾

•

each layer:

pattern feature extracted

from data,

remember? :-)

•

how many neurons? how many layers?

—more generally,

what structure?

• subjectively, your design!

• objectively, validation, maybe?

structural decisions:

key issue

for applying NNet

(5)

Deep Learning Deep Neural Network

Physical Interpretation of NNet Revisited

x

0

= 1 x

1

x

2

.. . x

d

+1

tanh

tanh

w

_ij⁽¹⁾

w

_jk⁽²⁾

w

_kq⁽³⁾

+1

tanh

tanh

s

₃⁽²⁾ tanh

x

₃⁽²⁾

•

each layer:

pattern feature extracted

from data,

remember? :-)

•

how many neurons? how many layers?

—more generally,

what structure?

• subjectively, your design!

• objectively, validation, maybe?

structural decisions:

key issue

for applying NNet

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 2/24

(6)

Deep Learning Deep Neural Network

Physical Interpretation of NNet Revisited

x

0

= 1 x

1

x

2

.. . x

d

+1

tanh

tanh

w

_ij⁽¹⁾

w

_jk⁽²⁾

w

_kq⁽³⁾

+1

tanh

tanh

s

₃⁽²⁾ tanh

x

₃⁽²⁾

•

each layer:

pattern feature extracted

from data,

remember? :-)

•

how many neurons? how many layers?

—more generally,

what structure?

• subjectively, your design!

• objectively, validation, maybe?

structural decisions:

key issue

for applying NNet

(7)

Deep Learning Deep Neural Network

Physical Interpretation of NNet Revisited

x

0

= 1 x

1

x

2

.. . x

d

+1

tanh

tanh

w

_ij⁽¹⁾

w

_jk⁽²⁾

w

_kq⁽³⁾

+1

tanh

tanh

s

₃⁽²⁾ tanh

x

₃⁽²⁾

•

each layer:

pattern feature extracted

from data,

remember? :-)

•

how many neurons? how many layers?

—more generally,

what structure?

• subjectively, your design!

• objectively, validation, maybe?

structural decisions:

key issue

for applying NNet

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 2/24

(8)

Deep Learning Deep Neural Network

Physical Interpretation of NNet Revisited

x

0

= 1 x

1

x

2

.. . x

d

+1

tanh

tanh

w

_ij⁽¹⁾

w

_jk⁽²⁾

w

_kq⁽³⁾

+1

tanh

tanh

s

₃⁽²⁾ tanh

x

₃⁽²⁾

•

each layer:

pattern feature extracted

from data,

remember? :-)

•

how many neurons? how many layers?

—more generally,

what structure?

• subjectively, your design!

• objectively, validation, maybe?

structural decisions:

key issue

for applying NNet

(9)

Deep Learning Deep Neural Network

Shallow versus Deep Neural Networks

shallow: few (hidden) layers; deep: many layers

Shallow NNet

•

more

efficient

to train ( )

• simpler

structural decisions ( )

•

theoretically

powerful enough

( )

Deep NNet

• challenging

to train (×)

• sophisticated

structural decisions (×)

• ‘arbitrarily’ powerful

( )

•

more

‘meaningful’?

(see next slide)

deep NNet (deep learning)

gaining attention

in recent years

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 3/24

(10)

Deep Learning Deep Neural Network

Shallow versus Deep Neural Networks

shallow: few (hidden) layers; deep: many layers

Shallow NNet

•

more

efficient

to train ( )

• simpler

structural decisions ( )

•

theoretically

powerful enough

( )

Deep NNet

• challenging

to train (×)

• sophisticated

structural decisions (×)

• ‘arbitrarily’ powerful

( )

•

more

‘meaningful’?

(see next slide)

deep NNet (deep learning)

gaining attention

in recent years

(11)

Deep Learning Deep Neural Network

Shallow versus Deep Neural Networks

shallow: few (hidden) layers; deep: many layers

Shallow NNet

•

more

efficient

to train ( )

• simpler

structural decisions ( )

•

theoretically

powerful enough

( )

Deep NNet

• challenging

to train (×)

• sophisticated

structural decisions (×)

• ‘arbitrarily’ powerful

( )

•

more

‘meaningful’?

(see next slide)

deep NNet (deep learning)

gaining attention

in recent years

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 3/24

(12)

Deep Learning Deep Neural Network

Shallow versus Deep Neural Networks

shallow: few (hidden) layers; deep: many layers

Shallow NNet

•

more

efficient

to train ( )

• simpler

structural decisions ( )

•

theoretically

powerful enough

( )

Deep NNet

• challenging

to train (×)

• sophisticated

structural decisions (×)

• ‘arbitrarily’ powerful

( )

•

more

‘meaningful’?

(see next slide)

deep NNet (deep learning)

gaining attention

in recent years

(13)

Deep Learning Deep Neural Network

Shallow versus Deep Neural Networks

shallow: few (hidden) layers; deep: many layers

Shallow NNet

•

more

efficient

to train ( )

• simpler

structural decisions ( )

•

theoretically

powerful enough

( )

Deep NNet

• challenging

to train (×)

• sophisticated

structural decisions (×)

• ‘arbitrarily’ powerful

( )

•

more

‘meaningful’?

(see next slide)

deep NNet (deep learning)

gaining attention

in recent years

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 3/24

(14)

Deep Learning Deep Neural Network

Shallow versus Deep Neural Networks

shallow: few (hidden) layers; deep: many layers

Shallow NNet

•

more

efficient

to train ( )

• simpler

structural decisions ( )

•

theoretically

powerful enough

( )

Deep NNet

• challenging

to train (×)

• sophisticated

structural decisions (×)

• ‘arbitrarily’ powerful

( )

•

more

‘meaningful’?

(see next slide)

deep NNet (deep learning)

gaining attention

in recent years

(15)

Deep Learning Deep Neural Network

Shallow versus Deep Neural Networks

shallow: few (hidden) layers; deep: many layers

Shallow NNet

•

more

efficient

to train ( )

• simpler

structural decisions ( )

•

theoretically

powerful enough

( )

Deep NNet

• challenging

to train (×)

• sophisticated

structural decisions (×)

• ‘arbitrarily’ powerful

( )

•

more

‘meaningful’?

(see next slide)

deep NNet (deep learning)

gaining attention

in recent years

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 3/24

(16)

Deep Learning Deep Neural Network

Shallow versus Deep Neural Networks

shallow: few (hidden) layers; deep: many layers

Shallow NNet

•

more

efficient

to train ( )

• simpler

structural decisions ( )

•

theoretically

powerful enough

( )

Deep NNet

• challenging

to train (×)

• sophisticated

structural decisions (×)

• ‘arbitrarily’ powerful

( )

•

more

‘meaningful’?

(see next slide)

deep NNet (deep learning)

gaining attention

in recent years

(17)

Deep Learning Deep Neural Network

Shallow versus Deep Neural Networks

shallow: few (hidden) layers; deep: many layers

Shallow NNet

•

more

efficient

to train ( )

• simpler

structural decisions ( )

•

theoretically

powerful enough

( )

Deep NNet

• challenging

to train (×)

• sophisticated

structural decisions (×)

• ‘arbitrarily’ powerful

( )

•

more

‘meaningful’?

(see next slide)

deep NNet (deep learning)

gaining attention

in recent years

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 3/24

(18)

Deep Learning Deep Neural Network

Shallow versus Deep Neural Networks

shallow: few (hidden) layers; deep: many layers

Shallow NNet

•

more

efficient

to train ( )

• simpler

structural decisions ( )

•

theoretically

powerful enough

( )

Deep NNet

• challenging

to train (×)

• sophisticated

structural decisions (×)

• ‘arbitrarily’ powerful

( )

•

more

‘meaningful’?

(see next slide)

deep NNet (deep learning)

gaining attention

in recent years

(19)

Deep Learning Deep Neural Network

Meaningfulness of Deep Learning

,

is it a ‘1’? ✲ ✛ is it a ‘5’?

✻

z

1

z

5

φ

1

φ

2

φ

3

φ

4

φ

5

φ

6

positive weight negative weight

• ‘less burden’

for each layer:

simple

to

complex

features

•

natural for

difficult

learning task with

raw features, like vision

deep NNet: currently popular in

vision/speech/. . .

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 4/24

(20)

Deep Learning Deep Neural Network

Meaningfulness of Deep Learning

,

is it a ‘1’? ✲ ✛ is it a ‘5’?

✻

z

1

z

5

φ

1

φ

2

φ

3

φ

4

φ

5

φ

6

positive weight negative weight

• ‘less burden’

for each layer:

simple

to

complex

features

•

natural for

difficult

learning task with

raw features, like vision

deep NNet: currently popular in

vision/speech/. . .

(21)

Deep Learning Deep Neural Network

Meaningfulness of Deep Learning

,

is it a ‘1’? ✲ ✛ is it a ‘5’?

✻

z

1

z

5

φ

1

φ

2

φ

3

φ

4

φ

5

φ

6

positive weight negative weight

• ‘less burden’

for each layer:

simple

to

complex

features

•

natural for

difficult

learning task with

raw features, like vision

deep NNet: currently popular in

vision/speech/. . .

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 4/24

(22)

Deep Learning Deep Neural Network

Meaningfulness of Deep Learning

,

is it a ‘1’? ✲ ✛ is it a ‘5’?

✻

z

1

z

5

φ

1

φ

2

φ

3

φ

4

φ

5

φ

6

positive weight negative weight

• ‘less burden’

for each layer:

simple

to

complex

features

•

natural for

difficult

learning task with

raw features, like vision

deep NNet: currently popular in

vision/speech/. . .

(23)

Deep Learning Deep Neural Network

Challenges and Key Techniques for Deep Learning

•

difficult

structural decisions:

• subjective with domain knowledge: like convolutional NNet for images

•

high

model complexity:

• no big worries if big enough data

• regularization towards noise-tolerant: like

• dropout (tolerant when network corrupted)

• denoising (tolerant when input corrupted)

•

hard

optimization problem:

• careful initialization to avoid bad local minimum:

called pre-training

•

huge

computational complexity

(worsen with

big data)

:

• novel hardware/architecture: like mini-batch with GPU

IMHO, careful

regularization

and

initialization

are key techniques

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 5/24

(24)

Deep Learning Deep Neural Network

Challenges and Key Techniques for Deep Learning

•

difficult

structural decisions:

• subjective with domain knowledge: like convolutional NNet for images

•

high

model complexity:

• no big worries if big enough data

• regularization towards noise-tolerant: like

• dropout (tolerant when network corrupted)

• denoising (tolerant when input corrupted)

•

hard

optimization problem:

• careful initialization to avoid bad local minimum:

called pre-training

•

huge

computational complexity

(worsen with

big data)

:

• novel hardware/architecture: like mini-batch with GPU

IMHO, careful

regularization

and

initialization

are key techniques

(25)

Deep Learning Deep Neural Network

Challenges and Key Techniques for Deep Learning

•

difficult

structural decisions:

• subjective with domain knowledge: like convolutional NNet for images

•

high

model complexity:

• no big worries if big enough data

• regularization towards noise-tolerant: like

• dropout (tolerant when network corrupted)

• denoising (tolerant when input corrupted)

•

hard

optimization problem:

• careful initialization to avoid bad local minimum:

called pre-training

•

huge

computational complexity

(worsen with

big data)

:

• novel hardware/architecture: like mini-batch with GPU

IMHO, careful

regularization

and

initialization

are key techniques

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 5/24

(26)

Deep Learning Deep Neural Network

Challenges and Key Techniques for Deep Learning

•

difficult

structural decisions:

• subjective with domain knowledge: like convolutional NNet for images

•

high

model complexity:

• no big worries if big enough data

• regularization towards noise-tolerant: like

• dropout (tolerant when network corrupted)

• denoising (tolerant when input corrupted)

•

hard

optimization problem:

• careful initialization to avoid bad local minimum:

called pre-training

•

huge

computational complexity

(worsen with

big data)

:

• novel hardware/architecture: like mini-batch with GPU

IMHO, careful

regularization

and

initialization

are key techniques

(27)

Deep Learning Deep Neural Network

Challenges and Key Techniques for Deep Learning

•

difficult

structural decisions:

• subjective with domain knowledge: like convolutional NNet for images

•

high

model complexity:

• no big worries if big enough data

• regularization towards noise-tolerant: like

• dropout (tolerant when network corrupted)

• denoising (tolerant when input corrupted)

•

hard

optimization problem:

• careful initialization to avoid bad local minimum:

called pre-training

•

huge

computational complexity

(worsen with

big data)

:

• novel hardware/architecture: like mini-batch with GPU

IMHO, careful

regularization

and

initialization

are key techniques

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 5/24

(28)

Deep Learning Deep Neural Network

Challenges and Key Techniques for Deep Learning

•

difficult

structural decisions:

• subjective with domain knowledge: like convolutional NNet for images

•

high

model complexity:

• no big worries if big enough data

• regularization towards noise-tolerant: like

• dropout (tolerant when network corrupted)

• denoising (tolerant when input corrupted)

•

hard

optimization problem:

• careful initialization to avoid bad local minimum:

called pre-training

•

huge

computational complexity

(worsen with

big data):

• novel hardware/architecture: like mini-batch with GPU

IMHO, careful

regularization

and

initialization

are key techniques

(29)

Deep Learning Deep Neural Network

Challenges and Key Techniques for Deep Learning

•

difficult

structural decisions:

• subjective with domain knowledge: like convolutional NNet for images

•

high

model complexity:

• no big worries if big enough data

• regularization towards noise-tolerant: like

• dropout (tolerant when network corrupted)

• denoising (tolerant when input corrupted)

•

hard

optimization problem:

• careful initialization to avoid bad local minimum:

called pre-training

•

huge

computational complexity

(worsen with

big data):

• novel hardware/architecture: like mini-batch with GPU

IMHO, careful

regularization

and

initialization

are key techniques

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 5/24

(30)

Deep Learning Deep Neural Network

Challenges and Key Techniques for Deep Learning

•

difficult

structural decisions:

• subjective with domain knowledge: like convolutional NNet for images

•

high

model complexity:

• no big worries if big enough data

• regularization towards noise-tolerant: like

• dropout (tolerant when network corrupted)

• denoising (tolerant when input corrupted)

•

hard

optimization problem:

• careful initialization to avoid bad local minimum:

called pre-training

•

huge

computational complexity

(worsen with

big data):

• novel hardware/architecture: like mini-batch with GPU

IMHO, careful

regularization

and

initialization

are key techniques

(31)

Deep Learning Deep Neural Network

Challenges and Key Techniques for Deep Learning

•

difficult

structural decisions:

• subjective with domain knowledge: like convolutional NNet for images

•

high

model complexity:

• no big worries if big enough data

• regularization towards noise-tolerant: like

• dropout (tolerant when network corrupted)

• denoising (tolerant when input corrupted)

•

hard

optimization problem:

• careful initialization to avoid bad local minimum:

called pre-training

•

huge

computational complexity

(worsen with

big data):

• novel hardware/architecture: like mini-batch with GPU

IMHO, careful

regularization

and

initialization

are key techniques

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 5/24

(32)

Deep Learning Deep Neural Network

Challenges and Key Techniques for Deep Learning

•

difficult

structural decisions:

• subjective with domain knowledge: like convolutional NNet for images

•

high

model complexity:

• no big worries if big enough data

• regularization towards noise-tolerant: like

• dropout (tolerant when network corrupted)

• denoising (tolerant when input corrupted)

•

hard

optimization problem:

• careful initialization to avoid bad local minimum:

called pre-training

•

huge

computational complexity

(worsen with

big data):

• novel hardware/architecture: like mini-batch with GPU

IMHO, careful

regularization

and

initialization

are key techniques

(33)

Deep Learning Deep Neural Network

Challenges and Key Techniques for Deep Learning

•

difficult

structural decisions:

• subjective with domain knowledge: like convolutional NNet for images

•

high

model complexity:

• no big worries if big enough data

• regularization towards noise-tolerant: like

• dropout (tolerant when network corrupted)

• denoising (tolerant when input corrupted)

•

hard

optimization problem:

• careful initialization to avoid bad local minimum:

called pre-training

•

huge

computational complexity

(worsen with

big data):

• novel hardware/architecture: like mini-batch with GPU

IMHO, careful

regularization

and

initialization

are key techniques

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 5/24

(34)

Deep Learning Deep Neural Network

Challenges and Key Techniques for Deep Learning

•

difficult

structural decisions:

• subjective with domain knowledge: like convolutional NNet for images

•

high

model complexity:

• no big worries if big enough data

• regularization towards noise-tolerant: like

• dropout (tolerant when network corrupted)

• denoising (tolerant when input corrupted)

•

hard

optimization problem:

• careful initialization to avoid bad local minimum:

called pre-training

•

huge

computational complexity

(worsen with

big data):

• novel hardware/architecture: like mini-batch with GPU

IMHO, careful

regularization

and

initialization

are key techniques

(35)

Deep Learning Deep Neural Network

Challenges and Key Techniques for Deep Learning

•

difficult

structural decisions:

• subjective with domain knowledge: like convolutional NNet for images

•

high

model complexity:

• no big worries if big enough data

• regularization towards noise-tolerant: like

• dropout (tolerant when network corrupted)

• denoising (tolerant when input corrupted)

•

hard

optimization problem:

• careful initialization to avoid bad local minimum:

called pre-training

•

huge

computational complexity

(worsen with

big data):

• novel hardware/architecture: like mini-batch with GPU

IMHO, careful

regularization

and

initialization

are key techniques

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 5/24

(36)

Deep Learning Deep Neural Network

A Two-Step Deep Learning Framework

Simple Deep Learning

1

for` = 1, . . . , L,

pre-train

n w

_ij ^(`)

o

assuming w

∗ ⁽¹⁾

,. . . w

∗ ^(`−1)

fixed

(a) (b) (c) (d)

2 train with backprop

on

pre-trained

NNet to

fine-tune

alln w

_ij ^(`)

o

will focus on

simplest pre-training

technique along with

regularization

(37)

Deep Learning Deep Neural Network

A Two-Step Deep Learning Framework

Simple Deep Learning

1

for` = 1, . . . , L,

pre-train

n w

_ij ^(`)

o

assuming w

∗ ⁽¹⁾

,. . . w

∗ ^(`−1)

fixed

(a) (b) (c) (d)

2 train with backprop

on

pre-trained

NNet to

fine-tune

alln w

_ij ^(`)

o

will focus on

simplest pre-training

technique along with

regularization

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 6/24

(38)

Deep Learning Deep Neural Network

A Two-Step Deep Learning Framework

Simple Deep Learning

1

for` = 1, . . . , L,

pre-train

n w

_ij ^(`)

o

assuming w

∗ ⁽¹⁾

,. . . w

∗ ^(`−1)

fixed

(a) (b) (c) (d)

2 train with backprop

on

pre-trained

NNet to

fine-tune

alln w

_ij ^(`)

o

will focus on

simplest pre-training

technique along with

regularization

(39)

Deep Learning Deep Neural Network

Fun Time

For a deep NNet for written character recognition from raw pixels, which type of features are more likely extracted after the first hidden layer?

1

pixels

2

strokes

3

parts

4

digits

Reference Answer: 2

Simple strokes are likely the ‘next-level’ features that can be extracted from raw pixels.

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 7/24

(40)

Deep Learning Deep Neural Network

Fun Time

For a deep NNet for written character recognition from raw pixels, which type of features are more likely extracted after the first hidden layer?

1

pixels

2

strokes

3

parts

4

digits

Reference Answer: 2

Simple strokes are likely the ‘next-level’

features that can be extracted from raw pixels.

(41)

Deep Learning Autoencoder

Information-Preserving Encoding

• weights: feature transform, i.e. encoding

• good weights: information-preserving encoding

—next layer

same info.

with

different representation

• information-preserving:

decode accurately

after

encoding

,

is it a ‘1’? ✲ ✛ is it a ‘5’?

✻

z1 z5

φ1 φ2 φ3 φ4 φ5 φ6

positive weight negative weight

(a) (b) (c) (d)

idea:

pre-train weights

towards

information-preserving

encoding

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 8/24

(42)

Deep Learning Autoencoder

Information-Preserving Encoding

• weights: feature transform, i.e. encoding

• good weights: information-preserving encoding

—next layer

same info.

with

different representation

• information-preserving:

decode accurately

after

encoding

,

is it a ‘1’? ✲ ✛ is it a ‘5’?

✻

z1 z5

φ1 φ2 φ3 φ4 φ5 φ6

positive weight negative weight

(a) (b) (c) (d)

idea:

pre-train weights

towards

information-preserving

encoding

(43)

Deep Learning Autoencoder

Information-Preserving Encoding

• weights: feature transform, i.e. encoding

• good weights: information-preserving encoding

—next layer

same info.

with

different representation

• information-preserving:

decode accurately

after

encoding

,

is it a ‘1’? ✲ ✛ is it a ‘5’?

✻

z1 z5

φ1 φ2 φ3 φ4 φ5 φ6

positive weight negative weight

(a) (b) (c) (d)

idea:

pre-train weights

towards

information-preserving

encoding

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 8/24

(44)

Deep Learning Autoencoder

Information-Preserving Encoding

• weights: feature transform, i.e. encoding

• good weights: information-preserving encoding

—next layer

same info.

with

different representation

• information-preserving:

decode accurately

after

encoding

,

is it a ‘1’? ✲ ✛ is it a ‘5’?

✻

z1 z5

φ1 φ2 φ3 φ4 φ5 φ6

positive weight negative weight

(a) (b) (c) (d)

idea:

pre-train weights

towards

information-preserving

encoding

(45)

Deep Learning Autoencoder

Information-Preserving Neural Network

x

0

= 1

x

1

x

2

x

3

.. . x

d

+1

tanh

tanh

tanh

≈ x

₁

≈ x

₂

≈ x

₃

.. .

≈ x

d

w _ij ⁽¹⁾ w _ji ⁽²⁾

• autoencoder:

d —˜ d—d NNet

with goal

g _i

(x) ≈

x _i

—learning to

approximate identity function

• w _ij ⁽¹⁾ : encoding weights; w _ji ⁽²⁾ : decoding weights

why

approximating identity function?

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 9/24

(46)

Deep Learning Autoencoder

Information-Preserving Neural Network

x

0

= 1

x

1

x

2

x

3

.. . x

d

+1

tanh

tanh

tanh

≈ x

₁

≈ x

₂

≈ x

₃

.. .

≈ x

d

w _ij ⁽¹⁾ w _ji ⁽²⁾

• autoencoder: d —˜ d—d NNet

with goal

g _i

(x) ≈

x _i

—learning to

approximate identity function

• w _ij ⁽¹⁾ : encoding weights; w _ji ⁽²⁾ : decoding weights

why

approximating identity function?

(47)

Deep Learning Autoencoder

Information-Preserving Neural Network

x

0

= 1

x

1

x

2

x

3

.. . x

d

+1

tanh

tanh

tanh

≈ x

₁

≈ x

₂

≈ x

₃

.. .

≈ x

d

w _ij ⁽¹⁾ w _ji ⁽²⁾

• autoencoder: d —˜ d—d NNet

with goal

g _i

(x) ≈

x _i

—learning to

approximate identity function

• w _ij ⁽¹⁾ : encoding weights; w _ji ⁽²⁾ : decoding weights

why

approximating identity function?

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 9/24

(48)

Deep Learning Autoencoder

Information-Preserving Neural Network

x

0

= 1

x

1

x

2

x

3

.. . x

d

+1

tanh

tanh

tanh

≈ x

₁

≈ x

₂

≈ x

₃

.. .

≈ x

d

w _ij ⁽¹⁾ w _ji ⁽²⁾

• autoencoder: d —˜ d—d NNet

with goal

g _i

(x) ≈

x _i

—learning to

approximate identity function

• w _ij ⁽¹⁾ : encoding weights; w _ji ⁽²⁾ : decoding weights

why

approximating identity function?

(49)

Deep Learning Autoencoder

Information-Preserving Neural Network

x

0

= 1

x

1

x

2

x

3

.. . x

d

+1

tanh

tanh

tanh

≈ x

₁

≈ x

₂

≈ x

₃

.. .

≈ x

d

w _ij ⁽¹⁾ w _ji ⁽²⁾

• autoencoder: d —˜ d—d NNet

with goal

g _i

(x) ≈

x _i

—learning to

approximate identity function

• w _ij ⁽¹⁾ : encoding weights; w _ji ⁽²⁾ : decoding weights

why

approximating identity function?

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 9/24

(50)

Deep Learning Autoencoder

Usefulness of Approximating Identity Function

if

g(x) ≈ x

using some

hidden

structures on the

observed data x _n

•

for supervised learning:

• hidden structure (essence) of x can be used as reasonable transform Φ(x)

—learning

‘informative’ representation

of data

•

for unsupervised learning:

• density estimation: larger (structure match) when g(x) ≈ x

• outlier detection: those x where g(x) 6≈ x

—learning

‘typical’ representation

of data

autoencoder:

representation-learning through

approximating identity function

(51)

Deep Learning Autoencoder

Usefulness of Approximating Identity Function

if

g(x) ≈ x

using some

hidden

structures on the

observed data x _n

•

for supervised learning:

• hidden structure (essence) of x can be used as reasonable transform Φ(x)

—learning

‘informative’ representation

of data

•

for unsupervised learning:

• density estimation: larger (structure match) when g(x) ≈ x

• outlier detection: those x where g(x) 6≈ x

—learning

‘typical’ representation

of data

autoencoder:

representation-learning through approximating identity function

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 10/24

(52)

Deep Learning Autoencoder

Usefulness of Approximating Identity Function

if

g(x) ≈ x

using some

hidden

structures on the

observed data x _n

•

for supervised learning:

• hidden structure (essence) of x can be used as reasonable transform Φ(x)

—learning

‘informative’ representation

of data

•

for unsupervised learning:

• density estimation: larger (structure match) when g(x) ≈ x

• outlier detection: those x where g(x) 6≈ x

—learning

‘typical’ representation

of data

autoencoder:

representation-learning through

approximating identity function

(53)

Deep Learning Autoencoder

Usefulness of Approximating Identity Function

if

g(x) ≈ x

using some

hidden

structures on the

observed data x _n

•

for supervised learning:

• hidden structure (essence) of x can be used as reasonable transform Φ(x)

—learning

‘informative’ representation

of data

•

for unsupervised learning:

• density estimation: larger (structure match) when g(x) ≈ x

• outlier detection: those x where g(x) 6≈ x

—learning

‘typical’ representation

of data

autoencoder:

representation-learning through approximating identity function

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 10/24

(54)

Deep Learning Autoencoder

Usefulness of Approximating Identity Function

if

g(x) ≈ x

using some

hidden

structures on the

observed data x _n

•

for supervised learning:

• hidden structure (essence) of x can be used as reasonable transform Φ(x)

—learning

‘informative’ representation

of data

•

for unsupervised learning:

• density estimation: larger (structure match) when g(x) ≈ x

• outlier detection: those x where g(x) 6≈ x

—learning

‘typical’ representation

of data

autoencoder:

representation-learning through

approximating identity function

(55)

Deep Learning Autoencoder

Usefulness of Approximating Identity Function

if

g(x) ≈ x

using some

hidden

structures on the

observed data x _n

•

for supervised learning:

• hidden structure (essence) of x can be used as reasonable transform Φ(x)

—learning

‘informative’ representation

of data

•

for unsupervised learning:

• density estimation: larger (structure match) when g(x) ≈ x

• outlier detection: those x where g(x) 6≈ x

—learning

‘typical’ representation

of data

autoencoder:

representation-learning through approximating identity function

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 10/24

(56)

Deep Learning Autoencoder

Usefulness of Approximating Identity Function

if

g(x) ≈ x

using some

hidden

structures on the

observed data x _n

•

for supervised learning:

• hidden structure (essence) of x can be used as reasonable transform Φ(x)

—learning

‘informative’ representation

of data

•

for unsupervised learning:

• density estimation: larger (structure match) when g(x) ≈ x

• outlier detection: those x where g(x) 6≈ x

—learning

‘typical’ representation

of data

autoencoder:

representation-learning through

approximating identity function

(57)

Deep Learning Autoencoder

Usefulness of Approximating Identity Function

if

g(x) ≈ x

using some

hidden

structures on the

observed data x _n

•

for supervised learning:

• hidden structure (essence) of x can be used as reasonable transform Φ(x)

—learning

‘informative’ representation

of data

•

for unsupervised learning:

• density estimation: larger (structure match) when g(x) ≈ x

• outlier detection: those x where g(x) 6≈ x

—learning

‘typical’ representation

of data

autoencoder:

representation-learning through approximating identity function

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 10/24

(58)

Deep Learning Autoencoder

Basic Autoencoder

basic

autoencoder:

d —˜ d —d NNet

with error functionP

d

i=1

(g

_i

(x) − x

_i

)

²

•

backprop

easily

applies;

shallow

and

easy

to train

•

usually

d ˜

<

d

:

compressed

representation

•

data: {(x

₁

,

y ₁ = x ₁

), (x

₂

,

y ₂ = x ₂

), . . . , (x

_N

,

y _N = x _N

)}

—often categorized as

unsupervised learning technique

•

sometimes constrain

w _ij ⁽¹⁾

=

w _ji ⁽²⁾

as

regularization

—more

sophisticated

in calculating gradient

basic

autoencoder

in basic deep learning:

n

w _ij ⁽¹⁾ o

taken as

shallowly pre-trained weights

(59)

Deep Learning Autoencoder

Basic Autoencoder

basic

autoencoder:

d —˜ d —d NNet

with error functionP

d

i=1

(g

_i

(x) − x

_i

)

²

•

backprop

easily

applies;

shallow

and

easy

to train

•

usually

d ˜

<

d

:

compressed

representation

•

data: {(x

₁

,

y ₁ = x ₁

), (x

₂

,

y ₂ = x ₂

), . . . , (x

_N

,

y _N = x _N

)}

—often categorized as

unsupervised learning technique

•

sometimes constrain

w _ij ⁽¹⁾

=

w _ji ⁽²⁾

as

regularization

—more

sophisticated

in calculating gradient

basic

autoencoder

in basic deep learning:

n

w _ij ⁽¹⁾ o

taken as

shallowly pre-trained weights

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 11/24

(60)

Deep Learning Autoencoder

Basic Autoencoder

basic

autoencoder:

d —˜ d —d NNet

with error functionP

d

i=1

(g

_i

(x) − x

_i

)

²

•

backprop

easily

applies;

shallow

and

easy

to train

•

usually

d ˜

<

d

:

compressed

representation

•

data: {(x

₁

,

y ₁ = x ₁

), (x

₂

,

y ₂ = x ₂

), . . . , (x

_N

,

y _N = x _N

)}

—often categorized as

unsupervised learning technique

•

sometimes constrain

w _ij ⁽¹⁾

=

w _ji ⁽²⁾

as

regularization

—more

sophisticated

in calculating gradient

basic

autoencoder

in basic deep learning:

n

w _ij ⁽¹⁾ o

taken as

shallowly pre-trained weights

(61)

Deep Learning Autoencoder

Basic Autoencoder

basic

autoencoder:

d —˜ d —d NNet

with error functionP

d

i=1

(g

_i

(x) − x

_i

)

²

•

backprop

easily

applies;

shallow

and

easy

to train

•

usually

d ˜

<

d

:

compressed

representation

•

data: {(x

₁

,

y ₁ = x ₁

), (x

₂

,

y ₂ = x ₂

), . . . , (x

_N

,

y _N = x _N

)}

—often categorized as

unsupervised learning technique

•

sometimes constrain

w _ij ⁽¹⁾

=

w _ji ⁽²⁾

as

regularization

—more

sophisticated

in calculating gradient

basic

autoencoder

in basic deep learning:

n

w _ij ⁽¹⁾ o

taken as

shallowly pre-trained weights

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 11/24

(62)

Deep Learning Autoencoder

Basic Autoencoder

basic

autoencoder:

d —˜ d —d NNet

with error functionP

d

i=1

(g

_i

(x) − x

_i

)

²

•

backprop

easily

applies;

shallow

and

easy

to train

•

usually

d ˜

<

d

:

compressed

representation

•

data: {(x

₁

,

y ₁ = x ₁

), (x

₂

,

y ₂ = x ₂

), . . . , (x

_N

,

y _N = x _N

)}

—often categorized as

unsupervised learning technique

•

sometimes constrain

w _ij ⁽¹⁾

=

w _ji ⁽²⁾

as

regularization

—more

sophisticated

in calculating gradient

basic

autoencoder

in basic deep learning:

n

w _ij ⁽¹⁾ o

taken as

shallowly pre-trained weights

(63)

Deep Learning Autoencoder

Basic Autoencoder

basic

autoencoder:

d —˜ d —d NNet

with error functionP

d

i=1

(g

_i

(x) − x

_i

)

²

•

backprop

easily

applies;

shallow

and

easy

to train

•

usually

d ˜

<

d

:

compressed

representation

•

data: {(x

₁

,

y ₁ = x ₁

), (x

₂

,

y ₂ = x ₂

), . . . , (x

_N

,

y _N = x _N

)}

—often categorized as

unsupervised learning technique

•

sometimes constrain

w _ij ⁽¹⁾

=

w _ji ⁽²⁾

as

regularization

—more

sophisticated

in calculating gradient

basic

autoencoder

in basic deep learning:

n

w _ij ⁽¹⁾ o

taken as

shallowly pre-trained weights

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 11/24

(64)

Deep Learning Autoencoder

Basic Autoencoder

basic

autoencoder:

d —˜ d —d NNet

with error functionP

d

i=1

(g

_i

(x) − x

_i

)

²

•

backprop

easily

applies;

shallow

and

easy

to train

•

usually

d ˜

<

d

:

compressed

representation

•

data: {(x

₁

,

y ₁ = x ₁

), (x

₂

,

y ₂ = x ₂

), . . . , (x

_N

,

y _N = x _N

)}

—often categorized as

unsupervised learning technique

•

sometimes constrain

w _ij ⁽¹⁾

=

w _ji ⁽²⁾

as

regularization

—more

sophisticated

in calculating gradient

basic

autoencoder

in basic deep learning:

n

w _ij ⁽¹⁾ o

taken as

shallowly pre-trained weights

(65)

Deep Learning Autoencoder

Basic Autoencoder

basic

autoencoder:

d —˜ d —d NNet

with error functionP

d

i=1

(g

_i

(x) − x

_i

)

²

•

backprop

easily

applies;

shallow

and

easy

to train

•

usually

d ˜

<

d

:

compressed

representation

•

data: {(x

₁

,

y ₁ = x ₁

), (x

₂

,

y ₂ = x ₂

), . . . , (x

_N

,

y _N = x _N

)}

—often categorized as

unsupervised learning technique

•

sometimes constrain

w _ij ⁽¹⁾

=

w _ji ⁽²⁾

as

regularization

—more

sophisticated

in calculating gradient

basic

autoencoder

in basic deep learning:

n w _ij ⁽¹⁾ o

taken as

shallowly pre-trained weights

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 11/24

(66)

Deep Learning Autoencoder

Pre-Training with Autoencoders

Deep Learning

with Autoencoders

1

for` = 1, . . . , L,

pre-train

n w

_ij ^(`)

o

assuming w

∗ ⁽¹⁾

,. . . w

∗ ^(`−1)

fixed

(a) (b) (c) (d)

by

training basic autoencoder on n

x ^(`−1) _n o

with ˜ d = d ^(`)

2 train with backprop

on

pre-trained

NNet to

fine-tune

all n

w

_ij ^(`)

o

many successful

pre-training

techniques take

‘fancier’ autoencoders

with different

architectures

and

regularization schemes

(67)

Deep Learning Autoencoder

Pre-Training with Autoencoders

Deep Learning with Autoencoders

1

for` = 1, . . . , L,

pre-train

n w

_ij ^(`)

o

assuming w

∗ ⁽¹⁾

,. . . w

∗ ^(`−1)

fixed

(a) (b) (c) (d)

by

training basic autoencoder on n

x ^(`−1) _n o

with ˜ d = d ^(`)

2 train with backprop

on

pre-trained

NNet to

fine-tune

all n

w

_ij ^(`)

o

many successful

pre-training

techniques take

‘fancier’ autoencoders

with different

architectures

and

regularization schemes

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 12/24

(68)

Deep Learning Autoencoder

Pre-Training with Autoencoders

Deep Learning with Autoencoders

1

for` = 1, . . . , L,

pre-train

n w

_ij ^(`)

o

assuming w

∗ ⁽¹⁾

,. . . w

∗ ^(`−1)

fixed

(a) (b) (c) (d)

by

training basic autoencoder on n

x ^(`−1) _n o

with ˜ d = d ^(`)

2 train with backprop

on

pre-trained

NNet to

fine-tune

all n

w

_ij ^(`)

o

many successful

pre-training

techniques take

‘fancier’ autoencoders

with different

architectures

and

regularization schemes

(69)

Deep Learning Autoencoder

Fun Time

Suppose training a d -˜d -d autoencoder with backprop takes approximately c · d · ˜d seconds. Then, what is the total number of seconds needed for pre-training a d -d

⁽¹⁾

-d

⁽²⁾

-d

⁽³⁾

-1 deep NNet?

1

c d + d

⁽¹⁾

+d

⁽²⁾

+d

⁽³⁾

+1

2

c d · d

⁽¹⁾

· d

⁽²⁾

· d

⁽³⁾

· 1

3

c dd

⁽¹⁾

+d

⁽¹⁾

d

⁽²⁾

+d

⁽²⁾

d

⁽³⁾

+d

⁽³⁾

4

c dd

⁽¹⁾

· d

⁽¹⁾

d

⁽²⁾

· d

⁽²⁾

d

⁽³⁾

· d

⁽³⁾

Reference Answer: 3

Each c · d

^(`−1)

· d

^(`)

represents the time for pre-training with one autoencoder to determine one layer of the weights.

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 13/24

(70)

Deep Learning Autoencoder

Fun Time

Suppose training a d -˜d -d autoencoder with backprop takes approximately c · d · ˜d seconds. Then, what is the total number of seconds needed for pre-training a d -d

⁽¹⁾

-d

⁽²⁾

-d

⁽³⁾

-1 deep NNet?

1

c d + d

⁽¹⁾

+d

⁽²⁾

+d

⁽³⁾

+1

2

c d · d

⁽¹⁾

· d

⁽²⁾

· d

⁽³⁾

· 1

3

c dd

⁽¹⁾

+d

⁽¹⁾

d

⁽²⁾

+d

⁽²⁾

d

⁽³⁾

+d

⁽³⁾

4

c dd

⁽¹⁾

· d

⁽¹⁾

d

⁽²⁾

· d

⁽²⁾

d

⁽³⁾

· d

⁽³⁾

Reference Answer: 3

Each c · d

^(`−1)

· d

^(`)

represents the time for pre-training with one autoencoder to determine one layer of the weights.

(71)

Deep Learning Denoising Autoencoder

Regularization in Deep Learning

x

₀

= 1 x

₁

x

₂

.. . x

d

+1

tanh

tanh

w

_ij⁽¹⁾

w

_jk⁽²⁾

w

_kq⁽³⁾

+1

tanh

tanh

s

₃⁽²⁾ tanh

x

₃⁽²⁾

watch out for overfitting, remember? :-)

high

model complexity: regularization

needed

•

structural decisions/constraints

•

weight decay or weight elimination

regularizers

• early stopping

next: another

regularization

technique

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 14/24

(72)

Deep Learning Denoising Autoencoder

Regularization in Deep Learning

x

₀

= 1 x

₁

x

₂

.. . x

d

+1

tanh

tanh

w

_ij⁽¹⁾

w

_jk⁽²⁾

w

_kq⁽³⁾

+1

tanh

tanh

s

₃⁽²⁾ tanh

x

₃⁽²⁾

watch out for overfitting, remember? :-)

high

model complexity:

regularization

needed

•

structural decisions/constraints

•

weight decay or weight elimination

regularizers

• early stopping

next: another

regularization

technique

(73)

Deep Learning Denoising Autoencoder

Regularization in Deep Learning

x

₀

= 1 x

₁

x

₂

.. . x

d

+1

tanh

tanh

w

_ij⁽¹⁾

w

_jk⁽²⁾

w

_kq⁽³⁾

+1

tanh

tanh

s

₃⁽²⁾ tanh

x

₃⁽²⁾

watch out for overfitting, remember? :-)

high

model complexity:

regularization

needed

•

structural decisions/constraints

•

weight decay or weight elimination

regularizers

• early stopping

next: another

regularization

technique

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 14/24

(74)

Deep Learning Denoising Autoencoder

Regularization in Deep Learning

x

₀

= 1 x

₁

x

₂

.. . x

d

+1

tanh

tanh

w

_ij⁽¹⁾

w

_jk⁽²⁾

w

_kq⁽³⁾

+1

tanh

tanh

s

₃⁽²⁾ tanh

x

₃⁽²⁾

watch out for overfitting, remember? :-)

high

model complexity: regularization

needed

•

structural decisions/constraints

•

weight decay or weight elimination

regularizers

• early stopping

next: another

regularization

technique

(75)

Deep Learning Denoising Autoencoder

Regularization in Deep Learning

x

₀

= 1 x

₁

x

₂

.. . x

d

+1

tanh

tanh

w

_ij⁽¹⁾

w

_jk⁽²⁾

w

_kq⁽³⁾

+1

tanh

tanh

s

₃⁽²⁾ tanh

x

₃⁽²⁾

watch out for overfitting, remember? :-)

high

model complexity: regularization

needed

•

structural decisions/constraints

•

weight decay or weight elimination

regularizers

• early stopping

next: another

regularization

technique

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 14/24

(76)

Deep Learning Denoising Autoencoder

Regularization in Deep Learning

x

₀

= 1 x

₁

x

₂

.. . x

d

+1

tanh

tanh

w

_ij⁽¹⁾

w

_jk⁽²⁾

w

_kq⁽³⁾

+1

tanh

tanh

s

₃⁽²⁾ tanh

x

₃⁽²⁾

watch out for overfitting, remember? :-)

high

model complexity: regularization

needed

•

structural decisions/constraints

•

weight decay or weight elimination

regularizers

• early stopping

next: another

regularization

technique

(77)

Deep Learning Denoising Autoencoder

Regularization in Deep Learning

x

₀

= 1 x

₁

x

₂

.. . x

d

+1

tanh

tanh

w

_ij⁽¹⁾

w

_jk⁽²⁾

w

_kq⁽³⁾

+1

tanh

tanh

s

₃⁽²⁾ tanh

x

₃⁽²⁾

watch out for overfitting, remember? :-)

high

model complexity: regularization

needed

•

structural decisions/constraints

•

weight decay or weight elimination

regularizers

• early stopping

next: another

regularization

technique

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 14/24

(78)

Deep Learning Denoising Autoencoder

Regularization in Deep Learning

x

₀

= 1 x

₁

x

₂

.. . x

d

+1

tanh

tanh

w

_ij⁽¹⁾

w

_jk⁽²⁾

w

_kq⁽³⁾

+1

tanh

tanh

s

₃⁽²⁾ tanh

x

₃⁽²⁾

watch out for overfitting, remember? :-)

high

model complexity: regularization

needed

•

structural decisions/constraints

•

weight decay or weight elimination

regularizers

• early stopping

next: another

regularization

technique

參考文獻

立即下載 ( PDF - 145 頁 - 0.94 MB )

Outline

representation-learning through approximating identity function artificial noise/hint as regularization!

相關文件

Machine Learning Techniques (ᘤᢈ)

2 Combining Predictive Features: Aggregation Models Lecture 7: Blending and Bagging.. Motivation of Aggregation

Machine Learning Techniques (ᘤᢈ)

Which of the following aggregation model learns diverse g t by reweighting and calculates linear vote by steepest search?.

Machine Learning Techniques (ᘤᢈ)

3 Distilling Implicit Features: Extraction Models Lecture 14: Radial Basis Function Network. RBF

Machine Learning Techniques (ᘤᢈ)

Lecture 4: Soft-Margin Support Vector Machine allow some margin violations ξ n while penalizing them by C; equivalent to upper-bounding α n by C Lecture 5: Kernel Logistic

Machine Learning Techniques (ᘤᢈ)

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 5/22.. Decision Tree Decision Tree Hypothesis. Disclaimers about

Machine Learning Techniques (ᘤᢈ)

1 Embedding Numerous Features: Kernel Models Lecture 1: Linear Support Vector Machine.. linear SVM: more robust and solvable with quadratic programming Lecture 2: Dual Support

Machine Learning Techniques (ᘤᢈ)

1 Embedding Numerous Features: Kernel Models Lecture 1: Linear Support Vector Machine.

What is Machine Learning Perceptron Learning Algorithm Types of Learning

Hsuan-Tien Lin (NTU CSIE) Machine Learning Basics

上傳您的學習材料以下載所有文件。

您的文件將被豐富，在 9lib TW 上共享以幫助學習。

相關文件

Machine Learning Foundations

Machine Learning Foundations

97

0

0

Machine Learning Techniques (ᘤᢈ)

Machine Learning Techniques (ᘤᢈ)

126

0

0

Machine Learning Techniques (ᘤᢈ)

Machine Learning Techniques (ᘤᢈ)

26

0

0

Machine Learning Techniques (ᘤᢈ)

Machine Learning Techniques (ᘤᢈ)

112

0

0

Machine Learning Techniques (ᘤᢈ)

Machine Learning Techniques (ᘤᢈ)

37

0

0

Machine Learning Techniques (ᘤᢈ)

Machine Learning Techniques (ᘤᢈ)

31

0

0

Machine Learning Techniques (ᘤᢈ)

Machine Learning Techniques (ᘤᢈ)

147

0

0

Machine Learning Techniques (ᘤᢈ)

Machine Learning Techniques (ᘤᢈ)

153

0

0