最近搜尋

沒有找到結果。

標籤

沒有找到結果。

文件

沒有找到結果。

上傳

首頁學校主題

登錄

SVM? :-)

在文檔中 Machine Learning Techniques (ᘤᢈ) (頁 83-120)

Linear Support Vector Machine Support Vector Machine

Solving a Particular Standard Problem

min

b,w 1 2 w ^T w

subject to y

n

(w

^T x _n

+

b)

≥ 1 for all n

X =









0 0 2 2 2 0 3 0









y =









−1

−1 +1 +1









− b

≥ 1 (i)

−2 w ₁ − 2 w ₂ − b

≥ 1 (ii)

2w ₁

+ 0w ₂

+ b

≥ 1 (iii)

3w ₁

+ 0w ₂

+ b

≥ 1 (iv)

•

(i) & (iii) =⇒

w ₁

≥ +1 (ii) & (iii) =⇒

w ₂

≤ −1

=⇒

¹ ₂ w ^T w

≥

1

•

(w

₁

=1,

w ₂

=−1,

b

=−1) at

lower bound

and satisfies (i)− (iv) g_SVM(x) = sign(x

₁

− x

2

− 1):

Linear Support Vector Machine Support Vector Machine

Solving a Particular Standard Problem

min

b,w 1 2 w ^T w

subject to y

n

(w

^T x _n

+

b)

≥ 1 for all n

X =









0 0 2 2 2 0 3 0









y =









−1

−1 +1 +1









− b

≥ 1 (i)

−2 w ₁ − 2 w ₂ − b

≥ 1 (ii)

2w ₁

+ 0w ₂

+ b

≥ 1 (iii)

3w ₁

+ 0w ₂

+ b

≥ 1 (iv)

•

(i) & (iii) =⇒

w ₁

≥ +1 (ii) & (iii) =⇒

w ₂

≤ −1

=⇒

¹ ₂ w ^T w

≥

1

•

(w

₁

=1,

w ₂

=−1,

b

=−1) at

lower bound

and satisfies (i)− (iv) g_SVM(x) = sign(x

₁

− x

2

− 1):

SVM? :-)

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 17/28

Linear Support Vector Machine Support Vector Machine

Support Vector Machine (SVM)

optimal solution: (w

₁

=1,

w ₂

=−1,

b

=−1) margin(b,

w)

=

_kwk ¹

=

^√ ¹

2

x¹−x²−1=0 0.707

•

examples on boundary:

‘locates’ fattest hyperplane

other examples:

not needed

•

call boundary example

support vector (candidate)

support vector

machine (SVM): learn

fattest hyperplanes

(with help of

support vectors

)

Linear Support Vector Machine Support Vector Machine

Support Vector Machine (SVM)

optimal solution: (w

₁

=1,

w ₂

=−1,

b

=−1) margin(b,

w)

=

_kwk ¹

=

^√ ¹

2

x¹−x²−1=0 0.707

•

examples on boundary:

‘locates’ fattest hyperplane

other examples:

not needed

•

call boundary example

support vector (candidate)

support vector

machine (SVM): learn

fattest hyperplanes

(with help of

support vectors

)

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 18/28

Linear Support Vector Machine Support Vector Machine

Support Vector Machine (SVM)

optimal solution: (w

₁

=1,

w ₂

=−1,

b

=−1) margin(b,

w)

=

_kwk ¹

=

^√ ¹

2

x¹−x²−1=0 0.707

•

examples on boundary:

‘locates’ fattest hyperplane

other examples:

not needed

•

call boundary example

support vector (candidate)

support vector

machine (SVM): learn

fattest hyperplanes

(with help of

support vectors

)

Linear Support Vector Machine Support Vector Machine

Support Vector Machine (SVM)

optimal solution: (w

₁

=1,

w ₂

=−1,

b

=−1) margin(b,

w)

=

_kwk ¹

=

^√ ¹

2

x¹−x²−1=0 0.707

•

examples on boundary:

‘locates’ fattest hyperplane

other examples:

not needed

•

call boundary example

support vector (candidate)

support vector

machine (SVM):

learn

fattest hyperplanes

(with help of

support vectors

)

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 18/28

Linear Support Vector Machine Support Vector Machine

Solving General SVM

min

b,w 1 2 w ^T w

subject to y

_n

(w

^T x _n

+

b)

≥ 1 for all n

• not easy manually, of course :-)

•

gradient descent?

not easy with constraints

•

luckily:

• (convex) quadratic objective function of (b, w)

• linear constraints of (b, w)

—quadratic programming

quadratic programming

(QP):

‘easy’ optimization problem

Linear Support Vector Machine Support Vector Machine

Solving General SVM

min

b,w 1 2 w ^T w

subject to y

_n

(w

^T x _n

+

b)

≥ 1 for all n

• not easy manually, of course :-)

•

gradient descent?

not easy with constraints

•

luckily:

• (convex) quadratic objective function of (b, w)

• linear constraints of (b, w)

—quadratic programming

quadratic programming

(QP):

‘easy’ optimization problem

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 19/28

Linear Support Vector Machine Support Vector Machine

Solving General SVM

min

b,w 1 2 w ^T w

subject to y

_n

(w

^T x _n

+

b)

≥ 1 for all n

• not easy manually, of course :-)

•

gradient descent?

not easy with constraints

•

luckily:

• (convex) quadratic objective function of (b, w)

• linear constraints of (b, w)

—quadratic programming

quadratic programming

(QP):

‘easy’ optimization problem

Linear Support Vector Machine Support Vector Machine

Solving General SVM

min

b,w 1 2 w ^T w

subject to y

_n

(w

^T x _n

+

b)

≥ 1 for all n

• not easy manually, of course :-)

•

gradient descent?

not easy with constraints

•

luckily:

• (convex) quadratic objective function of (b, w)

• linear constraints of (b, w)

—quadratic programming

quadratic programming

(QP):

‘easy’ optimization problem

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 19/28

Linear Support Vector Machine Support Vector Machine

Solving General SVM

min

b,w 1 2 w ^T w

subject to y

_n

(w

^T x _n

+

b)

≥ 1 for all n

• not easy manually, of course :-)

•

gradient descent?

not easy with constraints

•

luckily:

• (convex) quadratic objective function of (b, w)

• linear constraints of (b, w)

—quadratic programming

quadratic programming

(QP):

‘easy’ optimization problem

Linear Support Vector Machine Support Vector Machine

Solving General SVM

min

b,w 1 2 w ^T w

subject to y

_n

(w

^T x _n

+

b)

≥ 1 for all n

• not easy manually, of course :-)

•

gradient descent?

not easy with constraints

•

luckily:

• (convex) quadratic objective function of (b, w)

• linear constraints of (b, w)

—quadratic programming

quadratic programming

(QP):

‘easy’ optimization problem

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 19/28

Linear Support Vector Machine Support Vector Machine

Quadratic Programming

optimal (b,

w) =

? min

b,w 1 2 w ^T w

subject to y

n

(w

^T x n

+

b)

≥ 1, for n = 1, 2, . . . , N

optimal

u

← QP(

Q, p, A, c)

min

u 1

2 u ^T Qu

+

p ^T u

subject to

a ^T _m u

≥

c m

,

for m = 1, 2, . . . , M

objective function:

u =

b w

;

Q =

0 0 ^T _d 0 _d I d

;

p =

0 _{d +1}

constraints:

a ^T _n =

y n

1 x ^T _n

;

c n =

1

;M =

N

SVM with general QP solver: easy

if you’ve read the manual :-)

Linear Support Vector Machine Support Vector Machine

Quadratic Programming

optimal (b,

w) =

? min

b,w 1 2 w ^T w

subject to y

n

(w

^T x n

+

b)

≥ 1, for n = 1, 2, . . . , N

optimal

u

← QP(

Q, p, A, c)

min

u 1

2 u ^T Qu

+

p ^T u

subject to

a ^T _m u

≥

c m

,

for m = 1, 2, . . . , M

objective function:

u =

b w

;

Q =

0 0 ^T _d 0 _d I d

;

p =

0 _{d +1}

constraints:

a ^T _n =

y n

1 x ^T _n

;

c n =

1

;M =

N

SVM with general QP solver: easy

if you’ve read the manual :-)

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 20/28

Linear Support Vector Machine Support Vector Machine

Quadratic Programming

optimal (b,

w) =

? min

b,w 1 2 w ^T w

subject to y

n

(w

^T x n

+

b)

≥ 1, for n = 1, 2, . . . , N

optimal

u

← QP(

Q, p, A, c)

min

u 1

2 u ^T Qu

+

p ^T u

subject to

a ^T _m u

≥

c m

,

for m = 1, 2, . . . , M

objective function:

u =

b w

;

Q =

0 0 ^T _d 0 _d I d

;

p =

0 _{d +1}

constraints:

a ^T _n =

y n

1 x ^T _n

;

c n =

1

;M =

N

SVM with general QP solver: easy

if you’ve read the manual :-)

Linear Support Vector Machine Support Vector Machine

Quadratic Programming

optimal (b,

w) =

? min

b,w 1 2 w ^T w

subject to y

n

(w

^T x n

+

b)

≥ 1, for n = 1, 2, . . . , N

optimal

u

← QP(

Q, p, A, c)

min

u 1

2 u ^T Qu

+

p ^T u

subject to

a ^T _m u

≥

c m

,

for m = 1, 2, . . . , M

objective function:

u =

b w

;

Q =

0 0 ^T _d 0 _d I d

;

p = 0 _{d +1}

constraints:

a ^T _n = y n

1 x ^T _n

;

c n = 1;

M = N

SVM with general QP solver: easy

if you’ve read the manual :-)

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 20/28

Linear Support Vector Machine Support Vector Machine

Quadratic Programming

optimal (b,

w) =

? min

b,w 1 2 w ^T w

subject to y

n

(w

^T x n

+

b)

≥ 1, for n = 1, 2, . . . , N

optimal

u

← QP(

Q, p, A, c)

min

u 1

2 u ^T Qu

+

p ^T u

subject to

a ^T _m u

≥

c m

,

for m = 1, 2, . . . , M

objective function:

u =

b w

;

Q =

0 0 ^T _d 0 _d I d

;

p = 0 _{d +1}

constraints:

a ^T _n = y n

1 x ^T _n

;

c n = 1;

M = N

SVM with general QP solver:

easy

if you’ve read the manual :-)

Linear Support Vector Machine Support Vector Machine

SVM with QP Solver

Linear Hard-Margin SVM Algorithm

1 Q =

0 0 ^T _d 0 _d I d

;

p = 0 _{d +1}

;

a ^T _n = y n

1 x ^T _n

;

c n = 1

2

b w

← QP(

Q, p, A, c)

3

return

b

&

w

as

g

_SVM

• hard-margin: nothing violate ‘fat boundary’

• linear: x _n

want

non-linear? z n

= Φ(x

n

)—remember? :-)

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 21/28

Linear Support Vector Machine Support Vector Machine

SVM with QP Solver

Linear Hard-Margin SVM Algorithm

1 Q =

0 0 ^T _d 0 _d I d

;

p = 0 _{d +1}

;

a ^T _n = y n

1 x ^T _n

;

c n = 1

2

b w

← QP(

Q, p, A, c)

3

return

b

&

w

as

g

_SVM

• hard-margin: nothing violate ‘fat boundary’

• linear: x _n

want

non-linear?

z n

= Φ(x

n

)—remember? :-)

Linear Support Vector Machine Support Vector Machine

SVM with QP Solver

Linear Hard-Margin SVM Algorithm

1 Q =

0 0 ^T _d 0 _d I d

;

p = 0 _{d +1}

;

a ^T _n = y n

1 x ^T _n

;

c n = 1

2

b w

← QP(

Q, p, A, c)

3

return

b

&

w

as

g

_SVM

• hard-margin: nothing violate ‘fat boundary’

• linear: x _n

want

non-linear? z n

= Φ(x

n

)—remember? :-)

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 21/28

Linear Support Vector Machine Support Vector Machine

SVM with QP Solver

Linear Hard-Margin SVM Algorithm

1 Q =

0 0 ^T _d 0 _d I d

;

p = 0 _{d +1}

;

a ^T _n = y n

1 x ^T _n

;

c n = 1

2

b w

← QP(

Q, p, A, c)

3

return

b

&

w

as

g

_SVM

• hard-margin: nothing violate ‘fat boundary’

• linear: x _n

want

non-linear?

z n

= Φ(x

n

)—remember? :-)

Linear Support Vector Machine Support Vector Machine

SVM with QP Solver

Linear Hard-Margin SVM Algorithm

1 Q =

0 0 ^T _d 0 _d I d

;

p = 0 _{d +1}

;

a ^T _n = y n

1 x ^T _n

;

c n = 1

2

b w

← QP(

Q, p, A, c)

3

return

b

&

w

as

g

_SVM

• hard-margin: nothing violate ‘fat boundary’

• linear: x _n

want

non-linear?

z n

= Φ(x

n

)—remember? :-)

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 21/28

Linear Support Vector Machine Support Vector Machine

Fun Time

Consider two negative examples with

x ₁

= (0, 0) and x

2

= (2, 2); two positive examples with

x ₃

= (2, 0) and x

₄

= (3, 0), as shown on page 17 of the slides. Define

u, Q, p, c _n

as those listed on page 20 of the slides. What are

a ^T _n

that need to be fed into the QP solver?

1 a

^T₁

= [−1, 0, 0]

,

^a

^T₂

= [−1, 2, 2]

,

^a

^T₃

= [−1, 2, 0]

,

^a

^T₄

= [−1, 3, 0]

2 a

^T₁

= [1, 0, 0]

,

^a

^T₂

= [1, −2, −2]

,

^a

^T₃

= [−1, 2, 0]

,

^a

^T₄

= [−1, 3, 0]

3 a

^T₁

= [1, 0, 0]

,

^a

^T2

= [1, 2, 2]

,

^a

^T3

= [1, 2, 0]

,

^a

^T4

= [1, 3, 0]

4 a

^T₁

= [−1, 0, 0]

,

^a

^T2

= [−1, −2, −2]

,

^a

^T3

= [1, 2, 0]

,

^a

^T4

= [1, 3, 0]

Reference Answer: 4

We need

a ^T _n

=y

n

1

x ^T _n

.

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 22/28

Linear Support Vector Machine Support Vector Machine

Fun Time

Consider two negative examples with

x ₁

= (0, 0) and x

2

= (2, 2); two positive examples with

x ₃

= (2, 0) and x

₄

= (3, 0), as shown on page 17 of the slides. Define

u, Q, p, c _n

as those listed on page 20 of the slides. What are

a ^T _n

that need to be fed into the QP solver?

1 a

^T₁

= [−1, 0, 0]

,

^a

^T₂

= [−1, 2, 2]

,

^a

^T₃

= [−1, 2, 0]

,

^a

^T₄

= [−1, 3, 0]

2 a

^T₁

= [1, 0, 0]

,

^a

^T₂

= [1, −2, −2]

,

^a

^T₃

= [−1, 2, 0]

,

^a

^T₄

= [−1, 3, 0]

3 a

^T₁

= [1, 0, 0]

,

^a

^T2

= [1, 2, 2]

,

^a

^T3

= [1, 2, 0]

,

^a

^T4

= [1, 3, 0]

4 a

^T₁

= [−1, 0, 0]

,

^a

^T2

= [−1, −2, −2]

,

^a

^T3

= [1, 2, 0]

,

^a

^T4

= [1, 3, 0]

Reference Answer: 4

We need

a ^T _n

=y

n

1

x ^T _n

.

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 22/28

Linear Support Vector Machine Reasons behind Large-Margin Hyperplane

Why Large-Margin Hyperplane?

min

b,w 1 2 w ^T w

subject to y

_n

(w

^T z _n

+

b)

≥ 1 for all n

minimize constraint regularization E

_in w ^T w

≤ C

SVM

w ^T w

E

_in

=0 [and more]

SVM (large-margin hyperplane):

‘weight-decay regularization’ within E _in = 0

Linear Support Vector Machine Reasons behind Large-Margin Hyperplane

Why Large-Margin Hyperplane?

min

b,w 1 2 w ^T w

subject to y

_n

(w

^T z _n

+

b)

≥ 1 for all n

minimize constraint regularization E

_in w ^T w

≤ C

SVM

w ^T w

E

_in

=0 [and more]

SVM (large-margin hyperplane):

‘weight-decay regularization’ within E _in = 0

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 23/28

Linear Support Vector Machine Reasons behind Large-Margin Hyperplane

Why Large-Margin Hyperplane?

min

b,w 1 2 w ^T w

subject to y

_n

(w

^T z _n

+

b)

≥ 1 for all n

minimize constraint regularization E

_in w ^T w

≤ C

SVM

w ^T w

E

_in

=0 [and more]

SVM (large-margin hyperplane):

‘weight-decay regularization’ within E _in = 0

Linear Support Vector Machine Reasons behind Large-Margin Hyperplane

Large-Margin Restricts Dichotomies

consider ‘large-margin algorithm’A

ρ

:

either

returns g with margin(g) ≥ ρ (if exists)

, or 0 otherwise

A ⁰ : like PLA = ⇒ shatter ‘general’ 3 inputs

A ^1.126 : more strict than SVM = ⇒ cannot shatter any 3 inputs

ρ

fewer dichotomies =⇒ smaller ‘VC dim.’ =⇒

better generalization

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 24/28

Linear Support Vector Machine Reasons behind Large-Margin Hyperplane

Large-Margin Restricts Dichotomies

consider ‘large-margin algorithm’A

ρ

:

either

returns g with margin(g) ≥ ρ (if exists)

, or 0 otherwise

A ⁰ : like PLA = ⇒ shatter ‘general’ 3 inputs

A ^1.126 : more strict than SVM = ⇒ cannot shatter any 3 inputs

ρ

fewer dichotomies =⇒ smaller ‘VC dim.’ =⇒

better generalization

Linear Support Vector Machine Reasons behind Large-Margin Hyperplane

Large-Margin Restricts Dichotomies

consider ‘large-margin algorithm’A

ρ

:

either

returns g with margin(g) ≥ ρ (if exists)

, or 0 otherwise

A ⁰ : like PLA = ⇒ shatter ‘general’ 3 inputs

A ^1.126 : more strict than SVM = ⇒ cannot shatter any 3 inputs

ρ

fewer dichotomies =⇒ smaller ‘VC dim.’ =⇒

better generalization

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 24/28

Linear Support Vector Machine Reasons behind Large-Margin Hyperplane

Large-Margin Restricts Dichotomies

consider ‘large-margin algorithm’A

ρ

:

either

returns g with margin(g) ≥ ρ (if exists)

, or 0 otherwise

A ⁰ : like PLA = ⇒ shatter ‘general’ 3 inputs

A ^1.126 : more strict than SVM = ⇒ cannot shatter any 3 inputs

ρ

fewer dichotomies =⇒ smaller ‘VC dim.’ =⇒

better generalization

Linear Support Vector Machine Reasons behind Large-Margin Hyperplane

VC Dimension of Large-Margin Algorithm

fewer dichotomies =⇒ smaller

‘VC dim.’

considers d

VC

( A ρ ) [data-dependent, need more than VC]

—

instead of

d

VC

( H) [data-independent, covered by VC]

d

VC

( A ^ρ ) when X = unit circle in R ²

•

ρ = 0: just perceptrons (dVC=3)

•

ρ >

√ 3

2

: cannot shatter any 3 inputs (dVC< 3)

—some inputs must be of

distance ≤ √ 3

generally, whenX in

radius-R hyperball:

dVC(A

ρ

)≤ min

R ² ρ ²

, d

+1≤ d + 1

| {z }

d

VC

(perceptrons)

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 25/28

Linear Support Vector Machine Reasons behind Large-Margin Hyperplane

VC Dimension of Large-Margin Algorithm

fewer dichotomies =⇒ smaller

‘VC dim.’

considers d

VC

( A ρ ) [data-dependent, need more than VC]

—

instead of

d

VC

( H) [data-independent, covered by VC]

d

VC

( A ^ρ ) when X = unit circle in R ²

•

ρ = 0: just perceptrons (dVC=3)

•

ρ >

√ 3

2

: cannot shatter any 3 inputs (dVC< 3)

—some inputs must be of

distance ≤ √ 3

generally, whenX in

radius-R hyperball:

dVC(A

ρ

)≤ min

R ² ρ ²

, d

+1≤ d + 1

| {z }

d

VC

(perceptrons)

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 25/28

Linear Support Vector Machine Reasons behind Large-Margin Hyperplane

VC Dimension of Large-Margin Algorithm

fewer dichotomies =⇒ smaller

‘VC dim.’

considers d

VC

( A ρ ) [data-dependent, need more than VC]

—

instead of

d

VC

( H) [data-independent, covered by VC]

d

VC

( A ^ρ ) when X = unit circle in R ²

•

ρ = 0: just perceptrons (dVC=3)

•

ρ >

√ 3

2

: cannot shatter any 3 inputs (dVC< 3)

—some inputs must be of

distance ≤ √ 3

generally, whenX in

radius-R hyperball:

dVC(A

ρ

)≤ min

R ² ρ ²

, d

+1≤ d + 1

| {z }

d

VC

(perceptrons)

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 25/28

Linear Support Vector Machine Reasons behind Large-Margin Hyperplane

VC Dimension of Large-Margin Algorithm

fewer dichotomies =⇒ smaller

‘VC dim.’

considers d

VC

( A ρ ) [data-dependent, need more than VC]

—

instead of

d

VC

( H) [data-independent, covered by VC]

d

VC

( A ^ρ ) when X = unit circle in R ²

•

ρ = 0: just perceptrons (dVC =3)

•

ρ >

√ 3

2

: cannot shatter any 3 inputs (dVC< 3)

—some inputs must be of

distance ≤ √ 3

generally, whenX in

radius-R hyperball:

dVC(A

ρ

)≤ min

R ² ρ ²

, d

+1≤ d + 1

| {z }

d

VC

(perceptrons)

Linear Support Vector Machine Reasons behind Large-Margin Hyperplane

VC Dimension of Large-Margin Algorithm

fewer dichotomies =⇒ smaller

‘VC dim.’

considers d

VC

( A ρ ) [data-dependent, need more than VC]

—

instead of

d

VC

( H) [data-independent, covered by VC]

d

VC

( A ^ρ ) when X = unit circle in R ²

•

ρ = 0: just perceptrons (dVC =3)

•

ρ >

√ 3

2

: cannot shatter any 3 inputs (dVC< 3)

—some inputs must be of

distance ≤ √ 3

generally, whenX in

radius-R hyperball:

dVC(A

ρ

)≤ min

R ² ρ ²

, d

+1≤ d + 1

| {z }

d

VC

(perceptrons)

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 25/28

Linear Support Vector Machine Reasons behind Large-Margin Hyperplane

VC Dimension of Large-Margin Algorithm

fewer dichotomies =⇒ smaller

‘VC dim.’

considers d

VC

( A ρ ) [data-dependent, need more than VC]

—

instead of

d

VC

( H) [data-independent, covered by VC]

d

VC

( A ^ρ ) when X = unit circle in R ²

•

ρ = 0: just perceptrons (dVC =3)

•

ρ >

√ 3

2

: cannot shatter any 3 inputs (dVC< 3)

—some inputs must be of

distance ≤ √ 3

generally, whenX in

radius-R hyperball:

dVC(A

ρ

)≤ min

R ² ρ ²

, d

+1≤ d + 1

| {z }

d

VC

(perceptrons)

Linear Support Vector Machine Reasons behind Large-Margin Hyperplane

Benefits of Large-Margin Hyperplanes

large-margin

hyperplanes hyperplanes hyperplanes + feature transform Φ

# even fewer not many many

boundary simple simple sophisticated

• not many

good, for d_VC and generalization

• sophisticated

good, for possibly better E

_in

a new possibility: non-linear SVM

large-margin

hyperplanes

在文檔中 Machine Learning Techniques (ᘤᢈ) (頁 83-120)

立即下載 "Machine Learning Techn..."

Outline

相關文件