• 沒有找到結果。

Linear Support Vector Machine Standard Large-Margin Problem

Distance to Hyperplane: Preliminary

max w margin(w)

subject to every y n w T x n > 0 margin(w) = min

n=1,...,N distance(x n , w)

‘shorten’ x and w

distance

needs

w 0

and

(w 1 , . . . , w d )

differently (to be derived)

b

=

w 0

| w

|

=

 w 1

.. . w d

;

  

Linear Support Vector Machine Standard Large-Margin Problem

Distance to Hyperplane: Preliminary

max w margin(w)

subject to every y n w T x n > 0 margin(w) = min

n=1,...,N distance(x n , w)

‘shorten’ x and w

distance

needs

w 0

and

(w 1 , . . . , w d )

differently (to be derived)

b

=

w 0

| w

|

=

 w 1

.. . w d

;

   XX x 0 = X X 1

| x

|

=

 x 1

.. . x d

for this part: h(x) = sign(w

T x

+

b)

Linear Support Vector Machine Standard Large-Margin Problem

Distance to Hyperplane: Preliminary

max w margin(w)

subject to every y n w T x n > 0 margin(w) = min

n=1,...,N distance(x n , w)

‘shorten’ x and w

distance

needs

w 0

and

(w 1 , . . . , w d )

differently (to be derived)

b

=

w 0

| w

|

=

 w 1

.. . w d

;

   XX x 0 = X X 1

| x

|

=

 x 1

.. . x d

for this part: h(x) = sign(w

T x

+

b)

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 11/28

Linear Support Vector Machine Standard Large-Margin Problem

Distance to Hyperplane: Preliminary

max w margin(w)

subject to every y n w T x n > 0 margin(w) = min

n=1,...,N distance(x n , w)

‘shorten’ x and w

distance

needs

w 0

and

(w 1 , . . . , w d )

differently (to be derived)

b

=

w 0

| w

|

=

 w 1

.. . w d

;

   XX x 0 = X X 1

| x

|

=

 x 1

.. . x d

for this part: h(x) = sign(w

T x

+

b)

Linear Support Vector Machine Standard Large-Margin Problem

Distance to Hyperplane

want: distance(x,

b, w), with hyperplane w T x 0

+

b

=0

consider

x 0

,

x 00

on hyperplane

1 w T x 0

=

b

,

w T x 00

=

b

2 w

⊥ hyperplane:

w T

(x

00

x 0

)

| {z } vector on hyperplane

=0

3

distance = project (x−

x 0

)to

⊥ hyperplane

dist(x, h)

x x′′ w x

distance(x,

b, w) =

w T

k

w

k

(x−

x 0

)

=

1

1 k

w

k|

w T x

+

b

|

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 12/28

Linear Support Vector Machine Standard Large-Margin Problem

Distance to Hyperplane

want: distance(x,

b, w), with hyperplane w T x 0

+

b

=0

consider

x 0

,

x 00

on hyperplane

1 w T x 0

=

b

,

w T x 00

=

b

2 w

⊥ hyperplane:

w T

(x

00

x 0

)

| {z } vector on hyperplane

=

0

3

distance = project (x−

x 0

)to

⊥ hyperplane

dist(x, h)

x x′′

w x

distance(x,

b, w) =

w T

k

w

k

(x−

x 0

)

=

1

1 k

w

k|

w T x

+

b

|

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 12/28

Linear Support Vector Machine Standard Large-Margin Problem

Distance to Hyperplane

want: distance(x,

b, w), with hyperplane w T x 0

+

b

=0

consider

x 0

,

x 00

on hyperplane

1 w T x 0

=

b

,

w T x 00

=

b

2 w

⊥ hyperplane:

w T

(x

00

x 0

)

| {z } vector on hyperplane

=

0

3

distance = project (x−

x 0

)to

⊥ hyperplane

dist(x, h)

x x′′

w x

distance(x,

b, w) =

w T

k

w

k

(x−

x 0

)

=

1

1 k

w

k|

w T x

+

b

|

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 12/28

Linear Support Vector Machine Standard Large-Margin Problem

Distance to Hyperplane

want: distance(x,

b, w), with hyperplane w T x 0

+

b

=0

consider

x 0

,

x 00

on hyperplane

1 w T x 0

= −

b, w T x 00

= −

b

2 w

⊥ hyperplane:

w T

(x

00

x 0

)

| {z } vector on hyperplane

=

0

3

distance = project (x−

x 0

)to

⊥ hyperplane

dist(x, h)

x x′′

w x

distance(x,

b, w) =

w T

k

w

k

(x−

x 0

)

=

1

1 k

w

k|

w T x

+

b

|

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 12/28

Linear Support Vector Machine Standard Large-Margin Problem

Distance to Hyperplane

want: distance(x,

b, w), with hyperplane w T x 0

+

b

=0

consider

x 0

,

x 00

on hyperplane

1 w T x 0

= −

b, w T x 00

= −

b

2 w

⊥ hyperplane:

w T

(x

00

x 0

)

| {z } vector on hyperplane

=

0

3

distance = project (x−

x 0

)to

⊥ hyperplane

dist(x, h)

x x′′

w x

distance(x,

b, w) =

w T

k

w

k

(x−

x 0

)

=

1

1 k

w

k|

w T x

+

b

|

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 12/28

Linear Support Vector Machine Standard Large-Margin Problem

Distance to Hyperplane

want: distance(x,

b, w), with hyperplane w T x 0

+

b

=0

consider

x 0

,

x 00

on hyperplane

1 w T x 0

= −

b, w T x 00

= −

b

2 w

⊥ hyperplane:

w T

(x

00

x 0

)

| {z } vector on hyperplane

=0

3

distance = project (x−

x 0

)to

⊥ hyperplane

dist(x, h)

x x′′

w x

distance(x,

b, w) =

w T

k

w

k

(x−

x 0

)

=

1

1 k

w

k|

w T x

+

b

|

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 12/28

Linear Support Vector Machine Standard Large-Margin Problem

Distance to Hyperplane

want: distance(x,

b, w), with hyperplane w T x 0

+

b

=0

consider

x 0

,

x 00

on hyperplane

1 w T x 0

= −

b, w T x 00

= −

b

2 w

⊥ hyperplane:

w T

(x

00

x 0

)

| {z } vector on hyperplane

=0

3

distance = project (x−

x 0

)to

⊥ hyperplane

dist(x, h)

x x′′

w x

distance(x,

b, w) =

w T

k

w

k

(x−

x 0

)

=

1

1 k

w

k|

w T x

+

b

|

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 12/28

Linear Support Vector Machine Standard Large-Margin Problem

Distance to Hyperplane

want: distance(x,

b, w), with hyperplane w T x 0

+

b

=0

consider

x 0

,

x 00

on hyperplane

1 w T x 0

= −

b, w T x 00

= −

b

2 w

⊥ hyperplane:

w T

(x

00

x 0

)

| {z } vector on hyperplane

=0

3

distance = project (x−

x 0

)to

⊥ hyperplane

dist(x, h)

x x′′

w x

distance(x,

b, w) =

w T

k

w

k

(x−

x 0

)

=

1

1 k

w

k|

w T x

+

b

|

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 12/28

Linear Support Vector Machine Standard Large-Margin Problem

Distance to Hyperplane

want: distance(x,

b, w), with hyperplane w T x 0

+

b

=0

consider

x 0

,

x 00

on hyperplane

1 w T x 0

= −

b, w T x 00

= −

b

2 w

⊥ hyperplane:

w T

(x

00

x 0

)

| {z } vector on hyperplane

=0

3

distance = project (x−

x 0

)to

⊥ hyperplane

dist(x, h)

x x′′

w x

distance(x,

b, w) =

w T

k

w

k(x−

x 0

)

=

1

1 k

w

k|

w T x

+

b

|

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 12/28

Linear Support Vector Machine Standard Large-Margin Problem

Distance to Hyperplane

want: distance(x,

b, w), with hyperplane w T x 0

+

b

=0

consider

x 0

,

x 00

on hyperplane

1 w T x 0

= −

b, w T x 00

= −

b

2 w

⊥ hyperplane:

w T

(x

00

x 0

)

| {z } vector on hyperplane

=0

3

distance = project (x−

x 0

)to

⊥ hyperplane

dist(x, h)

x x′′

w x

distance(x,

b, w) =

w T

k

w

k(x−

x 0

)

=

1

1 k

w

k|

w T x

+

b

|

Linear Support Vector Machine Standard Large-Margin Problem

Distance to Hyperplane

want: distance(x,

b, w), with hyperplane w T x 0

+

b

=0

consider

x 0

,

x 00

on hyperplane

1 w T x 0

= −

b, w T x 00

= −

b

2 w

⊥ hyperplane:

w T

(x

00

x 0

)

| {z } vector on hyperplane

=0

3

distance = project (x−

x 0

)to

⊥ hyperplane

dist(x, h)

x x′′

w x

distance(x,

b, w) =

w T

k

w

k(x−

x 0

)

=

1

1

k

w

k|

w T x + b

|

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 12/28

Linear Support Vector Machine Standard Large-Margin Problem

Distance to Separating Hyperplane

distance(x,

b, w) =

1

k

w

k|

w T x + b

|

separating

hyperplane: for every n

y n (w T x n + b) > 0

distance to

separating

hyperplane: distance(x

n

,

b, w) =

1

k

w

k

y n

(w

T x n

+

b)

max

b,w

margin(b,

w)

subject to every

y n (w T x n + b) > 0

margin(b,

w) =

min

n=1,...,N

Linear Support Vector Machine Standard Large-Margin Problem

Distance to Separating Hyperplane

distance(x,

b, w) =

1

k

w

k|

w T x + b

|

separating

hyperplane: for every n

y n (w T x n + b) > 0

distance to

separating

hyperplane: distance(x

n

,

b, w) =

1

k

w

k

y n

(w

T x n

+

b)

max

b,w

margin(b,

w)

subject to every

y n (w T x n + b) > 0

margin(b,

w) =

min

n=1,...,N

distance(x

n

,

b, w)

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 13/28

Linear Support Vector Machine Standard Large-Margin Problem

Distance to Separating Hyperplane

distance(x,

b, w) =

1

k

w

k|

w T x + b

|

separating

hyperplane: for every n

y n (w T x n + b) > 0

distance to

separating

hyperplane:

distance(x

n

,

b, w) =

1

k

w

k

y n

(w

T x n

+

b)

max

b,w

margin(b,

w)

subject to every

y n (w T x n + b) > 0

margin(b,

w) =

min

n=1,...,N

distance(x

n

,

b, w)

Linear Support Vector Machine Standard Large-Margin Problem

Distance to Separating Hyperplane

distance(x,

b, w) =

1

k

w

k|

w T x + b

|

separating

hyperplane: for every n

y n (w T x n + b) > 0

distance to

separating

hyperplane:

distance(x

n

,

b, w) =

1

k

w

k

y n

(w

T x n

+

b)

max

b,w

margin(b,

w)

subject to every

y n (w T x n + b) > 0

margin(b,

w) =

min

n=1,...,N 1

kwk y n

(w

T x n

+

b)

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 13/28

Linear Support Vector Machine Standard Large-Margin Problem

Margin of Special Separating Hyperplane

max

b,w

margin(b,

w)

subject to every y

n

(w

T x n

+

b)

> 0 margin(b,

w) =

min

n=1,...,N 1

kwk

y

n

(w

T x n

+

b)

w T x + b

=0 same as 3w

T x + 3b

=0: scaling does not matter

special

scaling: only consider separating (b,

w)

such that

min

n=1,...,N y n (w T x n + b) = 1

=⇒

margin(b,

w) = kwk 1

max

b,w 1 kwk

subject to every y

n

(w

T x n

+b)> 0

min

n=1,...,N y n (w T x n + b) = 1

Linear Support Vector Machine Standard Large-Margin Problem

Margin of Special Separating Hyperplane

max

b,w

margin(b,

w)

subject to every y

n

(w

T x n

+

b)

> 0 margin(b,

w) =

min

n=1,...,N 1

kwk

y

n

(w

T x n

+

b)

w T x + b

=0 same as 3w

T x + 3b

=0: scaling does not matter

special

scaling: only consider separating (b,

w)

such that

min

n=1,...,N y n (w T x n + b) = 1

=⇒

margin(b,

w) = kwk 1

max

b,w 1 kwk

subject to every y

n

(w

T x n

+b)> 0

min

n=1,...,N y n (w T x n + b) = 1

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 14/28

Linear Support Vector Machine Standard Large-Margin Problem

Margin of Special Separating Hyperplane

max

b,w

margin(b,

w)

subject to every y

n

(w

T x n

+

b)

> 0 margin(b,

w) =

min

n=1,...,N 1

kwk

y

n

(w

T x n

+

b)

w T x + b

=0 same as 3w

T x + 3b

=0: scaling does not matter

special

scaling: only consider separating (b,

w)

such that

n=1,...,N min y n (w T x n + b) = 1

=⇒

margin(b,

w) = kwk 1

max

b,w 1 kwk

subject to every y

n

(w

T x n

+b)> 0

min

n=1,...,N y n (w T x n + b) = 1

Linear Support Vector Machine Standard Large-Margin Problem

Margin of Special Separating Hyperplane

max

b,w

margin(b,

w)

subject to every y

n

(w

T x n

+

b)

> 0 margin(b,

w) =

min

n=1,...,N 1

kwk

y

n

(w

T x n

+

b)

w T x + b

=0 same as 3w

T x + 3b

=0: scaling does not matter

special

scaling: only consider separating (b,

w)

such that

n=1,...,N min y n (w T x n + b) = 1

=⇒ margin(

b, w) = kwk 1

max

b,w 1 kwk

subject to every y

n

(w

T x n

+b)> 0

min

n=1,...,N y n (w T x n + b) = 1

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 14/28

Linear Support Vector Machine Standard Large-Margin Problem

Margin of Special Separating Hyperplane

max

b,w

margin(b,

w)

subject to every y

n

(w

T x n

+

b)

> 0 margin(b,

w) =

min

n=1,...,N 1

kwk

y

n

(w

T x n

+

b)

w T x + b

=0 same as 3w

T x + 3b

=0: scaling does not matter

special

scaling: only consider separating (b,

w)

such that

n=1,...,N min y n (w T x n + b) = 1

=⇒ margin(

b, w) = kwk 1

max

b,w 1 kwk

subject to every y

n

(w

T x n

+b)> 0

min

n=1,...,N y n (w T x n + b) = 1

Linear Support Vector Machine Standard Large-Margin Problem

Margin of Special Separating Hyperplane

max

b,w

margin(b,

w)

subject to every y

n

(w

T x n

+

b)

> 0 margin(b,

w) =

min

n=1,...,N 1

kwk

y

n

(w

T x n

+

b)

w T x + b

=0 same as 3w

T x + 3b

=0: scaling does not matter

special

scaling: only consider separating (b,

w)

such that

n=1,...,N min y n (w T x n + b) = 1

=⇒ margin(

b, w) = kwk 1

max

b,w 1 kwk

subject to

every y n (w T x n + b) > 0 min

n=1,...,N y n (w T x n + b) = 1

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 14/28

Linear Support Vector Machine Standard Large-Margin Problem

Standard Large-Margin Hyperplane Problem

max

b,w

1

k

w

k subject to

min

n=1,...,N y n (w T x n + b) = 1

necessary constraints: y

n

(w

T x n

+

b)

≥ 1 for all n

original constraint:

min n=1,...,N y n (w T x n + b) = 1

want: optimal (b,

w) here (inside)

if optimal (b,

w)

outside, e.g. y

n

(w

T x n

+

b)

>

1.126

for all n

—can scale (b,

w)

to “more optimal” (

1.126 b

,

1.126 w

)

(contradiction!)

final change: max =⇒ min, remove√

w

, add

1 2

min

b,w

1 2 w T w

subject to y

n

(w

T x n

+

b)

≥ 1

for all n

Linear Support Vector Machine Standard Large-Margin Problem

Standard Large-Margin Hyperplane Problem

max

b,w

1

k

w

k subject to

min

n=1,...,N y n (w T x n + b) = 1

necessary constraints: y

n

(w

T x n

+

b)

≥ 1 for all n original constraint:

min n=1,...,N y n (w T x n + b) = 1

want: optimal (b,

w) here (inside)

if optimal (b,

w)

outside, e.g. y

n

(w

T x n

+

b)

>

1.126

for all n

—can scale (b,

w)

to “more optimal” (

1.126 b

,

1.126 w

)

(contradiction!)

final change: max =⇒ min, remove√

w

, add

1 2

min

b,w

1 2 w T w

subject to y

n

(w

T x n

+

b)

≥ 1

for all n

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 15/28

Linear Support Vector Machine Standard Large-Margin Problem

Standard Large-Margin Hyperplane Problem

max

b,w

1

k

w

k subject to

min

n=1,...,N y n (w T x n + b) = 1

necessary constraints: y

n

(w

T x n

+

b)

≥ 1 for all n original constraint:

min n=1,...,N y n (w T x n + b) = 1

want: optimal (b,

w) here (inside)

if optimal (b,

w)

outside, e.g. y

n

(w

T x n

+

b)

>

1.126

for all n

—can scale (b,

w)

to “more optimal” (

1.126 b

,

1.126 w

)

(contradiction!)

final change: max =⇒ min, remove√

w

, add

1 2

min

b,w

1 2 w T w

subject to y

n

(w

T x n

+

b)

≥ 1

for all n

Linear Support Vector Machine Standard Large-Margin Problem

Standard Large-Margin Hyperplane Problem

max

b,w

1

k

w

k subject to

min

n=1,...,N y n (w T x n + b) = 1

necessary constraints: y

n

(w

T x n

+

b)

≥ 1 for all n original constraint:

min n=1,...,N y n (w T x n + b) = 1

want: optimal (b,

w) here (inside)

if optimal (b,

w)

outside,

e.g. y

n

(w

T x n

+

b)

>

1.126

for all n

—can scale (b,

w)

to “more optimal” (

1.126 b

,

1.126 w

)

(contradiction!)

final change: max =⇒ min, remove√

w

, add

1 2

min

b,w

1 2 w T w

subject to y

n

(w

T x n

+

b)

≥ 1

for all n

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 15/28

Linear Support Vector Machine Standard Large-Margin Problem

Standard Large-Margin Hyperplane Problem

max

b,w

1

k

w

k subject to

min

n=1,...,N y n (w T x n + b) = 1

necessary constraints: y

n

(w

T x n

+

b)

≥ 1 for all n original constraint:

min n=1,...,N y n (w T x n + b) = 1

want: optimal (b,

w) here (inside)

if optimal (b,

w)

outside, e.g. y

n

(w

T x n

+

b)

>

1.126

for all n

—can scale (b,

w)

to “more optimal” (

1.126 b

,

1.126 w

)

(contradiction!)

final change: max =⇒ min, remove√

w

, add

1 2

min

b,w

1 2 w T w

subject to y

n

(w

T x n

+

b)

≥ 1

for all n

Linear Support Vector Machine Standard Large-Margin Problem

Standard Large-Margin Hyperplane Problem

max

b,w

1

k

w

k subject to

min

n=1,...,N y n (w T x n + b) = 1

necessary constraints: y

n

(w

T x n

+

b)

≥ 1 for all n original constraint:

min n=1,...,N y n (w T x n + b) = 1

want: optimal (b,

w) here (inside)

if optimal (b,

w)

outside, e.g. y

n

(w

T x n

+

b)

> 1.126 for all n

—can scale (b,

w)

to “more optimal” (

1.126 b

,

1.126 w

)

(contradiction!)

final change: max =⇒ min, remove√

w

, add

1 2

min

b,w

1 2 w T w

subject to y

n

(w

T x n

+

b)

≥ 1

for all n

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 15/28

Linear Support Vector Machine Standard Large-Margin Problem

Standard Large-Margin Hyperplane Problem

max

b,w

1

k

w

k subject to

min

n=1,...,N y n (w T x n + b) = 1

necessary constraints: y

n

(w

T x n

+

b)

≥ 1 for all n original constraint:

min n=1,...,N y n (w T x n + b) = 1

want: optimal (b,

w) here (inside)

if optimal (b,

w)

outside, e.g. y

n

(w

T x n

+

b)

> 1.126 for all n

—can scale (b,

w)

to “more optimal” (

1.126 b

,

1.126 w

)

(contradiction!)

final change: max =⇒ min, remove√

w

, add

1 2

min

b,w

1 2 w T w

subject to y

n

(w

T x n

+

b)

≥ 1

for all n

Linear Support Vector Machine Standard Large-Margin Problem

Standard Large-Margin Hyperplane Problem

max

b,w

1

k

w

k subject to

min

n=1,...,N y n (w T x n + b) = 1

necessary constraints: y

n

(w

T x n

+

b)

≥ 1 for all n original constraint:

min n=1,...,N y n (w T x n + b) = 1

want: optimal (b,

w) here (inside)

if optimal (b,

w)

outside, e.g. y

n

(w

T x n

+

b)

> 1.126 for all n

—can scale (b,

w)

to “more optimal” (

1.126 b

,

1.126 w

)

(contradiction!)

final change: max =⇒ min, remove√

w

, add

1 2

min

b,w

1 2 w T w

subject to y

n

(w

T x n

+

b)

≥ 1

for all n

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 15/28

Linear Support Vector Machine Standard Large-Margin Problem

Standard Large-Margin Hyperplane Problem

max

b,w

1

相關文件