• 沒有找到結果。

Feasibility of Learning Connection to Learning

Connection to Learning

bin

unknown

orange

prob. µ

marble

∈ bin

• orange •

• green •

size-N sample from bin of i.i.d. marbles

Feasibility of Learning Connection to Learning

Connection to Learning

bin

unknown

orange

prob. µ

marble

∈ bin

• orange •

• green •

size-N sample from bin of i.i.d. marbles

learning

fixed hypothesis h(x)=

?

target f

(x)

x

∈ X

h is

wrong

h(x) 6= f (x)

h is

right

h(x) = f (x)

check h onD = {(x

n

, y

n

|{z}

f (x

n

)

)} with i.i.d.

x n

if

large N & i.i.d. x n

, can

probably

infer unknownJh(x) 6= f (x)K probability

by knownJh(x

n

)6= y

n

K fraction

top

X

• h(x) 6= f (x)

• h(x) = f (x)

Feasibility of Learning Connection to Learning

Connection to Learning

bin

unknown

orange

prob. µ

marble

∈ bin

• orange •

• green •

size-N sample from bin of i.i.d. marbles

learning

fixed hypothesis h(x)=

?

target f (x)

x

∈ X

h is

wrong

h(x) 6= f (x)

h is

right

h(x) = f (x)

check h onD = {(x

n

, y

n

|{z}

f (x

n

)

)} with i.i.d.

x n

if

large N & i.i.d. x n

, can

probably

infer unknownJh(x) 6= f (x)K probability

by knownJh(x

n

)6= y

n

K fraction

top

X

• h(x) 6= f (x)

• h(x) = f (x)

Hsuan-Tien Lin (NTU CSIE) Machine Learning Foundations 13/26

Feasibility of Learning Connection to Learning

Connection to Learning

bin

unknown

orange

prob. µ

marble

∈ bin

• orange •

• green •

size-N sample from bin of i.i.d. marbles

learning

fixed hypothesis h(x)=

?

target f (x)

x

∈ X

h is

wrong

h(x) 6= f (x)

h is

right

h(x) = f (x)

check h onD = {(x

n

, y

n

|{z}

f (x

n

)

)} with i.i.d.

x n

if

large N & i.i.d. x n

, can

probably

infer unknownJh(x) 6= f (x)K probability

by knownJh(x

n

)6= y

n

K fraction

top

X

• h(x) 6= f (x)

• h(x) = f (x)

Feasibility of Learning Connection to Learning

Connection to Learning

bin

unknown

orange

prob. µ

marble

∈ bin

• orange •

• green •

size-N sample from bin of i.i.d. marbles

learning

fixed hypothesis h(x)=

?

target f (x)

x

∈ X

h is

wrong

h(x) 6= f (x)

h is

right

h(x) = f (x)

check h onD = {(x

n

, y

n

|{z}

f (x

n

)

)} with i.i.d.

x n

if

large N & i.i.d. x n

, can

probably

infer unknownJh(x) 6= f (x)K probability

by knownJh(x

n

)6= y

n

K fraction

top

X

• h(x) 6= f (x)

• h(x) = f (x)

Hsuan-Tien Lin (NTU CSIE) Machine Learning Foundations 13/26

Feasibility of Learning Connection to Learning

Connection to Learning

bin

unknown

orange

prob. µ

marble

∈ bin

• orange •

• green •

size-N sample from bin of i.i.d. marbles

learning

fixed hypothesis h(x)=

?

target f (x)

x

∈ X

h is

wrong

h(x) 6= f (x)

h is

right

h(x) = f (x)

check h onD = {(x

n

, y

n

|{z}

f (x

n

)

)} with i.i.d.

x n

if

large N & i.i.d. x n

, can

probably

infer unknownJh(x) 6= f (x)K probability

by knownJh(x

n

)6= y

n

K fraction

top

X

• h(x) 6= f (x)

• h(x) = f (x)

Feasibility of Learning Connection to Learning

Connection to Learning

bin

unknown

orange

prob. µ

marble

∈ bin

• orange •

• green •

size-N sample from bin of i.i.d. marbles

learning

fixed hypothesis h(x)=

?

target f (x)

x

∈ X

h is

wrong

h(x) 6= f (x)

h is

right

h(x) = f (x)

check h onD = {(x

n

, y

n

|{z}

f (x

n

)

)} with i.i.d.

x n

if

large N & i.i.d. x n

, can

probably

infer unknownJh(x) 6= f (x)K probability

by knownJh(x

n

)6= y

n

K fraction

top

X

• h(x) 6= f (x)

• h(x) = f (x)

Hsuan-Tien Lin (NTU CSIE) Machine Learning Foundations 13/26

Feasibility of Learning Connection to Learning

Connection to Learning

bin

unknown

orange

prob. µ

marble

∈ bin

• orange •

• green •

size-N sample from bin of i.i.d. marbles

learning

fixed hypothesis h(x)=

?

target f (x)

x

∈ X

h is

wrong

h(x) 6= f (x)

h is

right

h(x) = f (x)

check h onD = {(x

n

, y

n

|{z}

f (x

n

)

)} with i.i.d.

x n

if

large N & i.i.d. x n

, can

probably

infer unknownJh(x) 6= f (x)K probability

by knownJh(x

n

)6= y

n

K fraction

top

X

• h(x) 6= f (x)

• h(x) = f (x)

Feasibility of Learning Connection to Learning

Connection to Learning

bin

unknown

orange

prob. µ

marble

∈ bin

• orange •

• green •

size-N sample from bin of i.i.d. marbles

learning

fixed hypothesis h(x)=

?

target f (x)

x

∈ X

h is

wrong

h(x) 6= f (x)

h is

right

h(x) = f (x)

check h onD = {(x

n

, y

n

|{z}

f (x

n

)

)} with i.i.d.

x n

if

large N & i.i.d. x n

, can

probably

infer unknownJh(x) 6= f (x)K probability

by knownJh(x

n

)6= y

n

K fraction

top

X

• h(x) 6= f (x)

• h(x) = f (x)

Hsuan-Tien Lin (NTU CSIE) Machine Learning Foundations 13/26

Feasibility of Learning Connection to Learning

Added Components

unknown target function f : X → Y

(ideal credit approval formula)

training examples D : (x

1

, y

1

), · · · , (x

N

,y

N

) (historical records in bank)

learning algorithm

A

final hypothesis g ≈ f

(‘learned’ formula to be used)

hypothesis set H

(set of candidate formula)

unknown P on X x

1

, x

2

, · · · , x

N

h ≈ f

?

fixed h x

for any fixed h, can probably infer

unknown E out (h)

= E

x∼P

Jh(x) 6= f (x)K

Feasibility of Learning Connection to Learning

The Formal Guarantee

for any fixed h, in ‘big’ data

(N large),

for any fixed h,

in-sample error E

in

(h) is probably close to

for any fixed h,

out-of-sample error E

out

(h)

(within )

P

E

in

(h)− E

out

(h)

>



 ≤ 2 exp

−2

 2 N



same as the ‘bin’ analogy . . .

valid for all

N

and



does not depend on E

out

(h),

no need to ‘know’ E out (h)

—f and P can stay unknown

‘E

in

(h) = E

out

(h)’ is

probably approximately correct (PAC)

=⇒ if

‘E in (h) ≈ E out (h)’

and

‘E in (h) small’

=⇒ E

out

(h) small =⇒ h ≈ f with respect to P

Feasibility of Learning Connection to Learning

The Formal Guarantee

for any fixed h, in ‘big’ data

(N large),

for any fixed h,

in-sample error E

in

(h) is probably close to

for any fixed h,

out-of-sample error E

out

(h)

(within )

P

E

in

(h)− E

out

(h)

>



 ≤ 2 exp

−2

 2 N



same as the ‘bin’ analogy . . .

valid for all

N

and



does not depend on E

out

(h),

no need to ‘know’ E out (h)

—f and P can stay unknown

‘E

in

(h) = E

out

(h)’ is

probably approximately correct (PAC)

=⇒ if

‘E in (h) ≈ E out (h)’

and

‘E in (h) small’

=⇒ E

out

(h) small =⇒ h ≈ f with respect to P

Feasibility of Learning Connection to Learning

The Formal Guarantee

for any fixed h, in ‘big’ data

(N large),

for any fixed h,

in-sample error E

in

(h) is probably close to

for any fixed h,

out-of-sample error E

out

(h)

(within )

P

E

in

(h)− E

out

(h)

>



 ≤ 2 exp

−2

 2 N



same as the ‘bin’ analogy . . .

valid for all

N

and



does not depend on E

out

(h),

no need to ‘know’ E out (h)

—f and P can stay unknown

‘E

in

(h) = E

out

(h)’ is

probably approximately correct (PAC)

=⇒ if

‘E in (h) ≈ E out (h)’

and

‘E in (h) small’

=⇒ E

out

(h) small =⇒ h ≈ f with respect to P

Feasibility of Learning Connection to Learning

The Formal Guarantee

for any fixed h, in ‘big’ data

(N large),

for any fixed h,

in-sample error E

in

(h) is probably close to

for any fixed h,

out-of-sample error E

out

(h)

(within )

P

E

in

(h)− E

out

(h)

>



 ≤ 2 exp

−2

 2 N



same as the ‘bin’ analogy . . .

valid for all

N

and



does not depend on E

out

(h),

no need to ‘know’ E out (h)

—f and P can stay unknown

‘E

in

(h) = E

out

(h)’ is

probably approximately correct (PAC)

=⇒ if

‘E in (h) ≈ E out (h)’

and

‘E in (h) small’

=⇒ E

out

(h) small =⇒ h ≈ f with respect to P

Feasibility of Learning Connection to Learning

The Formal Guarantee

for any fixed h, in ‘big’ data

(N large),

for any fixed h,

in-sample error E

in

(h) is probably close to

for any fixed h,

out-of-sample error E

out

(h)

(within )

P

E

in

(h)− E

out

(h)

>



 ≤ 2 exp

−2

 2 N



same as the ‘bin’ analogy . . .

valid for all

N

and



does not depend on E

out

(h),

no need to ‘know’ E out (h)

—f and P can stay unknown

‘E

in

(h) = E

out

(h)’ is

probably approximately correct (PAC)

=⇒

if

‘E in (h) ≈ E out (h)’

and

‘E in (h) small’

=⇒ E

out

(h) small

=⇒ h ≈ f with respect to P

Feasibility of Learning Connection to Learning

The Formal Guarantee

for any fixed h, in ‘big’ data

(N large),

for any fixed h,

in-sample error E

in

(h) is probably close to

for any fixed h,

out-of-sample error E

out

(h)

(within )

P

E

in

(h)− E

out

(h)

>



 ≤ 2 exp

−2

 2 N



same as the ‘bin’ analogy . . .

valid for all

N

and



does not depend on E

out

(h),

no need to ‘know’ E out (h)

—f and P can stay unknown

‘E

in

(h) = E

out

(h)’ is

probably approximately correct (PAC)

=⇒

Feasibility of Learning Connection to Learning

Verification of One h

for any fixed h, when data large enough, E

in

(h)≈ E

out

(h)

Can we claim ‘good learning’ (g ≈ f )?

相關文件