1 only has a non-zero solution ifthis is a m -th order equation in λ which can have at most m distinct solutions 1

(1)

1

nEigenvectors (for a square m´m matrix S)

nHow many eigenvaluesare there at most?

only has a non-zero solution if

this is a m-th order equation in λ which can have at most m distinct solutions

(roots of the characteristic polynomial) – can be complex even though S is real.

eigenvalue (right) eigenvector

Example

2

(2)

ú ú ú û ù ê ê

ê ë é

=

0 0 0

0 2 0

0 0 3

S has eigenvalues 3, 2, 0 with corresponding eigenvectors

÷ ÷

÷ ø ö ç ç ç è æ

= 0 0 1 v

1

÷ ÷

÷ ø ö ç ç ç è æ

= 0 1 0 v

2

÷ ÷

÷ ø ö ç ç ç è æ

= 1 0 0 v

3

On each eigenvector, S acts as a multiple of the identity matrix: but as a different multiple on each.

Any vector (say x= ) can be viewed as a combination of the eigenvectors: x = 2v

^÷ ₁

+ 4v

₂

+ 6v

₃

÷÷ ø ö çç ç è æ 6 4 2

3

§Thus a matrix-vector multiplication such as Sx (S, x as in the previous slide) can be rewritten in terms of the

eigenvalues/vectors:

§Even though x is an arbitrary vector, the action of S on x is determined by the eigenvalues/vectors.

§Suggestion: the effect of “small” eigenvalues is small.

3 3 2

2 1

1 3 2

1

3 2

1

6 4

2 6

4 2

) 6 4

2 (

v v

v Sv

Sv Sv

Sx

v v

v S Sx

l l

l + +

= +

+

=

+ +

=

4

(3)

0 and

,

₁ ₂ ₁ ₂

} 2 , 1 { } 2 , 1 { } 2 , 1

{

= v ¹ Þ v • v =

Sv l l l

For symmetric matrices, eigenvectors for distinct eigenvalues are orthogonal

Â Î Þ

=

- l l

l , if 0 and S S

^T

complex

for S I

All eigenvalues of a real symmetric matrix are real.

0 v

Sv if then , 0

, ³ = Þ ³

Â Î

" w

ⁿ

w

^T

Sw l l

All eigenvalues of a positive semidefinite matrix are non-negative

5

nLet

nThen

nThe eigenvalues are 1 and 3 (nonnegative, real).

nThe eigenvectors are orthogonal (and real):

ú û ê ù

ë

= é

2 1

1 S 2

. 0 1 ) 2 2 (

1

2

₂

= - -

ú Þ û ê ù

ë é

-

= -

- l

l l I l

S

÷÷ ø çç ö è æ

-1

1 ÷÷

ø çç ö è æ 1 1

Real, symmetric.

Plug in these values and solve for

eigenvectors. ⁶

(4)

§Let be a squarematrix with m linearly independent eigenvectors (a “non-defective” matrix)

§Theorem: Exists an eigen decomposition

§(cf. matrix diagonalization theorem)

§Columns of U are eigenvectorsof S

§Diagonal elements of are eigenvaluesof diagonal

Unique for distinct

eigenvalues

7

ú ú ú û ù ê ê

ê ë é

= v v

_n

U

₁

...

Let U have the eigenvectors as columns:

ú ú ú û ù ê ê

ê ë é ú ú ú û ù ê ê

ê ë é

= ú ú ú û ù ê ê

ê ë é

= ú ú ú û ù ê ê

ê ë é

=

n n

n

v v v v

v v

S SU

l l

l

l ... ... ...

...

1 1

1 1 1

Then, SU can be written

And S=ULU

^–1

. Thus SU=UL, or U

^–1

SU=L

8

(5)

Recall ; 1 , 3 . 2

1 1 2

2

1

= =

ú û ê ù

ë

= é l l

S

The eigenvectors and form

÷÷ ø çç ö è æ

-1

1 ÷÷ ø

çç ö è æ 1

1 ú

û ê ù

ë é

= -

1 1

1 U 1

Inverting, we have ú

û ê ù

ë

é -

=

-

2 / 1 2 / 1

2 / 1 2 /

1

1 U

Then, S=ULU

^–1

= ú

û ê ù

ë

é -

ú û ê ù

ë ú é û ê ù

ë é

- 1 / 2 1 / 2

2 / 1 2 / 1 3 0

0 1 1 1

1 1

Recall UU

^–1

=1.

9

Let s divide U (and multiply U

^–1

) by 2

ú û ê ù

ë

é -

ú û ê ù

ë ú é û ê ù

ë é

- 1 / 2 1 / 2

2 / 1 2 / 1 3 0

0 1 2 / 1 2 / 1

2 / 1 2 /

Then, S= 1

Q L (Q

^-1

= Q

^T

)

Why? Stay tuned …

10

(6)

n If is a symmetric matrix:

n Theorem: Exists a (unique) eigen decomposition

n where Q is orthogonal:

n Q

^-1

= Q

^T

n Columns of Q are normalized eigenvectors

n Columns are orthogonal.

n (everything is real)

Q

T

Q S = L

11

nExamine the symmetric eigen decomposition, if any, for each of the following matrices:

ú û ê ù

ë é

- 1 0 1

0 ú

û ê ù

ë é

0 1

1 0 ú

û ê ù

ë é

- 2 3 2

1 ú

û ê ù

ë é

4 2

2 2

12

(7)

V

T

U A = S

m ´ m m ´ n V is n ´ n For an m ´ n matrix A of rank r there exists a

factorization (Singular Value Decomposition = SVD) as follows:

The columns of U are orthogonal eigenvectors of AA

^T

. The columns of V are orthogonal eigenvectors of A

^T

A.

i

l

s =

(

_r

)

diag s

₁

... s

=

S Singular values.

Eigenvalues l

₁

… l

_r

of AA

^T

are the eigenvalues of A

^T

A.

13

nIllustration of SVD dimensions and sparseness

14

(8)

Let

ú ú ú û ù

ê ê ê ë

é -

=

0 1

1 0

1 1 A

Thus m=3, n=2. Its SVD is

ú û ê ù

ë é ú - ú ú û ù

ê ê ê ë é ú ú ú û ù

ê ê ê ë é

-

- 1 / 2 1 / 2

2 / 1 2 / 1 0 0

3 0

0 1

3 / 1 6

/ 1 2 / 1

3 / 1 6 / 1 2 / 1

3 / 1 6 / 2 0

Typically, the singular values arranged in decreasing order.

15

nSVD can be used to compute optimal low-rank approximations.

nApproximation problem: Find A_kof rank k such that

A_kand X are both

m ´ n matrices.

Typically, want k << r.

Frobenius norm k F

X rank X

k

A X

A = -

min

= ) ( :

16

(9)

nSolution via SVD

set smallest r-k singular values to zero

T k

k

U V

A = diag ( s

₁

,..., s , 0 ,..., 0 )

column notation: sum of rank 1 matrices T

i i k

i i

k

u v

A = å

=₁

s

k

17

§How good (bad) is this approximation?

§It’s the best possible, measured by the Frobenius norm of the error:

§ min

$:&'() $ *)

𝐴 − 𝑋

_F

= 𝐴 − 𝐴

₎ _F

= ∑

⁽_1*)34

𝜎

₁²

where the s_iare ordered such that s_i³ s_i+1.

Suggests why Frobenius error drops as k increased.

18

(10)

§Whereas the term-doc matrix A may have m=50000, n=10 million (and rank close to 50000)

§We can construct an approximation A₁₀₀with rank 100.

§ Of all rank 100 matrices, it would have the lowest Frobenius error.

§Great … but why would we?

§Answer: Latent Semantic Indexing

C. Eckart, G. Young, The approximation of a matrix by another of lower rank.

Psychometrika, 1, 211-218, 1936. ¹⁹