Introduction and overview - An Introduction to Computational Mathematics

about 500,000. The column n is the video list, which is about 20, 000. The rating is from 1 star to 5 stars. The matrix is certainly incomplete. We want to fill in the vacancy based on an assumption that the completed matrix is low rank. The singular value decomposition of A is

A =

i=1

σiuiv^T_i .

Each class uiv^T_i represents certain types of videos. For instance, the drama, the action, etc. In a class uiv_i^T, the row v^T_i lists those videos with this type (say drama), and uilists those members who rate these videos higher. The completed matrix can be used for recommendation to the members.

2.2 Introduction and overview

There are three kinds of linear problems we encounter in applications:

• solving large linear system: Ax = b

• solving least squares problem: minxkAx − bk².

• solving eigenvalue problems: Ax = λx

• solving singular value decomposition problem.

In solving linear systems, there are two classes of methods:

• Direct methods: which solves the equation directly. This is usually for small system. Basi-cally, the solving process is a factorization of the matrix A such as LU-factorization.

• Iterative methods: the basic idea is to decompose A = M − N , where M is a major part and easy to invert, while N is a minor part. Then perform an iteration M x_n+1− N x_n= b to get an approximate solution. Usually, a preconditioning is needed, which means that we replace Ax = b by P Ax = P b so that it is easy to have above major-minor decomposition.

In solving eigenvalue problems, I shall discuss the power method and QR algorithm. For least square problem, I shall discussed weighted iterative method.

2.3 *Matrix Algebra

Spectral Decomposition We assume A is an n × n matrix in Cⁿ.

Theorem 2.2 (Caley-Hamilton). Let pA(λ) := det(λI − A) be the characteristic polynomial of A.

ThenpA(A) = 0.

Theorem 2.3. There exists a minimal polynomial pmwhich is a factor ofpAandpm(A) = 0.

Theorem 2.4 (Fundamental Theorem of Algebra). Any polynomial p(λ) over C of degree m can be factorized as

p(λ) = a

i=1

(λ − λi)

for some constanta 6= 0 and λ1, ..., λm∈ C. This factorization is unique.

Definition 2.2. Let A : Cⁿ → Cⁿ. A subspace V ⊂ Cⁿ is called an invariant subspace of the linear mapA if AV ⊂ V.

Definition 2.3. Let A : Cⁿ→ Cⁿ. A vectorv is called an eigenvector of A if there exists a λ such that

Av = λv.

Definition 2.4. For a matrix A, the set of all its eigenvalues σ(A) := {λ₁, ..., λ_n} is called the spectra ofA.

Definition 2.5. A vector space V is said to be the direct sum of its two subspaces V1 andV₂ if for anyv ∈ V there exist two unique vectors v_i ∈ V_i,i = 1, 2 such that v = v₁+ v₂. We denote it by V = V₁⊕ V₂.

Remark 2.1. We also use the notation V = V1+ V2for the property: anyv ∈ V can be written as v = v₁+ v₂for somev_i ∈ V_i,i = 1, 2. Notice that V = V₁⊕ V₂if and only ifV = V₁+ V₂and V₁∩ V₂ = {0}.

Lemma 2.1. Suppose p and q are two polynomials over C and are relatively prime (i.e. no common roots). Then there exist two other polynomialsa and b such that

ap + bq = 1.

Lemma 2.2. Suppose p and q are two polynomials over C and are relatively prime (i.e. no common roots). LetN_p:= Ker(p(A)), Nq := Ker(q(A)) and Npq:= Ker(p(A)q(A)). Then

N_pq = N_p⊕ N_q. Proof. From ap + bq = 1 we get

a(A)p(A) + b(A)q(A) = I.

For any v ∈ N_pq, acting the above operator formula to v, we get v = a(A)p(A)v + b(A)q(A)v := v₂+ v₁. We claim that v1 ∈ N_p, whereas v2∈ N_q. This is because

p(A)v1 = p(A)b(A)q(A)v = b(A)p(A)q(A)v = 0.

Similar argument for proving v₂ ∈ N_q. To see this is a direct sum, suppose v ∈ N_p∩ N_q. Then v = a(A)p(A)v + b(A)q(A)v = 0.

Hence Np∩ N_q= {0}.

2.3. *MATRIX ALGEBRA 21 Corollary 2.1. Suppose a polynomial p is factorized as p = p₁· · · p_s withp₁, ..., p_s are relatively prime (no common roots). LetN_p_i := Kerpi(A). Then

N_p = N_p₁ ⊕ · · · ⊕ N_p_s.

Theorem 2.5 (Spectral Decomposition). Let p_mbe the minimal polynomial ofA. Suppose p_mcan be factorized as

Jordan matrix A matrix J is called a Jordan normal form of a matrix A if we can find matrix V such that

Here, λki are the eigenvalues of A, v_k^j ∈ Cⁿ are called the generalized eigenvectors of A, the matrices J_k are called Jordan blocks of size k of A. The matrix V_k = [v¹_k, · · · , v_k^k] is an n × k matrix. We can restrict A to V_k, k = k₁, ..., k_sas

AVk= A[v¹_k, · · · , v^k_k] = [v¹_k, · · · , v_k^k]Jk, k = k1, ..., ks.

For each generalized vector,

It is easy to check that

N²_k =

Theorem 2.6. Any matrix A over C is similar to a Jordan normal form. The structure of this Jordan normal form is unique.

Example Suppose A is a 2 × 2 matrix with double eigenvalue λ. Let N1 = Ker(A − λI) and N₂ = Ker(A − λI)². We assume dimN₁ = 1. Then N₁ ⊂ N₂ = C². Let us choose any v₂∈ N₂\ N₁. We define v1 = (A − λI)v₂. Then (A − λI)v1 = (A − λI)²v₂= 0. Thus, under [v1, v2], the matrix A is transformed to J2(λ).

2.3. *MATRIX ALGEBRA 23 Orthogonality, Self-adjoint operators There are some other decomposition, mainly when the under space Rⁿor Cⁿendowed with inner product structure.

Below V and W are vector spaces.

1. Orthogonal Projection: Given W ⊂ V , there is an orthogonal projection P : V → W such that (i) P w = w for all w ∈ W , (ii) (I − P )v ⊥ W for all v ∈ V .

2. For any W ⊂ V , there is a subspace W^⊥such that (i) V = W ⊕ W^⊥, (ii) W ∩ W^⊥ = {0}, (iii) W ⊥ W^⊥.

3. Self adjoint operator: we define A^∗ = (¯a_ji). A matrix A is called self-adjoint if A^∗ = A.

4. Alternatively, A^∗is defined by

hv, A^∗wi = hAv, wi, and A is self-adjoint if hAv, wi = hv, Awi.

5. A matrix U is unitary if U^∗U = U U^∗ = I. This is equivalent to that U = [u₁, ..., u_n] and {u_i}ⁿ_i=1are orthonormal.

Theorem 2.7. If A is self adjoint, then A is diagonalizable by a unitary matrix U and all eigenvalues are real.

Proof. 1. Suppose µ is an eigenvalue of A. By the spectral decomposition theorem, we can find the maximal invariant subspace W corresponding to µI − A. Let J = µI − A. We claim that J = 0 on W .

2. Since A is self-adjoint, so is J .

3. If the minimal polynomial of J in W is pm(λ) = λ^m. If m > 1, this means that there exists v1and v2 which are independent such that

J v1= 0, J v2= v1. Then we have

hv₁, v₁i = hJ v₂, v₁i = hv₂, J v₁i = 0.

This is a contradiction. Hence, m = 1. This also means J = 0.

4. The eigenvalues are real. Suppose λ, v are a pair of eigenvalue/eigenvector.

λhv, vi = hλv, vi = hAv, vi

= hλv, Avi = hλv, λvi = ¯λhv, vi

5. The eigenspace corresponding to two distinct eigenvalues λ 6= µ are orthogonal to each other.

Suppose

Av = λv, Aw = µw, λ 6= µ.

Then

λhv, wi = hAv, wi = hv, Awi = µhv, wi Hence, we get hv, wi = 0.

The Rayleigh quotient method is a constructive method to find eigenvalues of self-adjoint operator.

λ1= max

hAv, vi hv, vi . Suppose V₁be the corresponding eigenspace.

λ2= max

v⊥V1

hAv, vi hv, vi .

This process can be proceeded inductively and find all eigenvalues and eigenvectors.

Singular Value Decomposition

Theorem 2.8. Let A : Rⁿ → R^m(or Cⁿ → C^m). Then there exist orthonormal bases V = [v1, ..., vn] in RⁿandU = [u1, ..., um] and non-negative numbers

σ1 ≥ ... ≥ σ_p > 0, p ≥ min(m, n) such that

Avi= σiui, i = 1, ..., p, Av_i= 0 forp < i ≤ n Or in matrix form

AV = U Σ,

whereV is n × n unitary matrix, U is m × m unitary matrix, Σ is m × n diagonal matrix:

Σ =

(diag(σ₁, ..., σ_p), 0) ifm ≤ n (diag(σ1, ..., σp), 0)^T ifm > n.

Proof. 1. The matrix A^∗A is self-adjoint. All its eigenvalues are real. They are also non-negative because if λ and v is a pair of eigenvalue/eigenvector, then from Rayleigh quotient

λhv, vi = hA^∗Av, vi = hAv, Avi ≥ 0.

2. From the spectral decomposition for the self-adjoint matrix A^∗A, we can find unitary matrix [v₁, ..., v_n] and Λ = diag(λ₁, ..., λ_p, 0, ..., 0) such that AV = V Λ. Here, λ₁ ≥ · · · ≥ λ_p > 0, the rest eigenvalues are 0. The corresponding eigenspace spanned by < vp+1, ..., v_n> is the kernel N (A^∗A).

2.3. *MATRIX ALGEBRA 25

1. A^∗has the following representation:

A^∗ui= 1 σi

A^∗(Avi) = σivi.

2. The domain and range of A can be decomposed into

Rⁿ= [v₁, ..., v_p] ⊕ [v_p+1, ..., v_n] = [v₁, ..., v_p] ⊕ N (A),

4. The least-squares solution for Ax = b is the minimizer of 1

2kAx − bk²₂ With the singular value decomposition, we can represent

b =

The least squares solution is

x^∗= which minimize kAx − bk²with minimal value

kAx^∗− bk²= kb^⊥k².

Norm in vector space In analysis, we need to measure how close of two vectors, the concept of convergence. A natural way is to define the concept of norm for vectors.

Definition 2.6. Let V be a vector space. A mapping k · k : V → R is called a norm if (i) kxk ≥ 0 and kxk = 0 if and only if x = 0;

(ii) kλxk = |λ|kxk for any λ ∈ R and any x ∈ Rⁿ; (iii) kx + yk ≤ kxk + kyk.

A vector space endowed with a normk · k is called a normed vector space.

In Rⁿ, we define the norms

在文檔中 An Introduction to Computational Mathematics (頁 24-31)