university-logo
Tsung-Ming Huang
Department of Mathematics National Taiwan Normal University
October 3, 2011
university-logo
Outline
1 Elementary matrices
2 LR-factorization
3 Gaussian elimination
4 Cholesky factorization
5 Error estimation for linear systems
university-logo
Elementary matrices
Let A ∈ Cn×n be a nonsingular matrix. We want to solve the linear system Ax = b by
(a) Direct methods (finite steps);
(b) Iterative methods (convergence). (See Chapter 4)
university-logo
A =
1 1 0 3
2 1 −1 1
3 −1 −1 2
−1 2 3 −1
⇒ A1:= L1A ≡
1 0 0 0
−2 1 0 0
−3 0 1 0
1 0 0 1
A =
1 1 0 3
0 −1 −1 −5 0 −4 −1 −7
0 3 3 2
⇒ A2:= L2A1≡
1 0 0 0
0 1 0 0
0 −4 1 0
0 3 0 1
A1=
1 1 0 3
0 −1 −1 −5
0 0 3 13
0 0 0 −13
= L2L1A
university-logo
We have
A = L−11 L−12 A2= LR.
where L and R are lower and upper triangular, respectively.
Question
How to compute L−11 and L−12 ?
L1=
1 0 0 0
−2 1 0 0
−3 0 1 0
1 0 0 1
= I −
0 2 3
−1
1 0 0 0
L2=
1 0 0 0
0 1 0 0
0 −4 1 0
0 3 0 1
= I −
0 0 4
−3
0 1 0 0
university-logo
Definition 1
A matrix of the form
I − αxy∗ (α ∈ F, x, y ∈ Fn) is called an elementary matrix.
The eigenvalues of (I − αxy∗) are {1, 1, . . . , 1, 1 − αy∗x}. Compute (I − αxy∗)(I − βxy∗) = I − (α + β − αβy∗x)xy∗.
If αy∗x − 1 6= 0 and let β =αy∗αx−1, then α + β − αβy∗x = 0. We have (I − αxy∗)−1 = (I − βxy∗),
where α1+β1 = y∗x.
university-logo
Example 1
Let x ∈ Fn, and x∗x = 1. Let H = {z : z∗x = 0} and Q = I − 2xx∗ (Q = Q∗, Q−1= Q).
Then Q reflects each vector with respect to the hyperplane H. Let y = αx + w, w ∈ H. Then, we have
Qy = αQx + Qw = −αx + w − 2(x∗w)x = −αx + w.
university-logo
Let y = ei to be the i-th column of the unit matrix and x = li = [0, · · · , 0, li+1,i, · · · , ln,i]T. Then,
I + lieTi =
1
. ..
1 li+1,i
... . ..
ln,i 1
(1)
Since eTi li= 0, we have
(I + lieTi)−1= (I − lieTi ).
university-logo
From the equality
(I + l1eT1)(I + l2eT2) = I + l1eT1 + l2eT2 + l1(eT1l2)eT2 = I + l1eT1 + l2eT2 follows that
(I + l1eT1) · · · (I + lieTi) · · · (I + ln−1eTn−1)
= I + l1eT1 + l2eT2 + · · · + ln−1eTn−1
=
1
l21 . .. 0 ... . .. . .. ln1 · · · ln,n−1 1
. (2)
Theorem 2
A lower triangular with “1” on the diagonal can be written as the product of n − 1 elementary matrices of the form (1).
Remark: (I + l1eT1 + · · · + ln−1eTn−1)−1 = (I − ln−1eTn−1) · · · (I − l1eT1) which can not be simplified as in (2).
university-logo
LR -factorization
Definition 3
Given A ∈ Cn×n, a lower triangular matrix L with “1” on the diagonal and an upper triangular matrix R. If A = LR, then the product LR is called a LR-factorization (or LR-decomposition) of A.
university-logo
Basic problem
Given b 6= 0, b ∈ Fn. Find a vector l1= [0, l21, · · · , ln1]T and c ∈ F such that
(I − l1eT1)b = ce1.
Solution:
b1= c,
bi− li1b1= 0, i = 2, . . . , n.
b1= 0, it has no solution (since b 6= 0), b16= 0, then c = b1, li1= bi/b1, i = 2, . . . , n.
university-logo
Construction of LR-factorization:
Let A = A(0)=h
a(0)1 · · · a(0)n
i. Apply basic problem to a(0)1 : If
a(0)11 6= 0, then there exists L1= I − l1eT1 such that (I − l1eT1)a(0)1 = a(0)11e1. Thus
A(1) = L1A(0)
= h
La(0)1 · · · La(0)n
i
=
a(0)11 a(0)12 · · · a(0)1n 0 a(1)22 a(1)2n ... ... ... 0 a(1)n2 · · · a(1)nn
.
university-logo
The k-th step:
A(k)= LkA(k−1)= LkLk−1· · · L1A(0) (3)
=
a(0)11 · · · a(0)1n 0 a(1)22 · · · a(1)2n
... 0 . .. ...
... ... . .. a(k−1)kk · · · a(k−1)kn ... ... 0 a(k)k+1,k+1 · · · a(k)k+1,n
... ... ... ... ...
0 0 · · · 0 a(k)n,k+1 · · · a(k)nn
university-logo
If a(k−1)kk 6= 0, for k = 1, . . . , n − 1, then the method is executable and we have that
A(n−1)= Ln−1· · · L1A(0)= R is an upper triangular matrix. Thus, A = LR.
Explicit representation of L:
Lk= I − lkeTk, L−1k = I + lkeTk
L = L−11 · · · L−1n−1= (I + l1eT1) · · · (I + ln−1eTn−1)
= I + l1eT1 + · · · + ln−1eTn−1 (by (2)).
university-logo
Theorem 4
Let A be nonsingular. Then A has an LR-factorization (A = LR) if and only if κi:= det(Ai) 6= 0, where Ai is the leading principal matrix of A, i.e.,
Ai=
a11 · · · a1i
. . .
. . . ai1 · · · aii
, for i = 1, . . . , n − 1.
Proof: (Necessity “⇒” ): Since A = LR, we have
a11 · · · a1i
... ... ai1 · · · aii
=
l11 0
... . ..
li1 · · · lii
r11 · · · r1i
. .. ...
0 rii
.
From det(A) 6= 0 follows that det(L) 6= 0 and det(R) 6= 0. Thus, ljj 6= 0 and rjj 6= 0, for j = 1, . . . , n. Hence κi= l11· · · liir11· · · rii6= 0.
university-logo
(Sufficiency “⇐”): From (3) we have
A(0)= (L−11 · · · L−1i )A(i).
Consider the (i + 1)-th leading principle determinant. From (3) we have
a11 · · · ai,i+1
... ...
ai+1 · · · ai+1,i+1
=
1 0
l21 . .. ... . .. . .. ... . .. . .. li+1,1 · · · li+1,i 1
a(0)11 a(0)12 · · · a(0)1,i+1 a(1)22 · · · a(1)2,i+1
. .. ...
a(i−1)ii a(i−1)i,i+1
0 a(i)i+1,i+1
.
Thus, κi= 1 · a(0)11a(1)22 · · · a(i)i+1,i+16= 0 which implies a(i)i+1,i+16= 0.
Therefore, the LR-factorization of A exists.
university-logo
Theorem 5
If a nonsingular matrix A has an LR-factorization with A = LR and l11= · · · = lnn= 1, then the factorization is unique.
Proof: Let A = L1R1= L2R2. Then L−12 L1= R2R−11 = I.
Corollary 6
If a nonsingular matrix A has an LR-factorization with A = LDR, where D is diagonal, L and RT are unit lower triangular (with one on the diagonal) if and only if κi6= 0.
university-logo
Theorem 7
Let A be a nonsingular matrix. Then there exists a permutation P , such that P A has an LR-factorization.
Proof: By construction! Consider (3): There is a permutation Pk, which interchanges the k-th row with a row of index large than k, such that 0 6= a(k−1)kk (∈ PkA(k−1)). This procedure is executable, for k = 1, . . . , n − 1. So we have
Ln−1Pn−1· · · LkPk· · · L1P1A(0) = R. (4) Let P be a permutation which affects only elements k + 1, . . . , n. It holds P (I − lkeTk)P−1 = I − (P lk)eTk = I − ˜lkeTk = ˜Lk, (eTkP−1= eTk) where ˜Lk is lower triangular. Hence we have
P Lk= ˜LkP.
university-logo
Now write all Pk in (4) to the right as
Ln−1L˜n−2· · · ˜L1Pn−1· · · P1A(0)= R.
Then we have P A = LR with L−1= Ln−1L˜n−2· · · ˜L1and P = Pn−1· · · P1.
university-logo
Gaussian elimination
Given a linear system
Ax = b
with A nonsingular. We first assume that A has an LR-factorization, i.e., A = LR. Thus
LRx = b.
We then (i) solve Ly = b; (ii) solve Rx = y. These imply that LRx = Ly = b. From (4), we have
Ln−1· · · L2L1(A | b) = (R | L−1b).
university-logo
Algorithm: Gaussian elimination without permutation
1: for k = 1, . . . , n − 1 do
2: if akk= 0 then
3: Stop.
4: else
5: ωj := akj (j = k + 1, . . . , n);
6: end if
7: for i = k + 1, . . . , n do
8: η := aik/akk, aik := η;
9: for j = k + 1, . . . , n do
10: aij:= aij− ηωj, bj := bj− ηbk.
11: end for
12: end for
13: end for
14: xn = bn/ann;
15: for i = n − 1, n − 2, . . . , 1 do
16: xi = (bi−Pn
j=i+1aijxj)/aii.
17: end for
university-logo
Cost of computation (A flop is a floating point operation):
(i) LR-factorization: 2n3/3 flops;
(ii) Computation of y: n2 flops;
(iii) Computation of x: n2flops.
For A−1: 8/3n3≈ 2n3/3 + 2kn2(k = n linear systems).
university-logo
Pivoting: (a) Partial pivoting; (b) Complete pivoting.
From (3), we have
A(k−1)=
a(0)11 · · · a(0)1n
0 . .. ...
... a(k−2)k−1,k−1 · · · a(k−2)k−1,n ... 0 a(k−1)kk · · · a(k−1)kn
... ... ... ...
0 . . . 0 a(k−1)nk · · · a(k−1)nn
.
university-logo
Partial pivoting
Find a p ∈ {k, . . . , n} such that
|apk| = maxk≤i≤n|aik| (rk = p)
swap akj with apj for j = k, . . . , n, and bk with bp.
(5)
Replacing stopping step in Line 3 of Gaussian elimination Algorithm by (5), we have a new factorization of A with partial pivoting, i.e., P A = LR (by Theorem 7 and |lij| ≤ 1 for i, j = 1, . . . , n).
For solving linear system Ax = b, we use
P Ax = P b ⇒ L(Rx) = PTb ≡ ˜b.
It needs extra n(n − 1)/2 comparisons.
university-logo
Complete pivoting
Find p, q ∈ {k, . . . , n} such that
|apq| ≤ max
k≤i,j≤n|aij|, (rk:= p, ck := q)
swap akj and bk with apj and bp, resp., (j = k, . . . , n), swap aik with aiq(i = 1, . . . , n).
(6)
Replacing stopping step in Line 3 of Gaussian elimination Algorithm by (6), we also have a new factorization of A with complete pivoting, i.e., P AΠ = LR (by Theorem 7 and |lij| ≤ 1, for i, j = 1, . . . , n).
For solving linear system Ax = b, we use
P AΠ(ΠTx) = P b ⇒ LR˜x = ˜b ⇒ x = Π˜x.
It needs n3/3 comparisons.
university-logo
Let
A =
10−4 1
1 1
be in three decimal-digit floating point arithmetic.
κ(A) = kAk∞kA−1k∞≈ 4. A is well-conditioned.
Without pivoting:
L =
1 0
f l(1/10−4) 1
, f l(1/10−4) = 104, R =
10−4 1
0 f l(1 − 104· 1)
, f l(1 − 104· 1) = −104. LR =
1 0
104 1
10−4 1 0 −104
=
10−4 1
1 0
6= A.
university-logo
Here a22 entirely “lost” from computation. It is numerically unstable.
Let Ax =
1 2
. Then x ≈
1 1
. But Ly =
1 2
solves y1= 1 and y2= f l(2 − 104· 1) = −104, Rˆx = y solves ˆx2= f l((−104)/(−104)) = 1,
ˆ
x1= f l((1 − 1)/10−4) = 0.
We have an erroneous solution with cond(L), cond(R) ≈ 108.
university-logo
Partial pivoting:
L =
1 0
f l(10−4/1) 1
=
1 0
10−4 1
,
R =
1 1
0 f l(1 − 10−4)
=
1 1 0 1
. L and R are both well-conditioned.
university-logo
LDR - and LL
T-factorizations
Algorithm 2
[Crout’s factorization or compact method]
For k = 1, . . . , n,
for p = 1, 2, . . . , k − 1, rp:= dpapk, ωp:= akpdp, dk:= akk−Pk−1
p=1akprp, if dk = 0, then stop; else
for i = k + 1, . . . , n, aik := (aik−Pk−1
p=1aiprp)/dk, aki:= (aki−Pk−1
p=1ωpapi)/dk. Cost: n3/3 flops.
With partial pivoting: see Wilkinson EVP pp.225-.
Advantage: One can use double precision for inner product.
university-logo
Theorem 8
If A is nonsingular, real and symmetric, then A has a unique LDLT factorization, where D is diagonal and L is a unit lower triangular matrix (with one on the diagonal).
Proof: A = LDR = AT = RTDLT. It implies L = RT.
university-logo
Theorem 9
If A is symmetric and positive definite, then there exists a lower triangular G ∈ Rn×n with positive diagonal elements such that A = GGT.
Proof:
A is symmetric positive definite
⇔ xTAx ≥ 0 for all nonzero vector x ∈ Rn
⇔ κi≥ 0 for i = 1, . . . , n
⇔ all eigenvalues of A are positive
From Corollary 6 and Theorem 8 we have A = LDLT. From L−1AL−T = D follows that
dk= (eTkL−1)A(L−Tek) > 0.
Thus, G = Ldiag{d1/21 , · · · , d1/2n } is real, and then A = GGT.
university-logo
Derive an algorithm for computing the Cholesky factorization A = GGT: Let
A≡ [aij] and G =
g11 0 · · · 0 g21 g22 . .. ... ... ... . .. 0 gn1 gn2 · · · gnn
.
Assume the first k − 1 columns of G have been determined after k − 1 steps.
Bycomponentwise comparisonwith
[aij] =
g11 0 · · · 0 g21 g22 . .. ... ... ... . .. 0 gn1 gn2 · · · gnn
g11 g21 · · · gn1
0 g22 · · · gn2
... . .. . .. ... 0 · · · 0 gnn
,
one has
akk=
k
X
j=1
gkj2 ,
university-logo
which gives
gkk2 = akk−
k−1X
j=1
gkj2 .
Moreover,
aik= Xk j=1
gijgkj, i = k + 1, . . . , n,
hence the k-th column of G can be computed by
gik =
aik−
k−1X
j=1
gijgkj
gkk, i = k + 1, . . . , n.
university-logo
Cholesky Factorization
Input: n × n symmetric positive definite matrix A.
Output: Cholesky factorization A = GGT.
1: Initialize G = 0.
2: for k = 1, . . . , n do
3: G(k, k) =q
A(k, k) −Pk−1
j=1G(k, j)G(k, j)
4: for i = k + 1, . . . , n do
5: G(i, k) =
A(i, k) −Pk−1
j=1G(i, j)G(k, j) G(k, k)
6: end for
7: end for
In addition to n square root operations, there are approximately Xn
k=1
[2k − 2 + (2k − 1)(n − k)] = 1 3n3+1
2n2−5 6n
floating-point arithmetic required by the algorithm.
university-logo
For solving symmetric, indefinite systems: See Golub/ Van Loan Matrix Computation pp. 159-168. P APT = LDLT, D is 1 × 1 or 2 × 2 block-diagonal matrix, P is a permutation and L is lower triangular with one on the diagonal.
university-logo
Error estimation for linear systems
Consider the linear system
Ax = b, (7)
and the perturbed linear system
(A + δA)(x + δx) = b + δb, (8)
where δA and δb are errors of measure or round-off in factorization.
Definition 10
Let k · k be an operator norm and A be nonsingular. Then
κ ≡ κ(A) = kAkkA−1k is a condition number of A corresponding to k k.
university-logo
Theorem 11 (Forward error bound)
Let x be the solution of (7) and x + δx be the solution of the perturbed linear system (8). If kδAkkA−1k < 1, then
kδxk
kxk ≤ κ
1 − κkδAkkAk
kδAk
kAk +kδbk kbk
.
Proof: From (8) we have
(A + δA)δx + Ax + δAx = b + δb.
Thus,
δx = −(A + δA)−1[(δA)x − δb]. (9)
university-logo
Here, Corollary 2.7 implies that (A + δA)−1 exists. Now, k(A + δA)−1k = k(I + A−1δA)−1A−1k ≤ kA−1k 1
1 − kA−1kkδAk. On the other hand, b = Ax implies kbk ≤ kAkkxk. So,
1
kxk ≤kAk
kbk. (10)
From (9) follows that kδxk ≤ 1−kAkA−1−1kkδAkk (kδAkkxk + kδbk). By using (10), the inequality (11) is proved.
Remark 1
If κ(A) is large, then A (for the linear system Ax = b) is called ill-conditioned, else well-conditioned.
university-logo
Error analysis for Gaussian algorithm
A computer in characterized by four integers:
(a) the machine base β;
(b) the precision t;
(c) the underflow limit L;
(d) the overflow limit U .
Define the set of floating point numbers.
F = {f = ±0.d1d2· · · dt× βe| 0 ≤ di< β, d16= 0, L ≤ e ≤ U } ∪ {0}.
Let G = {x ∈ R | m ≤ |x| ≤ M } ∪ {0}, where m = βL−1and
M = βU(1 − β−t) are the minimal and maximal numbers of F \ {0} in absolute value, respectively.
university-logo
We define an operator f l : G → F by
f l(x) = the nearest c ∈ F to x by rounding arithmetic.
One can show that f l satisfies
f l(x) = x(1 + ε), |ε| ≤ eps,
where eps = 12β1−t. (If β = 2, then eps = 2−t). It follows that f l(a ◦ b) = (a ◦ b)(1 + ε)
or
f l(a ◦ b) = (a ◦ b)/(1 + ε), where |ε| ≤ eps and ◦ = +, −, ×, /.
university-logo
Given x, y ∈ Rn. The following algorithm computes xTy and stores the result in s.
s = 0,
for k = 1, . . . , n, s = s + xkyk.
Theorem 12
If n2−t≤ 0.01, then
f l(
Xn k=1
xkyk) = Xn k=1
xkyk[1 + 1.01(n + 2 − k)θk2−t], |θk| ≤ 1
Proof of Theorem 12
university-logo
Let the exact LR-factorization of A be L and R (A = LR) and let ˜L, ˜R be the LR-factorization of A by using Gaussian Algorithm (without pivoting). There are two possibilities:
(i) Forward error analysis: Estimate |L − ˜L| and |R − ˜R|.
(ii) Backward error analysis: Let ˜L ˜R be the exact LR-factorization of a perturbed matrix ˜A = A + E. Then E will be estimated, i.e.,
|E| ≤ ?.
Theorem 13
The LR-factorization ˜L and ˜R of A using Gaussian Elimination with partial pivoting satisfies
L ˜˜R = A + E, (2.6)
where
kEk∞≤ n2ρkAk∞2−t (2.7)
with
ρ = max
i,j,k
a(k)ij
.
kAk∞. (2.8)
university-logo
Applying Theorem 12 to the linear system ˜Ly = b and ˜Rx = y, respectively, the solution x satisfies
( ˜L + δ ˜L)( ˜R + δ ˜R)x = b or
( ˜L ˜R + (δ ˜L) ˜R + ˜L(δ ˜R) + (δ ˜L)(δ ˜R))x = b. (2.9) Since ˜L ˜R = A + E, substituting this equation into (2.9) we get
[A + E + (δ ˜L) ˜R + ˜L(δ ˜R) + (δ ˜L)(δ ˜R)]x = b.
The entries of ˜L and ˜R satisfy
|elij| ≤ 1, and |erij| ≤ ρkAk∞.
university-logo
Therefore, we get
k ˜Lk∞≤ n, k ˜Rk∞≤ nρkAk∞,
kδ ˜Lk∞≤ n(n+1)2 1.01 · 2−t, kδ ˜Rk∞≤ n(n+1)2 1.01ρ2−t.
(2.10)
In practical implementation we usually have n22−t<< 1. So it holds kδ ˜Lk∞kδ ˜Rk∞≤ n2ρkAk∞2−t.
Let
δA = E + (δ ˜L) ˜R + ˜L(δ ˜R) + (δ ˜L)(δ ˜R). (2.11) Then, from (2.7) and (2.10) we get
kδAk∞≤ kEk∞+ kδ ˜Lk∞k ˜Rk∞+ k ˜Lk∞kδ ˜Rk∞+ kδ ˜Lk∞kδ ˜Rk∞
≤ 1.01(n3+ 3n2)ρkAk∞2−t (2.12)
university-logo
Theorem 14
For a linear system Ax = b the solution x computed by Gaussian Elimination with partial pivoting is the exact solution of the equation (A + δA)x = b and δA satisfies (2.12).
Remark: The quantity ρ defined by (2.9) is called a growth factor. The growth factor measures how large the numbers become during the process of elimination. In practice, ρ is usually of order 10 for partial pivot selection. But it can be as large as ρ = 2n−1, when
A =
1 0 · · · 0 1
−1 1 0 · · · 0 1
... −1 . .. . .. ... 1 ... ... . .. . .. 0 1
−1 −1 · · · −1 1 1
−1 −1 · · · −1 1
.
university-logo
Better estimates hold for special types of matrices. For example in the case of upper Hessenberg matrices, that is, matrices of the form
A =
× · · · ×
× . .. ... ...
. .. ... ...
0 × ×
the bound ρ ≤ (n − 1) can be shown. (Hessenberg matrices arise in eigenvalus problems.)
university-logo
For tridiagonal matrices
A =
α1 β2 0
γ2 . .. ...
. .. ... ...
. .. ... βn
0 γn αn
it can even be shown that ρ ≤ 2 holds for partial pivot selection. Hence, Gaussian elimination is quite numerically stable in this case.
university-logo
For complete pivot selection, Wilkinson (1965) has shown that
|akij| ≤ f (k) max
i,j |aij| with the function
f (k) := k21h
21312413 · · · k(k−1)1 i12 . This function grows relatively slowly with k:
k 10 20 50 100
f (k) 19 67 530 3300
university-logo
Even this estimate is too pessimistic in practice. Up until now, no matrix has been found which fails to satisfy
|a(k)ij | ≤ (k + 1) max
i,j |aij| k = 1, 2, ..., n − 1, when complete pivot selection is used. This indicates that Gaussian elimination with complete pivot selection is usually a stable process.
Despite this, partial pivot selection is preferred in practice, for the most part, because:
(i) Complete pivot selection is more costly than partial pivot selection.
(To compute A(i), the maximum from among (n − i + 1)2 elements must be determined instead of n − i + 1 elements as in partial pivot selection.)
(ii) Special structures in a matrix, i.e. the band structure of a tridiagonal matrix, are destroyed in complete pivot selection.
university-logo
Iterative Improvement:
Suppose that the linear system Ax = b has been solved via the LR-factorization P A = LR. Now we want to improve the accuracy of the computed solution x. We compute
r = b − Ax, Ly = P r, Rz = y, xnew = x + z.
(2.13)
Then in exact arithmatic we have
Axnew= A(x + z) = (b − r) + Az = b.
This leads to solve
Az = r by using P A = LR.
university-logo
Unfortunately, r = f l(b − Ax) renders an xnew that is no more accurate than x. It is necessary to compute the residual b − Ax with extended precision floating arithmetic.
Algorithm 4
Compute P A = LR (t-digit) Repeat: r := b − Ax (2t-digit)
Solve Ly = P r for y (t-digit) Solve Rz = y for z (t-digit) Update x = x + z (t-digit)
From Theorem 14 we have (A + δA)z = r, i.e.,
A(I + F )z = r with F = A−1δA. (2.14)
university-logo
Theorem 15
Let {xk} be the sequence constructed by Algorithm 4 for solving Ax = b and x∗= A−1b be the exact solution. Assume Fk in (2.14) satisfying kFkk ≤ σ < 1/2 for all k. Then {xk} → x∗.
Proof of Theorem 15
Corollary 16 If
1.01(n3+ 3n2)ρ2−tkAk kA−1k < 1/2, then Algorithm 4 converges.
Proof: From (2.14) and (2.12) follows that
kFkk ≤ 1.01(n3+ 3n2)ρ2−tκ(A) < 1/2.
university-logo
Appendix
Proof of Theorem 12: Let sp= f l(Pp
k=1xkyk) be the partial sum in Algorithm 41. Then
s1= x1y1(1 + δ1) with |δ1| ≤ eps and for p = 2, . . . , n,
sp= f l[sp−1+ f l(xpyp)] = [sp−1+ xpyp(1 + δp)](1 + εp) with |δp|, |εp| ≤ eps. Therefore
f l(xTy) = sn = Xn k=1
xkyk(1 + γk),
where (1 + γk) = (1 + δk)Qn
j=k(1 + εj), and ε1≡ 0. Thus, f l(
Xn k=1
xkyk) = Xn k=1
xkyk[1 + 1.01(n + 2 − k)θk2−t].
The result follows immediately from the following useful Lemma.
university-logo
Lemma 7.1 If (1 + α) =Qn
k=1(1 + αk), where |αk| ≤ 2−t and n2−t≤ 0.01, then Yn
k=1
(1 + αk) = 1 + 1.01nθ2−twith |θ| ≤ 1.
Proof: From assumption it is easily seen that
(1 − 2−t)n≤ Yn k=1
(1 + αk) ≤ (1 + 2−t)n.
Expanding the Taylor expression of (1 − x)n as −1 < x < 1, we get (1 − x)n = 1 − nx +n(n − 1)
2 (1 − θx)n−2x2≥ 1 − nx.
Hence
(1 − 2−t)n ≥ 1 − n2−t.
university-logo
Now, we estimate the upper bound of (1 + 2−t)n: ex= 1 + x +x2
2! +x3
3! + · · · = 1 + x +x
2x(1 +x 3 +2x2
4! + · · · ).
If 0 ≤ x ≤ 0.01, then
1 + x ≤ ex≤ 1 + x + 0.01x1
2ex≤ 1 + 1.01x
(Here, we use the fact e0.01< 2 to the last inequality.) Let x = 2−t. Then the left inequality of (55) implies
(1 + 2−t)n≤ e2−tn
Let x = 2−tn. Then the second inequality of (55) implies e2−tn ≤ 1 + 1.01n2−t
From (55) and (55) we have
(1 + 2−t)n≤ 1 + 1.01n2−t.
Return
university-logo
Proof of Theorem 15: From (2.14) and rk= b − Axk we have A(I + Fk)zk= b − Axk.
Since A is nonsingular, we have (I + Fk)zk= x∗− xk.
From xk+1= xk+ zk we have (I + Fk)(xk+1− xk) = x∗− xk, i.e., (I + Fk)xk+1= Fkxk+ x∗. (2.15) Subtracting both sides of (2.15) from (I + Fk)x∗ we get
(I + Fk)(xk+1− x∗) = Fk(xk− x∗).
Then, xk+1− x∗= (I + Fk)−1Fk(xk− x∗). Hence, kxk+1− x∗k ≤ kFkkkxk− x∗k
1 − kFkk ≤ σ
1 − σkxk− x∗k.
Let τ = σ/(1 − σ). Then
kxk− x∗k ≤ τk−1kx1− x∗k.
But σ < 1/2 follows τ < 1. This implies convergence of Algorithm 4.
Return