計算火焰片模型的數值方法

全文

(1)國立交通大學應用數學系碩士論文. 計算火焰片模型的數值方法 Numerical Methods for Computing Flame Sheet Model. 研究生：林宗澤指導老師：葉立明. 教授. 中華民國九十三年六月.

(2) 計算火焰片模型的數值方法 Numerical Methods for Computing Flame Sheet Model. 研究生：林宗澤. Student：Tzong-Tzer Lin. 指導教授：葉立明. Advisor：Li-Ming Yeh. 國立交通大學應用數學系碩士論文. A Thesis Submitted to Department of Applied Mathematics College of Science National Chiao Tung University in Partial Fulfillment of the Requirements for the Degree of Master in Applied Mathematics June 2004 Hsinchu, Taiwan, Republic of China. 中華民國九十三年六月.

(3) lÊÒ_íbMj¶. çÞ: ŠŠè. Nû`¤: s p. Å >¦×ç@àbçÍ (û˝F) î=Ú. ¿. b. ÊÒ (flame sheet) _í}j˙u_ÝÝ(41/˛¤óóÉíÍ$ ÊbM, Ñ7j|¥<j˙, BbUà7{„â¶ (damped Newton’s method) 1/!¯Àáæ ¶ (one way multigrid) ¸ Krylov ä˛È¶ (Krylov subspace methods) …¹díñíu û˝ú¤_-œ^0í Krylov ä˛È¶, Wà, Âu G ì¶ (biconjugate gradient stabilized method), 2|ü”ì¶ (generalized minimum residual method) ¸Ì0íÒ |ü”ì¶ (transpose-free QMR method) Bb˛%Ñ¤ÊÒ_ê7 C xk˙{, 1 kdıqýbM!‹1/‹Jn. i.

(4) Numerical Methods for Computing Flame Sheet Model Student: Tzong-Tzer Lin. Advisor: Dr. Li-Ming Yeh. Department of Applied Mathematics National Chiao Tung University. Abstract The differential equations of flame sheet model are highly nonlinear and strongly coupled system. To solve these equations numerically, we use damped Newton’s method combining with one way multigrid method and Krylov subspace methods. The purpose of this thesis is to survey effective Krylov subspace methods for this model, for example, biconjugate gradient stabilized method (BICGSTAB), generalized minimum residual method (GMRES), and transpose-free QMR method (TFQMR). A code for the flame sheet model in C language is developed and numerical results will be presented and discussed in this work.. ii.

(5) Ð. á. ¥uBíø¹d, Ä¤ÊŸTí¬˙2Øn}XƒrÖí•~¸Øæ, '«Ë, ·?éB øøËs7 ÛÊåõõ¬%, q-kÅ7rÖí>é …d)JêA, 3b ˝kBíNû `¤ s p²=, s`¤úkBøÇá à/.£üíÑçG, ·?.ÀwËN£¸ùû, éB ?Ê°û˝Fí¬˙2ç3ƒdç½víÃãG¸j¶, ¥öuøM'Ø)íñð¸%ð, Oõ éB§ïGÖ, Ê¤BbyŸ>ás`¤íNû wŸ, úkBÊû˝F°zvø˜ó-í°çbz;áá, à‹³5b, B;Ê¥³°çí¬ ˙2"úu‚ÖÌÖ7 6y¤ã…×ð"˙N¼ |(, 6u|½bí, BbÔ>éBí‚ ú™ÑÅäíBƒÛÊ´?0™9Õ, 1³\ ‹ØÖL¸b°, UB?yÌ(è5Rí8”-J°BíÜ;, à‹³‚í¨ñ¸É=, B uÌ¶ß‚êA¥_ç“í. iii.

(6) Contents Abstract (in Chinese). i. Abstract (in English). ii. Acknowledgement. iii. Contents. iv. List of Figures. v. 1 Introduction 1.1 The Governing Equations for Flame Sheet Model . . . . . . . . . . . . . . 1.2 Discretized Form for Flame Sheet Model . . . . . . . . . . . . . . . . . . .. 1 2 3. 2 Krylov Subspace Methods 2.1 Projection Methods . . . 2.2 Arnoldi’s Method . . . . 2.3 FOM, IOM, and DIOM . 2.3.1 FOM . . . . . . . 2.3.2 IOM . . . . . . . 2.3.3 DIOM . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. 14 14 18 21 21 22 23. 3 Biconjugate Gradient Stabilized Method 3.1 Conjugate Gradient Method . . . . . . . 3.2 Lanczos Biorthogonalization . . . . . . . 3.2.1 Two-Sided Lanczos . . . . . . . . 3.2.2 BCG . . . . . . . . . . . . . . . . 3.3 Transpose-Free Variants . . . . . . . . . 3.3.1 CGS . . . . . . . . . . . . . . . . 3.3.2 BICGSTAB . . . . . . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. 25 25 30 30 32 34 34 37. 4 Transpose-Free QMR Method 4.1 GMRES, QGMRES, and DQGMRES 4.1.1 GMRES . . . . . . . . . . . . 4.1.2 QGMRES . . . . . . . . . . . 4.1.3 DQGMRES . . . . . . . . . . 4.2 Quasi-Minimal Residual . . . . . . . 4.3 TFQMR . . . . . . . . . . . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. 40 40 40 41 42 46 48. . . . . . .. . . . . . .. 5 Algorithm and Numerical Results 53 5.1 General Solution Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 53 5.2 Numerical Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . 54 References. 60. iv.

(7) List of Figures. Figure 1 2 3. Page Physical configuration for diffusion flame model (not in scale) . . . . . . . . . . . . . . 10 Contour plot of the temperature. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .58 Residual norm of Newton step during the time stepping phase . . . . . . . . . . . . . . 59. v.

(8) 1. Introduction. Primitive formulation for laminar diffusion flame in three dimensional rectangular coordinates can be transformed into the vorticity-velocity formulation by using two dimensional cylindrical coordinates. A detailed derivation of the vorticity-velocity formulation can be found in [4, 5]. Usually a flame sheet model [2] is used to initialize multidimensional diffusion flames. The governing equations for flame sheet model can be derived from equations for finite rate diffusion flame model in vorticity-velocity formulation. The flame sheet model is based on the assumptions that the chemical reaction in a laminar diffusion flame is a one-step irreversible reaction and that the conversion of reactant into stable problem is infinitely fast. In the reaction zone, fuel and oxidizer are separated; fuel and oxidizer react in stoichiometric proportion. With these assumptions, no fuel appears on the oxidizer side and vice verse. Because the differential equations of flame sheet model are highly nonlinear and strongly coupled system, it is extremely difficult to analyze mathematically. Newton’s method is a standard method in solving highly nonlinear and strongly coupled system of partial differential equations. Combing with multigrid method and Krylov subspace methods, we can solve the flame sheet model numerically. Purpose of this thesis is to survey efficient Krylov subspace methods and apply these methods to solve the flame sheet model. We shall concentrate on three Krylov subspace methods, namely biconjugate gradient stabilized method (BICGSTAB), generalized minimum residual method (GMRES), and transpose-free QMR method (TFQMR). This work is organized in the following ways: In the first chapter, we discretize the governing equation for flame sheet model by using centered difference scheme. The convective terms are evaluated by using upwind differenced scheme to preserve monotonicity. In the second chapter, we introduce the projection method and use Arnoldi’s method to construct the Krylov subspace. In the third and fourth chapters, we will introduce BICGSTAB, GMRES, and TFQMR methods in mathematical level. In the last chapter, we present some numerical results and give a discussion of these results.. 1.

(9) 1.1. The Governing Equations for Flame Sheet Model. The flame sheet equations consist of the total mass and momentum conservation equations, constituting the flow field problem and using vorticity-velocity formulation of the steady-state, coupled with a conserved scalar equation. The governing equations for flame sheet model with axis symmetry can be stated in the following form.                             . ¶ µ ∂ V · ∇ρ ∂ω 1 ∂Vr Vr ∂ 2 Vr ∂ 2 Vr , + 2 − − = + ρ¶ ∂r r r ∂r ∂z ∂z 2 ∂r2 µ ∂ V · ∇ρ ∂ω 1 ∂Vr ∂ 2 Vz ∂ 2 Vz , − − = − + ρ ∂z r ∂z ∂r ∂z 2 ∂r2 V2 ∂ω ρVr ∂ω ∂ ³ µω ´ ∂ 2 µω ∂ 2 µω − ∇ρ · g ∇ρ · ∇ ω + − + ρV = ρV + + z r 2 r ∂z ∂r ∂z 2µ ∂r r ∂r2 ¶ ∂µ ∂µ , − ∇Vz · ∇ +2 ∇(div(V )) · ∇µ − ∇Vr · ∇ ∂z ∂r ¶ µ ¶ µ ∂S ∂S ∂S ∂ ∂S 1 ∂ , + ρVz = ρVr ρD + rρD ∂z ∂r ∂z ∂z ∂r r ∂r. where V = (Vr , Vz ) is the velocity vector with radial Vr and axial Vz components, ω = ∂Vr ∂z. −. ∂Vz ∂r. is vorticity, and S is the conserved scalar. In addition, the scalar ρ is the mass. density of the mixture, measured as g/cm3 , given by the equation of ρ=. PW RT. where P is the pressure, R is the universal gas constant, T is the temperature, and W is the mean molecular weight of the mixture. The scalar µ is the viscosity coefficient of the mixture, measured as g/(cm · S), given by the equation of µ ¶r T µ = µ0 T0 where r = 0.7, T0 = 298K, µ0 = 1.85 × 10−4 . The scalar D is a diffusion coefficient, measured as cm2 /s, given by the equation of ρD =. µ Pr. ) is curl of β. , − ∂β where Pr = 0.75. And the vector ∇β = ( ∂β ∂r ∂z. 2.

(10) 1.2. Discretized Form for Flame Sheet Model. We use the finite difference approximations to discrete the governing equations for flame sheet model on the grid points in the computational domain, shown in Figure 1, which cover from r = 0 to 7.5cm in the radial direction and from z = 0 to 30cm in the axial direction. In the diffusion and source terms, we use standard centered differences. In the axial, we use a monotonicity preserving upwind scheme to discrete the convective terms. Next we write down the discretized form for each equation.. Radial Velocity : ∂ ∂r. µ. ∂Vr r ∂r (i). ¶. ∂ Vr ∂ω ∂ 2 Vr +r − +r 2 −r ∂r r ∂z ∂z (iii) (iv) (ii). µ. V · ∇ρ ρ (v). ¶ = 0.. (i) For i = 3 ∼ (n − 1) and j = 2 ∼ (m − 1)   ¯ ¯ ¶ ¯¯ µ ¯ ¯ ∂Vr ¯ 1 ∂Vr ¯ ∂  r ∂Vr ¯¯ −r r ¯ ¯ ≈ ∂r ¯ ∂r ¯ ri+1/2 − ri−1/2 ∂r ¯ ∂r i−1/2,j i+1/2,j i,j ¶ µ ri + ri−1 (Vr )i,j − (Vr )i−1,j ri+1 + ri (Vr )i+1,j − (Vr )i,j 2 . − ≈ ri − ri−1 2 ri+1 − ri 2 ri+1 − ri−1 For i = 2 and j = 2 ∼ (m − 1)  ¯  ¯ ¶ ¯¯ µ ¯ ∂Vr ¯¯  ∂Vr ¯ 1 ∂Vr ¯ ∂  where r1 = 0 −r r r ¯ ¯ ¯ ≈ ∂r ¯ ∂r ¯ r2+1/2 − r1 ∂r ¯ ∂r 1,j 2+1/2,j 2,j ¶ µ (Vr )3,j − (Vr )2,j r2 + r3 (Vr )3,j − (Vr )2,j 2 . = ≈ r3 − r2 r3 − r2 2 r2 + r3 (ii) For i = 2 ∼ (n − 1) and j = 2 ∼ (m − 1)   ¯ ¯ ¯ ¯ ¯ ¯ 2 ∂Vr ¯ 1 ∂ Vr ¯   ∂Vr ¯¯ − r 2 ¯ ≈ ri ¯ ∂z ¯ ∂z ¯ zj+1/2 − zj−1/2 ∂z ¯ i,j−1/2 i,j+1/2 i,j ¶ µ (Vr )i,j − (Vr )i,j−1 (Vr )i,j+1 − (Vr )i,j 2 . − ≈ ri zj − zj−1 zj+1 − zj zj+1 − zj−1 (iii) For i = 2 ∼ (n − 1) and j = 2 ∼ (m − 1) ¯ ωi,j+1 − ωi,j−1 ∂ω ¯¯ . r ¯ ≈ ri zj+1 − zj−1 ∂z ¯ i,j. 3.

(11) (iv) For i = 2 ∼ (n − 1) and j = 2 ∼ (m − 1) ¯ (Vr )i,j Vr ¯¯ . ¯ ≈ ri r¯ i,j. (v) For i = 2 ∼ (n − 1) and j = 2 ∼ (m − 1) ¶¯ µ ¶¯ µ ∂ Vr ∂ρ ¯¯ ∂ V · ∇ρ ¯¯ r ¯ ¯ =r ¯ ∂r ρ ∂r ¯ ρ ∂r i,j. ¶¯ µ ∂ Vz ∂ρ ¯¯ +r ¯ ∂r ρ ∂z ¯ i,j i,j   ¯ ¯ ¯ ¯ Vr ∂ρ ¯ ri   Vr ∂ρ ¯¯ − ≈ ¯ ρ ∂r ¯ ρ ∂r ¯ ri+1/2 − ri−1/2 i−1/2,j i+1/2,j   ¯ ¯ ¯ ¯ Vz ∂ρ ¯ ri   Vz ∂ρ ¯¯ − + ¯ ρ ∂z ¯ ρ ∂z ¯ ri+1 − ri−1 i−1,j i+1,j ¶ ·µ (Vr )i+1,j (Vr )i,j ρi+1,j − ρi,j ri + ≈ ri+1 − ri ρi+1,j ρi,j ri+1 − ri−1 ¸ ¶ µ (Vr )i−1,j (Vr )i,j ρi,j − ρi−1,j + − ri − ri−1 ρi−1,j ρi,j · (Vz )i+1,j ρi+1,j+1 − ρi+1,j−1 ri + zj+1 − zj−1 ρi+1,j ri+1 − ri−1 ¸ (Vz )i−1,j ρi−1,j+1 − ρi−1,j−1 . − zj+1 − zj−1 ρi−1,j. Axial Velocity : ∂ ∂ 2 Vz ∂ 2 Vz ∂ω 1 ∂Vr + + + + ∂z r ∂z ∂r ∂z 2 ∂r2 (iv) (iii) (ii) (i). µ. V · ∇ρ ρ (v). ¶ = 0.. (i) For i = 2 ∼ (n − 1) and j = 2 ∼ (m − 1)   ¯ ¯ ¯ ¯ ¯ ¯ 2 ∂Vz ¯ 1 ∂ Vz ¯   ∂Vz ¯¯ − ¯ ¯ ≈ 2 ∂r ¯ ∂r ¯ ri+1/2 − ri−1/2 ∂r ¯ i−1/2,j i+1/2,j i,j ¶ µ (Vz )i,j − (Vz )i−1,j (Vz )i+1,j − (Vz )i,j 2 . − ≈ ri − ri−1 ri+1 − ri ri+1 − ri−1. 4.

(12) (ii) For i = 2 ∼ (n − 1) and j = 2 ∼ (m − 1)   ¯ ¯ ¯ ¯ ¯ ¯ 2 ∂Vz ¯ 1 ∂ Vz ¯   ∂Vz ¯¯ − ¯ ¯ ≈ 2 ∂z ¯ ∂z ¯ zj+1/2 − zj−1/2 ∂z ¯ i,j−1/2 i,j+1/2 i,j ¶ µ (Vz )i,j − (Vz )i,j−1 (Vz )i,j+1 − (Vz )i,j 2 . − ≈ zj − zj−1 zj+1 − zj zj+1 − zj−1 (iii) For i = 2 ∼ (n − 1) and j = 2 ∼ (m − 1) ¯ ωi+1,j − ωi−1,j ∂ω ¯¯ . ¯ ≈ ri+1 − ri−1 ∂r ¯ i,j. (iv) For i = 2 ∼ (n − 1) and j = 2 ∼ (m − 1) ¯ (Vz )i,j+1 − (Vz )i,j−1 1 ∂Vz ¯¯ . ¯ ≈ ri (zj+1 − zj−1 ) r ∂z ¯ i,j. (v) For i = 2 ∼ (n − 1) and j = 2 ∼ (m − 1) ¶¯ µ ¶¯ µ ¶¯ µ ∂ Vz ∂ρ ¯¯ ∂ Vr ∂ρ ¯¯ ∂ V · ∇ρ ¯¯ ¯ ¯ + ¯ = ¯ ∂z ρ ∂z ¯ ∂z ρ ∂r ¯ ρ ∂z i,j i,j i,j   ¯ ¯ ¯ ¯ Vr ∂ρ ¯ 1   Vr ∂ρ ¯¯ − ≈ ¯ ρ ∂r ¯ ρ ∂r ¯ zj+1 − zj−1 i,j−1 i,j+1   ¯ ¯ ¯ ¯ ∂ρ V ∂ρ V 1 ¯ z   z ¯¯ − + ¯ ¯ ρ ∂z ¯ ρ ∂z zj+1/2 − zj−1/2 i,j−1/2 i,j+1/2 ¸ · (Vr )i,j+1 ρi+1,j+1 − ρi−1,j+1 (Vr )i,j−1 ρi+1,j−1 − ρi−1,j−1 1 − ≈ ri+1 − ri−1 ρi,j−1 ri+1 − ri−1 ρi,j+1 zj+1 − zj−1 ¶ ·µ (Vr )i,j+1 (Vr )i,j ρi,j+1 − ρi,j 1 + + zj+1 − zj ρi,j+1 ρi,j zj+1 − zj−1 ¸ ¶ µ (Vr )i,j−1 (Vr )i,j ρi,j − ρi,j−1 . + − zj − zj−1 ρi,j−1 ρi,j. Vorticity : ¶ µ ¶ µ V2 ∂ω ∂ω ∂ 2 µω µω ∂µω ∂ + ∇ρ · ∇ − ρV ω − r + rρV − rρV − +r r r z r 2 ∂z ∂r r ∂z 2 ∂r ∂r (v) (iv) (iii) (ii) (i) ¶ µ ¢ ¡ ∂µ ∂µ = 0. + ∇Vz · ∇ r∇ρ · g − 2r ∇(div(V )) · ∇µ + 2r ∇Vr · ∇ ∂z ∂r (viii) (vii) (vi) 5.

(13) (i) For i = 3 ∼ (n − 1) and j = 2 ∼ (m − 1)   ¯ ¯ ¶ ¯¯ µ ¯ ¯ ∂µω ¯ 1 ∂µω ¯ ∂  r ∂µω ¯¯ −r r ¯ ¯ ≈ ∂r ¯ ∂r ¯ ri+1/2 − ri−1/2 ∂r ¯ ∂r i−1/2,j i+1/2,j i,j µ ri+1 + ri µi+1,j ωi+1,j − µi,j ωi,j 2 ≈ ri+1 − ri 2 ri+1 − ri−1 ¶ ri + ri−1 µi,j ωi,j − µi−1,j ωi−1,j . − ri − ri−1 2 For i = 2 and j = 2 ∼ (m − 1)  ¯  ¯ ¶ ¯¯ µ ¯ ∂µω ¯¯  ∂µω ¯ 1 ∂µω ¯ ∂  where r1 = 0 − r r ≈ r ¯ ¯ ¯ ∂r ¯ ∂r ¯ r2+1/2 − r1 ∂r ¯ ∂r 1,j 2+1/2,j 2,j ¶ µ µ3,j ω3,j − µ2,j ω2,j r3 + r2 µ3,j ω3,j − µ2,j ω2,j 2 . = ≈ r3 − r2 r3 − r2 2 r3 + r2 (ii) For i = 2 ∼ (n − 1) and j = 2 ∼ (m − 1)   ¯ ¯ ¯ ¯ ¯ ¯ 2 ∂µω ¯ 1 ∂ µω ¯   ∂µω ¯¯ − r ¯ ¯ ≈ ri 2 ∂z ¯ ∂z ¯ zj+1/2 − zj−1/2 ∂z ¯ i,j−1/2 i,j+1/2 i,j ¶ µ µi,j ωi,j − µi,j−1 ωi,j−1 µi,j+1 ωi,j+1 − µi,j ωi,j 2 . − ≈ ri zj − zj−1 zj+1 − zj zj+1 − zj−1 (iii) For i = 2 ∼ (n − 1) and j = 2 ∼ (m − 1) ¯ µi,j ωi,j µω ¯¯ . ¯ ≈ ri r ¯ i,j. (iv) For i = 2 ∼ (n − 1) and j = 2 ∼ (m − 1) ¶ ¯¯ µ ∂ω ∂ω ¯ − ρVr ω ¯ + rρVz rρVr ¯ ∂z ∂r i,j ¾ · ½ ωi,j − ωi−1,j (Vr )i,j + (Vr )i−1,j ,0 ≈ ri ρi,j max r − ri−1 2 ¸ ¾ i ½ ωi+1,j − ωi,j (Vr )i+1,j + (Vr )i,j ,0 − max − r − ri 2 ¾ i+1 · ½ ωi,j − ωi,j−1 (Vz )i,j + (Vz )i,j−1 ,0 + ri ρi,j max zj − zj−1 2 ¸ ¾ ½ ωi,j+1 − ωi,j (Vz )i,j+1 + (Vz )i,j − ρi,j (Vr )i,j ωi,j . ,0 − max − zj+1 − zj 2 6.

(14) (v) For i = 2 ∼ (n − 1) and j = 2 ∼ (m − 1) ¯ ¶¯ µ ¶¯ µ ∂Vz ¯¯ ∂Vr ∂ρ ∂Vz ¯¯ ∂Vr ∂ρ V 2 ¯¯ + Vz Vr + Vz Vr r∇ρ · ∇ ¯ = r ¯ ¯ −r ∂z ¯ ∂z ∂r ∂r ¯ ∂r ∂z 2 ¯ i,j i,j i,j ¸ · (Vz )i+1,j − (Vz )i−1,j (Vr )i+1,j − (Vr )i−1,j ρi,j+1 − ρi,j−1 + (Vz )i,j (Vr )i,j ≈ ri ri+1 − ri−1 ri+1 − ri−1 zj+1 − zj−1 ¸ · (Vz )i,j+1 − (Vz )i,j−1 (Vr )i,j+1 − (Vr )i,j−1 ρi+1,j − ρi−1,j . + (Vz )i,j (Vr )i,j − ri zj+1 − zj−1 zj+1 − zj−1 ri+1 − ri−1 (vi) For i = 2 ∼ (n − 1) and j = 2 ∼ (m − 1) ¯ ¯ ¯ ¯ ρi+1,j − ρi−1,j ∂ρ ¯ ¯ . r∇ρ · g ¯ = −rgz ¯ ≈ (980.65)ri ¯ ri+1 − ri−1 ∂r ¯ i,j. i,j. (vii) For i = 2 ∼ (n − 1) and j = 2 ∼ (m − 1) ¯ ¢ ¯¯ ¡ 2r ∇(div(V )) · ∇µ ¯ ¯ i,j À ¯¯ ¶À ¿ µ ¶ ¿ µ ∂µ ∂µ ¯ ∂ ∂ 1 ∂ ∂ ∂ 1 ∂ , · (rVr ) + Vz (rVr ) + Vz , − = 2r ¯ ∂r ∂z ¯ ∂z ∂r r ∂r ∂z ∂z r ∂r i,j ¶ ¯¯ µ 2 ¶ ¯¯ µ 2 2 2 ∂ Vr 1 ∂Vr Vr ¯ ∂µ ∂ Vz ∂ Vz 1 ∂Vr ¯ ∂µ ∂ Vr − 2 ¯ + + + + = 2r ¯ − 2r 2 r ¯ r ∂r ∂r2 ∂z ∂r∂z r ∂z ¯ ∂z ∂r ∂z∂r i,j i,j · µi+1,j − µi−1,j (Vr )i+1,j+1 − (Vr )i+1,j−1 − (Vr )i−1,j+1 + (Vr )i−1,j−1 ≈ 2ri (ri+1 − ri−1 )(zj+1 − zj−1 ) ri+1 − ri−1 ¸ ¶ µ 1 (Vr )i,j+1 − (Vr )i,j−1 (Vz )i,j − (Vz )i,j−1 (Vz )i,j+1 − (Vz )i,j 2 + − + zj+1 − zj−1 ri zj − zj−1 zj+1 − zj zj+1 − zj−1 · µi,j+1 − µi,j−1 (Vz )i+1,j+1 − (Vz )i+1,j−1 − (Vz )i−1,j+1 + (Vz )i−1,j−1 − 2ri (ri+1 − ri−1 )(zj+1 − zj−1 ) zj+1 − zj−1 ¸ ¶ µ (Vr )i,j 1 (Vr )i+1,j − (Vr )i−1,j (Vr )i,j − (Vr )i−1,j (Vr )i+1,j − (Vr )i,j 2 . − + − + ri2 ri+1 − ri−1 ri ri − ri−1 ri+1 − ri ri+1 − ri−1. 7.

(15) (viii) For i = 2 ∼ (n − 1) and j = 2 ∼ (m − 1) ¶¯ µ ∂µ ¯¯ ∂µ + ∇Vz · ∇ 2r ∇Vr · ∇ ¯ ∂z ¯ ∂r i,j À¶ ¯¯ À ¿ 2 À ¿ À ¿ 2 µ¿ 2 ∂2µ ∂ µ ∂Vz ∂Vz ∂ µ ∂ µ ∂Vr ∂Vr ¯ ,− · , ,− 2 + · , = 2r ¯ ¯ ∂r∂z ∂z 2 ∂r ∂z ∂r ∂z∂r ∂r ∂z i,j ¯ ¶¸ µ · 2 ¯ ∂ 2 µ ∂Vr ∂Vz ∂ µ ∂Vz ∂ 2 µ ∂Vr ¯ − + − = 2r ¯ 2 2 ¯ ∂z ∂z∂r ∂r ∂r ∂z ∂z ∂r i,j ¶ µ · µi,j − µi,j−1 (Vz )i+1,j − (Vz )i−1,j µi,j+1 − µi,j 2 − ≈ 2ri ri+1 − ri−1 zj − zj−1 zj+1 − zj zj+1 − zj−1 ¶ µ µi,j − µi−1,j (Vr )i,j+1 − (Vr )i,j−1 µi+1,j − µi,j 2 − − zj+1 − zj−1 ri − ri−1 ri+1 − ri ri+1 − ri−1 ¶¸ µ (Vz )i,j+1 − (Vz )i,j−1 µi+1,j+1 − µi+1,j−1 − µi−1,j+1 + µi−1,j−1 (Vr )i+1,j − (Vr )i−1,j . − + zj+1 − zj−1 ri+1 − ri−1 (ri+1 − ri−1 )(zj+1 − zj−1 ). Conserved Scalar : µ. ∂S ∂S + ρVz ρVr ∂z ∂r (i). ¶. 1 ∂ − r ∂r. µ. ∂S rρD ∂r (ii). ¶. ∂ − ∂z. µ. ∂S ρD ∂z (iii). ¶. (i) For i = 2 ∼ (n − 1) and j = 2 ∼ (m − 1) ¶¯ µ ∂S ¯¯ ∂S + ρVz ρVr ¯ ∂z ¯ ∂r i,j ¾ · ½ Si,j − Si−1,j (Vr )i,j + (Vr )i−1,j ,0 ≈ ρi,j max r − ri−1 2 ¸ ¾ i ½ Si+1,j − Si,j (Vr )i+1,j + (Vr )i,j ,0 − max − r − ri 2 ¾ i+1 · ½ Si,j − Si,j−1 (Vz )i,j + (Vz )i,j−1 ,0 + ρi,j max zj − zj−1 2 ¸ ¾ ½ Si,j+1 − Si,j (Vz )i,j+1 + (Vz )i,j . ,0 − max − zj+1 − zj 2. 8. = 0..

(16) (ii) For i = 3 ∼ (n − 1) and j = 2 ∼ (m − 1) ¶¯ µ ¶¯ µ ∂S ¯¯ 1 1 ∂ ∂S ¯¯ 1 ∂ rµ rρD ¯ ¯ = ∂r ¯ Pr r ∂r ∂r ¯ r ∂r i,j i,j   ¯ ¯ ¯ ¯ ∂S ¯ 1 1 1  rµ ∂S ¯¯ − rµ ¯ ≈ ∂r ¯ ∂r ¯ Pr ri ri+1/2 − ri−1/2 i−1/2,j i+1/2,j ¶ µ ri µi,j + ri−1 µi−1,j Si,j − Si−1,j ri+1 µi+1,j + ri µi,j Si+1,j − Si,j 2 1 1 . − ≈ ri − ri−1 2 ri+1 − ri 2 Pr ri ri+1 − ri−1 For i = 2 and j = 2 ∼ (m − 1) ¶¯ µ ¶¯ µ ∂S ¯¯ 1 1 ∂ ∂S ¯¯ 1 ∂ rµ rρD ¯ ¯ = ∂r ¯ Pr r ∂r ∂r ¯ r ∂r 2,j 2,j  ¯ ¯  ¯ ¯ ¯ ¯ ∂S ∂S ∂S 1 1 1 ¯ ¯ rµ ¯¯ where − rµ ¯  ≈ ¯ =0 ∂r ¯ ∂r ¯ ∂r ¯ Pr r2 r2+1/2 − r1 1,j 1,j 2+1/2,j ¶ µ ¶ µ 1 1 µ3,j + µ2,j S3,j − S2,j r3 + r2 µ3,j + µ2,j S3,j − S2,j 2 1 1 . = ≈ r3 − r2 2 Pr r2 r3 − r2 2 2 Pr r2 r3 + r2 (iii) For i = 2 ∼ (n − 1) and j = 2 ∼ (m − 1) ¶ ¯¯ µ ¶ ¯¯ µ ∂S ¯ 1 ∂ ∂S ¯ ∂ µ ρD ¯ ¯ = ∂z ¯ Pr ∂z ∂z ¯ ∂z i,j i,j   ¯ ¯ ¯ ¯ ∂S ¯ 1 1  µ ∂S ¯¯ −µ ¯ ≈ ∂z ¯ ∂z ¯ Pr zj+1/2 − zj−1/2 i,j+1/2 i,j+1/2 ¶ µ µi,j + µi,j−1 Si,j − Si,j−1 µi,j+1 + µi,j Si,j+1 − Si,j 2 1 . − ≈ zj − zj−1 2 zj+1 − zj 2 Pr zj+1 − zj−1. 9.

(17) L. z. r. Rmax. RO RI. Air. Air Fuel. Figure 1: Physical configuration for diffusion flame model (not in scale). 10.

(18) Outer Boundary (r = Rmax ) : ∂Vz = 0, ∂r (ii). ∂Vr = 0, ∂r (i). ∂Vr , ∂z (iii). ω=. S = 0. (iv). (i) For i = n and j = 1 ∼ m (Vr )n,j − (Vr )n−1,j ∂Vr . ≈ rn − rn−1 ∂r (ii) For i = n and j = 1 ∼ m (Vz )n,j − (Vz )n−1,j ∂Vz . ≈ rn − rn−1 ∂r (iii) For i = n and j = 1 or m ωn,j − ωn−1,j ∂ω . ≈ rn − rn−1 ∂r For i = n and j = 2 ∼ (m − 1) ω−. (Vr )n,j+1 − (Vr )n,j−1 ∂Vr . ≈ ωn,j − zj+1 − zj−1 ∂z. (iv) For i = n and j = 1 ∼ m S ≈ Sn,j .. Axis of Symmetry (r = 0) : Vr = 0, (i). ∂Vz = 0, ∂r (ii). ω = 0, (iii). (i) For i = 1 and j = 2 ∼ (m − 1) Vr ≈ (Vr )1,j .. 11. ∂S = 0. ∂r (iv).

(19) (ii) For i = 1 and j = 2 ∼ (m − 1) We use the result (ii) to deal with axial velocity equation. ¶ µ (Vz )2,j − (Vz )1,j ∂ Vz ∂ρ ∂ 2 Vz ∂ 2 Vz ∂ω ≈ 2 + + + (r2 − r1 )2 ∂z ρ ∂z ∂r ∂z 2 ∂r2 ¸ · ω2,j (Vz )1,j − (Vz )1,j−1 (Vz )1,j+1 − (Vz )1,j 2 + − + r2 − r1 zj − zj−1 zj+1 − zj zj+1 − zj−1 ¸ ¶ µ ¶ ·µ (Vz )1,j−1 (Vz )1,j ρ1,j − ρ1,j−1 (Vz )1,j+1 (Vz )1,j ρ1,j+1 − ρ1,j 1 . + − + + zj − zj−1 ρ1,j−1 ρ1,j zj+1 − zj ρ1,j+1 ρ1,j zj+1 − zj−1 (iii) For i = 1 and j = 2 ∼ (m − 1) ω ≈ ω1,j . (iv) For i = 1 and j = 2 ∼ (m − 1) ω2,j − ω1,j ∂S . ≈ r2 − r1 ∂r. Inlet Boundary (z = 0) : Vr = 0,. Vz = Vz0 (r),. (i). (ii). ω=. ∂Vr ∂Vz , − ∂r ∂z (iii). S = S 0 (r) . (iv). (i) For i = 1 ∼ (n − 1) and j = 1 Vr ≈ (Vr )i,1 . (ii) For i = 1 ∼ (n − 1) and j = 1 Vz − Vz0 (r) ≈ (Vz )i,1 − Vz0 (ri ). (iii) For i = 1 and j = 1 ω ≈ ωi,1 . For i = 2 ∼ (n − 1) and j = 1 ω−. (Vz )i+1,1 − (Vz )i−1,1 (Vr )i,2 ωi,1 + ωi,2 ∂Vr ∂Vz . + − ≈ + ri+1 − ri−1 z2 − z1 2 ∂r ∂z. 12.

(20) (iv) For i = 1 ∼ (n − 1) and j = 1 S − S 0 (r) ≈ Si,1 − S 0 (ri ).. Outlet Boundary (z = L) : ∂ω = 0, ∂z (iii). ∂Vz = 0, ∂z (ii). Vr = 0, (i). ∂S = 0. ∂z (iv). (i) For i = 1 ∼ (n − 1) and j = m Vr ≈ (Vr )i,m . (ii) For i = 1 ∼ (n − 1) and j = m (Vz )i,m − (Vz )i,m−1 ∂Vz . ≈ zm − zm−1 ∂z (iii) For i = 1 and j = m ω ≈ ωi,m . For i = 2 ∼ (n − 1) and j = m ωi,m − ωi,m−1 ∂ω . ≈ zm − zm−1 ∂z (iv) For i = 1 ∼ (n − 1) and j = m Si,m − Si,m−1 ∂S . ≈ zm − zm−1 ∂z. 13.

(21) 2. Krylov Subspace Methods. A Krylov subspace method is a method for which the subspace Km (A, r0 ) of Rn is the Krylov subspace with the form Km (A, r0 ) = span{r0 , Ar0 , A2 r0 , . . . , Am−1 r0 }, where r0 = b − Ax0 be initial residual in general and x0 be initial guess. The Krylov subspace has well property that matrix-vector multiplication of basis is cheap to compute. In other words, if we given a basis for Km , then we can cheaply compute a basis for Km+1 . In this chapter for a start, we introduce the general projection methods, namely orthogonal projection or oblique projection. Then we will show two optimal results. One is orthogonal projection just to minimize the A-norm of the error when A is symmetric positive definite. The other is oblique projection just to minimize the 2-norm of the residual when A be an arbitrary square matrix. In the second section, we introduce the Arnoldi’s method to construct the Krylov subspace. This method is an orthogonal projection method onto K for general non-Hermitian matrix. In the last section, we introduce the full orthogonalization method and its variant version, called incomplete orthogonalization method, to apply to linear system.. 2.1. Projection Methods. Consider the linear system Ax = b. (2.1). where A is an n × n real matrix (or sparse matrix). We will use projection method for extracting an approximation to the solution of a linear system. This idea is to restrict the next step in an iterative method to a small subspace but pick “best” step in that subspace. In order to reach our goal as stated above, we find methods which can cheaply find the next iterate as far as possible but still minimize some measure of the error or residual at each step. Let K and L are two m-dimensional subspaces of Rn . A projection technique onto the subspace K and orthogonal to L is just to find an approximate solution x˜ to (2.1) by 14.

(22) imposing two conditions. One is to find x˜ belong to K and the other is making residual vector must be orthogonal to L. Now suppose we have current guess x0 and let initial residual vector r0 = b − Ax0 . The projection method is repressed as Find. x˜ ∈ x0 + K,. such that r˜ = b − A˜ x ⊥ L.. or equivalently Find. δ ∈ K,. such that r˜ = b − A(x0 + δ) = r0 − Aδ ⊥ L.. (2.2). Let V = [v1 · · · vm ] and W = [w1 · · · wm ] are n × m matrix whose column-vectors form a basis for K and L, respectively. Then x˜ = x0 + δ = x0 + V y for some y ∈ Rm . Thus, we can transform (2.2) into matrix representation as following W T AV y = W T r0. (2.3). If we can find W T AV is nonsingular, then (2.3) has unique solution. For this purpose, the following proposition describes two ideal cases. Proposition 1. Let A, L, and K satisfy either one of the two following conditions, i. A is positive definite and L = K,or ii. A is nonsingular and L = AK. Then the matrix B = W T AV is nonsingular for any bases V and W of K and L, respectively. Proof. To prove first case. Let V and W be any basis of K and L, respectively. Since L and K are the same, we let W = V G, where G is a m × m nonsingular matrix. Then B = W T AV = GT V T AV. Because V T AV is positive definite, then we shows that B is nonsingular. Consider the second case. Since L = AK, we let W = AV G, where G is a m × m nonsingular matrix. Then B = W T AV = GT (AV )T AV. Since A is nonsingular, the n × m matrix AV is full rank. Then we have (AV )T AV is nonsingular. So we have a result.. ¥ 15.

(23) Suppose Proposition 1. hold, then the approximate solution x˜ can be repressed as x˜ = x0 + V y = x0 + V (W T AV )−1 W T r0 . A question is how much the quality of the approximate solution obtained from a general projection method? In order to answer this problem, two optimal results will be established. Proposition 2. Assume that A is symmetric positive definite and L = K. Then a vector x˜ is the result of an (orthogonal) projection method onto K with the starting vector x0 if and only if it minimizes the A-norm of the error over x0 + K, that is, E(˜ x) = min E(x), x∈x0 +K. where E(x) ≡ (A(x∗ − x), x∗ − x)1/2 ≡ kx∗ − xkA . ˜ where δ˜ ∈ K. Then Proof. Let x˜ = x0 + δ, min kx∗ − xkA = min kx∗ − (x0 + δ)kA. x∈x0 +K. δ∈K. = min kd0 − δkA (where d0 = x∗ − x0 ) δ∈K. ˜ A = kd0 − δk Therefore, we have d0 − δ˜ ⊥A K. It would be better to say that δ˜ is the A-orthogonal projection of d0 onto K.. ¥. Proposition 3. Let A be an arbitrary square matrix and assume that L = AK. Then a vector x˜ is the result of an (oblique) projection method onto K orthogonally to L with the starting vector x0 if and only if it minimizes the 2-norm of the residual vector b − Ax over x ∈ x0 + K, that is, R(˜ x) = min R(x), x∈x0 +K. where R(x) ≡ kb − Axk2 . 16.

(24) ˜ where δ˜ ∈ K. Then Proof. Let x˜ = x0 + δ, min kb − Axk2 = min kb − A(x0 + δ)k2. x∈x0 +K. δ∈K. = min kr0 − Aδk2 (where r0 = b − Ax0 ) Aδ∈AK. ˜ 2 = kr0 − Aδk Therefore, we have r0 − Aδ˜ ⊥ AK = L. It would be better to say that Aδ˜ is the orthogonal projection of r0 onto AK.. ¥. By Proposition 2., the result of the projection process can be interpreted as orthogonal projector acts on the initial error. The same is true of the Proposition 3. can be interpreted as oblique projector acts on the initial residual. The following properties will state conclusions from the above properties. Proposition 4. Let x˜ be the approximate solution obtained from an orthogonal projection process onto K, and let d˜ = x∗ − x˜ be the associated error vector. Then, d˜ = (I − PA )d0 , where PA denotes the projector onto the subspace K, which is orthogonal with respect to the A-inner product. Proof. By the result of Proposition 2.. ¥. Proposition 5. Let x˜ be the approximate solution obtained from a projection process onto K orthogonally to L = AK, and let r˜ = b − A˜ x be the associated residual. Then, r˜ = (I − P )r0 , where P denotes the orthogonal projector onto the subspace AK. Proof. By the result of Proposition 3.. ¥. 17.

(25) 2.2. Arnoldi’s Method. We know that Schur factorization reduces a dense matrix A into upper triangular matrix U by applying unitary matrix V . In natural thought, the unitary matrix V is made up of Householder reflectors, and let V = V1 V2 · · · Vm . Then we have V ∗ AV = U . In the first step, V1∗ multiplied on the left of A and V1 multiplied on the right of A. Note that V1∗ will change all rows of A and V1 will change all columns of A. The entries are changed at each step and we write in boldface as following diagrams:      . × × × × ×. × × × × ×. × × × × × A. × × × × ×. × × × × ×. . .      −→     . × 0 0 0 0. × × × × ×. × × × × × × × × × × V1∗ A. × × × × ×. . .      −→     . × × × × ×. × × × × × × × × × × × × × × × V1∗ AV1. × × × × ×.    .  . By the same way to apply to other Householder reflectors, we can find this idea had to fail. Fortunately, Arnoldi’s method suggest a good idea for us to reduce A to Hessenberg form. At the first step, we select Householder reflector V2∗ that leaves the first row unchanged. In other words, we let V˜ = V2 V3 · · · Vm and omit the reflector V1 . We show the first step as following diagrams:      . × × × × ×. × × × × ×. × × × × × A. × × × × ×. × × × × ×. . .      −→     . × × 0 0 0. × × × × ×. × × × × × × × × × × V2∗ A. × × × × ×. . .      −→     . × × 0 0 0. × × × × × × × × × × × × × × × V2∗ AV2. × × × × ×.    .  . Repeating this process by the same way on V2∗ AV2 by V3 , . . . , Vm , we have Hessenberg form matrix V˜ ∗ AV˜ = (Vm∗ . . . V3∗ V2∗ )A(V2 V3 . . . Vm ) = Hm . Arnoldi’s method [1] is an orthogonal projection method onto Km for general nonHermitian matrices. This basic idea is reducing a dense matrix into Hessenberg form by above way. For a start, we use standard Gram-Schmidt method to construct matrix V . The Arnoldi algorithm is stated as follow: 18.

(26) ALGORITHM 1. Arnoldi 1. Choose a vector v1 of norm 1 2. For j = 1, 2, . . . , m Do: 3. hij = (Avj , vi ) for i = 1, 2, . . . , j P 4. wj = Avj − ji=1 hij vi 5. hj+1,j = kwj k2 6. If hj+1,j = 0 then stop 7. vj+1 = wj /hj+1,j 8. EndDo. Proposition 6. Assume that Algorithm 1. does not stop before the m-th step. Then the vectors v1 , v2 , . . . , vm form an orthonormal basis of the Krylov subspace Km = span{v1 , Av1 , . . . , Am−1 v1 }. Proof. Follow from the fact of steps 4, 5, and 7 of Algorithm 1.. ¥. ¯ m, Proposition 7. Denote by Vm , the n × m matrix with column vectors v1 , . . . , vm , by H the (m + 1) × m Hessenberg matrix whose nonzero entries hij are defined by Algorithm ¯ m by deleting its last row. Then the following 1., and by Hm the matrix obtained from H relations hold: ¯ m, AVm = Vm Hm + wm eTm = Vm+1 H VmT AVm = Hm , where wm = hm+1,m vm+1 . Proof. From the fact of Algorithm 1, we have Avj =. j+1 X. hij vi ,. j = 1, 2, . . . , m.. i=1. Then we have all relations.. ¥. By Algorithm 1, we know that Arnoldi’s method uses standard Gram-Schmidt orthonormalization to make Vm . In practice, the calculations of Gram-Schmidt formulas turn out to be numerically unstable. We can gain simple modification by using the modified GramSchmidt algorithm instead of the standard Gram-Schmidt algorithm. Let A = [a1 , . . . , an ] 19.

(27) with columns {aj } and Pj be an n × n orthogonal projector , projects the vector orthogonally onto the space orthogonal to hv1 , . . . , vj−1 i, of rank n − (j − 1) such that v1 =. P1 a1 , kP1 a1 k. v2 =. P2 a2 , kP2 a2 k. ··· ,. vj =. Pj aj , kPj aj k. with P1 = In . It is not difficult to see that Pj = P⊥vj−1 · · · P⊥v2 P⊥v1 . Then the standard Gram-Schmidt algorithm just to do a single orthogonal projection, vj = Pj aj . However, the modified Gram-Schmidt algorithm uses successive orthogonal projections, vj = P⊥vj−1 · · · P⊥v2 P⊥v1 aj Therefore, we know that Gram-Schmidt and modified Gram-Schmidt are equivalent in mathematics. The modified Gram-Schmidt’s version of Arnoldi is as follow: ALGORITHM 2. Arnoldi-Modified Gram-Schmidt 1. Choose a vector v1 of norm 1 2. For j = 1, 2, . . . , m Do: 3. wj = Avj 4. For i = 1, . . . , j Do: 5. hij = (wj , vi ) 6. wj = wj − hij vi 7. EndDo 8. hj+1,j = kwj k2 . If hj+1,j = 0 then stop 9. vj+1 = wj /hj+1,j 10. EndDo. 20.

(28) 2.3. FOM, IOM, and DIOM. Let us return to our main subject, to solve the linear system Ax = b, we want to find approximate solution x˜ cheaply. By the Arnoldi’s method, we can apply this method to three types and obtain our goal in this section. The first type is directly applying algorithm of Arnoldi, called full orthogonalization method or FOM. It should be said with some emphasis that we use modified Gram-Schmidt process instead of Gram-Schmidt process from now on. The second type is applying incomplete orthogonalization process in FOM, called incomplete orthogonalization method or IOM. And the last type is improving IOM by using LU factorization, called direct incomplete orthogonalization or DIOM.. 2.3.1. FOM. Now, given an initial guess x0 to the linear system Ax = b, we consider an orthogonal projection method, L = K = Km (A, r0 ), Km (A, r0 ) = span{r0 , Ar0 , A2 r0 , . . . , Am−1 r0 }, where r0 = b − Ax0 . Taking v1 = r0 /β and β = kr0 k2 in Arnoldi’s method, then we get VmT AVm = Hm and by (2.2), we have VmT (r0 − AVm ym ) = 0. Then −1 −1 ym = Hm (VmT r0 ) = Hm (βe1 ).. (2.4). The approximate solution is given by xm = x0 + Vm ym .. (2.5). The process as above is called the full orthogonalization method (FOM). Let me stress again that we use modified Gram-Schmidt method to make Vm and the algorithm is stated as follow: 21.

(29) ALGORITHM 3. FOM 1. r0 = b − Ax0 , β = kr0 k2 , and v1 = r0 /β 2. Set m × m matrix Hm = 0 3. For j = 1, 2, . . . , m Do: 4. wj = Avj 5. For i = 1, . . . , j Do: 6. hi,j = (wj , vi ) 7. wj = wj − hi,j vi 8. EndDo 9. hj+1,j = kwj k2 . If hj+1,j = 0 then set m = j and Goto 12 10. vj+1 = wj /hj+1,j 11. EndDo −1 (βe1 ) and xm = x0 + Vm ym 12. ym = Hm. 2.3.2. IOM. Sometimes we have a situation that our calculations may be dictated by computer’s memory limitations. By FOM algorithm, the memory cost increases at least as O(mn). As m increases, the largest value of m that can be used. We have two remedies to solve our problem. One remedy is to restart the FOM algorithm for reaching “small” m. And the other remedy is to truncate the Arnoldi algorithm, not using full orthogonalization. It is to say that we use incomplete orthogonalization process to make vj and gain a banded Hessenberg matrix Hm with bandwidth k + 1, the number k maybe dictated by computer’s memory limitations. The incomplete orthogonalization process with (2.4) and (2.5), called incomplete orthogonalization method (IOM), is performed as follow. ALGORITHM 4. IOM 1. r0 = b − Ax0 , β = kr0 k2 , and v1 = r0 /β 2. Set m × m matrix Hm = 0 3. For j = 1, 2, . . . , m Do: 4. w = Avj 5. For i = max{1, j − k + 1}, . . . , j Do: 6. hi,j = (w, vi ) 7. w = w − hi,j vi 8. EndDo 9. hj+1,j = kwk2 . If hj+1,j = 0 then set m = j and Goto 12 10. vj+1 = w/hj+1,j 11. EndDo −1 (βe1 ) and xm = x0 + Vm ym 12. ym = Hm 22.

(30) 2.3.3. DIOM. We know that FOM and IOM algorithms use the same relation (2.4) to find approximate solution xm . These methods must to compute the inverse of Hm , then we need to solve mth linear system. To avoid computing the inverse problem, we can modify the algorithm of IOM. The direct incomplete orthogonalization method (DIOM) is derived from the structure of the LU factorization, Hm = Lm Um , of the Hessenberg matrix Hm , which obtained from the IOM. We assume that no pivoting is used, then matrix Lm , entries {lij }, is unit lower bidiagonal and Um , entries {uij }, is banded upper triangular, with k diagonals. Thus the approximate solution is given by −1 −1 xm = x0 + Vm Um Lm (βe1 ).. (2.6). −1 Pm = [p1 , . . . , pm ] = Vm Um. (2.7). zm = [z1 , . . . , zm ]T = L−1 m (βe1 ).. (2.8). Defining the matrix. and the vector. By (2.7) and the structure of Um , we have m X. uim pi = vm ,. i=m−k+1. which show the vector pm to be updated from the previous pi ’s and vm , then # " m−1 X 1 uim pi . vm − pm = umm i=m−k+1 And by (2.8), because of the structure of Lm , we have ¸ · zm−1 zm = ζm where ζm = −lm,m−1 ζm−1 with z1 = [β] and ζ1 = β. Then (2.6) can be rewritten as ¸ · zm−1 xm = x0 + [Pm−1 , pm ] ζm = x0 + Pm−1 zm−1 + ζm pm = xm−1 + ζm pm . 23.

(31) The algorithm of DIOM is stated as follow: ALGORITHM 5. DIOM 1. Choose x0 and r0 = b − Ax0 , β = kr0 k2 , and v1 = r0 /β 2. For m = 1, 2, . . . , until convergence Do: 3. w = Avm 4. For i = max{1, m − k + 1}, . . . , m Do: 5. hi,m = (w, vi ) 6. w = w − hi,m vi 7. EndDo 8. hm+1,m = kwk2 and vm+1 = w/hm+1,m 9. Update the LU factorization of Hm . If umm = 0 then Stop. 10. ζm = { if m β, else −lm,m−1 ¢ ζm−1 } ¡ = 1 then Pm−1 −1 11. pm = umm vm − i=m−k+1 uim pi (for i ≤ 0 set uim pi = 0) 12. xm = xm−1 + ζm pm 13. EndDo. 24.

(32) 3. Biconjugate Gradient Stabilized Method. Conjugate gradient (CG) and biconjugate gradient (BCG) algorithm are derived in this chapter. These algorithms generate the optimal approximation from the Krylov subspace. When A is symmetric positive definite, CG can be derived from the symmetric Lanczos process. Contrary to symmetric form, non-symmetric matrix, we use two-sided Lanczos algorithm (Lanczos biorthogonalization) to construct biorthogonal bases for the Krylov subspaces corresponding to A and AT . Then BCG can be derived from the Lanczos biorthogonalization procedure. To avoid using AT in BCG, the conjugate gradient squared method (CGS) can be derived from BCG. The CGS can obtain faster convergent behavior than BCG for the same computational cost. Since the polynomials in CGS are square, then rounding errors tend to be more damaging than in the BCG. Finally, we improved the CGS algorithm such that can smoothen in the convergent behavior, called biconjugate gradient stabilized method (BICGSTAB).. 3.1. Conjugate Gradient Method. The Conjugate Gradient method [8] is one of the best known iterative techniques for solving sparse symmetric positive definite linear systems. The Conjugate Gradient algorithm can be derived from the DIOM, for the case when A is symmetric positive definite. In the first, we derive the similar case of FOM when A is symmetric, called Lanczos method. Given an initial guess x0 , we can find the approximate solution xm by rewriting (2.4) and (2.5), xm = x0 + V m y m ym = Tm−1 (βe1 ), where Tm be tridiagonal matrix of the following form,  α1 β2  β2 α2 β3   . . .. .. .. Tm =  .   βm−1 αm−1 βm βm αm 25.     .  . (3.1).

(33) Then the Lanczos method for linear system can be stated as follow: ALGORITHM 6. Lanczos method for linear systems 1. r0 = b − Ax0 , β = kr0 k2 , and v1 = r0 /β 2. For j = 1, 2, . . . , m Do: 3. wj = Avj − βj vj−1 (If j = 1 then set β1 v0 = 0) 4. αj = (wj , vj ) 5. wj = wj − αj vj 6. βj+1 = kwj k2 . If βj+1 = 0 then set m = j and Goto 9 7. vj+1 = wj /βj+1 8. EndDo 9. Tm = tridiag (βi , αi , βi+1 ), and Vm = [v1 , . . . , vm ]. 10. ym = Tm−1 (βe1 ) and xm = x0 + Vm ym. We should notice that the steps 3 until 5 of algorithm of Lanczos method can correspond with the steps 4 until 8 of algorithm of FOM as A is symmetric positive definite. Now, we will follow the same steps as for DIOM to construct direct version of the Lanczos method. Let Tm = Lm Um be LU factorization of Tm , where Lm is unit lower bidiagonal and Um is upper bidiagonal. To take a simple example,    η1 β2 1  λ2 1   η 2 β3       .. .. . . .. .. Tm =  × . .       ηm−1 βm λm−1 1 ηm λm 1 Then an approximate solution xm is −1 −1 xm = x0 + Vm Tm−1 (βe1 ) = x0 + Vm Um Lm (βe1 ).. Defining the matrix −1 Pm = [p1 , . . . , pm ] = Vm Um. and the vector zm = [z1 , . . . , zm ]T = L−1 m βe1 . Because of the structure of Um , we have −1 pm = ηm [vm − βm pm−1 ].. 26.     .  .

(34) Note that βm is a scalar computed from the Lanczos algorithm, while ηm results from the m-th Gaussian elimination step on the tridiagonal matrix, that is, βm , ηm−1 ηm = αm − λm βm .. λm =. By the same way, because of the structure of Lm , ¸ · zm−1 , zm = ζm in which ζm = −λm ζm−1 . Therefore, we can update xm as xm = x0 + Pm zm = xm−1 + ζm pm . Then we have the direct version of the Lanczos algorithm, called D-Lanczos, as follow: ALGORITHM 7. D-Lanczos 1. r0 = b − Ax0 , ζ1 = β = kr0 k2 , and v1 = r0 /β 2. set λ1 = β1 = 0 and p0 = 0 3. For m = 1, 2, . . . , until convergence Do: 4. w = Avm − βm vm−1 and αm = (w, vm ) βm and ζm = −λm ζm−1 5. If m > 1 then λm = ηm−1 6. ηm = αm − λm βm −1 (vm − βm pm−1 ) 7. pm = ηm 8. xm = xm−1 + ζm pm 9. If xm has converged then stop 10. w = w − α m vm 11. βm+1 = kwk2 and vm+1 = w/βm+1 12. EndDo. Proposition 8. Let rm = b − Axm , m = 0, 1, . . ., be the residual vectors produced by the Lanczos and the D-Lanczos algorithms and pm , m = 0, 1, . . ., the auxiliary vectors produced by D-Lanczos. Then, 1. Each residual vector rm is such that rm = σm vm+1 where σm is a certain scalar. As a result, the residual vectors are orthogonal to each other. 2. The auxiliary vectors pi form an A-conjugate set, i.e., (Api , pj ) = 0, for i 6= j. 27.

(35) Proof. We prove the first part, by Proposition 7. rm = b − Axm = b − A(x0 + Vm ym ) = r0 − (Vm Tm + βm+1 vm+1 eTm )ym = βv1 − Vm βe1 − βm+1 eTm ym vm+1 = −βm+1 eTm ym vm+1 . For the second part, we will claim that PmT APm is a diagonal matrix. −T T −1 PmT APm = Um Vm AVm Um −T −1 Tm U m = Um −T = Um Lm . −T Lm is a lower triangular and A is symmetric, then PmT APm must be a Note that Um. diagonal matrix.. ¥. Let us now return to CG algorithm. For the proposition given above, we can derive CG by imposing two conditions, one is rj ’s orthogonality and the other is pj ’s A-conjugacy. In D-Lanczos, we can find xj+1 = xj + αj pj . Then the residual rj+1 can be repressed as rj+1 = rj − αj Apj .. (3.2). Since the rj ’s are to be orthogonal each other, we have (rj − αj Apj , rj ) = 0. Therefore, αj =. (rj , rj ) . (Apj , rj ). (3.3). Also by DIOM, we have that the next search direction pj+1 is a linear combination of rj+1 and pj , it is pj+1 = rj+1 + βj pj . 28.

(36) Thus, (Apj , rj ) = (Apj , pj − βj−1 pj−1 ) = (Apj , pj ). Then (3.3) becomes (rj , rj ) . (Apj , pj ). αj = Because Apj orthogonal to pj+1 , then βj = −. (rj+1 , Apj ) . (pj , Apj ). From (3.2) we have Apj = −. 1 (rj+1 − rj ), αj. then βj =. (rj+1 , rj+1 ) 1 (rj+1 , (rj+1 − rj )) . = (rj , rj ) (Apj , pj ) αj. By Proposition 8. and above manners, we can rewrite D-Lanczos algorithm to conjugate gradient (CG) algorithm. We write down CG algorithm as follow: ALGORITHM 8. CG 1. r0 = b − Ax0 , and p0 = r0 . 2. For j = 0, 1, . . . , until convergence Do: 3. αj = (rj , rj )/(Apj , pj ) 4. xj+1 = xj + αj pj 5. rj+1 = rj − αj Apj 6. βj = (rj+1 , rj+1 )/(rj , rj ) 7. pj+1 = rj+1 + βj pj 8. EndDo. 29.

(37) 3.2 3.2.1. Lanczos Biorthogonalization Two-Sided Lanczos. We know if the matrix A is symmetric matrix, then the Gram-Schmidt procedure for constructing an orthonormal basis for Krylov subspace of A reduces to three term recurrences. However, the matrix A is the case that A is non-symmetric in general. Fortunately, we can extend the symmetric Lanczos algorithm to the non-symmetric version, having three term recurrences. The Lanczos biorthogonalization use a pair of three term recurrences, one involving A and the other involving AT , to construct biorthogonal bases for the Krylov spaces corresponding to A and AT , respectively. Judging from the above, we have Km (A, v1 ) = span{v1 , Av1 , . . . , Am−1 v1 } Km (AT , w1 ) = span{w1 , AT w1 , . . . , (AT )m−1 w1 }, in which (vi , wj ) = 0 for i 6= j. For the most part, we take (vi , wi ) = 1. Then we show the algorithm as follow: ALGORITHM 9. The Lanczos Biorthogonalization Procedure 1. Choose two vectors v1 , w1 such that (v1 , w1 ) = 1. 2. β1 = δ1 = 0, w0 = v0 = 0 3. For j = 1, 2, . . . , m Do: 4. αj = (Avj , wj ) 5. vˆj+1 = Avj − αj vj − βj vj−1 6. wˆj+1 = AT wj − αj wj − δj wj−1 7. δj+1 = |(ˆ vj+1 , wˆj+1 )|1/2 . If δj+1 = 0 then stop 8. βj+1 = (ˆ vj+1 , wˆj+1 )/δj+1 9. wj+1 = wˆj+1 /βj+1 10. vj+1 = vˆj+1 /δj+1 11. EndDo Note that the Algorithm 9 selects a manner to ensure that (vj+1 , wj+1 ) = 1. In that case, it is a canonical choice to find two scalars βj+1 , δj+1 such that satisfy the equality as following form δj+1 βj+1 = (ˆ vj+1 , wˆj+1 ).. 30.

(38) If the algorithm does not break down, then we have Tm the tridiagonal matrix as below:   α1 β2   δ2 α2 β3     ... ... ... Tm =  .    δm−1 αm−1 βm  δm αm The following proposition will interpret this result. Proposition 9. If the algorithm does not break down before step m, then the vector vi , i = 1, . . . , m, and wj , j = 1, . . . , m, form a biorthogonal system, i.e., (vj , wi ) = δi,j. 1 ≤ i, j ≤ m.. Moreover, {vi }i=1,2,...,m is a basis of Km (A, v1 ) and {wi }i=1,2,...,m is a basis of Km (AT , w1 ) and the following relations hold, AVm = Vm Tm + δm+1 vm+1 eTm , AT Wm = Wm TmT + βm+1 wm+1 eTm , WmT AVm = Tm . Proof. We need prove here only the biorthogonality of the vectors vi , wi with induction. Since the proof of the above relations is similar to Proposition 7. Assume now that the vectors v1 , . . . , vj and w1 , . . . , wj are biorthogonal. Claim that (vj+1 , wi ) = 0 for i ≤ j as follow: When i = j, by steps 5 and 9 of Algorithm 9., then −1 [Avj − αj vj − βj vj−1 ], wj ) (vj+1 , wj ) = (δj+1 −1 [(Avj , wj ) − αj (vj , wj ) − βj (vj−1 , wj )] = δj+1 −1 [αj − αj × 1 − 0] = 0. = δj+1. When i < j, by steps 5, 6, 9, and 10 of Algorithm 9., then −1 [(Avj , wi ) − αj (vj , wi ) − βj (vj−1 , wi )] (vj+1 , wi ) = δj+1 −1 [(vj , AT wi ) − 0 − βj (vj−1 , wi )] = δj+1 −1 [(vj , βi+1 wi+1 + αi wi + δi wi−1 ) − βj (vj−1 , wi )]. = δj+1. 31.

(39) For i < j − 1, (vj+1 , wi ) = 0 by inductive hypothesis. For i = j − 1, then −1 [(vj , βj wj + αj−1 wj−1 + δj−1 wj−2 ) − βj (vj−1 , wj−1 )] (vj+1 , wj−1 ) = δj+1 −1 [βj (vj , wj ) − βj (vj−1 , wj−1 )] = 0. = δj+1. ¥ 3.2.2. BCG. The Biconjugate Gradient method (BCG) [6] is a oblique projection process onto Km = span{v1 , Av1 , · · · , Am−1 v1 } orthogonally to Lm = span{w1 , AT w1 , · · · , (AT )m−1 w1 }, where v1 = r0 /kr0 k2 and takes w1 to satisfy (v1 , w1 ) = 1. Proceeding in the same way as for the CG algorithm from the symmetric Lanczos algorithm, we can derive BCG algorithm from Algorithm 9., the two-sided Lanczos algorithm. First, we use the LU factorization of Tm = Lm Um and define two matrices −1 Pm = [p1 , . . . , pm ] = Vm Um ,. Pm∗ = [p∗1 , . . . , p∗m ] = Wm L−T m . By Proposition 9., we have T −1 −1 −1 (Pm∗ )T APm = L−1 m Wm AVm Um = Lm Tm Um = I.. Thus, the column vectors p∗i of Pm∗ and those pi of Pm are A-conjugate. Also, it is known that pj+1 and p∗j+1 can be expressed as pj+1 = rj+1 + βj pj .. (3.4). ∗ p∗j+1 = rj+1 + βj p∗j .. (3.5). From (3.5) and A-conjugacy, ∗ (Apj , rj+1 + βj p∗j ) = 0.. 32.

(40) Then by (3.2), we have βj = −. ∗ ∗ ) (rj+1 , rj+1 ) (Apj , rj+1 . = ∗ ∗ (rj , rj ) (Apj , pj ). (3.6). And like the Conjugate Gradient algorithm, we find the residual rj and rj∗ are in the same direction as for vj+1 and wj+1 , respectively. For these information and by (3.5), we have (rj − αj Apj , rj∗ ) = 0, it imply. (rj , rj∗ ) . αj = (Apj , p∗j ). Putting these relations together by above, we have the BCG algorithm as follow: ALGORITHM 10. BCG 1. r0 = b − Ax0 , and choose r0∗ such that (r0 , r0∗ ) = 1. 2. p0 = r0 , p∗0 = r0∗ 3. For j = 0, 1, . . . , until convergence Do: 4. αj = (rj , rj∗ )/(Apj , p∗j ) 5. xj+1 = xj + αj pj 6. rj+1 = rj − αj Apj ∗ = rj∗ − αj AT p∗j 7. rj+1 ∗ )/(rj , rj∗ ) 8. βj = (rj+1 , rj+1 9. pj+1 = rj+1 + βj pj ∗ + βj p∗j 10. p∗j+1 = rj+1 11. EndDo. 33. (3.7).

(41) 3.3 3.3.1. Transpose-Free Variants CGS. The BCG algorithm require multiplication by both A and AT at each step. One thing, however, is certain: the vector p∗j+1 or wj generated with AT do not contribute to the solution directly, this mean extra work. For the reason given above, the Conjugate Gradient Squared (CGS) [10] was mainly to avoid using AT in the BCG and to obtain faster convergence for the same computational cost. This idea is the residual vector rj and the conjugate direction φj , in the BCG algorithm, can be expressed as rj = φj (A)r0 ,. pj = πj (A)r0 ,. where φj and πj are certain polynomials of degree j with φ0 (A) = 1 and π0 (A) = 1. The same observation applies to the vectors rj∗ and p∗j , defined as rj∗ = φj (AT )r0∗ ,. p∗j = πj (AT )r0∗ .. By (3.6) and (3.7), the scalar αj and βj are given by αj =. (φ2j (A)r0 , r0∗ ) (φj (A)r0 , φj (AT )r0∗ ) = (Aπj2 (A)r0 , r0∗ ) (Aπj (A)r0 , πj (AT )r0∗ ). βj =. (φ2j+1 (A)r0 , r0∗ ) (φj+1 (A)r0 , φj+1 (AT )r0∗ ) . = (φ2j (A)r0 , r0∗ ) (φj (A)r0 , φj (AT )r0∗ ). This indicates that the coefficients can be computed if we know r0∗ and φ2j (A)r0 and πj2 (A)r0 . From (3.4) and (3.5), it can be seen that the polynomials φj+1 (t) and πj+1 (t) satisfy the recurrences φj+1 (t) = φj (t) − αj tπj (t). (3.8). πj+1 (t) = φj+1 (t) + βj πj (t),. (3.9). and squaring both sides gives φ2j+1 (t) = φ2j (t) − 2αj tφj (t)πj (t) + αj2 t2 πj2 (t). (3.10). 2 (t) = φ2j+1 (t) + 2βj φj+1 (t)πj (t) + βj2 πj2 (t). πj+1. (3.11). 34.

(42) Multiplying φj (t) by the recurrence for πj (t) gives φj (t)πj (t) = φ2j (t) + βj−1 φj (t)πj−1 (t),. (3.12). and multiplying the recurrence for φj+1 (t) by πj (t) gives φj+1 (t)πj (t) = φj (t)πj (t) − αj tπj2 (t) = φ2j (t) + βj−1 φj (t)πj−1 (t) − αj tπj2 (t).. (3.13). Defining three new vectors as rj = φ2j (A)r0 , pj = πj2 (A)r0 , qj = φj+1 (A), then (3.10) and (3.11) and (3.13) translate into rj+1 = rj − αj A(2rj + 2βj−1 qj−1 − αj Apj ),. (use (3.11)). pj+1 = rj+1 + 2βj qj + βj2 pj ,. (3.14) (3.15). qj = rj + βj−1 qj−1 − αj Apj .. (3.16). It is convenient to define two auxiliary vectors uj = rj + βj−1 qj−1 , dj = 2rj + 2βj−1 qj−1 − αj Apj = uj + qj . (use (3.16)) Utilizing these auxiliary vectors, we rewrite (3.14) and (3.15) and (3.16) as rj+1 = rj − αj A(uj + qj ) pj+1 = uj+1 + βj (qj + βj pj ) qj = uj − αj Apj . The first point to notice is the residual of CGS is different from the residual of BCG. In general, one should expect the result of CGS to converge twice as fast as BCG. Therefore, the CGS algorithm is given as below. 35.

(43) ALGORITHM 11. CGS 1. r0 = b − Ax0 , and choose r0∗ such that (r0 , r0∗ ) = 1. 2. p0 = u0 = r0 3. For j = 0, 1, . . . , until convergence Do: 4. αj = (rj , r0∗ )/(Apj , r0∗ ) 5. qj = uj − αj Apj 6. xj+1 = xj + αj (uj + qj ) 7. rj+1 = rj − αj A(uj + qj ) 8. βj = (rj+1 , r0∗ )/(rj , r0∗ ) 9. uj+1 = rj+1 + βj qj 10. pj+1 = uj+1 + βj (qj + βj pj ) 11. EndDo. 36.

(44) 3.3.2. BICGSTAB. Note that the polynomials in the CGS are square, then the CGS residual usually increases by approximately the square of the increase of the BCG. But one difficulty in CGS is that rounding errors maybe lost or overflow and convergence curve maybe oscillate. To avoid the large oscillations in the CGS, one might try to produce iterates whose residual vectors are of the form rj0 = ψj (A)φj (A)r0 , where φj (t) is again the BCG polynomial and ψj (t) is a new polynomial which at each step is chosen to try and keep the smoothing convergence behavior. Specifically, we define ψj (t) by the form ψj+1 (t) = (1 − ωj t)ψj (t). (3.17). in which the scalar ωj can be chosen at each step to minimize krj+1 k2 . This manner leads to the Biconjugate Gradient Stabilized (BICGSTAB) [12], the derivation is similar to CGS. First, leaving the discussion of the scalar ωj aside for a moment, by using (3.8) and the residual polynomial ψj+1 (t)φj+1 (t) = (1 − ωj t)ψj (t)φj+1 (t) = (1 − ωj t)(ψj (t)φj (t) − αj tψj (t)πj (t)),. (3.18). which show that we can compute if we know the products ψj (t)πj (t). For this, by using (3.9) and (3.17), we have ψj (t)πj (t) = ψj (t)(φj (t) + βj−1 πj−1 (t)) = ψj (t)φj (t) + βj−1 (1 − ωj−1 t)ψj−1 (t)πj−1 (t).. (3.19). In the BICGSTAB scheme, we require two recurrences rj = φj (A)ψj (A)r0 , pj = ψj (A)πj (A)r0 . According to the above formulas, it follow from (3.18) and (3.19) that rj+1 = (I − ωj A)(rj − αj Apj ) pj+1 = rj+1 + βj (I − ωj A)pj . 37. (3.20).

(45) Finally, we need to express the coefficients αj , βj , and ωj in terms of the new vectors. Let ρj = (φj (A)r0 , φj (AT )r0∗ ), ρ˜j = (φj (A)r0 , ψj (AT )r0∗ ). From BCG, we have φj (A)r0 orthogonal to all vectors (AT )k r0∗ , with k < j. In addition, (j). (j). φj (A) and ψj (A) are polynomials of degree j. In particular, let η1 and γ1 be the leading coefficients for the polynomials ψj (A) and φj (A), respectively. Then ! Ã (j) (j) η η1 ρ˜j = φj (A)r0 , (j) φj (AT )r0 = 1(j) ρj . γ1 γ1 According to (3.8) and (3.17), we have (j+1). γ1. (j). (j+1). = −αj γ1 ,. η1. (j). = −ωj η1 .. As a result, we now compute βj : (φj+1 (A)r0 , ψj+1 (AT )r0∗ ) αj × ωj (φj (A)r0 , ψj (AT )r0∗ ) ∗ (ψj+1 (A)φj+1 (A)r0 , r0 ) αj × = ωj (ψj (A)φj (A)r0 , r0∗ ) ∗ (rj+1 , r0 ) αj × . = ωj (rj , r0∗ ). βj =. To compute αj by the same way, the polynomials in the right sides of the inner products in both the numerator and denominator can be replaced by their leading terms. And we also know that the leading coefficients for φj (AT )r0∗ and πj (AT )r0∗ are identical. Therefore, (φj (A)r0 , φj (AT )r0∗ ) (Aπj (A)r0 , πj (AT )r0∗ ) (φj (A)r0 , ψj (AT )r0∗ ) = (Aπj (A)r0 , ψj (AT )r0∗ ) (ψj (A)φj (A)r0 , r0∗ ) = (Aψj (A)πj (A)r0 , r0∗ ) (rj , r0∗ ) . = (Apj , r0∗ ). αj =. Lastly, let us now return to find the scalar ωj to be minimize the residual rj+1 and by (3.20), we have min krj+1 k2 = min k(I − ωj A)(rj − αj Apj )k2. ωj ∈R. ωj ∈R. = min k(I − ωj A)sj k2 ωj ∈R. 38.

(46) in which sj = rj − αj Apj . By Minimal Residual iteration, the optimal value for ωj is given by ωj =. (Asj , sj ) . (Asj , Asj ). Finally, we rewrite the residual rj+1 in the following form rj+1 = rj − αj Apj − ωj Asj = rj − A(αj pj + ωj sj ). Then an approximate solution xj+1 can be repressed as xj+1 = xj + αj pj + ωj sj . Thus, we have the BICGSTAB algorithm as follow : ALGORITHM 12. BCGSTAB 1. r0 = b − Ax0 , and choose r0∗ such that (r0 , r0∗ ) = 1. 2. p0 = r0 3. For j = 0, 1, . . . , until convergence Do: 4. αj = (rj , r0∗ )/(Apj , r0∗ ) 5. sj = rj − αj Apj 6. wj = (Asj , sj )/(Asj , Asj ) 7. xj+1 = xj + αj pj + ωj sj 8. rj+1 = sj − ωj Asj (r ,r0∗ ) α × ωjj 9. βj = (rj+1 ∗ ,r j 0) 10. pj+1 = rj+1 + βj (pj − ωj Apj ) 11. EndDo. 39.

(47) 4. Transpose-Free QMR Method. In this chapter generalized minimal residual (GMRES), quasi-minimal residual (QMR), and transpose-free QMR (TFQMR) algorithms are derived. In line 12 of FOM, we find vector ym can be obtained by dealing with the problem of inverse matrix. We will discuss it in different way from the least square method. In the first section, we derive GMRES and this variations by relying on application of Arnoldi’s method. From the GMRES algorithm, we took advantage of the same techniques of IOM and DIOM to construct Quasi-GMRES and direct version of QGMRES, called DQGMRES. In the second section, we introduce QMR method. In algorithm of QMR, we will find an approximate solution xm just as well as algorithm of GMRES, except for constructing the matrix Tm or Hm . In the last section, we introduce TFQMR method. This method is derived from the CGS algorithm. We shall have more to say about TFQMR later on.. 4.1. GMRES, QGMRES, and DQGMRES. In this section, we will develop GMRES, QGMRES, and DQGMRES as well as the section 2.3. Only taking notice of one thing is to derive DQGMRES by using QR factorization, not LU factorization.. 4.1.1. GMRES. The GMRES [11] is a oblique projection method based on taking Lm = AKm , in which Km is the m-th Krylov subspace with v1 = r0 /kr0 k2 . Now, we construct a basis for Km by using Arnoldi’s method. Then resulting projection, oblique projection, should satisfy Proposition 3. If the approximate solution xm = x0 + Vm ym , then b − Axm = b − A(x0 + Vm ym ) = r0 − AVm ym ¯ m ym = βv1 − Vm+1 H ¯ m ym ). = Vm+1 (βe1 − H 40.

(48) So the next guess xm should satisfy min kb − Axk2 = kb − Axm k2. x∈x0 +Km. ¯ m ym )k2 = kVm+1 (βe1 − H ¯ m ym k2 , = kβe1 − H and the claim is that solving the least squares problem is inexpensive to compute. By the above relations, this gives the following algorithm. ALGORITHM 13. GMRES 1. r0 = b − Ax0 , β = kr0 k2 , and v1 = r0 /β ¯m = 0 2. Set (m + 1) × m matrix H 3. For j = 1, 2, . . . , m Do: 4. wj = Avj 5. For i = 1, . . . , j Do: 6. hij = (wj , vi ) 7. wj = wj − hij vi 8. EndDo 9. hj+1,j = kwj k2 . If hj+1,j = 0 then set m = j and Goto 12 10. vj+1 = wj /hj+1,j 11. EndDo ¯ m yk2 and xm = x0 + Vm ym . 12. Compute ym the minimizer of kβe1 − H. 4.1.2. QGMRES. The same observation applies to QGMRES, we can use the same technique to derive an incomplete version of the GMRES algorithm, Quasi-GMRES (QGMRES). The algorithm of QGMRES is follow: ALGORITHM 14. QGMRES 1. r0 = b − Ax0 , β = kr0 k2 , and v1 = r0 /β ¯m = 0 2. Set (m + 1) × m matrix H 3. For j = 1, 2, . . . , m Do: 4. w = Avj 5. For i = max{1, j − k + 1}, . . . , j Do: 6. hij = (w, vi ) 7. w = w − hij vi 8. EndDo 9. hj+1,j = kwk2 and vj+1 = w/hj+1,j 10. EndDo ¯ m yk2 and xm = x0 + Vm ym . 11. Compute ym the minimizer of kβe1 − H 41.

(49) 4.1.3. DQGMRES. But before we come on to introduce DQGMRES, let us pause here to look briefly at transforming the Hessenberg matrix into upper triangular by using Givens rotation. In the first, we define the rotation matrices  1 ..  .   1   ci si  Ωi =  −si ci   1   ..  ..        ← row i   ← row i + 1     1. with c2i + s2i = 1. If m = 4, the  h11  h21  ¯4 =  H   ¯ 4 by Then premultiply H. sample example, we would   h12 h13 h14  h22 h23 h24     h32 h33 h34   , g¯0 =    h43 h44 h54. have  β 0   0   = βe1 . 0  0. . . c1 s1  −s1 c1  1 Ω1 =    1.      1. with s1 = p. h21 h211. +. h221. , c1 = p. h11 h211. + h221. ¯ 4(1) and right-hand side to obtain the matrix H    (1) ¯ H4 =   . (1). h11. (1). (1). h12 h13 (1) (1) h22 h23 h32 h33 h43. (1). h14 (1) h24 h34 h44 h54. 42. . .      , g¯1 =     . c1 β −s1 β 0 0 0.    .  .

(50) ¯4 Be continued until the 4-th rotation is applied by the same way, which transform the H into one involving the matrix and right-hand side,  (4) (4) (4) (4)   h11 h12 h13 h14 γ1  (4)  (4) (4)  h h h  24  23 22  γ2  (4) (4) (4)  ¯ , g¯4 =  H4 =  h33 h34   γ3    γ4 (4)  h44  γ5 0.    .  . By the above example, in general the scalars ci and si of the ith rotation Ωi are defined as (i−1). hi+1,i. si = q. (i−1) 2 ). (hii. + h2i+1,i. hii. , ci = q. (i−1) 2 ). (hii. .. (4.1). + h2i+1,i. Define Qm the product of matrices Ωi , Qm = Ωm Ωm−1 . . . Ω1 and (m) ¯m = H ¯m ¯ m, R = Qm H. g¯m = Qm (βe1 ) = (γ1 , . . . , γm+1 )T . Then ¯ m yk2 = min k¯ ¯ m yk2 . min kβe1 − H gm − R y. y. ¯m Proposition 10. Let Ωi , i = 1, . . . , m be the rotation matrices used to transform H ¯ m , g¯m = (γ1 , . . . , γm+1 )T the resulting matrix and into an upper triangular form and R ¯ m by right-hand side. Denote by Rm the m × m upper triangular matrix obtained from R deleting its last row and by gm the m-dimensional vector obtained from g¯m by deleting its last component. Then, 1. The rank of AVm is equal to the rank of Rm . In particular, if rmm = 0 then A must be singular. ¯ m yk2 is given by 2. The vector ym which minimizes kβe1 − H −1 gm . ym = Rm. 43.