Property of the Projection Method - Introduction to Conjugate Gradient (CG) Algorithm 23

Chapter 3 Introduction to Conjugate Gradient (CG) Algorithm 23

3.1.2 Property of the Projection Method

We will show that orthogonal projection solution can minimize the error between the desired solution and the approximate solution as in [9]. Let P is the orthogonal projector onto a subspace K, x is the desired vector, and y is the arbitrary vector in subspace K. Because of the orthogonality betweenxandPx, we have

2 2 2 2

If A is a symmetric and positive definite matrix, we can derive the similar result that orthogonal projection can minimize A-norm error between x and y . By Equation (4.6)

( ( ), ) 0 , , we have

A x−y′ q = ∀ ∈ (3.9) q K

Ax=b,

By Equatio

− ′ = ∀ ∈ (3.10)

T s is called the Galerki

n (4.7) can be rewritten as

K (b Ay q, ) 0 , q

hi n condition which defines an orthogonal projection [9].

Let A is an arbitrary matrix, and L= AK. The oblique projection onto K and orthogonal to L will minimize the 2-norm the residual vector . The derivation is similar as the orthogonal projection. Then we have

(3.11)

his is called the Petrov-Galerkin condition which defines an oblique projection [9].

3.2

The Krylov subspace is a subspace of the form [8],[9],[10]

(3.12)

By this definition, we know that is the subspace of all vectors in

of r= −b Axˆ We will show the iterative methods are located in the Krylov subspace. Solving x = , we may solve the simplib

A Tx₀ =b x₀

lution for x . We may correct the approximation x with δ , so δ ₀

( 0 )

A x +δ = (3.13) b

This can be see

(3.14) Aδ = −b Ax0

We may solve Equation (3.13) by a simplified approximate system

(3.15)

rrect the approx

By settingT I the Equation

)

Multiplying Equation (3.16) by and adding , we have

(3.18) imate solution with the same process respect tox . Therefore, we have ₁

al of the system depends on how well the polynomial p dam_i

Equation (3.18) shows that the residu ps the initial error.

By Equation (3.14), the i-th approximate solution x can be expressed as _i

xi =x +r +

k =

(3.22)

The aforementioned discussion shows that iterative methods are located in the

space (see Section 3.1), different

projection methods can be obtained, such as orthogonal or oblique projection, and different kinds of iterative techniques have been derived. They have different

on a case by case basis.

Usually, the characteristic of

iterative method. Choosing an appropriate method can have significant improvement ate and the complexity.

3.3 Krylov Subspace Methods

m kinds of K e

an ow volution from the

basic projection, the Arnoldi’s m

symmetric Lanczos algorithm and the CG algorithm.

3 .1 Arnoldi’s Algori

is a method that builds an orthogonal basis of

1 i 1

convergence rates. One should choose the best iterative method

A plays an important role in choosing the appropriate

on the convergence r

There are any rylov subspace m thods, and we focus on the predecessor of CG methods, d the CG method. We will sh the e

ethod, and then derive other simplified methods: the

.3 thm

Arnoldi’s algorithm is a basic orthogonal projection method. This scheme was first introduced in 1951 by Arnoldi. This

th bspace by orthogonal projection. The basic Arnoldi’s algorithm can be found in [9]

e Krylov subspace and finds an approximate solution on the Krylov su

Algorithm 3.1 Arnoldi’s Algorithm

1 1

The above process builds an orthogonal basis by a Gram-Schmidt process. The above algorithm can be rewritten in the matrix form as

j j j orthogonal basis of the Krylov subspace. We can rewrite Equation (3.22) in the matrix form as

11 12 13 1

is a Hessenberg matrix obtained by deleting the last row in

The above process produces an orthogonal basis of the Krylov subspace. By Equations (3.4) and (3.5), orthogonal projection means that the subspace L is the same as . We have

Combining Equations (3.26) and (3.27), we have the equation for orthogonal projection onto Krylov subspace as

0 ( 0 2 )

m m m 1

x =x +P H⁻ r e (3.29)

Algorithm

A method is called the full orthogonalization method (FOM) that searches the o ogonal basis

approximate solution by Equation (3.28). There are some modified methods that have om FOM method. Restarted FOM is to restart the Arnoldi’s algorithm periodically. Incomplete orthognoalization process (IOM) is to truncate

3.3.2 Krylov Subspace Methods Based on Arnoldi’s

rth of the Krylov subspace by Arnoldi’s theorem and finds the

lower c plexity than the

the bases generated by the original Arnoldi’s algorithm. We find the new basis only orthogonal to several bases that have already been found.

Algorithm 3.2 IOM Algorithm

1 1

progressive method in solving the approximate solution. Based on the above algorithm, the Hessenberg matrix in Equation (3.24) will be a band matrix with upper

nalization method (DIOM) derived from IOM is a

bandwidth equal to t−1 and lower bandwidth equal to 1, which can be shown

h₁₁ h₁

Take the LU factorization of this matrix. Because H_m is a Hessenberg band matrix with bandwidth equal to t+1, its LU factorization will have the form that the lowe wer triangle matrix, and the upper triangle matrix has upper bandwidth equal to

r triangle matrix is a unit band lo 1

t− . These two matrices are shown below.

0 0

Then Equation (3.28) can be written as

1 1 1

The above equation can be rewritten as

~

By the definition of c , we have _m

By Equation (3.32), we have the iterative equation as

In FOM and IOM algorithm, we require an orthogonal basis to solve the approximate solution. By Equation (3.36), we have a progressive method to solve x , _m which can solve the projection problem iterativel

algorithm which is mathematically identical to the IOM algorithm, but a progressive version.

y. Finally we have the DIOM

Algorithm 3.3 DIOM Algorithm Choose a vector p 1

3.3.3 Symmetric Lanczos Algorithm

The symmetric Lanczos algorithm is a simplified Arnoldi’s method in which the mmetric. When solving Ax=b in the assumption that A

matrix is sy is symmetric

a symmetric matrix, hence it is a tridiagonal matrix. We can reduce the computational complexity by this characteristic. A three-term r

algorithm.

m m m m

a b

b a

matrix, the Hessenberg matrix H in Equation (3.24) is also_m

ecurrence equation can be found based on the Arnoldi’s

The Hessenberg matrix H in Equation (3.24) should have the structure as _m follows

Then the Arnoldi’s theorem can be simplified to the Lanczos Algorithm as in [8]

⎢ ⎥

(3.38)

Algorithm 3.4 Lanczos method

1 1

Then we can find the orthogonal basis of the Krylov subspace by Lanczos’s theorem, and find the approximate solution by Equation (3.28), if A is symmetric.

This process will require fewer computations than the Arnoldi’s method.

basis based on the Lanczos algorithm. Then we can use Equation (3.28) to find

An algorithm similar to the DIOM algorithm can be derived. It is called the D-Lanczos algorithm. Because the H

LU factorization in Equation (3.30) can be written as

0 0

⎡ ⎤

⎢ ⎥

3.3.4 Conjugate Gradient Method

Like the FOM algorithm in the assumption that A is symmetric, we can build an orthogonal

orthogonal projection onto the Krylov subspace which is the desired approximate solution.

essenberg matrix H is a tridiagonal matrix, the _m

1 1

By Equation (3.38), Equation (3.32) can be simplified to

(3.40)

quation (3.39) can be rewritten as

m m m m m

h g +o g ₋ = p

1 ( )

m m m m

g p o g

= h − (3.41)

Then we have the D-Lanczos algorithm by replacing the equation of computing in DIOM algorithm (algorithm 3.3) with Equation (3.40). Because the approximate solution is iteratively found by

g_m

e g is called the searching direction vector. The CG method can be derived _m from the D-Lanczos algorithm by two properties. The first is that the residual vectors are orthogonal to each other and the second is that the search direction vectors g _m are A-conju (Ag g_i, _j)=0, ∀ ≠i j.

The coefficients ξ_m₋₁ and η_m can be found by the aforementioned two properties.

Finally, we have the CG algorithm, which is one of the best known iterative techniques in solving the symmetric positive definite (S.P.D) system.

Algorithm 3.5 Conjugate gradient method

0 0, 0

r = −b Ax g = r for j=0~convergence

( , ) /( , ) ^T ^T

j r rj j Agj gj r rj j g Agj j

α = = ,

j j j j

x ₊ =x +α g r_j₊₁= −r_j α_jAg_j

1, 1 1 1

T T

( ) /( , )

j rj+ rj+ r rj j rj+ rj+ r rj j 1 1

β = = g_j₊ =r_j₊ +β_jg_j

In this chapter, we first introduce the

algorithm from basic projection theory. CG algorithm is one of the best known iterative techniques for solving a symmetric positive definite (S.P.D) system. We will use the PCG algorithm fo

3.5 Summary

concept of projection and derive the CG

r solving the matrix inverse problem in the MMSE equalizer in the next chapte

Chapter 4 Proposed Low-Complexity Frequency Domain Equalizer

a es are introduced briefly in Chapter 2. In ximation based on the previous analysis is shown in th

ions are shown in Section 4.5

4.1 Band Channel Approximation

The magnitude of the frequency domain channel matrix is shown in Figure 4.1.

The channel model is the Jakes model and the normalized Doppler spread equals 0.1. It is shown that the most significant coefficients are those on the central band and the edges of the matrix, which is similar to the analysis of the channel in Chapter 2. In order to reduce the computation complexity, the smaller coefficients are ignored and only the significant coefficients are dealt with. Although there are some losses in the

The frequency dom in equalizer schem this chapter, the band channel appro

e first place. By this approximation, some techniques have been proposed to reduce the complexity of different equalizers as introduced in Section 4.2. In addition, an MMSE equalizer based on the CG method with optimal preconditioning is proposed.

And then we compare the complexity of this scheme with some other methods. Finally, performance simulat

BER performance, the computation complexity of mobile OFDM systems can be duced greatly.

n channel can be approximated as in Figure 4.2, [5], [6]. We n only take account of the coefficients in the shaded region and ignore other efficients. Then a frequency domain channel matrix with bandwidth Q as shown in

n [6] can enhance this re

The frequency domai ca

Figure 4.2 is processed. A time-domain technique discussed i approximation.

Figure 4.1: Amplitude of frequency domain channel matrix in Jakes model

Figure 4.2: Structure of approximate frequency domain channel

4.2 Existing Low-Complexity Frequency Domain Equalizers 　

Two important low-complexity frequency domain equalizers will be introduced, which are proposed in [4], [6]. The main ideas behind them are also the band channel approximation. We adopt the mobile OFDM signal model introduced in Chapter 2, Equation (2.17), and ignore the superscript giving

Q+1 Q

( )i

= = +

= + =

y Fr FHx Fη

FHF s Fη As z (4.1)

A l e

received signal. The weight computations are based on

inear minimum mean square equalizer (LMMSE) can be used to equalize th

H 2

arg min E{ W }

mmse = −

W W y s (4.2)

It can be easily derived that the optimum weights in the above equation are Q

1 1

( ^H )

mmse = + zz −

W AA R

SNR A (4.3)

where is the equivalent channel in the frequency domain as shown in is the autocorrelation matrix of the noise. The equalized signal can be written as

= H

A FHF

Equation (4.1), and R _zz

= mmse

d W y , and then the receiver make decision based on this equalized signal. In Equation (4.3), an N×N matrix inversion is required. It requires computations which is too expensive to be realized for a large N.

One should apply a low complexity algorithm to solve this problem.

By the idea that the ICI only comes from the neighborhood subcarriers, Xiaodong Cai and Georgios B. Giannakis proposed a low-complexity LMMSE equalizer in [4], Assuming that is the desired signal to be solved, it can only take the rows of the matrix for computing the LMMSE weight vector. It means we are only

riers out of th neighborhood, as shown in Figure 4.3.

Because the significant parts of the ICI come from the neighborhood subcarriers, this assum

( 3) O N

s i 2Q+1

concerned with the ICI coming from the 2Q neighborhood subcarriers, and ignore the ICI produced by the subcar e 2Q

ption is meaningful.

Therefore the equation for computing the LMMSE weights from Equations (4.3) and (4.4) can be written as

( , )

is a part of the origina atrix, is a part of the autocorrelation function.

This technique can be seen as ge system into several small systems, which can be easily solved. Note that the last 2Q rows of the matrix

( ,1)

partitioning a lar Rzi

pproach is proposed by Philip Schniter in [6]. We call this scheme the Partial MMSE equalizer for

channel approximation as described in Section 4.1. Assume that we want to retrieve l . We define

equalizer (proposed by Xiaodong Cai and Georgios B. Giannakis)

Another similar a

simplicity. This method applies the band

the signa

(2 1) (4 1)

utation of LMMSE weights is similar to Equation (4.5) as follows applied. The comp

1 R ₋1A ′ (4.8)

Because it only requires O N computations to compute( ) _i _i

, ( ^H ) quires computations to solve the LMMSE problem.

Figure 4.4: Partial MMSE eqaulizer (proposed by Philip Schniter)

(1,1) (1, ) (1, 1) (1, )

4.3 Proposed Preconditioned Conjugate Gradient (PCG) MMSE Equalizer

by using precond conjugate gra

4.3.1 Preconditioned Conjugate Gradient (PCG)

ne of the serious defects of iterative methods is the lack of robustness. CG In this section, a low-complexity LMMSE equalizer itioned dient algorithm for solving the matrix inversion problem is proposed. It will be shown that the complexity of this method is ( )O N and have similar performance but further computations than Partial MMSE equalizer.

Algorithm

works regularly if the system is well conditioned. Because CG is a project technique to the Krylov subspace K^m which is the subspace of ⁿ, it will converge in at most n iteration. The convergence rate of CG is related to the condition numberκ which is defined as follows

^max

min

κ λ

= λ (4.9)

where λ_max and λ_min are the maximum and minimum eigenvalue of theA matrix. If the condition number is large, the CG algorithm will converge slowly. This

charac tistical

haracteristic of

teristic limits the application of CG algorithm. However, if some sta A

c is known, it can be utilized to achieve faster convergence rate and en the system will be more robust. This is the idea of preconditioning which is a technique to better the condition number of the system.

Assuming th

M is the precondition matrix. Then the basic precondition method is to solve the system M⁻¹Ax=M⁻¹b instead of Ax=b . Therefore, the system

convergence rate depends on the condition number of the precond

itioned system M⁻ A. If M is chosen appropriately, the condition number of M⁻¹A can be solving the system smaller than the original matrix A . For this reason,

1 1

M Ax=M⁻ b will converge quickly.

There are some criteria for choosing the precondition matrix

−

M , which is introduced in [8], [9], and [10].

1. M is a good approximation to A in some sense 2. The cost of the construction of M is not prohibitive

3. The system M⁻¹x= is much easier to solve than the original system b

We may choose appropriate precondition matrix M according to the criteria above.

However, it is not necessary to solve the problem M⁻¹Ax=M⁻¹b; it only requires modifying the original CG algorithm into a preconditioned version. We will derive the PCG algorithm based on the CG Algorithm introduced in Section 3.3. Some parts of this derivation can be found in [14].

Assum M is a symmetric positive-definite matrix, the Cholesky factorization o s S. Tha

ing

f M i t is M =SS^T. The matrix M⁻¹A will have the same eigenvalues asS⁻¹AS. The system Ax=b can be transformed toS⁻¹AS⁻^Txˆ=S b⁻¹ , x′ =S x^T . Then the matrix S⁻ A is also a symmetric positive-definite matrix, so we can apply the CG method to solve the above question, as follows

1 −T

r₀′ S b⁻¹ ⁻¹AS⁻^Tx₀′, g₀ r₀ S

= −S =

for j=0~convergence

1 algorithm [9], [15].

Algorithm 4.1 Preconditioned Conjugate Gradient Method

0 0, 0 0

r = −b Ax g = r e for j=0~convergenc z_j =M⁻1r_j

4.3.2 PCG LMMSE Equalizer

The LMMSE equalizer for mobile OFDM system has introduced in Section 4.2, the weights is calculate by Equation (4.3). The equalized signal can be written as

1 1

This equation above can be rewritten as (AA^H + 1 R_zz)e y=

problem. Observing the equation above, an matrix inversion is required. By e band channel approximation as described in Section 4.1, this matrix is a spare th

symm d in Chapter 3 is one of the best known iterative techniques for solving symmetric positive definite problem, but it may suffer the problem of low convergence rate. The PCG algorithm can be used to avoid this problem.

By observing the amplitude of the matrix which is shown in Figure 4.1, it can be found that the most significant coefficients are those on the central band and the edges of the matrix. Then the matrix

etric positive definite matrix. The CG algorithm introduce

(AA^H +_SNR1 R_zz) istic to the matrix

which is the matrix required to inverted still have similar character . It means that it is also a diagonal dominant system. By applying the th d in the previous section, we can choose some diagonals of the central band of the matrix

ree criteria describe

(AA^H +_SNR1 R_zz) as the precondition matrix,M

trix

which is shown in Figure 4.5. The three preconditioning criteria described in the previous section should be checked before preconditioning. First, because the ma has the most significant values on its

is similar to the original matrix A.

Second, we can obtain the precondition matrix directly from the matrix A, so there is no extra cost in constructing the precondition matrix. Third, the system

diagonals, this choice of the precondition matrix

M⁻1x= can b be easily solved by the band LDL factorization [7], [15] and the forward and backward substitutions which have lower com

matrix.

With the precondition matrix chosen above, Equation (4.9) can be solved iteratively by the PCG metho

original system

shown that the preconditioned system has much smaller condition number than the m, so the preconditio

y anal

plexity than the inversion of a general

d described in Section 4.2. The condition number of the and the preconditioned system are shown in Figures 4.5-4.6. It can be

original syste ned system will converge faster than the original system. The convergence rate and complexit ysis of this method are shown in Section 4.4 and Section 4.5. By these analyses, it is presented that this approach has

lower complexity than the method introduced in Section 4.2 but still have similar BER performance to that method.

. Figure 4.5: Structure of precondition matrix

Q+1 Q

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 0

50 100 150 200 250 300

Condition Number

Figure 4.6: Condition number of original system

mber of cneinhanl realzatio

300

0 50 100 150 200 250

Condtion Number

Number of channel realization

Figure 4.7: Condition number of preconditioned system

4.3.3 Optimum Precondition Matrix

The central band of the matrix ( ^H ¹ _zz) +SNR

AA R

ecause we must solve the p

is chosen as the precondition matrix in the previous section. B roblemM⁻¹x= per b iteration in Algorithm 4.1, it leads to some overhead problems. Because the complexity of the inversing a band system increases exponentially with the bandwidth of the matrix, it is a trade-off problem that how many bandwidth we should choose. Choosing a larger bandwidth of precondition matrix will let the system converge faster but it requires more computations per iteration, so the subset of the central band matrix may be chosen as the precondition matrix. By computer simulations and complexity analysis, we can obtain the optimum bandwidth of the precondition matrix that achieving the lowest complexity. We will discuss this in following section.

4.4 Complexity Analysis

The LMMSE proposed by Xiaodong Cai and Georgios B. Giannakis requires flops, and the LMMSE method proposed by Philip Schniter which is called MMSE method requires flops. The PCG LMMSE method proposed by this paper also requires computations linearly with N. The complexity of the two methods, Partial LMMSE and PCG LMMSE will be analyzed here, and their complexity increases linearly with N.

A flop here is defined as a complex multiplication, and N is the FFT size, P is the bandwidth of the precondition matrix. Table 4.1 shows the complexity of PCG MMSE equalizer, and Table 4.2 shows the complexity of Partial MMSE equalizer

Note that the bandwidth of the precondition matrix not only affects the

complexity per hat how many

for achieving the lowest complexity. The optimum bandwidth of the precondition matrix can be obtained by

simu mputation requir

( 2) O N

the Partial L O N( )

iteration but also the convergence rate. It is a trade-off t bandwidth of the original matrix required to choose

lations. We will discuss the actual co ed by these two methods, Partial MMSE and PCG MMSE, in the next section.

Table 4.1: Complexity analysis of PCG MMSE equalizer

Operation Complexity

(

^AA^'

) (

²^Q²^{+ +}^Q ¹

)

^N^flops

inv(AA'+δC_xx) *r

( )

4 6 4 number of iteration

Q 2 P N

⎛ + +P + ⎞×

⎜ ⎟

⎝ ⎠ flops

'*inv( '+δ _xx) *

H AA C r

(

²^Q⁺¹

)

^N^flops

Total ⁴ ⁶ P² ⁴

(

number of iteration

)

2 ² 4 2

Q P Q Q N

⎧⎛ ⎞ ⎫

⎪⎜ + + + ⎟× + + + ⎪

⎨ ⎬

⎜ 2 ⎟

⎪⎝ ⎠ ⎪

⎩ ⎭

flops

: FFT size

Q: The bandwidth of the approximation channel P: The bandwidth of the precondition matrix N

Table 4.2: Complexity analysis of Partial MMSE equalizer

: The bandwidth of the approximation channel : FFT size

.5 Computer Simulations

In this section, computer simulations are conducted to evaluate the performance f the OFDM system using PCG LMMSE equalizer. Through out the simulations, we only deal with discrete time signal processing in the baseband, hence pulse-shaping and matched-filtering are removed from consideration for simplicity. Also, channel estimation and timing synchronization are assumed to be perfect. In the simulations, the relationship between SNR and

4

where is the symbol energy, T_s is the symbol duration, B is the system

在文檔中適用於時變正交分頻多工系統之干擾消除技術 (頁 38-0)