Conjugate Gradient Method - Krylov Subspace Methods

Chapter 3 Introduction to Conjugate Gradient (CG) Algorithm 23

3.3 Krylov Subspace Methods

3.3.4 Conjugate Gradient Method

⎢ ⎥

3.3.4 Conjugate Gradient Method

Like the FOM algorithm in the assumption that A is symmetric, we can build an orthogonal

orthogonal projection onto the Krylov subspace which is the desired approximate solution.

essenberg matrix H is a tridiagonal matrix, the _m

1 1

By Equation (3.38), Equation (3.32) can be simplified to

(3.40)

quation (3.39) can be rewritten as

m m m m m

h g +o g ₋ = p

1 ( )

m m m m

g p o g

= h − (3.41)

Then we have the D-Lanczos algorithm by replacing the equation of computing in DIOM algorithm (algorithm 3.3) with Equation (3.40). Because the approximate solution is iteratively found by

g_m

e g is called the searching direction vector. The CG method can be derived _m from the D-Lanczos algorithm by two properties. The first is that the residual vectors are orthogonal to each other and the second is that the search direction vectors g _m are A-conju (Ag g_i, _j)=0, ∀ ≠i j.

The coefficients ξ_m₋₁ and η_m can be found by the aforementioned two properties.

Finally, we have the CG algorithm, which is one of the best known iterative techniques in solving the symmetric positive definite (S.P.D) system.

Algorithm 3.5 Conjugate gradient method

0 0, 0

r = −b Ax g = r for j=0~convergence

( , ) /( , ) ^T ^T

j r rj j Agj gj r rj j g Agj j

α = = ,

j j j j

x ₊ =x +α g r_j₊₁= −r_j α_jAg_j

1, 1 1 1

T T

( ) /( , )

j rj+ rj+ r rj j rj+ rj+ r rj j 1 1

β = = g_j₊ =r_j₊ +β_jg_j

In this chapter, we first introduce the

algorithm from basic projection theory. CG algorithm is one of the best known iterative techniques for solving a symmetric positive definite (S.P.D) system. We will use the PCG algorithm fo

3.5 Summary

concept of projection and derive the CG

r solving the matrix inverse problem in the MMSE equalizer in the next chapte

Chapter 4 Proposed Low-Complexity Frequency Domain Equalizer

a es are introduced briefly in Chapter 2. In ximation based on the previous analysis is shown in th

ions are shown in Section 4.5

4.1 Band Channel Approximation

The magnitude of the frequency domain channel matrix is shown in Figure 4.1.

The channel model is the Jakes model and the normalized Doppler spread equals 0.1. It is shown that the most significant coefficients are those on the central band and the edges of the matrix, which is similar to the analysis of the channel in Chapter 2. In order to reduce the computation complexity, the smaller coefficients are ignored and only the significant coefficients are dealt with. Although there are some losses in the

The frequency dom in equalizer schem this chapter, the band channel appro

e first place. By this approximation, some techniques have been proposed to reduce the complexity of different equalizers as introduced in Section 4.2. In addition, an MMSE equalizer based on the CG method with optimal preconditioning is proposed.

And then we compare the complexity of this scheme with some other methods. Finally, performance simulat

BER performance, the computation complexity of mobile OFDM systems can be duced greatly.

n channel can be approximated as in Figure 4.2, [5], [6]. We n only take account of the coefficients in the shaded region and ignore other efficients. Then a frequency domain channel matrix with bandwidth Q as shown in

n [6] can enhance this re

The frequency domai ca

Figure 4.2 is processed. A time-domain technique discussed i approximation.

Figure 4.1: Amplitude of frequency domain channel matrix in Jakes model

Figure 4.2: Structure of approximate frequency domain channel

4.2 Existing Low-Complexity Frequency Domain Equalizers 　

Two important low-complexity frequency domain equalizers will be introduced, which are proposed in [4], [6]. The main ideas behind them are also the band channel approximation. We adopt the mobile OFDM signal model introduced in Chapter 2, Equation (2.17), and ignore the superscript giving

Q+1 Q

( )i

= = +

= + =

y Fr FHx Fη

FHF s Fη As z (4.1)

A l e

received signal. The weight computations are based on

inear minimum mean square equalizer (LMMSE) can be used to equalize th

H 2

arg min E{ W }

mmse = −

W W y s (4.2)

It can be easily derived that the optimum weights in the above equation are Q

1 1

( ^H )

mmse = + zz −

W AA R

SNR A (4.3)

where is the equivalent channel in the frequency domain as shown in is the autocorrelation matrix of the noise. The equalized signal can be written as

= H

A FHF

Equation (4.1), and R _zz

= mmse

d W y , and then the receiver make decision based on this equalized signal. In Equation (4.3), an N×N matrix inversion is required. It requires computations which is too expensive to be realized for a large N.

One should apply a low complexity algorithm to solve this problem.

By the idea that the ICI only comes from the neighborhood subcarriers, Xiaodong Cai and Georgios B. Giannakis proposed a low-complexity LMMSE equalizer in [4], Assuming that is the desired signal to be solved, it can only take the rows of the matrix for computing the LMMSE weight vector. It means we are only

riers out of th neighborhood, as shown in Figure 4.3.

Because the significant parts of the ICI come from the neighborhood subcarriers, this assum

( 3) O N

s i 2Q+1

concerned with the ICI coming from the 2Q neighborhood subcarriers, and ignore the ICI produced by the subcar e 2Q

ption is meaningful.

Therefore the equation for computing the LMMSE weights from Equations (4.3) and (4.4) can be written as

( , )

is a part of the origina atrix, is a part of the autocorrelation function.

This technique can be seen as ge system into several small systems, which can be easily solved. Note that the last 2Q rows of the matrix

( ,1)

partitioning a lar Rzi

pproach is proposed by Philip Schniter in [6]. We call this scheme the Partial MMSE equalizer for

channel approximation as described in Section 4.1. Assume that we want to retrieve l . We define

equalizer (proposed by Xiaodong Cai and Georgios B. Giannakis)

Another similar a

simplicity. This method applies the band

the signa

(2 1) (4 1)

utation of LMMSE weights is similar to Equation (4.5) as follows applied. The comp

1 R ₋1A ′ (4.8)

Because it only requires O N computations to compute( ) _i _i

, ( ^H ) quires computations to solve the LMMSE problem.

Figure 4.4: Partial MMSE eqaulizer (proposed by Philip Schniter)

(1,1) (1, ) (1, 1) (1, )

4.3 Proposed Preconditioned Conjugate Gradient (PCG) MMSE Equalizer

by using precond conjugate gra

4.3.1 Preconditioned Conjugate Gradient (PCG)

ne of the serious defects of iterative methods is the lack of robustness. CG In this section, a low-complexity LMMSE equalizer itioned dient algorithm for solving the matrix inversion problem is proposed. It will be shown that the complexity of this method is ( )O N and have similar performance but further computations than Partial MMSE equalizer.

Algorithm

works regularly if the system is well conditioned. Because CG is a project technique to the Krylov subspace K^m which is the subspace of ⁿ, it will converge in at most n iteration. The convergence rate of CG is related to the condition numberκ which is defined as follows

^max

min

κ λ

= λ (4.9)

where λ_max and λ_min are the maximum and minimum eigenvalue of theA matrix. If the condition number is large, the CG algorithm will converge slowly. This

charac tistical

haracteristic of

teristic limits the application of CG algorithm. However, if some sta A

c is known, it can be utilized to achieve faster convergence rate and en the system will be more robust. This is the idea of preconditioning which is a technique to better the condition number of the system.

Assuming th

M is the precondition matrix. Then the basic precondition method is to solve the system M⁻¹Ax=M⁻¹b instead of Ax=b . Therefore, the system

convergence rate depends on the condition number of the precond

itioned system M⁻ A. If M is chosen appropriately, the condition number of M⁻¹A can be solving the system smaller than the original matrix A . For this reason,

1 1

M Ax=M⁻ b will converge quickly.

There are some criteria for choosing the precondition matrix

−

M , which is introduced in [8], [9], and [10].

1. M is a good approximation to A in some sense 2. The cost of the construction of M is not prohibitive

3. The system M⁻¹x= is much easier to solve than the original system b

We may choose appropriate precondition matrix M according to the criteria above.

However, it is not necessary to solve the problem M⁻¹Ax=M⁻¹b; it only requires modifying the original CG algorithm into a preconditioned version. We will derive the PCG algorithm based on the CG Algorithm introduced in Section 3.3. Some parts of this derivation can be found in [14].

Assum M is a symmetric positive-definite matrix, the Cholesky factorization o s S. Tha

ing

f M i t is M =SS^T. The matrix M⁻¹A will have the same eigenvalues asS⁻¹AS. The system Ax=b can be transformed toS⁻¹AS⁻^Txˆ=S b⁻¹ , x′ =S x^T . Then the matrix S⁻ A is also a symmetric positive-definite matrix, so we can apply the CG method to solve the above question, as follows

1 −T

r₀′ S b⁻¹ ⁻¹AS⁻^Tx₀′, g₀ r₀ S

= −S =

for j=0~convergence

1 algorithm [9], [15].

Algorithm 4.1 Preconditioned Conjugate Gradient Method

0 0, 0 0

r = −b Ax g = r e for j=0~convergenc z_j =M⁻1r_j

4.3.2 PCG LMMSE Equalizer

The LMMSE equalizer for mobile OFDM system has introduced in Section 4.2, the weights is calculate by Equation (4.3). The equalized signal can be written as

1 1

This equation above can be rewritten as (AA^H + 1 R_zz)e y=

problem. Observing the equation above, an matrix inversion is required. By e band channel approximation as described in Section 4.1, this matrix is a spare th

symm d in Chapter 3 is one of the best known iterative techniques for solving symmetric positive definite problem, but it may suffer the problem of low convergence rate. The PCG algorithm can be used to avoid this problem.

By observing the amplitude of the matrix which is shown in Figure 4.1, it can be found that the most significant coefficients are those on the central band and the edges of the matrix. Then the matrix

etric positive definite matrix. The CG algorithm introduce

(AA^H +_SNR1 R_zz) istic to the matrix

which is the matrix required to inverted still have similar character . It means that it is also a diagonal dominant system. By applying the th d in the previous section, we can choose some diagonals of the central band of the matrix

ree criteria describe

(AA^H +_SNR1 R_zz) as the precondition matrix,M

trix

which is shown in Figure 4.5. The three preconditioning criteria described in the previous section should be checked before preconditioning. First, because the ma has the most significant values on its

is similar to the original matrix A.

Second, we can obtain the precondition matrix directly from the matrix A, so there is no extra cost in constructing the precondition matrix. Third, the system

diagonals, this choice of the precondition matrix

M⁻1x= can b be easily solved by the band LDL factorization [7], [15] and the forward and backward substitutions which have lower com

matrix.

With the precondition matrix chosen above, Equation (4.9) can be solved iteratively by the PCG metho

original system

shown that the preconditioned system has much smaller condition number than the m, so the preconditio

y anal

plexity than the inversion of a general

d described in Section 4.2. The condition number of the and the preconditioned system are shown in Figures 4.5-4.6. It can be

original syste ned system will converge faster than the original system. The convergence rate and complexit ysis of this method are shown in Section 4.4 and Section 4.5. By these analyses, it is presented that this approach has

lower complexity than the method introduced in Section 4.2 but still have similar BER performance to that method.

. Figure 4.5: Structure of precondition matrix

Q+1 Q

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 0

50 100 150 200 250 300

Condition Number

Figure 4.6: Condition number of original system

mber of cneinhanl realzatio

300

0 50 100 150 200 250

Condtion Number

Number of channel realization

Figure 4.7: Condition number of preconditioned system

4.3.3 Optimum Precondition Matrix

The central band of the matrix ( ^H ¹ _zz) +SNR

AA R

ecause we must solve the p

is chosen as the precondition matrix in the previous section. B roblemM⁻¹x= per b iteration in Algorithm 4.1, it leads to some overhead problems. Because the complexity of the inversing a band system increases exponentially with the bandwidth of the matrix, it is a trade-off problem that how many bandwidth we should choose. Choosing a larger bandwidth of precondition matrix will let the system converge faster but it requires more computations per iteration, so the subset of the central band matrix may be chosen as the precondition matrix. By computer simulations and complexity analysis, we can obtain the optimum bandwidth of the precondition matrix that achieving the lowest complexity. We will discuss this in following section.

4.4 Complexity Analysis

The LMMSE proposed by Xiaodong Cai and Georgios B. Giannakis requires flops, and the LMMSE method proposed by Philip Schniter which is called MMSE method requires flops. The PCG LMMSE method proposed by this paper also requires computations linearly with N. The complexity of the two methods, Partial LMMSE and PCG LMMSE will be analyzed here, and their complexity increases linearly with N.

A flop here is defined as a complex multiplication, and N is the FFT size, P is the bandwidth of the precondition matrix. Table 4.1 shows the complexity of PCG MMSE equalizer, and Table 4.2 shows the complexity of Partial MMSE equalizer

Note that the bandwidth of the precondition matrix not only affects the

complexity per hat how many

for achieving the lowest complexity. The optimum bandwidth of the precondition matrix can be obtained by

simu mputation requir

( 2) O N

the Partial L O N( )

iteration but also the convergence rate. It is a trade-off t bandwidth of the original matrix required to choose

lations. We will discuss the actual co ed by these two methods, Partial MMSE and PCG MMSE, in the next section.

Table 4.1: Complexity analysis of PCG MMSE equalizer

Operation Complexity

(

^AA^'

) (

²^Q²^{+ +}^Q ¹

)

^N^flops

inv(AA'+δC_xx) *r

( )

4 6 4 number of iteration

Q 2 P N

⎛ + +P + ⎞×

⎜ ⎟

⎝ ⎠ flops

'*inv( '+δ _xx) *

H AA C r

(

²^Q⁺¹

)

^N^flops

Total ⁴ ⁶ P² ⁴

(

number of iteration

)

2 ² 4 2

Q P Q Q N

⎧⎛ ⎞ ⎫

⎪⎜ + + + ⎟× + + + ⎪

⎨ ⎬

⎜ 2 ⎟

⎪⎝ ⎠ ⎪

⎩ ⎭

flops

: FFT size

Q: The bandwidth of the approximation channel P: The bandwidth of the precondition matrix N

Table 4.2: Complexity analysis of Partial MMSE equalizer

: The bandwidth of the approximation channel : FFT size

.5 Computer Simulations

In this section, computer simulations are conducted to evaluate the performance f the OFDM system using PCG LMMSE equalizer. Through out the simulations, we only deal with discrete time signal processing in the baseband, hence pulse-shaping and matched-filtering are removed from consideration for simplicity. Also, channel estimation and timing synchronization are assumed to be perfect. In the simulations, the relationship between SNR and

4

where is the symbol energy, T_s is the symbol duration, B is the system bandwidth, and M is the modulation order. The system transmit bit power is normalized to one, the noise power given by σ² corresponding to a specific

Eb N can be generated by

2= 0 b

σ E (4.14)

Table 4.3 lists all parameters used in our simulations. The configuration we consider here is an OFDM system with a bandwidth of 1.5 MHz and 64 subcarriers.

The set of QAM constellation used in the simulations is QPSK. The channel model is the Jakes model [12], [17], [18] and the normalized Doppler spread equals 0.1.

Table 4.3: Parameters of Computer Simulations

Transmit/Receive antennas SISO

Carrier frequency 5.2 GHz

Bandwidth 1 MHz

Number of carriers, FFT size 64

OFDM symbol duration 42 s

Guard interval 5.25μs

Modulation order QPSK

Velocity 250 km/hour

Maximum Doppler frequency 1.2 KHz

Normalized Doppler frequency 0.05

Channel model Jakes Model [17], [18]

: Jakes model simulator

The BE the CG MMSE equalizer with different numbers of iterations are shown in Figure 4.10. It can be shown that the conventional CG algorithm suffers from slow convergence rate problem, and this problem can be solved by the PCG algorithm

precondition is shown in Figures 4.11-4.14. It is shown that the convergence rate is proportional to the bandwidth of the precondition matrix, but a lager b x results in more comp er iteration. It is thus a trade-off in choosing the bandwidth of the preconditio ix. We define the complexity to be the num multiplications per ite ber of iterations. In Table 4.2, we show the complexity of different bandwidths of the precondition matrix and the number of iterations required for the convergence. By the simulations result, we can determine the optimum bandwidth of the precondition

Figure 4.8

R performances of

. The convergence rate of the proposed equalizer with different matrix bandwidths

andwidth of precondition matri utations p n matr

ber of ration multiplied by the num 2 sinβ1

0 5 10 15 20 25 30 35 40 10^-4

10^-3 10^-2 10^-1 10⁰

Eb/No

BER

5 iterations 10 iterations 20 iterations 30 iterations

matrix. The optimum bandwidth of the precondition matrix here equals three. By the comp is above, the PCG MMSE only requires 30% the computations of the Partial MMSE.

lexity analys

Figure 4.9: BER performance obtained by using CG based MMSE equalizer.

The performance of different numbers of iterations is shown. It can be seen that it requires about 30 iterations to converge.

P is diag

10⁰

0 5 10 15 20 25 30 35 40

10^-4 10^-3 10^-2 10^-1

Eb/No

BER

2 iterations 3 iterations 4 iterations

Figure 4.10: BER performance obtained by using PCG based MMSE equalizer, (BW of precondition matrix in PCG MMSE is zero). The performance of different numbers of iterations is shown. It can be seen that it requires about 4 iterations to converge.

0 5 10 15 20 25 30 35 40 10^-4

10^-3 10^-2 10^-1 10⁰

2 iterations 3 iterations 4 iterations

Eb/No

Figure 4.11:

shown. It can be seen that it requires about 3 iterations to converge.

BER

BER performance obtained by using PCG based MMSE equalizer, (BW of precondition matrix in PCG MMSE is one). The performance of different numbers of iterations is

0 5 10 15 20 25 30 35 40 10^-4

10^-3 10^-2 10⁰

10^-1

2 iterations 3 iterations

Eb/No

BER

igure 4.12: BER performance obtained by using PCG based MMSE equalizer, F

(BW of precondition matrix in PCG MMSE is two). The performance of different numbers of iterations is shown. It can be seen that it requires about 2 iterations to converge.

Table 4.4: Convergence rate for different precondition bandwidths Precondition Matrix

Bandwidth

Complexity

(4Q+6+1/2P2+4P)N

Number of iterations

P=0 (4Q+6)N 4

P=1 (4Q+10)N 3

P=2 (4Q+16)N 2

P=3 (4Q+22)N 2

Figure 4.13 shows the BER performance of different schemes. The conventional one-tap equalizer scheme has poor performance due to the influence of ICI. The Partial MM

equalizer is also sh PCG MMSE is du

applying a more complicated method such as MMSE-SIC, MMSE-PIC [3], [4]. It shows that with the channel approximation the MMSE-PIC equalizer has better performance than the MMSE equalizer. The MMSE-PIC equalizer can even have better BER performance than the MMSE equalizer BER bound in low SNR region.

SE and PCG MMSE have similar performance. A BER bound of the MMSE own in this figure. The gap between the MMSE BER bound and the e to the channel approximation errors. This gap can be reduced by

10⁰

Figure 4.14 shows the BER performances under different vehicle speeds. The BER performance of the OFDM system degrades with an increasing vehicle speed because the ICI is more significant in the high mobility environments. It will thus require more bandwidth of the approximated channel or a complicated method to mitigate the ICI.

Figure 4.15 shows the BER performance with channel estimation errors. The channel estimation errors are defined as a AWGN noise with variance

Figure 4.13: BER performance of different schemes

σe to disturb the e ated channel taps, by the definition in [26]

(4.)

where is the estimated channel impulse response, and stim

[

1 2 -1

]

= e e, e_L

e represents the error vector. It is assumed that is independent of and is modeled as independent zeros means complex-valued Gaussian noise. It can be shown in Figure 4.18 that the PCG MMSE has similar performance to the Partial MMSE equalizer even if the channel estimation errors are considered. Because the Partial MMSE equalizer only takes parts of the equations, it may be more sensitive to the disturbance of channel.

e h

10⁰

0 5 10 15 20 25 30 35 40

10^-4 10^-3 10^-2 10^-1

Eb/No

BER

490 km/hr 370 km/hr 250 km/hr

Figure 4.14: BER performance under different vehicle speeds.

0 5 10 15 20 25 30 35 40 10^-4

10^-3 10^-2 10^-1 10⁰

Eb/No

BER

Channel estimation errors variance=1e-2

Partial MMSE PCG MMSE

Perfect Channel Partial MMSE Perfect Channel PCG MMSE

Figure 4.15: BER performance with channel estimation errors

4.7 Summary

In this chapter, we first introduce the channel approximation of mobile channel.

This approximation is based on the concept that we are only concerned with the significant channel coefficients and ignore the trivial parts. With this approximation, the number of coefficients to be processed is reduced, so the computations for the equalizers can also be reduced. Although this approximation is useful, there is still an error floor due to the approximation errors. Furthermore, we introduce and compare several different low-complexity equalizers in Section 4.2, which are important techniques in this subject. By complexity analysis, it is shown that our scheme can achieve lower computation complexity while still have similar BER performance to the Partial MMSE equalizer.

ose a PCG based MMSE equalizer for the OFDM system over time-varying channels. Compared with conventional one-tap equalizers, this

eme can achieve better performance in mobile environments. In Chapter 2, the oncept of OFDM system is introduced and the reason why OFDM system can be used fficiency in time-invariant channel is given. Besides, the challenges to OFDM system

在文檔中適用於時變正交分頻多工系統之干擾消除技術 (頁 48-0)