Pilot-Channel Aided Adaptable Interference Cancellation Scheme

Not only the data-channel symbols but the pilot channel symbols are also interfered with other user signals, especially when the cross-correlation between each user’s codes is high. In our proposed scheme, the pilot channel signals from the received signals are cancelled before they enter to next detection unit, while the respread traffic-channel signals are subtracted from the received signals to obtain more accurate channel parameters.

4.3.1 The Proposed Scheme

The structure of Pilot-Channel Aided Adaptable IC is shown in Fig. 4-9. The proposed scheme has four main blocks to make the system performance better. These blocks are channel estimation plus pilot signal regenerator (Block 1), SIC with two different ordering methods (Block 2), refined successive channel estimation (Block 3), and PPIC (Block 4).

Each of the blocks can be chosen on or off depending on the system loading and environmental condition. Block1 in Fig. 4-9 are equivalent to Part I in either Fig. 3-8 or Fig.

3-9 while the block diagram of the pilot signal regenerator is shown in Fig. 3-5. The input signal to Block2 in Fig. 4-9 is r tˆ( ) in (3-6). As shown in Chapter 3, SIC with different power ordering methods have different computational complexity as well as error performance. In Block2, we can choose one of the methods based on system loading and/or performance requirement. The unit in Block2 in Fig. 4-9 is equivalent to an iteration of the Part II in Fig. 3-8 or Fig. 3-9. The cancellation process is serially repeated until the desired U-th iteration is performed. It depends if all K users are detected in SIC unit. The detail structure of Data Respread and RAKE bank in Part II in Fig. 3-8 or Fig. 3-9 is shown in Fig.

3-4 and Fig. 2-22, respectively. The output signalrˆ_{SIC u}_; ( )t

at u-thiteration is given in (3-13).

When the first unit of Block2 is performed, the regenerated signal C_data_;_<₁_>(t)

dedined in (3-14) where k=<1>is subtracted from the received signal and enters the unit 1 of Block3 to obtain

) ( )

( )

( _;₁

1 t r t C t

r = − _data_<_>

The advanced channel estimation for the chosen user is performed with r₁(t) in Block3 as shown in Fig. 4-10. Functions of units in Block3 is equivalent to that of PiIC #b in Fig.

4-2(b) Due to cancellation of the strongest signal in data channel, the interference term in (3-1) becomes smaller, and more accurate channel parameters can be expected. The output signalC_pilot′′ _;_<₁_>(t)of unit 1 in Block3 is the regenerated pilot signal with new channel parameters and can be oatained from (3-7) where αˆ_av⁽ⁿ_;⁾_J_,_f is replaced by α_J′′_,⁽ⁿ_f⁾ as shown in Fig. 4-10. Then C_pilot′′ _;_<₁_>(t) is subtracted from Block3 input signal r₁(t).

After the operation of unit 2 of Block2 is completed, the output also needs to be removed, i.e. the signal entering unit 2 of Block3 is

) ( )

( )

( ₁ _; ₂ _;₁

2 t r t C t C t

r = − _data_< _> − _pilot′′ _<_>

The process is repeated until the desired U-th iteration is performed. From (U+1)-th to K-th stage, RAKE bank with input_enable signal is used in Block2 as shown in Fig. 4-10. In this way, redundant computation can be saved and processing dela can be shorter. This is equivalent to performing conventional detection to the remaining users, but the detector input signal becomes less interfered as compared with the input signal in (3-6). At the input of the (U+1)-th to the K-th stage in Block 3, the signal becomes

) ( )

( )

( _;

1 ;

1 t r t C t C t

r ^K _pilot _U

k datak

U < >

+ = −

∑

− ′′ . Finally, the new channel estimates α_k′′_,⁽_fⁿ⁾, 1≤f≤F, and tentative decision ˆ [ ]

; n

b_SIC_k

of all users are fed into the partial PIC (PPIC). The PIC receiver is often called multistage cancellation [78], and it processes signals of K users at the same time. The multistage PIC

scheme is shown in Fig. 4-6. The PIC has the advantage of having shorter latency than SIC at the expense of approximately K times more hardware required. Since the estimation of MAI in PIC may not be reliable in the early stages, the well-known partial PIC [22] uses partial coefficients (often estimated from experience) partially cancels the estimated MAI to alleviate errors when the estimation of interference is poor in early stages. Fig. 4-7 shows the block diagram of one stage partial PIC for uplink dedicated channel. The tentative decision made from correlator outputs of all users are respread (shown in Fig. 3-4) and removed from the received signal except the desired user data. The remaining signals are passed through the correlator again to acquire a new output with less MAI. After that, the new decision is made with the combination of the old output times 1-ps and new outputs times ps where s is the stage index. The iterative manner of PPIC is based on the likelihood concept. However, the update of the likelihood information becomes unreliable when the system is in heavy load.

A linear version of this PPIC is presented in [20]. In [90], the authors proposed an adaptive multistage PIC, which adaptively decides the cancellation weight of each user by minimizing the mean-square error between the received signal and its estimates according to the least-mean-square (LMS) algorithm. The adaptive multistage PIC is suitable for the system with short scrambling code, and it can achieve better performance than the PPIC.

But it has computational complexity O(KSF) per stage where K is the user number and SF is the spreading gain.

In our proposed scheme, the RAKE bank at initial stage in Fig. 4-6 are omitted, and a better initial decisoin is used, i.e. ˆ [ ] ˆ [ ]

; 0

, n b n

b_PIC_k _SIC_k

= . The other input of PIC is the signal )

0(t r

, which removes all pilot signals respreaded with new channel parameters from the received signal.

∑

+ = − ′′

′′ ^K

k pilot

U t r t C t

;

1( ) ( ) ( )

(4-11)

Compared )r_U′′₊₁(t with rˆ t( ) in (3-6), because C_pilot′′ _;_k(t) is respread by new channel estimates with fewer interference form traffic-channel, we can expect C_pilot′′ _;_k(t) a better approximation to real C_{pilot k}_; ( )t than ˆ ( )

; t

C_pilot_k in (3-7). For user J with F paths combining at the output of PPIC’s first stage, the tentative decision can be written as

} and p1 is the partial cancellation coefficient. At the second stage of PPIC, it performs the same calculation as (4-12) and (4-13) except that ˆ¹ [ ]

, n keep on until to the stage (often 2~4 stages) we want to stop.

Also there are K sets of units in Fig. 4-9, the hardware of all blocks used in the adaptable scheme can be ranging from having only one set for processing one user a time to having K sets for processing all users at the same time with different bit index n.

4.3.2 Simulation Results and Discussion

In this section, we compare the performance of the proposed detector with that of PIC and

SIC mentioned above. The data channel and the pilot channel are assumed to have equal transmitting power. We simulate these detectors under multipath fading channel Case 3 in Table 3-3 and total power of all paths are normalized to unity. A 4-finger RAKE receiver is used for path combination. We assume path delay is perfectly known. The simulation parameters are similar to the parameters in Table 3-2 for performance evaluation except that SF=32, βc=1, PDR=1, G=1 and SNR=13dB. We adopt PPIC in [4] with partial coefficient 0.6 at the first stage in Block 4.

Fig. 4-11 shows the average BER versus iteration number of Block2 marked in Fig. 4-9 with different combination of other blocks. The capital letter B in the figure means Block.

There are 20 users in the system. When all blocks are used, we can see that a single-stage PIC of Block4 can outperform the traditional PPIC. The property means that although Block2 and Block3 are the additional structures compared with PPIC, we can save the required stage in the succeeding Block4 working with Block2 and Block3. The figure also shows that B1+B2+B3+B4 converges fast, 2-stage PIC almost has equal BER as 4-stage PIC in Block4.

Fig. 4-12 and Fig. 4-13 show the average BER versus user numbers. The difference between these two simulations is that Fig. 4-12 uses one stage of PPIC in Block4 and half of user number as iteration number in Block2 and Block3 while Fig. 4-13 finds the minimum BER when the stage in Block4 is limited up to three. Comparing Fig. 4-12 with Fig. 4-13, we can see that there is little difference in these two figures. Note that in Fig. 4-12 several schemes have better performance than three-stage pure PPIC and pure SIC while these schemes only take about half of the processing time of SIC and about 2/3 computational complexity of PPIC. Fig. 4-12 and Fig. 4-13 also show that with the same BER, more users can be served in the system when we use the proposed adaptable scheme. For example, the user number in system can be up to 24 when the BER is at 2*10^-2. It also means energy will be saved if proper blocks are chosen to achieve the desired BER.

Fig. 4-14 shows the best performance (lowest BER) each scheme can achieve versus signal-to-noise ratio, the stage in Block4 is limited to not more than three. With the same BER, the required SNR becomes smaller. This means that the adaptable scheme can perform well even the environment is noisier. The figure also shows that the proposed schemes outperform pure PPIC and SIC under different SNR.

4.4 Summary

In this chapter, we propose two advanced IC techniques for the pilot-channel aided systems. At first, a pilot channel aided pipelined interference cancellation scheme is introduced to cope with interferences due to multiuser, multipath as well as long delay due to inherent property of SIC. It is shown to have better performance when compared with RAKE receiver, SIC and PPIC under the same SNR and user capacity while it maintains the same throughput and slightly longer latency than SIC owing to the pipelined format.

In the second part, we proposed an adaptable interference cancellation scheme for multiuser detection and channel estimation. Due to its flexibility, the scheme can be used in a changing environment with corresponding computational complexity. Simulation results show that it outperforms the traditional SIC and PPIC with reasonable hardware, and the processing delay can be shorter than SIC.

Table 4-1 Implementation issues with the proposed pipelined SIC

Reordering Frequency

Throughput (per unit)

Latency (units)

Hardware Required for Minimum Delay

Computational Complexity C(.)

K/(GTb) 1/3 2+3K+2 As shown in Fig.

4-1

K*C(CE)+K*C(FM)+K( K-1)/2*C(RAKE)+

K*C(PiIC)+2K*C(UdIC)

ˆ1

α_{< >} α^ˆ_{< >}₂

α_{< >}′1 α_{< >}′₂

( ) r t

)

1 t

r_do_a r_do₁_a_,₂(t)

)

2 t

r_do_b r_do₂_b_,₂(t) )

1 t

r_pi r_pi₁_,₂(t)

Fig. 4-1 The proposed pipelined scheme for interference cancellation scheme

)

, (

1 t

r_pi _u

)

, (t r_po_u )

, (

2 t

r_pi _u

)

; (t

C′_pilot_<_u_>

) (

, n u>f

<′ α

)

, (

1 t

r_pi _u

)

, (t r_po_u )

, (

2 t

r_pi _u

)

; (t

C′_pilot_<_u_>

) (

, n u>f

<′ α

(a) (b) Fig. 4-2 PiIC (a) #ua block and (b) #ub block in Fig. 4-1, 1 ≤ u ≤K

)

, (t r_dia_u

)

, (

1 t

r_do_a_u

)

, (

2 t

r_do _a_u

]

~ [ n

b_<_u> ~ ( )

; t

C_data_<_u_>

) (

~ⁿ , u>f

α<

)

~(n

(a)

)

, (t

r_dib_u r_do₂_b_,_u(t)

] [n b_<_u>

)

; (t

C_data_<_u_>

) (

, n u>f

α<

) (n

)

, (

1 t

r_do_a_u

)

~(n

(b)

Fig. 4-3 UdIC (a) #ua block and (b) #ub block in Fig. 4-1, 1 ≤ u ≤K

4 6 8 10 12 14 16 10^-3

10^-2 10^-1 10⁰

User Number SNR=10, Proposed scheme

SNR=10, Rake receiver SNR=10, SIC SNR=10, PPIC 3 stage SNR=15, Proposed scheme SNR=15, Rake

SNR=15, SIC SNR=15, PPIC 3 stage

Fig. 4-4 Average BER versus user number with different schemes

5 10 15 20

10^-4 10^-3 10^-2 10^-1 10⁰

SNR USER=6, Proposed scheme

USER=10, Proposed scheme USER=14, Proposed scheme USER=6, Rake receiver USER=10, Rake receiver USER=14, Rake receiver USER=6, SIC USER=10, SIC USER=14, SIC USER=6, PIC 3 Stage USER=10, PIC 3 Stage USER=14, PIC 3 Stage

Fig. 4-5 Average BER versus SNR with different schemes and user numbers

PIC

Fig. 4-6 Multistage PIC scheme ( )

Fig. 4- 8 Detail structure of unit in Block3 in Fig. 4-9

Fig. 4-9 The proposed adaptable IC scheme

Fig. 4-10 Structure of the RAKE bank with input enables

0 5 10 15 20

10^-3 10^-2 10^-1 10⁰

Iteration number of SIC

BER

B1+B2+B3,one time power rank B2 with power rank all the time

B1+B2+B3+one stage B4,one time power rank B1+B2+B3+one stage B4,power rank all the time B1+B2+B3+two stage B4,one time power rank B1+B2+B3+two stage B4,power rank all the time B2+two stage B4,power rank all the time B1+B2+B3+four stage B4,power rank all the time

Fig. 4-11 Average BER versus iteration of SIC in Block2 in Fig. 4-9, 20 users, 10dB

0 5 10 15 20 25 30 35

10^-6 10^-5 10^-4 10^-3 10^-2 10^-1 10⁰

user number

BER B1+B2+B4,one time power rank

B1+B2+B3+B4,one time power rank B1+B2+B4,power rank all the time B1+B2+B3+B4,power rank all the time B2+B4,one time power rank B2+B4,power rank all the time Conventional receiver B1 only

pure SIC with power reordering all the time pure PIC 3 stage

Fig. 4-12 Average BER versus user number, iteration in Block2 equals to user number divide by 2, stage in Block4 is limited to 1

0 5 10 15 20 25 30 35 10^-6

10^-5 10^-4 10^-3 10^-2 10^-1 10⁰

user number

BER B1+B2+B4,one time power rank

B1+B2+B3+B4,one time power rank B1+B2+B4,power rank all the time B1+B2+B3+B4,power rank all the time B2+B4,one time power rank B2+B4,power rank all the time Conventional receiver B1 only

pure SIC with power reordering all the time pure PIC 3 stage

Fig. 4-13 Minimum average BER versus user number, PIC stage in Block4 can be up to 3, 10dB

0 5 10 15 20

10^-3 10^-2 10^-1 10⁰

SNR

BER

B1+B2+B4,one time power rank B1+B2+B3+B4,one time power rank B1+B2+B4,power rank all the time B1+B2+B3+B4,power rank all the time B2+B4,one time power rank B2+B4,power rank all the time Conventional receiver B1 only

pure SIC with power reordering all the time pure PIC 2 stage

Fig. 4-14 Average BER versus different SNR, 20 users

Chapter 5 Pilot-Channel Aided Iterative Interference Cancellation in Turbo-Coded Systems

5.1 Overview

Both channel decoding and MUD are important techniques in CDMA systems. In wireless communications, channel coding protects data passing through fading channel by adding redundant bits in the transmitted message. With MUD at receiver end, error control coding, such as convolutional codes and Turbo codes, can be use to cope with residual interferences from all sources including AWGN in CDMA systems. In Chapter 3, we present pilot-channel aided SIC for asynchronous fading environment with varying performance according to the ordering method. In this chapter, we consider MUD in WCDMA systems with Turbo coding.

Turbo codes, first introduced by Berrou, Glavieux, and Thitimajashima in 1993 [11], are a class of error correcting codes generated from the parallel concatenation of two or more recursive systematic convolutional (RSC) codes to different interleaved versions of the same information sequence. Turbo codes are suitable for data communications as it exhibits a very good error performance from low to high SNR with large decoding delay. The performance of Turbo codes can be near Shannon limit capacity by performing iterative maximum a posteriori (MAP) probability algorithm [13], i.e. iteratively passing probabilistic estimates between two decoders with long codeword. It is known that mosr improvements in SNR are achieved in the first few iterations. Detection of early

convergence or non-convergence of iterative decoder is essential to save power consumption and processing delay. Recently, techniques known as early-stopping criteria are introduced to examine if the iteratively decoding process could be terminated when the CB is correctly decoded, [24], [29], [30], [32], [37], [44], [47], [62], [83], [85], [86], [93], [94].

A class of receivers known as iterative/turbo MUD combines MUD and channel decoding with excellent error performance via an iterative process [3], [4], [14], [28], [50]

[51], [64], [84]., [91] The channel encoder of Turbo MUD is not necessary Turbo codes. The term “turbo” in Turbo MUD describes the iterative information transition between MUD and channel decoder. Based on the realization that multiuser CDMA signals combined with forward-error control (FEC) coding can be viewed as the serial concatenation of two coding systems, the iterative MUD takes two operations: (1) a posteriori probability (APP) estimation of the coded symbols and (2) parallel single-user FEC soft decoding. Soft informations are passed back and forth between these two soft-input soft-output (SISO) decoders with interleaving. Although it has been shown that near-single-user performance can be achieved, many iterative MUD receivers need complex implementation with significant computational complexity. The generalized iterative MUD tends to removed or factor out all MAIs from other users and/or other noise based on different criterion.

However, non-perfect parameters estimation obscures the efficiency of sophisticated operations for low error rate, especially at low SNR. Cancellation of the estimated MAI at low SNR can bring large performance degradation. Efforts to find a practical iterative MUD with performance improvement from low to high SNR for concurrent communication systems are needed.

The SIC II in Chapter 3 explores variation of the received data grouping bits by grouping bits, and removes MAI only from users with more reliable estimated data than

that of the desired user. These properties are especially suitable for signal detection at low SINR. In 5.2, the SIC II front-end with PDR=1.0 and G=1 followed by MAP algorithm for turbo decoding is employed. (If the required SNR is higher, such as punctured codes are used, the optimal PDR and G increase. In this case, SIC III is chosen.) Variance estimation used for MAP algorithm in turbo decoding is acquired from pilot-channel signal. This scheme is shown to be superior to the one with PPIC at front-end in BER and computational complexity. In addition, with the ordering information obtained from SIC front-end, we propose a new stopping criterion which requires low complexity and data buffer.

In 5.3, a novel iterative IC with the turbo-coded SIC presented in 5.2 as part of the first outer iteration is proposed. With the information obtained from stopping criterion, the correct CBs are hard-decisioned, re-encoded, respread and then removed from the signal to the next outer iteration. In this way, huge amount of redundant computational complexity and processing delay can be saved when compared to the genetralized Turbo-coded iterative MUD. In addition, channel estimates from pilot-channel signal are refined from the second outer iteration with estimated traffic-channel signal removal. With the analyses in complexity and computer simulations in BER, this scheme is shown to be superior and practical in current communication systems.

5.2 Stopping Criterion for Turbo Decoding with

在文檔中應用引導通道協助於寬頻分碼多重進接系統 (頁 101-117)