• 沒有找到結果。

Design for Receiver Complexity Reduction

Thus far, we have described the operating principles of the transform (a transmitter component) and the block turbo DFE (a receiver component). We now present a particular design that enables receiver implementation at a reduced complexity. We first consider how the turbo DFE can operate in the frequency domain. Then we present a particular design of the SFI suited to the proposed way of receiver operation. And lastly, we discuss the computational complexity.

9.2.1 Turbo DFE in Frequency Domain

By the circulant nature of the channel matrix in (9.2), we may decompose it as

H = (W ⊗ I)H ·

where W is the DFT matrix and each Λij is a K ×K diagonal matrix of the frequency response of the channel from transmit antenna j to receive antenna i. By permutation, the “super-matrix”

Λ can be reorganized into

Λ = QT

where Q is a permutation matrix, Λ(k) is the MIMO channel response at subcarrier k, and recall that LK

k=1Λ(k) denotes the block diagonal matrix with diagonal entries Λ(1), Λ(2), . . . , Λ(K).

The FFF and FBF can be likewise decomposed as F = (W ⊗ I)H · QT

where for unshaped turbo DFE we have, for the k subcarrier,

F (k) = µΛH(k), B(k) = I − µΛH(k)Λ(k). (9.17) Similarly, for the shaping filter we have

C(k) = µ¡

ΛH(k)Λ(k) + αI¢−1

(9.18) for the kth subcarrier. These equations show that both the shaped and the unshaped DFE can be performed in the frequency domain, independently over the subcarriers. The complexity can be lower than performing equalization in the time domain.

9.2.2 The Equalizer-Decoder Loop

Thus far, we have omitted the details of the equalizer-decoder loop. To this subject we now turn.

Let x = (W ⊗ I) · x be the frequency spectrum of x, and let ˆx and ¯x be the DFE output and the FBF input in the frequency domain, respectively. Since the transmitted signal is spread by the space-frequency transform S, we must apply the inverse transform S−1 to the DFE output before feeding it to the channel decoder. In addition, we also need to apply the transform S to the decoder output to obtain the FBF input for the next iteration. The structure of the equalizer-decoder loop is illustrated in Fig. 9.1, where we have assumed the use of a soft-output decoder.

Mathematically, the decoder input is given by X =ˆ ¡

TH ⊗ I¢

· PTxˆ (9.19)

and the FBF input is obtained from the decoder output ¯X by

¯

x = P (T ⊗ I) ¯X (9.20)

where PT corresponds to deinterleaving, TH ⊗ I to inverse orthogonal transform, T ⊗ I to orthogonal transform, and P to interleaving.

FFF FBF

invSFT SoftOut SFT

Dec

Figure 9.1: The equalizer-decoder loop.

... ... ...

frequency

sp a ce

freq. interlv space interlv

Figure 9.2: The illustration of separable space-frequency interleaving.

9.2.3 Design of Space-Frequency Interleaving

We note that the complexity of the equalizer-decoder loop can be reduced by moving the inter-leaving and deinterinter-leaving functions outside the loop. This can be achieved by a proper design of the SFI method. Specifically, we employ a “separable SFI” in which the permutations in the spatial and the frequency domains are separable. Then in the receiver, the SFI and the inverse SFI can be replaced by equivalent operations on the received signal and the estimated channel response outside the equalizer-decoder loop.

The SFI is separable in the spatial and the frequency dimensions if the permutation matrix P can be decomposed into the product of a spatial permutation matrix Θ and a frequency permutation matrix Φ, such as P = ΘΦ. As illustrated in Fig. 9.2, in frequency interleaving, signal samples at the same subcarrier are moved to another subcarrier as a group, and in spatial interleaving, signal samples at the same subcarrier are permuted in a pseudo-random manner to different antennas.

Disregard the additive noise. Then the received signal after DFT and inverse SFI is given by

PTr = PTΛP(T ⊗ I)X (9.21)

where PTΛP is the space-frequency interleaved channel frequency response. With P = ΘΦ, we have

PTΛP = ΦTΘTΛΘΦ , ΦTΛΦ , Λ. (9.22) Note that Λ and Λ have a similar structure as Λ, because the two pairs of interleaving and deinterleaving operations amount to mere re-ordering of the spatial and the frequency indexes.

Therefore, the frequency domain turbo DFE can be made to operate on Λin exactly the same way as on Λ without any modification. In other words, the inverse SFI and SFI functions can be omitted in the equalizer-decoder loop.

9.2.4 Computational Complexity

We now consider the computational complexity of the proposed system. We only consider the receiver for it is much more complicated than the transmitter.

To start, we examine the complexity of equalization. Recall that the equalization process is divided into three stages: MMSE block linear equalization, shaped DFE, and basic DFE.

Assume that the channel response is known. Assume also that, in each stage, we calculate the filter coefficients first (the setup phase) and then use the results in equalization (the processing phase). We use the number of complex multiplications to measure the complexity. In MMSE block linear equalization, the setup phase needs approximately ¡

2M N2+ N3¢

K computations and the processing phase M2N K computations. In the shaped DFE stage, the setup phase requires a similar amount of computation for the FFF coefficients and M N2K computations for the FBF coefficients. In the processing phase, the FFF output only needs to be calculated once per signal block, which costs M2N K computations. Each iteration then needs N3K computations for FBF filtering. In the basic DFE stage, the FFF needs no computation for setup, and the setup of FBF takes M N2K computations. Again, the FFF output only needs to be calculated once per signal block, and it takes M N2K computations. The complexity per iteration in the processing phase is the same as the shaped DFE stage.

Next, we examine the complexity associated with the transform. In the receiver, an in-verse orthogonal transform and an orthogonal transform are needed for each turbo DFE itera-tion. Different transforms have different computational complexities. If the Hadamard or fast Hadamard transform (FHT) is employed, then there is no complex multiplication but only some complex additions (2N K log2(K) for FHT). IF the DFT is used (which applies to MIMO block single-carrier transmission with cyclic prefixing, or CPBSC), then the amount is 2N K log2(K) computations for IDFT and DFT.