The maximum a posteriori probability algorithm

The maximum a posteriori probability (MAP) algorithm is a soft-output decoding algorithm. Since this technique is developed by Bahl, Cocke, Jelinek, and Raviv in 1974 [16], it is also termed BCJR algorithm. We have to make some assumptions about data transmission in advance. First, the code rate R is 1/n; the input bit u_t = (u⁽⁰⁾_t ) will generate output symbol v_t = (v_t⁽⁰⁾, . . . , v_t⁽ⁿ⁻¹⁾). Second, the binary phase shift keying (BPSK) modulation is applied to map each binary symbol into one of the modulation signal set. The coded signal v^(j)_t will be mapped into modulated signal y_t^(j) as (2.4) for j = 0∼ (n − 1).

y_t^(j)= (−1)^v^(j)^t =







+1 if v_t^(j) = 0

−1 if vt^(j) = 1

(2.4)

Third, the channel is an additive white-Gaussian-noise (AWGN) channel. While receiving the data r_t= (r_t⁽⁰⁾, . . . , r_t⁽ⁿ⁻¹⁾) from channel, each r_t^(j)can be viewed as the summation of modulated signal y_t^(j) and the zero-mean white Gaussian noise n^(j)_t .

r^(j)_t = y^(j)_t + n^(j)_t . (2.5)

The variance of n^(j)_t is σ², which is determined by symbol signal-to-noise ratio (SNR).

The symbol SNR is usually denoted by E_s/N₀. We can also use the bit SNR, E_b/N₀, to calculate σ² due to E_s = RE_b. The transition probabilities of a size-N sequence are deﬁned by

Pr{r0, . . . , r_N₋₁ | y0, . . . , y_N₋₁} ,

N∏−1 t=0

Pr{r⁽⁰⁾t , . . . , r_t⁽ⁿ⁻¹⁾ | yt⁽⁰⁾, . . . , y_t⁽ⁿ⁻¹⁾}, (2.6)

where

Given the received data sequence from channel, the MAP algorithm can generate the a posteriori probability (APP) of each transmitted symbol as

Pr{ut | r0, . . . , r_N₋₁} (2.9)

The APP is further used to compute the log-likelihood ratio (LLR)

L(u_t), lnPr{ut = 0| r0, . . . , r_N₋₁}

Pr{ut = 1| r0, . . . , r_N₋₁} (2.10)

and then make the hard decision.

The LLR can be rewritten as (2.12) by utilizing the characteristic of conditional proba-bility.

L(u_t) = lnPr{ut = 0; r₀, . . . , r_N₋₁}/ Pr{r0, . . . , r_N₋₁} Pr{ut = 1; r₀, . . . , r_N₋₁}/ Pr{r0, . . . , r_N₋₁}

= lnPr{ut = 0; r₀, . . . , r_N₋₁}

Pr{ut = 1; r₀, . . . , r_N₋₁} (2.12)

The u_t= 0 and u_t= 1 have their respective state transitions in the trellis diagram, so we have the equivalence as

Pr{ut} = ∑

(St,St+1)

Pr{ut; S_t, S_t+1}. (2.13)

Note that S_t is the state at time t of the trellis diagram, and (S_t, S_t+1) represents the state transition from St to St+1. With (2.13), the LLR calculation is modiﬁed to

L(u_t) = ln

∑

(St,St+1)

Pr{ut= 0; S_t, S_t+1; r₀, . . . , r_N₋₁}

∑

(St,St+1)

Pr{ut= 1; S_t, S_t+1; r₀, . . . , r_N₋₁}

= ln

∑

(ut=0;St,St+1)

Pr{St, S_t+1; r₀, . . . , r_N₋₁}

∑

(ut=1;St,St+1)

Pr{St, S_t+1; r₀, . . . , r_N₋₁}. (2.14)

The joint probability Pr{St, S_t+1; r₀, . . . , r_N₋₁} is involved in the LLR calculation. If there is no transition from S_t to S_t+1, this probability will be zero. Otherwise, it can be decomposed as (2.15) with Bayes’s rule.

Pr{St, St+1; r0, . . . , rN−1} = Pr{St; r0, . . . , rt−1}

× Pr{St+1; r_t | St; r₀, . . . , r_t₋₁}

× Pr{rt+1, . . . , r_N₋₁ | St, S_t+1; r₀, . . . , r_t₋₁, r_t} (2.15)

We can simplify the two conditional probabilities in (2.15) by removing the redundant conditions. Since the St is given, the transition to St+1 with rt is independent of previ-ous data (r₀, . . . , rt−1). Similarly, the condition St+1 is suﬃcient for the last conditional probability.

Pr{St+1; r_t| St; r₀, . . . , r_t₋₁} = Pr{St+1; r_t| St} (2.16) Pr{rt+1, . . . , r_N₋₁ | St, S_t+1; r₀, . . . , r_t₋₁, r_t} = Pr{rt+1, . . . , r_N₋₁ | St+1} (2.17)

Then the factorization of Pr{St, S_t+1; r₀, . . . , r_N₋₁} becomes

Pr{St; r₀, . . . , r_t₋₁} × Pr{St+1; r_t| St} × Pr{rt+1, . . . , r_N₋₁ | St+1}. (2.18)

Now we deﬁne three functions:

α(S_t) = ln Pr{St; r₀, . . . , r_t₋₁} (2.19) γ(S_t, S_t+1) = ln Pr{St+1; r_t | St} (2.20) β(S_t) = ln Pr{rt, . . . , r_N₋₁ | St}, (2.21)

where α(St) is named forward metric, γ(St, St+1) is branch metric, and β(St) is backward metric. Thus (2.18) can be rewritten as

Pr{St, S_t+1; r₀, . . . , r_N₋₁} = exp(

By substituting (2.22) for the APP in (2.14), the LLR will become

L(u_t) = ln

On the other hand, the deﬁnition of (2.19) is extended to be

exp(

Then we compute the natural logarithm of both sides in (2.24). calcula-tion of all α(St) with 0≤ t ≤ N is a forward recursion. Such recursive method needs an appropriate initial condition. If the encoder starts from S(0), the condition will be

α(S₀) =

We can make the similar deduction about the backward metric β(S_t). The ﬁrst step is

exp(

After the computation of natural logarithm, (2.27) changes to

β(S_t) = ln∑

Furthermore, the branch metric in (2.20) can be

where u_t and y^′_tare the corresponding information bit and modulated output on the state transition (S_t, S_t+1). To ﬁnd the Pr{ut}, we need the a priori information represented by (2.31). For simplicity, we use the modulated signal u^′_t = +1 to replace u_t = 0 and u^′_t =−1 to replace ut= 1.

L_a(u^′_t), lnPr{u^′t= +1}

Pr{u^′t=−1} (2.31)

We utilize La(u^′_t) to calculate the a priori probability

Pr{u^′t =±1} = e^±L^a^(u^′^t⁾ the Bt is a constant. In addition, the channel reliability value Lc is 2/σ², and it will be

( )

St−1

(a) Recursive αt(St) computation

( )

St+1

(b) Recursive βt(St) computation

Figure 2.6: Forward metric calculation and backward metric calculation

4E_s/N₀ for the AWGN channel [21]. The (2.32) and (2.33) change the branch metric to

γ(S_t, S_t+1) = ln A_t+ ln B_t+ 1

Consequently, the MAP algorithm needs the forward metrics, backward metrics, and branch metrics to get all the LLR. It will initialize α(S0) and β(SN) at ﬁrst. After receiving the codeword symbol r_t, the decoder can derive γ(S_t, S_t+1) of each branch in the trellis diagram. Then the decoder use these branch metrics to calculate α(S_t) and β(S_t) in a recursive way. The respective computations of forward metrics and backward metrics are described graphically in Fig. 2.6. Here we let each state at time t have two incoming branches from diﬀerent states, S_t^′₋₁ and S_t^′′₋₁, and two outgoing branches to diﬀerent states, S_t+1^′ and S_t+1^′′ . As α(S_t) and β(S_t+1) are available, the LLR L(u_t) and

decision ˆu_t can be further determined.

The MAP algorithm is often approximated to Log-MAP or Max-Log-MAP algorithm in order to reduce implementation complexity [22]. We use the Jacobian function [23]

ln(e^x¹+ e^x²), max^∗(e^x¹, e^x²) = max(e^x¹, e^x²) + ln(1 + e^−|x¹^−x²^|) (2.36)

and its extension

ln(e^x¹ + e^x² + e^x³ +· · · + e^x^q) = max^∗(e^x¹, e^x², e^x³, . . . , e^x^q)

= max^∗(· · · max^∗(max^∗(x₁, x₂), x₃)· · · , xq) (2.37)

to replace original computations in (2.23), (2.25), and (2.28). The value of ln(1+e^−|x¹^−x²^|) can be found via a lookup table in a practical design. If the logarithmic term is very small, it could be omitted. Then the normal max operations could replace the max^∗ operations.

max^∗(e^x¹, e^x²)≈ max(e^x¹, e^x²). (2.38)

We express both (2.25) and (2.28) in a simpler form:

α(S_t) = max

S_t−1 [α(S_t₋₁) + γ(S_t₋₁, S_t)] (2.39) β(S_t) = max

St+1

[β(S_t+1) + γ(S_t, S_t+1)] . (2.40)

Thus, the L(u_t) alters:

L(u_t) = max

(St,St+1):ut=0[α(S_t) + γ(S_t, S_t+1) + β(S_t+1)]

− max

(St,St+1):ut=1[α(S_t) + γ(S_t, S_t+1) + β(S_t+1)] (2.41) If the algorithm still uses max^∗ operations, it is named Log-MAP algorithm, and its performance is equivalent to that of MAP algorithm. The approximation in (2.38) leads

to the Max-Log-MAP algorithm using max operations. Because the logarithmic term in (2.36) is discarded, there will be some performance degradation. However, the Max-Log-MAP algorithm contains only addition, comparison, and selection functions. It is more suitable for circuit implementation. Moreover, its recursive metric calculations is similar to the critical add-compare-select (ACS) operation of Viterbi algorithm [20]. As a result, the decoder with Max-Log-MAP algorithm can adopt many techniques which originally support Viterbi decoder.

Wj-1

Wj+1

Wj+2

Wj+3

β

α β

β

α β

β

α β

β

time t0 t1 t2 t3 t4 t5

LLR

Figure 2.7: The MAP algorithm with sliding window technique

The optimal or suboptimal MAP algorithm will encounter another diﬃculty in imple-mentation while the block size N is large. The data (r₀, . . . , r_N₋₁) are usually sent to the decoder in ascending order, and then the forward metrics can be derived soon with the initial condition α(S0). After the whole sequence has been received, the recursive calculation of backward metrics can start from its only known state S_N. Both α(S_t) and β(S_t+1) are necessary to calculate L(u_t) with t = 0∼ (N − 1). Hence, all forward metrics must be kept during such decoding procedure. The memory requirement would be con-siderable, and the decoder would become impractical. To reduce this hardware overhead, the sliding window technique [24, 25] exploits a dummy calculation to provide reliable metric initialization at any time. As shown in Fig. 2.7, the codeword block is divided into ⌈N/L⌉ windows of length L, and the Wj stands for the j-th window. The dummy

backward recursion β_dis an operation similar to the β. Except the last window, the initial βdwithin each window is unknown. We set the βdof all 2^m states in the (j + 1)-th window equally probable:

β_d(S_(j+2)L) = ln 1

2^m for S_(j+2)L∈ {S(0), S(1), . . . , S(2^m− 1)} (2.42) The βd in the last window is the same as β(SN). As the βd process in the (j + 1)-th window ﬁnishes, the initial metrics β(S_(j+1)L) in the j-th window are available for the β recursion. The exact operations from t₀ to t₁ in Fig. 2.7 can be expressed as follows:











β_d : S_(j+2)L → S(j+2)L−1 → · · · → S(j+1)L+1 → S(j+1)L

α : S_(j+0)L → S(j+0)L+1 → · · · → S(j+1)L−1 → S(j+1)L

β : S_(j+0)L → S(j+0)L−1 → · · · → S(j−1)L+1 → S(j−1)L

LLR : u_(j+0)L→ u(j+0)L−1 → · · · → u(j−1)L+1→ u(j−1)L

(2.43)

During the β_d operation of the (j + 1)-th window, the decoder performs concurrently the following operations: the α of the j-th window, the β and the L(u_t) of the (j− 1)-th window. The calculation of L(u_t) is possible because all α results of the (j−1)-th window had been completed and stored in the memory. We also use the same memory to store the α of the j-th window. In the subsequent process between t₁ and t₂, the L(u_t) of the j-th window can be derived with the α in the memory, the β in computing, and the corresponding branch metrics. Instead of keeping (N × 2^m) α metrics, the decoder with sliding window technique requires a smaller memory for (L× 2^m) α metrics.

在文檔中運用平行架構及無競爭式交錯器之渦輪碼解碼器 (頁 21-30)