Code Construction - 結合錯誤更正與通道估計的系統化編碼設計及其最大概度解碼

3.2.1 Code Constraint that Maximizes the Average SNR Regard-less of Channel Statistics

From the system model in (3.1), it can be derived that the average SNR conditional on the input B satisfies

E[kBhk²|B]

E[knk²] = N Lσ²_ntr

E[hh^H] 1 NB^TB

. (3.3)

Since both transmitter and receiver know nothing about the channel coefficients h, the average SNR can be as worse as

min

{h : tr(E[hh^H])=τ }

E[kBhk²|B]

E[knk²] ,

where τ is a certain (possibly unknown) power level on the channel coefficients h. We then found that such a worst-case SNR can be upper-bounded by a constant, i.e.,

min

{h : tr(E[hh^H])=τ }

E[kBhk²|B]

E[knk²] ≤ E[kB˜hk²|B]

E[knk²] = N Lσ_n²

τ,

where the above inequality holds since an upper bound can be resulted by taking any h that satisfies tr(E[hh^H]) = τ into E[kBhk²|B]/E[knk²], and here we take ˜hto be zero-mean i.i.d. with tr(E[˜h˜h^H]) = τ . It is thus straightforward from (3.3) that this constant SNR bound can be achieved even if the system is totally blind on channel coefficients h (as well as the power level τ ), when the codeword is designed to be self-orthogonal in the sense that

NB^TB= IP. (3.4)

Condition (3.4) actually has an operational meaning. It ensures that every codeword is orthogonal to the shifted version of itself, and hence temporal diversity can be implic-itly realized even under completely no knowledge on channel statistics. We henceforth say that codewords constrained on (3.4) maximize the average SNR attainable regardless of the statistics of h [14].

Unfortunately, a codeword sequence satisfying (3.4) is only guaranteed to exist for P = 2 with N odd (and trivially, for P = 1). In some other cases, such as P = 3, one can only design codes to approximately satisfy (3.4). For example,

We therefore relax (3.4) and allow some off-diagonal entries in B^TB to be either −1 or 1 whenever it is impossible to strictly satisfy (3.4). We will denote such a matrix as G.

After the establishment of (3.4), we find that this particular structure of G can really be observed in the simulated-annealing-based computer-searched codes. Specifically, for 4 ≤ N ≤ 18 and N even, the best computer-searched half-rate codes that minimize the sum of PEPs under complex zero-mean Gaussian distributed h with E[hh^H] = (1/2)IP and P = 2 all satisfy the relation

B^TB=

N ±1

±1 N

. (3.5)

We have also obtained and examined the computer-searched code used in [30] for N = 22, and found as anticipated that every codeword satisfies (3.5).

We close this subsection by stating some existing results in the literature that correspond to condition (3.4). The authors in [8] suggest that for an optimal channel estimation, the training sequences b can be chosen such that B^TB is proportional to IP. Their observation agrees with what we obtained in (3.4). Moreover, condition (3.4) also has been identified in [13] where the authors remark [13, pp. 1591] that a code sequence with a certain aperiodic autocorrelation property possibly could be exploited in future code design approaches. This is indeed one of the main research goals of this chapter.

h ⊕

-6

- Outer product demodulator

- Minimum Euclidean

Distance Selector

-Equivalent Channel

b yy^H b(yyˆ ^H)

Figure 3.1: Equivalent system model for combined channel estimation and error protection codes.

3.2.2 Equivalent System Model for Joint Channel and Data Esti-mation

By noting that PB is idempotent and symmetric, and both tr(PB) and kvec(P^B)k² equal P , where vec(·) denotes the operation to transform a matrix into a vector, the joint ML decision in (3.2) can be reformulated as

ˆb = arg min

b∈C(y− P^By)^H(y− P^By)

= arg min

b∈C −tr(P^Byy^H) (3.6)

= arg min

b∈C kvec(yy^H)k²− vec(PB)^Tvec(yy^H)− vec(yy^H)^Hvec(PB) +kvec(PB)k²

= arg min

b∈C kvec(yy^H)− vec(P^B)k².

This implies that the ML decision can be obtained by finding the codeword PB whose Eu-clidean distance to yy^H is the smallest.

We therefore transform the original system in (3.1) to an equivalent system model that contains an outer product demodulator and a minimum Euclidean distance selector at the P_B-domain as shown in Fig. 3.1. As the outer product demodulator can be viewed as a generalization of the square-law combining that is of popular use in non-coherent detection for both slow and fast fading [28], the above equivalent transformation suggests a potential application of combined channel estimate and error protection codes for the non-coherent system in which the fading is rapid enough to preclude a good estimate of the channel

coefficients. Further discussion on how to design codes for unknown fast-fading channels will be continued in subsequent chapters.

We can then bound the ML error probability by

Pe≤ 1 2^K

2^K

i=1 2^K

j=1 j6=i

Pr kvec(yy^H)− vec(P^Bj)k² <kvec(yy^H)− vec(P^Bi)k²

b_i transmitted ,

(3.7)

where bi is the ith codeword of an (N, K) block code, and PBi denotes the equivalent ith codeword in the PB-domain. By the self-orthogonal property, PB = B(B^TB)⁻¹B^T = _N¹BB^T. The PEP-based upper bound in (3.7) then suggests that a good self-orthogonal code design should have an adequately large pairwise Euclidean distance

vec(BiB^T

i )− vec(BjB^T

² (3.8)

between all codeword pairs Bi and Bj, where Bi is the equivalent ith codeword in the B-domain. Based on this observation, we may infer under equal prior probabilities that a uniform draw of codewords satisfying B^TB = N · I^P may asymptotically result in a good code. This is conceptually equivalent to a uniform pick of codewords in a set of self-orthogonal binary sequences.

We recall that our initial research query is how to construct an efficiently decodable code that supports joint channel estimation and error correction. In order to achieve this goal for the priority-first search decoding algorithm, we need an efficient and systematic way to generate the successor paths of the top path. In particular, we would like to have a code tree that can be spanned in an on-the-fly or bit-by-bit fashion. The uniform pick principle then suggests that considering only the self-orthogonal sequences with the same prefix b_(`−1), the ratio of the number of self-orthogonal codewords satisfying b` =−1 to the number of all self-orthogonal sequences having the same b_` must be made equal to the similar ratio for self-orthogonal codewords satisfying b` = 1, whenever possible. Mathematically, this can be

expressed as

|C(b1, b₂, . . . , b_`−1, b_` = 1)|

|A(b1, b2, . . . , b_`−1, b` = 1|G)| ≈ |C(b1, b₂, . . . , b_`−1, b_` =−1)|

|A(b1, b2, . . . , b_`−1, b` =−1|G)|, (3.9) where C(b(`)) is the set of all codewords whose first ` bits equal b1, b2, . . . , b`, and A(b(`)|G) is the set of all binary sequences of length N , whose first ` bits equal b1, b2, . . . , b`, and whose B-representation satisfies B^TB = G. Accordingly, given the index i of the codeword, where 0≤ i ≤ 2^K− 1, and given the previous ` − 1 bits b¹, b2, . . . , b_`−1, whether the next code bit b` is −1 or +1 can be determined conceptually by checking whether i is less than or larger than P

˜b1+˜b2·2+...+˜b`−1·2^`−2<b1+b2·2+...+b`−1·2^`−2|C(˜b(`−1)|G)| + |C(b(`−1), b` = −1|G)|. A specific code design algorithm will be given in the next subsection.

3.2.3 Exemplified Code Design Algorithm for Channels of Mem-ory Order One

In this subsection, we provide an exemplified code design algorithm based on the uniform pick principle for channels of memory order 1, namely, P = 2. The code design algorithm for channels with higher memory order can be similarly built.

For θ∈ {−1, 0, +1}, we define

G_θ ,

N θ

θ N

Note that when B^TB= G0 (= N· I2) cannot be satisfied as aforementioned for N even, G₋₁ and G₁ will be used instead to define the relaxed self-orthogonal codewords. In such case, the uniform pick principle again suggests that half of the codewords should be uniformly drawn from binary sequences satisfying B^TB = G₋₁, and the other half of codewords are selected according to B^TB = G1. The proposed codeword selection process is simply to list all the sequences satisfying the desired self-orthogonal property in binary-alphabetical order, starting from zero, and uniformly pick the codewords from the ordered list in every

∆θ interval with

∆θ = |A(b1 =−1|Gθ)| − 1

2^K/|Θ| − 1 for θ ∈ Θ, (3.10)

where Θ ={0} for N odd, and Θ = {−1, 1} for N even. As a result, the selected codewords are those sequences with indices closest to b(i mod (2^K/|Θ|)) · ∆^θc for 0 ≤ i ≤ 2^K− 1. The codeword mapping algorithm is summarized by the following list:

Step 1. Input the index i of the requested codeword in the (N, K) block code, where 0≤ i ≤ 2^K− 1.

Step 2. Set Θ = {0} for N odd, and Θ = {−1, 1} for N even. Also, set θ = ((N + 1) mod 2)· (−1)^d(i+1)/(2^K^/|Θ|)e. Compute ∆θ according to (3.10). Initialize b1 =−1,

` = 1 and ρ = b(i mod (2^K/|Θ|)) · ∆θc. Let the minimum sequence index ρmin = 0.

Step 3. Execute ` = ` + 1, and compute γ_`=|A(b(`−1), b_` =−1|Gθ)|.

If ρ < ρmin+ γ`, then choose the next code bit b` =−1;

otherwise choose the next code bit b` = 1, and readjust ρmin = ρmin+ γ`.

Step 4. If ` = N , output the corresponding codeword b, and the algorithm stops; other-wise, go to Step 3.

In implementing the above algorithm, it is perhaps more convenient to calculate γ` re-cursively² such that the codeword mapping can be performed in an on-the-fly or bit-by-bit systematic fashion with respect to the given codeword index i. This recursive nature also facilitates the priority-first decoding search at the receiver, since branches of the code tree will only be spanned when necessary.

2Initializing b0 = 0, m0 = θ and γ1 =|A(b¹|G^θ)|, and setting m^`+1 = m`− b^`b`+1 for 0 ≤ ` < N, we obtain for P = 2 that if|m^`−1+ b`−1| ≤ N − `,

γ`+1 = γ`· 1 2(N− `)·







(N − ` − m^`−1)²− 1 N− ` + m^`−1+ 1

· 1 {|m^`−1+ 2| ≤ N − ` − 1} , for (b^`−1, b`) = (−1, 1), (N− ` + m^`−1+ 1− b^`−1b`+ b`)· 1 {|m^`−1− b^`−1b`+ b`| ≤ N − ` − 1} , otherwise, where 1{·} is the set indicator function. If however |m^`−1+ b`−1| > N − `, then

γ`+1=

(0, for (b`−1, b`)6= (−1, 1) or (b^`−1, b`) = (−1, 1) and m^`−16= −N + ` − 1 , 1, otherwise.

3.3 Maximum-Likelihood Metrics For Priority-First Search Decoding

In this section, we will establish two different metric functions to be used by the priority-first search algorithm. The first metric is

f1(b(`)) = g(b(`)) + ϕ1(b(`)), (3.11)

where g(b(`)) is derived in Section 3.3.1, and ϕ1(b(`)) = 0 is the all-zero function (cf. Sec-tion 3.3.2). The second metric is

f₂(b_(`)) = g(b_(`)) + ϕ₂(b_(`)), (3.12)

with g(y) the same as in f1, and with ϕ2(b_(`)) defined in Section 3.3.3. Both metrics will lead to an ML decoding. The difference is that f1 can be computed on-the-fly, and will therefore cause much less delay in the decoding. For the evaluation of f2, however, one needs to know all received symbols, but the computational complexity of f2 is one order of magnitude less than that of f1.

3.3.1 Recursive Maximum-Likelihood Metric g

Let subcode C^θ be the set of codewords that satisfy B^TB = Gθ, where θ takes value in Θ.

Hence,C = ∪θ∈ΘC^θ, andC^θ∩C^η =∅ whenever θ 6= η. Since a transmitted codeword belongs to only one of the subcodes, to maintain individual stacks for priority-first codeword searching over each subcode will introduce considerable unnecessary decoding burden, especially for the subcodes that the transmitted codeword does not belong to. Hence, only one stack is maintained during the entire priority-first search, and the metric function values for different subcodes are compared and sorted in the same stack. The path to be expanded next is therefore the one whose metric function value is the smallest globally.

By denoting D_θ = G⁻¹_θ = (B^TB)⁻¹, and letting the matrix entry of D_θ be δ_i,j^(θ), we can

continue the derivation from (3.6) as follows:

where for convenience, we put bj = 0 for j > N . After adjusting indices, the derivation can be resumed as

As the maximum-likelihood decision remains unchanged by adding a constant that is inde-pendent of the codeword b, we add a constant to make the decision criterion nonnegative:³

ˆb = arg min

It remains to prove that the metric of XN

can be computed recursively. To that aim, we define for every path b_(`) over code tree θ g(b(`)),

3Here, a nonnegative maximum-likelihood criterion makes possible the later definition of path metric g(b(`)) to be nondecreasing along any path in the code tree. It can then be anticipated (cf. Section 3.3.2) that letting the heuristic function be zero for all paths in the code tree suffices to result in a metric function satisfying the condition (2.3) in Lemma 1.

Note that the additive constant that makes the metric function nondecreasing along any path in the code tree can also be obtained by first defining g based on (3.13), and then determining its respective ϕ according to (2.3). Such an approach however complicates the determination of the heuristic function ϕ when we additionally require the metric function to be recursive-computable. The alternative approach that directly defines a recursive-computable g based on a nonnegative maximum-likelihood criterion is accordingly adopted in this work.

Then, by wm,n^(θ) = wn,m^(θ) for every 1≤ m, n ≤ N and θ ∈ Θ, we have for 1 ≤ ` ≤ N − 1,

A final remark in this discussion is that although the computational burden of α^(η)_` in (3.14) increases linearly with `, such a linearly increasing burden can be moderately compensated for by the fact that it is only necessary to compute α_`^(η) once for each ` and η, because it can be shared for all paths ending at level ` over the code tree η.

3.3.2 Heuristic Function ϕ

We next derive the first heuristic function that validates (2.3). Taking the maximum-likelihood metric g into the sufficient condition in (2.3) yields

Hence, in addition to ϕ(˜b) = 0, the heuristic function should satisfy

It is apparent that the all-zero function is the largest one that satisfies this inequality subject to no dependence on the future route and future receptions, i.e., {˜b^m}m≥`+1 and {w^(θ)^m,n}m≥`+1,n≥`+1. Hence, we choose ϕ1(b(`)) = 0.

Note that ϕ₁ is trivially on-the-fly computable, and hence so is f₁. In comparison with the exhaustive-search decoding, decoding based on recursive priority-first search shows a significant decrease in computational complexity especially at medium-to-high SNRs.

3.3.3 Heuristic Function ϕ

If we drop the requirement that the metric f must be independence of future receptions, we can further reduce the computational complexity. Upon reception of all y1, . . . , yL, the heuristic function that satisfies (3.15) regardless of ˜b`+1,. . ., ˜bN can be increased to

ϕ2(b(`)) , compared to the zero-heuristic function ϕ₁, the heuristic function in (3.16) further reduces

the number of path expansions during the decoding process up to one order of magnitude (cf. Table 3.1).

在文檔中結合錯誤更正與通道估計的系統化編碼設計及其最大概度解碼 (頁 33-43)