Pairwise Hamming Distance - 二元無記憶通道的最佳極小區塊碼設計

y g(y)&=m

(1 − %)ⁿ^−N(2|y)· %^N^(2|y)

·I0 d_H"

x_{m I}_(b|y), yI(b|y)

#= 01

(4.20) 4.4.1 Capacity of the BEC

The capacity of a BEC is given by

C_BEC= 1 − δ (4.21)

bits. The input distribution P_X^∗(·) that achieve the capacity is the uniform distribution given by

P_X^∗(0) = 1 − P_X^∗(1) = 1

2, (4.22)

which is also irrelevant to the cross-over probability δ.

4.5 Pairwise Hamming Distance

The minimum Hamming distance is a well-known and often used quality criterion of a codebook [12], [13]. [13, Ch. 2] discusses the maximum minimum Hamming distance for a given code C^(M,n), e.g., the Plotkin bound and Levenshtein’s theorem. (For discussions of the upper and lower bounds to average error probability, see Chapter7.) Unfortunately, a design based on the minimum Hamming distance can fail even for linear codes and even for a very symmetric channel like the BSC, whose error probability performance is completely specified by the Hamming distances between codewords and received vectors.

We therefore define a slightly more general and more concise description of a codebook:

the pairwise Hamming distance vector.

Definition 4.3 Given a codebook C^(M,n) with codewords xm, 1 ≤ m ≤ M, we define the length ¹₂(M − 1)M pairwise Hamming distance vector

C^(M,n)#

d_H(x₁, x₂),

d_H(x₁, x₃), d_H(x₂, x₃),

d_H(x1, x4), d_H(x2, x4), d_H(x3, x4), . . . ,

d_H(x₁, xM), d_H(x₂, xM), . . . , d_H(xM−1, xM)

-. (4.23)

The minimum Hamming distance d_min"

C^(M,n)#

is then defined as the minimum component of the pairwise Hamming distance vector d"

C^(M,n)# .

A Counterexample

To show that the search for an optimal (possibly nonlinear) code is neither trivial nor intuitive even in the symmetric BSC case, we would like to start with a simple example before we summarize our main results.

Assume a BSC with cross probability % = 0.4, M = 4, and a blocklength n = 4. Then consider the following codes:⁴

C^(4,4)

1 =







0 0 0 0 0 0 0 1 1 1 1 0 1 1 1 1







, C^(4,4)

2 =







0 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1







. (5.1)

We observe that while both codes are linear (i.e., any sum of two codewords is also a codeword), the first code has a minimum Hamming distance 1, and the second has a minimum Hamming distance 2. It is quite common to believe that C₂^(4,4) shows a better performance. This intuition is based on Gallager’s famous performance bound [6, Exercise 5.19]:

P_e"

C^(M,n)#

≤ (M − 1)e^−d^min^(C

(M,n)) log√_4!(1−!)¹

. (5.2)

However, the exact average error probability as given in (4.15) actually can be evaluated as P_e"

C^(4,4)

# ≈ 0.6112 and P_e"

C^(4,4)

# = 0.64. Hence, even though the minimum Hamming distance of the first codebook is smaller, its overall performance is superior to the second codebook!

Our goal is to find the structure of an optimal code C^(M,n)∗ that satisfies P_e"

C^(M,n)∗#

≤ P_e"

C^(M,n)#

(5.3) for any code C^(M,n).

4We will see in Chapter 6that both codes are weak flip codes. In this example, C₁^(4,4) = C_1,0^(4,4) and C₂^(4,4)= C_2,0^(4,4)according to Definition6.5given later.

Flip Codes, Weak Flip Codes and Hadamard Codes

We next introduce some special families of binary codes. We start with a family of codes with two codewords.

Definition 6.1 The flip code of type t for t ∈ 0

0, 1, . . . ,2_n

31 is a code with M = 2 codewords defined by the following codebook matrix C_t^(2,n):

tcolumns

4 56 7 C^(2,n)

t !

¯ x

80 · · · 0 1 · · · 1 1 · · · 1 0 · · · 0

. (6.1)

Defining the column vectors

% c⁽²⁾₁ !

80 1

, c⁽²⁾₂ ! 81

0 9:

, (6.2)

we see that a flip code of type t is given by a codebook matrix that consists of n − t columns c⁽²⁾₁ and t columns c⁽²⁾₂ .

We again remind the reader that due to the memorylessness of the BEC, other codes with the same columns as C_t^(2,n), but in different order are equivalent to C_t^(2,n). Moreover, we would like to point out that while the flip code of type 0 corresponds to a repetition code, the general flip code of type t with t > 0 is neither a repetition code nor is it even linear.

We have shown in [16] that for any blocklength n and for a correct choice⁵ of t, the flip codes are optimal on any binary-input binary-output channel for arbitrary channel parameters. In particular, they are optimal for the BSC and the ZC [16].

The columns given in the set in (6.2) are called candidate columns. They are flipped versions of each other, therefore also the name of the code.

5We would like to emphasize that the optimal choice of t for many binary channels is not 0, i.e., the linear repetition code is not optimal!

The definition of a flip code with one codeword being the flipped version of the other cannot be easily extended to a situation with more than two codewords. Hence, for M > 2, we need a new approach. We give the following definition.

Definition 6.2 Given an M > 2, a length-M candidate column c is called a weak flip column if its first component is 0 and its Hamming weight equals to 2M

3 or ;M 2

<. The collection of all possible weak flip columns is called weak flip candidate columns set and is denoted by C^(M).

We see that a weak flip column contains an almost equal number of zeros and ones.

The restriction of the first component to be zero is based on the insight of Lemma 3.1.

For the remainder of this work, we introduce the shorthand ' !

=M 2

. (6.3)

Lemma 6.3 The cardinality of a weak flip candidate columns set is

??C^(M)?

We are now ready to generalize Definition 6.1.

Definition 6.4 A weak flip code is a codebook that is constructed only by weak flip columns.

respectively. We often describe the weak flip code of type (t₂, t₃) by its code parameters

[t₁, t₂, t₃] (6.7)

where t₁ can be computed from the blocklength n and the type (t₂, t₃) as t₁ = n − t₂− t₃.

Note that the fair weak flip code of type (t₂, t₃) is only defined provided that the block-length satisfies n mod 3 = 0. In order to be able to provide convenient comparisons for every blocklength n, we define a generalized fair weak flip code for every n, C^(M,n)

,ⁿ⁺¹₃ -^,,ⁿ₃-,

If n mod 3 = 0, the generalized fair weak flip code actually is a fair weak flip code.

The following lemma follows from the respective definitions in a straightforward man-ner. We therefore omit its proof.

Lemma 6.7 The pairwise Hamming distance vector of a weak flip code of type (t₂, t₃) can be computed as follows:

d^(3,n) = (t₂+ t₃, t₁+ t₃, t₁+ t₂),

d^(4,n) = (t₂+ t₃, t₁+ t₃, t₁+ t₂, t₁+ t₂, t₁+ t₃, t₂+ t₃).

A similar definition can be given also for larger M, however, one needs to be aware that the number of weak flip candidate columns is increasing fast. For M = 5 or M = 6 we have ten weak flip candidate columns:



c⁽⁵⁾₈ !

We will next introduce a generalized fair weak flip codes, as we will see in Section6.1, possess particularly beautiful properties.

Definition 6.8 A weak flip code is called fair if it is constructed by an equal number of all possible weak flip candidate columns in C^(M). Note that by definition the blocklength of a fair weak flip code is always a multiple of "_2#−1

#, ' ≥ 2.

Fair weak flip codes have been used by Shannon et al. [17] for the derivation of error exponents, although the codes were not named at that time. Note that the error exponents are defined when the blocklength n goes to infinity, but in this work we consider finite n.

Related to the weak flip codes and the fair weak flip codes are the families of Hadamard codes [13, Ch. 2].

Definition 6.9 For an even integer n, a ( normalized) Hadamard matrix H_n of order n is an n × n matrix with entries +1 and −1 and with the first row and column being all +1, such that

H_nH^T_n= nI_n, (6.13)

if such a matrix exists. Here I_n is the identity matrix of size n. If the entries +1 are replaced by 0 and the entries −1 by 1, Hn is changed into the binary Hadamard matrix A_n.

Note that a necessary (but not sufficient) condition for the existence of Hn (and the corresponding An) is that n is a 1, 2 or multiple of 4 [13, Ch. 2].

Definition 6.10 The binary Hadamard matrix Angives rise to three families of Hadamard codes:

1. The "

n, n − 1,ⁿ₂#

Hadamard code H_1,n consists of the rows of An with the first column deleted. The codewords in H1,n that begin with 0 form the "_n

2, n − 2,ⁿ₂# Hadamard code H_1,n^$ if the initial zero is deleted.

2. The "

2n, n − 1,ⁿ₂ − 1#

Hadamard code H_2,n consists of H_1,n together with the com-plements of all its codewords.

3. The "

2n, n,ⁿ₂#

Hadamard code H_3,n consists of the rows of An and their comple-ments.

Further Hadamard codes can be created by an arbitrary combinations of the codebook ma-trices of different Hadamard codes.

Example 6.11 Consider a (6, 10, 6) H_1,12^$ code:

From this code, see the candidate columns (6.12) for M = 6, it is identical to the fair weak flip code for M = 6. Since the fair weak flip code already used up all the possible weak flip candidate columns, hence, there is only one (6, 10, 6) H_1,12^$ in column-wise respect.

Example 6.12 Consider an (8, 7, 4) H_1,8 code:

H¹

and the other (8, 7, 4) H_1,8² code:

H_1,8² =







0 0 0 0 0 0 0 0 0 1 0 1 1 1 1 0 0 1 1 0 1 0 1 1 1 1 0 0 1 1 1 0 0 0 1 1 0 1 1 0 1 0 0 1 0 1 0 1 1 1 1 0 0 1 1 0







. (6.16)

From these codes, an (8, 35, 20) Hadamard code can be constructed by simply concatenating H¹

1,8 five times, or concatenating H_1,8¹ three times and H_1,8² two times.

Note that since the rows of Hn are orthogonal, any two rows of An agree in ¹₂n places and differ in ¹₂n places, i.e., they have a Hamming distance ¹₂n. Moreover, by definition the first row of a binary Hadamard matrix is the all-zero row. Hence, we see that all Hadamard codes are weak flip codes, i.e., the family of weak flip codes is a superset of Hadamard codes.

On the other hand, every Hadamard code of parameters (M, n), for which fair weak flip codes exist, is not necessarily equivalent to a fair weak flip code. We also would like to remark that the Hadamard codes rely on the existence of Hadamard matrices. So in general, it is very difficult to predict whether for a given pair (M, n), a Hadamard code will exist or not. This is in stark contrast to weak flip codes (which exist for all M and n) and fair weak flip codes (which exist for all M and all n being a multiple of "_2#−1

#).

Example 6.13 We continue with Example 6.12 and note that the (8, 35, 20) Hadamard code that is constructed by five repetitions of the matrix given in (6.15) is actually not a fair weak flip code, since we have to use up all possible weak flip candidate columns to get a (8, 35, 20) fair weak flip code.

Note that two Hadamard matrices can be equivalent if one can be obtained from the other by permuting rows and columns and multiplying rows and columns by −1. In other words, Hadamard codes can actually be constructed from weak candidate columns. This also follows directly from the already mentioned fact that Hadamard codes are weak flip codes.

6.1 Characteristics of Weak Flip Codes

In conventional coding theory, most results are restricted to so called linear codes that possess very powerful algebraic properties. For the following definitions and proofs see, e.g., [12], [13].

Definition 6.14 Let M = 2^k, where k ∈ N. The binary code C_lin^(M,n) is linear if its codewords span a k-dimensional subspace of {0, 1}ⁿ.

One of the most important property of a linear code is as follows.

Proposition 6.15 Let C_lin be linear and let xm ∈ C_lin be given. Then the code that we obtain by adding x_m to each codeword of C_lin is equal to C_lin.

Another property concerns the column weights.

Proposition 6.16 If an (M, n) binary code is linear, then each column of its codebook matrix has Hamming weight ^M₂, i.e., the code is a weak flip code.

Hence, linear codes are weak flip codes. Note, however, that linear codes only exist if M = 2^k, where k ∈ N, while weak flip codes are defined for any M. Also note that the converse of Proposition 6.16 does not hold, i.e., even if M = 2^k for some k ∈ N, a weak flip code C^(M,n) is not necessarily linear. It is not even the case that a fair weak flip code for M = 2^k is necessarily linear!

Now the question arises as to which of the many powerful algebraic properties of linear codes are retained in weak flip codes.

Theorem 6.17 Consider a weak flip code C^(M,n) and fix some codeword xm ∈ C^(M,n). If we add this codeword to all codewords in C^(M,n), then the resulting code ˜C^(M,n) ! 0xm ⊕ x?

?∀ x ∈ C^(M,n)1

is still a weak flip code, however, it is not necessarily the same one.

Proof: Let C^(M,n) be according to Definition6.4. We have to prove that

 interchange the first codeword of ˜C and the all-zero codeword in the mth row in ˜C (which is always possible, see discussion after Definition 2.7), and we see that ˜C is also a weak flip code.

Theorem 6.17 is a beautiful property of weak flip codes; however, it still represents a considerable weakening of the powerful property of linear codes given in Proposition6.15.

This can be fixed by considering the subfamily of fair weak flip codes.

Theorem 6.18 (Quasi-Linear Codes) Let C be a fair weak flip code and let x_m ∈ C be given. Then the code ˜C =0

xm⊕ x?

?∀ x ∈ C^(M,n)1

is equivalent to C .

Proof: We have already seen in Theorem6.17that adding a codeword will result in a weak flip code again. In the case of a fair weak flip code, however, all possible candidate columns will show up again with the same equal frequency. It only remains to rearrange some rows and columns.

If we recall Proposition 6.16and the discussion after it, we realize that the definition of the quasi-linear fair weak flip code is a considerable enlargement of the set of codes having the property given in Theorem 6.18.

The following corollary is a direct consequence of Theorem 6.18.

Corollary 6.19 The Hamming weights of each codeword of a fair weak flip code are all identical except the all-zero codeword x₁. In other words, if we let w_H(·) be the Hamming weight function, then

wH(x2) = wH(x3) = · · · = wH(xM). (6.19) Before we next investigate the minimum Hamming distance for the quasi-linear fair weak flip codes, we quickly recall an important bound that holds for any "

M, n, d# code.

Lemma 6.20 (Plotkin Bound [13]) The minimum distance of an (M, n) binary code C^(M,n) always satisfies

dmin"

C^(M,n)#

≤







n·^M₂

M−1 Meven,

n·^M₂⁺¹

M Modd.

(6.20)

Proof: We show a quick proof. We sum the Hamming distance over all possible pairs of two codewords apart from the codeword with itself:

M(M − 1) · dmin(C^(M,n)) ≤ $

u∈C^(M,n)

v∈C^(M,n) v&=u

dH(u, v) (6.21)

$n j=1

2bj· (M − bj) (6.22)

≤

%n · ^M₂² if M even (achieved if b_j = M/2), n · ^M²₂⁻¹ if M odd (achieved if b_j = (M ± 1)/2).

(6.23) Here in (A.33) we rearrange the order of summation: instead of summing over all code-words (rows), we approach the problem column-wise and assume that the jth column of C^(M,n) contains b_j zeros and M − b_j ones: then this column contributes 2b_j(M − b_j) to the sum.

Note that from the proof of Lemma 6.20 we can see that a necessary condition for a codebook to meet the Plotkin-bound is that the codebook is composed by weak flip

candidate columns. Furthermore, Levenshtein [13, Ch. 2] proved that the Plotkin bound can be achieved, provided that Hadamard matrices exist.

Theorem 6.21 Fix some M and a blocklength n with n mod"_2#−1

# = 0. Then a fair weak flip code C^(M,n) achieves the largest minimum Hamming distance among all codes of given blocklength and satisfies

d_min"

C^(M,n)#

= n · '

2' − 1. (6.24)

Proof: For M = 2', we know that by definition the Hamming weight of each column of the codebook matrix is equal to '. Hence, when changing the sum from column-wise to row-wise, where we can ignore the first row of zero weight (from the all-zero codeword x₁), we get

n · ' =

$n j=1

w_H(c_j) =

$2#

m=2

w_H(x_m) (6.25)

$2#

m=2

d_min"

C^(M,n)#

(6.26)

= (2' − 1) · d_min"

C^(M,n)#

. (6.27)

Here, (B.42) follows from Theorem 6.18 and from Corollary 6.19. For M = 2' − 1, the Hamming distance remains the same due to the fair construction.

It remains to show that a fair weak flip code achieves the largest minimum Hamming distance among all codes of given blocklength. From Corollary 6.19we know that (apart from the all-zero codeword) all codewords of a fair weak flip code have the same Hamming weight. So, if we flip an arbitrary 1 in the codebook matrix to become a 0, then the corresponding codeword has a decreased Hamming weight and is therefore closer to the all-zero codeword. If we flip an arbitrary 0 to become a 1, then the corresponding codeword is closer to some other codeword that already has a 1 in this position. Hence, in both cases we have reduced the minimum Hamming distance. Finally, based on the concept of looking at the code in column-wise, it can be seen that whenever we change more than one bit, we either get back to a fair weak flip code or to another code who is worse.

Previous Work

7.1 SGB Bounds on the Average Error Probability

In [17], Shannon, Gallager, and Berlekamp derive upper and lower bounds on the average error probability of a given code used on a DMC. We next quickly review their results.

Definition 7.1 For 0 < s < 1 we define µ_α,β(s) ! ln$

P_Y_|X(y|α)^1−sP_Y_|X(y|β)^s. (7.1)

Therefore, the generalized µ(s) for blocklength n between xm and x_m^" can be defined and expressed in terms of (7.1) by

µ(s) ! ln$

P_Y_|X(y|xm)^1−sP_Y_|X(y|xm^")^s= n$

qα,β(m, m^$)µα,β(s), (7.2)

and the discrepancy D^(DMC)(m, m^$) between xm and xm^" is defined as D^(DMC)(m, m^$) ! − min

0≤s≤1

q_α,β(m, m^$)µ_α,β(s) (7.3)

with qα,β(m, m^$) given in Def.2.9.

Note that the discrepancy is a generalization of the Hamming distance, however, it depends strongly on the channel cross-over probabilities. We use a superscript “(DMC)” to indicate the channel which the discrepancy refers to.

Definition 7.2 The minimum discrepancy D^(DMC)_min (C^(M,n)) for a codebook is the mini-mum value of D^(DMC)(m, m^$) over all pairs of codewords. The maximum minimum dis-crepancy is the maximum value of D^(DMC)_min (C^(M,n)) over all possible C^(M,n) codebooks:

max_C(M,n)D^(DMC)

min (C^(M,n)).

Theorem 7.3 (Lower Bounds to Conditional Error Probability [17]) If x_m and xm^" are pair of codewords in a code of blocklength n, then either

λ_m > 1

where P_min is the smallest nonzero transition probability for the channel.

Conversely, one can also show that λm ≤ (M − 1) exp −n,

Theorem 7.4 (SGB Bounds on Average Error Probability [17]) For an arbitrary DMC, the average error probability P_e"

C^(M,n)#

of a given code C^(M,n) with M codewords and blocklength n is upper- and lower-bounded as follows:

1 where Pmin denotes the smallest nonzero transition probability of the channel.

Note that these bounds are specific to a given code design (via D^(DMC)_min ). Therefore, the upper bound is a generally valid upper bound on the optimal performance, while the lower bound only holds in general if we apply it to the optimal code or to a suboptimal code that achieves the optimal D_min.

The bounds (7.7) are tight enough to derive the error exponent of the DMC (for a fixed number M of codewords).

Theorem 7.5 ([17]) The error exponent of a DMC for a fixed number M of codewords E_M ! lim

Unfortunately, in general the evaluation of the error exponent is very difficult. For some cases, however, it can be done. For example, for M = 2, we have

E₂ = max

Also for the class of so-called pairwise reversible channels, the calculation of the error exponent turns out to be uncomplicated.

Definition 7.6 A pairwise reversible channel is a DMC that has µ^$_α,β(¹₂) = 0 for any inputs α, β.

Clearly, the BSC and BEC are pairwise reversible channels.

Note that it is easy to compute the pairwise discrepancy of a linear code on a pairwise reversible channel, so linear codes are quite suitable for computing (7.7).

Theorem 7.7 ([17]) For pairwise reversible channels with M > 2,

E_M = 1

M(M − 1) max

M_xs.t.

xM_x=M

all input letters x

all input letters x^"

M_xM_x"

· 8

− ln$

PP_Y_|X(y|x)P_Y_|X(y|x^$) 9

(7.11)

where Mx denotes the number of times the channel input letter x occurs in a column.

Moreover, EM is achieved by fair weak flip codes.⁶

We would like to emphasize that while Shannon et al. proved that fair weak flip codes achieve the error exponent, they did not investigate the error performance of fair weak flip codes for finite n. As we will show later, fair weak flip might be strictly suboptimal codes for finite n (see also [18]).

在文檔中二元無記憶通道的最佳極小區塊碼設計 (頁 26-39)