Introduction - 二元無記憶通道的最佳極小區塊碼設計

Shannon proved in his ground-breaking work [1] that it is possible to find an information transmission scheme that can transmit messages at arbitrarily small error probability as long as the transmission rate in bits per channel use is below the so-called capacity of the channel. However, he did not provide a way on how to find such schemes. In particular, he did not tell us much about the design of codes apart from the fact that good codes may need to have a large blocklength.

For many practical applications, exactly this latter constraint is rather unfortunate as we often cannot tolerate too much delay (e.g., in inter-human communication, in time-critical control and communication, etc.). Moreover, the system complexity usually grows exponentially in the blocklength, and in consequence having large blocklength might not be an option and we have to restrict the codewords to some reasonable size. The question now arises what can theoretically be said about the performance of communication systems with such restricted block size.

During the last years, there has been an renewed interest in the theoretical understand-ing of finite-length codunderstand-ing [2]–[5]. There are several possible ways on how one can approach the problem of finite-length codes. In [2], the authors fix an acceptable error probability and a finite blocklength and then find bounds on the maximal achievable transmission rate. This parallels the method of Shannon who set the acceptable error probability to zero, but allowed infinite blocklength, and then found the maximum achievable transmis-sion rate (the capacity). A typical example in [2] shows that for a blocklength of 1800 channel uses and for an error probability of 10⁻⁶, one can achieve a rate of approximately 80 percent of the capacity of a binary symmetric channel of capacity 0.5 bits.

In another approach, one fixes the transmission rate and studies how the error prob-ability depends on the blocklength n (i.e., one basically studies error exponents, but for relatively small n [6]). For example, [5] introduces new random coding bounds that enable a simple numerical evaluation of the error probability for finite blocklengths.

All these results have in common that they are related to Shannon’s ideas in the sense

that they try to make fundamental statements about what is possible and what not. The exact manner how these systems have to be built is ignored on purpose.

Our approach in this thesis is different. Based on the insight that for very short blocklength, one has no big hope of transmitting much information with acceptable error probability, we concentrate on codes with a small fixed number of codewords: so-called ultra-small block-codes. By this reduction of the transmission rates, our results are directly applicable even for very short blocklengths. In contrast to [2] that provide bounds on the best possible theoretical performance, we try to find a best possible design that minimizes the average error probability. Hence, we put a big emphasis on finding insights in how to actually build an optimal system. In this respect, this thesis could rather be compared to [7]. There the authors try to describe the empirical distribution of good codes (i.e., of codes that approach capacity with vanishing error probability) and show that for a large enough blocklength, the empirical distribution of certain good codes converges in the sense of divergence to a set of input distributions that maximize the input-output mutual information. Note, however, that [7] again focuses on the asymptotic regime, while our focus lies on finite blocklength.

There are interesting applications for ultra-small block-codes, e.g., in the situation of establishing an initial connection in a wireless link: the amount of information that needs to be transmitted during the setup of the link is very limited, usually only a couple of bits, but these bits need to be transmitted in very short time (e.g., blocklength in the range of n = 20 to n = 30) with the highest possible reliability [8]. Another important application for ultra-small block-codes is in the area of quality of service (QoS). In many delay-sensitive wireless systems like, e.g., voice over IP (VoIP) and wireless interactive and streaming video applications, it is essential to comply with certain limitations on queuing delays or buffer violation probabilities [3]–[4]. A further area where the performance of short codes is relevant is proposed in [9]: effective rateless short codes can be used to transmit some limited feedback about the channel state information in a wireless link or in some other latency-constrained application. Hence, it is of significant interest to conduct an analysis of (and to provide predictions for) the performance levels of practical finite-blocklength systems. Note that while the motivation of this work focuses on rather smaller values of n, our results nevertheless hold for arbitrary finite n.

The study of ultra-small block-codes is interesting not only because of the above men-tioned direct applications, but because their analytic description is a first step to a better fundamental understanding of optimal nonlinear coding schemes (with ML decoding) and of their performance based on the exact error probability rather than on an upper bound on the achievable error probability derived from the union bound or the mutual information density bound and its statistics [10], [11].

To simplify our analysis, we have restricted ourselves for the moment to binary input and output discrete memoryless channels, that we call in their general form binary asym-metric channels (BAC). The two most important special cases of the BAC, the binary symmetric channel (BSC) and the Z-channel (ZC), are then investigated more in detail.

The other channel we focus on more is the binary input and ternary output channel, which

is called binary erasure channel (BEC).

Our main contributions are as follows:

• We provide first fundamental insights into the performance analysis of optimal non-linear code design for the BAC. Note that there exists a vast literature about non-linear codes, their properties and good linear design (e.g., [12]). Some Hamming-distance related topics of nonlinear codes are addressed in [13].¹

• We provide new insights in the optimal code construction for the BAC for an arbi-trary finite blocklength n and for M = 2 codewords.

• We provide optimal code constructions for the ZC for an arbitrary finite blocklength n and for M = 2, 3 and 4 codewords. For the BSC, we show an achievable best code design for M = 2, 3. We have also found the linear optimal codes for M = 4. For the ZC we also conjecture an optimal design for M = 5.

• We provide optimal code constructions for the BEC for an arbitrary finite block-length n and for M = 2, 3 codewords. We have also found the linear optimal codes for M = 4. We also conjecture an optimal design for M = 5, 6. For some certain blocklength, a optimal code structure is conjectured with arbitrary M.

• For the ZC, BSC, and BEC these channels, we can derive its exact performance for comparison. Some known bounds for a finite blocklegnth with fixed number of codewords are introduced.

• We propose a new approach to the design and analysis of block-codes: instead of focusing on the codewords (i.e., the rows in the codebook matrix), we look at the codebook matrix in a column-wise manner.

The remainder of this thesis is structured as follows: after some comments about our notation we will introduce some common definitions and our channel models in Chapter2 and Chapter 3. After some more preliminaries in Chapter 4. Chapter 5 contains a very short example showing that the analysis of even such simple channel models is nontrivial and often nonintuitive. Chapter6then presents new code definitions that will be used for our main results. In Chapter 7, we review some important previous work. Chapter 8–11 then contain our main results. In Chapter8 we analyze the BAC only for two codewords, Chapter9takes a closer look at the ZC. In Chapter10and Chapter11, we investigate the BSC and BEC, respectively. Many of the lengthy proofs have been moved to the appendix.

As is common in coding theory, vectors (denoted by bold face Roman letters, e.g., x) are row-vectors. However, for simplicity of notation and to avoid a large number of transpose-signs, we slightly misuse this notational convention for one special case: any vector c is a column-vector. It should be always clear from the context because these

1Note that some of the code designs proposed in this thesis actually have interesting “linear-like”

properties and can be considered as generalizations of linear codes with 2^k codewords to codes with a general number of codewords M. For more details see [14].

vectors are used to build codebook matrices and are therefore also conceptually quite different from the transmitted codewords x or the received sequence y. Otherwise our used notation follows the main stream. We use capital letters for random quantities and small letters for realizations; sets are denoted by a calligraphic font, e.g., D; and constants are depicted by Greek letters, small Romans or a special font, e.g., M.

Definitions

2.1 Discrete Memoryless Channel

The probably most fundamental model describing communication over a noisy channel is the so-called discrete memoryless channel (DMC). A DMC consists of a

• a finite input alphabet X ;

• a finite output alphabet Y; and

• a conditional probability distribution P_Y_|X(·|x) for all x ∈ X such that P_Y_k_|X₁_,X₂_,...,X_k_,Y₁_,Y₂_,...,Y_k−1(y_k|x₁, x₂, . . . , x_k, y₁, y₂, . . . , y_k₋₁)

= P_Y_|X(y_k|x_k) ∀ k. (2.1) Note that a DMC is called memoryless because the current output Ykdepends only on the current input xk. Moreover also note that the channel is time-invariant in the sense that for a particular input x_k, the distribution of the output Y_k does not change over time.

Definition 2.1 We say a DMC is used without feedback, if

P (xk|x₁, . . . , xk−1, y₁, . . . , yk−1) = P (xk|x₁, . . . , xk−1) ∀ k, (2.2) i.e., X_k depends only on past inputs (by choice of the encoder), but not on past outputs.

Hence, there is no feedback link from the receiver back to the transmitter that would inform the transmitter about the last outputs.

Note that even though we assume the channel to be memoryless, we do not restrict the encoder to be memoryless! We now have the following theorem.

Theorem 2.2 If a DMC is used without feedback, then P (y₁, . . . , yn|x₁, . . . , xn) =

!n k=1

P_Y_|X(yk|xk) ∀ n ≥ 1. (2.3) Proof: See, e.g., [15].

在文檔中二元無記憶通道的最佳極小區塊碼設計 (頁 10-15)