Chih-Ming Chen, Student Member, IEEE, Ying-ping Chen, Member, IEEE, Tzu-Ching Shen, and John K. Zao, Senior Member, IEEE
Abstract— Luby Transform code (LT code) has been a popular and practical technique in the field of channel coding since its proposal. One of the key components of LT code is a degree distribution which is used to determine the relationship between source data and codewords. Luby in his proposal suggested two general methods to construct feasible degree distributions. Such general designs work appropriately in typical situations but not optimally in most cases. To explore the full potential of LT code, in this work, we make the first attempt to introduce evolutionary algorithms to optimize the degree distribution in LT code. Degree distributions are encoded as real-valued vectors and evaluated by numerical simulation of LT code. For applications of different natures, two objectives are implemented to search good degree distributions with different decoding behavior. Compared with the original design, the experimental results are quite promising and demonstrate that the degree distribution can be customized for different purposes. In addition to manually adjusting the degree distribution as the common practice, the work presented in this paper provides an efficient alternative approach to use and adapt LT code for both practitioners and researchers.
I. INTRODUCTION
Digital fountain code [1] is a popular class of erasure code in the field of communication. The concept of fountain code was first introduced by Byers et al. [2] in 1998. Firstly, source data are divided into several pieces with an identical length. The length of each piece can be any bits or even several bytes. Sender generates encoding packets, or called encoding symbols when the packet length is one bit, by some particular encoding operation. The encoding and sending procedure may repeat independently and unlimitedly. Infinite encoding packets are sent out continuously like a fountain, which is an important property of fountain code called rateless.
If a receiver is interested in receiving the data, it can receive the packet flow at any time and collect the packets in any combination. Once sufficient packets, of which the amount is usually slightly more than that of the source data, are obtained, the source data can be fully recovered. During the process, no further communication is required between sender and receiver. Encoding information can be embedded in each packet. As a result, digital fountain code is especially useful in broadcast or other situations in which back channels are un-available. Moreover, because source data can be reconstructed no matter which packets are received, fountain code is also considered reliable to handle the problem of packet loss.
Chih-Ming Chen, Ying-ping Chen, Tzu-Ching Shen, and John K. Zao are with the Department of Computer Science, National Chiao Tung Uni-versity, 1001 Ta Hsueh Road, Hsinchu, TAIWAN (email: [email protected], [email protected], [email protected], [email protected]).
Luby Transform code (LT code) [3] proposed by Luby in 2002 is the first practical framework of fountain code.
A novel coding mechanism based on a specifically designed degree distribution is proposed in the introduction of LT code.
The performance of LT code totally depends on the adopted degree distribution. In his proposal, Luby deigned general methods to construct an appropriate degree distribution to be used in LT code, and the degree distribution was named soliton distribution. Via theoretical analysis, the feasibility of soliton distribution was proven in the literature [4]. Recently, researchers started to optimize the degree distribution in order to improve the performance of LT code [5], [6], but the ob-tained improvement is quite limited. In these studies, only the parameters of soliton distribution were tuned and considered as decision variables, while in the present work, we directly consider the degree distribution itself as our decision variables.
Based on LT code, an improved framework call Raptor codes [7], [8] was proposed by Shokrollahi. Shokrollahi in-tegrated LT code with a pre-coding layer. Compared with pure LT code, the design of Raptor codes requires a degree distribution, called weakened LT, with some very different behavior and properties. Several instances were given in [9]
for certain particular sizes of source symbols, but there are no existing guidelines regarding how to construct suitable degree distributions for other sizes. In this regard, we demonstrate the use of optimization techniques proposed in evolutionary computation for generating degree distributions of different, desired properties.
In this paper, according to our limited knowledge, we make the first attempt to utilize evolutionary computation techniques to optimize the degree distribution for LT code and demonstrate the feasibility of customizing degree distributions for different purposes. Particularly, we adopt the covariance matrix adaptation evolution strategy (CMA-ES) [10] to di-rectly optimize degree distributions for two goals: reducing the overhead and lowering the failure rate. The experimental results are remarkably promising and show that significantly reduced overheads and lower failure rates can be achieved for LT code with the obtained degree distribution for a wide range of source symbol sizes.
The remainder of this paper is organized as follows. Sec-tion II describes the detailed operaSec-tions of LT code, including the coding process and soliton distribution proposed by Luby.
Section III introduces the evolutionary algorithm used in this paper. Experiments and results are given in section IV. Finally, section V concludes this paper.
in LT code is to decide the degree of each vertex in the bipartite graph with a probability distribution. The connectivity can be recorded as a encoding matrix and each column represents an encoding symbol. Originally, k source symbols can be fully decoding by Gaussian elimination if there existk linearly independent columns. However, Gaussian elimination is prohibitively expensive for its computational complexity of O(k3). Therefore, the belief propagation (BP) algorithm [11]
is introduced to replace the expensive Gaussian elimination in the LT decoding phase. Overhead of coding is used to trade computing time because belief propagation is more efficient but more encoding symbols are needed for successful decod-ing. Moreover, the performance of LT code is very sensitive to the degree distribution. A good degree distribution is necessary to co-operate with belief propagation. Luby suggested soliton distributions for LT framework in his proposal of LT code.
According to the mathematical verification, the properties of soliton distribution have been confirmed. In this section, details of coding operations and soliton distributions are described.
A. Encoding and decoding
Given the source data, we suppose that the source data can be cut into k source symbols with the same length of ℓ bits. Before every codeword is generated, a degree d is chosen at random according to the adopted degree distribution ρ(d), where 1 ≤ d ≤ k and Pk
d=1ρ(d) = 1. The degree d decides the how many distinct source symbols will be chosen to compose an encoding symbol. d source symbols, called neighbors, are chosen uniformly randomly and accumulated by XOR. In the design of LT code, random numbers play an essential role during the encoding process. The approach employed by LT code for a sender to inform receivers of all encoding information is achieved by synchronizing a random number generator with a specified random number seed.
At the receiver side, whenK encoding symbols were arrived which is usually slightly larger than k, belief propagation is used to reconstruct the source data step by step. All encoding symbols are initially covered in the beginning. For the first step, all encoding symbols with only one neighbor can be directly released to recover their unique neighbor. When a source symbol has been recovered but not processed, it is called a ripple and will be stored in a queue. At each subsequent step, ripples are popped as a processing target one by one. A ripple is removed from all encoding symbols which have it as neighbor. If an encoding symbols has only one remaining neighbor after the removing, the releasing action repeats and may produce new ripples to maintain a stable size of the queue. Maintaining the size of the ripple queue is
0 5 10 15 20 25 30
0 0.05
Degree (a) Ideal soliton distribution
0 5 10 15 20 25 30
(b) Robust soliton distribution Fig. 1
EXAMPLE OF SOLITON DISTRIBUTIONS(K= 30)
important because the decoding process fails when the ripple queue is empty and some source symbols remain uncovered.
In other words, more encoding symbols are required in the decoding process. Ideally, the process succeeds if all source symbols are recovered at the end of the decoding process.
B. Soliton distribution
The behavior of LT code is completely determined by the degree distribution,ρ(d), and the number of encoding symbols received,K, by receiver. The overhead ε = K/k denotes the performance of LT code, and ε depends on a given degree distribution. Based on his theoretical analysis, Luby proposed the ideal soliton distribution of which the overhead is 1, the best performance, in the ideal case.
Ideal soliton distribution ρ(d):
ρ(d) =
1
k for d = 1
d(d−1)1 for d = 2, 3, . . . , k . (1) Ideal soliton distribution guarantees that all the release prob-abilities are identical to1/k at each subsequent step. Hence, there is exactly one expected ripple generated at each pro-cessing step when the encoding symbol size is k. After k processing step, the source data can be ideally recovered.
Fig. 1(a) shows an example of ideal soliton distribution for k = 30.
the decoding process. Hence, in the same paper by Luby, a modified version called robust soliton distribution,µ(d), was also proposed.
Robust soliton distributionµ(d):
R = c · ln(k/δ)√ c and δ are two parameters for tuning robust soliton distribu-tion. c controls the mean of the degree distribution. Smaller values of c increase the probability of low degrees and larger ones decrease it.δ estimates that there are ln(k/δ)√
k expected ripple size as described. Fig. 1(b) is an example of robust soliton distribution with c = 0.1 and δ = 0.1. Robust soliton distribution can ensure that only K = k + O(ln2(k/δ)√
k) encoding symbols are required to recover the source data with a successful probability at least 1-δ.
Robust soliton distribution is not only viable but also practical. The analysis of robust soliton distribution based on probability and statistics is sound if k is infinite. However, in practice, source data cannot be divided into infinite pieces, and as a consequence, the behavior of LT code will not exactly match the mathematical analysis, especially when k is small.
Furthermore, robust soliton distribution is a general purpose design. It provides a convenient way to construct a distribution works well but not optimally. In this work, we try to customize the degree distribution by using optimization tools proposed in the field of evolutionary computation.
III. OPTIMIZATIONMETHOD
Evolution strategies (ES) are a major branch of evolutionary computation and have been developed since early 1960s. The key idea of ES is to evolve strategic parameters as well as deci-sion variables. ES is well-known to be quite capable of dealing with continuous optimization problems. One of the simplest ES is (1+1)-ES where only one child is produced by Gaussian mutation to compete with its parent in each generation, and the other is (1, 1)-ES which is equivalent to random walk.
Current general versions of ES are denoted as (µ+,λ)-ES.
In CMA-ES, only a few algorithmic parameters need to be decided because CMA-ES inherits the mechanism to adapt strategic parameters during the evolutionary process. In this work, CMA-ES is utilized to optimize the degree distribution in LT framework for a wide range of k, the size of source symbols. In the remainder of this section, the way to adopt CMA-ES to handle the optimization of degree distributions are presented in detail.
A. Decision Variables
The first step to use an evolutionary algorithm is to encode the decision variables of the optimization problem. It is not difficult in this study because a degree distribution can directly form a real-number vector. In the evaluation phase, a real-number vector of arbitrary values can be interpreted as a probability distribution, i.e., a degree distribution, with normalization. Such an operation does not change the fea-sibility, although the problem complexity may be slightly increased. The definition of degree distributions tells us that d ≤ k. For a specific source symbol size k, obviously the problem dimensions is at mostk. However, according to the LT encoding/decoding operations, we usually do not need a non-zero probability on every single degree. Observing the soliton distributions and considering the belief propagation algorithm, there is no necessary degree except 1, which ensures the start of belief propagation. As a result, we optimize a selected subset of degrees in the present work. We choose some degrees called tags to form the vector v(i) of decision variables according to the Fibonacci numbers smaller than half of k. A degree distribution used in this paper hence can be represented as the following formula.
Optimized degree distributionω(d):
ω(d) =
v(i) d = the i-th Fibonacci number, d < k/2
0 otherwise .
(4) B. Objectives
We try to use two indicators to evaluate degree distributions for LT code in this paper. The first one is the efficiency of the LT code with the optimized degree distribution which has been discussed in section II-B.ε denotes the expected rate of overhead to transmit data. For example,ε = 1.2 means that in addition to the size of source data, 20% extra data are needed to recover the complete source data. This objective is to obtain some degree distribution for a specific k with the smallest ε. LT code is rateless, and the coding process depends on randomness and probability. Source data recovered by a fixed amount of encoding symbols cannot be guaranteed. Therefore,
0 20 40 60 80 100 120 140 160 180 200 1
1.5
Function Evaluations Fig. 2
EVOLUTIONARY PROCESS DURING THE OPTIMIZATION OF OVERHEAD
TABLE I
THE BEST INDIVIDUALS FOR THE OPTIMIZATION OF OVERHEAD
Degree k=100 k = 400 k = 400 k = 1000 1 0.091397 0.116375 0.16058 0.129707 2 0.310884 0.255701 0.148543 0.266133 3 0.367223 0.34174 0.412275 0.321489 5 0.042648 0.112072 0.119163 0.077045 8 0.053247 0.071726 0.052843 0.124503 13 0.048949 0.028076 0.024701 0.000258 21 0.011876 0.013169 0.035112 0.019594 34 0.073776 0.030397 0.017738 0.033607
55 0 0.000264 0.002094 0.01543
89 0 0.01109 0.009837 0.00095
144 0 0.01939 0.002946 0.000143
233 0 0 0.014167 0.00075
377 0 0 0 0.010391
in order to evaluate ε, we provide infinite encoding symbols, in the form of a stream of encoding symbols, to simulate the decoding process until all source data are recovered. The average of required encoding symbols per simulation is the fitness value of degree distributions.
The second indicator is the amount of source symbols that cannot be recovered when a constant ratio of encoding symbols are received. In raptor codes, Low-density-parity-check (LDPC) [15] is introduced as a second layer pre-coding into LT code. LDPC is a kind of forward error correction codes. More information on LDPC can be found in [16], [17].
LDPC can fix errors of data without extra information as long as the error rate is lower than certain restriction. In such a condition, the mission of LT code is no longer to achieve full decoding. Instead, most of source symbols can be recovered with a small overhead is sufficient. For this purpose, we try to minimize the number of un-recovered source symbols given a constant overhead ε.
100 200 300 400 500 600 700 800 900 1000
1.05 1.1 1.15 1.2
Source symbols size k
Fig. 3
AVERAGE PERFORMANCE INDICATORS ARE COMPARED BETWEEN ROBUST SOLITON DISTRIBUTION AND OPTIMIZED DEGREE DISTRIBUTIONS FOR
DIFFERENT NUMBERS OF SOURCE SYMBOLS(k)
IV. EXPERIMENTS AND RESULTS
Two series of experiments are implemented for the two different objectives as described in the previous section. In each experiment, tags are determined by Fibonacci numbers and the specified source symbols size k. Tags are encoded as an individual, v(i), and represent that only these degrees have non-zero probabilities. Initial values of tags are set as 1/|v| uniformly, and then CMA-ES is applied without any customization or modification. After a new individual is created, it is normalized to be a valid probability distribution and evaluated for the fitness value by simulating the LT coding process. One hundred independent runs of simulation are conducted for each function evaluation. In the first series of experiments, we minimize the expected number of encoding symbols for full decoding. In the second, the average number of source symbols that cannot be recovered for a constant ε = 1.1 is considered. We call the second indicator as failure rate. The default parameter settings given in the source code of CMA-ES are adopted in this study except forλ = 10.
A. Overhead
In these experiments, we minimize the overheadε for differ-entk sizes, and the results are shown in Table I and Figs. 2–
5. Fig. 2 presents the improvement during the evolutionary process. Individuals are initially uniform distributions. It is expected that overheads are quite high in the beginning and the curves descend quickly after around 100 function evaluations.
Finally, the fitness almost converges after 200 function evalua-tions. Fig. 3 shows the comparison ofε between robust soliton distribution and the optimized distributions. The expected overhead of robust soliton distribution is given as
k + O(log2(k/δ)√
1 2 3 5 8 13 21 34 0
0.05
Degree
1 1.2 1.4 1.6 1.8 2
0
Overhead (a)k = 100
1 2 3 5 8 13 21 34 55 89 144 0
0.05 0.1 0.15 0.2 0.25 0.3 0.35
Degree
Probability
1 1.1 1.2 1.3 1.4 1.5 1.6
0 0.2 0.4 0.6 0.8 1
AVG : 1.1465
Overhead
Successful rate
(b)k = 400
1 2 3 5 8 13 21 34 55 89 144 233 0
0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
Degree
Probability
1 1.1 1.2 1.3 1.4
0 0.2 0.4 0.6 0.8 1
AVG : 1.126
Overhead
Successful rate
(c)k = 700
1 2 3 5 8 13 21 34 55 89 144 233 377 0
0.05 0.1 0.15 0.2 0.25 0.3 0.35
Degree
Probability
1 1.1 1.2 1.3 1.4
0 0.2 0.4 0.6 0.8 1
AVG : 1.1083
Overhead
Successful rate
(d)k = 1000 Fig. 4
LEFT FIGURES SHOW THE OPTIMIZED DEGREE DISTRIBUTIONS. ONLY TAGS ARE PRESENTED. RIGHT FIGURES ARE THE HISTOGRAM AND ACCUMULATED CURVE OF SUCCESSFUL RATE IN1000INDEPENDENT SIMULATION RUNS
0 5 10 15 20 0
Degree
1 1.2 1.4 1.6 1.8 2
0
Overhead (a)k = 100
0 5 10 15 20 25 30
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Degree
Probability
1 1.1 1.2 1.3 1.4 1.5 1.6
0 0.2 0.4 0.6 0.8 1
AVG : 1.2805
Overhead
Successful rate
(b)k = 400
0 5 10 15 20 25 30 35
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Degree
Probability
1 1.1 1.2 1.3 1.4
0 0.2 0.4 0.6 0.8 1
AVG : 1.2377
Overhead
Successful rate
(c)k = 700
0 10 20 30 40
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
Degree
Probability
1 1.1 1.2 1.3 1.4
0 0.2 0.4 0.6 0.8 1
AVG : 1.2188
Overhead
Successful rate
(d)k = 1000 Fig. 5
FOR THE COMPARISON WITH SAMEk’S,ROBUST SOLITON DISTRIBUTIONS AND THE CORRESPONDING PERFORMANCE INDICATORS ARE SHOWN SIMILAR TO THAT INFIG. 4. NOTE THAT ONLY PARTS OF ROBUST SOLITON DISTRIBUTIONS ARE PLOTTED FOR CLARITY
0 100 200 300 400 500 600 700 800 900 1000 10−2
Function Evaluations
Fail rate
Fig. 6
EVOLUTIONARY PROCESS DURING THE OPTIMIZATION OF FAILURE RATE
TABLE II
THE BEST INDIVIDUALS FOR THE OPTIMIZATION OF FAILURE RATE
Degree k=100 k = 400 k = 400 k = 1000 1 0.083997 0.102892 0.116854 0.115278 2 0.573671 0.383164 0.29678 0.333564 3 0.161178 0.237312 0.31115 0.241065 5 0.08038 0.186475 0.171342 0.184027 8 0.096245 0.030706 0.033393 0.046818 13 0.001267 0.039075 0.025977 0.022223 21 0.002963 0.015193 0.023452 0.022914 34 0.000299 0.000167 0.016096 0.020526
55 0 0.001276 0.002602 0.00643
89 0 0.000303 0.000268 0.004594
144 0 0.003436 0.002072 0.001422
233 0 0 0.000015 0.000883
377 0 0 0 0.000257
The value becomes smaller when k increases, and that is why the trend of Fig. 3 shows a declination. The val-ues of overhead are reduced at least 10% for all k’s with the optimized degree distributions. Some distributions of the best individuals are given in Table I. Fig. 4 illustrates each distribution and shows the histogram of successful rate in 1000 simulation runs on the right side. Compared with similar simulation results of robust soliton distribution in Fig. 5, the improvement is quite significant.
B. Failure rate
Unlike the original LT code, we are concerned with how many source symbols can be recovered in the second set of experiments. The objective value is the average number of source symbols that cannot be recovered with a constant overhead ε. Optimization results are shown in Fig. 6. More function evaluations are needed to search for good degree distributions. The failure rate of the final results are less than 10−1 for all k’s when ε = 1.1. In other words, more than 90 percent of source symbols can be recovered if extra 10 percent of encoding symbols are collected. Table II gives
Degree Overhead
Fail rate of symbols Uniform
Robust Soliton
Fail rate of symbols Uniform
Robust Soliton
Fail rate of symbols Uniform
Robust Soliton Optimized
(d)k = 1000 Fig. 7
THE FIGURE SHOWS THE SIGNIFICANT DIFFERENCE OF FAILURE RATE AFTER OPTIMIZATION. SIMILAR TO THAT INFIG. 4,ONLY TAGS ARE
SHOWN IN THE FIGURES
the best probability distributions found in the evolutionary process for k = 100, k = 400, k = 700, and k = 1000.
The simulation results of a constant overhead are presented in Fig. 7. The red line denotes the behavior of uniform distribution, which is the initial value of optimization. Most of the source symbols remain covered except for those of which the degree is one, i.e., with probability1/k. The same situation happens to robust soliton distributions because the
complish the optimization task. Different from the previous studies reported in the literature, each probability of degrees were directly encoded as an individual to optimize. Promising experimental results were obtained in both sets of experiments:
complish the optimization task. Different from the previous studies reported in the literature, each probability of degrees were directly encoded as an individual to optimize. Promising experimental results were obtained in both sets of experiments: