Tags Selection Function - Sparse Degrees Analysis for LT Codes Optimization

Sparse Degrees Analysis for LT Codes Optimization

Algorithm 1 Tags Selection Function

Input: The source symbol size k, the density parameter d;

Output: The set of sparse tags;

1: procedure TSF(k, d)

2: D ← Ideal soliton distribution for size k;

3: S ← 0, E ← 1, T ags ← []; (1, 6). The distance influence of such two pairs are the same but the pair (1, 6) with complementary property observably has a smaller difference.

Given the above considerations, it turns out that the reallo-cation of the probability to degrees 3 and 5, the nearest two degrees to 4, would be the most close to the optimum. For a better view of showing how close the adjusted distributions could approach to the optimal one, we illustrate the differences of error probabilities in logarithm scale in Figure 1(b). It also gives the proof that the error probability of the original distribution could be well approximated by the reallocated one.

These observations inspire us regarding how to choose the tags for a sparse degree distribution.

IV. SELECTION FUNCTION FORSPARSETAGS

In addition to the factors observed from experiments, we also take some intuitive properties into account. For example, the higher probability to be reallocated would result in higher difference of the error probability to the optimal one. Summing up all of above, the main considerations of our degree selection strategy for sparse tags would be as follows:

1) The number and value of probabilities around each degree.

2) The distance between the probability reallocated degrees and the removed one.

The first criterion comes from the positive correlation of the reallocated probability and the error probability. The second one accounts for the results of the experiments that replacing the tag to be removed with two adjacent tags would have the best approximation for error probability. According to these criterion, we propose the sparse tags selection function in Algorithm 1.

We consider the ideal soliton distribution since it would be the optimal degree distribution in the ideal case. The density

5 10 15 20 25 30

Fig. 2. Illustration of the work done by TSF(k, d).

parameterd acts as the bound that base on the first main con-sideration, degreei would not be removed if its probability was larger than1/d. On the other hand, we group the degrees with probabilities below1/d and concentrate those probabilities to a nearby degree. The second main consideration would be applied to the selection of the representative degree of each group. We accumulate the probabilities multiplied the distance factors and take the degree while the sum exceeds the bound 1/d. In addition, considering the complementary property of the selected degrees to distribute the probability, we reserve the degrees1 and k to ensure there always exists degrees satisfied this property to be chosen. Figure 2 illustrates the work done by Tags Selection Function (TSF) and shows the tags selected for (k, d) = (30, 10). Tags 1 and 30 were selected to meet the complementary property. The tags2, 3 were selected since the probabilities of these tags in ideal soliton distribution were above the density criteria. The remaining tags were the representations of the grouped tags of which probabilities were below the density bound.

Following we provide some examples of sparse tags selected by Algorithm 1 fork = 100:

We can see that the sparse tags selected in TSF(100, 5) were close to the series of power of 2 and those selected in TSF(100, 10) were close to Fibonacci series. Such a result gives an explanation for the good performance of choosing these series, power of2 and the Fibonacci series, as approxi-mation to the full tags in a certain extent.

2466

TABLE I

THE ERROR PROBABILITIES OF THE BEST DEGREE DISTRIBUTIONS DISCOVERED IN EACH30-RUN EXPERIMENT.

Tag Type k = 100 k = 150 k = 200 k = 250 k = 300

d = 3 0.56332720 0.46689819 0.37998428 0.30450876 0.24139928 d = 5 0.56296776 0.46627322 0.37872813 0.30325441 0.24007702 d = 10 0.56293832 0.46622717 0.37871451 0.30324585 0.24008108 d = 20 0.56291529 0.46619257 0.37866506 0.30319090 0.24001890 Full Tags 0.56291403 0.46619313 0.37869051 0.30323273 0.24023868 Min. 0.56291403 0.46619257 0.37866506 0.30319090 0.24001890

0 50 100 150 200 250

(a) Results in early stages for observing different initializations

200 300 400 500 600 700 800 900 1000

0.24

(b) Results for sparse degree distributions converge faster Fig. 3. The evolutionary trends of fitness values in optimization.

V. EXPERIMENTALRESULTS

In this section, we make a complete examination on the effects of the proposed sparse tags selection function. We set up the input parameters k = {100, 150, 200, 250, 300}

and d = {3, 5, 10, 20}. For each (k, d) pair, we firstly use the TSF function to determine the corresponding tags and then apply the CMA-ES algorithm to search for the optimal sparse degree distribution with these tags. Finally, we compare the minimal error probabilities of the sparse degree distribu-tion and the full degree distribudistribu-tion. For each optimizadistribu-tion setting, 30 independent runs were tested due to the natural

randomness of evolutionary algorithms. The minimal error probability in 30 runs was recorded for each generation.

The maximum number of function evaluation is limited to 3 × 10⁴ and the optimization result is presented in Figure 3.

Figure 3(a) shows the early phase of optimization process.

The results of all sparse degree distributions are with high error rates at beginning because the initial probabilities on sparse degrees were set by random values. In contrast, the full degree distribution was initialized as ideal soliton distribution to avoid the failure of optimizing a large number of decision variables. As the number of evaluations increases, the curves of sparse distributions drop down to the same level of error probability as the full degree distribution. Since the density parameter,d, of our selection function effects the size of tags, it also reflects on the convergence of the curve of each sparse distribution. It can be observed that the curve withd = 3 firstly converges andd = 20 is the last. Figure 3(b) shows the same experimental result but in a different interval ofx-axis. We can clearly observe that under the same optimization approach, the optimized results of sparse distributions are better than that of the full degree after hundreds of function evaluations.

Fast convergence is an expected advantage for adopting sparse degree distributions. Furthermore, it can be expected that the tags defined by our selection strategy could approx-imate full degrees on performance as nicely as possible. To examine our argument, the minimal error probabilities of different tag types are listed in Table I for comparison. The values were the minimal results found by CMA-ES in 30 runs and 3 × 10⁴ function evaluations. For each column, the minimal error probability is marked as bold and copied to the last row. The sparse degree distribution with d = 20 leads four of totally five differentk size experiments. For giving a more convenient view to compare the results, we survey the differences between each entry and the minimal value in same column. Accounting for the minimal error probability of the column will be zero after all entries minus the minimal one, we add a base value (10⁻⁷) to calculate differences for letting all data be plotted in a logarithmic coordinate system. Figure 4 visualizes the values in Table I and shows the distance between each tag types and the near optimal distribution that we have found. Although a larger value of d which means a larger subset of degrees will cause slow convergence in optimization, more tags can form a sparse distribution with a lower error probability. Our experimental result definitely confirms the argument and also illustrates that the selection strategy is practical. On the other hand, the full degree distribution is

2467

100 150 200 250 300

Fig. 4. Differences between the error probabilities and the minimal one.

considered to have global minimal error probability because it is the universal set of all distributions and forms the complete search space. However, the optimization result of full degree distributions gets worse and worse as input symbol size k increases. For the same optimization algorithm and evaluation function are implemented for each degree set, the worse results of full degree distributions could be explained by that the number of decision variables is too many for CMA-ES to handled in limited function evaluations.

VI. CONCLUSION

Using evolutionary algorithms to optimize the degree distri-bution for LT codes is a promising research topic. Sparse de-gree distributions are frequently used to replace full dede-grees for reducing the search space. How good the performance could be achieved by a sparse degree distribution depends on the set of its non-zero entries, i.e., tags. However, no investigation has been done regarding how to decide appropriate tags to construct sparse degree distributions with good performance.

In this paper, the authors analyzed the influence of different degrees on decoding rate and proposed a tag selection algo-rithm to choose tags for LT codes optimization. Finally, the presented experimental results were evidentially illustrate the practicality of the proposed tag selection algorithm.

In previous studies, researchers manually chose tags for sparse degree distributions according to their own experimental experience. Even though the chosen subset of degrees worked well, the detailed mechanism was still unknown. This work made an effort to find out guidelines for choosing appropriate tags. The proposed selection algorithm can be applied for any input size and control the level of sparseness conveniently by adjusting the density parameter. This solution can help researchers to put more attention in the optimization algorithm rather than the individual encoding. The paper presented the qualitative analysis of probability reallocation in a distribution.

The variances of error probability were compared for changing the reallocated degree and then quantitative analysis is needed in advance. If the quantity of variance can be measured precisely, developing a local search based on the measure approach to enhance certain optimization framework for LT codes will be possible. Research of this line is definitely worth pursuing, and the authors are currently taking the challenge.

ACKNOWLEDGMENTS

The work was supported in part by the National Science Council of Taiwan under Grant NSC 99-2221-E-009-123-MY2.

REFERENCES

[1] J. W. Byers, M. Luby, M. Mitzenmacher, and A. Rege, “A digital fountain approach to reliable distribution of bulk data,” in Proceedings of the ACM SIGCOMM ’98 conference on Applications, technologies, architectures, and protocols for computer communication. ACM, 1998, pp. 56–67.

[2] J. W. Byers, M. Luby, and M. Mitzenmacher, “A digital fountain approach to asynchronous reliable multicast,” IEEE Journal on Selected Areas in Communications, vol. 20, no. 8, pp. 1528–1540, 2002.

[3] M. Luby, “LT codes,” in Proceedings of the 43rd Symposium on Foundations of Computer Science. IEEE Computer Society, 2002, pp.

271–282.

[4] E. A. Bodine and M. K. Cheng, “Characterization of Luby Transform codes with small message size for low-latency decoding,” in Proceedings of the IEEE International Conference on Communications, 2008, pp.

1195–1199.

[5] E. Hyyti¨a, T. Tirronen, and J. Virtamo, “Optimal degree distribution for LT codes with small message length,” in Proceedings of the 26th IEEE International Conference on Computer Communications (INFOCOM 2007), 2007, pp. 2576–2580.

[6] ——, “Optimizing the degree distribution of LT codes with an impor-tance sampling approach,” in Proceedings of the 6th InternationalWork-shop on Rare Event Simulation (RESIM 2006), 2006, pp. 64–73.

[7] C.-M. Chen, Y.-p. Chen, T.-C. Shen, and J. K. Zao, “On the optimization of degree distributions in LT code with covariance matrix adaptation evo-lution strategy,” in Proceedings of the IEEE Congress on Evoevo-lutionary Computation, 2010, pp. 3531–3538.

[8] ——, “Optimizing degree distributions in LT codes by using the multiob-jective evolutionary algorithm based on decomposition,” in Proceedings of the IEEE Congress on Evolutionary Computation, 2010, pp. 3635–

3642.

[9] A. Talari and N. Rahnavard, “Rateless codes with optimum interme-diate performance,” in Proceedings of the Global Telecommunications Conference (GLOBECOM 2009), 2009, pp. 1–6.

[10] R. Karp, M. Luby, and A. Shokrollahi, “Finite length analysis of LT codes,” in Proceedings of the IEEE International Symposium on Information Theory 2004 (ISIT 2004), 2004, p. 39.

[11] E. Maneva and A. Shokrollahi, “New model for rigorous analysis of LT-codes,” in Proceedings of the IEEE International Symposium on Information Theory (ISIT 2006), 2006, pp. 2677–2679.

[12] N. Hansen and A. Ostermeier, “Adapting arbitrary normal mutation distributions in evolution strategies: the covariance matrix adaptation,”

in Proceedings of the IEEE International Conference on Evolutionary Computation, 1996, pp. 312–317.

[13] A. Auger and N. Hansen, “Performance evaluation of an advanced local search evolutionary algorithm,” in Proceedings of the 2005 IEEE Congress on Evolutionary Computation (CEC 2005), 2005, pp. 1777–

1784.

[14] ——, “A restart CMA evolution strategy with increasing population size,” in Proceedings of the 2005 IEEE Congress on Evolutionary Computation (CEC 2005), 2005, pp. 1769–1776.

2468

When and What Kind of Memetic Algorithms

Abstract—The synergy between exploration and exploitation has been a prominent issue in optimization. The rise of memetic algorithms, a category of optimization techniques which fea-ture the explicit exploration-exploitation coordination, much accentuates this issue. While memetic algorithms have achieved remarkable success in a wide range of real-world applications, the key to a successful exploration-exploitation synergy still remains obscure. Manifold empirical results and theoretical derivations have been proposed and provided various perspectives from dif-ferent algorithm-problem complexes to this issue. In our previous work, the concept of local search zones was proposed to provide an alternative perspective depicting the general behavior of memetic algorithms on a broad range of problems. In this work, based on the local search zone concept, we further investigate how the problem landscape and the way the algorithm explores and exploits the search space affect the performance of a memetic algorithm. The collaborative behavior of several representative archetypes of memetic algorithms, which exhibit different degrees of explorability and exploitability, are illustrated empirically and analytically on problems with different landscapes. As the empirical results consist with the local search zone concept and describe the behavior of various memetic algorithms on different problems, this work may reveal some essential design principals for memetic algorithms.

I. INTRODUCTION

Optimization, finding the optimal element among a set of feasible ones, is a type of problems commonly encountered in many fields. Numerous real-world and theoretical problems can be formulated as optimization problems and solved by ap-plying or developing various optimization techniques. Among them, general meta-heuristics, population-based algorithms which explore the search space stochastically according to some common heuristics, exhibit good explorative ability and have a good chance to perform well on many real-world op-timization problems which are generally black-box problems with little a priori problem knowledge available. Some of the renowned meta-heuristics are evolutionary algorithms, particle swarm optimization, ant colony algorithms, and the like.

However, the generality of meta-heuristics which provides the wide applicability also limits the efficiency of meta-heuristics.

When complicated problems are encountered, without taking advantages of problem specific information given a priori or retrieved during optimization, meta-heuristics can merely deliver mediocre performance. In an attempt to incorporate the good explorative ability of general meta-heuristics and the

good exploitive performance of problem specific algorithms to provide more efficient techniques for more complicated problems, techniques which employ general meta-heuristics as global search and problem specific algorithms as local search, referred to as memetic algorithms (MAs), have thrived. A variety of successful memetic algorithms in various domains, ranging from NP-hard combinatorial problems to non-linear programming problems, also have been reported [1].

Among these memetic algorithms, the synergy between global search and local search has always been one of the key design issues. The seminal study on memetic algorithms [2]

and its succeeding work [3] both suggest that memetic al-gorithms favor infrequent starts and long running time of local search. They also proposed several renowned strategies for selecting solution candidates on which the local search operator is applied: the fitness based selection and the diversity based selection. However, with the aids of these guidelines, designing a memetic algorithm for a specific problem still requires considerable time as the optimal design is not only algorithm specific but also problem dependent. Parameterizing difficulties in the design of memetic algorithms have been practically encountered in applications and also theoretically proved on several problem classes [4], [5]. To cope with this issue, memetic algorithms have been evolved from hybridiza-tion of global search and local search to hybridizahybridiza-tion with adaptation [6]–[8]. These algorithms adopting adaptive local search, referred as to memes, are robust and efficient with the expense of the learning cost of memes.

These studies on different algorithm-problem complexes have provided manifold aspects to the behavior of different memetic algorithms on different problems. In our previous work [9], we proposed the concept of local search zone to provide an alternative perspective which aims to depict the general behavior of memetic algorithms. In the previous work, a theoretical model depicting the exploration-exploitation syn-ergy of the subthreshold seeker, a representative archetype of memetic algorithms, on different Quasi-Basin Classes (QBCs) was formulated to represent the general behavior of collaboration between global search and local search in memetic computation on a broad class of objective functions.

As the theoretically and empirically verified model not only well depicts the collaborative behavior of the representative

U.S. Government work not protected by U.S. copyright

WCCI 2012 IEEE World Congress on Computational Intelligence

June, 10-15, 2012 - Brisbane, Australia IEEE CEC

2716

在文檔中應用泛用型最佳化演算架構於無線網路傳輸技術最佳化問題之研究 (頁 82-85)