IEEE COMMUNICATIONS LETTERS, VOL. 7, NO. 8, AUGUST 2003 391
An Algorithm for Construction of Efficient Fix-Free Codes
Ksenija Lakovic´, Student Member, IEEE, and John Villasenor
Abstract—We propose an algorithm for construction of fix-free codes, which involves a new codeword selection mechanism.
Starting from the shortest codewords, the proposed algorithm se- lects fix-free codewords of any length with respect to maximizing the number of available codewords of length + 1. Due to the increased fix-free codeword availability, the proposed algorithm constructs the codes of higher efficiency relative to the other algorithms in the literature.
Index Terms—Fix-free codes, Huffman codes, prefix codes, re- versible variable length codes (RVLCs).
I. INTRODUCTION
A
FIX-FREE code is a code such that no codeword is a prefix or a suffix of any other codeword. In a fix-free code any finite sequence of codewords can be decoded in both directions, which can be used by a decoder to enhance robustness in the presence of transmission errors. Fix-free codes are also known as reversible variable length codes (RVLCs), and their applica- tions have been studied extensively [1]–[6], particularly in the development of video standards H.263+ and MPEG-4. The con- ditions for the existence of fix-free codes were considered in [7], [8], while [9] established the bounds on the redundancy of fix-free codes.The algorithms for construction of fix-free codes were studied in [1], [2]. While Takishima et al. [1] considered the construc- tion of symmetric and asymmetric codes separately, Tsai and Wu [2] introduced a generic algorithm that involves selection mechanisms for symmetric and asymmetric RVLCs. In both cases it was shown that asymmetric codes exhibit significantly higher efficiency; therefore we restrict our attention to construc- tion of these codes.
RVLC construction algorithms generally use a Huffman code as an input [1], [2]. Since the codeword length distribution of Huffman codes is the optimal prefix code distribution, RVLC construction starts by selecting the shortest RVLC codewords that have the same length as the shortest Huffman codewords. The longer RVLC codewords are selected to have the same length as the Huffman codewords, provided there
Manuscript received January 29, 2003. The associate editor coordinating the review of this letter and approving it for publication was Prof. M. Fossorier.
This work was supported by the Office of Naval Research under Contract N00014-01-C-016. This paper was presented in part at the 36th Asilomar Conference for Signals, Systems and Computers, Pacific Grove, CA, November 2002.
The authors are with the Department of Electrical Engineering, University of California, Los Angeles, CA 90095 USA (e-mail: [email protected];
Digital Object Identifier 10.1109/LCOMM.2003.815660
exist codewords of that length simultaneously satisfying both the prefix and suffix conditions. In the case that such codewords do not exist, RVLC codeword length must be increased, which decreases the RVLC code efficiency.
The number of prefix and suffix condition satisfying code- words of a certain length depends on the previously assigned (shorter) RVLC codewords. This dependency was not consid- ered in the development of the RVLC construction algorithm in [1]; therefore this algorithm generally produces relatively in- efficient RVLCs. In [2] it was conjectured that the number of available (prefix and suffix condition satisfying) codewords of any length depends on the metric called minimum repetition gap (MRG) associated with the shorter RVLC codewords. The algo- rithm proposed in [2] performs RVLC codeword assignment on the base of MRG, and generally produces more efficient codes than the algorithm proposed in [1]. It is possible to establish a formal relationship between the number of available RVLC codewords of any length, and the structure of shorter RVLC codewords [9]. In Section II of this paper we present this re- lationship, in the form that can be easily utilized by an RVLC construction algorithm. We show that the number of available RVLC codewords does not depend exclusively on MRG, and use this result to propose a new RVLC construction algorithm that produces more efficient RVLCs than the algorithms in [1], [2]. The proposed algorithm is presented in Section III and com- pared to the other algorithms in the literature. Section IV con- cludes the paper.
II. AFFIXINDICES ANDFIX-FREECODEWORDAVAILABILITIES
Let a binary codeword of length be denoted as . Define the prefix set as the set of length- codewords, , which are prefixed by
(1) Define the suffix set as the set of length- codewords,
, which are suffixed by
(2) The cardinality of the prefix set is obviously equal to the cardinality of the suffix set
(3)
Assume that is a set of fix-free
codewords of lengths . Denote with
the number of codewords of length , , which are neither
1089-7798/03$17.00 © 2003 IEEE
392 IEEE COMMUNICATIONS LETTERS, VOL. 7, NO. 8, AUGUST 2003
prefixed nor suffixed by any codeword from . Since the total number of length- codewords is equal to , it holds that
(4) Recall that no fix-free codeword is a prefix or a suffix of an- other codeword. This implies that all the prefix sets, as well as all the suffix sets, are disjoint, i.e.
(5) Therefore
(6) Now define the affix set as the set of codewords of length , , prefixed by and suffixed by
(7) Obviously, depends on the cardinalities of affix sets , , i.e., on the affix index of the set , defined as
(8) It can be easily shown that the cardinalities of affix sets
are given as follows [9].
In the case that
if
if (9)
while for
(10) To illustrate the relationship between the fix-free codeword availabilities and the structure of the shorter codewords,
consider two codeword sets and
. The affix sets for
the codewords from and are given in Table I. Using Table I, (8) and (6), it can be computed that and
for , while and for .
Now consider the minimum repetition gaps for the codewords in and . The minimum repetition gap is used as the codeword selection criterion in Tsai’s algorithm [2], and can
TABLE I EXAMPLES OFAFFIXSETS
be interpreted as follows: The minimum repetition gap of a length- codeword , denoted as , is equal to the minimal positive integer such that . The algorithm in [2] selects the codewords with the lowest , which contributes to the increase in , with the lowest first,
because depends on as in (6). However,
this algorithm does not consider the elements , , which have a significant impact on , and consequently on the coding efficiency. Between and
Tsai’s algorithm selects , because and
, while . However,
the set is more favorable with respect to coding efficiency
[ and for while and
for ]. The algorithm proposed in the next section performs the selection of fix-free codeword sets on the basis of their affix indices, which yields the increased availability of higher length codewords and the improved coding efficiency.
III. ANALGORITHM FORCONSTRUCTION OFEFFICIENT
FIX-FREECODES
Consider the construction of a fix-free code for an -ary i.i.d. information source , with the proba-
bility mass function , . The
proposed RVLC construction algorithm involves the following steps (steps 1, 2.2 and 3 are the same as in the algorithm in [2], while step 2.1 involves a new codeword selection mechanism):
1) Construct a Huffman code , which maps the source symbols into corresponding binary codewords
of length
(11)
Denote the bit length vector of by
, where represents the number of Huffman codewords having length . Initialize the number of assigned RVLC codewords as , and the RVLC
bit length vector as ,
. Start the assignment of RVLC codewords at
level .
2) Identify the set of available length- codewords, which are neither prefixed nor suffixed by the codewords assigned at earlier levels. The number of available codewords, denoted as , is equal to at level . At levels , depends on the previously assigned RVLC codewords , as given by (6).
LAKOVIC´ AND VILLASENOR: AN ALGORITHM FOR CONSTRUCTION OF EFFICIENT FIX-FREE CODES 393
2.1) If , then RVLC codewords should
be assigned. Assuming that
denotes the set of available codewords, a candidate selection of
RVLC codewords is any -element subset ,
and the set of all the candidate codeword selections can be written as
(12) Assuming that the codewords from the set
are selected as RVLC codewords, the number of available code- words , at a level , according to (6)–(8) is
(13) Consequently, the highest is achieved by selecting the set with the maximal , denoted
(14) However, as can be inferred from (9), the selection that maxi- mizes does not necessarily maximize , i.e., it gen-
erally holds that , .
Therefore, we perfom the selection in the following manner:
At level , we assign the codewords from the set
(15) This selection maximizes the number of available codewords of length . As the algorithm proceeds to level , the per- formed selection maximizes etc. Thus, at each level of the fix-free code construction algorithm, the proposed approach peforms the selection of codewords that maximizes the number of available codewords at the next level, which makes this ap- proach level-by-level optimal.
Note that it is possible to have several candidate sets
with ,
which yield the same . In that case we choose the set that yields the highest , by considering the can- didate selections for all the candidate sets , and choosing the set with the highest corresponding
.
2.2) If all the available RVLC codewords are assigned, and the bit-length vector is adjusted as
(16) 3) The number of the assigned codewords is updated as
. If the codeword assignment (step 2) continues at level . Otherwise, the construction ends.
This algorithm is of higher complexity than the RVLC con- struction algorithm in [2]. At each level , with , Tsai’s algorithm involves consideration of MRGs, while the proposed algorithm considers affix indices.
Table II compares the RVLCs for the English alphabet, constructed using the proposed algorithm, and the algorithms published in [1], [2]. Due to the improved codeword selection mechanism, the proposed algorithm yields a fix-free code of significantly higher efficiency.
TABLE II
VLCS FOR THEENGLISHALPHABET: HUFFMANCODE,THERVLCSFROM[1], [2],AND ANRVLC CONSTRUCTEDUSING THEPROPOSEDALGORITHM
IV. CONCLUSIONS
We have proposed an algorithm for construction of efficient fix-free codes, which involves a new codeword selection mecha- nism. The proposed algorithm exploits the dependency between the affix indices associated with the selected codewords, and the availability of the codewords of higher lengths. This algorithm involves, at each level, the selection of the codeword set with the highest affix index, which results in the increased number of available fix-free codewords of higher lengths, and conse- quently yields fix-free codes of higher efficiency relative to the other algorithms in the literature.
REFERENCES
[1] Y. Takishima, M. Wada, and H. Murakami, “Reversible variable length codes,” IEEE Trans. Commun., vol. 43, pp. 158–162, Feb.–Apr. 1995.
[2] C. W. Tsai and J. L. Wu, “On constructing the Huffman-code based reversible variable length codes,” IEEE Trans. Commun., vol. 49, pp.
1506–1509, Sept. 2001.
[3] J. Wen and J. Villasenor, “Reversible variable length codes for efficient and robust image and video coding,” in Data Compression Conf., 1998, pp. 471–480.
[4] A. H. Li, S. Kittitornkun, Y. H. Hu, D. S. Park, and J. Villasenor, “Data partitioning and reversible variable length codes for robust video com- munications,” in Data Compression Conf., 2000, pp. 460–469.
[5] R. Bauer and J. Hagenauer, “On variable length codes for iterative source/channel-decoding,” in Data Compression Conf., 2001, pp.
273–282.
[6] K. Lakovic and J. Villasenor, “On reversible variable length codes with turbo codes, and iterative source/channel-decoding,” in IEEE Int. Symp.
on Information Theory, June 2002, p. 170.
[7] Z. Kukorelly and K. Zeger, “New binary fix-free codes with Kraft sum 3/4,” in IEEE Int. Symp. on Information Theory, June 2002, p. 178.
[8] S. Yekhanin, “Sufficient conditions for existence of fix-free codes,” in IEEE Int. Symp. on Information Theory, June 2001, p. 284.
[9] C. Ye and R. W. Yeung, “Some basic properties of fix-free codes,” IEEE Trans. Inform. Theory, vol. 47, pp. 72–87, Jan. 2001.