Problem 4.
a)
b)
If we sort the probabilities in descending order, we can see that the two letters with the lowest probabilities are a2 and a4. These will become the leaves on the lowest level of the binary tree. The parent node of these leaves will have a probability of 0.9. If we consider parent node as a letter in a reduced alphabet then it will be one of the two letters with the lowest probability: the other one being a1. Continuing in this manner, we get the binary tree shown in Figure 1. and the code is
Figure 1: Huffman code for the five-letter alphabet.
c) lavg = 0.15 × 3 + 0.04 × 4 + 0.26 × 2 + 0.05 × 4 + 0.5 × 1 = 1.83bits/symbol.
Problem 5.
Figure 2: Huffman code for the four-letter alphabet in Problem 5.
a) The Huffman code tree is shown in Figure 2. The code is
The average length of the code is 0.1×3+0.3×2+0.25×3+0.35×1 = 2 bits/symbol.
b) Huffman code tree is shown in Figure 3. The code is
The average length of the code is obviously 2 bits/symbol.
Figure 3: Minimum variance Huffman code for the four-letter alphabet in Problem 5.
While the average length of the codeword is the same for both codes, that is they are both equally efficient in terms of rate. However, the second code has a variance of zero for the code lengths. This means that we would not have any problems with buffer control if we were using this code in a communication system. We cannot make the same assertion about the first code.
Problem 6.
Examining the Huffman code generated in Problem 4 (not 3!) along with the associated probabilities, we have
The proportion of zeros in a given sequence can be obtained by first computing the
probability of observing a zero in a codeword and then dividing that by the average length of a codeword. The probability of observing a zero in a codeword is
1 × 0.15 + 0 × 0.04 + 1 × 0.26 + 1 × 0.05 + 1 × 0.50 = 0.96.
0.96/1.83 = 0.52. Thus, the proportion of zeros is close to a half. If we examine Huffman codes for sources with dyadic probabilities, we would find that the proportion is exactly a half. Thus, the use of a Huffman code will not lead to inefficient channel usage.
Problem 10.
Depending on how you count, the errors five characters are received in error before the first correctly decoded character.
For the minimum variance code the situation is different
Again, only a single character is received in error.
Problem 13.