Decoding delay - Dynamic decoder and the associated issues

6.2 Dynamic decoder and the associated issues

6.2.2 Decoding delay

Decoding delay is perhaps the most important issue in high speed decoder design.

The decoding delay associated with an S-IBPTC for a parallel decoder is minimized by using a proper decoding schedule. Even if only one APP decoder is used, as we will see shortly, the decoding schedule still plays a pivotal role in minimizing the decoding delay of an S-IBPTC. The delay associated with B-IBPTC is similar to classic TC but B-IBPTC provides more decoding options due to its IBP nature. The decoding delay can also be reduced by a proper schedule as S-IBPTC and will be discussed later.

We first analyze the decoding delays when only one APP decoder is used. The single-round interleaving (or de-interleaving) delay is proportional to the interleaving delay. But the total decoding delay is a much more complicated issue. For a decoder that uses a single ADU, the decoding delay depends mainly on three variables: the single-round interleaving delay (SRID), the single-single-round APP decoding delay, and the number of decoding iterations. As the single-round APP decoding delay (speed) is usually much less than the SRID, we ignore the APP decoding delay in the subsequent discussion.

For the first decoding of each incoming block, there can be zero waiting time, but for later DRs the corresponding delays depend on, among other things, the decoding schedule used. With the same block size, the decoding delay of the first received block for the classic TC is definitely shorter than that for the S-IBPTC. But if one considers a period that consists of multiple blocks (otherwise one will not have enough blocks to perform inter-block permutation) and takes the decoding schedule into account, then the average decoding delay difference can be completely eliminated. This is because the APP decoder (including the interleaver and deinterleaver) will not stay idle until

Classic TC/S-IBPTC

Figure 6.4: A comparison of exemplary decoding schedules for classic TC and S-IBPTC when decoding 7 blocks with 2 iterations (four decoding rounds). The numbers in the two rectangular grid-like tables represent the order the APP decoder performs decoding.

all blocks within the span of a given block are received. Instead, the APP decoder will perform decoding-interleaving or deinterleaving operations for other blocks according to a predetermined decoding schedule before it can do so for the given block (and the given DR).

If we define the total decoding delay as the time span between the instant a decoder receives the first input sample (from the input buffer) and the moment it outputs its last decision then it is possible that both the S-IBP and the classic approaches yield the same total decoding delay even if only one APP decoder is used. We use the following example and Fig. 6.4 to support our claim; its generalization is straightforward.

Suppose we receive a total of 7 blocks of samples (in a packet, say) and want to finish decoding in 2 iterations (4 DRs) and a schedule for both classic TC and S-IBPTC are shown in Fig. 6.4. The first block of the classic TC is decoded by the first 4 decoding rounds (the leftmost column) but that of the S-IBPTC is decoded by the first, third, sixth and tenth decoding rounds. One can easily see that a classic TC decoder would output the first decoded block in 4 DT cycles, where DT is the number of cycles needed to perform a single-block APP decoding plus SRID. The S-IBPTC decoder, on

the other hand, needs 10 DT cycles to output its first decoded block. However, if one further examines the decoding delays associated with the remaining blocks, then one finds they are 8, 12, 16, 20, 24, and 28 DT cycles for the classic TC decoder while those for the S-IBPTC decoder are 14, 18, 22, 25, 27 and 28 DT cycles, respectively. So in the end, both approaches reach the final decision at the same time.

It can be shown that, for a decoder with Dmax DRs and S = 1, both decoders result in a constant delay of ^D^max^(D₂^max⁻¹⁾ DT cycles between two adjacent output blocks, except for the first block and the last Dmax − 1 blocks. For an S = 1 S-IBPTC, the decoder requires a first-block decoding delay of ^D^max^(D₂^max⁺¹⁾ DT cycles while that for the classic TC is only Dmax DT cycles. The inter-block decoding delays, i.e., decoding latency between two consecutive output blocks, for the last Dmax − 1 output blocks of the S-IBPTC decoder using a decoding schedule similar to that shown in Fig. 6.4 (e.g., the one shown in Fig. 6.5) form a monotonic decreasing arithmetic sequence nDmax(Dmax−1)

2 − 1,^D^max^(D2^max⁻¹⁾ − 3, · · · , 0o

(in DT cycles). The inter-block decoding delay of a classic TC decoder remains a constant DmaxDT cycles. On the average, both codes give the same inter-block decoding delay.

Although we have assumed a stream-oriented scenario so far, our arguments are valid for the conventional block-oriented consideration as well. It is thus of paramount importance that we recapture the IBP concept from the block-oriented viewpoint before returning to the main discourse.

Consider the example illustrated in Fig. 6.4. For a classic TC with an interleaving (block) size of 7L bits, the first-block decoding delay for a 2-iteration single-ADU decoder is 28 DT cycles. But if one divides this 7L-bit block into 7 subblocks and uses a special block-oriented interleaver which performs successive intra-subblock and inter-subblock permutations on these subblocks, the corresponding (2-iteration single-ADU) decoding delays in DT cycles for these subblocks are 14, 18, 22, 25, 27 and 28, respectively.

Therefore, although both code structures result in identical total decoding delay the

IBPTC structure is able to supply partial decoded outputs much earlier. This feature, when combined with proper intra-(sub)block and inter-(sub)block interleaving rules, multiple ADUs, optimized decoding schedule and implementation resource management, become very beneficial for high speed applications. More importantly, it can been shown by computer simulations that a turbo code with such an interleaver does not yield performance inferior to that of a classic TC with a block-oriented interleaver (e.g., 3GPP interleavers) of the same size.

6.2.3 Memory contention and decoding schedule for multiple

在文檔中用於高傳輸率渦輪碼之交錯器設計 (頁 166-169)