Genie stopping test - Multiple-round stopping tests

6.3 Multiple-round stopping tests

6.3.5 Genie stopping test

Genie ST is a hypothetic ideal test that is capable of verifying the tentative decision vector without error. The performance of this ideal test is used as the ultimate bound for reference purpose.

At the first glance, we might expect the hybrid test or higher-order (larger m) tests to take more DRs since a received block is less likely to pass both SC and CRC or a

higher-order requirement. But the fact is that a correct block decision, through the IBP interleaving, will help other blocks to meet the stopping condition sooner while an incorrect one tends to has an adverse effect. Our numerical experiment indicates that the hybrid test not only gives better performance but also requires less average DRs.

This is another advantage of IBPTCs that is not shared by classic TCs.

6.4 Simulation results

The simulation results reported in this section is based on the following assumptions and parameters. The component code of the rate=1/3 TC, G(D) = h

1,^1+D_1+D+D²^+D3³

and the CRC-8(=“110011011”) code used are the same as those specified in the 3GPP standard [1] except that the component code is tail-biting [106] encoded. The APP decoder uses the Log-MAP algorithm and the S-IBP of Table 5.1 while the interleaving length and S-IBP span are left as variables; MR = 3 MUs and N = 1000 per simulation run are assumed. Except for the Genie ST, our simulations do not assume a perfect stopping test for a block.

The effects of various STs on the S-IBPTC VTT-APP decoder performance for the system with S = 1, L = 400, Dmax= 30 and tail-biting encoding are shown in Figs. 6.8 and 6.9. Multiple-round CRCST, SCST and HST are considered. For comparison, we include performance curves of the decoder using the genie ST, that with fixed 20 and 30 DRs (10 and 15 iterations) and, for reference purpose, that of the classic TC with block length L = 800 using the genie ST with Dmax = 30.

Block error rate performance improves as the number of test rounds m increases no matter which ST is used. Fig. 6.8 shows that T1.3 outperforms T1.2 for E_b/N₀ greater than 0.3 dB. Tests using sign-check alone, T2.3 and T2.5, are inferior to other stopping tests since, as mentioned before, the class of sign-check tests check if decoded bits con-verge but does not guarantee the correctness of the tentative decoded vectors. Incorrect stopping decisions will spread false information to the neighboring blocks through

in-0.0 0.2 0.4 0.6 0.8 1.0 1.2 1E-4

1E-3 0.01 0.1

1 Turbo Code

L=800 D_max=30 Genie

S-TB-IBPTC Variable DRs D_max=30

T1.2 T1.3 T2.3 T2.5 T3.2 T3.3 Genie Fixed DRs

Dmax=20 D_max=30

Block Error Rate

Eb/N

0 (dB)

Figure 6.8: Block error rate performance of various stopping tests; no memory constraint;

Dmax= 30 DRs.

terleaving and result in degraded performance. T1.3, T3.2, T3.3 and the one with fixed 30 DRs yield the best performance and they are almost as good as the genie ST. Using T3.2 for early stopping, the S-IBPTC has 0.4 ∼ 0.6 dB gain against the classic TC for BLER=10⁻³ ∼ 10⁻⁴ although the average decoding delay per DR for both codes are about the same.

Fig. 6.9 shows the average DR performance of various STs. Except for the two sign-check tests, all STs require less than 20 or 10 APP DRs (10 or 5 iterations) when Eb/N0 is greater than 0.2 or 0.6 dB. Considering both block error rate and average latency performance, we conclude that, among the STs we have examined, T3.2 is the best choice.

The numerical results presented so far assume no memory constraint. Figs. 6.10 and 6.11 reveal the impact of finite memory size for the system that employs a T3.2-aided VTT-APP decoder and the memory management algorithm of the previous section with block length L = 400, the span S = 1 and M_d = 1. Fig. 6.10 shows block error rate performance for different memory constraints. For convenience of comparison, we also

0.0 0.2 0.4 0.6 0.8 1.0 5

10 15 20 25

30 Variable DRs

D_max=30 T1.2 T1.3 T2.3 T2.5 T3.2 T3.3 Genie Fixed DRs

D_max=20 D_max=30

Average Number of APP Decoding Rounds

E_b/N₀ (dB)

Figure 6.9: Average APP DR performance of various stopping tests; D_max = 30 DRs, no memory constraint.

present three cases without memory constraint, one with Dmax= 200, the other two with fixed DRs. It is reasonable to find that larger memory sizes give better performance. At higher Eb/N0(> 0.8 dB), all performance curves converge to the same one since all VTT-APP decoders finish decoding after only a few DRs (see Fig. 6.11) and memory size is no longer a problem. The fact that the cases D_max = 100 with 100 MUs, and D_max = 30 with 100 MUs give almost identical performance indicates that increasing Dmax beyond a certain number (30 in this case) can not improve block error rate performance and the memory size becomes the dominant factor. Performance for the decoder with Dmax = 200 and no memory constraint (it can be shown that 804 MUs is sufficient for this case, which is at least eight time larger than that required by other decoders) is clearly better than the other decoders when Eb/N0 < 0.6 dB but this edge is gradually diminished after 0.6 dB.

The average DR performance is given in Fig. 6.11. For Eb/N0 ≥ 0.5 dB, all VTT-APP decoders need less than or equal to 10 DRs (5 iterations). But when Eb/N0 <

0.3 dB, the performance curves are distinctly different–if we do not impose a memory

0.0 0.2 0.4 0.6 0.8 1.0 1E-4

1E-3 0.01 0.1 1

Infinite Memory Vairable DRs

D_max=200 Fixed DRs

D_max=20 84 MUs D_max=30 124 MUs

Finite Memory D_max=30

50 MUs 60 MUs 80 MUs 100 MUs 100 MUs

D_max=50 D_max=100

Block Error Rate

E_b/N₀ (dB)

Figure 6.10: The effect of memory constraint and management on the block error rate performance. Curves labelled with infinite memory are obtained by assuming no memory constraint; “fixed DRs” implies that no early stopping test is involved.

constraint, the average DR will increase significantly as Eb/N0 decreases. Most of the computation effort will be wasted, so is the memory. In other words, at the low E_b/N₀ region, ST can not offer early stopping decision. Imposing a memory constraint and invoking a proper memory management algorithm provide a solution that forces early stoppings, saving computing power and memory at the cost of a small performance loss.

Finally, we find that, comparing with our proposed schemes, the two decoders with fixed DRs (20 and 30) usually need much more memory and DRs.

The effectiveness of various STs on the performance of a classic TC with L = 800 are shown in Fig. 6.12 and Fig. 6.13 where Dmax = 30 DRs and tail-biting encoding are assumed. The performance of T1.1 with CRC-24 is worse than those of T1.2 and T3.2 with CRC-8. Using CRC-8, T2.3 provides error rate performance similar to that of T3.2 but at the cost of one more DR. Both tests yield performance very close to that of the genie ST. In summary, these two figures show that (i) the proposed MRSTs can also be used in classic TC-coded systems and (ii) using a proper MRST has the

0.0 0.2 0.4 0.6 0.8 1.0 5

10 15 20 25 30 35 40

Finite Memory D_max=30

50 MUs 60 MUs 80 MUs 100 MUs 100 MUs

D_max=50 D_max=100 Infinite Memory

Vairable DRs D_max=200 Fixed DRs

D_max=20 84 MUs D_max=30 124 MUs

Average Number of APP Decoding Rounds

E_b/N₀ (dB)

Figure 6.11: Average APP DR performance for various decoding schemes and conditions.

Curves labelled with infinite memory are obtained by assuming no memory constraint;

“fixed DRs” means no early-stopping condition is imposed.

benefits of reduced CRC overhead and DRs (decoding latency) without compromising the performance. The latter conclusion implies that a multiple-round stopping test with a short CRC code is better than a single-round stopping test with a much longer CRC code. Of course, the same advantages are shared by IBPTC-coded systems as well.

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4

Figure 6.12: Block error rate performance of a classic TC using various STs; L = 800 bits and Dmax= 30 DRs.

Average Number of APP Decoding Rounds

E_b/N₀ (dB)

Figure 6.13: The effect of various STs on the average APP DR performance of a classic TC with L = 800 and Dmax= 30 DRs.

Chapter 7 Multi-stage factor graph

Multi-stage factor graph (MSFG) extended from factor graph [60, 42] elaborates message-passing for iterative decoding. Factor graph expounds a code structure and one can operate belief propagation (BP) algorithm on the graph. However, the graph only shows possible paths on the graph but does not indicate the schedule of real message-passing. The MSFG, a directed graph, describes the message-passing which reflects a decoding schedule. With the assistance of this graph the impact of the decoding schedule on computing complexity and storage requirements can be analyzed. Moreover, our representations avoid ambiguous description of cyclic or loopy message-passing events.

Multi-stage factor sub-graph (MSFSG) and causal multi-stage sub-graph (CMSSG) shorten representation of the lengthy MSFG without demonstration loss of message-passing and can be directed converted into hardware circuitry or used to design decoding schedule. The MSFG is a regular graph and looks like a duplication of a sub-graph when the decoding schedule is regular. MSFSG, a sub-graph extracted from MSFG, describes the operation procedure associated with decoding round or iteration and it is useful to represent block-oriented code such as B-IBPTC, classic TC, LDPC [46], etc. Causal multi-stage sub-graph (CMSSG), a sub-graph extracted from the MSFG, describes the operation procedure associated with each input bit or block of the stream-oriented code such as S-IBPTC, convolutional LDPC code [76, 94], etc. Therefore MSFSG and CMSSG reveal the decoding schedule which can be directly applied by the dynamic decoder shown

in Fig. 6.3 to coordinate multiple APP decoders. MSFSG and CMSSG also reflect the schedule of corresponding function nodes or hardware circuitry.

At last, we apply the CMSSG to acquire a new decoding schedule for the dynamic decoder. The new decoding schedule requires less storage space without compromising performance for S-IBPTC and the new schedule also offers performance improvement comparing to the pipeline decoder. The cost is more computing power especially at low SNR.

7.1 Multi-stage factor graph

The multi-stage factor graph (MSFG) describes message-passing for iterative de-coding process. The edge between function nodes on factor graph are undirected but message-passing are different to the opposite connected function nodes. We modify the undirect edge into two opposite directed edges to reflect different message during itera-tive process. Then we duplicate this directed graph, redirect edges on the directed graph to connect these duplicated graph and label nodes by stage index. Then the new graph reflects real message-passing procedure during iterative process. In short, directed edges show the message-passing and stages mark the processing order on the MSFG.

The construction method is composed of following skills: grouping and labelling, duplication and stage stamping, edge replacement, edge redirecting, edge wiping and edge adding. Grouping and labelling provide a hierarchical graph representation to simplify the representation of MSFG. Duplication and stamping impose time concept on the graph. Edge replacement distinguishes messages by directions. Edge redirecting, wiping and adding connect multiple layers and reschedule message-passing. The first purpose of edge redirecting connects multiple grouped graphs. The second purpose of edge redirecting is to redirect the edge to prior stage or later stage to increase or detain message-passing. Edge wiping removes edges and the corresponding message-passing and node operations are deactivated. At last edge adding amends edges when a node

requires a message when edge redirecting removes some edges at the initial stages. These steps are described as follows.

• This step groups function nodes and edges into a grouped node and label these grouped nodes.

• This step duplicates the grouped graph into multiple layers and stamps stage on these labelled grouped nodes.

• This step replaces the undirected edges into two directed edges.

• This step redirects the directed edge to connect these grouped graphes, adds extra directed edges to enable the node processing and wipes some directed edges to detain the processing of function nodes.

We apply LDPC code [46] and S-IBPTC code as examples to draw the MSFGs cor-responding to various schedules. For LDPC code, we compare conventional belief prop-agation (BP) algorithm [60] and horizontal-shuffled BP algorithm [63]. Furthermore we will provide another graph to demonstrate the new BP algorithm which reduces the cycle effect but requires more storage. We also plot two MSFGs for the S-IBPTC associated with Fig. 6.2. One graph is in accordance with the S-IBPTC pipeline decoding and the other is an aggressive schedule to increase message-passing speed. Both LDPC code and S-IBPTC are described in the following two subsections.

在文檔中用於高傳輸率渦輪碼之交錯器設計 (頁 179-188)