Performance Comparison - Introduction to Holographic Data Storage Systems

2 Introduction to Holographic Data Storage Systems

3.4 Performance Comparison

In this section, the performance of the iterative detection schemes introduced in the chapter is compared, under two different channel models with two kinds of IPI conditions respectively. The comparison under the complete channel model is shown in Figure 3.9, and that under the incoherent intensity channel model is shown in Figure 3.10. The simulation results coincide with our expectation toward these two

approaches. As a hard detection scheme, PDFE sacrifices its performance for simplicity. By merely operating on the hard decisions, lots of information has been neglected, which leads to the inferior performance of PDFE than that of 2D-MAP detection. Furthermore, as the extent of IPI increases, the tendency of showing an error floor is much more obvious for PDFE than for 2D-MAP detection. As a result, in the case of severer IPI, the detecting performance of PDFE is not even acceptable. Based on above reasoning, we have concluded with the adoption of 2D-MAP detection. An unfavorable factor remained would be the high computational complexity associated with the 2D-MAP algorithm. In next section, we shall devise an effective suite of complexity-reducing schemes for a 2D-MAP detector, so that the implementation of it becomes feasible.

Figure 3.9 Performance Comparison of PDFE and 2D-MAP (Completer Channel Model)

Figure 3.10 Performance comparison of PDFE and 2D-MAP (Incoherent Intensity Channel Model)

Two final notes about the iterative detection schemes are provided here for completeness. The first one concerns the possibility of extremely severe channel conditions. The case in which W=1.5 is simulated for the complete channel, as is shown in Figure 3.11. Under the circumstance, simple threshold detection and PDFE do not even show any decrease in BER as SNR increases, and the BER curve of 2D-MAP also has a much smaller slope than those with W=1 and W=1.25. This simulation result provides an idea that a W of 1.5 or above may suggest a too harsh channel environment for current iterative detection schemes. Fortunately, the real HDS

Figure 3.11 Performance Comparison including a severer channel condition

channel usually seems milder. As suggested in the channel model recently proposed [6], the range of W is within 0.8-1.25 for complete channel model. Thus, our settings of W as 1 and 1.25 in our simulations seem quite reasonable.

The second note is about our assumption of perfect channel estimation. Here, for the reason of completeness too, a performance simulation considering estimation error is offered. In Figure 3.12, we compared the BER performance under different amount of estimation error. NEE (Normalized Estimation Error) is defined as standard deviation of channel estimation error divided by the output signal mean. A rough idea is thus provided for required accuracy of channel estimation, which is to be included in the future.

Figure 3.12 Performance degradation owing to channel estimation error (W=1.25, Complete Channel)

Chapter 4 Schemes to Reduce Computational Complexity of 2D-MAP

4.1 Introduction

2D-MAP detection offers superior performance under conditions of severe IPI and thus becomes a powerful solution to overcome the two-dimensional crosstalk present in holographic data storage systems. However, the high computational complexity poses a concern for practicality. For demonstration, all the arithmetic operations involved in 2D-MAP detection have been unfolded and illustrated as in Figure 4.1.

During every iteration, the combination of the normalized squared Euclidean distance and the sum of LLRs is calculated for each possible neighborhood pattern, which is normally referred to as candidate. To provide an idea, the number of iterations and candidates are 15 and 256 respectively for our system simulated with the complete channel model. Then, in the calculation of the aforementioned combination, an addition-intensive part would be for the sum of LLRs, in which at most 7 additions are needed to sum 8 LLRs together. Therefore, as is expressed in Figure 4.1, a suite of three complexity-reducing schemes will be presented, with each one of them focusing

on a different dimension of the algorithm.

Figure 4.1 The arithmetic operations involved in 2D-MAP and three aspects on complexity reduction

4.2 The Schemes to Reduce Computational Complexity

In this section, the suite of three complexity-reducing schemes will be introduced respectively, including schemes of iteration reduction, candidate reduction, and addition reduction. For generality, this research architecture has been developed under the complete channel model, which can be seen as equivalent to the actual channel and without extra hypotheses or simplifications.

4.2.1 Scheme of Iteration Reduction

As 2D-MAP algorithm being an iterative detection scheme, the iteration number for the BER to reach a converged state may be different as the channel condition varies.

As could be recalled in Figure 3.5, we only verify the relationship of iteration number and BER performance for a specific SNR setting, since for other SNR values, the relationship can be somehow quite different. Even so, a trend shall be predicted: when the SNR is higher, the iteration number needed to reach convergence will definitely be fewer. While in the previous implementation with a fixed iteration number, in order to keep a design margin, the iteration number we choose may be too large than enough for some channel settings. This is why the idea of an adaptive iteration number comes into play, and thus topics on stopping criteria have been researched. [18][19]

A question naturally appears: how do we know when the iterative detection can be stopped? BER is not a metric that can be conveniently accessed owing to the lack of original information, so we have to search for other metrics that could serve as an indicator for the trend of BER and could be easily obtained at the same time. In previous literatures, there have been a few stopping criteria proposed to terminate the iterative detection at convergence. Examples include the sign-change ratio (SCR) criterion and the hard-decision-aided (HDA) criterion. The SCR criterion counts the sign changes of extrinsic information during the updating step and stops the iteration

when the number of total sign changes decreases under a pre-defined threshold. And the HDA criterion stops the iteration when the hard decisions of a block between two consecutive iterations fully agree. However, in these stopping criteria, the iteration number is set the same for all pixels in a page or a region within a page.

In fact, in the entire page to be detected, there are pixels which converge faster, while others slower. If they are not treated in the same way, the average iteration number is able to be further decreased. Here, the definition for the convergence of pixel may be slightly different from that mentioned earlier for the BER. Previously when we say that the BER has converged, we mean that through iteration the fluctuation of BER curve has become steady. Now when we say a pixel has converged, we mean that the hard decision of that pixel has remained the same and stop alternating between 1 and 0. While telling the stop of alternating is really not easy, is there a more reliable approach to decide that a pixel has converged?

Given the formula of likelihood feedback (3.4), it can be observed from simulation that, as iteration proceeds, the magnitudes of the LLR terms grow larger and larger. This is because, as belief propagates, the decisions are gradually being improved so that the LLR moves toward positive or negative infinity. Through iteration, finally the LLR grows to an extent that the sum of LLRs in (3.4) dominates the entire

min{}.

Note that (3.4) has to be performed for both the hypotheses {X(i,j)=1} and {X(i,j)=0}. Owing to the aforementioned phenomenon, the candidates X that result in the minimum under both hypotheses would eventually become the same one. And when they become the same, this pixel has very likely already converged.

So we can formulate the following Iteration-Reduction scheme in two steps:

I. For every pixel in current iteration, keep tracks of the candidates that result in the minimum for both hypotheses. Let’s name them as ML_Candidate_1 and

ML_Candidate_0.

II. If ML_Candidate_1 = ML_Candidate_0 in successive two iterations, then this pixel will be judged as “converged”. Its LLR will be raised to the extreme value, which may be seen by other neighbors as a decision making. And the detection of this pixel is stopped.

If a pixel still does not reach the “converged” state at the predetermined iteration number (15, as in our case), then the detection on that pixel will be stopped anyway.

Thus the worse case for a pixel would be running the same iterations as our original 2D-MAP scheme without iteration reduction. This above two-step strategy can be illustrated in a flowchart as in Figure 4.2. A counter is kept for each pixel to indicate for how many times the two ML candidates have been the same. With this scheme, different iteration number would be associated with different pixels. In Figure 4.3, we

Figure 4.2 Flowchart of the iteration-reducing scheme

Figure 4.3 Iteration number adaptive to SNR

show the average iteration number achieved with our stopping criterion with other cases with fixed iteration number. When compared with the original 2D-MAP using 15 iterations for every pixel, the proposed scheme reduce the average iteration number to nearly half, with the exact reduction depending on the operating conditions. Then, the BER performance of the iteration-reduced 2D-MAP is compared with those cases with fixed iteration number, as shown in Figure 4.4. The performance degradation is verified to be negligible.

Figure 4.4 Performance of iteration-reduced 2D-MAP compared with that with fixed iteration number

4.2.2 Scheme of Candidate Reduction

From Figure 3.4 and the formula of likelihood feedback (3.4), we know that in the bit-wise operation of 2D-MAP algorithm, 2⁸=256 candidates are involved in the search for the minimum. Nonetheless, among all of the candidates, some neighborhood patterns are just unlikely to appear compared to others. Or, to put it in another way, some candidates would result in a quite large value of the combination of normalized Euclidean distance and sum of LLRs so that they would definitely be filtered out by the min{}. Thus, our aim in candidate reduction is to eliminate candidates that are improbable to yield small values inside the min{} in formula (3.4) before we actually delve into the calculation.

From Figure 2.7, we know that even if the simple threshold detection does not yield satisfactory BER performance, the resulted BER is still around 10^-1 under most channel conditions, which indicates that, on average, only one error occurs in every 10 pixels when simple threshold detection is applied. This fact motivates the filtering of candidates by Hamming distance. [14] The candidate-reducing scheme is simple: only the candidates that are within a predetermined radius in Hamming distance (R) to the current hard decision of the neighborhood pattern will be involved in further calculations. This is illustrated in Figure 4.5, in which Q represents the Hamming distance between a candidate and the current hard decision. The current hard decision

Figure 4.5 the candidate-reducing scheme using Hamming distance

of neighborhood pattern then represents the candidate whose Q = 0, and is determined from either the received pixel values or neighboring LLRs depending on the stage in iterative process.

In order to select a suitable radius R in Hamming distance, the BER of several SNR conditions has been plotted against the radius, as is shown in Figure 4.6. This figure demonstrates a trade-off between the number of candidates and BER performance. To strike a balance, we finally determine to set the radius R as 2.

Therefore, the number of candidate is reduced from 256 to

C

₀⁸+

C

₁⁸+

C

₂⁸ =37, which

is a 85% reduction in computational complexity.

In Figure 4.7, the BER performance after the reduction of iteration and candidate is examined. The performance degradation is little compared to the original version.

Figure 4.6 Trade-off between number of candidates and BER performance

Figure 4.7 Performance degradation after applying the scheme of iteration and candidate reduction

4.2.3 Scheme of Addition Reduction

After the previous candidate reduction is applied, the candidate remained would share many common ingredients in their sum of LLRs in formula (3.4). Therefore, after the sum of LLRs for the pattern of current hard decision is calculated, assumed to be named as LXhd, those for other candidates can be obtained using offsets on LXhd. The offsetting could conveniently be realized through transversing a tree with a depth-first strategy. Thus, this scheme of addition reduction is named as the

“tree-based LLR summation”. For generality, a case of R=3 is examined below.

According to the bit positions that a candidate is different from the temporary hard decision, as illustrated below, a tree is constructed as shown in Figure 4.8.

In this tree, the first stage corresponds to the candidates with Q = 1, the second to those with Q = 2, and so on. Every step moving downward the tree represents one addition for an extra LLR value. Therefore, if no intermediate results are stored, we will need two additions to reach the nodes in the second stage and three additions to reach those in the third stage. Nonetheless, with a register to keep the intermediate result in stage 1 and stage 2 respectively, many redundant additions can be saved.

Figure 4.8 Tree construction in tree-based LLR summation

In addition to the two registers to store the intermediate results, we also need three flag signals to keep track of the current position in the tree, so that we can know which candidate is being considered now. The current candidate is generated from the three flag signals and the temporary hard decision, since the union of the three flags represents the difference pattern, thus indicating which bit in the temporary hard decision is to be reversed. For example, the case when {FLAG_1, FLAG_2, FLAG_3}

= {0, 1, 2} points to a candidate whose values in positions (0), (1), (2) are reversed as compared to the pattern of temporary hard decision. Note that 0 is a reserved value for FLAG_2 and 0, 1 are reserved values for FLAG_3, so no reversion is associated with those values on those flags. By transversing the tree with a depth-first strategy, a short section of flag transitions is shown in Figure 4.9. From the observation of these flag signals, we have found some rules inherent in their transitions:

Figure 4.9 Flag transition pattern and a pseudo-code to describe the inherent rules

<i> If a flag of a lower stage becomes zero, that flag would be reset to the value of the flag of a higher stage incremented by 1. An example is seen when {FLAG_1, FLAG_2, FLAG_3} changes from {0, 1, 0} to {0, 1, 2}. This kind of transition is

named “downward operation”, since these flag transitions correspond to downward movements in the tree.

<ii> If a flag of a lower stage becomes 7, it would be set to 0, and the flag of the higher stage would be incremented by 1. An example is seen as {FLAG_1, FLAG_2, FLAG_3} changes from {0, 1, 7} to {0, 2, 0}. This kind of transition is named as “upward operation”, since these flag transitions correspond to upward movements in the tree.

<iii> If none of the above cases happens, then FLAG_3 would be incremented by 1.

An example is seen as {FLAG_1, FLAG_2, FLAG_3} changes from {0, 1, 2} to {0, 1, 3}. This kind of transition is named as “sweep operation”, since these flag transitions correspond to horizontal movements in the tree.

A pseudo-code is shown together in Figure 4.9 to provide more details. As can be seen, an output of calculated result is generated in each clock cycle with only one addition required. Besides, an updating of register is associated with every downward operation so as to keep the intermediate result before moving downward.

With the application of tree-based LLR summation, the generation of one sum of LLRs requires only one addition on average, as summarized in TABLE 4.1. An extra note is that this scheme can be further generalized to the case of inclusion of all candidates (R=8). On average, the number of additions for LLRs is reduced to one-third of the original approach.

TABLE 4.1 Addition needed to generate a sum of LLRs for candidates of different Q

4.3 Summary of Complexity Reduction

In last section, a suite of complexity-reducing schemes has been introduced, including:

a. Iteration reduction with a stopping criterion based on ML candidates b. Candidate reduction based on Hamming distance

c. Addition reduction applying a tree-based LLR summation

Each of them focuses on a different aspect of the 2D-MAP algorithm, as is illustrated in Figure 4.1. Moreover, each scheme is in fact independent of the other two. That is, each one of them can be independently applied on the 2D-MAP algorithm, while combined together, they are able to result in significant complexity reduction.

If the computational complexity of the direct implementation is set as 100%, in which 256 candidates and 15 iterations are involved, then the effect of each complexity-reducing scheme is demonstrated as in TABLE 4.2. The environment applied is a complete channel with W=1.25 and SNR=20dB.

While the addition reduction does not cause any performance degradation owing

to its equivalence in effect to direct implementation, the degradation resulted from iteration and candidate reduction is also negligible, as is observed from the

performance of the 2D-MAP with all three complexity-reducing schemes applied in Figure 4.10.

TABLE 4.2 Summary of reduction on computational complexity

( IR=Iteration Reduction, CR=Candidate Reduction, AR=Addition Reduction;

Direct Implementation involves 256 candidates and 15 iterations. )

Figure 4.10 Performance of the complexity-reduced 2D-MAP (dashed lines)

4.4 Complexity Reduction under Incoherent Intensity Channel

The above research results have been developed based on the assumption of complete channel model for the reason of generality. In Chapter 2, we understand that, as a simplified and linearized version of complete model, the incoherent intensity model will be applicable when the target system appears more like an incoherent optical system. In this section, we will demonstrate that, with the hypothesis of incoherent channel model, a 2D-MAP detector with even lower computational complexity can be devised.

The main difference brought by the adoption of incoherent channel model is that we can now assume the H(X+X(i,j)) to be H․X+X(i,j) in the equation (3.4) in 2D-MAP

algorithm. Originally the function H is a black box to us, and now we know that it suggests an inner product between a 3×3 hypothesis and the 3×3 IPI matrix. With this simple replacement, another computation reduction is made possible: the multiplication reduction, as shown in Figure 4.11, aiming to get rid of the squaring operation in the calculation of normalized squared Euclidean distance.

In the following, we will first demonstrate that the three complexity-reducing schemes equally applies to the system under incoherent intensity channel, then the multiplication reduction will be presented so that the combined effect of these four schemes can be observed.

Figure 4.11 Complexity-reducing schemes that can be applied in 2D-MAP under incoherent channel

To take the worst case into consideration, these schemes are verified under the incoherent channel with severer IPI, in which W=1.8. The direct implementation uses 256 candidates with 20 iterations. When iteration reduction applied, the average iteration number is decreased to approximately half of that in direct implementation, with the exact reduction adaptive to channel conditions, as is shown in Figure 4.12.

Then the performance of the iteration-reduced 2D-MAP is compared to those of 2D-MAP with fixed iteration number, as is shown in Figure 4.13, and there is nearly no performance degradation with respect to the case of direct implementation. For candidate reduction, we determined to set the radius of Hamming distance as 3, which is concluded from the simulation in Figure 4.14. The performance of 2D-MAP after the

在文檔中適用於分頁配置儲存系統之低運算複雜度二維遞迴偵測設計 (頁 62-0)