Re-Ordering Algorithm for Input Sequence

Chapter 2 Single-Radix FFT

2.3 Re-Ordering Algorithm for Input Sequence

The process of decimating the signal in the time domain has made input samples need to be re-ordered. For a 9-point signal, the original order of the samples is 0, 1, 2, 3, 4, 5, 6, 7 and 8. But after decimating by radix-3 FFT the order becomes 0, 3, 6, 1, 4, 7, 2, 5 and 8. The order can be obtained by representing the number in the ternary form as follows in Figure 2.10. In the figure, once the numbers are represented in the ternary form, the digits of the representing ternary bits are reversed. The new numbers represented by the reversed digits are the new sequence which is to be applied to the decimated FFT.

Figure 2.10: The process of re-ordering the input samples for radix-3 FFT.

Figure 2.11: 9-point decimation-in-time FFT algorithm.

For the illustrative purpose, we depict the computation of 9-point FFT in Figure 2.11. The computation is performed in two stages, beginning with the computations of three 3-point FFT’s, and finally one 9-point recombine algebra.

As for the radix-4 FFT or radix-5 FFT, we can also apply the same bit-reversal procedure to re-order the in put sequence. In the process, we have to remember to first represent the numbers in the quaternary and quinary form respectively.

2.4 Speed Ratio of Radix-L FFT to Radix-2 FFT

In section 2.2, we have a table to compare the computation complexity of different single-radix FFT’s. In this section, however we will use the idea of butterfly structure to obtain the same result. We will count how many butterfly structures are used for an algorithm.

We have an equation to obtain the amount of butterfly structures for each algorithm, that is,

radix- FFT: ( ) log , is of the power of

Where NL/L is the number of butterfly structures at each stage and logLN_L is the number of stages for an algorithm. The computation time of a butterfly structure (TL) is:

total run-time radix- FFT:

amount of butterfly structures ( ) log

= = ^run

L L

L T T

N L N_L

Because the computation time of the process of re-ordering is very small and can be ignored, so we can directly use the total run-time of the radix-L FFT to calculate TL.

We use MATLAB to run the radix-L FFT, record the total run-time and the amount of butterfly structures for each algorithm, and then we can obtain T_Lfor each algorithm as shown in Table 2.2.

TL T² T³ T⁴ T⁵

us 1.76 2.59 3.47 4.69

Table 2.2: The computation time of a butterfly structure for each algorithm.

After obtaining TL for each algorithm, we can start to calculate the speed ratio. Let us assume the numbers of sampling points N2 is very close to N3, N4 and N5 in value, then according to the following equation,

2 2

We can easily calculate the speed ratio of radix-L FFT to radix-2 FFT as shown in Table 2.3.

algorithm radix-2 FFT radix-3 FFT radix-4 FFT radix-5 FFT speed ratio 1 1.02．log23 1.01．log24 0.94．log25

Table 2.3: The speed ratio of radix-L FFT to radix-2 FFT.

Obviously, the computation speed ratio of radix-L FFT to radix-2 FFT is very close to log2L which is larger than one. So, radix-3 FFT, radix-4 FFT and radix-5 FFT have a better performance than does the radix-2 FFT.

2.5 Numbers of Sampling Points

As discussed, radix-L FFT’s have a speed performance improvement over the radix-2 FFT, there is another issue which needs to be considered. That is the number of numbers of sampling points which can be applied by FFT’s. In the following, we will discuss this issue.

For the radix-2 FFT, the number of sampling points which can be applied by the FFT must

be a number equal to 2^k, where k is a natural number. Similarly for the radix-3 FFT, the number of sampling points must be one of the series 3^k. Table 2.4 lists the numbers of sampling points that can be applied by the radix-3 FFT under the value 10000. There are only 8 numbers of sampling points. This is fewer than that of the radix-2 FFT, which is 13. The higher radix FFT, the fewer this number. For radix-4 and radix-5 FFT, the number becomes 6 and 5 respectively. This is a drawback for using a higher radix FFT.

k 1 2 3 4 5 6 7 8 3^k 3 9 27 81 243 729 2187 6561 distance - 6 18 54 162 486 1458 4374

Table 2.4: The numbers of sampling points which can be applied by the radix-3 FFT under the value 10000.

Chapter 3 Mixed-Radix FFT

As we have discussed in the previous chapter, the FFT’s other than radix-2 have the advantages of speed improvement but have the drawback that the numbers of sampling points become less. A good solution for bypassing the drawback is that we can apply FFT’s for a set of sampling points by dividing the computation stages into several groups and apply the FFT of different radix for each group of computation stages. We call this the “mixed radix-FFT”

algorithm.

In this chapter, we will derive the mixed-radix FFT based on the results obtained in the previous chapter. First we will discuss the case of two radixes, i.e., radix-A/B FFT, and then the case of three radixes, i.e., radix-A/B/C FFT.

3.1 Radix-A/B FFT

Radix-A/B FFT has two factors A and B. According to the order of permutation of these two factors, there will be several forms of algorithm that can be used to decimate the data sequence. For example, if the data sequence has N (= A²×B) points, there will be three forms of permutation of A and B, i.e., AAB, ABA and BAA. We only consider the permutation AA…ABB…B since all other permutations can be applied the same analysis.

3.1.1 Decimation-In-Time Algorithm

Let us consider the data sequence x(n) with N (= A^mA×B^mB ) sampling points. First we decimate the data sequence by a factor of A repeatedly until the data sequence is split into A^mA data sequences of which each has B^mB sampling points. And then we decimate each of these data sequences further by a factor of B repeatedly until these data sequences each is a series of B-point FFT’s. This process can be shown as in Figure 3.1.

Figure 3.1: The radix-A/B FFT decimation.

In the figure, ‘mA-stage’ means the data sequence is decimated by the radix-A algorithm (mA-1) times, ‘mB-stage’ means the data sequences are decimated by the radix-B algorithm (mB-1) times, and ‘A^mA-block and B^mB-point’ means there are A^mA data sequences with B^mB sampling points for each sequence.

Figure 3.2: The 6-point FFT computed by the radix-2/3 algorithm.

For an example, Figure 3.2 depicts the 6-point FFT computed by the radix-2/3 algorithm which is stated above. We observe that the data sequence is finally decimated into two 3-point FFT’s. Besides using the radix-2/3 algorithm, we can also use the radix-3/2 algorithm to do the same 6-point FFT. For this case, the data sequence is finally decimated into three 2-point FFT’s as shown in Figure 3.3.

Figure 3.3: The 6-point FFT computed by the radix-3/2 algorithm.

3.1.2 Re-Ordering Algorithm for Input Sequence

The re-ordering algorithm used for the radix-A/B FFT is very different from that used for the single-radix FFT. Since there are two factors in the algorithm, the previous bit-reversal procedure can not be used directly.

Here we consider the process of the decimation-in-time algorithm. We first use the radix-A to decimate the data sequence to reach several blocks. We observe the order of the data sequence in every block and try to find a mathematical relation to describe such result. And then we use the radix-B to decimate each block which has B^mB sampling points. Before decimating, we re-assign numbers beginning from 0 to the data sequences in each block and record the true order and the new number of each data. After this, we use the bit-reversal (radix-B re-ordering) procedure to re-order the data in each block. Finally, we can obtain the input order according to the previous record.

However the above method is a little bit complicated. We can directly find a mathematical relation to describe the input order. The following Figure 3.4 depicts the 6-point radix-2/3 re-ordering algorithm. This algorithm can be extended to radix-A/B.

Figure 3.4: The 6-point radix-2/3 re-ordering algorithm.

In the figure, ‘0, 1, 2, 3, 4, 5’ means the original order, ‘0, 2, 4, 1, 3, 5’ means the decimated order and the part written on the right of the equal sign is the re-ordering algorithm.

‘N1-point radix-2 re-ordering’ means using bit-reversal (radix-2 re-ordering) to re-order the sequence of numbers 0…N1-1 where N1 is of the power of 2. ‘(N2-point radix-3 re-ordering) × N1’ means using bit-reversal (radix-3 re-ordering) to re-order the sequence of numbers 0…N2-1 where N2 is of the power of 3, and then multiplied by the sequence N1.

One thing to be noted is that: For applying the radix-A/B FFT or the radix-B/A FFT, the order of A and B will give different run-time. Table 3.1 compares the computation speed for the radix-A/B FFT and the radix-B/A FFT under the different conditions.

A^mA ＞ B^mB B^mB ＞ A^mA radix-A/B FFT slow fast radix-B/A FFT fast slow

Table 3.1: A comparison of the radix-A/B FFT and the radix-B/A FFT.

3.2 Radix-A/B/C FFT

For the three radixes A, B and C, there are permutations of radix-A/B/C FFT, radix-A/C/B FFT, radix-B/A/C FFT, radix-B/C/A FFT, radix-C/A/B FFT and radix-C/B/A FFT. Here we only discuss for the case of the radix-A/B/C FFT, and all other permutations of radix order are the same.

3.2.1 Decimation-In-Time Algorithm

Let us consider the data sequence x(n) with N (= A^mA×B^mB×C^mC) sampling points. First the data sequence is decimated by a factor of A repeatedly until the data sequence is split into A^mA data sequences for which each sequence has (B^mB×C^mC) sampling points. And then each of these data sequences is decimated by the factor of B repeatedly until each sequence is split into B^mB data sequences for which each sequence has C^mC sampling points. Hence, now we have altogether (A^mA×B^mB) data sequences for which each sequence has C^mC sampling points.

Finally, each of these sequences is decimated by the factor of C repeatedly until it reaches a series of C-point FFT’s. In other words, the radix-A/B/C FFT uses the radix-A algorithm, the radix-B algorithm and the radix-C algorithm in turn to decimate the DFT computation as shown in Figure 3.5.

Figure 3.5: The Radix-A/B/C FFT decimation.

In the figure, ‘mA-stage’ means the data sequence is decimated by the radix-A algorithm

(mA-1) times, ‘mB-stage’ means the data sequences are decimated by the radix-B algorithm (mB-1) times, ‘mC-stage’ means the data sequences are decimated by the radix-C algorithm (mC-1) times, ‘A^mA×B^mB-block and C^mC-point’ means there are (A^mA×B^mB) data sequences with

C^mC sampling points for each sequence and ‘A^mA-block and (B^mB×C^mC)-point’ means there are A^mA data sequences with (B^mB×C^mC) data points for each sequence.

For example, Figure 3.6 depicts the 30-point FFT computed by the radix-2/3/5 algorithm.

We observe that the data sequence is decimated into two 15-point FFT’s first and then these two 15-point FFT’s are decimated into three 5-point FFT’s, respectively.

Figure 3.6: The 30-point FFT computed by the radix-2/3/5 algorithm.

This 30-point FFT can also be done by using other five algorithms of different order of 2, 3 and 5. Although the structures of computation and the order of input sequences are different, the obtained results are still the same.

3.2.2 Re-Ordering Algorithm for Input Sequence

Similar to section 3.1.2, here again we will find a mathematical relationship to describe the input order directly. We use radix-2/3/5 as an example to re-order a 30-point sequence in Figure 3.7. The procedure can be extended to the general case of the radix-A/B/C.

Figure 3.7: The illustration of a 30-point radix-2/3/5 re-ordering algorithm.

In the figure, the first sequence is in the original order, the second sequence is in the decimated order and the part written under the equal sign is the re-ordering algorithm.

‘N1-point radix-2 re-ordering’ means using bit-reversal (radix-2 re-ordering) to re-order the sequence of numbers 0…N1-1 where N1 is of the power of 2, and ‘(N2-point radix-3 re-ordering) × N1’ and ‘(N3-point radix-5 re-ordering) × M’ are similar to those discussed in section 3.1.2.

Chapter 4 Radix-2/4/3/5 FFT and Interpolation

Now we can use the radix-2/3/5 FFT to replace the radix-2 FFT, but we still want to speed up the radix-2/3/5 FFT and extend the numbers of sampling points to more numbers. To do this, we will replace the radix-2 algorithm in the radix-2/3/5 FFT with the radix-2/4 algorithm, thus we will obtain the radix-2/4/3/5 FFT. Furthermore, we will add the concept of interpolation to extend the applicable numbers of sampling points to any number.

4.1 Radix-2/4/3/5 FFT Algorithm

For a data sequence x(n) with N (= 2^m2’×3^m3×5^m5) sampling points, we replace the radix-2 algorithm with the radix-2/4 algorithm, in other words, we consider the data sequence x(n) to be N (= 2^m2×4^m4×3^m3×5^m5) sampling points. This reduces the number of computation stages since m2’ is larger than (m2 + m4).

The computation of the radix-2/4/3/5 FFT is shown in Figure 4.1.

Figure 4.1: The radix-2/4/3/5 FFT decimation.

As for the re-ordering algorithm, it is the same as that of Figure 3.8, except that the N1-point radix-2 re-ordering must be replaced with the N1-point radix-2/4 re-ordering as shown in Figure 4.2.

Figure 4.2: The 30-point radix-2/4/3/5 re-ordering algorithm.

In the figure, the ‘radix-2/4 re-ordering’ is the radix-A/B re-ordering algorithm discussed in section 3.1.2.

4.2 Interpolation Algorithm

The purpose of this algorithm is that it is to re-sample the sampled data so that the number of sampling points can be applied by the radix-2/4/3/5 FFT. For example, if there is a data sequence x(n) for a signal with N (= any number) sampling points, interpolation can re-sample x(n) and then we can obtain another similar sequence x’(n), which expresses the same signal, with N’ (= 2^m2×4^m4×3^m3×5^m5) sampling points. In order to shorten the run-time of interpolation and reduce the difference between x(n) and x’(n), we use the linear interpolation which is shown in Figure 4.3. In the figure, there is a data sequence x(n) with 6 sampling points in (a).

We use a straight line passing through two points of x(n) to approximate the actual curve and then represent the interpolated signal with a new set of sequence x’(n) with 10 sampling points placed in equal time interval as shown in (c).

Figure 4.3: The illustration of linear interpolation.

The following Figure 4.4 depicts the flow chart of an N-point FFT operated by the new algorithm.

Figure 4.4: The flow chart of N-point FFT operated by the new algorithm.

4.3 Simulation Results

To illustrate the advantage of the radix-2/4/3/5 FFT on the applicable number of sampling points, we do the following experiment: Within the range from 1 to 10000, we choose 50 groups of Ns evenly. The numbers of sampling points (Ns) and the data sequences are

obtained from the procedure as follows:

for (i=1:50)

Ns(i)=111+(i-1)*200;

for (j=1:Ns(i)) x(i,j)=j;

end end

We plot the required numbers of sampling points for which the FFT algorithms can be applied in terms of the number of sampling points for the single-radix FFT’s in Figure 4.5 and for the mixed-radix FFT’s in Figure 4.6.

Figure 4.5: The comparison of applicable number (Ni) of sampling points which can be applied by the single-radix FFT’s.

In the figure, ‘DFT’ means using discrete Fourier transform which can be applied with any number of sampling points. Thus ‘DFT’ is a straight line. The radix-2 FFT has the most numbers of sampling points than any other single-radix FFT as discussed in section 2.5 since

step-level is. Hence, it is not a good idea to use the single-radix FFT’s.

Figure 4.6: The comparison of applicable number (Ni) of sampling points which can be applied by the mixed-radix FFT’s.

Figure 4.6 shows the similar plots for the mixed-radix FFT’s. In the figure, ‘radix-2,3 FFT’ includes two algorithms, i.e., radix-2/3 FFT and radix-3/2 FFT, since they have the same number of re-sampled points. And so does for the ‘radix-2,5 FFT’ and ‘radix-2,3,5 FFT’.

From the figure, we can find that ‘radix-2,3 FFT’, ‘radix-2,5 FFT’ and ‘radix-2,3,5 FFT’ all follow the straight line of ‘DFT’, and as Ns is larger, the step-levels of ‘radix-2,3 FFT’ and

‘radix-2,5 FFT’ are larger too. As for the ‘radix-2,3,5 FFT’, it has the smallest step-level and approaches that of the ‘DFT’.

Figure 4.7: The comparison of ΔN (Ni - Ns) for different mixed-radix FFT’s.

Figure 4.7 shows the ΔN (Ni - Ns) plots which are derived from data from Figure 4.6 for each mixed-radix FFT. Obviously, ‘radix-2,5 FFT’ has the maximum amplitude and

‘radix-2,3,5 FFT’ has the minimum amplitude. To further demonstrate the above property is to plot the data in the form of 1 - |ΔN/Ns| versus the number of sampling points as shown in Figure 4.8.

Since interpolation modifies the original signal and introduces effectively noise to the signal, it is interesting to investigate how the S/N of the signal will be affected by the interpolation process. The 1 - |ΔN/Ns| figure reflects indirectly the S/N of the processed signal.

In the figure, it can be seen that the radix-2,3,5 FFT has the best S/N than other two algorithms. However, as for other two algorithms, their S/N’s are not too bad since their values of 1 - |ΔN/Ns| are all larger than 0.9.

Next, we will compare the run-time for different FFT algorithms. First we compare the run-times of the radix-2 FFT and the radix-2/4 FFT as shown in Figure 4.9 and then compare the run-times of the radix-2,3,5 FFT and the radix-2/4,3,5 FFT as shown in Figure 4.10 since they have the same numbers of re-sampled points, respectively.

Figure 4.9: The comparison of run-times of radix-2 FFT and radix-2/4 FFT.

Obviously, the radix-2/4 FFT is faster than radix-2 FFT and the run-time of radix-2/4 FFT is almost half of that of radix-2 FFT. This result is consistent with that of Table 2.3. In this experiment, for the radix-2/4 FFT, only the first stage is decimated by the factor 2, and all other stages are decimated by the factor 4. So, the radix-2/4 FFT is almost the same as the

radix-4 FFT, except that the number of sampling points that can be applied to radix-2/4 FFT is of the power of 2, not of the power of 4.

Figure 4.10: The comparison of run-times of radix-2,3,5 FFT and radix-2/4,3,5 FFT.

In this experiment, the radix-2/4,3,5 FFT was obtained by replacing the radix-2 algorithm of the radix-2,3,5 FFT with the radix-2/4 algorithm. From this figure, we can see that the radix-2/4,3,5 FFT is faster than the radix-2,3,5 FFT, but there are some positions that these two algorithms overlap. This is because the numbers of re-sampled points are (2×3^m3×5^m5) where the radix-2/4,3,5 FFT is the same as the radix-2,3,5 FFT.

The comparison of run-times of radix-2,3 FFT, radix-2,5 FFT and radix-2,3,5 FFT that have similar numbers of re-sampled points is shown in Figure 4.11 . Obviously, radix-2,3 FFT and radix-2,5 FFT have similar performance of computation speed but the radix-2,3,5 FFT has a better computation speed performance than the above two algorithms generally.

Figure 4.11: The comparison of run-times of radix-2,3 FFT, radix-2,5 FFT and radix-2,3,5 FFT.

And the comparison of run-times of radix-2,3 FFT, radix-2,5 FFT and radix-2/4,3,5 FFT is shown in Figure 4.12. Obviously, radix-2/4,3,5 FFT is much better than other two algorithms.

Figure 4.12: The comparison of run-times of radix-2,3 FFT, radix-2,5 FFT and radix-2/4,3,5 FFT.

As a conclusion, based on results of Figures 4.6, 4.7, 4.8 and 4.12, we can conclude that the radix-2/4,3,5 FFT cooperating with interpolation is the best algorithm in doing FFT which satisfy both the requirements of the performance of computation speed and the applicable numbers of sampling points.

4.4 Examples

In this section we further use two examples applied with the above developed FFT

在文檔中應用於混模電路測試之利用混合基底與內插法的快速傅立葉轉換 (頁 25-0)