• 沒有找到結果。

2. DISCUSSION AND RESULTS

2.1 V ARIABLE - LENGTH FFT P ROCESSOR

2.1.1 Variable-length Data Address Generator

In in-place memory-based FFT processor design, data address generator is decided by the order of butterfly operations. A conventional processing order and control scheme for radix-2 FFT are proposed by Cohen (1976) [1], and the algorithm was then extended and generalized by [2], [3], [4], [5]. However, we can find out that Cohen’s scheme is not suitable for a variable-length FFT when analyzing a sub-segment of signal flow graph for a shorter-length FFT.

To give an example of 16-point radix-2 DIF FFT operation, the direct-order scheme processes butterflies from top to down and from left stage to right stage as marked by the numbers on the right-hand sides of ellipses in Fig. 2.1. On the other hand, since the main idea of Cohen’s processing order is grouping butterflies associated with the same twiddle factor together to reduce signal switching frequency of the coefficient circuits, it results in decimation in butterfly (DIB) order as marked by the numbers on the left-hand sides of ellipses in Fig. 2.1.

1 2 3 4

5 6 7 8

1 2 3 4 5 6 7 8

1 3 5 7

2 4 6 8

1 2 3 4

5 6 7 8

1 5

2 6

3 7

4 8

1 2

3 4

5 6

7 8

1

2

3

4

5

6

7

8 8

7 6 5 4 3 2 1

16-point 8-point 4-point

0th stage 1st stage 2nd stage 3rd stage

Fig. 2.1 DIF butterfly processing sequence for fixed-length and variable-length memory based FFT processors

Table 2.1 Data addresses needed for butterfly PE in direct processing order.

BF 0 BF 1 BF 2 BF 3 BF 4 BF 5 BF 6 BF 7 Stage 1 <0, 8> <1, 9> <2, 10> <3, 11> <4, 12> <5, 13> <6, 14> <7, 15>

Stage 2 <0, 4> <1, 5> <2, 6> <3, 7> <8, 12> <9, 13> <10, 14> <11, 15>

Stage 3 <0, 2> <1, 3> <4, 6> <5, 7> <8, 10> <9, 11> <12, 14> <13, 15>

Stage 4 <0, 1> <2, 3> <4, 5> <6, 7> <8, 9> <10, 11> <12, 13> <14, 15>

Table 2.2 data address pairs for butterfly PE in Cohen’s scheme.

BF 0 BF 1 BF 2 BF 3 BF 4 BF 5 BF 6 BF 7 Stage 1 <0, 8> <1, 9> <2, 10> <3, 11> <4, 12> <5, 13> <6, 14> <7, 15>

Stage 2 <0, 4> <8, 12> <1, 5> <9, 13> <2, 6> <10, 14> <3, 7> <11, 15>

Stage 3 <0, 2> <4, 6> <8, 10> <12, 14> <1, 3> <5, 7> <9, 11> <13, 15>

Stage 4 <0, 1> <2, 3> <4, 5> <6, 7> <8, 9> <10, 11> <12, 13> <14, 15>

In Fig 2.1, when we isolate the sub-SFG of a shorter-length FFT from the longer SFG, the Cohen’s butterfly order is unmatchable with the variable-length FFT design concept. On the contrary, the direct processing order is suited to the varied FFT lengths, and therefore the architecture of Cohen’s data address generator has to be modified to deal with the operations of different lengths FFT.

In the example shown above, the data addresses needed for butterfly PE in direct processing order and in Cohen’s processing order are listed in Table 2.1 and Table 2.2 respectively. In the table, <s, t> denotes data address pair for both input and output data for radix-2 butterfly PE, and s and t are indices of one dimension memory array. Note that the address translation and mapping from one dimension index to multi-bank memory system are considered later.

Data address pair <s, t> needed for the i-th butterfly of the k-th stage in Cohen’s scheme can be described as (2.1), while the operator ROTATEn(X, m) circularly rotates X right by m bits within n bits.

) 2 , (

1

~ 0 ,

2 1

~ 0 , )

, ( log2

N k i ROTATE t

n N k

i k

i ROTATE s

N n

n n

+

=

=

=

=

=

(2.1)

To realize equation (2.1), Cohen proposed the efficient address generator architecture as shown in Fig. 2.2. The main idea is appending 0 and 1 to MSB of the content of butterfly counter then using barrel shifters to realize the rotation.

Butterfly

Shift Right Circular

}

}

0

1

t

Shift Right Circular

s

Fig. 2.2 Data address generator for radix-2 FFT in Cohen’s scheme

We can modify Cohen’s DIB-ordered addressing scheme to direct ordered addressing scheme to suit with variable-length FFT design. The data address pair <s, t> can be described as the following equation (2.2) composed of the contents of the butterfly counter and the stage counter.

Chang [6] proposed a variable-length data address generator, which was modified from Cohen’s fixed-length data address generator. Chang’s design includes an extra barrel shifter that rotates the content of butterfly counter circular left before bit appending operations and then rotates circular right followed by bit appending operations. This design not only alternates Cohen’s scheme to direct butterfly operation order, but also adapts to varying FFT lengths. The block diagram is shown in Fig. 2.3.

Butterfly counter Stage

counter

Barrel _Shifter _0 Barrel_Shifter_1

Shift Right Circular

}

}

0

1

t

Shift Right Circular

s Barrel _Shifter

Shift Left Circular

Fig. 2.3 Chang’s variable-length data address generator.

In order to achieve high-performance variable-length FFT operations and data accesses, we propose the following data address generator. The design covers seven different FFT lengths including 64, 256, 512, 1024, 2048, 4096, and 8192 points, which cover all the required FFT lengths by 802.11a, 802.16a, DAB, DVB-T, VDSL and ADSL. Furthermore, the proposed data address generator significantly improves the address generator mentioned above, by considering radix-22 DIF FFT algorithm and variable-length FFT operations, and by simplifying the original area-consuming barrel-shifter based designs with simpler multiplexer-based addressing functions.

The four addresses required by radix-22 butterfly PE correspond to the 4 different banks.

The addresses are denoted as <s, t, u, v> which can be calculated by the equation (2.3), where N is the longest FFT length supported, k is the stage counter content, and

2

i is butterfly counter content.

2

Butterfly counter [10:0]

SIB MUX

Fig. 2.4 Block diagram of the proposed variable-length data address generator

The hardware block diagram of variable-length data address generator is shown in Fig. 2.4.

In the figure, “carry-in controller” adds the carry-out signal from butterfly counter to the LSB or its left immediate bit of the stage counter to alternate the counter step of the stage counter between one and two; “comparator” compares stage counter content with the maximum stage

count corresponding to each FFT length and reset all counters if they are equal; input signal

“mode select” controls the butterfly counter step and maximum stage count to vary FFT length;

“SIB MUX array” denotes shift-insertion-bypass multiplexer array. It greatly simplifies the address generator of Fig. 2.5 and Fig. 2.6, as will be detailed below.

Insert symbol 0

Insert

symbol 1 Bypass Shift 1

MUX_n

MUX_con_n

Fig. 2.5 Block diagram of MUX_n module

MUX_0 MUX_1

MUX_2 MUX_3

MUX_9 MUX_10

MUX_11 MUX_12

Butterfly counter [10:0]

Data address [12:0]

Symbol [1:0]

x

x

x

x MUX_con_12

MUX_con_11

MUX_con_10

MUX_con_9

MUX_con_3

MUX_con_2

MUX_con_1

MUX_con_0

Fig. 2.6 The architecture of Shift-insert-bypass MUX array

In our design, we define several functions to simplify the design to replace those area-consuming barrel shifters with much simpler multiplexers. The functions include the required left shift operations, symbol bit insertion operations, and bypass the remaining bits, for the realization of the variable-length data address generation algorithm. Detailed architecture of the shift-bypass-insertion multiplexer array is shown in Fig. 2.6. In the figure, the input signal

“symbol” to MUX_n module can be 00, 01, 10, or 11, where block diagram of MUX_n is shown in Fig. 2.5, and function of MUX_n out-put is explained in Table 2.3. Timing and area

comparisons of data address generator between SIB-MUX array approach and barrel shifter approach are shown in Table 2.4, and the result is synthesized by TSMC 0.25µm standard cell library with Synopsis Design Analyzer.

Table 2.3 Output functions of the MUX_n.

Output Function Insert symbol 0 (I0) Select symbol bit 0 as the n-th bit of data address.

Insert symbol 1 (I1) Select symbol bit 1 as the n-th bit of data address.

Bypass (BP) Select the n-th bit of butterfly counter as the n-th bit of data address.

Shift 2 (S2) Select the (n-2)-th bit of butterfly counter as the n-th bit of data address.

Table 2.4 Comparison of DAG units.

SIB-MUX array Barrel shifter (Fig.2.3)

No. of cells 143 229

Total gate counts 169 352

Path delay 5.72ns 7.14ns

相關文件