Chapter 4 Probability-Based Static Scaling Optimization for Fixed Wordlength FFT
4.2 Number Scaling
4.2.1 Related works
It requires n+1 bits to accurately preserve a result of an n-bit fixed-point addition/subtraction operation. Hence, one solution for avoiding an overflow generated from a butterfly operation is to make the output wordlength one bit larger than the input one [48]. However, increasing the wordlength induces a number of drawbacks in FFT hardware implementation. First, a larger data storage unit (memory block or register file) is required, which increases both chip area and power consumption. Second, a longer wordlength results in a worse critical-path delay in arithmetic logic, which is not eligible for high-throughput FFT designs. Most of all, the wordlength is fixed in a memory-based FFT architecture, meaning that it is not possible to vary the wordlength from stage to stage. Consequently, many number scaling approaches have been proposed to prevent a wordlength increase at the cost of a minor accuracy loss, which can be roughly divided into two categories: the static scaling approaches and the dynamic ones.
Oppenheim et al. [49] proposed a static scaling procedure which is widely adopt in today’s FFT hardware implementation. Since the maximum magnitude of the result increases no more than a factor of 2 for a butterfly stage, incorporating an attenuation of 1/2 at both inputs (that is, increase the integer part by one bit and decrease the fractional part by one bit in a fixed-length word) to a radix-2 butterfly unit can completely eliminate output overflows. However, this approach degrades the output
SQNR due to larger truncation errors caused by the increasingly shorter fractional part stage by stage. Besides, the above scaling method can be further improved a bit with only a slight modification. Instead of performing number scaling at the input, incorporating an attenuation of 1/2 at the output of each stage, as shown in Figure 16, can achieve a better overall SQNR.
In [50], Ramakrishnan et al. concentrated on FFT designs for OFDM receivers.
The authors exploit the fact that input samples of OFDM follow a normal distribution to predict the possible output value range at each stage and then determine the scaling strategy accordingly. They suggest increasing the integer part by one bit for every two stages instead of every stage for FFT designs used in OFDM. However, the input can vary from application to application, and is mostly assumed uniformly distributed in a typical FFT analysis [48]. Furthermore, our experimental results show that the approach presented in [50] works well only if the standard deviation of normal distribution is within a specific range.
Therefore, instead of adopting the methods proposed in [49] and [50] directly, most designers try to find the optimized number format of output for each stage through simulation if a better SQNR is expected. Typically, there are two options for determining the number format of a radix-2 butterfly stage: keeping it unchanged as at the previous stage, or moving one bit from the fractional part to the integer part.
However, when the number of stages (k) is big due to a large FFT size, it is virtually Figure 16 A radix-2 butterfly unit with scaling by 1/2 at the output.
+
-Xm-1[p]
r
WN
2 1
*
21
Xm-1[q]
Xm[p]
Xm[q]
impossible to evaluate all feasible configurations (2k) and then pick the best one through simulation. Consequently, designers usually empirically select a limited set of
"better" candidate configurations, and choose the best one among them still through extensive time-consuming simulation.
On the other hand, a dynamic scaling approach improves the output SQNR by means of the notion of shared-exponent. The BFP algorithm [1], which is one of dynamic scaling methods, employs an intermediate buffer to store a block of output data, detects the maximum value, and then determines the exponent for that block of data. Though this method does achieve a better result than common static scaling approaches, the extra data buffer implies a notable increase in area. As well, buffer access and exponent detection operations require longer processing latency and consume more power. Therefore, static scaling approaches are still much more commonly preferred for typical FFT hardware implementations.
In this dissertation, we propose a fast probability-based static scaling optimization technique that is capable of providing a better output SQNR than existing static ones as well as needs no simulation at all. It is also as area-efficient as other static methods since all of them do not require a dynamic scaling unit; however, our technique can still roughly achieve the same level of output quality when compared with dynamic scaling approaches. For every butterfly stage, the proposed method can precisely estimate the accuracy loss of each candidate number format due to possible saturation and truncation errors via the static probability-based analysis and then picks the best one of them. Furthermore, our method can work with various FFT sizes, FFT algorithms, wordlengths, and input signal distributions.
4.2.2 Motivation and problem definition
As mentioned, the approach proposed in [49] suggests increasing the bitwidth of
the integer part by one at every radix-2 butterfly stage to avoid overflows. In this dissertation, the format for a fixed-point number is represented in the form of mbnf, denoting the wordlength is m-bit, the integer part is n-bit and thus the fractional part takes the rest m-n bits. Note that an m-bit number can only represent 2m different values no matter what the value of n is. Though the maximum magnitude of representable values can be doubled as increasing n by one, the bitwidth of the fractional part must be decreased by one at the same time since the wordlength m is fixed, which inevitably results in a precision loss. Take the radix-2 64-point FFT, which has log264 (i.e., 6) butterfly stages, as an example, if the input data is in 12b1f format, the final output format becomes 12b7f, in which only 5 bits are available for the fractional part.
However, after a 12-hour simulation with about 20 million random sets of input data, the probability for an output value that actually needs the seventh bit of the integer part is almost zero (i.e., the increasing one bit in the integer part is almost unnecessary). The fact implies that it may not be a wise method to always move a bit from the fractional part to the integer part at every butterfly stage since keeping more bits for fractional part can help reduce the truncation errors and thus improve the final output SQNR.
Nevertheless, if a stage keeps its output number format the same as its previous stage, then overflows might occur. In such cases, saturation logic is typically employed for overflow error reduction. A saturation operation is to clamp an overflowed positive/negative value to the maximum/minimum value a number format can hold. For example, if the number format in use is 4b4f, then 0100 (4) + 0101 (5) = 1001 (-7), which is an overflow with an error of 9 – (-7) = 16. However, if saturation is applied, the result becomes 0111 (7) and the error can be reduced to 9 – 7 = 2, which is much smaller than 16.
Let's examine a few configurations of the output number format for the 256-point FFT design with the input number format of 12b1f. If the configuration suggested in [49] is used, the resultant SQNR is 35.39 dB by simulation. If the integer part is not increased at the output of the 8th stage and saturation is performed, the SQNR would climb to 37.03 dB. If applying it again to the 7th stage, the SQNR would rise to 38.47dB. However, if further applying it to the 2nd stage, the SQNR would dramatically drop to 17.82 dB. The above indicates that the output number format of each stage must be determined carefully for achieving an even better SQNR.
Conventionally, static scaling optimization methods usually rely on simulation to evaluate the performance of a configuration, like [50]. Nevertheless, it takes hours for simulation with only ten thousand sets of inputs just to evaluate one single configuration of the 8192-point FFT. Meanwhile, for the radix-2 N-point FFT, there are log2N stages and the integer part can be increased by one bit or not at each stage, which makes the total number of possible configurations equal to N. That is, it takes years if one attempts to evaluate all configurations of the 8192-point FFT. This is the prime reason that motivates us to develop a revolutionary simulation-free scaling optimization technique, which turns out to be able to discover a near-optimal solution within only few minutes.
At the end of this section, the problem of static scaling optimization on fixed-point FFT addressing in this dissertation is described as follows – given FFT size, radix r (where r is a power of 2), fixed wordlength, and input probability distribution, determine the number format for the output of every stage statically (i.e., without use of simulation) such that the overall SQNR is maximized.