• 沒有找到結果。

Chapter 1 Introduction

1.5 Thesis Organization

The rest of the thesis is organized as follows. Chapter 2 introduces the OFDM system for our experiment, the difference between the floating point and the fixed point, problem formulation and three algorithms which are closely related to proposed algorithm. In Chapter 3, proposed algorithm is presented in detail.

Experimental results are showed in Chapter 4. Finally, the conclusion is made in Chapter 5

Chapter 2 Preliminary

2.1 The OFDM System

The OFDM system used in the case study is obtained from [13]. Figure 2.1, Figure 2.2 and Figure 2.3 depict the OFDM system blocks. Figure 2.1 depicts the transmitter architecture part of the system. Figure 2.2 depicts the receiver architecture part of the system. Figure 2.3 deeply depicts the channel estimator block of the receiver architecture.

Figure 2.1 Transmitter architecture of the OFDM system

Figure 2.2 Receiver architecture of the OFDM system

Figure 2.3 Channel estimator of the OFDM system

2.2 Floating Point and Fixed Point

There are two ways in numeric representation. One is floating point and the other one is fixed point. Floating point representation has wider range and fixed point has less complexity in hardware implementation.

2.2.1 Floating Point

Figure 2.4 Floating point representation

The floating point representation is showed in Figure 2.4. The bit width of the floating point contains one sign bit, exponent bits and mantissa bits. The name floating point comes from the fact that the radix point can float. It means that the radix point can allocate anywhere relative to the significant digits of the number.

The first thing to add or subtract two floating points is to represent them with the same exponent and shift the mantissa bits. Because the exponent bits have to be checked and mantissa bits be shifted, it will be more complicated than fixed point representation in practical hardware implementation.

In the Case 2.1 below, the second number is shifted right by two digits, and we process it with the normal addition.

Case 2.1 Two floating point addition

2.2.2 Fixed Point

Figure 2.5 Fixed point representation

The fixed point representation is showed in Figure 2.5. The radix point separates the integer part and fraction part.

In contrast to floating point representation, the radix point of fixed point already fixed. It means that the radix point can not move anymore after the bit width determination of the integer part and the fraction part.

Unlike the addition and subtraction of floating point, the addition and subtraction of fixed point just add or subtract the integer part and the fraction part individually, and with carry out if necessary.

Case 2.2 Two fixed point addition

Two fixed point numbers with the same bits in Case 2.1 is showed in Case 2.2.

Because the addition in fixed point representation does not compare the exponent bits and shift the mantissa bits, it needs less hardware than floating point does.

2.3 Problem Formulation

The bit width is a set of N bit widths of a system and is defined to be a bit width vector as follow:

W = { , w w

1 2

, , " w

N

} .

(2.1)

Assume that the objective function f is defined by the sum of every bit width implementation hardware cost function c as

The error function p indicates the bit error rate and is constrained as below, and Preq is a constant for a required error constraint.

p W ( ) ≤ p

req (2.3)

The lower bound bit width set is denoted by LB and the upper bound bit width set is denoted by UB. The lower bound and upper bound of each variable are also considered as constraint:

w

k_LB

w

k

w

k UB_

, ∀ = k 1,..., N

(2.4)

The complete bit width optimization problem can be stated as:

min ( ) , ( )

r

,

N eq

W I

f W subject to p w p LB W UB

≤ ≤ ≤

(2.5)

2.4 Previous Works

S. Roy and P. Banerjee proposed an algorithm to determine the bit width in MATLAB [12]. Figure 2.6 shows the complete flow of the algorithm. First, the algorithm determines the integer bit width. Second, the algorithm generates the fixed point form of MATLAB. Third, the algorithm simulates the floating point version and fixed point version to record the error metric. Fourth, the coarse optimize will be executed to find a rough solution. Finally, the fine optimize will be executed to further minimize the bit width.

The Coarse_Optimize phase of the automatic algorithm follows the divide-and-conquer approach to quickly move closer to the optimal set. They vary the fraction bit width to get a set of coarse optimized fraction bit width, which are all equal.

The accuracy of the output of an algorithm is less sensitive to some internal variables than to others. Hence, starting from the coarse optimal point, they apply Fine_Optimize algorithm to the coarse optimized variables. They do not perform finer optimization directly from the start because it will take a long time to converge. The Fine_Optimize algorithm basically tries to find out the variable whose impact on the quantization error is the smallest and to reduce fraction bit width in such variables as long as the error is within the EM constraint. Then, it tries to find a variable whose impact on the error is largest and increase one fraction bit while simultaneously reducing two bits in the variable with the smallest impact on the error, thereby reducing one bit overall. It performs such bit reductions iteratively until the EM constraint is exceeded.

The algorithm only reduces the complexity by choosing the total bit-width as the objective function. The impact of bit-width on area should depend on the functional unit.

K. Han et al. proposed sequential search for word length optimization [17].

The basic notion of sequential search is that each trial eliminates a portion of the region being search. The sequential search method decides where the most promising areas are located, and continues in the most favorable region after each set of simulations.

The principles of sequential search in n dimensions can be summarized in the following four steps:

1. Select a set of feasible values for the independent variables, which satisfy the desired performance during one-variable simulation. This is a base point.

2. Evaluate the performance at the base point.

3. Choose the feasible locations at which evaluate the performances and compare their performance

4. If one point is better than others, move to the better point, and repeat the search, until the point has been located to within the desired accuracy.

The sequential search only considers the error distortion and does not consider the hardware complexity, either. Therefore, K. Han and B.L.Evans further proposed the complexity and distortion measure for the bit width determination [16].

The complexity-and-distortion measure combines the hardware complexity measure with the error distortion measure by a weighted factor. In the objective function, both hardware complexity and error distortion are simultaneously considered. They normalize the hardware complexity and the error distortion function by multiply them with hardware complexity and error distortion weighting factors respectively.

Setting the hardware complexity and error distortion weighted factor from 0 to 1, the complexity and distortion method searches for an optimum word length with tradeoffs between only hardware complexity measure and error distortion measure method.

The complexity-and-distortion measure method can reduce the number of iterations for searching the optimum word lengths, because the error distortion sensitivity information is utilized. This method can more rapidly find the optimum word length that satisfies the required error constraint by using fewer iteration compared to the complexity measure method. However the word lengths are not guaranteed to be optimal in terms of the hardware complexity.

Chapter 3

Proposed Algorithm

3.1 Fixed Point Bit Width Determination

In digital system, there are two numeric representations, floating point and fixed point. Floating point representation allocates one sign bit and a fixed number of bits to exponent and mantissa. In fixed point representation, the bit width is divided for the integer part and the fraction part. When designers develop high-level algorithms, floating-point formats are usually used because of its accuracy. Floating point representation can present very large range. In hardware, the floating point representation needs to normalize the exponents of the operands and it costs lots of hardware. Floating point representation is usually transferred to fixed point representation to reduce the total hardware cost.

As mentioned above, fixed point representation is composed of the integer part and the fraction part. The number of bits assigned to the integer part is called integer bit width (IBW), and the number of bits assigned to the fraction part is called fraction bit width (FBW). The complete fixed point bit width can be

Figure 3.1 Flow of bit width determination

The total bit width determination procedure is showed in Figure 3.1. First of all, the integer bit width is calculated to prevent overflow. Then, the iteration procedure is used to minimize the fraction bit width to reduce the total hardware cost.

The integer bit width has to be long enough to prevent overflow. By monitoring the signals of the system, the minimum and the maximum value of the signals are obtained, and the integer bit width can be also obtained.

Integer bit width = log2 (max (|MAX|, |MIN|)) + 2.

After assigning the integer bit width, there are three steps to determine the fraction bit width. First, the uniform fraction bit width is determined to be the upper bound of the algorithm. Second, the individual minimum bit width of every variable is calculated to be the lower bound. Finally, the bit width will be fine tuned between the upper bound and the lower bound for each variable.

3.2 Upper Bound Determination

In order to accelerate the fraction bit width determination procedure, the uniform fraction bit width is calculated to be the upper bound. The uniform fraction bit width means that every variable has the same fraction bit width. A binary search approach is used to quickly obtain the uniform fraction bit width.

Upper-Bound-Determination ( UB, error_ constraint ) begin

1. Set H to highest bit width and set L to lowest bit width, M = ( H + L)/2;

2. Calculate the BER for all variables having the M fraction bits;

3. While ( !(( BER < error_constraint ) and (0 ≤ M-L ≤ 1) and (0≤ H-M ≤1) ) )

Figure 3.2 Upper bound of the algorithm

The uniform fraction bit width determination procedure is showed in Figure 3.2. The fraction bit width determination procedure will repeat until it meets the condition. We obtain a uniform bit width set which is denoted by UB and the upper bound of every variable. The upper bound of the jth variable is denoted by wj_UB. Because the fraction bit width determination procedure uses binary search approach, it will not spend too much time to find the uniform fraction bit width.

3.3 Lower Bound Determination

In order to minimize the total hardware cost, it has to determine the minimum individual fraction bit width when other variables remain as upper bound. The individual minimum fraction bit width will be the lower bound, and the fine tuning process will start from the lower bound.

Lower-Bound-Determination ( W, UB, LB, error_constraint ) begin

1. for ( i from 1 to N) 2. for ( j from 1 to N )

3. if ( j != i ) Set WLB[j]as UB[j];

4. LB[i] ← minimum_bit_width( WLB, error_constraint);

5. return LB;

end;

Figure 3.3 Lower bound of the algorithm

The lower bound determination procedure is showed in Figure 3.3. We only choose one variable and set the variable to fixed point each time, while other variables remain as upper bound. We use the binary search to find the minimum bit width of the variable. We determine every variable in order and finally obtain a lower bound bit width set, which is denoted by LB. The lower bound of jth variable is denoted by wj_LB.

Because the determination procedure uses the binary search for the minimum individual fraction bit width as well, the minimum individual fraction bit width, which will be the start point of fine tuning process, is obtained quickly.

3.4 Fine Tuning Process

3.4.1 First Stage

There are two stages of the fine tuning processes, the first stage will increase the fraction bit width to meet the error constraint, and the second stage will reduce the fraction bit width to reduce hardware complexity under error constraint.

In the first stage, after obtaining the lower bound and the upper bound of the fraction bit width, there is a bit width set which all variables are set to their lower bound, and the bit width set will be the first candidate.

First of all, the candidate will be simulated and the bit error rate will be recorded. If the bit error rate of the candidate meets the error constraint, it means the candidate is the minimum bit width set so the fine tuning process will be terminated. If the bit error rate of the candidate does not meet the error constraint, it means that the candidate is not long enough to represent the value exactly and the candidate has to be increased.

Second, only the bit width of one variable in the candidate is set to the upper bound each time while other variables do not change, and each bit width sets is called one combination. Those combinations will be simulated. There will be n

Finally, if no combination meets the error constraint, the combination which has the smallest ratio ( | △cost | / | △BER | ) will be chosen to be the new candidate. △ BER denotes the difference of the bit error rate between the combination and the candidate. The smallest ratio ( | △cost | / | △BER | ) means that the combination increases the smallest hardware cost in the same bit error rate.

The combination which has the smallest ratio will not be simulated next time, because the variable of the combination already be the upper bound and can not increase anymore. The procedure will repeat from the second step and simulate the combinations until any combination meets the error constraint. The total flow is showed in Figure 3.4 below.

Figure 3.4 Flow of the first stage of fine tuning process

Figure 3.5 Simulated combinations for first stage of fine tuning process

W denotes the bit width set of variables. WFS is an empty set. wj represents the bit width of jth variable. There are N variables in the system. The bit width set of the W and WFS will be the candidate. We simulate |W|, which means the number of variable in the set, combinations. Figure 3.5 shows an example of the simulating

|W| combinations first time. We only set the bit width of one variable in the W to the upper bound and other variables remain the same bit width. Every bit width set is one combination. After simulating |W| combinations, if there is no combination meeting the error constraint, we choose the combination k having the smallest ratio ( | △cost | / | △BER | ). We set wk to the upper bound and put wk into the WFS, it means wk can not increase anymore. We remove wk from W so that it reduces one simulation next time.

3.4.2 Second Stage

After first stage of the fine tuning process, the candidate already met error constraint. The second stage will reduce the fraction bit width under error constraint in order to minimize the hardware cost.

First, the every variable is set to half of sum of the lower bound and the bit width of the candidate, each bit width set is one combinations. We simulate the combinations.

Second, if any combination meets the error constraint, the combination which has the biggest ratio ( | △cost | / | △BER | ) are chosen. It means that the variable of the combination has the biggest hardware cost in the same bit error rate. For the combinations which do not meet the error constraint, the lower bounds of these combinations are updated to half of sum of the original lower bound and the bit width of the variables in the combinations. We repeat the procedure until the lower bound could not update anymore.

Finally, if the bit width set of the candidate is equal to lower bound, it means that the bit width could not be further reduced. We terminate the second stage of the fine tuning process and obtain the final result.

Figure 3.6 Flow of the second stage of fine tuning process

The total flow of the second stage of fine tuning process is showed in Figure

Figure 3.7 Simulated combinations for second stage of fine tuning process

Subsequently, we simulate |W| combinations. If there is more than one combination meeting the error constraint, we choose the combination k having largest ratio ( | △cost | / | △BER | ) and set wk to (wk + wk_LB) / 2. The variables in the combinations which meet the error constraint but have smaller ratio than combination k remain the same bit width in WFS. If combinations do not meet the error constraint, we set the lower bound of the changed variable in the combinations to half of sum of bit width of variable and its lower bound. We will check if there is variable equal to its lower bound. If it is, we remove the variable from |W| and put it into WLB.

Finally we will check if W is an empty set. If the W is an empty set, it means that W can not reduce anymore and we terminate the process. Otherwise, we repeat the procedure until the W is equal to lower bound.

Figure 3.8 Example of the first step of fine tuning first stage

The first stage of fine tuning process is showed in Figure 3.8. The figure indicates that there is no combination which meets the error constraint. Since there is no combination meeting the error constraint, the combination which has the smaller ratio ( | △cost | / | △BER | ) will be picked to be the new candidate. The procedure repeats because the bit error rate of new candidate does not meet the

Figure 3.9 Example of the second step of fine tuning first stage

The combination meets the error constraint in the first stage of the fine tuning process showing in Figure 3.9. After first choice of the combination, there are only two combination and the combinations are individually simulated. In these two combinations, only the first combination meets the error constraint. Even if the second combination has the smaller ratio ( | △cost | / | △BER | ), it does not meet the error constraint, so the first combination is chosen to be the new candidate and go to the second stage of fine tuning process.

Figure 3.10 Example of the first step of fine tuning second stege

The second stage of fine tuning process is showed in Figure 3.10. Lower bounds of three variables are 3, 4 and 2, so the bit width of first variable in the first combination is

(3 + 6) / 2 = 4.5

The bit width of first variable in the first combination is 4 because the bit

The first combination is chosen to be the new candidate and the process repeats because it is the only combination which meets the error constrain. The second combination does not the error constraint so the lower bound of the second variable is updated to 5.

Figure 3.11 Example of the second sep of fine tuning second stage

The end of the fine tuning process is showed in Figure 3.11. The bit width in the combinations is calculated in the same way. For two combinations, these combinations are simulated and both of they do not meet the error constraint.

The combinations do not meet the error constraint, and then we update the lower bound. Since the new lower bound is equal to the candidate, it means bit

The combinations do not meet the error constraint, and then we update the lower bound. Since the new lower bound is equal to the candidate, it means bit

相關文件