• 沒有找到結果。

Modified Register Relabeling for BIBITS Bus Encoding Scheme .33

Chapter 3 Design of BIBITS Bus Encoding

3.2 Modified Register Relabeling for BIBITS Bus Encoding Scheme .33

Based on the design of the BIBITS encoding scheme, there is still a chance to reduce power consumption in advance. A further technique, modified register relabeling, is introduced to further reduce power consumption in this section. The idea

of this is come from the observation of program-execution trace. When program is executed, processor often executes sequence of instructions repeatedly. This sequence of instructions is known as loop. A loop contains either one or more basic blocks.

Therefore, the distribution of register pairs is very skew.

According to Figure 3-8, we find that the best assignment can only be one Hamming distance for each register pair with different registers. However, there are still a lot of bit transitions when frequency of the register pair is very large. If he Hamming distance of the register pair can further be reduced from one to zero, a lot of bit transitions will be reduced. Therefore, modify original register relabeling method is proposed to resolve this problem. The difference between our proposed register relabeling algorithm and original register relabeling algorithm is that we consider that register relabeling is able to be combined with BIBITS bus encoding scheme, such that the best assignment of register relabeling will be one register pair that is zero Hamming distance. We see the following Figure 3-8, for example, R2 and R5 are

assigned as R2 and R3 that is one Hamming distance. And it is a best assignment case in original register relabeling method. But our modified register relabeling method assigns one inverse register pair that is R26 and R5. After applying BIBITS bus encoding step, this register pair will become R5 and R5, and the Hamming distance of this register pair becomes zero.

Original Register Relabeling Modified Register Relabeling

Bit transition = 1 Bit transition = 0 R2 (00010) R5 (00101)

R2 (00010) R3 (00011) Relabeling

R2 (00010) R5 (00101)

R26 (11010) R5 (00101)

R5 (00101) R5 (00101) BIBITS

encoding Y = X′

Modified Relabeling

Original Register Relabeling Modified Register Relabeling

Bit transition = 1 Bit transition = 0 R2 (00010) R5 (00101)

R2 (00010) R3 (00011) Relabeling

R2 (00010) R5 (00101)

R26 (11010) R5 (00101)

R5 (00101) R5 (00101) BIBITS

encoding Y = X′

Modified Relabeling

Figure 3-8: Comparison of original register relabeling and modified register relabeling for BIBITS bus encoding

Figure 3-9(a) shows us register pairs frequency of some program. RHG captures the utilization frequency and relation between register pairs. Nodes of the RHG correspond to registers and literals, and edge weight corresponds to frequency. Iterate through the edges starting from the most frequent ones. The following Figure 3-9 (b) is an example of RHG.

34

r5

Reg pair frequency

(r5,r2)

Reg pair frequency

(r5,r2)

Figure 3-9: (a) Register pairs frequencies of some program (b) An example of register histogram graph

The detailed modified register relabeling for BIBITS bus encoding algorithm is –Iterate through the edges starting from the highest weight edgei

‧If two nodes of edgei have not been assigned a new register name, then assign one inverted register pair Rx and R31-x, such that H.D.(BIBITS(Rx,R31-x))=0.

If one inverted register pair is not found, then assign one register pair Rx and Ry, such that H.D.(BIBITS(Rx, Ry)) is minimized.

‧If one node of edgei has been assigned Rx, then assign Ry to another node, such that H.D.(BIBITS(Rx, Ry)) is minimized.

‧If all nodes of edgei have been assigned new register names, then process the next edge.

3.3 Basic Block Selection Algorithm

Our selection algorithm is applied to analyze programs-execution behavior. We analyze and calculate the numbers of bit toggles and execution counts of each basic block.

The task of determining an optimal basic blocks for a given programs is known to be NP-complete in the size of the programs. However, many heuristics have sprung up that find near optimal solutions to the problem, and most are quite similar. The key idea of the encoded basic-block selection algorithm is to select the most frequent basic blocks to be encoded. We analyze the program-execution trace with this algorithm. First, we identify every basic block, and then we calculate the occurrence frequency and numbers of bit transitions of each basic block occurs on bus. After the trace analyzing, each basic block has three parameters: numbers of instruction of this basic block (length of this basic block), numbers of execution counts, and numbers of bit transitions.

Each basic block has a contribution value measured as the product of numbers of execution counts and numbers of bit transitions. We can compute the contribution as the contribution value divided by the length of this basic block.

The greedy algorithm with the contribution ratio of each basic block is applied to help us select which basic block should be encoded. The selection algorithm is shown as follows. Suppose that there is a set BB=

{

1,2,...n

}

of n basic blocks in the program. Each basic block has a contribution value:

Length

BitToggles ounts

ExecutionC on

Contributi = ⋅

. (EQ1)

Assumption that transformation table size is S. The greedy-basic-block selector

36

algorithm is

TT : Transformation Table BB : Basic Block

TD(BB{i}): Transformation data of basic block

TT

Basic Block Selector n CR Greedy

For example, the algorithm can be illustrated as Figure 3-10

Figure 3-10: An example of greedy-basic -block selector algorithm

Table 3-2: Computing contribution ratio of each basic block Frequency Bit Toggles Length Contribution CR

BB1 3 43 4 129 32.25

BB2 2 55 4 110 27.5

BB3 2 30 3 60 20

After analyze the trace of program execution, the set BB has three elements:

Basic-Block 1, Basic-Block 2, and Basic-Block 3. The parameters of these three basic blocks are shown as Table 3-2. The selection priority is first BB 1, then BB 2, and BB 3 is the last one.

38

相關文件