Introduction - 藉由暫存器配置與指派演算法減少程式碼大小於混合寬度指令集架構處理器

In the increasing market of embedded systems, RISC processors have been used widely.

A RISC processor usually offers higher computation power and lower hardware cost and, meanwhile, suffers from the less code density than a CISC processor because of its fixed-width instruction set. However, code size is one of the major issues in embedded systems, since the larger code size may increase the memory requirement. As a result, mixed-width RISC instruction set architectures (ISAs) have been proposed to make a good tradeoff between performance and code density (i.e. low code size) [1]. Moreover, the traffic of the memory data bus for fetching instructions and the I-Cache miss rate may also be reduced.

There are several mixed-width ISAs provided commercially, for examples, ARM’s ARM/Thumb ISA, MIPS’ MIPS/MIPS16 ISA, Andes’ AndeStar ISA, etc [2-4]. They typically have one short width instruction format (S-Format) as a frequently used subset of the longer width instruction format (L-Format). For example, MIPS is a 32-bit width instruction set, and its 16-bit width subset is called MIPS16. Mixing the short width instructions into the original program which is composed with 32-bit instructions may improve the code density.

However there are two main limitations exists due to the S-Format instructions have fewer bits for register indexing and immediate value storing in mixed-width ISAs.

1. Fewer bits to index registers:

One of the limitations is that the short width instructions have fewer bits to index registers. For example, 3-bit register field in S-Format can access eight physical registers only. If all of the operands of an instruction are assigned to the registers that can be accessed by S-Format instructions, then this instruction is able

to be encoded as an S-Format instruction to reduce code size. Otherwise, if one of its operands is out of the register indexing range of S-Format instructions, then this instruction must be encoded as an L-Format instruction definitely. Accordingly, if the compiler does not take into account these restrictions while assigning registers, the translation rate of S-Format instructions may be quite low. Therefore, the assignment of registers becomes very important for mixed-width ISAs.

2. Fewer bits to hold immediate values:

The other limitation is the short width instructions have fewer bits to store immediate values. If the immediate value is oversized for an instruction’s S-Format then the instruction can only be encoded as L-Format instruction. Although large immediate values may impact the translation rate of S-Format instructions, it varies on how complier manages constants. If a compiler uses a constant pool to hold these large immediate values, the impact of immediate values can nearly be neglected.

In addition, the different mechanisms of mode switching between L-Format and S-Format instructions make problems distinct in mixed-width ISAs. There are two types of mechanisms for switching between L-Format and S-Format instructions [5]. Some architectures use a mode switching instruction to change modes between code segments with different encoding formats, for example, ARM/Thumb. It means that all instructions in the same code segment must be encoded in the same format as shown in Figure 1-1 (a). On the other hand, there are some architectures change modes by instruction encoding so that L-Format and S-Format instructions may be interleaved freely in routines as shown in Figure 1-1 (b), i.e., L-Format and S-Format instructions may be mixed up at the instruction level granularity. For example, AndeStar ISA uses a bit (usually the MSB) in instruction field to

indicate whether the instruction is L-Format or S-Format as shown in Figure 1-2. For the former, existing compilers either rely on user guidance or perform an analysis to determine which code segments should use S-Format [6], then a mode switch instruction will be inserted between the code segments, and finally the compiler compiles code segments with different instruction width by different policies. For the latter, because no mode-switch instruction is needed, the compiler should eliminate the limitations of each individual instruction of its S-Format as far as possible to increase the number of instructions encoded in S-Format.

However, the existing techniques for this kind of ISAs are still rudimentary.

Figure 1-1 – L/S-Format instructions in program code. (a) Mode-switch by mode-switch instruction. (b) Mode-switch by instruction encoding.

Figure 1-2 – Mode-switch by instruction encoding.

BB1:

1.1 Research Motivation

So far we have introduced the mixed-width ISAs, which can increase code density if the registers are used carefully especially for those with mode-switch by instruction encoding.

Also, we know that the code size problem is one of the major issues in embedded systems.

The larger code size needs the larger memory, and thus may consume more power.

Unfortunately, the enlarging program size due to the requirement of more program functionalities in modern embedded applications is happening. For these reasons, using mixed-width ISAs is a feasible approach for code size reduction.

In order to reduce program code size for a mixed-width ISA with mode-switch by instruction encoding, registers should be allocated and assigned properly to eliminate each instruction’s limitation of translation to S-Format instructions, and, as in results, the number of instructions that can be encoded as S-Format may be increased. However, the existing techniques of compilers for mixed-width ISA with mode-switch by instruction encoding are rudimentary. Therefore, a proper register allocation and assignment algorithms should be designed for this kind of ISAs.

1.2 Research Objective

In this thesis, we proposed an algorithm for mixed-width ISA with mode-switch by instruction encoding to increase the number of instructions encoded as S-Format by allocating and assigning registers properly. The original goal of register allocator is to allocate virtual variables to registers or memory locations and optimize for generating fewest memory referenced instructions (spill codes). However, for the mixed-width ISAs, it should consider

the mapping of physical registers and virtual variables, and, meanwhile, the number of spill codes should be minimized due to the performance issue. To achieve the features mentioned, there are two main goals to accomplish:

1. Reducing code size:

To reduce code size by mixed-width ISA with mode-switch by instruction encoding, the usage of S-Format instructions is the key point. In other words, if we can encode more instructions as S-Format instructions, the code size will be reduced more. To achieve that, we propose a heuristic model in register assignment procedure. The proposed algorithm not only allocates virtual variables to registers or memory locations but also assigns registers by choosing virtual registers with the highest code size benefit to assign physical registers which are accessible by S-Format instructions.

2. Minimizing performance degradation:

Although our primary objective is to reduce code size, the number of spill codes generated by register allocator is critical, too. More spill codes lead to more performance degradation, and it also increases code size. To minimizing the number of spill codes, the proposed algorithm chooses variables with the lowest memory reference cost to spill while the required registers are more than the physical registers available.

1.3 Organization of this Thesis

The rest of this paper is organized as follows: Section 2 discusses more details of a typical mixed-width ISA and other related researches and algorithms; Section 3 gives the instruction formats and register classes defined in our algorithm and the detailed description of the proposed algorithm; Section 4 presents the experimental results and discussion follows.

Finally, Section 5 dedicates to the conclusions we draw and the future work planed.

在文檔中藉由暫存器配置與指派演算法減少程式碼大小於混合寬度指令集架構處理器 (頁 11-17)