JIT Compiler Optimizations - 應用在ARM/Thumb雙指令集處理器的嵌入式混合模式爪哇虛擬機器之設計與實作

Chapter 2 Background

2.2 JIT Compiler Optimizations

Since JIT compilers perform compilation at run time, the restriction of compilation time is more severe than that in traditional static compilers. As a result, only cost-effective opti-mization techniques can be suitably applied during JIT compilation. Due to the character-istics of Java, optimization techniques might cause different impact when applied in Java JIT compilers than in traditional static C compilers. In this section, we are to discuss some common optimization techniques used in Java JIT compilers [15][16][17], and then to dis-cuss different ranges of optimization.

2.2.1 Common Optimization Techniques

Constant Folding

The concept behind constant folding is to evaluate constant expression, whose operands are known to be constant, at compile time. After this simple transformation, the constant expression is replaced by its value. Therefore it saves the run-time computation of the expression. A simple example is demonstrated in Figure 2-3.

Figure 2-3. A Constant Folding Example

Copy Propagation

Copy propagation is a transformation that replace variable occurrences with its copy value which is defined in earlier copy assignments. For example, the copy assignment is represented in the form x = y, for some variables x and y. Then later uses of x, as long as intervening instructions have not changed the value of either x or y, can be replaced with y.

// before constant folding x = 10 + 2;

// after constant folding x = 12;

Figure 2-4 is an example in the flowgraph form. The copy assignment is b = a, and suc-ceeding occurrences of b of the underlined expressions are replaced with a.

Figure 2-4. A Copy Propogation Example (a) Before Copy Propogation (b) After Copy Propogation

Common Sub-expression Elimination (CSE)

Figure 2-5. An Example of CSE

The purpose of CSE is to reduce repetitive computations by substituting available results for the expressions that do the same computation. Figure 2-5 gives a simple exam-ple. Also two common derivatives of CSE are:

• Scalar Replacement

Array element accesses in a loop are replaced by temporary variables, when the array objects and the array indexes remain unchanged. See the example in Figure 2-6.

entry

• Common Effective Address Generation

Successive array element accesses in a loop can be optimized by introducing a tempo-rary pointing to the first element. Therefore other elements can be accessed by using the temporary as the base address and corresponding array indexes as offsets. See the example in Figure 2-6.

Figure 2-6. An Example of Scalar Replacement and Common Effective Address

Exception Check Elimination

Java bytecode instructions contain semantics that may induce exceptions. In an inter-preter, such bytecode instructions are checked during interpretation to see if exceptions arise. If they are, appropriate exception handlers are invoked. For a JIT compiler, to com-pile these bytecode instructions also produce comcom-piled code that performs exception check.

However, some of these checks are redundant and can be eliminated via careful analysis.

In short, exception check elimination helps to save unnecessary operations and also reduce code size. Null pointer check elimination and array bound check elimination are the most common techniques used in Java JIT compilers.

Method Inlining

The idea of method inlining is to inline method calls by expanding method bodies. This optimization can reduce method invocation overhead in sacrifice of code size expansion and also can provide more optimization opportunities. In object-oriented languages like Java, tiny methods such as class constructors and methods that accesses private variables are frequently executed. These methods spend more time on method invocation than

For (i =0; i <=5; i++)

Common Effective Address

method body execution. Hence method inlining is useful under these circumstances. More-over, concerning the heavy overhead of devirtualization, virtual method calls may be inlined as well. Certainly, it involves further analysis.

Strength Reduction and Machine Idioms

Strength reduction is to replace an operation with a semantically equivalent one, though weaker but faster. A common case is using the shift operator to multiply and divide integers by a power of 2. For example, x >> 2 can be used in place of x / 4, and x << 1 replaces x

* 2. In a similar way, machine idioms refer to instructions or instruction sequences for a specific ISA that executes more efficiently than a similar sequence of instructions targeted for a more general architecture. A good example is that some architectures provide multiply-and-add instructions for faster execution.

2.2.2 Optimization Range

Conventionally, an optimization applied to a program is generally called "local" if it is performed by looking only at the statements in a basic block; otherwise, it is called "global"

[18]. To be more specific, "local" means optimization is applied within a basic block while

"global" within a function. Some optimization techniques can be applied at both local and global levels. Global optimization invests more compilation time in advanced analysis, and therefore leads to better compiled code quality.

Local optimization might expand its optimization range from a basic block to an extended basic block [19]. As a contrast to single-entry-single-exit basic blocks, extended basic blocks are also single-entry but possibly multiple-exit, and therefore have more opportunities for optimization. Researches on high performance architectures focus on loop optimization in a program. In fact, high-level loop structures may be recovered by identi-fying strongly connected components (SCCs) or regions in a low-level control flow graph.

Furthermore, interprocedural optimization is more aggresive for its range expands across functions, and thus is considered to be pretty costly. In short, as the optimization range is enlarged from local to loop and global, or even interprocedural, the cost of analysis defi-nitely increases. For more detailed information, please also refer to [19].

在文檔中應用在ARM/Thumb雙指令集處理器的嵌入式混合模式爪哇虛擬機器之設計與實作 (頁 20-24)