Chapter 2 Background
2.2 JIT Compiler Optimizations
Since JIT compilers perform compilation at run time, the restriction of compilation time is more severe than that in traditional static compilers. As a result, only cost-effective opti-mization techniques can be suitably applied during JIT compilation. Due to the character-istics of Java, optimization techniques might cause different impact when applied in Java JIT compilers than in traditional static C compilers. In this section, we are to discuss some common optimization techniques used in Java JIT compilers [15][16][17], and then to dis-cuss different ranges of optimization.
2.2.1 Common Optimization Techniques
Constant FoldingThe concept behind constant folding is to evaluate constant expression, whose operands are known to be constant, at compile time. After this simple transformation, the constant expression is replaced by its value. Therefore it saves the run-time computation of the expression. A simple example is demonstrated in Figure 2-3.
Figure 2-3. A Constant Folding Example
Copy Propagation
Copy propagation is a transformation that replace variable occurrences with its copy value which is defined in earlier copy assignments. For example, the copy assignment is represented in the form x = y, for some variables x and y. Then later uses of x, as long as intervening instructions have not changed the value of either x or y, can be replaced with y.
// before constant folding x = 10 + 2;
// after constant folding x = 12;
Figure 2-4 is an example in the flowgraph form. The copy assignment is b = a, and suc-ceeding occurrences of b of the underlined expressions are replaced with a.
Figure 2-4. A Copy Propogation Example (a) Before Copy Propogation (b) After Copy Propogation
Common Sub-expression Elimination (CSE)
Figure 2-5. An Example of CSE
The purpose of CSE is to reduce repetitive computations by substituting available results for the expressions that do the same computation. Figure 2-5 gives a simple exam-ple. Also two common derivatives of CSE are:
• Scalar Replacement
Array element accesses in a loop are replaced by temporary variables, when the array objects and the array indexes remain unchanged. See the example in Figure 2-6.
entry
• Common Effective Address Generation
Successive array element accesses in a loop can be optimized by introducing a tempo-rary pointing to the first element. Therefore other elements can be accessed by using the temporary as the base address and corresponding array indexes as offsets. See the example in Figure 2-6.
Figure 2-6. An Example of Scalar Replacement and Common Effective Address
Exception Check Elimination
Java bytecode instructions contain semantics that may induce exceptions. In an inter-preter, such bytecode instructions are checked during interpretation to see if exceptions arise. If they are, appropriate exception handlers are invoked. For a JIT compiler, to com-pile these bytecode instructions also produce comcom-piled code that performs exception check.
However, some of these checks are redundant and can be eliminated via careful analysis.
In short, exception check elimination helps to save unnecessary operations and also reduce code size. Null pointer check elimination and array bound check elimination are the most common techniques used in Java JIT compilers.
Method Inlining
The idea of method inlining is to inline method calls by expanding method bodies. This optimization can reduce method invocation overhead in sacrifice of code size expansion and also can provide more optimization opportunities. In object-oriented languages like Java, tiny methods such as class constructors and methods that accesses private variables are frequently executed. These methods spend more time on method invocation than
For (i =0; i <=5; i++)
Common Effective Address
method body execution. Hence method inlining is useful under these circumstances. More-over, concerning the heavy overhead of devirtualization, virtual method calls may be inlined as well. Certainly, it involves further analysis.
Strength Reduction and Machine Idioms
Strength reduction is to replace an operation with a semantically equivalent one, though weaker but faster. A common case is using the shift operator to multiply and divide integers by a power of 2. For example, x >> 2 can be used in place of x / 4, and x << 1 replaces x
* 2. In a similar way, machine idioms refer to instructions or instruction sequences for a specific ISA that executes more efficiently than a similar sequence of instructions targeted for a more general architecture. A good example is that some architectures provide multiply-and-add instructions for faster execution.
2.2.2 Optimization Range
Conventionally, an optimization applied to a program is generally called "local" if it is performed by looking only at the statements in a basic block; otherwise, it is called "global"
[18]. To be more specific, "local" means optimization is applied within a basic block while
"global" within a function. Some optimization techniques can be applied at both local and global levels. Global optimization invests more compilation time in advanced analysis, and therefore leads to better compiled code quality.
Local optimization might expand its optimization range from a basic block to an extended basic block [19]. As a contrast to single-entry-single-exit basic blocks, extended basic blocks are also single-entry but possibly multiple-exit, and therefore have more opportunities for optimization. Researches on high performance architectures focus on loop optimization in a program. In fact, high-level loop structures may be recovered by identi-fying strongly connected components (SCCs) or regions in a low-level control flow graph.
Furthermore, interprocedural optimization is more aggresive for its range expands across functions, and thus is considered to be pretty costly. In short, as the optimization range is enlarged from local to loop and global, or even interprocedural, the cost of analysis defi-nitely increases. For more detailed information, please also refer to [19].