J AVA T ECHNOLOGIES - BACKGROUND AND RELATED WORK

CHAPTER 2 BACKGROUND AND RELATED WORK

2.1 J AVA T ECHNOLOGIES

Programs of traditional programming languages have only one form, a running program. Whereas Java programs come with two flavors: a stand-alone program to run as a separate unit or an applet to run from the Internet browser. The life cycle of a traditional language program is very simple. A programmer writes a program, which may consist of a number of modules but are all linked at compile time. The compiler then converts the program to the underlying machine assembly language. As far Java program modules, each consisting of one or more classes, they are compiled independently to Java Virtual Machine bytecode. At this stage, these modules which are called class files can be exchanged and transferred around the network. Users load the module into an implementation of a the JVM. JVM may then load additional

“.class” files as needed, from the user or across the Internet. Only at this point, references between different modules are resolved. And a dynamic linking step performed by a linker before the user gets starting the program.

2.1.1 Java Bytecode Manipulation Methods

Java bytecodes in their way to run take one of three methods: interpreter, Just-In-Time (JIT) compiler, and Java processor. These methods connect the virtual machine to the actual machine, where Java software can run.

A Java interpreter, like a translator, can convert Java bytecodes on-the-fly (at run-time) into native codes. The interpreter must process the same code over and over again while a Java program is running. Interpretation is simple and does not require

much memory. It is relatively easy to be implemented on any processor. However, it involves a time-consuming loop to translate every Java bytecode, and, thus, affects performance signitificantly.

A Java Just-In-Time (JIT) compiler, like an interpreter, translates Java bytecodes into native code but it does not have to translate the same code over and over again as it cache the native code. This can result in significant speedup. However, sometimes a JIT compiler takes a large number of time to do its job and results in code size expansion and consuming more memory.

A Java processor natively understands Java bytecode without the overhead of an interpreter or a JIT compiler. We can take advantage of high performance by running Java programs on Java processors.

2.1.2 Java Class File Organization

Like any compiler, the Java compiler takes the source code of a program and translates it into machine code and binary symbolic information. In a traditional system, these data will be stored in an object file for later use or execution. In Java case, they are placed into a separate “.class” file for each Java class or interface in the source code.

The Java class file is a precisely defined binary file for Java programs. Each Java class file represents a complete description of one Java class or interface. There is no way to put more than one class or interface into a single class file. The precise definition for the format of the class file ensures that any Java class file can be loaded and correctly interpreted by any Java virtual machine, no matter which system produced the class file or which system hosts the virtual machine.

The Java class file is a binary stream of 8-bit bytes. Data items are stored sequentially in the class file, with no padding between adjacent items. The lack of padding helps to keep class files compact. Items that occupy more than one byte are split into several consecutive bytes that appear in big-endian order. The class files follow a rigid five-part format as shown in Figure 2-1. Each class file begins with a magic number and version information, followed by a constant pool, a class descriptor header, fields, methods, and finally an extension area. Because of Java’s dynamically linked nature, each class file must contain a large amount of symbolic and typing information. This data informs the JVM about how to resolve internal and external class references, and also allows it to verify the security and integrity of classes.

File Header signature version

public static int i = 123;

Fig. 2-1: Linear, record-based organization of a Java class file

Constant Pool and Class Descriptor

The constant pool of a class file is similar to the symbol table in a traditional object file; see Figure 2-1(a). The data with constant pool is referenced primarily by other structures and code within the class file, and thus contains a wealth of additional information beyond the usual symbol names. The pool is treated as an one-dimensional array of slots each containing one variable-length data type called a tag. The most common constant pool tags are strings. Strings are stored in the UTF-8 format in Unicode characters which are packed into bytes to save space.

The constant pool also integrates the aspects of traditional import/export and relocation tables. There are a number of special linking tags, which simply contain the indices of other pool slots. These tags (such as CLASS, METHOD, and NAMETYPE) are used to dynamically link Java classes. For example, a METHOD tag points to a CLASS tag (to specify an imported class), as well as a NAMETYPE tag (to identify a specific method in that class). Linking tags are also directly referenced by the bytecode of the class as a dynamic relocation table.

Following the constant pool, the class descriptor consists of several fields related to the entire class; see Figure 2-1(b). These fields include the access flags of the class (public, private, and so on), as well as constant pool indexes to the class and its superclass. An array of constant pool indexes to any interfaces implemented by the class also appears here.

Fields, Methods, and Attributes

Following the class descriptor, there are two arrays that describe fields (Figure 2-1(c)) and methods (Figure 2-1(d)). Both arrays have an identical structure, but they describe different types of class members. Each variable length entry identifies the access flags, name, and signature of the member, as well as a list of associated

“attributes”.

An attribute is a basic component of the class format, and is merely a special type of record that provides additional information in a more flexible format. For instance, each method descriptor contains a nested Code attribute that fully describes the actual bytecode for that method. Similarly, a field descriptor may contain a ConstantValue attribute, which points to a constant pool entry that describes a “static”

constant in a class. In addition, several attributes are optional and are related to debugging. You can include your own attributes in class files to extend the class format without breaking existing code or Java Virtual Machine.

The Code attribute is especially important because it contains the actual Java bytecode (along with stack and local variable information). It can also contain nested attributes. For example, it can nest an Exceptions attribute (to list any exceptions thrown by the method owning the Code attribute) as well as several debug attributes, such as LineNumberTable, LocalVariables, and SourceFile.

At the end of the class file (Figure 2-1(e)) is a separate section for other attributes that apply to the class as a whole. The SourceFile attribute is placed here by the Java compiler, and vendors are free to put additional attributes in this section as well. For instance, the Attributes section is a good place to put class authentication or security information, or perhaps revision control system data.

在文檔中重用object fields以加速Java處理器之執行效率 (頁 11-16)