• 沒有找到結果。

The Five Classic Components of a Computer

2.2 Operations of the Computer Hardware

hardware and the relationship between high-level programming languages and this more primitive one. Our examples are in the C programming language; Section 2.15 shows how these would change for an object-oriented language like Java.

By learning how to represent instructions, you will also discover the secret of computing: the stored-program concept. Moreover, you will exercise your “foreign language” skills by writing programs in the language of the computer and running them on the simulator that comes with this book. You will also see the impact of programming languages and compiler optimization on performance. We conclude with a look at the historical evolution of instruction sets and an overview of other computer dialects.

We reveal our first instruction set a piece at a time, giving the rationale along with the computer structures. This top-down, step-by-step tutorial weaves the components with their explanations, making the computer’s language more palatable. Figure 2.1 gives a sneak preview of the instruction set covered in this chapter.

2.2 Operations of the Computer Hardware

Every computer must be able to perform arithmetic. The RISC-V assembly language notation

add a, b, c

instructs a computer to add the two variables b and c and to put their sum in a. This notation is rigid in that each RISC-V arithmetic instruction performs only one operation and must always have exactly three variables. For example, suppose we want to place the sum of four variables b, c, d, and e into variable a. (In this section, we are being deliberately vague about what a “variable” is; in the next section, we’ll explain in detail.)

The following sequence of instructions adds the four variables:

add a, b, c // The sum of b and c is placed in a add a, a, d // The sum of b, c, and d is now in a add a, a, e // The sum of b, c, d, and e is now in a Thus, it takes three instructions to sum the four variables.

The words to the right of the double slashes (//) on each line above are comments for the human reader, so the computer ignores them. Note that unlike other programming languages, each line of this language can contain at most one instruction. Another difference from C is that comments always terminate at the end of a line.

The natural number of operands for an operation like addition is three: the two numbers being added together and a place to put the sum. Requiring every instruction to have exactly three operands, no more and no less, conforms to the philosophy of keeping the hardware simple: hardware for a variable number of

stored-program concept The idea that instructions and data of many types can be stored in memory as numbers and thus be easy to change, leading to the stored-program computer.

There must certainly be instructions for performing the fundamental arithmetic operations.

Burks, Goldstine, and von Neumann, 1947

RISC-V operands

Name Example Comments

32 registers x0-x31

Fast locations for data. In RISC-V, data must be in registers to perform arithmetic.

Register x0 always equals 0.

261 memory

words Memory[0], Memory[8], …, Memory[18,446,744,073,709,551, 608]

Accessed only by data transfer instructions. RISC-V uses byte addresses, so sequential doubleword accesses differ by 8. Memory holds data structures, arrays, and spilled registers.

RISC-V assembly language

Meaning Comments

Add add x5, x6, x7 Three register operands; add

Subtract sub x5, x6, x7 Three register operands; subtract

Add immediate addi x5, x6, 20 Used to add constants

Arithmetic

Data transfer

Logical

Load doubleword ld x5, 40(x6) Doubleword from memory to register

Store doubleword sd x5, 40(x6) Doubleword from register to memory

Load word lw x5, 40(x6) Word from memory to register

Load word, unsigned lwu x5, 40(x6) Unsigned word from memory to register

Store word sw x5, 40(x6)

x5 = x6 + x7

Memory[x6 + 40] = x5 Word from register to memory Load halfword lh x5, 40(x6) x5 = Memory[x6 + 40] Halfword from memory to register Load halfword,

unsigned lhu x5, 40(x6) Unsigned halfword from memory

to register x5 = Memory[x6 + 40]

Store halfword sh x5, 40(x6) Halfword from register to memory

Load byte lb x5, 40(x6) Byte from memory to register

Load byte, unsigned lbu x5, 40(x6) Byte unsigned from memory to register

Memory[x6 + 40] = x5 x5 = Memory[x6 + 40]

x5 = Memory[x6 + 40]

Store byte sb x5, 40(x6) Byte from register to memory

Load reserved lr.d x5, (x6) Load; 1st half of atomic swap

Store conditional sc.d x7, x5, (x6) Store; 2nd half of atomic swap

Load upper

immediate lui x5, 0x12345 Loads 20-bit constant shifted left

12 bits

and x5, x6, x7 Three reg. operands; bit-by-bit AND

Three reg. operands; bit-by-bit XOR x5 = x6 & x7

Inclusive or or x5, x6, x8 x5 = x6 | x8 Three reg. operands; bit-by-bit OR Exclusive or xor x5, x6, x9 x5 = x6 ^ x9

Bit-by-bit AND reg. with constant And immediate andi x5, x6, 20 x5 = x6 & 20

Bit-by-bit OR reg. with constant Inclusive or immediate ori x5, x6, 20 x5 = x6 | 20

Bit-by-bit XOR reg. with constant Exclusive or immediate xori x5, x6, 20 x5 = x6 ^ 20

Example Instruction

Category

Shift

Shift left logical sll x5, x6, x7 Shift left by register

Arithmetic shift right by register x5 = x6 << x7

Shift right logical srl x5, x6, x7 x5 = x6 >> x7 Shift right by register Shift right arithmetic sra x5, x6, x7 x5 = x6 >> x7

Shift left by immediate

Arithmetic shift right by immediate Shift right arithmetic

immediate srai x5, x6, 3 x5 = x6 >> 3

FIGURE 2.1 RISC-V assembly language revealed in this chapter. This information is also found in Column 1 of the RISC-V Reference Data Card at the front of this book.

2.2 Operations of the Computer Hardware 65

operands is more complicated than hardware for a fixed number. This situation illustrates the first of three underlying principles of hardware design:

Design Principle 1: Simplicity favors regularity.

We can now show, in the two examples that follow, the relationship of programs written in higher-level programming languages to programs in this more primitive notation.

Compiling Two C Assignment Statements into RISC-V

This segment of a C program contains the five variables a, b, c, d, and e. Since Java evolved from C, this example and the next few work for either high-level programming language:

a = b + c;

d = a − e;

The compiler translates from C to RISC-V assembly language instructions.

Show the RISC-V code produced by a compiler.

A RISC-V instruction operates on two source operands and places the result in one destination operand. Hence, the two simple statements above compile directly into these two RISC-V assembly language instructions:

add a, b, c sub d, a, e

Conditional branch

Unconditional branch

PC-relative branch if registers equal PC-relative branch if registers less if (x5 == x6) go to PC+100

PC-relative branch if registers not equal if (x5 != x6) go to PC+100

if (x5 < x6) go to PC+100

PC-relative branch if registers greater or equal

if (x5 >= x6) go to PC+100

PC-relative branch if registers less, unsigned

if (x5 < x6) go to PC+100

PC-relative branch if registers greater or equal, unsigned

if (x5 >= x6) go to PC+100

jal x1, 100 x1 = PC+4; go to PC+100 PC-relative procedure call jalr x1, 100(x5) x1 = PC+4; go to x5+100 Procedure return; indirect call Branch if equal beq x5, x6, 100

Branch if not equal bne x5, x6, 100 Branch if less than blt x5, x6, 100 Branch if greater or

equal bge x5, x6, 100

Branch if less, unsigned bltu x5, x6, 100 Branch if greater or

equal, unsigned bgeu x5, x6, 100 Jump and link

Jump and link register FIGURE 2.1 (Continued).

EXAMPLE

ANSWER

Compiling a Complex C Assignment into RISC-V

A somewhat complicated statement contains the five variables f, g, h, i, and j: f = (g + h) − (i + j);

What might a C compiler produce?

The compiler must break this statement into several assembly instructions, since only one operation is performed per RISC-V instruction. The first RISC-V instruction calculates the sum of g and h. We must place the result somewhere, so the compiler creates a temporary variable, called t0:

add t0, g, h // temporary variable t0 contains g + h Although the next operation is subtract, we need to calculate the sum of i and j before we can subtract. Thus, the second instruction places the sum of i and j in another temporary variable created by the compiler, called t1:

add t1, i, j // temporary variable t1 contains i + j Finally, the subtract instruction subtracts the second sum from the first and places the difference in the variable f, completing the compiled code:

sub f, t0, t1 // f gets t0 − t1, which is (g + h) − (i + j)

Elaboration: To increase portability, Java was originally envisioned as relying on a software interpreter. The instruction set of this interpreter is called Java bytecodes (see Section 2.15), which is quite different from the RISC-V instruction set. To get performance close to the equivalent C program, Java systems today typically compile Java bytecodes into the native instruction sets like RISC-V. Because this compilation is normally done much later than for C programs, such Java compilers are often called Just In Time (JIT) compilers. Section 2.12 shows how JITs are used later than C compilers in the start-up process, and Section 2.13 shows the performance consequences of compiling versus interpreting Java programs.

EXAMPLE

ANSWER

Check Yourself

For a given function, which programming language likely takes the most lines of code? Put the three representations below in order.

1. Java 2. C

3. RISC-V assembly language