The von Neumann Model
5.4 Control Instructions
Control instructions change the sequence of the instructions that are executed. If there were no control instructions, the next instruction fetched after the current instruction finishes would be the instruction located in the next sequential memory location. As you know, this is because the PC is incremented in the FETCH phase of each instruction. We will see momentarily that it is often useful to be able to break that sequence.
The LC-3 has five opcodes that enable this sequential flow to be broken: con-ditional branch, unconcon-ditional jump, subroutine (sometimes called function) call, TRAP, and return from interrupt. In this section, we will deal almost exclusively with the most common control instruction, the conditional branch. We will also introduce the unconditional jump and the TRAP instruction. The TRAP instruc-tion is particularly useful because, among other things, it allows a programmer to get information into and out of the computer without fully understanding the intricacies of the input and output devices. However, most of the discussion of the TRAP instruction and all of the discussion of the subroutine call and the return from interrupt we will leave for Chapters 9 and 10.
5.4 Control Instructions 154
5.4.1 Conditional Branches
The format of the conditional branch instruction (opcode = 0000) is as follows:
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
0 0 0 0 N z p PCoffset
Bits [11], [10], and [9] correspond to the three condition codes discussed in Section 5.1.7. Recall that in the LC-3, all instructions that write values into the general purpose registers set the three condition codes (i.e., the single-bit registers N, Z, P) in accordance with whether the value written is negative, zero, or positive.
These instructions are ADD, AND, NOT, LD, LDI, LDR, and LEA.
The condition codes are used by the conditional branch instruction to deter-mine whether to change the instruction flow; that is, whether to depart from the usual sequential execution of instructions that we get as a result of incrementing PC during the FETCH phase of each instruction.
The instruction cycle is as follows: FETCH and DECODE are the same for all instructions. The PC is incremented during FETCH. The EVALUATE ADDRESS phase is the same as that for LD and ST: the address is computed by adding the incremented PC to the 16-bit value formed by sign-extending bits [8:0] of the instruction.
During the EXECUTE phase, the processor examines the condition codes whose corresponding bits in the instruction are 1. That is, if bit [ 11 ] is 1, condition code N is examined. If bit [10] is 1, condition code Z is examined. If bit [9]
is 1, condition code P is examined. If any of bits [11:9] are 0, the corresponding condition codes are not examined. If any of the condition codes that are examined are in state 1, then the PC is loaded with the address obtained in the EVALUATE ADDRESS phase. If none of the condition codes that are examined are in state 1, the PC is left unchanged. In that case, in the next instruction cycle, the next sequential instruction will be fetched.
For example, if the last value loaded into a general purpose register was 0, then the current instruction (located at x4027) shown here
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
0 0 0 0 0 1 0 0 1 1 0 1 1 0 0 1
BR n z p xOD9 would load the PC with x4101, and the next instruction executed would be the one at x4101, rather than the instruction at x4028.
Figure 5.11 shows the data path elements that are required to execute this instruction. Note the logic required to determine whether the sequential instruction flow should be broken. In this case the answer is yes, and the PC is loaded with x4101, replacing x4028, which had been loaded during the FETCH phase of the conditional branch instruction.
If all three bits [11:9] are 1, then all three condition codes are examined. In this case, since the last result stored into a register had to be either negative, zero, or positive (there are no other choices), one of the three condition codes must be in state 1. Since all three are examined, the PC is loaded with the address obtained in the EVALUATE ADDRESS phase. We call this an wrcconditional branch since
Yes!
F i g u r e 5 . 1 1 Data path relevant to the execution of BRz xOD9
the instruction flow is changed unconditionally, that is, independent of the data that is being processed.
For example, if the following instruction,
15 14 13 12 11 10 9 7 6 5 4 3 2 1 0
0 0 0 0 1 1 1 1 1 0 0 0 0 1 0 1
BR n xl85
located at x507B, is executed, the PC is loaded with x5001.
What happens if all three bits [11:9] in the BR instruction are 0?
5.4.2 An Example
We are ready to show by means of a simple example the value of having control instructions in the instruction set.
Suppose we know that the 12 locations x3100 to x310B contain integers, and we wish to compute the sum of these 12 integers.
5.4 Control Instructions 1 3 3
F i g u r e 5 . 1 2 An a l g o r i t h m f o r adding 12 integers
A flowchart for an algorithm to solve the problem is shown in Figure 5.12.
First, as in all algorithms, we must initialize our variables. That is, we must set up the initial values of the variables that the computer will use in executing the program that solves the problem. There are three such variables: the address of the next integer to be added (assigned to Rl), the running sum (assigned to R3), and the number of integers left to be added (assigned to R2). The three variables are initialized as follows: The address of the first integer to be added is put in R l . R3, which will keep track of the running sum, is initialized to 0. R2, which will keep track of the number of integers left to be added, is initialized to 12. Then the process of adding begins.
The program repeats the process of loading into R4 one of the 12 integers, and adding it to R3. Each time we perform the ADD, we increment R l so it will point to (i.e., contain the address of) the next number to be added and decrement R2 so we will know how many numbers still need to be added. When R2 becomes zero, the Z condition code is set, and we can detect that we are done.
The 10-instruction program shown in Figure 5.13 accomplishes the task.
The details of the program execution are as follows: The program starts with PC = x3000. The first instruction (at location x3000) loads R l with the address x3100. (The incremented PC is x3001; the sign-extended PCoffset is xOOFF.)
The instruction at x3001 clears R3. R3 will keep track of the running sum, so it must start off with the value 0. As we said previously, this is called initializing the SUM to zero.
The instructions at x3002 and x3003 set the value of R2 to 12, the number of integers to be added. R2 will keep track of how many numbers have already been added. This will be done (by the instruction contained in x3008) by decrementing R2 after each addition takes place.
The instruction at x3004 is a conditional branch instruction. Note that bit [10]
is a 1. That means that the Z condition code will be examined. If it is set, we know
Address
R2 must have just been decremented to 0. That means there are no more numbers to be added and we are done. If it is clear, we know we still have work to do and we continue.
The instruction at x3005 loads the contents of x3100 (i.e., the first integer) into R4, and the instruction at x3006 adds it to R3.
The instructions at x3007 and x3008 perform the necessary bookkeeping.
The instruction at x3007 increments R l , so Rl will point to the next location in memory containing an integer to be added (in this case, x3101). The instruction at x3008 decrements R2, which is keeping track of the number of integers still to be added, as we have already explained, and sets the N, Z, and P condition codes.
The instruction at x3009 is an unconditional branch, since bits [11:9] are all 1.
It loads the PC with x3004. It also does not affect the condition codes, so the next instruction to be executed (the conditional branch at x3004) will be based on the instruction executed at x3008.
This is worth saying again. The conditional branch instruction at x3004 fol-lows the instruction at x3009, which does not affect condition codes, which in turn follows the instruction at x3008. Thus, the conditional branch instruction at x3004 will be based on the condition codes set by the instruction at x3008. The instruction at x3008 sets the condition codes depending on the value produced by decrementing R2. As long as there are still integers to be added, the ADD instruction at x3008 will produce a value greater than zero and therefore clear the Z condition code. The conditional branch instruction at x3004 examines the Z condition code. As long as Z is clear, the PC will not be affected, and the next instruction cycle will start with an instruction fetch from x3005.
The conditional branch instruction causes the execution sequence to follow:
x3000, x3001, x3002, x3003, x3004, x3005, x3006, x3007, x3008, x3009, x3004, x3005, x3006, x3007, x3008, x3009, x3004, x3005, and so on until the value in R2 becomes 0. The next time the conditional branch instruction at x3004 is executed, the PC is loaded with x300A, and the program continues at x300A with its next activity.
Finally, it is worth noting that we could have written a program to add these 12 integers without any control instructions. We still would have needed the LEA
5.4 Control Instructions 158
instruction in x3000 to initialize R l . We would not have needed the instruction at x3001 to initialize the running sum, nor the instructions at x3002, and x3003 to initialize the number of integers left to be added. We could have loaded the contents of x3100 directly into R3, and then repeatedly (by incrementing R l , loading the next integer into R4, and adding R4 to the running sum in R3) added the remaining 11 integers. After the addition of the twelfth integer, we would go on to the next task, as does the example of Figure 5.13 with the branch instruction in x3004.
Unfortunately, instead of a 10-instruction program, we would have had a 35-instruction program. Moreover, if we had wished to add 100 integers without any control instructions instead of 12, we would have had a 299-instruction program instead of 10. The control instructions in the example of Figure 5.13 permit the reuse of sequences of code by breaking the sequential instruction execution flow.
5.4.3 Two Methods for Loop Control
We use the term loop to describe a sequence of instructions that get executed again and again under some controlling mechanism. The example of adding 12 integers contains a loop. Each time the body of the loop executes, one more integer is added to the running total, and the counter is decremented so we can detect whether there are any more integers left to add. Each time the loop body executes is called one iteration of the loop.
There are two common methods for controlling the number of iterations of a loop. One method we just examined: the use of a counter. If we know we wish to execute a loop n times, we simply set a counter to n, then after each execution of the loop, we decrement the counter and check to see if it is zero. If it is not zero, we set the PC to the start of the loop and continue with another iteration.
A second method for controlling the number of executions of a loop is to use a sentinel. This method is particularly effective if we do not know ahead of time how many iterations we will want to perform. Each iteration is usually based on processing a value. We append to our sequence of values to be processed a value that we know ahead of time can never occur (i.e., the sentinel). For example, if we are adding a sequence of numbers, a sentinel could be a # or a *, that is, something that is not a number. Our loop test is simply a test for the occurrence of the sentinel. When we find it, we know we are done.
5.4.4 Example: Adding a Column of Numbers Using a Sentinel
Suppose in our example of Section 5.4.2, we know the values stored in locations x3100 to x310B are all positive. Then we could use any negative number as a sentinel. Let's say the sentinel stored at memory address x310C is —1. The resulting flowchart for the program is shown in Figure 5.14 and the resulting program is shown in Figure 5.15.
As before, the instruction at x3000 loads R l with the address of the first value to be added, and the instruction at x3001 initializes R3 (which keeps track of the sum) to 0.
1 R1 < - x3100
R3 < - 0 R4 < - M[R1]
R3 < - R3 + R4 Increment R1
R4 < - M[R1]
Figure 5 . 1 4 An algorithm showing the use of a sentinel for loop control
Address 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 x3000 1 1 1 0 0 0 1 0 1 1 1 1 1 1 1 1 x3001 0 1 0 1 0 1 1 0 1 1 1 0 0 0 0 0 x3002 0 1 1 0 1 0 0 0 0 1 0 0 0 0 0 0 x3003 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 x3004 0 0 0 1 0 1 1 0 1 1 0 0 0 1 0 0 x3005 0 0 0 1 0 0 1 0 0 1 1 0 0 0 0 1 x3006 0 1 1 0 1 0 0 0 0 1 0 0 0 0 0 0 x3007 0 0 0 0 1 1 1 1 1 1 1 1 1 0 1 1
R l < - x 3 1 0 0 R3 < - 0 R4 < - M[Rl]
BRn x 3 0 0 8 R3 <- R3+R4 Rl <- R l + 1 R4 < - MtRl]
BRnzp x 3 0 0 3
Figure 5 . 1 5 A program that implements the algorithm of Figure 5.14
At x3002, we load the contents of the next memory location into R4. If the sentinel is loaded, the N condition code is set.
The conditional branch at x3003 examines the N condition code, and if it is set, sets PC to x3008 and onto the next task to be done. If the N condition code is clear, R4 must contain a valid number to be added. In this case, the number is added to R3 (x3004), Rl is incremented to point to the next memory location (x3005), R4 is loaded with the contents of the next memory location (x3006), and the PC is loaded with x3003 to begin the next iteration (x3007).
5.4.5 The J MP Instruction
The conditional branch instruction, for all its capability, does have one unfortunate limitation. The next instruction executed must be within the range of addresses that can be computed by adding the incremented PC to the sign-extended offset
5.4 Control Instructions 1 3 7
obtained from bits [8:0] of the instruction. Since bits [8:0] specify a 2's comple-ment integer, the next instruction executed after the conditional branch can be at most +256 or —255 locations from the branch instruction itself. What if we would like to execute next an instruction that is 1,000 locations from the current instruction. We cannot fit the value 1,000 into the 9-bit field; ergo, the conditional branch instruction does not work.
The LC-3 ISA does provide an instruction JMP (opcode = 1100) that can do the job. An example follows: .
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0
JMP R2 The JMP instruction loads the PC with the contents of the register specified by bits [8:6] of the instruction. If this JMP instruction is located at address x4000, R2 contains the value x6600, and the PC contains x4000, then the instruction at x4000 (the JMP instruction) will be executed, followed by the instruction located at x6600. Since registers contain 16 bits, the full address space of memory, the JMP instruction has no limitation on where the next instruction to be executed must reside.
5.4.6 The TRAP Instruction
Finally, because it will be useful long before Chapter 9 to get data into and out of the computer, we introduce the TRAP instruction now. The TRAP (opcode =
1111) instruction changes the PC to a memory address that is part of the operating system so that the operating system will perform some task in behalf of the program that is executing. In the language of operating system jargon, we say the TRAP instruction invokes an operating system SERVICE CALL. Bits [7:0]
of the TRAP instruction form the trapvector, which identifies the service call that the program wishes the operating system to perform. Table A.2 contains the trapvectors for all the service calls that we will use with the LC-3 in this book.
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 1 1 1 1 0 0 0 0 trapvector
Once the operating system is finished performing the service call, the program counter is set to the address of the instruction following the TRAP instruction, and the program continues. In this way, a program can, during its execution, request services from the operating system and continue processing after each such service is performed. The services we will require for now are
* Input a character from the keyboard (trapvector = x23).
* Output a character to the monitor (trapvector = x21).
* Halt the program (trapvector = x25).
Exactly how the LC-3 carries out the interaction between operating system and executing programs is an important topic for Chapter 9.