Memory Addressing - Z Stop the machine and ring the warning bell

Z Stop the machine and ring the warning bell

2.3 Memory Addressing

7 6 5 4 3 2 1 0

0 1 2 3 4 5 6 7

When operating within one computer, the byte order is often unnoticeable—only programs that access the same locations as both, say, words and bytes can notice the difference. Byte order is a problem when exchanging data among computers with different orderings, however. Little Endian ordering also fails to match nor-mal ordering of words when strings are compared. Strings appear “SDRAWK-CAB” (backwards) in the registers.

A second memory issue is that in many computers, accesses to objects larger than a byte must be aligned. An access to an object of size s bytes at byte address A is aligned if A mod s = 0. Figure 2.5 shows the addresses at which an access is aligned or misaligned.

Why would someone design a computer with alignment restrictions? Mis-alignment causes hardware complications, since the memory is typically aligned on a multiple of a word or double-word boundary. A misaligned memory access may, therefore, take multiple aligned memory references. Thus, even in comput-ers that allow misaligned access, programs with aligned accesses run faster.

Value of 3 low order bits of byte address:

Width of object: 0 1 2 3 4 5 6 7

1 Byte (Byte) Aligned Aligned Aligned Aligned Aligned Aligned Aligned Aligned

2 Bytes (Half word) Aligned Aligned Aligned Aligned

2 Bytes (Half word) Misaligned Misaligned Misaligned Misalig.

4 Bytes (Word) Aligned Aligned

4 Bytes (Word) Misaligned Misaligned

4 Bytes (Word) Misaligned Misalig.

8 bytes (Double word) Aligned

8 bytes (Double word) Misaligned

8 bytes (Double word) Misalig.

FIGURE 2.5 Aligned and misaligned addresses of byte, half word, word, and double word objects for byte ad-dressed computers. For each misaligned example some objects require two memory accesses to complete. Every aligned object can always complete in one memory access, as long as the memory is as wide as the object. The figure shows the memory organized as 8 bytes wide. The byte offsets that label the columns specify the low-order three bits of the address.

Even if data are aligned, supporting byte, half-word, and word accesses re-quires an alignment network to align bytes, half words, and words in 64-bit regis-ters. For example, in Figure 2.5 above, suppose we read a byte from an address with its three low order bits having the value 4. We will need shift right 3 bytes to align the byte to the proper place in a 64-bit register. Depending on the instruc-tion, the computer may also need to sign-extend the quantity. Stores are easy:

only the addressed bytes in memory may be altered. On some computers a byte, half word, and word operation does not affect the upper portion of a register. Al-though all the computers discussed in this book permit byte, half-word, and word accesses to memory, only the IBM 360/370, Intel 80x86, and VAX supports ALU operations on register operands narrower than the full width.

Now that we have discussed alternative interpretations of memory addresses, we can discuss the ways addresses are speciﬁed by instructions, called address-ing modes.

Addressing Modes

Given an address, we now know what bytes to access in memory. In this sub-section we will look at addressing modes—how architectures specify the address of an object they will access. Addressing mode specify constants and registers in addition to locations in memory. When a memory location is used, the actual memory address speciﬁed by the addressing mode is called the effective address.

Figure 2.6 above shows all the data-addressing modes that have been used in recent computers. Immediates or literals are usually considered memory-address-ing modes (even though the value they access is in the instruction stream), al-though registers are often separated. We have kept addressing modes that depend on the program counter, called PC-relative addressing, separate. PC-relative ad-dressing is used primarily for specifying code addresses in control transfer instruc-tions, discussed in section 2.9.

Figure 2.6 shows the most common names for the addressing modes, though the names differ among architectures. In this ﬁgure and throughout the book, we will use an extension of the C programming language as a hardware description notation. In this ﬁgure, only one non-C feature is used: The left arrow (←⁾is used for assignment. We also use the array Mem as the name for main memory and the ar-ray Regs for registers. Thus, Mem[Regs[R1]] refers to the contents of the mem-ory location whose address is given by the contents of register 1 (R1). Later, we will introduce extensions for accessing and transferring data smaller than a word.

Addressing modes have the ability to signiﬁcantly reduce instruction counts;

they also add to the complexity of building a computer and may increase the av-erage CPI (clock cycles per instruction) of computers that implement those modes. Thus, the usage of various addressing modes is quite important in helping the architect choose what to include.

Figure 2.7 above shows the results of measuring addressing mode usage pat-terns in three programs on the VAX architecture. We use the old VAX

architec-ture for a few measurements in this chapter because it has the richest set of addressing modes and fewest restrictions on memory addressing. For example, Figure 2.6 shows all the modes the VAX supports. Most measurements in this chapter, however, will use the more recent register-register architectures to show how programs use instruction sets of current computers.

Addressing

mode Example instruction Meaning When used

+ Regs[R3]

When a value is in a register.

Immediate Add R4,#3 Regs[R4]←Regs[R4]+3 For constants.

Displacement Add R4,100(R1) Regs[R4]←Regs[R4]

+ Mem[100+Regs[R1]]

Accessing local variables (+ simulates register indirect, direct addressing modes) Register indirect Add R4,(R1) Regs[R4]←Regs[R4]

+ Mem[Regs[R1]]

Accessing using a pointer or a computed address.

Indexed Add R3,(R1 + R2) Regs[R3]←Regs[R3]

+Mem[Regs[R1]+Regs[R2]]

Sometimes useful in array addressing: R1 = base of array;

R2 = index amount. con-stant may need to be large.

Memory indirect Add R1,@(R3) Regs[R1]←Regs[R1]

+ Mem[Mem[Regs[R3]]]

If R3 is the address of a pointer p, then mode yields *p.

Autoincrement Add R1,(R2)+ Regs[R1]←Regs[R1]

+ Mem[Regs[R2]]

Regs[R2]←Regs[R2]+d

Useful for stepping through ar-rays within a loop. R2 points to start of array; each reference increments R2 by size of an element, d. also act as push/pop to imple-ment a stack.

Scaled Add R1,100(R2)[R3] Regs[R1]← Regs[R1]+

Mem[100+Regs[R2]

+ Regs[R3]*d]

Used to index arrays. May be applied to any indexed ad-dressing mode in some com-puters.

FIGURE 2.6 Selection of addressing modes with examples, meaning, and usage. In autoincrement/decrement and scaled addressing modes, the variable d designates the size of the data item being accessed (i.e., whether the instruction is accessing 1, 2, 4, or 8 bytes). These addressing modes are only useful when the elements being accessed are adjacent in memory. RISC computers use Displacement addressing to simulate Register Indirect with 0 for the address and simulate Direct addressing using 0 in the base register. In our measurements, we use the first name shown for each mode. The ex-tensions to C used as hardware descriptions are defined on the next page, also on page 144, and on the back inside cover.

As Figure 2.7 shows, immediate and displacement addressing dominate ad-dressing mode usage. Let’s look at some properties of these two heavily used modes.

Displacement Addressing Mode

The major question that arises for a displacement-style addressing mode is that of the range of displacements used. Based on the use of various displacement sizes, a decision of what sizes to support can be made. Choosing the displacement ﬁeld sizes is important because they directly affect the instruction length. Figure 2.8 shows the measurements taken on the data access on a load-store architecture us-ing our benchmark programs. We look at branch offsets in section 2.9—data ac-cessing patterns and branches are different; little is gained by combining them, although in practice the immediate sizes are made the same for simplicity.

Immediate or Literal Addressing Mode

Immediates can be used in arithmetic operations, in comparisons (primarily for branches), and in moves where a constant is wanted in a register. The last case

oc-FIGURE 2.7 Summary of use of memory addressing modes (including immediates). These major addressing modes account for all but a few percent (0% to 3%) of the memory accesses. Register modes, which are not counted, account for one-half of the operand references, while memory addressing modes (including immediate) account for the other half. Of course, the compiler affects what addressing modes are used; see section 2.11. The memory indirect mode on the VAX can use displacement, autoincrement, or autodecrement to form the initial memory address; in these programs, almost all the memory indirect references use displacement mode as the base. Displacement mode includes all displacement lengths (8, 16, and 32 bit). The PC-relative addressing modes, used almost exclusively for branches, are not included. Only the ad-dressing modes with an average frequency of over 1% are shown. The data are from a VAX using three SPEC89 programs.

0% 10% 20% 30% 40% 50% 60%

FIGURE 2.8 Displacement values are widely distributed. There are both a large number of small values and a fair num-ber of large values. The wide distribution of displacement values is due to multiple storage areas for variables and different displacements to access them (see section 2.11) as well as the overall addressing scheme the compiler uses. The x axis is log₂ of the displacement; that is, the size of a field needed to represent the magnitude of the displacement. Zero on the x axis shows the percentage of displacements of value 0. The graph does not include the sign bit, which is heavily affected by the storage layout. Most displacements are positive, but a majority of the largest displacements (14+ bits) is negative.

Since this data was collected on a computer with 16-bit displacements, it cannot tell us about longer displacements. These data were taken on the Alpha architecture with full optimization (see section 2.11) for SPEC CPU2000, showing the average of integer programs (CINT2000) and the average of floating-point programs (CFP2000).

FIGURE 2.9 About one-quarter of data transfers and ALU operations have an immediate operand. The bottom bars show that integer programs use immediates in about one-fifth of the instructions, while floating-point programs use immedi-ates in about one-sixth of the instructions. For loads, the load immediate instruction loads 16 bits into either half of a 32-bit register. Load immediates are not loads in a strict sense because they do not access memory. Occasionally a pair of load immediates is used to load a 32-bit constant, but this is rare. (For ALU operations, shifts by a constant amount are included as operations with immediate operands.) These measurements as in Figure 2.8.

0 % 5 % 1 0 % 1 5 % 2 0 % 2 5 % 3 0 % 3 5 % 4 0 %

0 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 Number of bits of displacement

Percentage of displacement

Integer average

Floating-point average

2 1 % 2 5 % 2 3 %

1 6 % 1 9 %

2 2 %

0 % 5 % 1 0 % 1 5 % 2 0 % 2 5 % 3 0 % All instructions

ALU operations Loads

Floating-point average Integer average

curs for constants written in the code–which tend to be small–and for address constants, which tend to be large. For the use of immediates it is important to know whether they need to be supported for all operations or for only a subset.

The chart in Figure 2.9 shows the frequency of immediates for the general classes of integer operations in an instruction set.

Another important instruction set measurement is the range of values for im-mediates. Like displacement values, the size of immediate values affects instruc-tion length. As Figure 2.10 shows, small immediate values are most heavily used.

Large immediates are sometimes used, however, most likely in addressing calcu-lations.

To give a ﬂavor of the different perspective between different architecture cul-tures, here are two addressing modes that distinguish DSPs.

Since DSPs deal with inﬁnite, continuous streams of data, they routinely rely on circular buffers. Hence, as data is added to the buffer, a pointer is checked to see if it is pointing at the end of the buffer. If not, it increments the pointer to the next address; if it is, the pointer is set instead to the start of the buffer. Similar is-sues arise when emptying a buffer.

FIGURE 2.10 The distribution of immediate values. The x axis shows the number of bits needed to represent the mag-nitude of an immediate value—0 means the immediate field value was 0. The majority of the immediate values are positive.

About 20% were negative for CINT2000 and about 30% were negative for CFP2000. These measurements were taken on a Alpha, where the maximum immediate is 16 bits, for the same programs as in Figure 2.8. A similar measurement on the VAX, which supported 32-bit immediates, showed that about 20% to 25% of immediates were longer than 16 bits.

在文檔中 Fundamentals of Computer Design 1 (頁 94-100)