4–6 SEGMENT OVERRIDE PREFIX - THE INTEL MICROPROCESSORS

The segment override prefix, which may be added to almost any instruction in any memory-addressing mode, allows the programmer to deviate from the default segment. The segment over-ride prefix is an additional byte that appends the front of an instruction to select an alternate segment register. About the only instructions that cannot be prefixed are the jump and call instruc-tions that must use the code segment register for address generation. The segment override is also used to select the FS and GS segments in the 80386 through the Core2 microprocessors.

For example, the MOV AX,[DI] instruction accesses data within the data segment by default. If required by a program, this can be changed by prefixing the instruction. Suppose that the data are in the extra segment instead of in the data segment. This instruction addresses the extra segment if changed to MOV AX,ES:[DI].

Table 4–21 shows some altered instructions that address different memory segments that are different from normal. Each time an instruction is prefixed with a segment override prefix, the instruction becomes 1 byte longer. Although this is not a serious change to the length of the instruction, it does add to the instruction’s execution time. It is usually customary to limit the use of the segment override prefix and remain in the default segments so that shorter and more effi-cient software is written.

4–7 ASSEMBLER DETAIL

The assembler (MASM)¹ for the microprocessor can be used in two ways: (1) with models that are unique to a particular assembler, and (2) with full-segment definitions that allow complete control over the assembly process and are universal to all assemblers. This section of the text

pre-1The assembler used throughout this text is the Microsoft MACRO assembler called MASM, version 6.1X.

TABLE 4–21 Instructions that include segments override prefixes.

sents both methods and explains how to organize a program’s memory space by using the assem-bler. It also explains the purpose and use of some of the more important directives used with this assembler. Appendix A provides additional detail about the assembler.

In most cases, the inline assembler found in Visual is used for developing assembly code for use in a program, but there are occasions that require separate assembly modules writing using the assembler. This section of the text contrasts, where possible, the inline assem-bler and the assemassem-bler.

Directives

Before the format of an assembly language program is discussed, some details about the direc-tives (pseudo-operations) that control the assembly process must be learned. Some common assembly language directives appear in Table 4–22. Directives indicate how an operand or sec-tion of a program is to be processed by the assembler. Some directives generate and store infor-mation in the memory; others do not. The DB (define byte) directive stores bytes of data in the memory, whereas the BYTE PTR directive never stores data. The BYTE PTR directive indicates the size of the data referenced by a pointer or index register. Note that none of the directives function in the inline assembler program that is a part of Visual . If you are using the inline assembler exclusively, you can skip this part of the text. Be aware that complex sections of assembly code are still written using MASM.

Note that by default the assembler accepts only 8086/8088 instructions, unless a program is preceded by the .686 or .686P directive or one of the other microprocessor selection switches.

The .686 directive tells the assembler to use the Pentium Pro instruction set in the real mode, and the .686P directive tells the assembler to use the Pentium Pro protected mode instruction set.

Most modern software is written assuming that the microprocessor is a Pentium Pro or newer, so the .686 switch is often used. Windows 95 was the first major operating system to use a 32-bit architecture that conforms to the 80386. Windows XP requires a Pentium class machine (.586 switch) using at least a 233MHz microprocessor.

Storing Data in a Memory Segment. The DB (define byte), DW (define word), and DD (define doubleword) directives, first presented in Chapter 1, are most often used with MASM to define and store memory data. If a numeric coprocessor executes software in the system, the DQ (define quadword) and DT (define ten bytes) directives are also common. These directives label a memory location with a symbolic name and indicate its size.

Example 4–13 shows a memory segment that contains various forms of data definition directives. It also shows the full-segment definition with the first SEGMENT statement to indicate the start of the segment and its symbolic name. Alternately, as in past examples in this and prior chapters, the SMALL model can be used with the .DATA statement. The last statement in this example contains the ENDS directive, which indicates the end of the segment. The name of the segment (LIST_SEG) can be anything that the programmer desires to call it. This allows a pro-gram to contain as many segments as required.

EXAMPLE 4–13

;Using the DB, DW, and DD directives

;

0000 LIST_SEG SEGMENT

0000 01 02 03 DATA1 DB 1,2,3 ;define bytes 0003 45 DB 45H ;hexadecimal 0004 41 DB 'A' ;ASCII 0005 F0 DB 11110000B ;binary 0006 000C 000D DATA2 DW 12,13 ;define words 000A 0200 DW LIST1 ;symbolic

C + + C + +

C + +

TABLE 4–22 Common MASM directives.

Directive Function

.286 Selects the 80286 instruction set

.286P Selects the 80286 protected mode instruction set .386 Selects the 80386 instruction set

.386P Selects the 80386 protected mode instruction set .486 Selects the 80486 instruction set

.486P Selects the 80498 protected mode instruction set .586 Selects the Pentium instruction set

.586P Selects the Pentium protected mode instruction set .686 Selects the Pentium Pro–Core2 instruction set

.686P Selects the Pentium Pro–Core2 protected mode instruction set

.287 Selects the 80287 math coprocessor

.387 Selects the 80387 math coprocessor

.CODE Indicates the start of the code segment (models only) .DATA Indicates the start of the data segment (models only)

.EXIT Exits to DOS (models only)

.MODEL Selects the programming model

.STACK Selects the start of the stack segment (models only) .STARTUP Indicates the starting instruction in a program (models only) ALIGN n Align to boundary n (n = 2 for words, n = 4 for doublewords) ASSUME Informs the assembler to name each segment (full segments only)

BYTE Indicates byte-sized as in BYTE PTR

DB Defines byte(s) (8 bits)

DD Defines doubleword(s) (32 bits)

DQ Defines quadwords(s) (64 bits)

DT Defines ten byte(s) (80 bits)

DUP Generates duplicates

DW Define word(s) (16 bits)

DWORD Indicates doubleword-sized, as in DWORD PTR

END Ends a program file

ENDM Ends a MACRO sequence

ENDP Ends a procedure

ENDS Ends a segment or data structure

EQU Equates data or a label to a label FAR Defines a far pointer, as in FAR PTR MACRO Designates the start of a MACRO sequence NEAR Defines a near pointer, as in NEAR PTR OFFSET Specifies an offset address

ORG Sets the origin within a segment

OWORD Indicates octalwords, as in OWORD PTR

PROC Starts a procedure

PTR Designates a pointer

QWORD Indicates quadwords, as in QWORD PTR

SEGMENT Starts a segment for full segments STACK Starts a stack segment for full segments STRUC Defines the start of a data structure USES Automatically pushes and pops registers

USE16 Uses 16-bit instruction mode

USE32 Uses 32-bit instruction mode

WORD Indicates word-sized, as in WORD PTR

000C 2345 DW 2345H ;hexadecimal 000E 00000300 DATA3 DD 300H ;define doubleword 0012 4007DF3B DD 2.123 ;real

0016 544269E1 DD 3.34E+12 ;real

001A 00 LISTA DB ? ;reserve 1 byte 001B 000A[ LISTB DB 10 DUP(?) ;reserve 10 bytes

]

0025 00 ALIGN 2 ;set word boundary 0026 0100[

0000

] LISTC DW 100H DUP(0) ;reserve 100H words 0226 0016[ LISTD DD 22 DUP(?) ;reserve 22 doublewords

????????

]

027E 0064[ SIXES DB 100 DUP(6) ;reserve 100 bytes 06

]

02E2 LIST_SEG ENDS

Example 4–13 shows various forms of data storage for bytes at DATA1. More than 1 byte can be defined on a line in binary, hexadecimal, decimal, or ASCII code. The DATA2 label shows how to store various forms of word data. Doublewords are stored at DATA3; they include floating-point, single-precision real numbers.

Memory is reserved for use in the future by using a question mark (?) as an operand for a DB, DW, or DD directive. When a ? is used in place of a numeric or ASCII value, the assembler sets aside a location and does not initialize it to any specific value. (Actually, the assembler usu-ally stores a zero into locations specified with a?.) The DUP (duplicate) directive creates an array, as shown in several ways in Example 4–12. A 10 DUP (?) reserves 10 locations of mem-ory, but stores no specific value in any of the 10 locations. If a number appears within the ( ) part of the DUP statement, the assembler initializes the reserved section of memory with the data indicated. For example, the LIST2 DB 10 DUP (2) instruction reserves 10 bytes of memory for array LIST2 and initializes each location with a 02H.

The ALIGN directive, used in this example, makes sure that the memory arrays are stored on word boundaries. An ALIGN 2 places data on word boundaries and an ALIGN 4 places them on doubleword boundaries. In the Pentium–Pentium 4, quadword data for double-precision floating-point numbers should use ALIGN 8. It is important that word-sized data are placed at word boundaries and doubleword-sized data are placed at doubleword boundaries. If not, the microprocessor spends additional time accessing these data types. A word stored at an odd-numbered memory location takes twice as long to access as a word stored at an even-odd-numbered memory location. Note that the ALIGN directive cannot be used with memory models because the size of the model determines the data alignment. If all doubleword data are defined first, fol-lowed by word-sized and then byte-sized data, the ALIGN statement is not necessary to align data correctly.

ASSUME, EQU, and ORG. The equate directive (EQU) equates a numeric, ASCII, or label to another label. Equates make a program clearer and simplify debugging. Example 4–14 shows several equate statements and a few instructions that show how they function in a program.

EXAMPLE 4–14

;Using equate directive

;

= 000A TEN EQU 10

= 0009 NINE EQU 9 0000 B0 0A MOV AL,TEN 0002 04 09 ADD AL,NINE

The THIS directive always appears as THIS BYTE, THIS WORD, THIS DWORD, or THIS QWORD. In certain cases, data must be referred to as both a byte and a word. The assem-bler can only assign either a byte, word, or doubleword address to a label. To assign a byte label to a word, use the software listed in Example 4–15.

EXAMPLE 4–15

;Using the THIS and ORG directives

;

0000 DATA_SEG SEGMENT 0300 ORG 300H

= 0300 DATA1 EQU THIS BYTE 0300 DATA2 DW ?

0302 DATA_SEG ENDS

0000 CODE_SEG SEGMENT 'CODE'

ASSUME CS:CODE_SEG, DS:DATA_SEG 0000 8A 1E 0300 R MOV BL,DATA1

0004 A1 0300 R MOV AX,DATA2 0007 8A 3E 0301 R MOV BH,DATA1+1 000B CODE_SEG ENDS

This example also illustrates how the ORG (origin) statement changes the starting off-set address of the data in the data segment to location 300H. At times, the origin of data or the code must be assigned to an absolute offset address with the ORG statement. The ASSUME statement tells the assembler what names have been chosen for the code, data, extra, and stack segments. Without the ASSUME statement, the assembler assumes nothing and auto-matically uses a segment override prefix on all instructions that address memory data. The ASSUME statement is only used with full-segment definitions, as described later in this sec-tion of the text.

PROC and ENDP. The PROC and ENDP directives indicate the start and end of a procedure (subroutine). These directives force structure because the procedure is clearly defined. Note that if structure is to be violated for whatever reason, use the CALLF, CALLN, RETF, and RETN instructions. Both the PROC and ENDP directives require a label to indicate the name of the pro-cedure. The PROC directive, which indicates the start of a procedure, must also be followed with a NEAR or FAR. A NEAR procedure is one that resides in the same code segment as the program. A FAR procedure may reside at any location in the memory system. Often the call NEAR procedure is considered to be local, and the call FAR procedure is considered to be global. The term global denotes a procedure that can be used by any program; local defines a procedure that is only used by the current program. Any labels that are defined within the proce-dure block are also defined as either local (NEAR) or global (FAR).

Example 4–16 shows a procedure that adds BX, CX, and DX and stores the sum in regis-ter AX. Although this procedure is short and may not be particularly useful, it does illustrate how to use the PROC and ENDP directives to delineate the procedure. Note that information about the operation of the procedure should appear as a grouping of comments that show the registers changed by the procedure and the result of the procedure.

EXAMPLE 4–16

;A procedure that adds BX, CX, and DX with the

;sum stored in AX

;

0000 ADDEM PROC FAR ;start of procedure

0000 03 D9 ADD BX,CX 0002 03 DA ADD BX,DX 0004 8B C3 MOV AX,BX 0006 CB RET

0007 ADDEM ENDP ;end of procedure

If version 6.x of the Microsoft MASM assembler program is available, the PROC directive specifies and automatically saves any registers used within the procedure. The USES statement indicates which registers are used by the procedure, so that the assembler can automatically save them before your procedure begins and restore them before the procedure ends with the RET instruction. For example, the ADDS PROC USES AX BX CX statement automatically pushes AX, BX, and CX on the stack before the procedure begins and pops them from the stack before the RET instruction executes at the end of the procedure. Example 4–17 illustrates a procedure written using MASM version 6.x that shows the USES statement. Note that the registers in the list are not separated by commas, but by spaces, and the PUSH and POP instructions are dis-played in the procedure listing because it was assembled with the .LIST ALL directive. The instructions prefaced with an asterisk (*) are inserted by the assembler and were not typed in the source file. The USES statement appears elsewhere in this text, so if MASM version 5.10 is in use, the code will need to be modified.

EXAMPLE 4–17

;A procedure that includes the USES directive to

;save BX, CX, and DX on the stack and restore them

;before the return instruction.

0000 ADDS PROC NEAR USES BX CX DX 0000 53 * push bx

0001 51 * push cx 0002 52 * push dx 0003 03 D8 ADD BX,AX 0005 03 CB ADD CX,BX 0007 03 D1 ADD DX,CX 0009 8B C2 MOV AX,DX

RET 000B 5A * pop dx 000C 59 * pop cx 000D 5B * pop bx 000E C3 * ret 0000h 000F ADDS ENDP

在文檔中 THE INTEL MICROPROCESSORS (頁 161-166)