• 沒有找到結果。

Intel IA-32 Family

N/A
N/A
Protected

Academic year: 2022

Share "Intel IA-32 Family"

Copied!
35
0
0

加載中.... (立即查看全文)

全文

(1)

Intel x86 Architecture

Computer Organization and Assembly Languages p g z y g g Yung-Yu Chuang

with slides by Kip Irvine

(2)

Intel microprocessor history

(3)

Early Intel microprocessors

• Intel 8080 (1972)

64K addressable RAM – 64K addressable RAM – 8-bit registers

– CP/M operating systemCP/M operating system – 5,6,8,10 MHz

– 29K transistros

• Intel 8086/8088 (1978)

– IBM-PC used 8088

my first computer (1986)

– 1 MB addressable RAM – 16-bit registers

– 16-bit data bus (8-bit for 8088)

– separate floating-point unit (8087)

3

– used in low-cost microcontrollers now

(4)

The IBM-AT

• Intel 80286 (1982)

16 MB dd bl RAM – 16 MB addressable RAM – Protected memory

several times faster than 8086 – several times faster than 8086 – introduced IDE bus architecture – 80287 floating point unit80287 floating point unit

– Up to 20MHz – 134K transistors134K transistors

4

(5)

Intel IA-32 Family

• Intel386 (1985)

4 GB addressable RAM – 4 GB addressable RAM – 32-bit registers

– paging (virtual memory)paging (virtual memory) – Up to 33MHz

• Intel486 (1989)Intel486 (1989)

– instruction pipelining – Integrated FPUg

– 8K cache

• Pentium (1993)( )

– Superscalar (two parallel pipelines)

5

(6)

Intel P6 Family

• Pentium Pro (1995)

– advanced optimization techniques in microcodeadva ced opt at o tec ques c ocode – More pipeline stages

– On-board L2 cache

• Pentium II (1997)

– MMX (multimedia) instruction set Up to 450MHz

– Up to 450MHz

• Pentium III (1999)

– SIMD (streaming extensions) instructions (SSE)SIMD (streaming extensions) instructions (SSE) – Up to 1+GHz

• Pentium 4 (2000)

– NetBurst micro-architecture, tuned for multimedia – 3.8+GHz

P ti D (2005 D l )

6

• Pentium D (2005, Dual core)

(7)

IA32 Processors

• Totally Dominate Computer Market E l i D i

• Evolutionary Design

– Starting in 1978 with 8086

– Added more features as time goes on

– Still support old features, although obsolete

• Complex Instruction Set Computer (CISC)

– Many different instructions with many different y y formats

• But, only small subset encountered with Linux programs

– Hard to match performance of Reduced Instruction Set Computers (RISC)

B I l h d j h !

– But, Intel has done just that!

(8)

IA-32 Architecture

(9)

IA-32 architecture

• Lots of architecture improvements, pipelining, superscalar branch prediction hyperthreading superscalar, branch prediction, hyperthreading and multi-core.

F ’ i t f i IA 32 h t

• From programmer’s point of view, IA-32 has not changed substantially except the introduction

f t f hi h f i t ti

of a set of high-performance instructions

9

(10)

Modes of operation

• Protected mode

ti d (Wi d Li ) f ll f t

– native mode (Windows, Linux), full features, separate memory

• Virtual-8086 mode

• hybrid of Protectedy

• each program has its own 8086 computer

• Real-address mode

– native MS-DOS

• System management mode

– power management, system security, diagnostics

10

p g , y y, g

(11)

Addressable memory

• Protected mode

– 4 GB

– 32-bit address

• Real-address and Virtual-8086 modes

– 1 MB space M space

– 20-bit address

11

(12)

General-purpose registers

32-bit General-Purpose Registers

EAX EBX

EBP ESP ECX

EDX

ESI EDI

16-bit Segment Registers

CS SS EFLAGS ES

g g

FS SS

EIP DS

FS GS

12

(13)

Accessing parts of registers

• Use 8-bit name, 16-bit name, or 32-bit name A li EAX EBX ECX d EDX

• Applies to EAX, EBX, ECX, and EDX

AH AL

8 8

AH AL

AX

8 bits + 8 bits

16 bits

AX

EAX

EAX 32 bits

13

(14)

Index and base registers

• Some registers have only a 16-bit name for their lower half (no 8 bit aliases) The 16 bit their lower half (no 8-bit aliases). The 16-bit registers are usually used only in real-address mode

mode.

14

(15)

Some specialized register uses

(1 of 2)

• General-Purpose

EAX l t ( t ti ll d b di i i – EAX – accumulator (automatically used by division

and multiplication) – ECX – loop counterECX loop counter

– ESP – stack pointer (should never be used for arithmetic or data transfer)

– ESI, EDI – index registers (used for high-speed memory transfer instructions)

EBP t d d f i t ( t k) – EBP – extended frame pointer (stack)

15

(16)

Some specialized register uses

(2 of 2)

• Segment

– CS – code segment – DS – data segment – SS – stack segment

– ES, FS, GS - additional segments

• EIP – instruction pointer

• EFLAGSEFLAGS

– status and control flags

– each flag is a single binary bit (set or clear)each flag is a single binary bit (set or clear)

• Some other system registers such as IDTR, GDTR LDTR etc

16

GDTR, LDTR etc.

(17)

Status flags

• Carry

– unsigned arithmetic out of rangeunsigned arithmetic out of range

• Overflow

– signed arithmetic out of range – signed arithmetic out of range

• Sign

result is negative – result is negative

• Zero

result is zero – result is zero

• Auxiliary Carry

carry from bit 3 to bit 4 – carry from bit 3 to bit 4

• Parity

sum of 1 bits is an even number

17

– sum of 1 bits is an even number

(18)

Floating-point, MMX, XMM registers

• Eight 80-bit floating-point data registers

ST(0) ST(1)

registers

– ST(0), ST(1), . . . , ST(7)

ST(1) ST(2) ST(3)

– arranged in a stack

– used for all floating-point

ST(3) ST(4) ST(5)

arithmetic

• Eight 64-bit MMX registers

ST(5) ST(6) ST(7)

g g

• Eight 128-bit XMM registers for single-instruction multiple-data

ST(7)

g p

(SIMD) operations

18

(19)

Programmer’s model

19

(20)

Programmer’s model

20

(21)

IA-32 Memory Management

(22)

Real-address mode

• 1 MB RAM maximum addressable (20-bit address)

• Application programs can access any area of memory

• Single tasking

• Supported by MS-DOS operating system

• Supported by MS DOS operating system

22

(23)

Segmented memory

Segmented memory addressing: absolute (linear) address is a combination of a 16-bit segment value added to a 16-g bit offset

C0000 D0000 E0000 F0000

8000:FFFF

90000 A0000 B0000

one segment

50000 60000 70000 80000

8000:0250

(64K)

20000 30000 40000 50000

8000:0000

0250

00000 23

10000

seg ofs

(24)

Calculating linear addresses

• Given a segment address, multiply it by 16 (add a hexadecimal zero) and add it to the offset

a hexadecimal zero), and add it to the offset

• Example: convert 08F1:0100 to a linear address

Adjusted Segment value: 0 8 F 1 0 Add the offset: 0 1 0 0 Linear address: 0 9 0 1 0 Linear address: 0 9 0 1 0

• A typical program has three segments: code

• A typical program has three segments: code,

data and stack. Segment registers CS, DS and SS are used to store them separately

24

are used to store them separately.

(25)

Example

What linear address corresponds to the segment/offset address 028F:0030?

028F0 + 0030 = 02920

Always use hexadecimal notation for addresses.

25

(26)

Protected mode

(1 of 2)

• 4 GB addressable RAM (32-bit address)

(00000000 t FFFFFFFFh) – (00000000 to FFFFFFFFh)

• Each program assigned a memory partition hi h i d f h

which is protected from other programs

• Designed for multitasking

• Supported by Linux & MS-Windows

26

(27)

Protected mode

(2 of 2)

• Segment descriptor tables

• Program structure

– code, data, and stack areas – CS, DS, SS segment descriptors – global descriptor table (GDT)

• MASM Programs use the Microsoft flat memory modelodel

27

(28)

Flat segmentation model

• All segments are mapped to the entire 32-bit physical address space at least two one for data and one for address space, at least two, one for data and one for code

• global descriptor table (GDT)g p ( )

28

(29)

Multi-segment model

• Each program has a local descriptor table (LDT)

holds descriptor for each segment used by the program – holds descriptor for each segment used by the program

RAM

Local Descriptor Tablep

26000

00008000 000A 00026000 0010

base limit access

26000

00003000 0002 00008000 000A

8000

multiplied by

29

1000h 3000

(30)

Translating Addresses

• The IA-32 processor uses a one- or two-step process to convert a variable's logical address process to convert a variable s logical address into a unique memory location.

h fi bi l i h

• The first step combines a segment value with a variable’s offset to create a linear address.

• The second optional step, called page

translation, converts a linear address to a physical address.

(31)

Converting Logical to Linear Address

The segment

selector points to a Selector Offset

Logical address

selector points to a segment descriptor,

which contains the Descriptor table

base address of a memory segment.

The 32-bit offset Segment Descriptor +

The 32 bit offset from the logical address is added to the segment’s base address, generating a 32 bit linear

GDTR/LDTR

Linear address

a 32-bit linear

address. (contains base address of descriptor table)

Linear address

(32)

Indexing into a Descriptor Table

Each segment descriptor indexes into the program's local descriptor table (LDT). Each table entry is mapped to a linear address:

Linear address space

Logical addresses

(unused)

L l D i T bl DRAM

0018 0000003A

DRAM

SS ESP

Local Descriptor Table

001A0000 0002A000 0010 000001B6

DS

18 10 (index) offset

0001A000 00003000 0008 00002CD3

08 IP 00

LDTR register

(33)

Paging

(1 of 2)

• Virtual memory uses disk as part of the memory, thus allowing sum of all programs can be larger thus allowing sum of all programs can be larger than physical memory

O l t f t b k t i

• Only part of a program must be kept in

memory, while the remaining parts are kept on di k

disk.

• The memory used by the program is divided into small units called pages (4096-byte).

• As the program runs, the processor selectively p g , p y unloads inactive pages from memory and loads other pages that are immediately required.p g y q

(34)

Paging

(2 of 2)

• OS maintains page directory and page tables

• Page translation: CPU converts the linear address into a physical address

• Page fault: occurs when a needed page is not in memory, and the CPU interrupts the y, p

program

• Virtual memory manager (VMM) – OS utility

• Virtual memory manager (VMM) OS utility that manages the loading and unloading of pages

pages

• OS copies the page into memory, program resumes execution

resumes execution

(35)

Page Translation

A linear address is Linear Address

10 10 12

divided into a page directory field, page table field and page

Directory Table Offset

Page Frame

table field, and page frame offset. The CPU uses all three to

Page Directory Page Table

Physical Address

CPU uses all three to calculate the

physical address.

Page-Table Entry

p y

Directory Entry

CR3 CR3

32

參考文獻

相關文件

 Calculating the expected total edge number for one left path started at one problem with m’ edges.  Evaluating the total edge number for all right sub-problems #

• The memory storage unit holds instructions and data for a running program.. • A bus is a group of wires that transfer data from one part to another (data,

1900–1935.” In Leila Haaparanta (Ed.), The Development of Modern Logic, Oxford: Oxford University Press, 2004...

The natural structure for two vari- ables is often a rectangular array with columns corresponding to the categories of one vari- able and rows to categories of the second

Doing-undoing (the capacity not only able to use a process to get to a goal, but also to understand the process well enough to work backward from the answer to the starting

A convenient way to implement a Boolean function with NAND gates is to obtain the simplified Boolean function in terms of Boolean operators and then convert the function to

Structured programming 14 , if used properly, results in programs that are easy to write, understand, modify, and debug.... Steps of Developing A

Interestingly, the periodicity in the intercept and alpha parameter of our two-stage or five-stage PGARCH(1,1) DGPs does not seem to have any special impacts on the model