Real Arithmetic

(1)

Real Arithmetic

Computer Organization and Assembly Languages p g z y g g Yung-Yu Chuang

Fractional binary numbers

2^i–1 2ⁱ

2 4

• • •

b_i b_i–1 • • • b₂ b₁ b₀ .b_–1 b_–2 b_–3 • • • b_–j 1

• • • 1/2

1/4 1/8

• Representation p

²^–j

– Bits to right of “binary point” represent fractional

powers of 2 _b

k2^k

i

2

– Represents rational number: ^k^{ j}^b^k ²

Binary real numbers

• Binary real to decimal real

• Decimal real to binary real

4.5625 = 100.1001

₂

3

Fractional binary numbers examples

•Value Representation

5 3/4 101 11

5-3/4 101.11₂

2-7/8 10.111₂

63/64 0 111111

63/64 0.111111₂

•Value Representation

1/3 0 0101010101[01]

1/3 0.0101010101[01]…₂

1/5 0.001100110011[0011]…₂ 1/10 0 0001100110011[0011]…₂ 1/10 0.0001100110011[0011]…₂

4

(2)

Fixed-point numbers

sign integer part fractional part

radix point

0 000 0000 0000 0110 0110 0000 0000 0000 = 110.011

• only 2 only 2 to 2

¹⁶

to 2

^-16

Not flexible, not adaptive to applications

• Fast computation, just integer operations. Fast computation, just integer operations.

It is often a good way to speed up in this way If you know the working range beforehand.

5

If you know the working range beforehand.

IEEE floating point

• IEEE Standard 754

E t bli h d i 1985 if t d d f fl ti – Established in 1985 as uniform standard for floating

point arithmetic

• Before that many idiosyncratic formats

• Before that, many idiosyncratic formats – Supported by all major CPUs

Driven by Numerical Concerns

• Driven by Numerical Concerns

– Nice standards for rounding, overflow, underflow H d t k g f t

– Hard to make go fast

• Numerical analysts predominated over hardware types in defining standard

types in defining standard

6

IEEE floating point format

• IEEE defines two formats with different precisions: single and double

precisions: single and double

23.85 = 10111.110110

₂

=1.0111110110x2

⁴

h

0 100 0001 1 011 1110 1100 1100 1100 1100 e = 127+4=83h

7

0 100 0001 1 011 1110 1100 1100 1100 1100

IEEE floating point format

special values special values

IEEE double precision

8

IEEE double precision

(3)

Denormalized numbers

• Number smaller than 1.0x2

^-126

can’t be presented by a single with normalized form presented by a single with normalized form.

However, we can represent it with denormalized format

denormalized format.

• 1.0000..00x2

^-126

the least “normalized” number

• 0.1111..11x2

^-126

the largest “denormalized”

number

• 1.001x2

^-129

=0.001001x2

^-126

9

Summary of Real Number Encodings

 +

NaN NaN

+



0

+Denorm +Normalized -Denorm

-Normalized

NaN 0 +00

(3.14+1e20)-1e20=0 3.14+(1e20-1e20)=3.14

10

IA-32 floating point architecture

• Original 8086 only has integers. It is possible to simulate real arithmetic using software but it simulate real arithmetic using software, but it is slow.

8087 fl ti i t ( d 80287 80387)

• 8087 floating-point processor (and 80287, 80387) was sold separately at early time.

• Since 80486, FPU (floating-point unit) was integrated into CPU.

11

FPU data types

• Three floating-point types

12

(4)

FPU data types

• Four integer types

13

FPU registers

• Data register

C l i

• Control register

• Status register

• Tag register

14

Data registers

• Load: push, TOP--

• Store: pop TOP++

79 0

• Store: pop, TOP++ R0

• Instructions access the stack using ST(i)

R1

R2 ST(0) 010

TOP

g ( ) relative to TOP

• If TOP=0 and push, TOP

R3 R4

ST(1) ST(2)

p wraps to R7

• If TOP=7 and pop, TOP ^R5_R6

( )

wraps to R0

• When overwriting occurs, t ti

R7

generate an exception

• Real values are transferred to and from memory and stored in 10-byte temporary format When storing

15

stored in 10-byte temporary format. When storing, convert back to integer, long, real, long real.

Postfix expression

**• (56)-4 → 5 6 4 -**

6 4

5 5

5 6

30

*

30 4

26 -

16

(5)

Special-purpose registers

17

Special-purpose registers

• Last data pointer stores the memory address of the operand for the last non control instruction the operand for the last non-control instruction.

Last instruction pointer stored the address of the last non control instruction Both are 48 the last non-control instruction. Both are 48 bits, 32 for offset, 16 for segment selector.

1 1 0 1 1 1 1 0 1 1

18

Control register

Initial 037Fh

for compatibility only for compatibility only

19

The instruction FINIT will initialize it to 037Fh.

Rounding

• FPU attempts to round an infinitely accurate result from a floating point calculation

result from a floating-point calculation

– Round to nearest even: round toward to the closest one; if both are equally close round to the even one one; if both are equally close, round to the even one – Round down: round toward to -∞

Round up: round toward to + – Round up: round toward to +∞

– Truncate: round toward to zero

E l

• Example

– suppose 3 fractional bits can be stored, and a l l t d l l 1 0111

calculated value equals +1.0111.

– rounding up by adding .0001 produces 1.100

di d b bt ti 0001 d 1 011

20

– rounding down by subtracting .0001 produces 1.011

(6)

Rounding

method original value rounded value Round to nearest even 1.0111 1.100

Round down 1.0111 1.011

Round up 1.0111 1.100

Truncate 1 0111 1 011

Truncate 1.0111 1.011

method original value rounded value method original value rounded value Round to nearest even -1.0111 -1.100

Round down -1.0111 -1.100

Round down 1.0111 1.100

Round up -1.0111 -1.011

T t 1 0111 1 011

21

Truncate -1.0111 -1.011

Floating-Point Exceptions

• Six types of exception conditions

#I I lid ti – #I: Invalid operation – #Z: Divide by zero

#D D li d d

detect before execution – #D: Denormalized operand

– #O: Numeric overflow

# d fl d t t ft ti

– #U: Numeric underflow – #P: Inexact precision

detect after execution

• Each has a corresponding mask bit

– if set when an exception occurs, the exception is handled automatically by FPU

– if clear when an exception occurs, a software i h dl i i k d

22

exception handler is invoked

Status register

23

C₃-C₀: condition bits after comparisons

FPU data types

.data

bigVal REAL10 1 212342342234234243E+864 bigVal REAL10 1.212342342234234243E+864 .code

fld bigVal fld bigVal

24

(7)

FPU instruction set

• Instruction mnemonics begin with letter F

S d l id ifi d f

• Second letter identifies data type of memory operand

– B = bcd – I = integer

– no letter: floating point

• Examples

– FBLD load binary coded decimal – FISTP store integer and pop stackg p p – FMUL multiply floating-point operands

25

FPU instruction set

• Fop {destination}, {source}

Operands

• Operands

– zero, one, or two

• faddfadd

• fadd [a]

• fadd st, st(1)

– no immediate operands

– no general-purpose registers (EAX, EBX, ...) (FSTSW is the only exception which stores FPU status word is the only exception which stores FPU status word to AX)

– destination must be a stack registerdestination must be a stack register

– integers must be loaded from memory onto the stack and converted to floating-point before being used in

l l i

26

calculations

Classic stack (0-operand)

• ST(0) as source, ST(1) as destination. Result is stored at ST(1) and ST(0) is popped leaving the stored at ST(1) and ST(0) is popped, leaving the result on the top. (with 0 operand,

^fadd=faddp

)

27

Memory operand (1-operand)

• ST(0) as the implied destination. The second operand is from memory

operand is from memory.

28

(8)

Register operands (2-operand)

• Register: operands are FP data registers, one must be ST

must be ST.

• Register pop: the same as register with a ST g p p g pop afterwards.

29

Example: evaluating an expression

30

Load

FLDPI stores π FLDL2T stores log₂(10) FLDL2E stores log₂(e) FLDLG2 stores log₁₀(2) FLDLN2 stores ln(2)( )

32

(9)

load

.data

array REAL8 10 DUP(?)y ( ) .code

fld array ; ^direct

fld [array+16] ; direct-offset

fld REAL8 PTR[esi] ; indirect

fld array[esi] ; indexed

fld array[esi*8] ; indexed, scaled

fld REAL8 PTR[ebx+esi]; base-index

fld array[ebx+esi] ; base-index-displacement

33

Store

34

Store

fst dblOne ; 200.0

f t dblT 200 0

fst dblTwo ; 200.0 fstp dblThree ; 200.0 fstp dblFour ; 32.0

35

Arithmetic instructions

FCHS ; change sign of ST FABS ; ST=|ST|

36

(10)

Floating-Point add

• FADD

dd t d ti ti – adds source to destination

– No-operand version pops the FPU stack after addition

stack after addition

• Examples:

37

Floating-Point subtract

• FSUB

bt t f d ti ti

– subtracts source from destination.

– No-operand version pops the FPU stack after subtracting

stack after subtracting

• Example:

fsub mySingley g ; ST -= mySingley g

fsub array[edi*8] ; ST -= array[edi*8]

38

Floating-point multiply/divide

• FMUL

M lti li b d ti ti – Multiplies source by destination,

stores product in destination

• FDIV

– Divides destination by source, then pops the stack

39

Miscellaneous instructions

.data

x REAL4 2.75. 5 five REAL4 5.2 .code

.code

fld five ; ST0=5.2

fld x ; ST0=2 75, ST1=5 2

fld x ; ST0 2.75, ST1 5.2

fscale ; ST0=2.75*32=88

; ST1=5 2

40

; ST1=5.2

(11)

Example: compute distance

; compute D=sqrt(x^2+y^2)

fld x ; load x

fld st(0) ; duplicate x

fmul ; x*x

fld y ; load y

fld st(0) ; duplicate y

f l *

fmul ; y*y

f dd * *

fadd ; x*x+y*y

fsqrt

41

fst D

Example: expression

; expression:valD = –valA + (valB * valC).

data .data

valA REAL8 1.5 valB REAL8 2 5 valB REAL8 2.5 valC REAL8 3.0

valD REAL8 ? ; will be +6.0 valD REAL8 ? ; will be +6.0 .code

fld valA ; ST(0) = valA fld valA ; ST(0) valA

fchs ; change sign of ST(0) fld valB d a ; load valB into ST(0); oad a to ( ) fmul valC ; ST(0) *= valC

fadd ; ST(0) += ST(1)

42

; ( ) ( )

fstp valD ; store ST(0) to valD

Example: array sum

.data N = 20 N 20

array REAL8 N DUP(1.0) sum REAL8 0.0

.code

mov ecx, N

mov esi, OFFSET array

fldz ; ST0 = 0

lp: fadd REAL8 PTR [esi]; ST0 += *(esi) add esi, 8 ; move to next double loop lp

fstp sum ; store result

43

Comparisons

44

(12)

Comparisons

• The above instructions change FPU’s status register of FPU and the following instructions register of FPU and the following instructions are used to transfer them to CPU.

• SAHF copies C into carry C into parity and C

• SAHF copies C

₀

into carry, C

₂

into parity and C

₃

to zero. Since the sign and overflow flags are not set use conditional jumps for unsigned not set, use conditional jumps for unsigned integers (ja, jae, jb, jbe, je, jz).

45

Comparisons

46

Branching after FCOM

• Required steps:

1 Use the FSTSW instruction to move the FPU status 1. Use the FSTSW instruction to move the FPU status

word into AX.

2 Use the SAHF instruction to copy AH into the 2. Use the SAHF instruction to copy AH into the

EFLAGSregister.

3 Use JA JB etc to do the branching 3. Use JA, JB, etc to do the branching.

• Pentium Pro supports two new comparison instructions that directly modify CPU’s FLAGS instructions that directly modify CPU s FLAGS.

FCOMI ST(0), src ; src=STn FCOMIP ST(0), src( ),

Example

fcomi ST(0), ST(1)

47

jnb Label1

Example: comparison

.data

x REAL8 1.0 y REAL8 2.0 .code

; if (x>y) return 1 else return 0

fld x ; ST0 = x

fcomp y ; compare ST0 and y fstsw ax ; move C bits into FLAGS sahf

jna else part ; if x not above y, jna else_part ; if x not above y, ...

then_part:

mov eax, 1 jmp end_if else_part:

mov eax, 0

48

, end_if:

(13)

Example: comparison

.data

x REAL8 1.0 y REAL8 2.0 .code

; if (x>y) return 1 else return 0

fld y ; ST0 = y

fld x ; ST0 = x ST1 = y

fcomi ST(0), ST(1)

jna else part ; if x not above y, jna else_part ; if x not above y, ...

then_part:

mov eax, 1 jmp end_if else_part:

mov eax, 0

49

, end_if:

Comparing for equality

• Not to compare floating-point values directly because of precision limit For example because of precision limit. For example,

sqrt(2.0)*sqrt(2.0) != 2.0

instruction FPU stack

fld two ST(0): +2.0000000E+000 fsqrt ST(0): +1.4142135+000 fm l ST(0) ST(0) ST(0) +2 0000000E+000 fmul ST(0), ST(0) ST(0): +2.0000000E+000 fsub two ST(0): +4.4408921E-016

50

Comparing for equality

• Calculate the absolute value of the difference between two floating point values

between two floating-point values

.data

epsilon REAL8 1.0E-12 ; difference value val2 REAL8 0.0 ; value to compare

val3 REAL8 1.001E-13 ; considered equal to val2 .code

; if( val2 == val3 ), display "Values are equal".

fld epsilon fld epsilon fld val2 fsub val3 fabs fabs

fcomi ST(0),ST(1) ja skip

51

mWrite <"Values are equal",0dh,0ah>

skip:

Example: quadratic formula

52

(14)

Example: quadratic formula

53

Example: quadratic formula

54

Other instructions

• F2XM1 ; ST=2

^ST(0)

-1; ST in [-1,1]

FYL2X **ST ST(1)*l (ST(0))**

• FYL2X **; ST=ST(1)*log**

₂

(ST(0))

**• FYL2XP1 ; ST=ST(1)*log**

₂

(ST(0)+1)

• FPTAN FPTAN ; ST(0) 1;ST(1) tan(ST) ; ST(0)=1;ST(1)=tan(ST)

• FPATAN ; ST=arctan(ST(1)/ST(0)) FSIN ST i (ST) i di

• FSIN ; ST=sin(ST) in radius

• FCOS ; ST=sin(ST) in radius

• FSINCOS ; ST(0)=cos(ST);ST(1)=sin(ST)

55

Real Arithmetic

Real Arithmetic

Computer Organization and Assembly Languages p g z y g g Yung-Yu Chuang

Fractional binary numbers

• Representation p

Binary real numbers

• Binary real to decimal real

• Decimal real to binary real

4.5625 = 100.1001

Fractional binary numbers examples

•Value Representation

•Value Representation

Fixed-point numbers

0 000 0000 0000 0110 0110 0000 0000 0000 = 110.011

• only 2 only 2 to 2

to 2

Not flexible, not adaptive to applications

• Fast computation, just integer operations. Fast computation, just integer operations.

It is often a good way to speed up in this way If you know the working range beforehand.

If you know the working range beforehand.

IEEE floating point

• IEEE Standard 754

IEEE floating point format

• IEEE defines two formats with different precisions: single and double

precisions: single and double

23.85 = 10111.110110

=1.0111110110x2

h

0 100 0001 1 011 1110 1100 1100 1100 1100 e = 127+4=83h

0 100 0001 1 011 1110 1100 1100 1100 1100

IEEE floating point format

special values special values

IEEE double precision

IEEE double precision

Denormalized numbers

• Number smaller than 1.0x2

can’t be presented by a single with normalized form presented by a single with normalized form.

However, we can represent it with denormalized format

denormalized format.

• 1.0000..00x2

the least “normalized” number

• 0.1111..11x2

the largest “denormalized”

number

• 1.001x2

=0.001001x2

Summary of Real Number Encodings

(3.14+1e20)-1e20=0 3.14+(1e20-1e20)=3.14

IA-32 floating point architecture

• Original 8086 only has integers. It is possible to simulate real arithmetic using software but it simulate real arithmetic using software, but it is slow.

8087 fl ti i t ( d 80287 80387)

• 8087 floating-point processor (and 80287, 80387) was sold separately at early time.

• Since 80486, FPU (floating-point unit) was integrated into CPU.

FPU data types

• Three floating-point types

FPU data types

• Four integer types

FPU registers

• Data register

C l i

• Control register

• Status register

• Tag register

Data registers

Postfix expression

• (5*6)-4 → 5 6 * 4 -

Special-purpose registers

Special-purpose registers

• Last data pointer stores the memory address of the operand for the last non control instruction the operand for the last non-control instruction.

Last instruction pointer stored the address of the last non control instruction Both are 48 the last non-control instruction. Both are 48 bits, 32 for offset, 16 for segment selector.

Control register

Rounding

• FPU attempts to round an infinitely accurate result from a floating point calculation

result from a floating-point calculation

E l

• Example

Rounding

Floating-Point Exceptions

• Six types of exception conditions

• Each has a corresponding mask bit

**• (56)-4 → 5 6 4 -**

FYL2X **ST ST(1)*l (ST(0))**