### Real Arithmetic

*Computer Organization and Assembly Languages *
*Yung-Yu Chuang *

*2005/12/22*

**Announcement**

• Grade for homework #3 is online

• You are encouraged to work in pairs for your final project

• You will have to schedule a demo session with your TA for your final project

• The final project will be due on the week after final week

**Binary real numbers**

• Binary real to decimal real

• Decimal real to binary real

4.5625 = 100.1001_{2}

0 100 0001 1 011 1110 1100 1100 1100 1100

**IEEE floating point format**

• IEEE defines two formats with different precisions: single and double

23.85 = 10111.110110_{2}=1.0111110110x2^{4}
e = 127+4=83h

**IEEE floating point format**

IEEE double precision

special values

**Denormalized numbers**

• Number smaller than 1.0x2^{-126} can’t be

presented by a single with normalized form.

However, we can represent it with denormalized format.

• 1.001x2^{-129}=0.01001x2^{-127}

**IA-32 floating point architecture**

• Original 8086 only has integers. It is possible to simulate real arithmetic using software, but it is slow.

• 8087 floating-point processor was sold

separately at early time. Later, FPU (floating- point unit) was integrated into CPU.

**FPU data types**

• Three floating-point types

**FPU data types**

• Four integer types

**Data registers**

• Load: push, TOP--

• Store: pop, TOP++

• Instructions access the stack using ST(i)

relative to TOP

• If TOP=0 and push, TOP wraps to R7

• If TOP=7 and pop,

result in an exception

• Floating-point values are transferred to and from memory and stored in 10-byte temporary format.

When storing, convert back to integer, long, real, long real.

**Special-purpose registers**

**Special-purpose registers**

• Last data pointer stores the memory address of the operand for the last non-control instruction.

Last instruction pointer stored the address of the last non-control instruction. Both are 48 bits, 32 for offset, 16 for segment selector.

1 1 0 1 1

**Status register**

**Control register**

**Initial 037Fh**

**Instruction format**

• Begin with ‘F’. The second letter could be ‘B’

(binary-coded decimal), ‘I’ (binary integer) or none (real).

• Up to two operands, at least one of them is a floating-point register. Hence, no memory-

memory operation. No immediate and CPU register operands.

**Instruction format**

{…}: implied operands

**Classic stack**

• ST(0) as source, ST(1) as destination. Result is stored at ST(1) and ST(0) is popped, leaving the result on the top.

**Real memory and integer memory**

• ST(0) as the implied destination. The second operand is from memory.

**Register and register pop**

• Register: operands are FP data registers, one must be ST.

• Register pop: the same as register with a ST pop afterwards.

**Example: evaluating an expression**

**Load**

**FLDPI** stores π

**FLDL2T** stores log_{2}(10)
**FLDL2E stores log**_{2}(e)
**FLDLG2 stores log**_{10}(2)
**FLDLN2 stores ln(2)**

**Store**

**Register **

**Addition**

**Addition**

**Subtraction**

**Example: array sum**

**.data**
**N = 20**

**array REAL8 N DUP(1.0)**
**sum REAL8 0.0**

**.code**

**mov** **ecx, N**

**mov** **esi, OFFSET array**

**fldz** **; ST0 = 0**

**lp:** **fadd REAL8 PTR [esi]; ST0 += *(esi)**

**add** **esi, 8** **; move to next double**
**loop lp**

**fstp sum** **; store result**

**Multiplication**

**FMULP ST(1)=ST(0)*ST(1), ****pop ST(0)**

**Division**

**Comparisons**

**Comparisons**

• The above instructions change FPU’s status register of FPU and the following instructions are used to transfer them to CPU.

**• SAHF** copies C_{0} into carry, C_{2} into parity and C_{3}
to zero. Since the sign and overflow flags are
not set, use conditional jumps for unsigned
**integers (ja, jae, jb, jbe, je, jz).**

**Comparisons**

**Example: comparison**

**.data**

**x REAL8** **1.0**
**y REAL8** **2.0**
**.code**

**; if (x>y) return 1 else return 0**

**fld** **x** **; ST0 = x**

**fcomp y** **; compare ST0 and y**

**fstsw ax** **; move C bits into FLAGS**
**sahf**

**jna** **else_part** **; if x not above y, ...**

**then_part:**

**mov** **eax, 1**
**jmp** **end_if**
**else_part:**

**mov** **eax, 0**
**end_if:**

**Pentium Pro new comparison**

• Pentium Pro supports two new comparison

instructions that directly modify CPU’s FLAGS.

The format should be
**FCOMI ST(0), src****FCOMIP ST(0), src**

**Example: max=max(x,y)**

**.686**
**.data**

**x** **REAL8** **1.0**

**y** **REAL8** **2.0**

**max** **REAL8** **?**
**.code**

**fld** **y**

**fld** **x** **; ST0=x, ST1=y**

**fcomip** **st(0), st(1)**
**jna** **y_bigger**

**fcomp** **st(0)** **; pop y from stack**

**fld** **x** **; ST0=x**

**y_bigger:**

**fstp** **max**

**Miscellaneous instructions**

**.data**

**x** **REAL4** **2.75**
**five REAL4** **5.2**
**.code**

**fld** **five** **; ST0=5.2**

**fld** **x** **; ST0=2.75, ST1=5.2**

**fscale** **; ST0=2.75*32=88**

**; ST1=5.2**