Computer Organization &

(1)

Computer Organization &

Assembly Languages Assembly Languages

Pu-Jen Cheng 2008/09/15

Introduction

(2)

Course Administration

Instructor: Pu-Jen Cheng (CSIE R323) [email protected]

http://www.csie.ntu.edu.tw/~pjcheng

Class Hours: 2:00pm-5:00pm, Monday Classroom: CSIE R102

Classroom: CSIE R102

TA(s): 戴瑋彥 [email protected]

Course Information:

Announce: http://www.csie.ntu.edu.tw/~pjcheng/course/asm2008/

Q&A: bbs://ptt.cc → CSIE_ASM

(3)

Textbook

Assembly Language for Intel-Based Computers, 5th Edition, by Kip Irvine, Prentice-Hall, 2006

http://www.asmirvine.com

(4)

(5)

References

Computer Systems: A Programmer's Perspective By Randal E. Bryant and David R. O'Hallaron, Prentice Hall

http://csapp.cs.cmu.edu/

The Art of Assembly Language The Art of Assembly Language By Randy Hyde,

http://webster.cs.ucr.edu/AoA/Windows/PDFs/0_P DFIndexWin.html

System Software: An Introduction to Systems Programming

By Leland L. Beck Addison-Wesley

(6)

Pre-requisite

Experiences in writing programs in a high-level

language such as C, C++, and Java

(7)

Course Grading (tentative)

Assignments (55%)

Class participation (5%)

Midterm exam (20%)

Final exam (20%)

(8)

Materials

Some materials used in this course are adapted from

¾ The slides prepared by Kip Irvine for the book, Assembly Language for Intel-Based Computers, 5^th Ed.

¾ The slides prepared by S. Dandamudi for the book, Introduction to Assembly Language Programming, 2^nd Ed.

¾ Introduction to Computer Systems CMU

¾ Introduction to Computer Systems, CMU

(http://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/

15213-f05/www/)

¾ Assembly Language & Computer Organization, NTU

(http://www.csie.ntu.edu.tw/~cyy/courses/assembly/06fall/news//) (http://www.csie.ntu.edu.tw/~acpang/course/asm_2004)

(9)

What is Assembly Language

First Glance at Assembly Language

(10)

Translating Languages

English: Display the sum of A times B plus C.

C++: cout << (A * B + C);

Assembly Language:

mov eax,A mul B

add eax,C call WriteInt

Intel Machine Language:

A1 00000000 F7 25 00000004 03 05 00000008 E8 00500000

(11)

A Simple Example in VC++

(12)

View/Debug Windows/Disassembly

(13)

gcc -s prog.c

(14)

The Compilation System

(15)

First Glance at Assembly Language

Low-level language

¾ Each instruction performs a much lower-level task compared to a high-level language instruction

¾ Most high-level language instructions need more than one assembly instruction

One-to-one correspondence between assembly language and machine language instructions

¾ For most assembly language instructions, there is a machine language equivalent

Directly influenced by the instruction set and

architecture of the processor (CPU)

(16)

Comparisons with High-level Languages

Advantages of Assembly Languages

¾ Space-efficiency

(e.g. hand-held device softwares, etc)

¾ Time-efficiency

(e g Real-time applications etc ) (e.g. Real-time applications, etc )

¾ Accessibility to system hardwares

(e.g., Network interfaces, device drivers, video games, etc)

Advantages of High-level Languages

¾ Development

¾ Maintenance (Readability)

¾ Portability (compiler, virtual machine)

(17)

Comparisons with High-level Languages (cont.)

(18)

This Course

Why Taking the Course?

Basic Concepts of Computer

Organization Computer Design

Computer Organization Computer Architecture This Course

System Software

Computer Architecture

Assembler, Linker, Loader Compiler, Operating System, … Assembly

Language

(19)

“I really don ’ t think that you can write a book for serious computer programmers unless you are

able to discuss low-level details. ”

Donald Knuth (高德納) The Art of Computer Programming

http://en.wikipedia.org/wiki/Donald_Knuth

(20)

Course Coverage

Basic Concepts

IA-32 Processor Architecture

Assembly Language Fundamentals

Data Transfers, Addressing, and Arithmetic

Procedures

Conditional Processing

Integer Arithmetic

Advanced Procedures

Strings and Arrays

Structures and Macros

High-Level Language Interface

Assembler, Linker, and Loader

What You Will Learn

Basic principles of computer architecture

IA-32 processors and memory management

Basic assembly programming skills

How high-level language is translated to assembly

How assembly is translated to machine code

How application program communicates with OS

Interface between assembly to high-level language

(22)

Performance: Multiword Arithmetic

Longhand multiplication

¾ Final 128-bit result in P:A

P := 0; count := 64

A := multiplier; B := multiplicand

while (count > 0)

0 1 0 1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

while (count > 0)

if (LSB of A = 1)

then P := P+B

CF := carry generated by P+B

else CF := 0

end if

shift right CF:P:A by one bit position

count := count-1

end while

(23)

Example

A = 1101

₂

(13)

B = 0101

₂

(5)

After P+B After the shift

CF P A CF P A

CF P A CF P A Initial state ? 0000 1101 -- ---- ---- Iteration 1 0 0101 1101 ? 0010 1110 Iteration 2 0 0010 1110 ? 0001 0111 Iteration 3 0 0110 0111 ? 0011 0011 Iteration 4 0 1000 0011 ? 0100 0001

(24)

Time Comparison

2 3 4 5

e (seconds)

C version

0 1 2

0 20 40 60 80 100

Number of calls (in millions)

Time

Multiplication time comparison on a 2.4-GHz Pentium 4 system

ASM version

(25)

Chapter 1: Basic Concept

Virtual Machine Concept

Data Representation

Boolean Operations

(26)

Translating Languages

English: Display the sum of A times B plus C.

C++: cout << (A * B + C);

Assembly Language:

mov eax,A mul B

add eax,C call WriteInt

Intel Machine Language:

A1 00000000 F7 25 00000004 03 05 00000008 E8 00500000

(27)

Virtual Machines

High-Level Language

Assembly Language Level 4 Level 5

Machine-independent

Machine-specific

Abstractions for computers

Operating System

Instruction Set Architecture Microarchitecture

Digital Logic Level 0 Level 1 Level 2 Level 3

(28)

High-Level Language

Level 5

Application-oriented languages

¾ C++, Java, Pascal, Visual Basic . . .

Programs compile into assembly language

(Level 4)

(29)

Assembly Language

Level 4

Instruction mnemonics that have a one- to-one correspondence to machine

language

Calls functions written at the operating

Calls functions written at the operating system level (Level 3)

Programs are translated into machine

language (Level 2)

(30)

Operating System

Level 3

Provides services to Level 4 programs

Translated and run at the instruction set

architecture level (Level 2)

(31)

Instruction Set Architecture

Level 2

Also known as conventional machine language

Executed by Level 1 (microarchitecture)

program

(32)

Microarchitecture

Level 1

Interprets conventional machine instructions (Level 2)

Executed by digital hardware (Level 0)

(33)

Digital Logic

Level 0

CPU, constructed from digital logic gates

System bus

Memory

next: Data Representation

(34)

Data Representation

Binary Numbers

¾ Translating between binary and decimal

Binary Addition

Integer Storage Sizes

Hexadecimal Integersg

¾ Translating between decimal and hexadecimal

¾ Hexadecimal subtraction

Signed Integers

¾ Binary subtraction

Fractional Binary Numbers

Character Storage

Machine Words

(35)

Binary Representation

Electronic Implementation

¾ Easy to store with bistable elements

¾ Reliably transmitted on noisy and inaccurate wires

(36)

Binary Numbers

Digits are 1 and 0

¾ 1 = true

¾ 0 = false

MSB – most significant bit

LSB – least significant bit

Bit numbering:

0 15

1 0 1 1 0 0 1 0 1 0 0 1 1 1 0 0

MSB LSB

(37)

Binary Numbers

Each digit (bit) is either 1 or 0

Each bit represents a power of 2: ^{1 1}₂⁷ ₂⁶ ¹₂⁵ ¹₂⁴ ¹₂³ ¹₂² ¹₂¹ ¹₂⁰

Every binary number is a sum of powers of 2

(38)

Translating Binary to Decimal

Weighted positional notation shows how to calculate the decimal value of each binary bit:

dec = (D

_n-1

× 2

^n-1

) + (D

_n-2

× 2

^n-2

) + ... + (D

₁

× 2

¹

) + (D

₀₀

× 2

⁰

)

D = binary digit

binary 00001001 = decimal 9:

(1 × 2

³

) + (1 × 2

⁰

) = 9

(39)

Translating Unsigned Decimal to Binary

Repeatedly divide the decimal integer by 2.

Each remainder is a binary digit in the translated value:

37 = 100101

(40)

Binary Addition

Starting with the LSB, add each pair of digits, include the carry if present.

0 0 0 0 0 1 0 0

1

(4)

carry:

0 0 0 0 0 1 1 1

0 0 0 0 0 1 0 0

+

0 0 0 0 1 0 1 1

(4) (7)

(11)

0 1

2 3

4 bit position: 7 6 5

(41)

Integer Storage Sizes

byte

16 8

32 word

doubleword

64 quadword

Standard sizes:

What is the largest unsigned integer that may be stored in 20 bits?

(42)

Large Measurements

Kilobyte (KB), 2

¹⁰

bytes

Megabyte (MB), 2

²⁰

bytes

Gigabyte (GB), 2

³⁰

bytes

Terabyte (TB), 2

⁴⁰

bytes

Petabyte, 2

⁵⁰

bytes

Exabyte, 2

⁶⁰

bytes

Zettabyte, 2

⁷⁰

bytes

Yottabyte, 2

⁸⁰

bytes

Googol, 10

¹⁰⁰

(43)

Hexadecimal Integers

Binary values are represented in hexadecimal.

(44)

Translating Binary to Hexadecimal

Each hexadecimal digit corresponds to 4 binary bits.

Example: Translate the binary integer

000101101010011110010100 to hexadecimal:

(45)

Converting Hexadecimal to Decimal

Multiply each digit by its corresponding power of 16:

dec = (D₃ × 16³) + (D₂ × 16²) + (D₁ × 16¹) + (D₀ × 16⁰)

Hex 1234 equals (1 × 16³) + (2 × 16²) + (3 × 16¹) + (4

× 16⁰), or decimal 4,660.

Hex 3BA4 equals (3 × 16³) + (11 * 16²) + (10 × 16¹) + (4 × 16⁰), or decimal 15,268.

(46)

Powers of 16

Used when calculating hexadecimal values up to 8

digits long:

(47)

Converting Decimal to Hexadecimal

decimal 422 = 1A6 hexadecimal

(48)

Hexadecimal Addition

Divide the sum of two digits by the number base (16).

The quotient becomes the carry value, and the remainder is the sum digit.

36 28 28 6A

42 45 58 4B

1 1

42 45 58 4B

78 6D 80 B5

21 / 16 = 1, rem 5

Important skill: Programmers frequently add and

subtract the addresses of variables and instructions.

(49)

Hexadecimal Subtraction

When a borrow is required from the digit to the left, add 16 (decimal) to the current digit's value:

16 + 5 = 21

C6 75 A2 47

24 2E

−1

Practice: The address of var1 is 00400020. The

address of the next variable after var1 is 0040006A.

How many bytes are used by var1?

(50)

Signed Integers

The highest bit indicates the sign.

1 = negative, 0 = positive

sign bit

1 1 1 1 0 1 1 0

0 0 0 0 1 0 1 0

Negative

Positive

If the highest digit of a hexadecimal integer is > 7, the value is negative. Examples: 8A, C5, A2, 9D

(51)

Forming the Two's Complement

Bitwise NOT of the number and add 1

Note that 00000001 + 11111111 = 00000000

(52)

8-bit Two's Complement Integers

(53)

Binary Subtraction

When subtracting A – B, convert B to its two's complement

Add A to (–B)

0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 – 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 1 0 0 0 0 1 0 0 1 Advantages for 2’s complement:

No two 0’s

Sign bit

Remove the need for separate circuits for add

and sub

(54)

Ranges of Signed Integers

The highest bit is reserved for the sign. This limits

the range:

(55)

Fractional Binary Numbers

b_i b_i ₁ • • • b₂ b₁ b₀ .b ₁ b ₂ b ₃ • • • b _j 1

2 4 2ⁱ^–1 2ⁱ

• • •

Representation

¾ Bits to right of “binary point” represent fractional powers of 2

¾ Represents rational number:

b_i b_i_–1 b₂ b₁ b₀ .b_–1 b_–2 b_–3 b_–_j

• • • 1/2

1/4 1/8 2^–^j

b

_k

⋅2

^k

k=− j

∑

i

(56)

Value Representation

5-3/4 101.11₂

2-7/8 10.111₂

63/64 0.111111₂

Examples of Fractional Binary Numbers

Observations

¾ Divide by 2 by shifting right

¾ Multiply by 2 by shifting left

¾ Numbers of form 0.111111…₂just below 1.0

1/2 + 1/4 + 1/8 + … + 1/2ⁱ + … → 1.0

Use notation 1.0 – ε

(57)

Limitation

¾ Can only exactly represent numbers of the form

¾ Other numbers have repeating bit representations

Value Representation

1/3 0.0101010101[01]…₂

Representable Numbers

/ ₂

1/5 0.001100110011[0011]…₂

1/10 0.0001100110011[0011]…₂

(58)

Converting Real Numbers

Binary real to decimal real

Decimal real to binary real

4.5625 = 100.1001

₂

(59)

True or False

If x > 0 then x + 1 > 0

If x < 0 then x * 2 < 0

If x > y then -x < -y

If x >= 0 then -x <= 0

If x < 0 then -x > 0

If x >= 0 then (( !x – 1 ) & x ) == x

If x < 0 && y > 0 then x * y < 0

If x < 0 then ((x ^ x >> 31) + 1) > 0

(60)

Character Storage

Character sets

¾ Standard ASCII (0 – 127)

¾ Extended ASCII (0 – 255)

¾ ANSI (0 – 255)

¾ Unicode (0 – 65 535)

¾ Unicode (0 – 65,535)

Null-terminated String

¾ Array of characters followed by a null byte

Using the ASCII table

¾ back inside cover of book

(61)

Machine Words

Machine Has “Word Size”

¾ Nominal size of integer-valued data

Including addresses

¾ Most current machines use 32 bits (4 bytes) words

Limits addresses to 4GB

Users can access 3GB

Becoming too small for memory-intensive applications

¾ High-end systems use 64 bits (8 bytes) words

Potential address space ≈ 1.8 X 10¹⁹ bytes

x86-64 machines support 48-bit addresses: 256 Terabytes

¾ Machines support multiple data formats

Fractions or multiples of word size

Always integral number of bytes

(62)

Word-Oriented Memory Organization

Addresses Specify Byte Locations

¾ Address of first byte in

0000 0001 0002 0003 0004 0005 32-bit

Words 64-bit Bytes Addr.

Words

Addr

=

??

Addr

=

??

Addr

= 0000

0000

y word

¾ Addresses of successive words differ by 4 (32-bit) or 8 (64-bit)

0006 0007 0008 0009 0010 0011 0012 0013 0014 0015

Addr

=

??

Addr

=

??

Addr

=

??

0004

0008

0012

0008

(63)

Data Representations

Sizes of C Objects (in Bytes)

¾ C Data Type Typical 32-bit Intel IA32 x86-64

unsigned 4 4 4

int 4 4 4

long int 4 4 4

char 1 1 1

short 2 2 2

float 4 4 4

double 8 8 8

char * 4 4 8

Or any other pointer

(64)

Byte Ordering

How should bytes within multi-byte word be ordered in memory?

Conventions

¾ Big Endian: Sun, PPC Mac

L t i ifi t b t h hi h t dd

Least significant byte has highest address

¾ Little Endian: x86

Least significant byte has lowest address

(65)

Byte Ordering Example

Big Endian

¾ Least significant byte has highest address

Little Endian

¾ Least significant byte has lowest address

Example

¾ Variable x has 4-byte representation 0x01234567

¾ Address given by &x is 0x100

0x100 0x101 0x102 0x103

01 23 45 67

0x100 0x101 0x102 0x103

67 45 23 01

Big Endian

Little Endian

01 23 45 67

67 45 23 01

(66)

Representing Integers

int A = 15213;

int B = -15213;

long int C = 15213;

Decimal: 15213

Binary: 0011 1011 0110 1101 Hex: 3 B 6 D

6D

IA32, x86-64 A

00 Sun A

6D x86-64 C

00 Sun C 6D

IA32 C

3B 00 00

3B 6D 00

93 C4 FF FF

IA32, x86-64 B

C4 93 FF FF Sun B

Two’s complement representation 00

00 00 00 3B 00 00

3B 6D 00 3B

00 00

(67)

char S[6]= “15213”;

Representing Strings

Strings in C

¾ Represented by array of characters

¾ Each character encoded in ASCII format

Standard 7-bit encoding of character set

Character “0” has code 0x30 Linux/Alpha S Sun S

Digit i has code 0x30+i

¾ String should be null-terminated

Final character = 0

Compatibility

¾ Byte ordering not an issue

Linux/Alpha S Sun S

32 31 31 35

33 00 32

31 31 35

33 00

(68)

Boolean Operations

NOT

AND

OR

Operator Precedence

Truth Tables

(69)

Boolean Algebra

Based on symbolic logic, designed by George Boole

Boolean expressions created from:

¾ NOT, AND, OR

(70)

NOT

Inverts (reverses) a boolean value

Truth table for Boolean NOT operator:

Digital gate diagram for NOT:

NOT

(71)

AND

Truth table for Boolean AND operator:

Digital gate diagram for AND:

AND

(72)

OR

Truth table for Boolean OR operator:

Digital gate diagram for OR:

OR

(73)

Operator Precedence

NOT > AND > OR

Examples showing the order of operations:

Use parentheses to avoid ambiguity

(74)

Truth Tables ^{(1 of 3)}

A Boolean function has one or more Boolean inputs, and returns a single Boolean output.

A truth table shows all the inputs and outputs of a Boolean function

Example: ¬X ∨ Y

(75)

Truth Tables ^{(2 of 3)}

Example: X ∧ ¬Y

(76)

Truth Tables ^{(3 of 3)}

Example: (Y ∧ S) ∨ (X ∧ ¬S)

mux X

Y

S

Z

Two-input multiplexer