Computer Organization &
Computer Organization &
Assembly Languages Assembly Languages
Pu-Jen Cheng 2008/09/15
Introduction
Course Administration
Instructor: Pu-Jen Cheng (CSIE R323) pjcheng@csie.ntu.edu.tw
http://www.csie.ntu.edu.tw/~pjcheng
Class Hours: 2:00pm-5:00pm, Monday Classroom: CSIE R102
Classroom: CSIE R102
TA(s): 戴瑋彥 b93705014@ntu.edu.tw
Course Information:
Announce: http://www.csie.ntu.edu.tw/~pjcheng/course/asm2008/
Q&A: bbs://ptt.cc → CSIE_ASM
Textbook
Assembly Language for Intel-Based Computers, 5th Edition, by Kip Irvine, Prentice-Hall, 2006
http://www.asmirvine.com
References
Computer Systems: A Programmer's Perspective By Randal E. Bryant and David R. O'Hallaron, Prentice Hall
http://csapp.cs.cmu.edu/
The Art of Assembly Language The Art of Assembly Language By Randy Hyde,
http://webster.cs.ucr.edu/AoA/Windows/PDFs/0_P DFIndexWin.html
System Software: An Introduction to Systems Programming
By Leland L. Beck Addison-Wesley
Pre-requisite
Experiences in writing programs in a high-level
language such as C, C++, and Java
Course Grading (tentative)
Assignments (55%)
Class participation (5%)
Midterm exam (20%)
Final exam (20%)
Materials
Some materials used in this course are adapted from
¾ The slides prepared by Kip Irvine for the book, Assembly Language for Intel-Based Computers, 5th Ed.
¾ The slides prepared by S. Dandamudi for the book, Introduction to Assembly Language Programming, 2nd Ed.
¾ Introduction to Computer Systems CMU
¾ Introduction to Computer Systems, CMU
(http://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/
15213-f05/www/)
¾ Assembly Language & Computer Organization, NTU
(http://www.csie.ntu.edu.tw/~cyy/courses/assembly/06fall/news//) (http://www.csie.ntu.edu.tw/~acpang/course/asm_2004)
What is Assembly Language
First Glance at Assembly Language
Translating Languages
English: Display the sum of A times B plus C.
C++: cout << (A * B + C);
Assembly Language:
mov eax,A mul B
add eax,C call WriteInt
Intel Machine Language:
A1 00000000 F7 25 00000004 03 05 00000008 E8 00500000
A Simple Example in VC++
View/Debug Windows/Disassembly
gcc -s prog.c
The Compilation System
First Glance at Assembly Language
Low-level language
¾ Each instruction performs a much lower-level task compared to a high-level language instruction
¾ Most high-level language instructions need more than one assembly instruction
One-to-one correspondence between assembly language and machine language instructions
¾ For most assembly language instructions, there is a machine language equivalent
Directly influenced by the instruction set and
architecture of the processor (CPU)
Comparisons with High-level Languages
Advantages of Assembly Languages
¾ Space-efficiency
(e.g. hand-held device softwares, etc)
¾ Time-efficiency
(e g Real-time applications etc ) (e.g. Real-time applications, etc )
¾ Accessibility to system hardwares
(e.g., Network interfaces, device drivers, video games, etc)
Advantages of High-level Languages
¾ Development
¾ Maintenance (Readability)
¾ Portability (compiler, virtual machine)
Comparisons with High-level Languages (cont.)
This Course
Why Taking the Course?
Basic Concepts of Computer
Organization Computer Design
Computer Organization Computer Architecture This Course
System Software
Computer Architecture
Assembler, Linker, Loader Compiler, Operating System, … Assembly
Language
“I really don ’ t think that you can write a book for serious computer programmers unless you are
able to discuss low-level details. ”
Donald Knuth (高德納) The Art of Computer Programming
http://en.wikipedia.org/wiki/Donald_Knuth
Course Coverage
Basic Concepts
IA-32 Processor Architecture
Assembly Language Fundamentals
Data Transfers, Addressing, and Arithmetic
Procedures
Conditional Processing
Integer Arithmetic
Advanced Procedures
Strings and Arrays
Structures and Macros
High-Level Language Interface
Assembler, Linker, and Loader
Other Advanced Topics (optional)
What You Will Learn
Basic principles of computer architecture
IA-32 processors and memory management
Basic assembly programming skills
How high-level language is translated to assembly
How assembly is translated to machine code
How application program communicates with OS
Interface between assembly to high-level language
Performance: Multiword Arithmetic
Longhand multiplication
¾ Final 128-bit result in P:A
P := 0; count := 64
A := multiplier; B := multiplicand
while (count > 0)
0 1 0 1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
while (count > 0)
if (LSB of A = 1)
then P := P+B
CF := carry generated by P+B
else CF := 0
end if
shift right CF:P:A by one bit position
count := count-1
end while
Example
A = 1101
2(13)
B = 0101
2(5)
After P+B After the shift
CF P A CF P A
CF P A CF P A Initial state ? 0000 1101 -- ---- ---- Iteration 1 0 0101 1101 ? 0010 1110 Iteration 2 0 0010 1110 ? 0001 0111 Iteration 3 0 0110 0111 ? 0011 0011 Iteration 4 0 1000 0011 ? 0100 0001
Time Comparison
2 3 4 5
e (seconds)
C version
0 1 2
0 20 40 60 80 100
Number of calls (in millions)
Time
Multiplication time comparison on a 2.4-GHz Pentium 4 system
ASM version
Chapter 1: Basic Concept
Virtual Machine Concept
Data Representation
Boolean Operations
Translating Languages
English: Display the sum of A times B plus C.
C++: cout << (A * B + C);
Assembly Language:
mov eax,A mul B
add eax,C call WriteInt
Intel Machine Language:
A1 00000000 F7 25 00000004 03 05 00000008 E8 00500000
Virtual Machines
High-Level Language
Assembly Language Level 4 Level 5
Machine-independent
Machine-specific
Abstractions for computers
Operating System
Instruction Set Architecture Microarchitecture
Digital Logic Level 0 Level 1 Level 2 Level 3
High-Level Language
Level 5
Application-oriented languages
¾ C++, Java, Pascal, Visual Basic . . .
Programs compile into assembly language
(Level 4)
Assembly Language
Level 4
Instruction mnemonics that have a one- to-one correspondence to machine
language
Calls functions written at the operating
Calls functions written at the operating system level (Level 3)
Programs are translated into machine
language (Level 2)
Operating System
Level 3
Provides services to Level 4 programs
Translated and run at the instruction set
architecture level (Level 2)
Instruction Set Architecture
Level 2
Also known as conventional machine language
Executed by Level 1 (microarchitecture)
program
Microarchitecture
Level 1
Interprets conventional machine instructions (Level 2)
Executed by digital hardware (Level 0)
Digital Logic
Level 0
CPU, constructed from digital logic gates
System bus
Memory
next: Data Representation
Data Representation
Binary Numbers
¾ Translating between binary and decimal
Binary Addition
Integer Storage Sizes
Hexadecimal Integersg
¾ Translating between decimal and hexadecimal
¾ Hexadecimal subtraction
Signed Integers
¾ Binary subtraction
Fractional Binary Numbers
Character Storage
Machine Words
Binary Representation
Electronic Implementation
¾ Easy to store with bistable elements
¾ Reliably transmitted on noisy and inaccurate wires
Binary Numbers
Digits are 1 and 0
¾ 1 = true
¾ 0 = false
MSB – most significant bit
LSB – least significant bit
Bit numbering:
0 15
1 0 1 1 0 0 1 0 1 0 0 1 1 1 0 0
MSB LSB
Binary Numbers
Each digit (bit) is either 1 or 0
Each bit represents a power of 2: 1 127 26 125 124 123 122 121 120
Every binary number is a sum of powers of 2
Translating Binary to Decimal
Weighted positional notation shows how to calculate the decimal value of each binary bit:
dec = (D
n-1× 2
n-1) + (D
n-2× 2
n-2) + ... + (D
1× 2
1) + (D
00× 2
0)
D = binary digit
binary 00001001 = decimal 9:
(1 × 2
3) + (1 × 2
0) = 9
Translating Unsigned Decimal to Binary
Repeatedly divide the decimal integer by 2.
Each remainder is a binary digit in the translated value:
37 = 100101
Binary Addition
Starting with the LSB, add each pair of digits, include the carry if present.
0 0 0 0 0 1 0 0
1
(4)
carry:
0 0 0 0 0 1 1 1
0 0 0 0 0 1 0 0
+
0 0 0 0 1 0 1 1
(4) (7)
(11)
0 1
2 3
4 bit position: 7 6 5
Integer Storage Sizes
byte
16 8
32 word
doubleword
64 quadword
Standard sizes:
What is the largest unsigned integer that may be stored in 20 bits?
Large Measurements
Kilobyte (KB), 2
10bytes
Megabyte (MB), 2
20bytes
Gigabyte (GB), 2
30bytes
Terabyte (TB), 2
40bytes
Petabyte, 2
50bytes
Exabyte, 2
60bytes
Zettabyte, 2
70bytes
Yottabyte, 2
80bytes
Googol, 10
100Hexadecimal Integers
Binary values are represented in hexadecimal.
Translating Binary to Hexadecimal
Each hexadecimal digit corresponds to 4 binary bits.
Example: Translate the binary integer
000101101010011110010100 to hexadecimal:
Converting Hexadecimal to Decimal
Multiply each digit by its corresponding power of 16:
dec = (D3 × 163) + (D2 × 162) + (D1 × 161) + (D0 × 160)
Hex 1234 equals (1 × 163) + (2 × 162) + (3 × 161) + (4
× 160), or decimal 4,660.
Hex 3BA4 equals (3 × 163) + (11 * 162) + (10 × 161) + (4 × 160), or decimal 15,268.
Powers of 16
Used when calculating hexadecimal values up to 8
digits long:
Converting Decimal to Hexadecimal
decimal 422 = 1A6 hexadecimal
Hexadecimal Addition
Divide the sum of two digits by the number base (16).
The quotient becomes the carry value, and the remainder is the sum digit.
36 28 28 6A
42 45 58 4B
1 1
42 45 58 4B
78 6D 80 B5
21 / 16 = 1, rem 5
Important skill: Programmers frequently add and
subtract the addresses of variables and instructions.
Hexadecimal Subtraction
When a borrow is required from the digit to the left, add 16 (decimal) to the current digit's value:
16 + 5 = 21
C6 75 A2 47
24 2E
−1
Practice: The address of var1 is 00400020. The
address of the next variable after var1 is 0040006A.
How many bytes are used by var1?
Signed Integers
The highest bit indicates the sign.
1 = negative, 0 = positive
sign bit
1 1 1 1 0 1 1 0
0 0 0 0 1 0 1 0
Negative
Positive
If the highest digit of a hexadecimal integer is > 7, the value is negative. Examples: 8A, C5, A2, 9D
Forming the Two's Complement
Bitwise NOT of the number and add 1
Note that 00000001 + 11111111 = 00000000
8-bit Two's Complement Integers
Binary Subtraction
When subtracting A – B, convert B to its two's complement
Add A to (–B)
0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 – 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 1 0 0 0 0 1 0 0 1 Advantages for 2’s complement:
No two 0’s
Sign bit
Remove the need for separate circuits for add
and sub
Ranges of Signed Integers
The highest bit is reserved for the sign. This limits
the range:
Fractional Binary Numbers
bi bi 1 • • • b2 b1 b0 .b 1 b 2 b 3 • • • b j 1
2 4 2i–1 2i
• • •
Representation
¾ Bits to right of “binary point” represent fractional powers of 2
¾ Represents rational number:
bi bi–1 b2 b1 b0 .b–1 b–2 b–3 b–j
• • • 1/2
1/4 1/8 2–j
b
k⋅2
kk=− j
∑
i
Value Representation
5-3/4 101.112
2-7/8 10.1112
63/64 0.1111112
Examples of Fractional Binary Numbers
Observations
¾ Divide by 2 by shifting right
¾ Multiply by 2 by shifting left
¾ Numbers of form 0.111111…2 just below 1.0
1/2 + 1/4 + 1/8 + … + 1/2i + … → 1.0
Use notation 1.0 – ε
Limitation
¾ Can only exactly represent numbers of the form
¾ Other numbers have repeating bit representations
Value Representation
1/3 0.0101010101[01]…2
Representable Numbers
/ 2
1/5 0.001100110011[0011]…2
1/10 0.0001100110011[0011]…2
Converting Real Numbers
Binary real to decimal real
Decimal real to binary real
4.5625 = 100.1001
2True or False
If x > 0 then x + 1 > 0
If x < 0 then x * 2 < 0
If x > y then -x < -y
If x >= 0 then -x <= 0
If x < 0 then -x > 0
If x >= 0 then (( !x – 1 ) & x ) == x
If x < 0 && y > 0 then x * y < 0
If x < 0 then ((x ^ x >> 31) + 1) > 0
Character Storage
Character sets
¾ Standard ASCII (0 – 127)
¾ Extended ASCII (0 – 255)
¾ ANSI (0 – 255)
¾ Unicode (0 – 65 535)
¾ Unicode (0 – 65,535)
Null-terminated String
¾ Array of characters followed by a null byte
Using the ASCII table
¾ back inside cover of book
Machine Words
Machine Has “Word Size”
¾ Nominal size of integer-valued data
Including addresses
¾ Most current machines use 32 bits (4 bytes) words
Limits addresses to 4GB
Users can access 3GB
Users can access 3GB
Becoming too small for memory-intensive applications
¾ High-end systems use 64 bits (8 bytes) words
Potential address space ≈ 1.8 X 1019 bytes
x86-64 machines support 48-bit addresses: 256 Terabytes
¾ Machines support multiple data formats
Fractions or multiples of word size
Always integral number of bytes
Word-Oriented Memory Organization
Addresses Specify Byte Locations
¾ Address of first byte in
0000 0001 0002 0003 0004 0005 32-bit
Words 64-bit Bytes Addr.
Words
Addr
=
??
Addr
=
??
Addr
= 0000
0000
y word
¾ Addresses of successive words differ by 4 (32-bit) or 8 (64-bit)
0006 0007 0008 0009 0010 0011 0012 0013 0014 0015
Addr
=
??
??
Addr
=
??
Addr
=
??
0004
0008
0012
0008
Data Representations
Sizes of C Objects (in Bytes)
¾ C Data Type Typical 32-bit Intel IA32 x86-64
unsigned 4 4 4
int 4 4 4
long int 4 4 4
long int 4 4 4
char 1 1 1
short 2 2 2
float 4 4 4
double 8 8 8
char * 4 4 8
Or any other pointer
Byte Ordering
How should bytes within multi-byte word be ordered in memory?
Conventions
¾ Big Endian: Sun, PPC Mac
L t i ifi t b t h hi h t dd
Least significant byte has highest address
¾ Little Endian: x86
Least significant byte has lowest address
Byte Ordering Example
Big Endian
¾ Least significant byte has highest address
Little Endian
¾ Least significant byte has lowest address
Example
¾ Variable x has 4-byte representation 0x01234567
¾ Address given by &x is 0x100
0x100 0x101 0x102 0x103
01 23 45 67
0x100 0x101 0x102 0x103
67 45 23 01
Big Endian
Little Endian
01 23 45 67
67 45 23 01
Representing Integers
int A = 15213;
int B = -15213;
long int C = 15213;
Decimal: 15213
Binary: 0011 1011 0110 1101 Hex: 3 B 6 D
6D
IA32, x86-64 A
00 Sun A
6D x86-64 C
00 Sun C 6D
IA32 C
3B 00 00
3B 6D 00
93 C4 FF FF
IA32, x86-64 B
C4 93 FF FF Sun B
Two’s complement representation 00
00 00 00 3B 00 00
3B 6D 00 3B
00 00
char S[6]= “15213”;
Representing Strings
Strings in C
¾ Represented by array of characters
¾ Each character encoded in ASCII format
Standard 7-bit encoding of character set
Character “0” has code 0x30 Linux/Alpha S Sun S
Digit i has code 0x30+i
¾ String should be null-terminated
Final character = 0
Compatibility
¾ Byte ordering not an issue
Linux/Alpha S Sun S
32 31 31 35
33 00 32
31 31 35
33 00
Boolean Operations
NOT
AND
OR
Operator Precedence
Truth Tables
Boolean Algebra
Based on symbolic logic, designed by George Boole
Boolean expressions created from:
¾ NOT, AND, OR
NOT
Inverts (reverses) a boolean value
Truth table for Boolean NOT operator:
Digital gate diagram for NOT:
NOT
AND
Truth table for Boolean AND operator:
Digital gate diagram for AND:
AND
OR
Truth table for Boolean OR operator:
Digital gate diagram for OR:
OR
Operator Precedence
NOT > AND > OR
Examples showing the order of operations:
Use parentheses to avoid ambiguity
Truth Tables (1 of 3)
A Boolean function has one or more Boolean inputs, and returns a single Boolean output.
A truth table shows all the inputs and outputs of a Boolean function
Example: ¬X ∨ Y
Truth Tables (2 of 3)
Example: X ∧ ¬Y
Truth Tables (3 of 3)
Example: (Y ∧ S) ∨ (X ∧ ¬S)
mux X
Y
S
Z
Two-input multiplexer