Arithmetic Logic Unit (ALU)
Introduction to Computer p Yung-Yu Chuang
with slides by Sedgewick & Wayne (introcs.cs.princeton.edu), Nisan & Schocken (www.nand2tetris.org) and Harris & Harris (DDCA)
Let's Make an Adder Circuit Goal. x + y = z for 4-bit integers.
We build 4-bit adder: 9 inputs, 4 outputs.p p
Same idea scales to 128-bit adder.
Key computer component.
1 1
1 0
8 4
2 7
7 5
3 9
+
1 1
1 0
6 0
6 6
Binary addition
Assuming a 4-bit system:
0 0 0 1 1 1 1 1
1 0 0 1
0 1 0 1 + 1 0 1 1 0 1 1 1 +
no overflow overflow
0 1 1 1 0 1 0 0 1 0
no overflow overflow
Algorithm: exactly the same as in decimal addition
Overflow (MSB carry) has to be dealt with.
Representing negative numbers (4-bit system)
The codes of all positive numbers begin with a “0”
0 0000
1 0001 1111 -1
The codes of all negative numbers begin with a “1“
b
2 0010 1110 -2
3 0011 1101 -3
4 0100 1100 -4 To convert a number:
leave all trailing 0’s and first 1 intact, and flip all the remaining bits
4 0100 1100 -4
5 0101 1011 -5
6 0110 1010 -6
7 0111 1001 -7
1000 -8
Example: 2 - 5 = 2 + (-5) =p 0 0 1 0 + 1 0 1 1
1 1 0 1 = 3
1 1 0 1 = -3
Let's Make an Adder Circuit
Step 1. Represent input and output in binary.
1 0
0 0
0 1
1 0
1 0
0 0
1 1
0 1
+
0 0
1 1
x1 x2
x33 2 1 x00 y1
y2
y3 y0
+
z1 z2
z3 z0
Let's Make an Adder Circuit Goal. x + y = z for 4-bit integers.
x1 x2
x3 x0
cin cout
Step 2. [first attempt]
Build truth table.
y1 y2
y3 y0
+
z1 z2
z3 z0
4-Bit Adder Truth Table y2
y3 0 0 x0 x1
0 0 x2 x3
0 0
y0 y1
0 0
z2 z3
0 0
z0 z1
0 0 c0
0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 1 0 0 1 1
0 0 0 0 0 0 0 0
0 1 0 1 0 0 1
1 28+1 = 512 rows!
0 0 0 0
1 . 1 0 . 1 0 . 1 0 . 1 0 . 1 0 . 1
0 . 1 0 . 1
1 . 1 0 . 1
0 . 1 0 . 1
2 512 rows!
0 . 1
Q. Why is this a bad idea?
A 128-bit adder: 2
256+1rows >> # electrons
A. 128-bit adder: 2
56rows >> # electrons
in universe!
1-bit half adder
We add numbers one bit at a time.
ADD
x y
c s
x y s c
1-bit full adder
x y s
x y
Cout Cin
ADD
C t
Cin Cout
s
8-bit adder
Let's Make an Adder Circuit Goal. x + y = z for 4-bit integers.
x1 x2
x3 x0
c1 c2
c3 c0 = 0 cout
Step 2. [do one bit at a time]
Build truth table for carry bit.
y1 y2
y3 y0
+
z1 z2
z3 z0
Build truth table for summand bit.
Carry Bit Summand Bit
Carry Bit
ci ci+1 yi
xi
0 0
0 0
Summand Bit ci zi yi
xi
0 0
0 0
0 0 1 1
0 1 0
1 1 0
0 0
1 1 0 1
0 1 0
1 1 0
0 0 0
1 1 0
1 0 0
0 1 1
1 1
1 0 0 0
1 0 0
0 1 1
1 1 1
1 1
1 1 1 1 1
Let's Make an Adder Circuit Goal. x + y = z for 4-bit integers.
Step 3.
Derive (simplified) Boolean expression.
Carry Bit Summand Bit
MAJ 0
ODD 0 ci ci+1
yi xi
0 0
0 0
ci zi yi
xi
0 0
0 0
Carry Bit Summand Bit
0 0 1
1 1 0 0
0 1 1
0 1 0
1 1 0
0 0
1 1 0 1
0 1 0
1 1 0
0 0 0
1 1
1 0 0 0
1 1 0
1 0 0
0 1 1
1 1
1 0 0 0
1 0 0
0 1 1
1 1
1 1
1 1
1
1 1 1 1 1
Let's Make an Adder Circuit Goal. x + y = z for 4-bit integers.
Step 4.
Transform Boolean expression into circuit.
Chain together 1-bit adders.
Adder: Interface
Adder: Component Level View
Adder: Switch Level View
Subtractor Subtractor circuit: z = x – y.
One approach: design like adder circuitpp g
Subtractor Subtractor circuit: z = x – y.
One approach: design like adder circuitpp g
Better idea: reuse adder circuit
– 2’s complement: to negate an integer, flip bits, then add 1
Subtractor Subtractor circuit: z = x – y.
One approach: design like adder circuitpp g
Better idea: reuse adder circuit
– 2’s complement: to negate an integer, flip bits, then add 1
Shifter
s0 s1 s2 s3
Only one of them will be on at a time.
x0
SHIFT x1
SHIFT x2
x3
4 bit Shift
z0 z1 z2 z3
4-bit Shifter
Shifter
z
0z
1z
2z
3z
0z
1z
2z
3s
0ss
1s
2s
3Shifter
z
0z
1z
2z
3z
0z
1z
2z
3s
0x
0x
1x
2x
3s 0 x x x
s
10 x
0x
1x
2s
20 0 x
0x
10 0 0
s
30 0 0 x
0z0 = s0‧x0 + s1‧0 + s2‧0 + s3‧0
z1 = s0‧x1 + s1‧x0 + s2‧0 + s3‧0
z2 = s0‧x2 + s1‧x1 + s2‧x0 + s3‧0
z3 = s0‧x3 + s1‧x2 + s2‧x1 + s3‧x0
Shifter
z0 = s0‧x0 + s1‧0 + s2‧0 + s3‧0
z1 = s0‧x1 + s1‧x0 + s2‧0 + s3‧0
z2 = s0‧x2 + s1‧x1 + s2‧x0 + s3‧0
z3 = s0‧x3 + s1‧x2 + s2‧x1 + s3‧x0
N-bit Decoder N-bit decoder
N address inputs, 2p N data outputsp
Addresses output bit is 1;
all others are 0
N-bit Decoder N-bit decoder
N address inputs, 2p N data outputsp
Addresses output bit is 1;
all others are 0
2-Bit Decoder Controlling 4-Bit Shifter
Ex. Put in a binary amount to shift. r
0r
1Arithmetic Logic Unit
Arithmetic logic unit (ALU). Computes all operations in parallel. p p
Add and subtract.
Xor.
A d
And.
Shift left or right.
Q. How to select desired answer?
Q
1 Hot OR 1 hot OR.
All devices compute their answer;
adder
we pick one. p
Exactly one select line is on.
Implies exactly one output line is xor
Implies exactly one output line is relevant.
x.1 = x
0 0
shifterx.0 = 0
0
x + 0 = x
1 Hot OR
1
adder
x.1 = x x.0 = 0 x + 0 = x
xor decoder
shift
Bus 16-bit bus
Bundle of 16 wiresu f w
Memory transfer Register transfer
8-bit bus 8-bit bus
Bundle of 8 wires
TOY memory addressy
4 bit b 4-bit bus
Bundle of 4 wires
TOY register address
TOY register address
Bitwise AND, XOR, NOT Bitwise logical operations
Inputs x and y: n bits eachp y
Output z: n bits
Apply logical operation to each corresponding pair of bits
of bits
TOY ALU TOY ALU
Big combinational logic g g
16-bit bus
Add, subtract, and, xor, shift left, shift right, copy input 2
copy input 2
Device Interface Using Buses
16 bit words for TOY memory
Device. Processes a word at a time.
Input bus. Wires on top.
16-bit words for TOY memory
p p
Output bus. Wires on bottom.
Control. Individual wires on side.
ALU Arithmetic logic unit.
Add and subtract.
Xor.
And.
Shift left or right
Shift left or right.
Arithmetic logic unit.
Computes all operations in parallel.
Uses 1-hot OR to pick each bit answer
bit answer.
How to convert opcode to 1-hot OR signal?
Hack ALU
out
x
16
16-bit 16 adder out
y
16
zx nx zy ny f no out(x, y, control bits) =
x
x+y, x-y, y–x, 0, 1, -1,
16 bits ALU
16 bits
x
y 16 bits
out
x, y, -x, -y, x!, y!,
x+1 y+1 x 1 y 1
zr ng
x+1, y+1, x-1, y-1, x&y, x|y
Hack ALU
The ALU in the CPU context (a sneak preview of the Hack platform)
c1,c2,… ,c6
D
a D register
ALU
M
out
A/M
A register A
Mux A/M
RAM M (selected (selected register)
Perspective
Combinational logic
Our adder design is very basic: no parallelism
Our adder design is very basic: no parallelism
It pays to optimize adders
Our ALU is also very basic: no multiplication, no division
Wh is th s t f m d n d m th p ti ns?
Where is the seat of more advanced math operations?
a typical hardware/software tradeoff.