Recap: ALU
Big combinational logic (16-bit bus)
Add/subtract, and, xor, shift left/right, copy input 2
A 3-bit control for 5 primary ALU operations
– ALU performs operations in parallel
– Control wises select which result ALU outputs
Can we combine these 5 bits into 3 bits for 7 operations?
Yes, you can. But, you will still need 5 bits at the end.
Recap: ALU
Goal: select from one of n k-bit buses
Implemented by layering k n-to-1 multiplexer
Recap: Multiplexer
Recap: flip flop
Stand-Alone Register
Register file implementation
Recap: memory
Recap: register file
Clock Clock.
Fundamental abstraction: regular on-off pulse.
– on: fetch phase
– off: execute phase
External analog device.
Synchronizes operations of different circuit elements.
Requirement: clock cycle longer than max switching time.
cycle time
Clock
on off
Introduction to Computer Science • Robert Sedgewick and Kevin Wayne • Copyright © 2005 • http://www.cs.Princeton.EDU/IntroCS
TOY Machine Architecture
The TOY Machine Combinational circuits. ALU.
Sequential circuits. Memory.
Machine architecture. Wire components together to make computer.
TOY machine.
256 16-bit words of memory.
16 16-bit registers.
1 8-bit program counter.
16 instruction types.
Fetch
Execute
Design a processor How to build a processor
Develop instruction set architecture (ISA)
– 16-bit words, 16 TOY machine instructions
Determine major components
– ALU, memory, registers, program counter
Determine datapath requirements
– Flow of bits
Analyze how to implement each instruction
– Determine settings of control signals
Practice: 4-bit counter
Practice: stack
Build a TOY: Interface Instruction set architecture (ISA).
16-bit words, 256 words of memory, 16 registers.
Determine set of primitive instructions.
– too narrow cumbersome to program
– too broad cumbersome to build hardware
16 instructions.
0: halt
Instructions
1: add 2: subtract 3: and 4: xor
5: shift left 6: shift right 7: load address
8: load 9: store
A: load indirect B: store indirect C: branch zero D: branch positive E: jump register F: jump and link
Instructions
TOY Reference Card
0: halt
#
1: add 2: subtract 3: and
4: xor
5: shift left 6: shift right 7: load addr
exit(0)
R[d] R[s] + R[t]
R[d] R[s] - R[t]
R[d] R[s] & R[t]
R[d] R[s] ^ R[t]
R[d] R[s] << R[t]
R[d] R[s] >> R[t]
R[d] addr 8: load
9: store
A: load indirect B: store indirect C: branch zero D: branch positive E: jump register F: jump and link
R[d] mem[addr]
mem[addr] R[d]
R[d] mem[R[t]]
mem[R[t]] R[d]
if (R[d] == 0) pc addr if (R[d] > 0) pc addr pc R[t]
R[d] pc; pc addr 13 12 11 10
15 14 9 8 7 6 5 4 6 3 2 1 0
opcode dest d addr
opcode dest d source s source t
Format 2 Format 1
Operation Pseudocode
1 Fmt
1 1 1 1 1 1 2 2 2 1 1 2 2 1 2
Register 0 always 0.
Loads from mem[FF]
from stdin.
Stores to mem[FF] to stdout.
Design a processor How to build a processor
Develop instruction set architecture (ISA)
– 16-bit words, 16 TOY machine instructions
Determine major components
– ALU, memory, registers, program counter
Determine datapath requirements
– Flow of bits
Analyze how to implement each instruction
– Determine settings of control signals
Components
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1
Memory
W W Data Addr
R Data
IR op
d s t
A L U
Clock
Cond Eval
Cond. Eval.
Cond Eval X 16
Y0
Y1
=0
>0
Y0
Y1 X0
X15 :
Design a processor How to build a processor
Develop instruction set architecture (ISA)
– 16-bit words, 16 TOY machine instructions
Determine major components
– ALU, memory, registers, program counter
Determine datapath requirements
– Flow of bits
Analyze how to implement each instruction
– Determine settings of control signals
Datapath
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1
Memory
W W Data Addr
R Data
IR op
d s t
A L U
R[d] R[s] ALU R[t] R[d] addr R[d] mem[addr]
mem[addr] R[d] R[d] mem[R[t]] mem[R[t]] R[d]
if (R[d]?) pc addr pc R[t] R[d] pc; pc addr
1-6 7 8
9 A B
CD E F
Cond Eval
Datapath
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1
Memory
W W Data Addr
R Data
IR op
d s t
Cond Eval
A L U
pc+1 pc for jump
and branch address for load/store
result of ALU or address for load address
pc for jal
addr store data
load
8 16
16
0 8 0 8
8
Design a processor How to build a processor
Develop instruction set architecture (ISA)
– 16-bit words, 16 TOY machine instructions
Determine major components
– ALU, memory, registers, program counter
Determine datapath requirements
– Flow of bits
Analyze how to implement each instruction
– Determine settings of control signals
Datapath
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1
Memory
W W Data Addr
R Data
IR op
d s t
Cond Eval
A L
1 U
0
10 0
1 10
1001 00
Control
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1
Memory
W W Data Addr
R Data
IR op
d s t
Cond Eval
A L U 2
5
WRITE_MEM WRITE_IR
CLOCK_MEM CLOCK_REG
WRITE_REG
ALU_OP
READ_REG A MUX WRITE_REG MUX
MEM_ADDR MUX
WRITE_PC
PC_MUX ALU MUX
A total of 17 control signals
10
10 0
1 10
1001 00
TOY architecture
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1-bit counter 1
5 2
4
=0
>0
Opcode Execute
Fetch Clock Memory
W W Data Addr
R Data
IR op
d s t
Cond Eval
A L U
Control
Clock
10
10 0
1 10
1001 00
Clock
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1-bit counter 1
5 2
4
=0
>0
Opcode Execute
Fetch Clock Memory
W W Data Addr
R Data
IR op
d s t
Cond Eval
A L U
Control
Clock
10
10 0
1 10
1001 00
1-bit counter 1-bit counter
Circuit that oscillates between 1 and 0.
Clock
Two cycle design (fetch and execute)
Use 1-bit counter to distinguish between 2 cycles.
Use two cycles since fetch and execute phases each access memory and alter program counter.
Clocking Methodology Two-cycle design.
Each control signal is in one of four epochs.
– fetch [set memory address from pc]
– fetch and clock [write instruction to IR]
– execute [set ALU inputs from registers]
– execute and clock [write result of ALU to registers]
Fetch
Clock Execute Fetch
Phase 1
fetch Phase 3 execute Phase 2
fetch & clock
Phase 4
execute & clock
Clocking Methodology
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1
Memory
W W Data Addr
R Data
IR op
d s t
Cond Eval
A L U 2
5
fetch execute
10
10 0
1 10
1001 00
Example: ADD
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1
Memory
W W Data Addr
R Data
IR op
d s t
Cond Eval
A L U 2
5
PC=20
Mem[20]=1234
R[3]=0028 R[4]=0064
20 ????
10
10 0
1 10
1001 00
Example: ADD (fetch)
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1
Memory
W W Data Addr
R Data
IR op
d s t
Cond Eval
A L U 2
20 5
20 ????
PC=20
Mem[20]=1234
R[3]=0028 R[4]=0064
10
10 0
1 10
1001 00
Example: ADD (fetch)
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1
Memory
W W Data Addr
R Data
IR op
d s t
Cond Eval
A L U 2
20 5
1234
20 ????
PC=20
Mem[20]=1234
R[3]=0028 R[4]=0064
10
10 0
1 10
1001 00
Example: ADD (fetch)
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1
Memory
W W Data Addr
R Data
IR op
d s t
Cond Eval
A L U 2
20 5
1234
21
20 ????
21
PC=20
Mem[20]=1234
R[3]=0028 R[4]=0064
10
10 0
1 10
1001 00
Example: ADD (fetch and clock)
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1
Memory
W W Data Addr
R Data
IR op
d s t
Cond Eval
A L U 2
5
PC=21 IR=1234
21 1234
21 1
2 3 4
Mem[20]=1234
R[3]=0028 R[4]=0064
10
10 0
1 10
1001 00
Example: ADD (execute)
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1
Memory
W W Data Addr
R Data
IR op
d s t
Cond Eval
A L U 2
5
21 1234
1 2 3 4
PC=21 IR=1234 Mem[20]=1234
R[3]=0028 R[4]=0064
3 4
10
10 0
1 10
1001 00
Example: ADD (execute)
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1
Memory
W W Data Addr
R Data
IR op
d s t
Cond Eval
A L U 2
5
21 1234
1 2 3 4
PC=21 IR=1234 Mem[20]=1234
R[3]=0028 R[4]=0064
0028
4
0064 3
10
10 0
1 10
1001 00
0064 0028
Example: ADD (execute)
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1
Memory
W W Data Addr
R Data
IR op
d s t
Cond Eval
A L U 2
5
21 1234
1 2 3 4
PC=21 IR=1234 Mem[20]=1234
R[3]=0028 R[4]=0064
4 3
008C 008C
10
10 0
1 10
1001 00
Example: ADD (execute and clock)
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1
Memory
W W Data Addr
R Data
IR op
d s t
Cond Eval
A L U 2
5
21 1234
1 2 3 4
PC=21 IR=1234 R[2]=008C Mem[20]=1234
R[3]=0028 R[4]=0064
008C
2
10
10 0
1 10
1001 00
Example: Jump and link
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1
Memory
W W Data Addr
R Data
IR op
d s t
Cond Eval
A L U 2
5
PC=20
Mem[20]=FF30
R[3]=0028 R[4]=0064
10
10 0
1 10
1001 00
Fetch
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1-bit counter 1
5 2
4
=0
>0
Opcode Execute
Fetch Clock Memory
W W Data Addr
R Data
IR op
d s t
Cond Eval
A L U
Control
Clock
10
10 0
1 10
1001 00
Program counter
Read program counter when
Fetch
Execute for jal
Write program counter when
Fetch and clock
Execute and clock depending on
conditions
Fetch and clock
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1-bit counter 1
5 2
4
=0
>0
Opcode Execute
Fetch Clock Memory
W W Data Addr
R Data
IR op
d s t
Cond Eval
A L U
Control
Clock
10
10 0
1 10
1001 00
Instruction register
Execute
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1-bit counter 1
5 2
4
=0
>0
Opcode Execute
Fetch Clock Memory
W W Data Addr
R Data
IR op
d s t
Cond Eval
A L U
Control
Clock
Control
Two approaches to implement control
Micro-programming
– Use a memory (ROM) for micro-code
– More flexible
– Easier to program
Hard-wired
– Use logic gates and wire
– More efficient
Control
Opcode Execute
Fetch Clock
=0
>0
…
17 control signals
…
17 control signals
512x17 ROM
::
9-bit address
Control
ALU MUX
ALU control
Execute and clock (write-back)
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1-bit counter 1
5 2
4
=0
>0
Opcode Execute
Fetch Clock Memory
W W Data Addr
R Data
IR op
d s t
Cond Eval
A L U
Control
Clock
10
10 0
1 10
1001 00
Writing registers and memory
Memory
W W Data
Addr
R Data
Registers
W W Data
A Data B Data W Addr
A Addr B Addr
More examples
PC
Registers
W W Data
A Data B Data W Addr A Addr B Addr +
1
Memory
W W Data Addr
R Data
IR op
d s t
Cond Eval
A L U 2
5
10
10 0
1 10
1001 00
TOY "Classic", Back Of Envelope Design
Build a TOY-Lite: Devices
10-bit word, 4-word register 16-word memory
Control
data bus
to memory input
control lines to ALU opcode
from IR control lines
to processor registers
external clock just ticks
data bus from ALU
Control. Circuit that determines control line
sequencing.
Build a TOY-Lite: Layout
Build a TOY-Lite: Datapath
Build a TOY-Lite: Control
Real Microprocessor (MIPS R10000)
Real Microprocessor (MIPS R10000)
History + Future
Computer constructed by layering abstractions.
Better implementation at low levels improves everything.
Ongoing search for better abstract switch!
History.
1820s: mechanical switches.
1940s: relays, vacuum tubes.
1950s: transistor, core memory.
1960s: integrated circuit.
1970s: microprocessor.
1980s: VLSI.
1990s: integrated systems.
2000s: web computer.
Future: quantum, optical soliton, …
Ray Kurzweil (http://en.wikipedia.org/wiki/Image:PPTMooresLawai.jpg)