Data Manipulation
National Chiao Tung University Chun-Jen Tsai 03/09/2012
Computer Architecture
Central Processing Unit (CPU) contains
Arithmetic/Logic Unit (ALU)
Control Unit
Registers
Cache Memory
Bus
Main Memory
I/O devices
CPU
main memory
I/O devices
Controllers
bus
processor core
3/34
Register
A register is a special memory cell inside the CPU that can store an n-bit data
For modern CPUs, n is usually 8, 16, 32, or 64
Registers are used to store intermediate computation results
An n-bit register is composed of a group of n flip-flops
Example: a 3-bit register:
read/write control
output bits
input bits
The truth table of the flip-flop is
Don’t care 1
1
1 1 0
0 0 1
Previous value 0
0
Q R S Q
S
R
Note: is denoted as , it is called a NOR gate. is a 2-to-1 multiplexor.
0
0
0
0 0
0
Bus
A bus is a collection of wires to connect a computer’s CPU, memory, and I/O devices
Information (0’s and 1’s) are transmitted among the CPU, memory, and devices via the bus following some
handshaking rules
CPU
I/O Device
Main Memory
System Controller bus
5/34
Stored Program Concept
A program is a sequence of instructions that implements an algorithm
A program is just a special type of data and can be stored in main memory
Program memory and data memory can be shared or separate; e.g. separate memory architecture:
CPU
program memory
bus
data memory
What Do Instructions Look Like
When you ask other people to do something, you use human languages (Chinese, English, etc.)
However, human languages are too cumbersome and too ambiguous for computers to decode and execute!
Machine languages must be
Concise – easy to decode
Precise – each instruction can be executed in only one way
7/34
Machine Language
A machine instruction is an instruction coded as a bit pattern directly recognizable by the CPU
A machine language is the set of all instructions recognizable by a CPU
Each CPU has its own machine language, called the Instruction Set Architecture (ISA) of the CPU
The bit-pattern of a machine instruction can be
divided into two parts: op-code field and operand field
Parts of a Machine Instruction
Op-code field
Specifies which machine operation to execute
One per instruction
Operand field (a.k.a. addressing mode field)
Data and addresses related to this operation
Number of operands varies depending on op-code
Example: the instruction that asks CPU to add data2 and data6 and store the result in data7 can be coded in a 16-bit number as follows:
ADD data7 data2 data6
4 bits op-code 4 bits 4 bits 4 bits
9/34
Instruction Lengths
A machine language can use fixed-length coding or variable-length coding of all its instructions
Fixed-length coding: easier for the CPU to decode, usually allows fewer instructions in the language
Variable-length coding: harder for the CPU to decode, but allows a richer set of instructions
Machine Language and Architecture
Machine language reflects the architecture of the
CPU, there are two major types of design philosophy:
RISC and CISC
Reduced Instruction Set Computer (RISC)
CPU only executes a small set of simple instructions
However, CPU executes these simple instructions really fast
Usually use fixed-length coding of instructions
Complex Instruction Set Computer (CISC)
CPU executes many complex and powerful instructions
Complex instructions takes longer to execute, but an
algorithm implemented in CISC machine language requires less instructions than that of RISC’s
Usually use variable-length coding of instructions
11/34
Machine Instruction Types
The instructions of a machine language can be classified into three groups:
Data Transfer: copy data between CPU registers and/or main memory cells
Arithmetic/Logic: use existing data values (stored in registers or main memory cells) to compute a new data value
Control: direct the execution flow of the program
Example: A Simple CPU
The textbook describes a simple CPU architecture
Central Processing Unit
Control Unit
00
Program Counter
Instruction Register Registers
00 20 5A
07
0 1 2
F ALU
ADD
XOR
ROR
Main memory
35 A7
00
00 01
04
00
address cells
FF 02 C0
00
03
bus
13/34
Example: An Instruction
Op-code Operand
0011 0101 1010 0111
3 5 A 7
Actual bit pattern (16 bits)
Hexadecimal form (4 bits)
Op-code 3 means to store the contents of a register in a memory cell
This part of the operand identifies the register whose contents are to be stored
This part of the operand identifies the address of the memory cell that is to receive data
Note: The complete instruction set is listed in Appendix C of the textbook
Example: Program in Machine Codes
15/34
Program Execution
Controlled by two special-purpose registers
Program counter: address of next instruction
Instruction register: current instruction
Each machine cycle of program execution is composed of three steps:
Fetch – copies memory cells addressed by the program
counter to the instruction register and increment the program counter to the next instruction in main memory
Decode – decodes the bit pattern in the instruction register to determine the operation required and the related
operands
Execute – performs the operation specified by the instruction
Program Flow Control
Some instructions change the next instruction to be fetched based on some condition
Example: conditional jump instruction
17/34
Running a Program in Main Memory
CPU
Control Unit
A0
Program Counter
Instruction Register Registers
00 00 00
00
0 1 2
F ALU
ADD
XOR
ROR
Main memory
15 6C
50
A0 A1
A4
56
address cells
A5
16
A2 A3 6D
C0
A8
00
A9
30
A6 A7 6E
Program counter contains the address of the first instruction
Program is stored in main memory beginning at address A0
bus
Fetch Step (1/2)
19/34
Fetch Step (2/2)
Decode and Execution
To understand how execution and decode is done inside a CPU, we need a more detail architecture diagram of a CPU than the one illustrated in the textbook
(Note: next three slides are beyond the scope of this course, but is good for your understanding of CPU)
21/34
Internal Structure of a Realistic CPU
CPU
Register bank
Program Counter
⋅ ⋅ ⋅
Register 0 Register 1 Register 2
Register F
ALU
Data Out Register Address Out Register
Data In Register
Instruction Decode
and Control
Incrementer
Main memory memory cell
address
data from a cell data to a cell
512F
Instruction Register
Concept of “Data Path”
In previous chapter, we mentioned that a function can be implemented using a “circuit of gates:”
Each function (or data processing circuit) can also be referred to as a “data path”
A general purpose computer can control its data path based on the instructions it receives
a
output b
c
Circuit
23/34
Execution of an Instruction
After decoding of an “add” instruction, the data path of a CPU may become as follows:
CPU
Program Counter Register 0 Register 1 Register 2
⋅ ⋅ ⋅
Register F
+
Data Out Register Address Out Register
Data In Register Instruction
Decode and Control
Memory
+2
Computation of R1 = R2 + RF
512F
datapath
Arithmetic/Logic Operations
The Arithmetic/Logic Unit (ALU) of a CPU is the muscle that performs data manipulation
Three types of operations are supported by ALU:
Logic: AND, OR, XOR, NOT
Rotate and Shift: rotate (a.k.a. circular shift), logical shift, arithmetic shift
Arithmetic: add, subtract, multiply, divide
25/34
Rotation (Circular Shift)
Logical Shift and Arithmetic Shift
There is only one way to do an n-bit left shift
There are two ways to do n-bit right shift
Logical right shift
Arithmetic right shift
10100011 Left shift by two bits
10001100
10100011 Right shift by two bits
00101000
10100011 Right shift by two bits
11101000
27/34
I/O Subsystem of a Computer
Communicating with Devices (1/2)
Controller is an intermediary device that handles communication between the computer and a device
CPU transfers data to/from the device using addresses
Dedicated instruction I/O: CPU uses dedicated I/O
addresses to communicate with device controllers; often these addresses are called “I/O ports”
Memory-mapped I/O: CPU uses main memory addresses to communicate with device controllers
29/34
Communicating with Devices (2/2)
Direct memory access (DMA)
A controller can access main memory directly when CPU is not using the bus; this capability is called DMA
Sometimes, we design a special circuit just to move data around inside the system, we also call this circuit a DMA
Von Neumann Bottleneck
If the CPU fetches both the instructions and data through a single bus connected to a memory device, the performance of the CPU would be limited by the performance of the BUS and the memory device
Handshaking
The process of controlling the transfer of data between components connected via a bus
Memory Mapped I/O Example
I/O address
(an empty memory cell that does not really exist in main memory device)
31/34
Data Communication Terminologies
Modem (modulation-demodulation):
In communication systems, modulation is a process that
“pack” data (analog or digital) to a carrier signal (like packing goods in a truck) for transmission of data in the real world
In computer terminology, “modem” is a device that perform this operation when the carrier is a telephone line
Serial communication: transfers one bit at a time
Parallel communication: transfers multiple bits simultaneously
Multiplexing: interleaving of data so that different data can be transmitted over a single communication path
Improving CPU Architecture
Pipelining: overlap steps of the CPU operation cycles
Parallel processing: execute multiple operations simultaneously
Parallel processing can be performed within a CPU or across CPUs
time
Fetch 1 Decode 1 Execute 1
Fetch 2 Decode 2 Execute 2
Fetch 3 Decode 3 Execute 3
t0 t
1 t
2 t
3 t
4
33/34
Multiprocessor Systems
A single processing unit execute one instruction a time, which is called Single-Instruction stream Single- Data stream (SISD) architecture
If multiple processing units (multi-CPU, multi-core, or multi-ALU) are connected to the main memory, we have parallel processing architecture:
Multiple-Instructions stream Multiple-Data stream (MIMD):
different instructions are issued at the same time to operate on different data
Single-Instruction stream Multiple-Data stream (SIMD):
one instruction is issued to operate on different data
Multi-CPU, Multi-Core, Multi-ALU
Multi-CPU Architecture:
Multi-Core Architecture:
Multi-ALU Architecture:
main memory
bus
CPU
bus Core 1 Core 2
other stuff CPU 1
Core other stuff
CPU 2
Core other stuff
main memory
bus Registers
other stuff Instruction Fetch, Decode, and Control
ALU 1 ALU 2
Registers IF, ID, & Ctrl.