Lecture 10

(1)

Lecture 10

Storage Systems

• Types of Storage Devices

(2)

I/O Systems

Processor

Cache

Memory - I/O Bus

Main Memory

I/O Controller

Disk Disk

I/O Controller

I/O Controller Graphics Network

interrupts interrupts

(3)

Motivation: Who Cares About I/O?

• CPU Performance: 50% to 100% per year

• Multiprocessor supercomputers 150% per year

• I/O system performance limited by mechanical delays

< 5% per year (IO per sec or MB per sec)

• Amdahl's Law: system speed-up limited by the slowest part!

– 10% IO & 10x CPU => 5x Performance (lose 50%) – 10% IO & 100x CPU => 10x Performance (lose 90%) – I/O bottleneck:

» Diminishing fraction of time in CPU

(4)

Devices: Magnetic Disks

Sector Track

Cylinder Head Platter

• Purpose:

– Long-term, nonvolatile storage

– Large, inexpensive, slow level in the storage hierarchy

• Read/Write data is three-stage process:

– Seek Time (~20 ms avg, 1M cyc at 50MHz)

» move arm over track – rotational latency -

» wait for the sector to rotate under head

» Average = (0.5)/3600RPM = 8.3ms – Transfer rate

» About a sector per ms (1-10 MB/s)

(5)

Disk Time Example

• Disk Parameters:

– 512-byte sector

– Advertised average seek time is 9 ms – Transfer rate is 4MB/sec

– Disk spins at 7200 RPM.

– Controller overhead is 1ms – Assume that the disk is idle

• What is the average time to read/write a sector?

– Ave seek time + ave rot time + xfer time + control overhead – 9 + 0.5 / 7200RPM + 0.5KB / (4.0MB/sec) + 1 = 14.3

(6)

Other Devices

• DRAM + Battery

– Big reduction in seek time and lower latency – Cost is not attactive

• CD-ROMs

– Cheap and high density

– For archival storage due to their write once nature

• Magnetic Tapes

– Sequential access – Backup to disks

• Automated tape library

– Robotic tape storage

(7)

Processor Interface Issues

• Interconnections

– Busses

• Processor interface

– Interrupts

– Memory mapped I/O

• I/O Control Structures

– Polling – Interrupts

– DMA

– I/O Controllers/Processors

(8)

Bus-Based Interconnect

• Bus: a shared communication link between subsystems

– Low cost: a single set of wires is shared multiple ways

– Versatility: Easy to add new devices & peripherals may even be ported between computers using common bus

• Disadvantage

– A communication bottleneck, possibly limiting the maximum I/O throughput

• Bus speed is limited by physical factors

– the bus length

– the number of devices (and, hence, bus loading).

– these physical limits prevent arbitrary bus speedup.

(9)

Bus-Based Interconnect

• Two generic types of busses:

– I/O busses: lengthy, many types of devices connected, wide range in the data bandwidth), and follow a bus standard

(sometimes called a channel)

– CPU–memory buses: high speed, matched to the memory system to maximize memory–CPU bandwidth, single device (sometimes called a backplane)

– To lower costs, low cost (older) systems combine together

• Bus transaction

– Sending address & receiving or sending data

(10)

Bus Protocols

° ° ° Master Slave

Control Lines Address Lines Data Lines

Bus Master

: has ability to control the bus, initiates transaction -- need bus arbitration if there are multiple bus masters

Bus Slave

: module activated by the transaction

Bus Communication Protocol

: specification of sequence of events and timing requirements in transferring information.

(11)

Synchronous Bus Protocols

Address Data Read Wait Clock

Address Data

Pipelined/Split transaction Bus Protocol

addr 1 addr 2 addr 3 begin read

Read complete

(12)

Asynchronous Handshake

Address Data Read Req.

Ack.

Master Asserts Address Master Asserts Data

Next Address

Write Transaction

t0 t1 t2 t3 t4 t5

t0 : Master has obtained control and asserts address, direction, data Waits a specified amount of time for slaves to decode target

t1: Master asserts request line

4 Cycle Handshake

(13)

Read Transaction

Address Data Read Req Ack

Master Asserts Address Next Address

t0 t1 t2 t3 t4 t5

t0 : Master has obtained control and asserts address, direction, data Waits a specified amount of time for slaves to decode target

t1: Master asserts request line

4 Cycle Handshake

(14)

Bus Arbitration

Parallel (Centralized) Arbitration

Serial Arbitration (daisy chaining) BR BG

M

BR BG

M

BR BG

M

BGi BGo BR

M

BGi BGo BR

M

BGi BGo BR

BG BR A.U.

Bus Request Bus Grant

• Parallel arbitration: use multiple request lines; a centralized arbiter chooses from the among the requesters.

(15)

Bus Arbitration

• Distributed arbitration

– Self-selection: use multiple request lines; the devices requesting bus access determine who will be granted access

– Collision detection: multiple simultaneous requests result in a collision; the collision is detected and a

scheme for selecting among the colliding parties is used

(16)

Bus Options

Option High performance Low cost

Bus width Separate address Multiplex address

& data lines & data lines

Data width Wider is faster Narrower is cheaper (e.g., 32 bits) (e.g., 8 bits)

Transfer size Multiple words has Single-word transfer less bus overhead is simpler

Bus masters Multiple Single master

(requires arbitration) (no arbitration)

Split Yes—separate No—continuous

transaction? Request and Reply connection is cheaper packets gets higher and has lower latency bandwidth

(needs multiple masters)

Clocking Synchronous Asynchronous

(17)

Interfacing Storage Devices to the CPU

• Connect an I/O bus to memory? Or cache?

• How does the CPU address an I/O device

– Memory-mapped I/O

– Dedicated I/O instructions

• I/O control structures

– Polling – Interrupts – DMA

– I/O Processors

(18)

Dedicated I/O Instructions

Independent I/O Bus

CPU

Interface Interface

Peripheral Peripheral

Memory memory

bus

Separate I/O instructions (in,out):

the CPU sends a signal that this address is for I/O devices

CPU common memory

& I/O bus

(19)

Memory Mapped I/O

Single Memory & I/O Bus No Separate I/O Instructions CPU

Interface Interface

Peripheral Peripheral Memory

I/O

$ CPU

0

n

(20)

Polling

CPU

IOC

device Memory

Is the data ready?

read data

store data yes

no

done? no

busy wait loop not an efficient way to use the CPU

unless the device is very fast!

but checks for I/O completion can be dispersed among

computationally intensive code

(21)

Interrupt Driven Data Transfer

CPU

IOC

device Memory

add sub and or nop

read store ...

rti

memory

user

program (1) I/O

interrupt (2) save PC (3) interrupt service addr

interrupt service routine (4)

User program progress only halted during actual transfer

each xfer – 1000 bytes : 2 µsec per interrupt

(22)

Direct Memory Access

CPU

IOC

device Memory DMAC

Time to do 1000 xfers (0.1 second):

1 DMA set-up sequence @ 50 µsec 1 interrupt @ 2 µsec

1 interrupt service sequence @ 48 µsec 0.1 ms for interrupt overhead

(1/1000 of transfer time) CPU sends a starting address,

direction, and length count to DMAC. Then issues "start".

DMAC provides handshake signals for Peripheral

0

Peripherals DMAC Memory

Mapped I/O

(23)

Input/Output Processors

CPU IOP

Mem

D1 D2

Dn . . . main memory

bus

I/O bus CPU

IOP

issues instruction to IOP interrupts when done (1)

memory (2)

(3)

(4) OP Device Address

target device

where cmnds are

looks in memory for commands

OP Addr Cnt Other

(24)

Lecture 10

Lecture 10

Storage Systems

• Types of Storage Devices

I/O Systems

Motivation: Who Cares About I/O?

• CPU Performance: 50% to 100% per year

• Multiprocessor supercomputers 150% per year

• I/O system performance limited by mechanical delays

• Amdahl's Law: system speed-up limited by the slowest part!

Devices: Magnetic Disks

• Purpose:

• Read/Write data is three-stage process:

Disk Time Example

• Disk Parameters:

• What is the average time to read/write a sector?

Other Devices

• DRAM + Battery

• CD-ROMs

• Magnetic Tapes

• Automated tape library

Processor Interface Issues

• Interconnections

• Processor interface

• I/O Control Structures

Bus-Based Interconnect

• Bus: a shared communication link between subsystems

• Disadvantage

• Bus speed is limited by physical factors

Bus-Based Interconnect

• Two generic types of busses:

• Bus transaction

Bus Protocols

Bus Master

Bus Slave

Bus Communication Protocol

Synchronous Bus Protocols

Asynchronous Handshake

Write Transaction

Read Transaction

Bus Arbitration

Bus Request Bus Grant

Bus Arbitration

• Distributed arbitration

Bus Options

Interfacing Storage Devices to the CPU

• Connect an I/O bus to memory? Or cache?

• How does the CPU address an I/O device

• I/O control structures

Dedicated I/O Instructions

Memory Mapped I/O

Polling

Interrupt Driven Data Transfer

Direct Memory Access

Input/Output Processors

Next time: I/O Performance & RAID