Storage Systems
郭大維 教授
ktw@csie.ntu.edu.tw
嵌入式系統暨無線網路實驗室
(Embedded Systems and
Wireless Networking Laboratory)
國立臺灣大學資訊工程學系
Reading:
Kam-yiu Lam and Tei-Wei Kuo, “Real-Time Database Systems: Architecture and Techniques”, Kluwer Academic Publishers, 2000
Krishna and Kang, “Real-TimeSystems,” McGRAW-HILL, 1997.
Storage Systems
Real-Time Disk Scheduling
Flash-Memory Storage Systems
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 3
Real-Time Disk Scheduling
Motivation: Disparity between CPU and disk speed.
access time = queuing time + seek time + latency delay + transfer time
Strategies to improve the performance of disk service:
First-come-first-served (FCFS) algorithm * :
Poor because of no consideration in deadlines and arm movements.
Earliest-deadline-first (EDF) algorithm:
Not optimum in minimizing the number of transaction deadlines missed.
Track
1 4 380 390
Requests Deadlines Track
1 20 390
2 40 1
3 60 380
4 80 4
2010/12/14 Embedded Systems and Wireless Networking Lab. 4
Real-Time Disk Scheduling
Scan (or elevator) algorithm * :
Start at one end of the disk, and moves toward the other end, servicing requests as it reaches each track, until it gets to the other end of the disk.
At the other end, the direction of head movement is reversed and servicing continues.
Bad for service requests at either end of a disk.
C-Scan (Circle Scan) algorithm * :
Goal: Provide a more uniform wait time.
As does Scan scheduling, servicing requests as it goes. However, when the head reaches one end, it immediately returns to the beginning of the disk.
Track
queue=295, 30, 150
30 295
queue=5, 70, 225
150 225 399
70
5
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 5
Real-Time Disk Scheduling
Shortest-seek-time-first (SSTF)algorithm * :
A greedy algorithm which always selects the request with the minimum seek time from the current request queue.
Starvation of some requests...
A variation of SCAN:
Classify requests into classes.
Service requests in the same class in terms of SCAN.
Service classes in order of their priorities.
Q: How many priority levels are enough, and how to partition them?
* It means no consideration of deadlines.
Track
queue=305, 30, 5, 225 30 70 150 225 305 399
5
queue=305, 150
2010/12/14 Embedded Systems and Wireless Networking Lab. 6
Real-Time Disk Scheduling
A weighted scheduling algorithm:
Sort requests in the waiting queue in the increasing order of their deadlines.
Each request is assigned a weight w i depending on their order in the queue.
Let d i be the distance the arm has to move from its current position to serve the request.
Consider q requests at a time to reduce the algorithm complexity.
Service the request with the highest priority p i = 1/ (w i d i ) Q: How to assign processes weights w i ?
A variation of the weighted scheduling algorithm:
Motivation: Consider deadline instead of deadline order!
Service the request with the highest priority p i = f(d i , d i ) = a d i + (1-a) d i . a is a design factor, and choosing a in the range 0.7 to 0.8 looks good.
Reading: A. Silberschatz and P.B. Galvin, “Operating System Concepts,” 4th Ed., Addison-Wesley Publishing Company, 1994.
C.M. Krishna and K.G. Shin, “Real-TimeSystems,” McGRAW-HILL, 1997.
S. Chen, J.A. Stankovic, J.F. Kurose, and D.F. Towsley, “Performance Evaluation of Two New disk scheduling Algorithms for Real-Time Systems,”
J. of Real-Time Systems, 3(3):307-336, 1991.
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 7
Real-Time Disk Scheduling
Another paper for discussion:
A.L. N. Reddy and J.C. Wyllie, “I/O Issues in
Multimedia System,” IEEE Transactions on
Computers, March 1994.
Flash-Memory Storage Systems
郭大維 教授
ktw@csie.ntu.edu.tw
嵌入式系統暨無線網路實驗室 (Embedded Systems and Wireless
Networking Laboratory)
國立臺灣大學資訊工程學系
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 9
Agenda
Introduction
Management Issues
Performance vs Overheads
Other Challenging Issues
Conclusion
2010/12/14 Embedded Systems and Wireless Networking Lab. 10
Introduction – Why Flash Memory
Diversified Application Domains
Portable Storage Devices Consumer Electronics
Industrial Applications
SoC and Hybrid Devices
Critical System Components
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 11
Trends in VLSI Technology
Source: www.icknowledge.com
* This slide was from the ASP-DAC’06 talk delivered by Prof. Sang L. Min from the Seoul National University.
December 14, 2010 12
Trends in Flash Memory Development
IBM Journal of Research and Development, Nov 2008 Source: IDC, Jun 2007
• Significant dropping of the price per GB – More than 40% per year
• Exponential increasing in the density and capacity
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
13
December 14, 2010 13
Trends in Flash Memory Development
Market size: More than 15 billions US dollars in 2008 Revenue growth: More than 10% in 2008
Source: iSuppli, Apr 2008
IC Technology, NAND Flash CostTrend, 2008
2010/12/14 Embedded Systems and Wireless Networking Lab. 14
Introduction – Trends in Storage Technology
Source: Using multilevel cell NAND flash technology in consumer applications, Electronic Engineering Times, July ,2005
Samsung 2GB USB 2.0 Flash Drive
Price: $49.99
Less Rebate: - $25.00 Final Price: $24.99*
T-One 2GB
Microdrive/3600RPM
$144.99
September 2006
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 15
Introduction – Trends in Storage Technology
Source: Using multilevel cell NAND flash technology in consumer applications, Electronic Engineering Times, July ,2005. Amazon.com
Transcend 8GB
CompactFlash Card
Price: $84.85
Microdrive 4GB Compact Flash Type II
Price: $116 ScanDisk 4GB
CompactFlash Card
Price: $55.99
March 2007
Trends – Storage Media
2008/1/30 Embedded Systems and Wireless Networking Lab. 16
Source: Using multilevel cell NAND flash technology in consumer applications, Electronic Engineering Times, July ,2005 Component Times, Nov 2007.
* Parts of this slide was from the ASP-DAC’06 talk delivered by Prof. SangLyul. Min from the Seoul National University.
Transcend 32GB SSD, 2.5- Inch, SATA, MLC (USD343.30)
Transcend 8GB SDHC SD CARD (USD30.30)
HITACHI 6GB
Microdrive MD6GBBP (USD169.95)
Seagate
ST302504FDA1E1-RK FreeAgent Desktop 250 GB USB
External Hard Drive (USD79.99)
March 2008
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
Trends – Storage Media
2008/11/11
All Rights Preserved, Tei-Wei Kuo, National Taiwan University 17 Source: Using multilevel cell NAND flash technology in consumer applications, Electronic Engineering Times, July ,2005, and
Component Times, Nov 2007.
The first commercial flash chip was presented by Toshiba in 1985 as a 256Kbit chip, where 1Mb DRAM was introduced in 1986.
* Parts of this slide was from the ASP-DAC’06 talk delivered by Prof. SangLyul. Min from the Seoul National University.
20Gb IBM HDD
Transcend 32GB SSD, 2.5- Inch, SATA, MLC (USD92.24)
Transcend 8GB SDHC SD CARD (USD19.95)
HITACHI 6GB
Microdrive MD6GBBP (USD149.99)
Seagate FreeAgent 500GB 3.5-Inch USB 2.0 Hard Drive ST305004FDA1E1-RK (USD85.99)
October 2008
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
Trends – Storage Media
2010/12/14
All Rights Preserved, Tei-Wei Kuo, National Taiwan University
Source: Using multilevel cell NAND flash technology in consumer applications, Electronic Engineering Times, July ,2005, and Component Times, Nov 2007.
The first commercial flash chip was presented by Toshiba in 1985 as a 256Kbit chip, where 1Mb DRAM was introduced in 1986.
* Parts of this slide was from the ASP-DAC’06 talk delivered by Prof. SangLyul. Min from the Seoul National University.
Transcend 2.5" SSD 64 GB SATA-II
(USD144.12) Transcend 16GB
SDHC SD CARD (USD39.50)
4GB HITACHI MICRODRIVE CF+
TYPE II TRAVEL KIT (MD4GB/
A HMS360604D5CF00) (USD25.00)
Seagate FreeAgent Desk 1 TB
USB 2.0 Hard Drive ST310005FDA2E1-RK
(USD119.99)
May 2009
18
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
Trends – Storage Media
2010/12/14
All Rights Preserved, Tei-Wei Kuo, National Taiwan University
Source: Using multilevel cell NAND flash technology in consumer applications, Electronic Engineering Times, July ,2005, and Component Times, Nov 2007.
The first commercial flash chip was presented by Toshiba in 1985 as a 256Kbit chip, where 1Mb DRAM was introduced in 1986.
* Parts of this slide was from the ASP-DAC’06 talk delivered by Prof. SangLyul. Min from the Seoul National University.
Transcend 128 GB SATA-2 2.5-inch Solid State
(USD283.99) Transcend 16GB
SDHC SD CARD (USD36.49)
4GB HITACHI MICRODRIVE CF
(USD19.00)
Seagate FreeAgent Desk 1.5 TB
USB 2.0 Hard Drive ST315005FDA2E1-RK
(USD145.77)
November 2009
19
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
Trends – Storage Media
2010/12/14
All Rights Preserved, Tei-Wei Kuo, National Taiwan University
Source: Using multilevel cell NAND flash technology in consumer applications, Electronic Engineering Times, July ,2005, and Component Times, Nov 2007.
The first commercial flash chip was presented by Toshiba in 1985 as a 256Kbit chip, where 1Mb DRAM was introduced in 1986.
* Parts of this slide was from the ASP-DAC’06 talk delivered by Prof. SangLyul. Min from the Seoul National University.
Kingston SSDNow V Series 128 GB
SATA 2.5- Inch Solid State Drive
(USD224.35) Transcend 32GB
SDHC SD CARD (USD50.99)
Western Digital WD Elements 2 TB USB 2.0 Desktop External Hard Drive
WDBAAU0020HBK-NESN
(USD109.99)
November 2010
20
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 21
The Ultimate Limit on Mechanical Devices
Fly By Night
Boeing 747
2,000,000 Miles Per Hour
1/100” Flying Height
Source: Richard Lary, The New Storage Landscape: Forces shaping the storage economy, 2003.
Source:
http://www.hitachigst.com/
* This slide was from the ASP-DAC’06 talk delivered by Prof. Sang L. Min from the Seoul National University.
A Microdrive Example
Flash Makers – NAND Flash Memory
Embedded Systems and Wireless Networking Lab. 22
http://hugoleijtens.spaces.live.com/blog/cns!4B94B7453D4BFD9E!988.entry Source: iSuppli Corp (Unit: Million Dollars)
[EE Times,11/30/2007]
For years, NAND prices have dropped by an
average of 40 percent
or more per year.
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
Flash Stories – Solid-State Disks
< 50% Heat, Ultra
Silence, Light Weight, MTTF – 200M hours (HDD – 30M hours), Energy Efficiency
Embedded Systems and Wireless Networking Lab. 23
IDC, SanDisk - Component Times, Nov 2007
Flash Stories – Flash Wars
Fab
300mm Wafer Fab by Toshiba and SanDisk at Yokkaichi, Japan
80,000 Wafers per Month (2008) 210,000 Wafers per Month
300mm Wafer Fab by IM Flash
Technologies (Intel & Micron) at Utah
Joint Venture: Sony and Qimonda, Hynix and Sandisk
Technology
32nm (Samsung, Intel), 43nm (Toshiba), 50nm (IM)
2008/1/30 Embedded Systems and Wireless Networking Lab. 24
Introduction – The
Characteristics of Storage Media
Reference Devices/Modules: DRAM: DDR-400. NOR FLASH: Silicon Storage Technology SST39LF010.
NAND FLASH: Samsung K9K8G08U0M. Disk: Seagate ST31000333AS
Media
Access time
Read Write Erase
DRAM 5ns (1B)
2.56us (512B)
5ns (1B) 2.56us (512B)
-
NOR FLASH 45ns (1B) 23us (512B)
14us (1B) 7.2ms (512B)
18ms (128KB)
NAND FLASH 20us (1B) 32.8us (512B)
200us (1B) 212us (512B)
1.5ms (128KB)
DISK 12.7ms (512B) (average)
14.2 ms(512B) (average)
-
13X 83X
390X 67X
Non-Volatile Memory
2008/1/30 Embedded Systems and Wireless Networking Lab. 26
Macronix International Co., Component Times, Nov. 2007
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
All Rights Preserved, Tei-Wei Kuo, National Taiwan University 27
Introduction – Single-Level Cell (SLC)
Selected cell I DS
Control Gate
Drain
Source Each Word Line is connected to control gates.
Each Bit Line is connected to the drain.
Cell
NOR Flash
Introduction – Single-Level Cell (SLC)
NAND accesses each cell through adjacent cells, while NOR allows for individual access to each cell. 1
The cell size of NAND is almost half the size of a NOR cell.
All Rights Preserved, Tei-Wei Kuo, National Taiwan University 28
*
"Flash Memory Moves from Niche to
Mainstream," Chip Design Magazine, April/May 2006.
NOR Flash NAND Flash
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 29
Introduction – Multi-Level Cell (MLC) vs SLC
64 MB
SLC Flash
1 0
V
TLOGIC STATE
N U M B E R O F C E L L S
a) BILEVEL (1 BIT/CELL)
11 00
V
TLOGIC STATE
N U M B E R O F C E L L S
b) MULTILEVEL (2 BIT/CELL)
10 01
128 MB
MLC Flash
2010/12/14 Embedded Systems and Wireless Networking Lab. 30
1-bit/Cell SLC NAND Flash
100,000 Program/Erase cycles (with ECC) [1]
10 years Data Retention [1]
2-bits/Cell MLC NAND Flash
10,000 Program/Erase cycles (with ECC) [2]
10 years Data Retention [2]
4-bits/Cell MLC NAND FLASH Developers (2006)
M-systems, Intel, Samsung, and Toshiba
[1] ST Micro-electronics NAND SLC large page datasheet (NAND08GW3B2A) [2] ST Micro-electronics NAND MLC large page datasheet (NAND04GW3C2A)
* USD34.65 per GB for NOR, USD6.79 per GB for NAND (March 2007)
Comparison of SLC and MLC
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 31
Introduction – Consumer Applications
Electronic Engineering Times, July 2005
2010/12/14 Embedded Systems and Wireless Networking Lab. 32
Bandwidth Requirements – Video
ˇ ˇ
Electronic Engineering Times, July 2005
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 33
Bandwidth Requirements – Audio
ˇ ˇ
Electronic Engineering Times, July 2005
2010/12/14 Embedded Systems and Wireless Networking Lab. 34
Introduction – Challenges in Flash- Memory Storage Designs
Requirements in Good Performance Limited Cost per Unit
Strong Demands in Reliability
Increasing in Access Frequencies
Tight Coupling with Other Components
Low Compatibility among Vendors
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 35
Agenda
Introduction
Management Issues
Performance vs Overheads
Other Challenging Issues
Conclusion
2010/12/14 Embedded Systems and Wireless Networking Lab. 36
Management Issues – System Architectures
AP
File-System Layer
AP AP
Block Device Layer (FTL emulation)
Flash Memory MTD drivers
AP
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 37
Management Issues – Flash-Memory Characteristics
……
Block 0 Block 1 Block 2 Block 3
Erase one block 1 Page = 512B
1 Block = 32 pages(16KB)
……
Write one
page
2010/12/14 Embedded Systems and Wireless Networking Lab. 38
Management Issues – Flash-Memory Characteristics
Write-Once
No writing on the same page unless its residing block is erased!
Pages are classified into valid, invalid, and free pages.
Bulk-Erasing
Pages are erased in a block unit to recycle used but invalid pages.
Wear-Leveling
Each block has a limited lifetime in erasing counts.
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 39
Management Issues – Flash-Memory Characteristics
Example 1: Out-place Update
Live pages Free pages A B C D
Suppose that we want to update data A and B…
2010/12/14 Embedded Systems and Wireless Networking Lab. 40
Dead pages
A B C D A B
Management Issues – Flash-Memory Characteristics
Example 1: Out-place Update
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 41
A live page A dead page A free page This block is to be recycled.
(3 live pages and 5 dead pages)
L D D L D D L D L L D L L L F D L F L L L L D F F L L F L L F D
Management Issues – Flash-Memory Characteristics
Example 2: Garbage Collection
2010/12/14 Embedded Systems and Wireless Networking Lab. 42
L L D L L L D L F L L L L D
L L F L L F D L
L
D D D D
A live page A dead page A free page
Live data are copied to somewhere else.
L
D D D
D
Management Issues – Flash-Memory Characteristics
Example 2: Garbage Collection
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 43
A live page A dead page A free page
The block is then erased.
Overheads:
•live data copying
•block erasing.
L L D L L L D L F L L L L D
L L F L L F D L
L F F F F F F F F
L
Management Issues – Flash-Memory Characteristics
Example 2: Garbage Collection
2010/12/14 Embedded Systems and Wireless Networking Lab. 44
Management Issues – Flash-Memory Characteristics
Example 3: Wear-Leveling
L D D L D D L D L L D L L L F D L F L L L L D F F L L F L L F D
100
10
20
15
Erase cycle counts
Wear-leveling might interfere with the
decisions of the block- recycling policy.
A live page A dead page A free page
A
B
C
D
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 45
Management Issues – Challenges
The write throughput drops significantly after garbage collection starts!
The capacity of flash-memory storage systems increases very quickly such that memory space requirements grows quickly.
Reliability becomes more and more critical when the manufacturing capacity increases!
The significant increment of flash-memory access rates seriously exaggerates the
Read/Program Disturb Problems!
2010/12/14 Embedded Systems and Wireless Networking Lab. 46
Agenda
Introduction
Management Issues
Performance vs Overheads – FTL vs NFTL
Other Challenging Issues
Conclusion
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 47
System Architecture
Garbage Collection
Address Translation
FTL/NFTL Layer File system (FAT, EXT2, NTFS...)
Device Driver fwrite(file,data)
Block write (LBA,size)
Flash I/O Requests
Control signals
File Systems process process
process Applications
Flash-Memory Storage System
Physical Devices
(Flash Memory Banks)
process
48 48
D e v ic e
Flash Media
H o s t
Applications
Operating System
File System (e.g. DOS FAT)
FTL MTD
Applications
Operating System
File System (e.g. DOS FAT)
H o s t D e v ic e
FTL
Flash Media MTD
*FTL: Flash Translation Layer, MTD: Memory Technology Device xD,
SmartMedia
SD,
Memory Stick, Compact Flash
Management Issues – Flash-
Memory Characteristics
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 49
Policies – FTL
FTL adopts a page-level address translation mechanism.
The main problem of FTL is on large memory space
requirements for storing the address translation information.
User data
. . .
Logical Block Address (array index)
Physical Block Address (block,page)
Physical Block Address (block,page)
Access LBA = 3
Address Translation
Table (in main-memory) Flash memory
0,0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 1,0 1,1 1,2 1,3
(0,3) (0,1) (0,6) (0,4) (4,7) (1,0) (2,1) (1,2) (1,3)
0 1 2 3 4 5 6 7 8 9 10 11
. . .
Spare data
Spare data LBA=3;
ECC=. . .;
Status=. . .;
2010/12/14 Embedded Systems and Wireless Networking Lab. 50
Policies – NFTL (Type 1)
. . .
(9)
Write data to LBA=1011
. . .
NFTL
Address Translation Table (in main-memory)
Free Free Free Used
Free Free Free Free
Free Free Free Free Free Free Free Free
A Chain Block Address = 9
A Chain Block Address = 23
VBA=126
Block Offset=3
If the page has been used Write to the page with block
offset=3
A logical address under NFTL is divided into a virtual block address and a block offset.
e.g., LBA=1011 => virtual block address (VBA) = 1011 / 8 = 126 and block offset = 1011 % 8 = 3
Free Free Free Free Free Free Free Free
A Chain Block Address = 50
Used
Write to the page with block
offset=3
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 51
Policies – NFTL (Type 2)
A logical address under NFTL is divided into a virtual block address and a block offset.
e.g., LBA=1011 => virtual block address (VBA) = 1011 / 8 = 126 and block offset = 1011 % 8 = 3
. . .
(9,23)
Write data to LBA=1011
. . .
NFTL
Address Translation Table (in main-memory)
Free Free Free Used
Free Free Free Free
Used Used Used Free Free Free Free Free A Primary Block
Address = 9
A Replacement Block Address = 23
VBA=126
Block Offset=3
If the page has been used
Write to the
first free page
2010/12/14 Embedded Systems and Wireless Networking Lab. 52
Policies – NFTL
NFTL is proposed for the large-scale NAND flash storage systems because NFTL adopts a block-level address translation.
However, the address translation
performance of read and write requests
might deteriorate, due to linear searches of address translation information in primary and replacement blocks.
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
Policies – BL
BL adopts a block-level address translation mechanism.
The main problem of BL is on low space utilization and significant write overhead
e.g., LBA=1011 => virtual block address (VBA) = 1011 / 8
= 126 and block offset = 1011 % 8 = 3
53
Address Translation Table (in main-memory)
. . .
9
Write data to LBA=1011
. .
VBA=12 .
6
Used Free Used
Free Free Used Used Free
Physical block = 9
Block Offset=3
Write
Free Free Free Free Free Free Free Free
Physical block = 10
10
Write
Copy
2010/12/14 Embedded Systems and Wireless Networking Lab. 54
Policies – FTL or NFTL
FTL NFTL
Memory Space Requirements Large Small Address Translation Time Short Long Garbage Collection Overhead Less More
Space Utilization High Low
The Memory Space Requirements for one 1GB
NAND (512B/Page, 4B/Table Entry, 32 Pages/Block) FTL: 8,192KB (= 4*(1024*1024*1024)/512)
NFTL: 256KB (= 4*(1024*1024*1024)/(512*32))
Remark: Each page of small-block(/large-block) SLC NAND can store 512B(/2KB) data, and there are 32(/64) pages per block.
Each page of MLCx2 NAND can store 2KB, and there are 128 pages per block.
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 55
Address Translation Time - NFTL
The address translation performance of read and write requests can be deteriorated, due to linear searches of physical addresses.
A
A Primary Block A Replacement Block
B C D E
A B A B A A B B
The most-recent content of the write request of
LBA F=105
The most-recent content of the write request of
LBA B=101
The most-recent content of the write request of
LBA A=100
1. Assume that each block contains 8 pages.
2. Let LBA A, B, C, D, and E be written for 5, 5, 1, 1, and 1 times,
respectively. Their data distribution could be like to what in the left figure.
3. For example, it might need to scan 9
spare areas for LBA B.
2010/12/14 Embedded Systems and Wireless Networking Lab. 56
Garbage Collection Overhead - NFTL
A
A Primary Block A Replacement Block
B C D E
A B A B A A B B
A B C D E
A New Primary Block
3. Overhead is 2 block erases and 5 page writes.
1. Copy the most-recent content to the new primary block.
2. Erase the old primary block and the replacement block.
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 57
A
A Primary Block A Replacement Block
B C D E
A B A B A A B B
A B C D E
A New Primary Block
Space Utilization - NFTL
3 free pages are wasted.
2010/12/14 Embedded Systems and Wireless Networking Lab. 58
Agenda
Introduction
Management Issues
Performance vs Overheads – An Adaptive Two-Level Mapping Mechanism
Other Challenging Issues
Conclusion
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 59
Motivation
An adaptive two-level management design of a flash translation layer, called AFTL.
Exploit the advantages of the fine-grained address
mechanism and the coarse-grained address mechanism.
FTL NFTL AFTL Memory Space
Requirements
Large Small A little larger than NFTL
Address Translation Time
Short Long Much Better than NFTL
Garbage Collection Overhead
Less More Much Better than NFTL
Space Utilization High Low Much Better than
NFTL
2010/12/14 Embedded Systems and Wireless Networking Lab. 60
AFTL – Coarse-to-Fine Switching
A Fine-Grain Hash Table
A Coarse-Grain Hash Table
. . .
. . .
A A Primary Block
( )
B C D E
A
A Replacement Block ( )
B A B A A B B .
. .
. . .
AFTL
PPBA RPBA
) , ,
(VBA PPBA RPBA
1. AFTL doesn’t erase the two blocks immediately.
2. AFTL moves the
mapping information of the replacement block to the fine-grained hash
table by adding fine- grained slots.
(A, ) (B, )
5 RPBA
7 RPBA
) 1 , , (VBAPPBA
3. The RPBA field of the corresponding mapping information is nullified.
Chin-Hsien Wu and Tei-Wei Kuo, 2006, “An Adaptive Two-Level Management for the Flash Translation Layer in Embedded Systems,”
IEEE/ACM 2006 International Conference on Computer-Aided Design (ICCAD), November 5-9, 2006.
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 61
AFTL – Fine-to-Coarse Switching
The number of the fine-grained slots is limited.
Some least recently used mapping information of fine-grained slots should be moved to the coarse-grained hash table.
A Fine-Grain Hash Table
A Coarse-Grain Hash Table
. . .
. . .
A A Primary Block
( )
B C D E
A
A Replacement Block ( )
B A B .
. .
. . .
AFTL
PPBA RPBA
) , ,
(VBAPPBARPBA (A, ) (B, )
5 RPBA
7 RPBA
1. Assume that this fine-grained slot is to be replaced.
2. Data stored in the page with the given (physical) address are
copied to the primary or replacement block of the
corresponding coarse-grained slot, as defined by NFTL.
3. If there dose not exist any corresponding coarse-grained slot, a new one is created.
(F, PBAE)
2010/12/14 Embedded Systems and Wireless Networking Lab. 62
AFTL – Fine-to-Coarse Switching
Coarse-to-fine switches would introduce fine-to- coarse switches and overhead in valid page
copying.
It is because the number of the fine-grained slots is limited.
Stop any coarse-to-fine switch when some
frequency bound in coarse-to-fine switches is reached.
We set a parameter in the experiments to control the
frequency of switches to explore the behavior of the
proposed mechanism.
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 63
The Advantages of AFTL
Improve the address translation performance.
It is because the moving of their mapping information to the fine-grained hash table.
Improve the garbage collection overhead.
The delayed recycling of any replacement block reduces the potential number of valid data
copyings and blocks erased.
Improve the space utilization.
The delayed recycling of any primary block lets free pages of a primary block be likely used in the future.
A Fine-Grain Hash Table
A Coarse-Grain Hash Table
. . .
. . .
A A Primary Block
( )
B C D E
A A Replacement Block
( )
B A B A A B B .
. .
. . .
(A, ) (B, )
AFTL
cg.PPBA
d dcg.RPBA
) , , . (
RPBA PPBA VBA
cg cg cg
d d d
5 .RPBA dcg
7 .RPBA dcg
Chin-Hsien Wu and Tei-Wei Kuo, 2006, “An Adaptive Two-Level Management for the Flash Translation Layer in Embedded Systems,”
IEEE/ACM 2006 International Conference on Computer-Aided Design (ICCAD), November 5-9, 2006.
2010/12/14 Embedded Systems and Wireless Networking Lab. 64
Agenda
Introduction
Management Issues
Performance vs Overheads –
Performance EvaluationOther Challenging Issues
Conclusion
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 65
Performance Evaluation
Performance Setup
The characteristics of the experiment trace was over a 20GB disk.
CPU Intel Celeron 750MHz
RAM 320 MB
OS Windows XP
File Systems NTFS
Applications Web Applications, E-mail Clients, MP3 Player, MSN Messenger, Word, Excel, PowerPoint, Media, Player,
Programming, and Virtual Memory Activities
Durations One week
Total Write / Read Requests 13,198,805 / 2,797,996 sectors
Different LBA’s 1,669,228
2010/12/14 Embedded Systems and Wireless Networking Lab. 66
Performance Evaluation
Performance Setup
The maximum number of fine-grained slots is controlled by a parameter MFS.
A parameter ST controls the frequency of
switches between the two address translation mechanisms – n/ST.
ST=0 => No constraint on the number of switches.
Smaller ST => More switches.
Larger ST => Less switches.
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 67
Memory Space Requirements
1. MFS ranged from 2,500, 5,000, 7,500, 10,000, 12,500, to 15,000.
2. AFTL uses a little more memory space than NFTL.
0 200 400 600 800 1000 1200
M FS =2 ,5 00
M FS =5 ,0 00
M FS =7 ,5 00
M FS =1 0, 00 0
M FS =1 2, 50 0
M FS =1 5, 00 0
N FT L
M em o ry S p ac e( K )
AFTL NFTL
2010/12/14 Embedded Systems and Wireless Networking Lab. 68
Address Translation Performance
600000 650000 700000 750000 800000 850000 900000 950000 1000000
M FS =2,5 00
M FS =5,0 00
M FS =7,5 00
M FS =10, 000
M FS =12, 500
M FS =15,
000 N FT L
A ddr es s T ra ns la ti on T im e (m s)
ST=64 ST=32 ST=16 ST=0 NFTL
1 . Larger MFS => smaller address translation time
- More address translations going through the fine-grained address translation mechanism.
2. Smaller ST => longer address translation time
- More coarse-to-fine switches
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 69
Garbage Collection Overhead
4 5 6 7 8 9 10
M F S = 2, 50 0
M F S = 5, 00 0
M F S = 7, 50 0
M F S = 10 ,00 0
M F S = 12 ,50 0
M F S = 15 ,00 0
N F T L
Average Number of Valid Pages Copied
ST=64 ST=32 ST=16 ST=0 NFTL
350000 370000 390000 410000 430000 450000 470000 490000 510000 530000 550000
M F S= 2, 50 0
M F S= 5, 00 0
M F S= 7, 50 0
M F S= 10 ,0 00
M F S= 12 ,5 00
M F S= 15 ,0 00
N FT L
Number of Blocks Erased
ST=64 ST=32 ST=16 ST=0 NFTL
AFTL outperforms NFTL.
- Coarse-to-fine switches can avoid immediate recycling of their primary and replacement blocks and related valid data copyings.
2010/12/14 Embedded Systems and Wireless Networking Lab. 70
Space Utilization
0 0.5 1 1.5 2 2.5 3
M FS =2 ,5 00
M FS =5 ,0 00
M FS =7 ,5 00
M FS =1 0, 00 0
M FS =1 2, 50 0
M FS =1 5, 00 0
N FT L
A v er ag e N u m b er o f F re e P ag es L ef t ST=64 ST=32 ST=16 ST=0 NFTL
The Space utilization might be better under AFTL.
- Coarse-to-fine switches can delay the recycling of replacement blocks.
- Free pages of primary blocks might be used in the future..
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 71
Summary
AFTL is proposed to
exploit the advantages of fine- grained/coarse-grained address translation mechanisms, and to
switch dynamically and adaptively the mapping information between the two address translation mechanisms.
AFTL does provide good performance in address mapping and space
utilization and have garbage collection overhead and memory space
requirements under proper management.
A Fine-Grain Hash Table
A Coarse-Grain Hash Table
. . .
. . .
A A Primary Block
( )
B C D E
A A Replacement Block
( )
B A B A A B B .
. .
. . .
(A, ) (B, )
AFTL
cg.PPBA
d dcg.RPBA
) , , . (
RPBA PPBA VBA
cg cg cg
d d d
5 .RPBA dcg
7 .RPBA dcg
Chin-Hsien Wu and Tei-Wei Kuo, 2006, “An Adaptive Two-Level Management for the Flash Translation Layer in Embedded Systems,”
IEEE/ACM 2006 International Conference on Computer-Aided Design (ICCAD), November 5-9, 2006.
2010/12/14 Embedded Systems and Wireless Networking Lab. 72
Agenda
Introduction
Management Issues
Performance vs Overheads
Other Challenging Issues
Conclusion
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 73
Challenging Issues – Reliability
Selected cell I DS
Control Gate
Drain
Source
Each Word Line is connected to control gates.
Each Bit Line is connected to the drain.
Cell
2010/12/14 Embedded Systems and Wireless Networking Lab. 74
Challenging Issues – Reliability
Read Operation
When the floating gate is not charged with electrons, there is current I D (100uA) if a reading voltage is
applied. (“1” state)
5V
1V
Program Operation
Electrons are moved into
the floating gate, and the
threshold voltage is thus
raised.
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 75
Challenging Issues – Reliability
Over-Erasing Problems
Fast Erasing Bits All of the cells connected to the same bit line of a depleted cell would be read as “1”, regardless of their values.
Read/Program Disturb Problems
DC erasing of a programmed cell, DC programming of a non-programmed cell, drain disturb, etc.
Flash memory that has thin gate oxide makes disturb problems more serious!
Data Retention Problems
Electrons stored in a floating gate might be lost such that
the lost of electrons will sooner or later affects the charging
status of the gate!
2010/12/14 Embedded Systems and Wireless Networking Lab. 76
Program Disturb Example - DC Erasing of a Programmed Cell
Electrons might be tunneled from floating gate to control gate through interpoly oxide in all the
programmed cells.
E.g., 12V applied to the control gate and 0V applied to both source and drain
Cell A was programmed.
Programming Cell B also erases Cell A at the same time.
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2010/12/14 Embedded Systems and Wireless Networking Lab. 77
Program Disturb Example – Drain Disturb
Cells sharing the BL are to be programmed, since electrons are tunneled from the floating gate through gate oxide to the drain.
Cell D was programmed.
Programming Cell B also erases Cell D at the same time.
2010/12/14 Embedded Systems and Wireless Networking Lab. 78
Challenging Issues – Observations
The write throughput drops significantly after garbage collection starts!
The capacity of flash-memory storage systems increases very quickly such that memory space requirements grows quickly.
Reliability becomes more and more critical when the manufacturing capacity increases!
The significant increment of flash-memory access rates seriously exaggerates the Read/Program
Disturb Problems!
Wear-leveling technology is even more critical when
flash memory is adopted in many system components
or might survive in products for a long life time!
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
2008/1/30 Embedded Systems and Wireless Networking Lab. 79
Wear Leveling versus Product Lifetime
Settings
File system: FAT16 file system with 8KB cluster size
Flash memory: 256MB small-block flash memory with 100K erase cycles
Updating of a 16MB file repeatedly with the throughput:
0.1MBs
The file requires 2K clusters = 16MB ÷ 8KB(cluster size) The FAT size of this file is 4KB (2K(clusters) x 2 bytes)
The 20% of blocks in flash memory joins the dynamic wear leveling Data of a 16MB file is stored in 1K blocks (16MB ÷ 16KB(block size)) Suppose flash memory is managed in the block level
File systems update the FAT in each cluster writing so that FAT is updated 2K times for a 16MB file
Writing of a 16MB incurs 1K block erases because of the reclaiming of invalid
space.
2008/1/30 Embedded Systems and Wireless Networking Lab. 80
Wear Leveling versus Product Lifetime
Ways in Data Updates
In-Place-Updates: Rewriting on the Same Page
Dynamic Wear Leveling: Rewriting over Another Free Page with Erasing over Blocks with Dead Pages
Static Wear Leveling: Rewriting over Another Free Page with Erasing over Any Blocks
Expected Lifetime of
) days ( 5 . 987 60
60 24 ) 1 2
(
100 0%
0 1 (blocks) 16
ond) 0.1(MB/sec
16(MB) Leveling
Wear Static
) days ( 5 . 197 60
60 24 ) 1 2
(
100 20%
(blocks) 16
ond) 0.1(MB/sec
16(MB) Leveling
Wear Dynamic
) days ( 09 . 0 60 60 24 2
100 ond)
0.1(MB/sec 16(MB) update)
place - (in Leveling Wear
NO
K K
K K
K K
K K
K
K
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
81
Comparison of Different Technologies
Physical Block Addresses (PBA)
Eras e Cy cles
0 20 40 60 80 100
0 1000 2000 3000 4000 5000
Perfect SWL
Physical Block Addresses (PBA)
0 20 40 60 80 100
0 1000 2000 3000 4000 5000
Intuitive SWL
Eras e Cy cles
0 20 40 60 80 100
0 1000 2000 3000 4000 5000
No Wear Leveling
0 20 40 60 80 100
0 1000 2000 3000 4000 5000
Dynamic Wear Leveling
2008/1/30 Embedded Systems and Wireless Networking Lab. 82
Use a counter for each block
The garbage collector always finds the block with the least erase count.
Some heuristic approach erases a block to maintain 2 free blocks when the garbage collector finds the erase count of the block is over a given threshold.
Problems:
High extra block erases and live-page copyings
High main-memory consumption High computation cost
Wear Leveling versus Product Lifetime
Static Wear Leveling – Block-Level Mapping
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Flash Memory
Counter
: free block : dead block : block contains (some) valid data : index in the selection of a victim block
: index to the selected free block
5 5 5 4 5 5 4 4 4 5 5 5 4 4 4 5Update data in block 3 Write new data to block 4 GC starts find a victim block
5
Update block 15 GC starts
5 5 5 5
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
83
An Efficient Static Wear Leveling Mechanism
A modular design for compatibility considerations
A SWL mechanism
Block Erasing Table (BET)
bit flags
SW Leveler
SWL-Procedure SWL-Update
Allocator
Address Translation / Block Assignment
Memory Technology Device (MTD) Layer SW Leveler Static Wear Leveling
Cleaner Garbage Collection File Systems (e.g., DOS FAT)
Flash Translation Layer (such FTL and NFTL)
Flash Memory
84
The Block Erasing Table (BET)
A bit-array: Each bit is for 2
k
consecutive blocks.Small k – in favor of hot-cold data separation Large k – in favor of small RAM space
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Flash
BET 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1 1 1 1
k=0 k=2
e cnt =0 f cnt =0
e cnt =0 f cnt =0 e cnt =1
f cnt =1 e cnt =2 f cnt =2 e cnt =3 f cnt =2
e cnt =1 f cnt =1 e cnt =2 f cnt =2 e cnt =3 f cnt =2 e cnt =4 f cnt =2
: a block that has been erased in the current resetting interval
: an index to a block that the Cleaner wants to erase f cnt : the number of 1’s in the BET
e cnt : the total number of block erases done since the BET is reset
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.
85
e cnt =1998 f cnt =2
An unevenness level (e cnt / f cnt ) >= T
Triggering of the SW Leveler
Resetting of BET when all flags are set.
The SW Leveler
0 0 0 0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
: A block that has been erased in the current resetting interval
1 1 k=2
: An index to a block that the Cleaner wants to erase
f
cnt: The number of 1’s in the BET
e
cnt: the total number of block erases since the BET is erased
T : A threshold, T=1000 in this example
: An index in the selection of a block set
e cnt =1999 f cnt =2 e cnt =2000
f cnt =2
2000 / 2 = 1000 >= 1000 (E cnt / f cnt >= T)
: An index that SW Leveler triggers the Cleaner to do garbage 1
collection
The Cleaner is triggered to
1. Copy valid data of selected block set to free area, 2. Erase block in the selected block set, and 3. Inform the Allocator to update the address mapping
between LBA and PBA
After a period of time, the total erase count reaches 2998.
e cnt =2004 f cnt =3 e cnt =2998
f cnt =3 e cnt =2999
f cnt =3 e cnt =3000
f cnt =3
3000 / 3 = 1000 >= 1000 (E cnt / f cnt >= T)
1
e cnt =3004 f cnt =4
After a period of time, the total erase count reaches 3999.
e cnt =3999 f cnt =4 e cnt =4000
f cnt =4
4000 / 4 = 1000>=1000 (e cnt / f cnt >=1000) , but all flags in BET are 1 reset BET
Reset to a randomly selected block set
(flag)
e cnt =0
f cnt =0
86
Main-Memory Requirements
512MB 1GB 2GB 4GB 8GB
k=0 256B 512B 1024B 2048B 4096B
k=1 128B 256B 512B 1024B 2048B
k=1 64B 128B 256B 512B 1024B
k=3 32B 64B 128B 256B 512B
MLC x2 (1 page = 2 KB, 1 block=128 pages)
Copyright: All rights reserved, Prof. Tei-Wei Kuo, Embedded System and Wireless Networking Lab, National Taiwan University.