The Behavior Analysis of Flash-Memory Storage Systems ∗
Po-Chun Huang † , Yuan-Hao Chang † , Tei-Wei Kuo † , Jen-Wei Hsieh ‡ , and Miller Lin §
† Department of Computer Science and Information Engineering Graduate Institute of Networking and Multimedia
National Taiwan University, Taipei 106, Taiwan, R.O.C.
{r95070, d93944006, ktw}@csie.ntu.edu.tw
‡ Department of Computer Science and Information Engineering
National Taiwan University of Science and Technology, Taipei, Taiwan, R.O.C.
§ Genesys Logic, Taipei, Taiwan, R.O.C [email protected]
Abstract
Performance and reliability are two major design concerns of flash-memory storage systems, especially for low-cost products. Although various excellent flash- memory management schemes are proposed, there is little work done on how to evaluate the designs or im- plementations of flash-memory storage systems. Many of the existing evaluation workloads for flash-memory storage systems still rely on those based on hard disks.
This work aims at the needs of behavior analysis of flash-memory storage systems and their evaluations. In particular, a set of evaluation metrics and their corre- sponding access patterns are proposed. The behaviors of flash memory are also analyzed in terms of perfor- mance and reliability issues.
1 Introduction
Flash memory becomes more and more popular be- cause of its nature in shock resistance, low power con- sumption, small size, non-volatility, low cost, and ex- cellent random access time. It has now been widely adopted in various applications and grown beyond its original goals 1 . As the capacity of flash memory and the number of bits in each flash cell increase, how to re- tain the system performance and reliability has become a very challenging issue. A well designed flash-memory management scheme can improve the performance and reliability of flash-memory storage systems. However, it is hard to evaluate the performance and reliability of a flash-memory management scheme, because those of a flash-memory management scheme are affected by the features of the underlying flash media and the access patterns issued by file systems and user applications.
∗
Supported by the National Science Council of Taiwan, R.O.C., under Grant NSC-96-2752-E-002 -008 -PAE
1
There are two major types of flash memory: NOR and NAND. In this paper, we consider NAND flash memory, which is the most widely adopted flash memory in storage systems.
Even though some evaluation tools, e.g., HDBench T M for hard drives and F DBench T M for flash drives, are designed for the evaluation of storage devices, none of them were designed with the considerations of flash- memory features, especially on performance and relia- bility. These observations underline the motivation of this research.
In the past years, there are a lot of excellent designs and implementations of flash-memory management schemes proposed in the literature, e.g., [2, 6, 17, 18].
In recent years, several vendors, such as Intel and Mi- crosoft, also started to adopt flash memory in their product designs, e.g., the flash-memory cache for hard disks (known as TurboMemory) and the fast booting mechanism in Windows Vista [1, 3], in order to improve their system performance. Well-known flash-memory- based products, such as solid state disks (SSD), now emerge in the market to substitute for hard drives in some portable devices [5, 16]. Researchers and vendors explored the possibility in the performance improve- ment of NAND with a SRAM cache [10, 11, 13, 15].
Among them, OneNAND-similar approaches presented simple but effective hardware architectures to replace NOR with NAND and a SRAM cache [10, 12].
With the wide popularity of flash memory in storage
system designs in various applications/systems, it is of
paramount importance to have fair evaluation of dif-
ferent designs/implementations of flash-memory stor-
age system, especially on their performance and reli-
ability metrics. Because the performance and reliabil-
ity of a flash-memory storage system is highly influ-
enced by the underlying flash media, the management
scheme design, and the access pattern generated by ap-
plications, proper evaluation of flash-memory storage
systems should consider system behaviors with respect
to these three factors. Unfortunately, not sufficient at-
tention is paid to the evaluation of flash-memory stor-
age systems, e.g., [8, 9]. Such an observation moti-
vates the research objective of this study. In this pa-
per, we analyze the performance and reliability issues
of flash memory and their potential storage-system de-
signs. Metrics on performance and reliability are pro-
posed for the evaluations of flash-memory storage sys-
tems based on the viewpoints of device drivers. With
11th IEEE Symposium on Object Oriented Real-Time Distributed Computing (ISORC)
the proposed metrics, a set of access patterns are pro- posed to evaluate the performance and reliability of flash-memory storage systems. The proposed access patterns are analyzed based on throughput, response time, maintenance overheads, endurance, data distur- bance, and crash recovery. The objective is to provide a more complete behavior analysis of flash-memory stor- age systems so that more objective comparison of the designs/implementations of flash-memory storage sys- tems is possible.
The rest of this paper is organized as follows: Sec- tion 2 describes flash-memory characteristics, system architectures, and our research motivation. The pro- posed access patterns and their behaviors on differ- ent flash-memory management schemes are proposed in Section 3. Section 4 is the conclusion.
2 Flash-Memory Characteristics, Sys- tem Architecture, and Motivation
A NAND-type flash memory chip comprises one or more banks. Each bank is composed of blocks, and each block is of a fixed number of pages. A block is the smallest unit of erase operations, while read and write operations are done on a page basis. When a page is written, the space can not be overwritten unless it is erased. (That is called the write-once property.) There- fore, out-place update is adopted to write data to free space and invalidate old versions of data. The latest copy of data is considered as live, and old versions are considered as dead, since more than one versions of data might coexist in flash memory. Pages that store live data and dead data are called valid pages and in- valid pages, respectively. When the number of the free pages falls below some safe threshold, a system activity, called garbage collection, is carried out to reclaim in- valid pages. Invalid pages are reclaimed through block erases, and valid pages in the to-be-erased block (vic- tim block ) should be copied to a free space before the block is erased. Since each flash-memory block has lim- ited erase cycles, a wear leveling strategy is needed to erase all blocks evenly so as to achieve a longer lifetime of flash memory 2 .
As shown in Figure 1, a Memory Technology Device (MTD) driver is to provide primitive functions, such as reads, writes, and erases over flash media. To do this, a Flash Translation Layer driver is needed in the system for address translation and garbage collection.
The Flash Translation Layer protocol (FTL) and the NAND Flash Translation Layer protocol (NFTL) are its popular implementations. This driver emulates the flash media as block devices so that the file systems and user applications can access the flash media trans- parently. Another architecture is to adopt a flash file system that is specifically designed for flash media and manages the address translation and garbage collection at the same time. Products in the market might be par- titioned into two categories: One may include the MTD driver (and the Flash Translation Layer driver) in their device packages (such as CompactF lash T M , and USB Flash Drives (UFDs)), as shown in Figure 1(a), and
2
SLC (Single Level Cell) and MLC (Multi-Level Cell) are the two popular designs of flash memory. Each cell of SLC flash memory stores one-bit information, while each cell of MLC
×nflash memory contains n-bit information.
Figure 1. Storage system architecture
the other may not include them (such as xD T M ), as shown in Figure 1(b).
Figure 2. Design and evaluation considera- tions of flash-memory storage systems
In this paper, we consider the architecture with the
Flash Translation Layer driver because it provides high
portability and compatibility for flash devices. Under
this architecture, the performance and reliability of a
flash-memory storage system is mainly controlled by
the flash-memory management scheme implemented in
the Flash Translation Layer driver. Note that differ-
ent flash-memory management schemes favor different
access patterns. As shown in Figure 2, the design
of a flash-memory management scheme should con-
sider the access patterns applied on it, and the evalua-
tion of a flash-memory storage system should consider
both the flash-memory management scheme and access
patterns applied to the storage system. In addition,
the features of flash media should also be considered
since it affects the design of flash-memory management
schemes. However, most evaluation tools, i.e., bench-
marks, for storage systems are designed for the bench-
marking of hard drives without good consideration to
neither the features of flash media nor the behaviors
of flash-memory management schemes. Such observa-
tions underline the objectives of this research. That is
to explore the behaviors of flash-memory management
schemes and the features of flash media as well as to
analyze evaluation metrics and access patterns that are
suitable to the performance and reliability evaluations
of flash-memory storage systems.
3 The Analysis and Evaluation for Flash-memory Storage Systems 3.1 Overview
Flash memory, like hard disks, are usually used in secondary storage devices and accessed by the host as block devices through a management layer called the flash translation layer. Without the considerations of physical characteristics and the translation layer of flash memory, we could not evaluate flash-based stor- age systems by adopting existing evaluation methods or tools designed for hard drives. The constraints on the read/write performance and the reliability concerns of flash memory should be further analyzed. The pur- pose of this section is to analyze the behaviors of flash memory, in terms of performance (Section 3.2) and reli- ability (Section 3.3). A set of access patterns are then proposed for the evaluation of flash-memory storage systems according to the analyzed behaviors and pro- posed metrics.
The performance and reliability of a flash-memory storage system is the combination result of the flash- memory management scheme (which is implemented as a Flash Translation Layer driver) and its underly- ing flash media. To evaluate a flash-memory storage system, access patterns should be designed in the layer of device drivers because device drivers can separate read operations from write operations without the in- terference of file systems. In other words, a file system might use buffers to cache data, or issue several reads and writes to its underlying device drivers whenever it receives a read (or a write) request from an applica- tion; meanwhile, different file systems (and even differ- ent implementations of file systems) also have different behaviors to access storage systems. Note that a read or write operation issued by the device driver consists of two fields to describe accessed LBAs: the start LBA and the number of LBAs.
3.2 Performance Analysis and Relative Metrics
3.2.1 Read and Write Performance
Throughput and Response Time Throughput and response time are the most widely used metrics for the performance evaluation of storage systems, where throughput is the amount of data read from (or writ- ten to) the storage system per time unit, e.g., one sec- ond, and response time is the time that the storage system takes to react to a read or write from the host.
The throughput evaluates the performance of a storage system in the granularity of one time unit, and the re- sponse time shows the performance of a storage system in the unit of one read or write operation. Note that read and write operations are in the view of the level of device drivers so that we can separate the evaluation of read operations from writes operations, in terms of the throughput and response time of a storage system.
The throughput can be classified into the aver- age throughput T H avg , worst-case throughput T H worst , and best-case throughput T H best , which are defined as follows:
T H
avg=
T i=1D
iT (1)
T H
best= min
Ti=1
D
i(2)
T H
worst= max
Ti=1
D
i, (3)
where D
iis the amount of data transmitted during, the i
thtime unit and T is the number of time units during the evaluation period.
Equation 1 is to evaluate the average throughput of a flash-memory storage system during the evalua- tion period. The best-case throughput and worst-case throughput, as shown in Equations 2-3, are to find out the largest and smallest amount of data that a flash-memory storage system can transmit in a time unit, respectively. Therefore, the best-case through- put presents the performance limitation of a flash- memory storage system, and the worst-case throughput can aid the design of real-time applications. However, the worst-case throughput is usually ignored, since it doesn’t evaluate the performance of a flash-memory storage system in the unit of one read or write op- eration.
Similar to the throughput, the response time can also be classified into three categories: the average re- sponse time RT avg , worst-case response time RT worst , and best-case response time RT best , as shown in Equa- tions 4-6. The average response time is the average time that an access is serviced and responded, where an access is a read or write operation. The best-case response time shows the shortest time that an access can be responded, but it is usually discarded because it doesn’t help the stability consideration in the de- signs of flash-memory storage systems. The worst-case response time can be used to measure the device capa- bility in the support of real-time application designs.
In the design of a real-time applications, we usually measure the worst-case response time, instead of the worst-case throughput because one time unit usually consists of more than one accesses. Note that in or- der to evaluate the worst-case response time, one better way is to access only one page for each read or write operation, since the accessing to n consecutive pages at a time (for a request) is usually much faster than the total time in accessing one page per request for n times.
RT
avg=
N i=1RT
iN (4)
RT
best= min
Ni=1
RT
i(5)
RT
worst= max
Ni=1
RT
i, (6)
where RT
iis the response time of the i
thaccess,
and N is the number of accesses during the evaluation period.
Without loss of generality, the evaluation of read performance is separated from that of write perfor- mance for the considerations of simplicity. In each evaluation, it suffices to simply evaluate the aver- age throughput, best-case throughput, average response time, and worst-case response time of a flash-memory storage system, since the worst-case throughput and best-case response time can not provide any further per- formance information of a flash-memory storage sys- tem.
Due to the write-once property and the unit differ-
ence between read/write operations and erase opera-
tions to flash media, the best-case throughput usually
occurs when data of consecutive logical addresses are
written to the flash memory or when data stored in the
consecutive physical addresses are read from the flash
memory. Figure 3 shows an example of the access pat-
tern to evaluate the best-case throughput, where x-axis
denotes the sequence of accesses, and y-axis denotes the
logical block addresses. In this figure, five sequential write operations (red dots) are issued to write data of consecutive LBAs, followed by five sequential read op- erations (dark blue dots) to read previous written data back, where the number of LBAs accessed by each op- eration (length of vertical line) is the maximal number imposed by the operation system or bus driver. Be- cause a flash-memory management scheme might adopt a cache to buffer accessed data, the number of all LBAs accessed in the test pattern should be be large enough to identify the caching effects.
i
thaccess LBA
Largest request packet length
Figure 3. A test pattern for best-case throughput
As shown in Figure 4(a), the access pattern for the worst-case response time should access one page at a time, and each LBA should be randomly accessed as even as possible. It is because the longest time to finish reading or writing the same amount of data from/to a flash-memory storage system is to access one page at a time, and different flash-memory management schemes might favor different designed patterns. As the evalua- tion time with this access pattern is longer, the derived worst-case response time is closer to the true worst-case response time. However, it is lack of efficiency for the evaluation of the worst-case response time of write op- erations, because the worst case of a write operation occurs when it triggers garbage collections, which take much longer time than any other management over- heads. As shown in Figure 4(b), the access pattern first fills up the flash-memory storage system with se- quential writes and then start random write process, so that garbage collections can be triggered much faster.
(a) Uniform random access pattern for the worst-case re- sponse time
LBA
ithaccess Sub-pass1 Sub-pass2
(b) Uniform random access pattern for the worst-case write response time
Figure 4. Test patterns for worst-case re- sponse time
When a flash-memory storage system, e.g., SSD, is adopted in a general-purpose application such as lap-
tops, it receives accesses from all of the applications running on the system. In such an access pattern, the distribution of the start LBA and the number of LBAs of a read/write operation would be close to a normal distribution, whose averages and variations are varied by the behavior of a system under evaluation.
We can use a normal distribution to generate the dis- tributions of the start LBA and the number of LBAs of each read/write operation to evaluate the average throughput and response time of a flash-memory stor- age system, and Figure 5(a) shows an example.
If a flash-memory storage system is adopted on a portable multimedia device such as digital cam- era, most read and write operations access consecu- tive LBAs. For such applications, the multimedia de- vice usually formats the flash-memory device as a FAT file system, because of its compatibility and simplic- ity, and stores files (e.g., pictures, videos, or MP3’s) on the storage system. Therefore, no matter the device wants to read, write, or delete a file, the storage sys- tem receives a set of read or write operations to access a large number of consecutive LBAs (i.e., sequential accesses) accompanied with some read or write opera- tions, which are mainly to access file allocation tables or file/directory information, to access a small number of consecutive LBAs (i.e., random accesses). Figure 5(b) is an example to simulate the access pattern of such applications. The access pattern can randomly generate sequential accesses mixed with some random accesses in various pre-configurable ratios.
LBA
ithaccess
(a) An access pattern of general-purpose applications
ithaccess LBA
Metadata Area Data Area
(b) An access pattern of multimedia applications
Figure 5. Test patterns for average throughput and response time
Performance Stability In terms of reads and
writes, performance stability is also very important to a
storage system, since better stability implies more pre-
dictable performance of a system. Mechanical storage
devices such as hard disks have unstable performance,
because their read and write performance heavily de-
pends on the seek time and rotation time. On the
other hand, electronic storage devices such as SRAM,
DRAM, and flash memory, have comparatively stable
read and write performance by their nature. However,
the write-once property, the bulk erase property, and
the property of limited erase cycles over each block
let flash memory have unstable performance, so that a
management layer, i.e., the flash translation layer, must
be adopted to manage address translation, garbage
collections, and wear leveling, as stated in previous
sections. For example, a write operation to a flash-
memory storage system may trigger live-page copyings
and block erases to reclaim free spaces, so that the time to finish the write operation varies. Suppose that an operation to write a 2KB page to a SLC flash memory [14] takes 200 µs for data programming, where the data transmission time is not considered. If this write op- eration only triggers one block erase, it takes at least 1.7ms, i.e., 200µ + 1.5ms [14], which is 8.5 times of the time to write a 2KB page without the involvement of any block erase.
For time-critical applications, the performance sta- bility of the flash-memory storage system plays a criti- cal role in the system designs. In order to evaluate the performance stability of a flash-memory storage sys- tem, the standard deviation of the throughput T H S
and that of the response time RT S are adopted and shown in Equations 7-8 because they can measure the variances of read/write operations in the throughput and response time.
T H
S=
T −1
i=0
(D
i− T H
avg)
2T (7)
RT
S=
N −1
i=0