4 Experimental Results 4.1 Experimental Setup

This section is meant to explore the geometry of SSDs by a series of experiments. The experiments are con-ducted over a personal computer with Intel Pentium 4 CPU (3.4GHz). The operating system is Windows XP. To elim-inate disturbance from the file system, we adopt Windows API, i.e., ReadFile() and WriteFile(), to access underlying storage devices. Using DeviceIoControl() in conjunction with IOCTL ATA PASS THROUGH as parameter, we can send ATA command to storage devices directly. Therefore, we can impose special controls, such as DISABLE READ CACHE, DISABLE WRITE CACHE, or FLUSH WRITE CACHE over SSDs.

We evaluate the management overhead inside SSDs in terms of read/write response time. To achieve a precise measurement, the RDTSC (read time stamp counter) in-struction is used to obtain a proper cycle count (which is incremented every clock cycle). Since the response time incurred by a garbage collection varies widely, trigger of a garbage collection is detected based on the throughput.

For detection of SSD geometry, we disable read cache or write buffer to precisely assess how FTL adopted in vari-ous SSDs operates over underlying NAND flash memory for read/write requests. Table 1 summarizes SSDs evalu-ated in our experiments. Since MLC SSD is unstable in write performance, we focus on SLC SSD to present our experimental results.

Table 1. Devices under tests.

Brand Model Type Size

Transcend TS16GSSD25S-S SLC 16 GB

Transcend TS32GSSD25S-M MLC 32 GB

SAMSUNG MCBQE32G5MPP-0VA SLC 32 GB

Mtron MSP-SATA7525-032 SLC 32 GB

Intel SSDSA2MH080G1GC MLC 80 GB

OCZ OCZSSD2-1C64G MLC 64 GB

OCZ OCZSSD2-1VTX60G MLC 60 GB

4.2 Detecting Eﬀective Page Size

4.2.1 Detection Method

When a write request is not aligned with the effective page size, one or two read-modify-write operations might be re-quired depending on amount of the request data. The ex-periment is conducted by issuing two update requests with adjacent starting addresses to the target SSD iteratively. For

Figure 4. Effective Page Size Detector.

each iteration, amount of the updated data is incremented by 1KB. Once the difference between the response time of re-quests exceeds a threshold, the effective page size can thus be detected.

As shown in Fig. 4, two possible cases might be en-countered as amount of the updated data increased. When amount of the updated data x is smaller than the effective page size of the target SSD, as shown in Case 1, the starting address of update requests either from 0KB or 1KB would have no impact on response time since both of them would require one read-modify-write operation. When amount of the updated data x is equal to the effective page size of the target SSD, as shown in Case 2, the request with its starting address from 0KB requires only one write operation. How-ever, the request with the starting address from 1KB would incur two read-modify-write operations, which is time con-suming compared with only one write operation. Thus the effective page size can be detected by comparing response times of two requests with adjacent starting addresses. Note that we must disable write buffer to have a precise measure-ment.

4.2.2 Detection Results

Fig. 7(a) and 7(b) shows the experimental result of ef-fective page size detection for Transcend TS16GSSD25S-S and TS16GSSD25S-Samsung MCBQE32G5MPP-0VA. As shown in the figure, there is an obvious distinguishability on re-sponse time of update requests with starting address from 0KB and 1KB when amount of written data is 4KB for Transcend TS16GSSD25S-S and 16KB for Samsung MCBQE32G5MPP-0VA, respectively. We also conduct an experiment for read requests. As shown in Fig. 7(c), since read-modify-write has no impact on read, there is no sig-nificant difference on read response times whether we align the request with the starting address of an effective page or not. However, for those target SSDs that cannot have write buffer disabled, we must explore the effective page size from read operations. As shown in Fig. 7(d), a read request aligned with the starting address of an effective page would have a shorter response time for Mtron MSP-SATA7525-032 when data amount of the request is fixed to 8KB. It

64 MB

x_KB x_KB x_KB

Effective Block

Need valid pages copy when performing garbage collection Step 1

x_KB x_KB x_KB

Effective Block

FTL can performing switch merge

Step 2 Step 3

Figure 5. Effective Block Size Detector.

is because such a read request would incur an additional read operation if the request is not aligned with the effec-tive page.

4.3 Detecting Eﬀective Block Size and Mapping Groups

Notably, even though effective blocks and mapping groups are two different things, we use these terms inter-changeably here because their difference is insignificant in terms of geometry detection.

4.3.1 Detection Method

For a block-level mapping FTL, overhead of live data copy-ing is inevitable for a partial merge or a full merge [4]. How-ever, when all the data in a data block are sequentially up-dated, a low-cost switch merge can be performed instead.

The experiment is conducted by issuing update requests to the target SSD iteratively. For each iteration, amount of the sequentially updated data is doubled. Once a switch merge is triggered by an update request, the effective block size can thus be detected.

As shown in Fig. 5, two possible cases might be encoun-tered as amount of the sequentially updated data increased.

When amount of the sequentially updated data x is smaller than the effective block size of the target SSD, as shown in Case 1, a partial merge is required to reclaim free space.

Since a partial merge incurs live data copying, the effec-tive throughput drops. When amount of the sequentially updated data x is equal to the effective block size of the tar-get SSD, as shown in Case 2, a switch merge can be adopted to reclaim free space without any live data movement. Thus the best effective throughput can be achieved.

To ensure that each request is mapped to a different logi-cal block, we separate each subsequent request with enough space, e.g., 64MB in our experiment. As a result, log blocks are consumed quickly and a garage collection would be trig-gered frequently to reclaim free space for a one-to-one

map-ping scheme. For a many-to-one mapmap-ping scheme, merge operation would be more complex and cost of live data copying for a garbage collection can thus be observed eas-ily.

4.3.2 Detection Results

Fig. 7(e)-7(g) shows the experimental result of effective block size detection. As shown in the figure, there is an obvious distinguishability on throughput improvement un-der different request sizes. The throughput improves dra-matically as the request size increased. The throughput improvement achieves the best performance and becomes steady when the request size exceeds a certain amount of data due to efficiency of switch merge. Therefore, we could conclude that the effective block sizes of Transcend TS16GSSD25S-S, Samsung MCBQE32G5MPP-0VA, and Mtron MSP-SATA7525-032 are 1MB, 4MB, and 4MB, re-spectively.

4.4 Detecting Mapping Zones

4.4.1 Detection Method

The experiment is conducted by issuing two read requests A and B iteratively. For each iteration, the starting address of the read request A is fixed, while the starting address of the read request B is increased by 1MB. Once the requests access different zones, mapping table thrashing would be incurred. Therefore, the response time of the read request B would be longer afterward.

As shown in Fig. 6, two possible cases might be encoun-tered as the starting address of the read request B increased.

When the distance between starting addresses of two read requests is smaller than the zone size, as shown in Case 1, read requests A and B would access the same zone. Thus no mapping table reloading is required. When the distance between starting addresses of two read requests is larger than the zone size, as shown in Case 2, read requests A and B must access different zones. As a result, mapping ta-ble reloading is required. In our experiment, we repeatedly issue read requests A and B to trigger mapping table thrash-ing, from which the overhead of mapping table reloading would be more obvious.

4.4.2 Detection Results

As shown in Fig. 7(h), when the distance between read quest A and read request B is shorter than 422MB, the re-sponse time of reading 512 Bytes data from address B is obviously better. When the distance between read request A and read request B exceeds 422MB, the response time of reading 512 Bytes data from address B is increased with a

Read

A

Read

B

Increase the start address of B

Zone 1

Need to reload mapping-table frequently Case 1 : distance(A,B) = zone size Case 2 : distance(A,B) > zone size

Figure 6. Zone Size Detection.

steady amount due to the overhead of mapping table load-ing. Therefore, we can conclude the zone size of Transcend TS16GSSD25S-S is 422MB.

5 Conclusion

The management of flash memory in solid-state disks imposes non-uniform response times on random sector ac-cesses. Being aware of the geometry information inside of solid-state disks can help the host system software to change data placement for matching the host write pattern and the storage device characteristics. This work demonstrates a collection of black-box tests that successfully detects the geometry of flash storage devices. We believe that these techniques are beneficial to not only enhancing existing sys-tem software but also designing new file syssys-tems.

References

[1] M.-L. Chiang, P. C. H. Lee, and R. chuan Chang. Using data clustering to improve cleaning performance for flash memory.

Software Practice and Experience, 29(3):267–290, 1999.

[2] S.-W. Lee, D.-J. Park, T.-S. Chung, D.-H. Lee, S. Park, and H.-J. Song. A log buffer-based flash translation layer us-ing fully-associative sector translation. Trans. on Embedded Computing Sys., 6(3):18, 2007.

[3] Y. Lee, J.-S. Kim, and S. Maeng. Ressd: a software layer for resuscitating ssds from poor small random write performance.

In Proceedings of the 2010 ACM Symposium on Applied Com-puting, SAC ’10, pages 242–243, New York, NY, USA, 2010.

ACM.

[4] C. Park, W. Cheon, J. Kang, K. Roh, W. Cho, and J.-S. Kim.

A reconfigurable ftl (flash translation layer) architecture for nand flash-based applications. ACM Trans. Embed. Comput.

Syst., 7(4):1–23, 2008.

[5] Samsung Electronics Company. K9MDG08U5M 4G * 8 Bit MLC NAND Flash Memory Data Sheet, 2008.

[6] J. Schindler, J. Griffin, C. Lumb, and G. Ganger. Track-aligned extents: matching access patterns to disk drive char-acteristics. In Conference on File and Storage Technologies, 2002.

(a) Transcend, Page = 4KB

(b) Samsung, Page = 16KB

Number of Read Request

Response Time (ms)

(h) Transcend, Zone Size= 422MB

Figure 7. Experimental Results.

在文檔中嵌入式網路通訊裝置評比技術與工具之研發---子計畫四：嵌入式網路通訊裝置儲存裝置效能評比基準與工具之研發(中心分項)(II) (頁 25-28)