Macro-benchmark Results - Experimental Results

0.6 Experimental Results

0.6.4 Macro-benchmark Results

This part of our experiment considers a real-life workloads. This workload is collected from 54 sensors deployed in Intel Berkeley Research lab between February 28th and April 5th, 2004. The Mica2Dot sensors with weather boards collected the weather data once every 31 seconds and got a log of about 2.3 million readings. We filtered out the data of temperature and modified it to be appropriate for index operation.

This experiment use the following settings for fat lists: the turnstile size is 8, the maxi-34 mum level is five, the slots number is 40, the key number is 30, and the pointer pool size is 7. The probability parameter p is 0.25, as suggested in [22]. The page size of µ-tree is 512 bytes.

First, we insert the temperature data according to the order of collecting data. If the same value data was inserted again, we regarded it as update. After inserting all the data, we randomly produced the start keys and ranges from the temperature data to range query.

Figure 21 shows the results of fat lists and µ-tree. In insertion phase, the read time of fat lists is more than µ-tree since the query performance in µ-tree is depend on tree height and the tree height of µ-tree grows slow. However, the write and erase time of µ-tree is much more than fat lists because every leaf node in µ-tree occupies a page which is much larger than an object in fat lists. Therefore, the flash space utilization of µ-tree is more than fat lists, and garbage-collection in µ-tree is much more frequent than fat lists. On the other hand, µ-tree need to rewrite all path on every node update, so the write time of µ-tree is more than fat lists.

In range query phase, the performance of µ-tree is worse than fat lists no matter in 1000 times or 10000 times query. This is because µ-tree need to read from root on reading every range query node after finding the start key, and fat lists just need to read along the lowest level after finding the start key. As long as the overhead of reading the objects the soft pointer reference to in fat lists is less than reading the whole path in µ-tree, the performance of fat lists would be better than µ-tree in range query. Therefore, the read time of fat lists is still much less than µ-tree.

0.6.5 Discussion

The evaluation results in prior sections present the performance characteristics of fat lists.

Using a large maximum level could increase the distances of skips, but if the high level object amount is so few that it could not skip long distance and also increases the search time from high level. Avoiding using very large turnstiles reduces the total number of useless probes,

but using very small turnstiles would result in too many spare blocks that would seriously35 increase the garbage-collection frequency. Therefore, the setting of maximum level and turnstile size is important and depend on the workload.

0.7 Conclusions

Dealing with crucial limitations on computational resources is a fundamental design issue in embedded devices. Efficient data indexing not only provides fast data retrieval, but also prolongs battery life. Due to the write-once nature of flash memory, a major challenge of data indexing in flash memory is that data updates and pointer updates recursively trigger further updates. Previous studies tackle this issue using logical pointers, at the cost of large RAM-space requirements and a lengthy initialization scan. This study introduces a new pointer design, called soft pointers, and a novel index structure, called fat lists, that uses these soft pointers. A soft pointer allows de-referencing to probe a bounded number of physical locations in NOR flash. As a result, data objects can be moved around in NOR flash without invalidating a pointer, largely simplifying space management in NOR flash.

Even better, the probes made by de-referencing a soft pointer provide opportunities for forward random skips in soft lists, greatly speeding up search operations. By enlarging the index objects, the frequency of structural modifications reduces and the speed of range query increases. The strategies of delayed split and lazy merge not only reduce structural modifications but also improve object space utilization and thus improve garbage-collection efficiency.

This study examines the performance characteristics of fat lists using a series of exper-iments based on a synthesized workload and a real-life workload. Results show that fat lists, taking advantage of very fast NOR flash reading but extremely slow writing and eras-ing, achieve a good performance for read-write operations. More importantly, fat lists save precious erasure cycles of flash blocks and extend the lifespan of flash memory.

Bibliography

[1] A. Hunter, “A Brief Introduction to the Design of UBIFS,”

http://www.linux-mtd.infradead.org/doc/ubifs_whitepaper.pdf, 2008.

[2] B. Pfaff, “Performance Analysis of BSTs in System Software,” ACM SIGMETRICS Performance Evaluation review, Vol. 32, Issue 1, 2004.

[3] C. H. Wu, T. W. Kuo, and L. P. Chang, “An Efficient B-Tree Layer Implementation for Flash-Memory Storage Systems,” ACM Transactions on Embedded Computing Systems, Volume 6, Issue 3, 2007.

[4] C. H. Wu, T. W. Kuo, and L. P. Chang “The Design of efficient initialization and crash recovery for log-based file systems over flash memory,” ACM Transaction on Storage, Volume 2, Issue 4, 2006.

[5] C. Park, W. Cheon, J. Kang, K. Roh, W. Cho, and J. Kim, “A reconfigurable FTL (flash translation layer) architecture for NAND flash-based applications,” ACM Trans-actions on Embedded Computing Systems, Vol. 7, issue 4, 2008.

[6] D. Agrawal, D. Ganesan, R. Sitaraman, Y. Diao, and S. Singh, “Lazy-Adaptive Tree:

An Optimized Index Structure for Flash Devices,” In Proceedings of the 35th Inter-national Conference on Very Large Data Bases, 2009.

[7] D. W. Kang, D. W. Jung, J. U. Kang, and J. S. Kim, “ µ-tree: an ordered index struc-ture for NAND flash memory,” in Proceedings of the 7th ACM/IEEE International Conference on Embedded Software, 2007

[8] E. Gal and S. Toledo, “A Transactional Flash File System for Microcontrollers,” in Proceedings of the USENIX Technical Conference, 2005.

[9] F. Buchholz, “The Structure of the Reiser File System,”37 http://homes.cerias.purdue.edu/~florian/reiser/reiserfs.php, 2006.

[10] J. Katcher, ”PostMark: A New Filesystem Benchmark,” Technical Report TR3022, Network Appliance, http://www.netapp.com/techlibrary/3022.html, 1997.

[11] K. S. Yim, J. H. Kim, and K. Koh, “A Fast Start-Up Technique for Flash Mem-ory Based Computing Systems,” in Proceedings of the ACM Symposium on Applied Computing, 2005.

[12] L. P. Chang and T. W. Kuo, “Efficient Management for Large-Scale Flash-Memory Storage Systems with Resource Conservation,” ACM Transactions on Storage, Volume 1, Issue 4, 2005.

[13] O. Rodeh, “B-trees, Shadowing, and Clones,” ACM Transactions on Storage, Vol. 3, Issue 4, 2008.

[14] S. W. Lee, D. J. Park, T. S. Chung, D. H. Lee, S. W. Park, and H. J. Song, “A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation,”

ACM Transactions on Embedded Computing Systems, Vol 6, Issue 3, 2007.

[15] Samsung Electronics Company, “K8C1215ETM 32M x16 MLC NOR Flash Data Sheet,” 2006.

[16] Samsung Electronics Company, “K9GAG08U0M 2G * 8 Bit MLC NAND Flash Mem-ory Data Sheet,” 2006.

[17] S. Lin, D. Zeinalipour-Yazti, V. Kalogeraki, D. Gunopulos, W. A. Najjar, “Efficient Indexing Data Structures for Flash-Based Sensor Devices,” ACM Transactions on Storage, Volume 2 , Issue 4, 2006.

[18] S. Nath, and A. Kansal, “FlashDB: Dynamic Self-Tuning Database for NAND Flash,”

In Proceedings of the 6th international Conference on information Processing in Sensor Networks, 2007.

[19] S. Lee, and B. Moon,“Design of Flash-Based DBMS: an In-Page Logging Approach,”

In Proceedings of the 2007 ACM SIGMOD international Conference on Management of Data, 2007.

[20] T. H. Cormen, C. E. Leiserson, and R. L. Rivest, “Introduction to Algorithms,” The38 MIT Press, 1990.

[21] Y. Li, B. He, Q. Luo, and K. Yi, “Tree Indexing on Flash Disks,” In Proceedings of the 2009 IEEE international Conference on Data Engineering, 2009.

[22] W. Pugh, “Skip Lists: A Probabilistic Alternative to Balanced Trees,” Communica-tions of the ACM, Vol. 33, No. 6, 1990.

在文檔中胖串列: 一種NOR快閃記憶體的循序索引結構 (頁 34-39)