5.3.1 Workload Characteristics
The second experiment adopted file-operation workloads from three typical Android appli-cations, i.e., the Facebook client, the Gmail client, and the Chrome browser. We captured the traces of file operations from the three applications using the following procedure: We cleared the application storage cache of the applications, and started systemtap for file-operation trace collecting. We then started the applications, and used the application to emulate users’ daily activities, like checking the new feeds on Facebook, checking and send-ing e-mails, and surfsend-ing the web, until the applications wrote sufficiently many file data to the flash storage. The collected traces, consisting of calls to fopen(), fclose(), fread(), fwrite(), rename(), unlink(), fdatasync(), and fsync(), were then replayed on the original ext4 file system and the modified ext4 file system with our eager synching for performance evaluation.
Table 5.3 shows the characteristics of the file-accessing behaviors of the three applications.
The read-write ratios of Facebook and Gmail were very close, while Chrome performed file writing for most of the time. Their file-writing sizes were small, between 2 KB and 3KB, which is a sign of random file writing as most of the random writes carry out with small
Write Performance Amount of data Avg. block write Sequential ratio written (MB) req. size (sectors)
SQLite benchmark† 78 (45) TPS 41,1 (38.7) 23.0 (10.1) 65.03% (0%)
File-accessing benchmark‡ 1 (0.6) MB/s 34.9 (39.4) 46.4 (18.6) 99.02% (0%)
Facebook 1.19 (0.91) MB/s 91.8 (85.0) 91.15 (45.73) 59.02% (1.23%)
Gmail 2.08 (1.69) MB/s 292.2 (306.2) 70.99 (25.74) 17.15% (0.13%)
Chrome 5.05 (3.80) MB/s 146.0 (149.8) 126.05 (63.02) 14.44% (3.49%)
Table 5.4: Summary of the experimental results. †The insert test. ‡The random write test.
size. In particular, Chrome touched the largest amount of different files among the three applications, and these files belong to the cache of web pages and image files. Facebook and Gmail made a lot of fsync() calls, which were mainly contributed by operations on the SQLite database files. Differently, Chrome produced a smaller amount of fsync() calls, and the calls were contributed by synching the web-page caching files.
5.3.2 Evaluation Results
Figure 5.3(a) shows that all the write throughput significantly benefited from eager synching under the workloads of the three applications. The improvement was the largest for the workload of Chrome, archiving almost 30%. Like in the micro-benchmark, the performance improvements was contributed by the reduced write randomness and the enlarged write request size.
As shwon in Table 5.3, even though Chrome produced the smallest amount of fsync() calls, it accessed the largest number of different files. Most of the files were the local copy of web pages and image files. Figure 5.3(b) shows the disk write pattern of using the original ext4 synching, and the write requests to the small files increase the randomness of the write pattern. Figure 5.3(c) shows that using eager synching re-directed many of the synchronous write requests to the journal, and thus the write pattern appears much more sequential compared to Figure 5.3(b). As to the average block-level write request size, we found that the size increased from 63 sectors to 126 sectors, almost being doubled.
Using eager synching also reduced the total amount of data written by about 5% under the workloads of Gmail and Chrome, while increased the amount by 6% under the Facebook workload. Nevertheless, the slight increase did not much affect the benefit of using eager
synching, as flash storage is very fast in terms of sequential write.
Chapter 6
Conclusion
Our research proposes a fast file syncing mechanism called eager syncing which tries to solve the problem that frequent file synchronization causes small and random write pattern to underlying flash storage, and this kind of write is weak point to flash memory, thus results in poor performance.
Syncing a file through eager syncing do not write data back directly, which is small and random, instead, those data are gathered up, written sequentially into a log space. This creates a bigger chance where I/O scheduler can do better jobs of sorting and merging I/O requests, therefore presenting to flash storage more sequential and bigger requests. Data to be synced are also considered so that none of critical data are missing where inconsistency raised.
Eager syncing are implemented in ext4, and some experiments conducted shows that not only write pattern are more sequential than before, thus performance gain are obvious, but the amount of data written are also decreased in some cases.
Bibliography
[1] D. Melanson. (2013) Eric schmidt: Google now at 1.5 million android activations per day. [Online]. Available: http://www.engadget.com/2013/04/16/
eric-schmidt-google-now-at-1-5-million-android-activations-per/
[2] S.-W. Lee, D.-J. Park, T.-S. Chung, D.-H. Lee, S. Park, and H.-J. Song, “A log buffer-based flash translation layer using fully-associative sector translation,” ACM Transactions on Embedded Computing Systems (TECS), vol. 6, no. 3, p. 18, 2007.
[3] A. Gupta, Y. Kim, and B. Urgaonkar, “Dftl: a flash translation layer employing demand-based selective caching of page-level address mappings,” in ASPLOS ’09: Pro-ceeding of the 14th international conference on Architectural support for programming languages and operating systems, 2009.
[4] H. Kim, N. Agrawal, and C. Ungureanu, “Revisiting storage for smartphones,” in Proc. of the USENIX conf. on File and stroage technologies, 2012.
[5] K. Lee and Y. Won, “Smart layers and dumb result: Io characterization of an android-based smartphone,” in Proceedings of the tenth ACM international conference on Embedded software, ser. EMSOFT ’12. New York, NY, USA: ACM, 2012, pp.
23–32. [Online]. Available: http://doi.acm.org/10.1145/2380356.2380367
[6] T. O. G. B. Specifications. (2004) fsync. [Online]. Available: http://pubs.opengroup.
org/onlinepubs/009695399/functions/fsync.html
[7] W.-H. Lin and L.-P. Chang, “Dual greedy: Adaptive garbage collection for page-mapping solid-state disks,” in Design, Automation & Test in Europe Conference &
Exhibition (DATE), 2012. IEEE, 2012, pp. 117–122.
[8] C. Park, W. Cheon, J. Kang, K. Roh, W. Cho, and J.-S. Kim, “A reconfigurable ftl (flash translation layer) architecture for nand flash-based applications,” ACM Trans-actions on Embedded Computing Systems (TECS), vol. 7, no. 4, p. 38, 2008.
[9] A. Leventhal, “Flash storage memory,” Commun. ACM, vol. 51, no. 7, pp. 47–51, Jul. 2008. [Online]. Available: http://doi.acm.org/10.1145/1364782.1364796
[10] R. Y. Wang, T. E. Anderson, and D. A. Patterson, “Virtual log based file systems for a programmable disk,” in Proceedings of the third symposium on Operating systems design and implementation, ser. OSDI ’99. Berkeley, CA, USA: USENIX Association, 1999, pp. 29–43. [Online]. Available: http:
//dl.acm.org/citation.cfm?id=296806.296809
[11] T.-c. Chiueh and L. Huang, “Trail: A fast synchronous write disk subsystem using track-based logging,” 1999.
[12] M. H. Eich, “Main memory database recovery,” in Proceedings of 1986 ACM Fall joint computer conference, ser. ACM ’86. Los Alamitos, CA, USA: IEEE Computer Society Press, 1986, pp. 1226–1232. [Online]. Available:
http://dl.acm.org/citation.cfm?id=324493.325092
[13] C. Zhang, X. Yu, A. Krishnamurthy, and R. Y. Wang, “Configuring and scheduling an eager-writing disk array for a transaction processing workload,” in Proceedings of the 1st USENIX Conference on File and Storage Technologies, ser.
FAST ’02. Berkeley, CA, USA: USENIX Association, 2002. [Online]. Available:
http://dl.acm.org/citation.cfm?id=1083323.1083355