Chapter 3 Design and Implementation
3.1 Background
3.1.2 Intelligent Write-Back Policy
Modern operating systems write back dirty pages periodically or when the number of free pages is below a specific threshold (i.e., memory pressure). On systems with non-volatile main memory, dirty pages are already persistent and thus need not to be written back into the disk periodically. Instead, they need to be written back only under memory pressure or sync operations. Currently, Linux utilizes a file-by-file write back policy, which scans the list of dirty inodes and submits the dirty pages of each inode to the IO subsystem. The rationale behind this policy is to reduce the numbers of non-up-to-date files when power outages or system crashes. Assume that 100 files are updated and each file has 10 dirty pages in memory. If the system crashes after 500 dirty pages are written back to disk, it would be better to write all the dirty pages of 50 files than write 5 dirty pages of all the files.
However, this policy may write back recently-updated pages, which has two drawbacks. First, writing back such pages can not help to release the situation of memory pressure since these pages will not be reclaimed by the page replacement policy. One purpose of writing back dirty pages is to reclaim the page so as to maintain a reasonable number of free pages in the system. In Linux, all pages belonging to user processes and page cache are grouped into two lists, the active list and the inactive list. The former includes pages which have been accessed recently while the latter contains pages that have not been accessed for a period of time. The file-by-file policy may write back dirty pages in the active list. However, most
are used recently. Second, according to time locality, these pages will be marked dirty soon after their write back. Thus, writing back such pages is of little use. The pages may need to be written back again soon. Some UNIX systems like Solaris do not have such problem. They only write back dirty pages that are not used recently.
The common problem of the write back policies of the existing UNIX operating systems (including Linux) is that they ignore the disk location of the dirty pages when submitting the pages to their IO subsystems. Although an IO subsystem can sort the requests submitted to it, there may still a significant amount of seek and rotation delay among the dirty pages.
In this paper, we propose an intelligent write-back policy, which considers the recency as well as the disk locations of the dirty blocks to reduce the IO traffic, seek time and rotation delay. To reduce the IO traffic, the proposed policy recency only writes back dirty pages in the inactive list.
To reduce the seek time, we divide a disk into a number of zones, which is a set of continuous blocks on the disk, and write back dirty pages in a zone-by-zone manner.
The dirty page information is recorded in a set of identical data structures called zone information tables, each of which correspond to a zone. When a page becomes dirty and inactive, we record the page in the corresponding zone information table according to the disk block number of the page.
Each time the write-back procedure is invoked, the proposed policy selects a zone and writes back dirty pages in that zone. This reduces the seek time because the disk blocks of the written-back dirty pages are close. In order to further reducing the rotation delay, the policy selects a zone with the maximum Average Segment Length (ASL), which is defined in Equation 1. A segment stands for a set of continuous dirty
blocks in a zone, and there is generally no rotation delay between two continuous blocks. Therefore, this policy tries to select a zone which contains more continuous dirty pages to reduce both the seek time and the rotation delay of the IO traffic caused by dirty page write back.
Average Segment Length (ASL) = Number of Dirty Pages in the Zone/Number of Segments in the Zone _________________________________________Equation 1
The zone information table, which is shown in Figure 3.3, it records some information such as, dirty pages numbers, segment numbers, segment list which contains of all segment in the zone, page list which includes all dirty pages of the inactive list in the zone, and length (Average Segment Length).
Figure 3.3 the Zone Information Table
Zone1 Zone2 Zone3 Zone4 ZoneN
Figure 3.4 Zone with Different Average Segment Length
Figure 3.4 shows an example of the zone selection. The dirty pages of zone 4 and zone 6 are both 7, but the dirty pages of zone 6 are more continuous (i.e., with a larger value of ASL) than zone 4. Therefore, zone 6 is selected to be written back.
As mentioned before, this policy only writes back pages in the inactive list in order to reduce the write back IO traffic. Therefore, only the dirty pages in the inactive list are recorded in the zone information tables. To accomplish this, we need to insert or remove the information about a dirty page when it becomes inactive or active.
Specifically, when a dirty page becomes inactive (i.e., moves from the active list to the inactive list), we record it in the corresponding zone information table. When the page becomes active again or clean, the recorded information is removed. This allows us to write back only inactive dirty pages.