Operating system scheduling approach

Chapter 2 System Model and Related Work

2.2 Related work

2.2.2 Operating system scheduling approach

The operating system scheduler approach attempts to select co-scheduled tasks which use different part of cache to minimize occurrences of cache contention. The scheduling decision must be made before tasks being executed, so the scheduler

requires to predict the memory behavior of tasks. We introduce three methods briefly in the following.

2.2.2.1 Active-set supported task scheduling [14]

T. Sherwood et al.[20] have shown that the task behavior is typically periodic and predictable, and so is the cache access behavior. Hence, [14] attempts to use this property to predict the cache access behavior. The future cache usage is predicted by the past cache usage. Then the tasks which might use different cache regions are co-scheduled. Thus, it reduces the possibility of co-scheduled tasks using the same cache regions. Settle et al. propose a monitoring hardware to record the number of cache accesses for cache sets. The task scheduling decisions are made based on the recorded results. The cache set are considered as frequently access cache region if it has the number of accesses larger than a preset threshold. Tasks with different frequently access cache regions are simultaneously scheduled.

This method assumes that the future memory behavior can be perfectly predicted by using past memory behavior. However, tasks may change its behavior during their execution, and so do their cache access patterns. The prediction policy may not be able to react these changes instantly. Therefore, the change of task behavior may result in false prediction and lead to an inferior scheduling decision.

2.2.2.2 Inter-thread cache contention prediction [12]

Chandra et al. propose a method called Prob to predict the number of cache contentions in a given task mix. Tasks running on a chip multiprocessor must share memory hierarchy. Therefore, memory accesses from co-scheduled tasks will be interleaved. Figure 2.3 shows two cases of interleaving accesses. Both cases

interleave accesses from task T1 with task T2. The access trace of T1 is denoted by T1R, and the access trace of T2 is denoted by T2R. The uppercase letters in access trace denote the memory addresses. Assuming a 4-way full associative cache, the second access to A is a cache hit in case 1, but a cache miss in case 2. The difference is that case 2 interleaved more accesses which load new blocks into cache. These interleaved accesses make the data block which loaded by first access to A been evicted.

Prob uses a probabilistic approach to predict miss rate of a task mix. It needs

the cache access traces for all tasks in the task mix. All possible interleaved access traces are exhaustively listed. The probability of an individual cache hit which becomes a cache miss is computed. Then, by multiplying the number of cache hits in access traces and the computed possibility, we can get the expect value of overall miss rate. The prediction can be used as one of the scheduling criteria of task scheduler to reduce the cache contentions.

The disadvantage of Prob is that it exhaustively evaluates all possible interleaved access traces. This evaluation would be very expansive while the number of tasks increases.

T₁^R: A B A T₂^R: U V V W

case 1: A U B V V A W case 2: A U B V V W A

Hit Miss

Figure 2.3 Illustration of how interleaving accesses from another task determines whether the access will be a cache hit or cache miss, assuming a 4-way full associative cache.

2.2.2.3 Throughput-oriented scheduling [15]

Fedorova et al. propose a modified balance-set[21] scheduling algorithm to decrease the shared L2 cache miss on chip multi-threading system. It first estimates miss rate of all possible task mixes by adapting the StatCache[22] probabilistic model. The StatCache model used in this approach is developed by Berg and Hagersten. It is used to predict the miss rate of single task with previously recorded reuse distance information[23]. Fedorova et al. proposed a merging method called AVG to combine individual miss rate predictions into the miss rate prediction of

co-scheduled tasks. AVG adjusts StatCache by assuming the numbers of cache blocks accessed by all tasks are equal. The overall miss rate for co-scheduled tasks is the average miss rate of all tasks.

After predicting the miss rate of task mix, tasks are divided into groups according to the estimated results. Then, Fedorova et al. use a mechanism integrated with balance-set[21] and StatCache[22] to schedule tasks. When the scheduling decision is made, it first generates all possible task mixes from the global dispatch queue. Second, it predicts miss rate for all possible task mixes with StatCache and AVG. Then, task mixes with predicted miss rates lower than a given threshold are

considered to schedule. Final scheduling decision is made with other scheduling factors, such as priority and waited time.

The drawback of this approach is that the AVG mechanism simply assumes all tasks allocated equal fraction of cache. However, this assumption is not always true, since the abilities of tasks to compete cache space are different, as discussed in [11].

This might result in inaccurate prediction in set-associative cache, and lead to sub-optimal scheduling.

From the related work, the cache partitioning approaches focus on partitioning cache for co-scheduled tasks. However, if all of the co-scheduled tasks frequently access cache then these tasks may still suffer by cache contentions. The operating system approaches can resolve this by selecting co-scheduled tasks which use the different part of the cache. In other words, the cache hardware only affects activities on the scale of tens to thousands of cycles. On the other hands, the operating system controls the resources and activities at the larger time scale, million of cycles. We have more opportunities to improve the system behavior through the operating system. Besides, the operating system task scheduler is usually implemented as software. By using software mechanisms, it is possible to build systems that can evolve when new techniques to be discovered. Furthermore, software approaches allow us to do some workload specific tuning. These benefits form our basis to select the operating system task scheduling approach.

In next chapter, we will describe the basic concepts and principles of our method in some detailed.

Chapter 3 Hint-aided Cache Contention

在文檔中在晶片多處理器系統下以減少快取衝突為目的之動態工作排程方法 (頁 19-24)