Real-Time Self-Suspending Processes

(1)

Real-Time Self-Suspending Processes

^?

Ya-Shu Chen^†, Li-Pin Chang^†, and Tei-Wei Kuo^†‡

†Department of Computer Science and Information Engineering

‡Institute of Networking and Multiledia National Taiwan University, Taipei, Taiwan 106, ROC

{d92010, d6526009, ktw}@csie.ntu.edu.tw

Abstract. While a number of researchers have proposed excellent protocols on resource synchronization, little work is done for processes that might suspend themselves for I/O access, especially when they tend to be more tolerant to multiple priority inversions. This paper presents research results extended from the concept of priority ceilings with an objective to satisfy different priority-inversion requirements for different processes. We aim at practical considerations in which processes might voluntarily give up CPU and be willing to receive more blocking time than those in most traditional approaches. Extensions on the proposed scheduling protocols for deadlock prevention are also considered.

Keywords: Real-Time Systems, Resource Synchronization Protocol, Prior- ity Ceiling, Deadlock Prevention.

1 Introduction

Real-time resource synchronization has been an important research topic in the past decades. How to resolve resource contention with a proper management of priority inversion is usually the main focus of this research. Among the many proposed synchronization protocols, the Priority Ceiling Protocol (PCP) [17] is one of the most well-known protocols in hard real-time task scheduling. It is proved that no higher-priority task could be blocked by more than one lower-priority task. The Stack Resource Policy (SRP) [23] further extends PCP by allowing multiple units per resource and could adopt dynamic priority assignment algo- rithms, e.g., the Earliest Deadline First (EDF) algorithm [6], where EDF assigns the task with the closest deadline among ready tasks with the highest priority.

Although many excellent resource synchronization protocols have been proposed, most of them are either for hard real-time task scheduling with the maximum priority inversion number being one, e.g., [17, 23], or for soft real-time task scheduling without any guarantee on the maximum priority inversion number,

?This research was supported in part by the National Science Council under grants NSC92-2213-E-002-065.

(2)

2 Ya-Shu Chen , Li-Pin Chang , and Tei-Wei Kuo

e.g. [19, 27]. Furthermore, research on hard or soft real-time task scheduling often considers computation-intensive tasks only. When I/O operations are considered, research on real-time resource synchronization is often based on heuristics without any schedulability guarantee. Nevertheless, tasks executing over modern computer systems often consist of CPU and I/O bursts. While a task is pend- ing on the completion of an I/O operation, the scheduler usually does a context switch to execute another task. Resource synchronization which involves concur- rent I/O operations and CPU executions is a difficult problem [13]. Tasks which might suspend themselves voluntarily for I/O operations could suffer from a virtually unbounded number of priority inversions under many popular real-time resource synchronization protocols. It is because a lower-priority task might lock a resource that later blocks a suspended higher-priority task. Such a task model with I/O operations is not considered in many existing protocols. On the other hand, one priority inversion seems overly conservative for many applications. In reality, tasks could take different numbers of priority inversions, depending on their natures. These observations motivate this research.

This paper proposes configurable resource synchronization protocols for engineers to adjust the maximum number of priority inversions for each task and considers task suspension. A table-based approach is first proposed to adjust the maximum numbers of priority inversions for tasks without suspension. We then extend the approach to the considerations of task suspension. The ceilings of resources become configurable to allow lower-priority tasks to grab resources while higher-priority tasks suspend themselves to wait for I/O operations. Sys- tem utilization is traded with the maximum numbers of priority inversions for different tasks. A deadlock-prevention method with low run-time overheads is proposed to avoid system deadlocks. Although PCP is adopted to illustrate the configurable resource synchronization protocols, the idea could be extended to other synchronization protocols.

The rest of this paper is organized as follows: The motivation of this work is illustrated in Section 2 based on several examples, along with the process mode considered in this paper and the necessary terminologies. A configurable resource synchronization protocol is proposed in Section 3. In Section 4, we present the idea of deadlock avoidance. An off-line heuristic to resolve deadlocks with its on-line manipulation are then proposed. This work is concluded in Section 5.

2 Motivation and Problem Definitions

The purpose of this section is to provide the motivation for this research. Obser- vations over the behaviors of the well-known Priority Ceiling Protocol (PCP) [17]

and Stack Resource Policy (SRP) [23], when tasks might suspend themselves, are illustrated. We then define terminologies and definitions for this paper.

2.1 Motivation

We are interested in uni-processor scheduling with I/O considerations. Let τH, τM, and τL be three tasks scheduled by a fixed-priority scheduling algorithm,

(3)

!

"$#

"&%

')(*,+*.-/10

2!3 -4+*.576!8

9

: ; <: <;

>=

9=

!=

=

?@ A B A B

AB AB

τ

^C

τ τ

^D^E

τ

^C

τ τ

DE

τ

^C

τ

^D

Fig. 1. PCP not applied in R^a

where τH and τL are the highest-priority task and the lowest-priority task, re- spectively. Three resources R1, R2, and Râ are shared among the tasks. R1, R2 are semaphores, and Râ is an I/O device which could operate indepen- dently when some task is running over the CPU. Let operations on Râ be non- preemptible. Suppose that τH might access R1, R2, and Râ, τM might access Râ, and τL might access R1 and R2. We shall use the following two scheduling examples to serve as the motivation for this research: Note that tasks could be periodic or aperiodic. We call each instance of a task as a job, where a task is a template of its jobs. For example, a periodic task has a corresponding job ready for each period.

Suppose that the task set is scheduled by PCP [17]. The ceilings of R1 and R2 are both equal to the priority of τH, where the ceiling of a resource is equal to the maximum priority of all tasks that might access the resource. Since Râ denotes an I/O device, there are two cases to be considered: (1) There is no resource locking needed for Râ such that any task could access Râ whenever it is available. (2) A semaphore is defined for the access synchronization of Râ.

Consider the first case: Figure 1 shows a schedule in which R1 and R2 are managed by PCP, and no ceiling rule is applied to R^a. Suppose that τH, τM

and τL are all ready at time 0. The blocking of τH occurs at time 5 because τH

voluntarily surrenders the CPU at time 1 such that τL has a chance to execute and block τH. Such a blocking could happen repeatedly theoretically without a bound (over CPU and I/O devices). For example, the I/O request of τH at time 8 causes the suspension of τH again such that τL successfully locks R2 at time 11. Such a lock on R2 let τL later block τH again at time 14. We conclude that there could be a virtually unlimited number of priority inversions¹ for a

1 The number of priority inversion encountered by a task could be dominated by the number of self-suspension by the task.

(4)

! !"

#%$

#'&

(*)+-,.+0/12

354/6,.+0798:

;"<>=?@ ;"<>=?@ ;<>=?@

;"<>=?@

A6B CD CD

CD CD

τ

^E

τ τ

FG

τ

^E

τ

^F

τ

^E

τ τ

^F^G

Fig. 2. SRP applied in R^a

higher-priority task (over CPU and I/O devices) when PCP is directly applied without a proper synchronization rule on R^a. However, we must point out that one advantage for the absence of synchronization for I/O devices is a potentially higher system utilization.

Another scheduling alternative is to have a semaphore for the access syn- chronization on Râ. For the purpose of discussions, we adopt another (maybe more) restrictive algorithm in access synchronization. Let SRP be adopted for the scheduling of the task set, and a semaphore is adopted for the access syn- chronization of Râ. For the simplicity of presentation, the semaphore for Râ is referred to as Râ when there is no ambiguity. Since there is only one unit per resource, and the number of units per request is only one, we only need to define the preemption levels for resources R1, R2, and Râ when there is no resource available. When there is not a single unit for resource R1 (R2/Râ) available, the preemption level of R1 (R2/Râ) is equal to the priority of the task τH (i.e., dR1e⁰= dR2e⁰= dRâe⁰= priority(τH)). SRP requires no task being scheduled unless its priority is higher than the maximum preemption level of resources in the system, i.e., the system preemption level. Figure 2 shows the schedule under SRP. It is observed that τH suffers no priority inversion when it requests any lock on R1, R2, or Râ. Such a strong synchronization requirement results in the delay of the executions of τM and τL until time 16. Before time 16, either the CPU or the I/O device is idle.

We must point out that I/O devices are very different from common resources for synchronization, such as semaphores, in which their access requires the running of the CPU. A tradeoff does exist between the system utilization and the maximum number of priority inversions (or priority inversion time):

– Suppose that no PCP ceiling rule is adopted to manage an I/O device. Each lock request of a task to the I/O device might introduce at most one priority

(5)

inversion when the task tries to lock a semaphore later (after the I/O request is satisfied). Please see Figure 1.

– Suppose that each I/O device is considered as a resource managed by SRP.

The maximum number of priority inversions per task is one. Please see Fig- ure 2.

– Suppose that each I/O device is considered as a resource managed by PCP.

The maximum number of priority inversions of a task is equal to one plus the number of lock requests to I/O devices².

The above observations motivate the design of a resource synchronization protocol in which system engineers could trade the priority inversion time with the system utilization.

2.2 Process Model, Definitions, and Terminologies

This section defines the process model and terminologies for this paper. We first classify resources as active or passive as follows:

Definition 1. Passive Resources:

A resource is passive if any accessing of the resource requires the consumption of the CPU.

Good examples of passive resources include semaphores, mutex locks, event objects, and database locks. Passive resources could be accessed without any locks or with exclusive or shared locks, depending on the characteristics of the resources and application logics. A resource is active if it is not passive.

Good examples of active resources include disks, printers, network adaptors, and transceivers. A task might issue a request on an active resource and resumes its execution if the request is asynchronous and granted. If the request is synchronous, then the task is suspended until the request is fulfilled. In this paper, we are interested in non-preemptible active resources, such as disks, with synchronous requests. Let an access request be serviced immediately on the corresponding active resource once it is granted and available. We do not consider I/O buffering or caching in this paper.

Tasks could be periodic or aperiodic. We call each instance of a task as a job, where a task is a template of its jobs. A task τ_i is a sequence of subtasks τ_i,j. A subtask could be either a CPU execution or a period of time in accessing an active resource. If a subtask is a CPU execution, then it might lock or unlock any passive resources. When a subtask τi,j represents the accessing of an active resource R^a, τi,j could also be denoted as τ_i,j^R^a. The duration of a subtask τi,j

is denoted as ci,j, and the total execution time ci of τi is the sum of the CPU executions and the periods of time in accessing active resources of all subtasks

2 A lower-priority task could execute over the CPU when a higher-priority task suspends itself to access an I/O device. Note that when the higher-priority task resumes from the I/O access, the lower-priority task could access the I/O device and later block the higher-priority task.

(6)

!

#"%$

&

'

&

(

"%$ "%$

)

*

" $

1 +-,

τ

^τ¹

τ1

Fig. 3. A task execution which involves active resource access

τi,j. Suppose that task τ1 first executes some computation-intensive code (i.e., τ1,1) and then accesses active resource Râ (i.e., τ1,2). It completes after executes some computation-intensive code (i.e., τ1,5), as shown in Figure 3. The total execution time c1 of task τ1 is equal to c1,1 + c^R_1,2â + c1,3 + c^R_1,4â + c1,5. Note that active resources are accessed synchronously. While τ1is accessing an active resource, it must suspend its CPU execution until the access completes. No new lock could be obtained for a suspending task.

All (passive or active) resources must be locked before they are accessed. An active resource is released by a task when the access on the active resource by the task completes, and the corresponding lock is released. When a task accesses and locks an active resource several times in a period, the time point for the last releasing of the resource is called the dismissing point of the resource in the period. As shown in Figure 3, an active resource R^a is locked and released by task τ1 at time 3 and 5, respectively. It is locked and released again at time 7 and 9, respectively. The dismissing point of the active resource R^a for task τ1 is at time 9.

For the rest of this paper, we shall propose a resource synchronization protocol to trade the priority inversion time with the system utilization.

3 A Configurable Synchronization Protocol

3.1 Overview

Existing research results on resource synchronization are mainly on the mini- mization of priority inversion. PCP guarantees at most one priority inversion for any higher-priority task in uniprocessor fixed-priority systems. SRP later extends the idea to the management of multiple resources and dynamic priority scheduling. Although various excellent resource synchronization protocols have been proposed, little work is done to adjust the numbers of priority inversions for tasks.

In this section, we shall propose two resource synchronization protocols with an adjustment mechanisms for priority inversion management. The basic protocol extends the ceiling rules of PCP so that higher-priority tasks could receive

(7)

Table 1. An example ceiling table Task/Resource R1 R2 R3 R4R5

τ1 1 1 * 0 0

τ2 0 0 * * 1

τ3 0 1 1 * 1

τ4 0 1 1 1 1

a larger number of priority inversions to trade for the schedulability of lower- priority tasks. A ceiling table is proposed to set up the maximum numbers of priority inversions for tasks. Note that the table of preemption levels in SRP is proposed for a purpose very different from that of the ceiling table proposed in this section³. In the basic protocol, only passive resources are considered, and we are interested in uniprocessor task scheduling in this paper. We then extend the basic protocol in the considerations of active resources such that tasks might suspend themselves for I/O operations (on active resources). Guidelines are also proposed for the setup of the ceiling table based on the priority-inversion requirements of tasks.

3.2 The Basic Configurable Ceiling Protocol

The basic protocol extends the ceiling rules of PCP to have a tradeoff between the number of priority inversions of higher-priority tasks and the schedulability of lower-priority tasks. A ceiling table is first defined for the adjustment of the maximum priority inversion time for tasks. Protocol rules are then proposed for task scheduling based on the ceiling table.

3.2.1 The Ceiling Table The purpose of a ceiling table is for the adjustment of the maximum priority inversion time for tasks. CT (τi, Rp) is an entry in the ceiling table to represent the way in which task τi might access resource Rp. When CT (τi, Rp) = 0, τi will not access Rp in any way. If CT (τi, Rp) = 1, then τi might lock and access Rp. If CT (τi, Rp) = ∗, then τi could tolerate priority inversion resulted from the access conflict of Rp. Guidelines for the setup of the ceiling table are in Section 3.4.

Under PCP, the ceiling of a resource is defined as the maximum priority of the tasks which might access the resource. With a ceiling table, the ceiling of each resource Rp (Ceiling(Rp)) is revised as follows: Ceiling(Rp) is the maximum priority of tasks τiwith CT (τi, Rp) = 1. The lock request of a task τion resource Rq is granted if the priority of τiis higher than the maximum ceiling of resources locked by other tasks. Otherwise, τi is blocked. Let Rp be a locked resource that owns the maximum ceiling such that τi is blocked, and task τj currently lock Rp. We say that τi is directly blocked by τj. We must point out that the

3 The table of preemption levels in SRP is to check up for the availability of resources before a task begins its execution.

(8)

revised definition of Ceiling(Rp) does not consider the priorities of tasks τiwhen CT (τi, Rp) = ∗. Such a modification virtually gives up some privileges of those tasks with CT (τi, Rp) = ∗ when Rp is locked. That is, when Rp is locked, other tasks might still have a possibility to lock other resources even though their priorities are lower than the priorities of tasks with CT (τi, Rp) = ∗.

Consider a ceiling table in Table 1, based on the definition of Ceiling(Rp) in PCP, Ceiling(R1) = Ceiling(R2) = Ceiling(R3) = the priority of τ1, and Ceiling(R4) = Ceiling(R5) = the priority of τ2. Different from PCP, the revised ceiling definitions for the ceiling table (i.e., Table 1) are as follow: Ceiling(R1) = the priority of τ1, Ceiling(R2) = the priority of τ1, Ceiling(R3) = the priority of τ3, Ceiling(R4) = the priority of τ4, Ceiling(R5) = the priority of τ2.

In the following section, we shall propose a resource synchronization protocol based on the concept of the ceiling table.

3.2.2 Resource Synchronization Protocol The purpose of this section is to propose a resource synchronization protocol called the basic configurable ceil- ing protocol (BCCP) based on the concept of ceiling tables. We are interested in uniprocessor scheduling with only access over passive resources in the basic protocol. Since PCP is adopted to illustrate the idea, we adopt a fixed-priority assignment policy, such as the Rate Monotonic Scheduling (RMS) algorithm [6], where RMS assigns a higher priority to a task with a smaller period. The objective is to provide a tradeoff between the numbers of priority inversions for higher-priority tasks and the responsiveness of lower-priority tasks. The ceiling table is to provide a way to adjust the maximum numbers of priority inversions for tasks.

The scheduling protocol is defined as follows: The ceilings of all resources are defined based on the given ceiling table. The ready task with the highest priority is dispatched for execution. The lock request of a task τi on resource Rp

is granted if the priority of τi is higher than the maximum ceiling of resources locked by other tasks, and Rp is free. Otherwise, τi is blocked. Two conditions must be considered for the occurrence of a blocking: One possibility is that the priority of τi is no higher than the maximum ceiling of resources locked by other tasks. Let Rq be the locked resource that owns the maximum ceiling such that τi is blocked, and τj currently locks Rq. We say that τi is directly blocked by τj. Another possibility is that Rpis locked by another task τjalthough the priority of τiis higher than the maximum ceiling of resources locked by other tasks. In this case, τi is also directly blocked by τj. The occurrence of such direct blocking is resulted from the lowering of the priority ceiling of Rp(because CT (τi, Rp) = ∗).

When τi is directly blocked by τj, τj inherits the priority of τi. The priority inheritance is done transitively. When τi is no longer directly blocked by τj, the priority of τj resumes at the priority when the priority inheritance occurs.

3.2.3 Properties and Protocol Analysis We could show the correctness of the following properties for BCCP: Note that no active resources are considered

(9)

for BCCP. Assume that tasks are sorted by their priorities and τ0is the highest priority task.

Before the properties of BCCP are proved, we revise a given ceiling ta- ble CT () into a corresponding ceiling table CT⁰() as follows: CT⁰(τi, Rp) = CT (τi, Rp) for any task τi and any resource Rp except the following cases: (1) CT⁰(τi, Rp) = 1 if CT (τi, Rp) = ∗, and there exists an entry CT (τa, Rp) = 1 and i > a. It is because the ceiling of Rp would be higher than the priority of τ_i regardless of whether CT⁰(τ_i, R_p) is equal to 1 or ∗. (2) CT⁰(τ_i, R_p) = 1 if CT (τi, Rp) = ∗, and CT (τk, Rp) = 0 for all k > i. It is because no lower-priority task will lock Rp. In other words, τi would not be blocked by any lower-priority tasks because of Rp, regardless of whether CT⁰(τi, Rp) is equal to 1 or ∗. (3) CT⁰(τi, Rp) = 1 if CT (τi, Rp) = ∗, and τi is the task with the lowest priority in the system. It is because the setting of CT (τi, Rp) being equal to ∗ or 1 has no impacts on the blocking behavior of other tasks. CT⁰() is called the revised ceiling table of a given ceiling table CT (). We could show the following lemma:

Lemma 1. A task is blocked by another task under BCCP with a given ceiling table CT () if and only if the former task is blocked by the later task under BCCP with the revised ceiling table CT⁰().

Proof. The if-part of the lemma can be proved as follows: Suppose that a lock request of a task τi on resource Rp is not granted under CT⁰(). Such a blocking could only occur when the priority of τiis no higher than the maximum ceiling of resources currently locked by other tasks under CT⁰(), or when Rp is locked by another task τk.

Suppose that τjis the task that locks the resource with the maximum ceiling and blocks τi. Two cases are under considerations: (a) As mentioned in the previous paragraph, CT⁰(τi, Rp) is revised as 1 if CT (τi, Rp) = ∗, and there exists an entry CT (τa, Rp) = 1, where a < i. As a result, Ceiling(Rp) remains the same for both CT () and CT⁰(), and it is equal to the priority of τa. A lock request of a task τi on resource Rp will not be granted under CT () as well.

(b) CT⁰(τi, Rp) is revised as 1 if CT (τi, Rp) = ∗, and there does not exist a non-zero entry CT (τk, Rp) where i < k. In other words, any task which has a priority lower than τi would never lock Rp. The revision of CT⁰(τi, Rp) would not introduce any new blocking. If a lock request of τ_i on resource R_p is not granted under CT⁰(), then such a request will not be granted under CT ().

Suppose that a lock request of a task τi on resource Rpis not granted under CT⁰() because Rpis locked by another task τj. It is obvious that the lock request of τi on Rp will not be granted under CT () because Rp is locked already. The only-if-part of the lemma can be proved in a similar way. ¤

Given a task τi, let φ(τi) denote the number of resources Rpwith CT⁰(τi, Rp) =

∗. Note that there is a n task set, and all tasks are reordered and renamed such that τi has a priority higher than τi+1 does, for (n − 1) ≥ i ≥ 0.

Theorem 1. No task τicould be directly blocked by lower-priority tasks for more than φ(τi) + 1 times in each of its period.

(10)

* τ

τ τ

τ τ τ

! #"$%&

(') +*,- ./$0! #"&

1 1 1 1

2 2

3 3

4658797:<;>=0?A@B95 4C5D797:<;>=0?A@B95

E

+ +

+

Fig. 4. The reason of transitive blocking

Proof. The correctness of this theorem follows directly from Lemma 1 and the following observation (based on CT⁰()): First, no direct blocking would be introduced to τi due to any access on a resource Rp if CT⁰(τi, Rp) = 0 because τi would not access Rp. Each resource Rp with CT⁰(τi, Rp) = ∗ could introduce only one direct blocking for τi because Ceiling(Rp) is lower than the priority of τi. Furthermore, when some task accesses a resource Rp with CT⁰(τi, Rp) = 1, the ceiling of Rpwill prevent any other task with a priority lower than that of τi

from directly blocking τi again. In other words, only one direct blocking would be possibly introduced to τi for all resources with CT⁰(τi, Rp) = 1. As a result, the maximum number of direct blocking of τ_i is no more than φ(τ_i) + 1. ¤

Based on Theorem 1, the maximum number of direct blocking for each task could be derived from a given ceiling table. For example, the maximum number of direct blocking for τ1under BCCP with the ceiling table, as shown in Table 1, is 1 + 1 = 2. Those of τ2 and τ3 are 2 + 1 = 3 and 1 + 1 = 2, respectively.

Note that τ4 will not suffer from any direct blocking because it is the task with the lowest priority. Theorem 1 shows the maximum number of direct blocking suffered by a task in a period. The rest of this section is to derive a bound on the maximum duration of priority inversion time possibly suffered by a task in a period, where some of the priority inversion time might come from transitive blocking.

Priority inversion could come from direct and/or indirect blocking. Since the ceilings of resources could be lower than their corresponding PCP ceilings, indirect blocking (i.e., transitive blocking) might occur. The possibility of tran- sitive blocking could be observed from a given ceiling table CT (): Let a sym- bol + denote some value equal to ∗ or 1. Suppose that CT (τi, Rp) = + and CT (τj, Rp) = + for some resource Rp, where the priority of τi is higher than that of τj. Suppose that CT (τj, Rq) = ∗ and CT (τj+1, Rq) = + for some other resource Rq, as shown in Figure 4.(A). Let tasks be sorted in an increasing order of their priorities. (We first consider the case in which every task has a distinct priority.) That is, the priority of τj+1 is lower than that of τj. Let Rq be locked

(11)

by τj+1 when τj locks Rp. The lock request of τj on Rp is successful because CT (τj, Rq) = ∗. The lock request of τi on Rp later results in a direct blocking of τiby τj (i.e., the path + −→ +). As a result, τj resumes its execution. When τj

requests Rq, τj is directly blocked by τj+1because Rq is already locked by τj+1

(i.e., the path ∗ −→ +). Such a transitive blocking τi− τj− τj+1 could occur because CT (τj, Rq) = ∗. As astute readers might notice, a transitive blocking might occur when one of the tasks in the transitive blocking locks a resource Rq such that a higher-priority task τj in the transitive blocking later request Rq, where CT (τj, Rq) = ∗. On the other hand, when the above observation does not exist, no transitive blocking will occur. We use a counter example, as shown in Figure 4.(B), to provide an explanation: As the same as the example shown in Figure 4.(A), CT (τi, Rp) = + and CT (τj, Rp) = +, and the priority of τi

is higher than that of τ_j. Now let CT (τ_j, R_q) = 1, and CT (τ_j+1, R_q) = 1. Let Rq be locked by τj+1 when τj requests Rp. The lock request of τj is blocked because CT (τj, Rq) = 1. As a result, τi is not blocked by τj on Rp, when τi

later requests Rp, compared to the former example (i.e., the path + −→ +). A transitive blocking τi− τj− τj+1 does not occur.

Let the total blocking time of task τi be denoted as Bτi. Consider an n task set, in which all tasks are reordered and renamed such that τi has a priority higher than τi+1 does, for (n − 1) ≥ i ≥ 0. The derivation of Bτi could be done by an iterative procedure as follows: First, the resources which might potentially introduce transitive blocking to τi must be identified. For example, as shown in Figure 4.(A), Rq should be identified when τi is considered because Rq is involved in the path + −→ + −→ ∗ −→ +. With the resources could introduce direct blocking and indirect blocking to τi, the total blocking time imposed on τi is then calculated. The total blocking time caused by direct blocking is the same as that in PCP. The total blocking time caused by indirect blocking is the sum of the respective longest duration of accesses to those resources identified in the first step.

Let the maximum duration in the locking of resource Rp of all tasks be denoted as BRp. The maximum blocking time of τ1under BCCP with the ceiling table, as shown in Table 1, is max(BR1, BR2)+BR3+BR4, where max(BR1, BR2) comes from “1”’s, and BR3 comes from the asterisk symbols and “0”’s in the table, and BR4comes from the transitive blocking. Those of τ2and τ3are BR5+ BR3 + BR4 and max(BR2, BR3, BR5) + BR4, respectively. Note that τ4 will not suffer from any blocking because it is the task with the lowest priority.

3.3 The Extended Configurable Ceiling Protocol

3.3.1 The Ceiling Table In this section, we shall extend the basic configurable ceiling protocol (BCCP) for systems with active resources. Since a task could voluntarily suspend its CPU execution until the completion of an active resource request, the tradeoff between the priority inversion management and the system utilization becomes a critical issue. Under the extended protocol, system designers are allowed to fill in the maximum number of priority inversions for each task. Furthermore, each entry in the table denotes the maximum number

(12)

12 Ya-Shu Chen , Li-Pin Chang , and Tei-Wei Kuo Task/Resource R1R2R3 R4 R5

τ1 3 1 3 4 0

τ2 1 0 2 3 1

τ3 0 1 1 2 1

τ4 1 1 1 1 1

Table 2. An example ceiling table for the extended protocol.

of priority inversions for the corresponding task caused by any access conflicts over the corresponding resource (Please see Theorem 2). The ceiling table for the extended protocol is called the extended ceiling table for the rest of this paper.

During the on-line operations, the table could be used to manage the number of priority inversions for each task and to derive a proper ceiling for each resource.

The main idea is as follows: The initial value of CT (τi, Rp) denotes the maximum number of priority inversions for any access conflicts of resource Rp for task τi. When CT (τi, Rp) = N at some time t, it means that τicould tolerate additional N priority inversions over any access conflicts of resource Rp. Note that only passive resources have corresponding entries in the extended ceiling table. The idea of the extended ceiling table is to have a better management of passive resources when active resources are available in the system.

After the setting of the initial values for the extended ceiling table, the system dynamically derives the ceiling of each resource in an on-line fashion: When CT (τ_i, R_p) = N for some N > 1, and a direct blocking occurs for τ_i on R_p, CT (τi, Rp) is decremented by one. The derivation of the ceiling for resource Rp

only considers the entries with CT (τi, Rp) = 1 (for any task τi). The rationale behind this rule is that when CT (τi, Rp) > 1, τi could still tolerate priority inversion resulted from the access conflict of Rp. As a result, the setting of the ceiling of Rp does not need to consider the priority of τi. It is similar to the case when CT (τi, Rp) = ∗ under BCCP. In the next subsection, we shall extend BCCP with the extended ceiling table.

The ceiling derivation of passive resources is as presented in the previous paragraph. The ceiling of an active resource is defined as the maximum priority of the tasks that ever lock the resource and have not reached the dismissing point of the resource. If there is no such task in the system, then the ceiling of the active resource is the lowest priority in the system.

3.3.2 Resource Synchronization Protocol This section extends BCCP by considering active resources and a more precise management of the number of priority inversions due to each resource.

Given a system with a collection of passive resources {..., Rp, ...}, the ex- tended ceiling table CT (), and a collection of active resources {..., R^a_g, ...}, the system always dispatches the ready task with the highest priority. Let Ceiling(R) denote the ceiling of resource R.

The lock request of a task τi on Râ_g is granted if the priority of τi is no less than Ceiling(Râ_g), and Râ_g is currently not locked. If the lock request of task τi

(13)

on Râ_gis granted, then Ceiling(Râ_g) is replaced with the priority of τi. Otherwise, the request is blocked. We say that τi is directly blocked by τj if R_gâ is currently locked by some other task τj, and the priority of τi is no less than Ceiling(Râ_g) (i.e., the priority of τj). Note that if some higher-priority task τh once locks Râ_g and has not reached its dismissing point, then Ceiling(Râ_g) is larger than the priority of τi. We say that τiis directly obstructed by τhbecause the lock request of a task τi on Râ_g is not granted.

Let Γ and γ denote the maximum priority and the minimal priority of tasks which are currently suspending themselves, respectively. The lock request of a task τi on a passive resource Rp is granted if Rp is not locked, the priority of τi

is higher than the maximum ceiling of passive resources locked by other tasks, and one of the following two conditions holds: (1)The priority of τi is higher than Γ (2) Ceiling(Rp) is less than γ. Otherwise, the request is not granted and postponed. The task that directly blocks τi is determined as follows:

If the priority of τi is not higher than the maximum ceiling of passive re- sources locked by other tasks, then τi is directly blocked by the task that locks the resource with the maximum ceiling. If the priority of τi is higher than the maximum ceiling of passive resources locked by other tasks, but Rp is locked, then τiis directly blocked by the task that locks Rp. If the priority of τiis higher than the maximum ceiling of passive resources locked by other tasks, Rp is not locked, but the two conditions that presented in the previous paragraph both fail, then we say that τisuffers from an active resource obstructing. τi is said be- ing directly obstructed by some of the tasks that currently suspend for accessing active resources such that the above two conditions both fail. The occurrence of an active resource obstructing is to prevent a higher-priority but suspending task from extra priority inversion when the task resumes from the suspension due to the access of the active resource. The side effects of an active resource obstructing might result in the execution of some task with a priority lower than that of τi.

When a task τiis directly blocked by another task τjbecause of the request- ing of a passive resource, τj inherits the priority of τi. Priority inheritance is transitive. When a blocking no longer exists, the corresponding task resumes its priority when the priority inheritance occurs. Note that when τi is directly blocked by τjbecause of the requesting of an active resource, the priority inheritance is not applied.

3.3.3 Properties and Protocol Analysis The purpose of this section is to derive the maximum number of direct and indirect blocking for each task under ECCP. We shall comment on the value assignment of entry values in the extended ceiling table in a later section.

We shall first show that each extended ceiling table CT () under ECCP has a corresponding revised extended ceiling table CT⁰() such that schedules of task executions with either extended ceiling table are the same. Let θidenote the total number of accesses on all active resources by some task τi, and µi,p denote the number of requests on a passive resource Rp by τi in each period. CT⁰(τi, Rp) =

(14)

CT (τi, Rp) for any task τiand any resource Rpexcept the following four cases: (1) CT⁰(τi, Rp) = min(µi,p, θi) if CT (τi, Rp) > µi,por CT (τi, Rp) > θi. It is because the number of priority inversions suffered by τibecause of the requesting of Rpis bounded by µi,p and θi. (2) CT⁰(τi, Rp) = 1 if CT (τi, Rp) > 1, and there exists an entry CT (τa, Rp) = 1 for i > a. It is because the ceiling of Rpwould be higher than the priority of τi, regardless of whether CT⁰(τi, Rp) is no less than 1. (3) CT⁰(τi, Rp) = 1 if CT (τi, Rp) > 1, and CT (τk, Rp) = 0 for all k > i. It is because no lower-priority task will lock Rp. In other words, τi would not be blocked by any lower-priority tasks because of Rp, regardless of whether CT⁰(τi, Rp) is no less than 1. (4) CT⁰(τi, Rp) = 1 if CT (τi, Rp) > 1, and τi is the task with the lowest priority in the system. It is because the setting of CT (τi, Rp) has no impacts on the blocking behavior of other tasks. CT⁰() is called the revised extended ceiling table of a given extended ceiling table CT () based on the above revision rules. We could show the following lemma:

Lemma 2. A task is blocked by another task under ECCP with a given extended ceiling table CT () if and only if the former task is blocked by the later task under ECCP with the revised extended ceiling table CT⁰().

Proof. The proof is similar to that of Lemma 1. ¤

Theorem 2. Under ECCP, no task τi could be directly blocked by lower-priority tasks for more than

M + 1 + X

CT⁰(τi,Rp)>1

(CT⁰(τi, Rp) − 1)

times in each of its period, with the presence of M active resources.

Proof. This theorem could be proved in a similar way to that of Theorem 1: No direct blocking would be introduced to any task τi due to any access on a resource Rp if CT⁰(τi, Rp) = 0 because τi would not access Rp. Only one direct blocking would be possible introduced to τifor all resources with CT⁰(τi, Rp) = 1 because the ceiling of Rp will prevent any other task with a priority lower than that of τi to directly block τi again. For any resource Rp with CT⁰(τi, Rp) > 1, a task τi might be directly blocked once whenever τi resumes its execution due to the waiting of the service over some active resource R^a_g. Since CT⁰(τi, Rp) is decreased by one whenever a direct blocking occurs to τi due to access on Rp, there is no more than (CT (τi, Rp) − 1) direct blocking for τi due to access on R_p. Note that when CT (τ_i, R_p) becomes one, Ceiling(R_p) is set to the priority of τi.

An active resource could also contribute one potential direct blocking to τi, because the ceiling of the active resource is raised to the priority of τi until the dismissing point of the period is reached. As a result, no task τicould be directly blocked for more than M + 1 +P

CT⁰(τi,Rp)>1(CT⁰(τi, Rp) − 1) times in each of its period. ¤

Let the maximum duration in the locking of resource Rpof all tasks be BRp, the maximum blocking time imposed on τiby those resources with CT⁰(τi, Rp) =

(15)

1 be M B (M B could be calculated by the same way as that in PCP), and the longest execution time of any sub-jobs ever use an active resource R^a_g be BR_g^a. The derivation procedure of the total blocking time imposed on a task τi is similar to that for BCCP. That is, the sum of the direct blocking time

X

R^a_g accessed byτi

BR^a_g+ M B + X

CT⁰(τi,Rp)>1

(CT⁰(τi, Rp) − 1)BRp

plus the indirect blocking time. Note that the total indirect blocking time can be calculated in a similar way as that shown in Section 3.2.3.

3.4 Remark: the Setting of the BCCP and ECCP Ceiling Tables In this section, we shall provide some guidelines to set up the ceiling table CT () for a given system. Similar to the requirements of PCP, tasks scheduled by BCCP and ECCP must have their resource requests known in advance. That is, which resource will be used by which task, and the number of requests on each active or passive resource in a period of each task. The information on the duration of each request could further help in deriving the priority inversion time for each task and providing schedulability analysis.

The heuristics for ECCP to set ceiling tables is as follows: First, we run a schedulability analysis, such as the Rate Monotonic Analysis, to derive the maximum blocking time tolerable to each task. Initially, the maximum number of priority inversions for each task is set as 1 (i.e., CT (τi, Rp) = 0 if τi would not use Rp; otherwise, CT (τi, Rp) = 1). Starting from the task with the highest priority, we try to increase the maximum number of direct blocking due to the access of each resource for the task under consideration. The increasing of the maximum number of direct blocking for a task τi due to access over a resource Rp, i.e., CT (τi, Rp), can be done if the resulted amount for priority inversion (because of direct or transitive blocking) is still no more than the maximum blocking time tolerable to τi. In each iteration, we could always choose the most frequently requested resource (Rp) to increase the maximum number of direct blocking for τi. As astute reader might notice, when an entry (e.g., CT (τi, Rp)) is marked with a value larger than one, blocking might be transitively propagated to some higher priority tasks through the entry. In the process for the value increasing of CT (τi, Rp), we have to make sure that all involved higher priority tasks could tolerate that the blocking time is transitively propagated to them.

Note that the method could be simply modified to set up the ceiling table for BCCP by changing all entry values that are more than one into “*”.

4 Deadlock Prevention

The flexibility for the adjustment of the maximum priority inversion number for each task introduces potential transitive blocking and deadlocks. In this section, we shall present a simple deadlock prevention approach for BCCP or ECCP.

(16)

Task/Resource R₁ R₂ R₃ R₄ R₅

1 3 0 1 0 0

2 1 2 0 0 0

3 0 0 0 2 3

4 1 1 1 0 1

τ

1

τ

2

τ

3

τ

4

Fig. 5. The ceiling table and the corresponding resource allocation graph of an example system.

A resource allocation graph, as illustrated in Figure 5, is used to reflect the dependencies among tasks. If a task might access a resource, then there exists an edge between the corresponding vertices in the graph. Note that a resource allocation graph is a bipartite graph, where vertices of the same type reside at the same side. A request edge (τi → Rp) denotes that task τi is blocked over the requesting of resource Rp. An allocation edge (Rp → τi) denotes that task τi currently locks resource Rp. As shown in Figure 5, the ceiling table settings of ECCP (as well as BCCP) could not prevent a run-time waiting cycle, such as (τ1 → R1 → τ2 → R2 → τ4 → R3 → τ1), from happening. No active resource should be involved in the considerations of deadlocks, because a task that self- suspends itself to wait for the service completion of some active resource could not issue another request on some other resource. Note that even though this paper focuses on synchronous I/O, the above discussions of deadlocks remain even if we have asynchronous I/O. It is because asynchronous I/O would au- tomatically releases the involved active resource once the corresponding service completes.

Deadlock avoidance in PCP is achieved by having a sufficiently high ceiling for a resource in a potentially formed waiting cycle such that no task could lock a last resource needed in the formation of the waiting cycle. An example shown in Figure 6 could be used to have a better illustration: Let resource R1be locked by task τ3 at time t, and τ3requests for R3. Under PCP, task τ1 has no chance to lock resource R2 successfully after time t (because the ceiling of R1is no less than the priority of τ1) and later requests to lock R1. As astute readers might point out, the ceiling of a resource under BCCP or ECCP might be lower than that of the corresponding resource under PCP such that a deadlock might be formed.

In order to prevent the occurrence of deadlocks, two naive approaches might be considered: (1) a dynamic adjustment mechanism for resource ceilings to prevent any deadlock from happening in an on-line fashion. (2) a proper and

(17)

τ

2

τ

3

τ

1

τ

1

τ

3

τ

2

Fig. 6. The selection for critical resources and the removing of edges in a resource allocation graph.

off-line setting of the ceiling tables for “critical resources” so that no deadlocks could occur. The first approach is costly and might not be suitable to BCCP and ECCP because of its impacts on the manipulations of the ceiling table. We focus this section on the second approach because it only involves off-line efforts in the picking-up of certain resources. After the revising of the ceiling table for the selected resources, BCCP and ECCP would operate as defined in the previous sections, and their properties remain.

The idea is to pick up two resources per cycle in a resource allocation graph (such as R1and R3in Figure 6) as critical resources so that these two resources would not be locked by two different tasks at the same time. It could be achieved by setting the ceilings of these two resources as the maximum priority of the tasks that might access these two resources. Two technical issues must be addressed:

(1) How to revise a given ceiling table for BCCP/ECCP to achieve the above objective. (2) How to select a minimum collection of resources such that there are always two resources appearing in a cycle in the graph. Note that we should not try to find out every cycle in the graph because the number of cycles could be an exponential number of the number of vertices in the graph.

Theorem 3. If two distinct resources in each cycle would not be locked by two different tasks at the same time, then there is no deadlock in the system.

Proof. Since there would be no wait-for cycle of tasks, there is no deadlock.

¤

We shall first address the first technical issue, i.e., how to revise a given ceiling table for BCCP/ECCP: Let Rpand Rq be two resources selected in a cycle, and τiand τjbe the highest-priority tasks with CT (τi, Rp) ≥ 1 and CT (τj, Rq) ≥ 1, respectively. Without the loss of generality, let the priority of τi be higher than that of τj. The revising of the ceiling table could be simply done by setting both CT (τi, Rp) and CT (τi, Rq) as 1.