• 沒有找到結果。

Two-Version-Based Concurrency Control and Recovery in Real-Time Client/Server Databases∗†

N/A
N/A
Protected

Academic year: 2022

Share "Two-Version-Based Concurrency Control and Recovery in Real-Time Client/Server Databases∗†"

Copied!
40
0
0

加載中.... (立即查看全文)

全文

(1)

Two-Version-Based Concurrency Control and Recovery in Real-Time Client/Server Databases

∗†

Tei-Wei Kuo, Yuan-Ting Kao, and Chin-Fu Kuo

Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan 106, ROC

Department of Computer Science and Information Engineering National Chung Cheng University, Chiayi, Taiwan 621, ROC

Abstract

While there has been a significant amount of research in real-time concurrency control, little work has been done in logging and recovery for real-time databases. This paper proposes a two-version approach which considers both real-time concurrency control and recovery.

We propose a network-server-based architecture and algorithms which can not only reduce the blocking time of higher-priority transactions and improve the response time of client- side read-only transactions but also provide a diskless run-time logging mechanism and an efficient and predictable recovery procedure. The performance of the algorithms was verified by a series of simulation experiments by comparing the algorithms with the well-known Priority Ceiling Protocol (PCP), the Read/Write PCP, the New PCP, and the 2-version two-phase locking protocol, for which we have very encouraging results. The schedulability of higher-priority transactions and the response time of client-side read-only transactions were all greatly improved.

KEY WORDS AND PHRASES: Real-Time Database, Concurrency Control, Recovery, Read-Only Transactions, Client/Server Databases, Logging, Write Through Procedure.

Supported in part by research grants from the National Science Council under Grants NSC87-2213-E- 194-018 and NSC87-2213-E-309-002

This paper is an extended version of a paper in the IEEE Third International High Assurance Systems Engineering Symposium. Yuan-Ting Kao graduated from the National Chung Cheng University under the thesis supervision of Tei-Wei Kuo.

(2)

1 Introduction

Real-time concurrency control has been an active research topic in the past decades. A number of researchers, [5, 6, 7, 14, 16, 21, 17, 22, 18, 26, 27, 25, 30, 32], have proposed vari- ous effective mechanisms in the concurrency control of real-time data access. In particular, semantics-based concurrency control, [6, 7, 16, 19, 26, 27, 25, 32], has been shown to im- prove the system schedulability significantly. Researchers also explored issues in processing read-only transactions, such as those [18, 19] based on the idea of dynamic adjustment of serializability order [22].

As issues in real-time concurrency control are better understood, the demand of sys- tem reliability is increasing. Although a lot of research, [8, 12, 11, 24], has been done in logging and recovery for traditional databases, little work explores logging and recovery for real-time databases [13, 31]. Rare research considers issues related to both real-time con- currency control and durability issues. In particular, Gupta, Haritsa, and Ramamritham [10] proposed a new real-time commit protocol which allows transactions to “optimistically”

borrow uncommitted data in a controlled manner to minimize the number of deadline viola- tions. Sivasankaran, Ramamritham, and Stankovic [31] proposed a partitioned logging and recovery algorithm for real-time disk-resident databases. The log is partitioned according to data classes, such as critical and temporal ones, to provide parallel logging and recov- ery. Non-volatile RAM-based devices are used to reduce the unpredictability of real-time databases. Huang and Gruenwald [13] proposed a checkpointing technique for real-time main memory databases. A database is partitioned according to data types (persistent type vs. temporal type) and update frequencies. The system checkpoints each partition independently based on its update frequency and its temporal valid interval.

While little work has been done in logging and recovery for real-time databases, the close relationship between real-time concurrency control and recovery (and logging) has been ignored in the past decade. Note that a schedule is recoverable only if no transaction τ commits before any transaction from which τ reads commits. A real-time concurrency control protocol should ensure that conflicting transactions commit in the order of their read-from relationship. This paper proposes an integrated mechanism for concurrency con- trol and recovery in real-time databases. A two-version-based concurrency control protocol called Two-Version Priority Ceiling Protocol (2VPCP), is proposed to reduce the blocking time of higher-priority transactions based on the idea of dynamic serializability adjustment [17, 19, 22] without relying on local data updates for transactions [17, 19, 22]. The 2VPCP protocol is, then, extended to a distributed environment to process read-only transactions

(3)

at client-side systems locally. The resulting system can not only significantly boost the response time of read-only transactions issued at client-side systems, but also virtually eliminate the interference of conflicting data accesses between client-side read-only transac- tions and server-side transactions. The extended 2VPCP protocol not only associates each client-side system with a consistent database image for local processing of read-only transac- tions, but also provides an efficient recovery mechanism. The performance of the algorithms was verified by a series of simulation experiments, for which we had very encouraging re- sults. Comparisons of different recovery mechanisms are also presented to demonstrate the capability of the two-version approach.

There are two major contributions in this paper: (1) The effectiveness of the two- version approach is shown in reducing the blocking time of higher-priority transactions and in improving the response time of client-side read-only transactions. Note that the results of this paper are orthogonal to any previous research in processing read-only transactions, [18, 19], which consider weaker correctness criteria or access patterns of transactions. All transactions in our system are serializable. (2) A two-version (network-server-based) ar- chitecture is proposed to not only support a diskless run-time logging mechanism and an effective write-through procedure, but also provide an efficient and predictable recovery mechanism. The logging mechanism and an effective write-through procedure virtually have no impact on the executions of transactions in the system.

The rest of this paper is organized as follows: Section 2 extends the Read/Write Priority Ceiling Protocol (RWPCP) [30] into a two-version-based protocol called the Two- Version Priority Ceiling Protocol (2VPCP). The properties of the 2VPCP protocol are then proven. Section 3 further extends the 2VPCP protocol to a distributed environment to locally and efficiently process read-only transactions at client-side systems. The correctness of the extended protocol is proven. Section 4 proposes an efficient and predictable recovery mechanism based on the extended 2VPCP protocol. Section 5 provides experimental results which demonstrate the performance of the algorithms. Section 6 is the conclusion.

2 The 2VPCP Protocol

2.1 Overview

The Read/Write Priority Ceiling Protocol (RWPCP) [30] has shown the effectiveness of using read and write semantics in improving the performance of the Priority Ceiling Protocol (PCP) [29] in real-time concurrency control. While PCP only allows exclusive locks on data

(4)

objects, RWPCP introduces a write priority ceiling W P Li and an absolute priority ceiling AP Li for each data object Oi to emulate share and exclusive locks, respectively. The write priority ceiling W P Liof data object Oi is equal to the highest priority of transactions which may write Oi. The absolute priority ceiling AP Li of data object Oi is equal to the highest priority of transactions which may read or write Oi. When data object Oi is read-locked, the read/write priority ceiling RW P Li of Oi is equal to W P Li. When data object Oi is write-locked, the read/write priority ceiling RW P Li of Oi is equal to AP Li. A transaction instance may lock a data object if its priority is higher than the highest read/write priority ceiling RW P Li of the data objects locked by other transaction instances. When a data object Oi is write-locked, the setting of RW P Li prevents any other transaction instance from write-locking Oi because RW P Li is equal to AP Li. When a data object Oi is read- locked, the setting of RW P Li only allows a transaction instance with a sufficiently high priority to read-lock Oi in order to constrain the number of priority inversions for any transaction instance which may write-lock Oi because RW P Li is equal to W P Li.

Lam and Hung [17] further sharpened the RWPCP by proposing the idea of dynamic adjustment of serializability order for hard real-time transactions, where Lin and Son [22]

proposed the idea of dynamic adjustment of serializability order for optimistic real-time concurrency control. With a delayed write procedure, a higher-priority transaction in- stance may preempt a lower-priority transaction instance by using the Thomas Write rules when a write-write conflict exists, where a delayed write procedure requires every transac- tion instance to only update data objects in its local space and to delay the updating of the database until the commitment of the transaction instance. The read-write conflict between conflicting transaction instances is partially resolved by allowing a higher-priority transac- tion instance to read the database even though a lower-priority transaction instance has write-locked the data object. Note that the delayed write procedure requires every trans- action instance to only update data objects in its local space, and the above preemption in read-write conflict lets the higher-priority transaction instance precede the lower-priority transaction instance in the serializability order.

Although the new protocol introduced by Lam and Hung [17] significantly reduces the blocking time of higher-priority transaction instances under RWPCP, every transaction instance may need extra space to keep its own local copy for each of its updated data objects because of the delayed write procedure. On the other hand, only the response time of higher-priority transactions is improved, and the executions of read-only transaction instances tend to be serialized. Little work, including [17, 30], has been done in considering recovery when concurrency control protocols are proposed.

(5)

req\locked R W C

R yes yes no

W yes no no

C no no no

Table 1: The compatibility matrix of locks.

This paper proposes a two-version approach which considers both real-time concur- rency control and recovery. We propose to use the idea of two-version databases to replace a delayed write procedure to save the extra space needed by the procedure. The goal is to first propose a two-version variation of the RWPCP [30] to have the flexibility in the dynamic adjustment of transaction serializability order to favor higher-priority transactions and read-only transactions. We will then extend the protocol and the idea of two-version databases into distributed environments for efficient and local processing of read-only trans- actions and provide efficient and predictable failure recovery. Note that little work has been done for concurrency control in distributed real-time environments, e.g., [20, 15, 28]. Since there can be a large number of read-only transactions in many commercial database systems, how to improve the response time of read-only transactions is of paramount importance.

We assume that a transaction system consists of a fixed set of transactions. (This condition will be relaxed when local processing of read-only transactions is considered in Section 3.) Each data object has two versions: a consistent version and a working version, where the consistent version contains a data value updated by a committed transaction instance, and the working version contains a data value updated by an uncommitted trans- action instance. There are three kinds of locks in the system: read, write, and certify.

Before a transaction reads (or writes) a data object, it must first read-lock (or write-lock) the data object. A read operation on a data object always reads from the consistent version of the data object. A write operation on a data object always updates the working version of the data object. It is required that, before a transaction commits, the transactions must transform each of its write locks into a certify lock on the same data object. As soon as a transaction obtains a certify lock on a data object, it can copy its updated working version of the data object to the consistent version. There is no requirement on the order or timing of lock transformations. The transformation of a write-lock into a certify-lock is considered as requesting a new certify lock. If the request of a certify-lock by a transaction instance is not granted, the transaction is blocked by the system until the request is granted. When

(6)

a transaction terminates, it must release all of its locks. The compatibility matrix of locks is shown in Table 1. (The well-known Two-Version Two-Phase Locking scheme has the same compatibility matrix [4, 9]. Note that the Two-Version Two-Phase Locking scheme could not guarantee one priority inversion for real-time transactions and may suffer from the deadlock problem.) A certify lock is stronger than a write lock, and a write lock is stronger than a read lock. All transactions follow the two-phase locking (2PL) scheme. The details will be shown in later sections. Compared to the Read/Write Priority Ceiling Pro- tocol (RWPCP) [30], the two-version locking mechanism could provide higher-priority (and read-only) transactions better opportunities to preempt lower-priority transactions. How- ever, it would be at the cost of extra certify locks and the copying of the updated working version of data objects to the consistent version. The number of certify locks which must be obtained by a committing transaction is the same as the number of write locks already obtained by the committing transaction. The cost in copying the updated working version of data objects to the consistent version is also proportional to the number of write locks already obtained by the committing transaction.

Now, we will state our notation.

Notation:

• τi,j denotes the jth instance of transaction τi. pi and ci are the period and worst- case computation time of transaction τi, respectively. If transaction τi is aperiodic, pi is the minimal separation time between its consecutive requests. When there is no ambiguity, we use the terms “transaction” and “transaction instance” interchangeably.

• Ri,j denotes the jth request of transaction τi. A transaction instance τi,j is initiated for each request of transaction τi. Once transaction instance τi,j is aborted, τi,j may be restarted or terminated, as required by the selected scheduling algorithm.

• The kth critical section of a transaction instance τi,j is denoted as zi,j,k and corre- sponds to the code segment between the kth locking operation and its corresponding unlocking operation. We assume in this paper that critical sections are properly nested. In other words, if the locking operation of a semaphore is no later than the locking operation of another semaphore within a transaction instance, the correspond- ing unlocking operation of the former semaphore is no earlier than the corresponding unlocking operation of the later semaphore. Note that it is one of the assumptions of PCP in handling the priority inversion problem.

• W (Oi) and C(Oi) denote the working version and consistent version of data object Oi, respectively.

(7)

2.2 The Basic 2VPCP Protocol

The Two-Version Priority Ceiling Protocol (2VPCP) is a two-version variation of the Read/Write Priority Ceiling Protocol [30]. The rationale behind the design of the 2VPCP protocol is to have flexibility in the adjustment of transaction serializability order to favor higher-priority transactions and read-only transactions. In later sections, we shall then ex- tend the 2VPCP protocol into distributed environments for local processing of read-only transactions and efficient failure recovery.

In this section, we are interested in the context of uniprocessor priority-driven pre- emptive scheduling, and every transaction has a fixed priority. (This condition will be relaxed when local processing of read-only transactions is considered in Section 3.) The real-time database can be either memory-resident or disk-resident. As defined in [30], the write priority ceiling W P Li of data object Oi is equal to the highest priority of transac- tions which may write Oi. The absolute priority ceiling AP Li of data object Oi is equal to the highest priority of transactions which may read or write Oi. Since 2VPCP adopts a two-data-version approach and introduces a new lock called certify lock, the setting of the read/write priority ceiling RW P Li of each data object Oi is modified as follows: The read/write priority ceiling RW P Li of each data object Oi is set dynamically. When a transaction read-locks or write-locks Oi, RW P Li is equal to W P Li. When a transaction certify-locks Oi, RW P Li is equal to AP Li. Note that any read operation on a data object always reads from the consistent version of the data object, and any write operation on a data object always writes into the working version of the data object. It is required that, before a transaction commits, the transactions must transform each of its write locks into a certify lock on the same data object. A certify lock on a data object secures the copying of the data value from its working version into the consistent version. No lock transformation is required for a read lock.

The rationale behind the setting of priority ceilings is as follows: When data object Oi is write-locked, RW P Li is set as W P Li to prevent any subsequent transaction from write-locking Oi because W P Li is equal to the highest priority of transactions which may write Oi. Note that there is only one working version for each data object. When data object Oiis certify-locked, RW P Li is set as AP Liso that no other subsequent transactions can lock Oi in any mode. This is to secure the copying of the data value from the working version of Oi into its consistent version. When data object Oiis read-locked, RW P Li is set as W P Li so that only transactions which have a priority higher than W P Li can read-lock Oi afterward. This constraint is to prevent any transaction which might later write-lock Oi from being blocked by more than one lower-priority transaction which read-locks Oi.

(8)

We now present the definition of 2VPCP:

1. A transaction instance, which has the highest priority among all ready transaction instances, is assigned the processor. If a transaction instance does not attempt to lock any data object, the transaction instance can preempt the execution of any transaction instance with a lower priority, whether or not the priorities are assigned or inherited.

(Priority inheritance will be defined later.)

2. When a transaction instance τi,j attempts to read-lock, write-lock, or certify-lock a data object Ok, the priority of τi,jmust be higher than the read/write priority ceilings of all data objects currently locked by transaction instances other than τi,j; otherwise, the lock request is blocked. If the priority of τi,j is higher than the read/write priority ceilings of all data objects currently locked by transaction instances other than τi,j, there are three cases to consider:

(a) If τi,j requests a read lock on Ok, then τi,j read-locks Ok, and the read/write priority ceiling RW P Lk of data object Ok is set as W P Lk.

(b) If τi,j requests a write lock on Ok, then τi,j write-locks Ok, and the read/write priority ceiling RW P Lk of data object Ok is set as W P Lk.

(c) If τi,j requests a certify lock on Ok, then τi,j certify-locks Ok, and the read/write priority ceiling RW P Lk of data object Ok is set as AP Lk. Note that τi,j must have write-locked Okbefore it requests a certify lock on Ok, and both AP Lkand W P Lk are no less than the priority of τi,j.

If the priority of τi,j is no higher than the read/write priority ceilings of all data objects currently locked by transaction instances other than τi,j, then the lock request is blocked. Let Obe the data object with the highest read/write priority ceiling of all data objects currently locked by transaction instances other than τi,j. If τi,jis blocked because of O, τi,j is said to be blocked by the transaction instance that locked O. 3. A transaction instance τi,j uses its assigned priority, unless it locks some data objects

and blocks higher priority transaction instances. If a transaction instance blocks a higher priority transaction instance, it inherits the highest priority of the transaction instances blocked by τi,j. When a transaction instance unlocks a data object, it resumes the priority it had at the point of obtaining the lock on the data object.

When a transaction instance is aborted, all transaction instances which inherit its priority must reset their priorities according to the definition of priority inheritance.

The priority inheritance is transitive. Note that the resetting of priority inheritance

(9)

can be efficiently implemented by using a stack data structure. We refer interested readers to [29] for details. This is because there is no transitive blocking in transaction executions (Please see Lemma 2).

4. All transaction instances follow a 2PL scheme. That is, no transaction instance is allowed to obtain any new lock after it releases any locks.

The lock compatibility matrix, as shown in Table 1, is implicitly verified through priority ceilings (Please see Lemmas 3, 4, and 5). A transaction instance is before another trans- action instance in the serializability order if any of the following conditions is satisfied: (a) the latter reads from the consistent version of any data object updated by the former. (b) the former and the latter update the same data object, and the former write-locks the data object first. Theorem 4 shows that all 2VPCP schedules are serializable. Theorem 5 also shows that the serializability order of transaction instances is the same as their begin unlock message order. The priority ceilings are used to control priority inversion in the system.

A similar approach can be found in [30]. The aborting of a transaction may happen be- cause of its deadline violation. However, transaction aborting incurs low overheads because executing transactions are updating only the working version of any data object. When a transaction commits, the transaction transforms each of its write locks into a certify lock on the same data object and copies its updated working version of the data object to the consistent version. The aborting of a transaction simply discards the working version, and any subsequent transaction can simply overwrite the working version. Since all transac- tions read from the consistent version of data objects, no cascading aborting is possible.

Furthermore, it is not possible for the cascaded resetting of any inherited priority for any transaction due to the occurrences of transaction abortings because there is no transitive blocking, as shown in Lemma 2 in the next section.

Example 1 A 2VPCP Schedule:

We illustrate the 2VPCP protocol by an example. Suppose that there are three transactions τ1, τ2, and τ3 in a uniprocessor environment. Let the priorities of τ1, τ2, and τ3 be 1, 2, and 3, respectively, where 1 is the highest, and 3 is the lowest. Suppose that τ1 and τ2 may read and write data object S1, respectively, and τ2 and τ3 may read and write data object S2, respectively. According to the definitions of ceilings, the write priority ceiling W P L1 and the absolute priority ceiling AP L1 of S1 are 2 and 1, respectively. The write priority ceiling W P L2and the absolute priority ceiling AP L2of S2are 3 and 2, respectively.

At time 0, τ3starts execution. At time 2, τ3write-locks S2successfully, and RW P L2 = W P L2 = 3. At time 4, τ2 arrives and preempts τ3. At time 6, τ2 write-locks S1 success-

(10)

T

1

T

2

T

3

S1

S1 S2 S2S1

S2 S2

R_lock(S1)

Unlock(S1)

R_lock(S2)

W_lock(S2)

W_lock(S1) Unlock(S2)

Unlock(S1)

Unlock(S2) C_lock(S1)

C_lock(S2)

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 Time

Time

Time

Figure 1: A 2VPCP schedule

fully because the priority of τ2 is higher than RW P L2 (RW P L1 = W P L1 = 2). At time 8, τ2 read-locks S2 successfully because the priority of τ2 is higher than RW P L2 (RW P L2 = W P L2 = 3). Note that τ3 is behind τ2 in the serializability order although τ3 write-locks S2 before τ2 read-locks S2. At time 11, τ1 arrives and preempts τ2. At time 13, τ1read-locks S1 successfully because the priority of τ2is higher than RW P L1 and RW P L2. RW P L1 is equal to W P L1 = 2. Note that τ2 is behind τ1 in the serializability order al- though τ2 write-locks S1 before τ1 read-locks S1. τ1 then unlocks S1 and commits at time 17 and 19, respectively. Right before time 21, τ2 certify-locks S1 successfully and copies the working version of S1 into the consistent version because the priority of τ2 is higher than RW P L2 (RW P L2 = W P L2 = 3). At time 21, τ2 unlocks S2. At time 23, τ2 unlocks S1. At time 25, τ2 commits, and τ3 resumes its execution. Right before time 28, τ3certify-locks S2 successfully and copies the working version of S2 into the consistent version. At time 30, τ3 commits.

For comparison, let us schedule these transactions according to the Read/Write Pri- ority Ceiling Protocol (RWPCP), where a single version per data object is considered. As shown in Figure 2, the write-lock request of τ2 on S1 is rejected at time 6 because the priority of τ2 is no higher than RW P L2 = AP L2 = 2. The reason for the rejection under the RWPCP protocol is because τ2 may later read S2, and the read will leave τ2 behind τ3 in the serializability order. As a result, τ2 is blocked by τ3. Note that the 2VPCP protocol lets τ2 preempt τ3, and τ2 reads from the consistent version of S2. As a result, τ2 is not blocked by τ3 under the 2VPCP protocol. Figure 2 also shows the blocking of τ1 at time 13 under the RWPCP protocol.

(11)

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32

T

1

T

2

T

3

S1

S1 S2

S2

R_lock(S1)

R_lock(S2)

W_lock(S2)

W_lock(S1) Unlock(S2) Unlock(S1)

Unlock(S2)

S2

S1

Time

Time

Time

Figure 2: A RWPCP schedule

This example demonstrates that one of the goals in designing the 2VPCP protocol is that a higher-priority transaction instance can utilize the consistent version of a data object without being blocked by a lower-priority transaction instance, due to read/write conflicts.

The serializability order of transaction instances is no longer determined by the order of the conflicting lock requests and can be adjusted according to the priorities of the transaction instances. 2

2.3 Properties

Lemma 1 A (higher-priority) transaction instance τH can be blocked by another (lower- priority) transaction instance τL only if τL is executing in a critical section which later blocks τH (when τH is initiated).

Proof According to the definitions of the 2VPCP protocol, τL can block τH only if τL directly blocks τH because of a lock request, or τL inherits a priority higher than the priority of τH. In either case, τL must be in a critical section to block τH. Furthermore, if τL is not executing in a critical section when τH is initiated, then τH can preempt τL because its priority must be no more than the priority of τH. 2

Definition 1 [29] Transitive blocking is said to occur if a (higher priority) transaction instance is directly blocked by another (lower priority) transaction instance which, in turn, is directly blocked by the other (further lower priority) transaction instance.

(12)

Lemma 2 No transitive blocking is possible.

Proof. This lemma can be proven by contradiction. Suppose that a transitive blocking happens among three distinct transaction instances τ1, τ2, and τ3. Let τ1 directly block τ2, and τ2 directly block τ3. According to Lemma 1, τ1 and τ2 must be executing in critical sections to block τ2 and τ3, respectively. Let τ1 and τ2 be executing in critical sections z1,i and z2,j, respectively, when the transitive blocking occurs. Since τ1 blocks τ2, τ1 must enter critical section z1,i before τ2 is initiated; otherwise, τ2 will not be directly blocked by τ1. By the definitions of the 2VPCP protocol, the read/write priority ceiling RW P Lkof the data object Ok locked by τ1 when τ1 enters z1,i should be no lower than the priority of τ2. However, when τ2 requests a lock to enter critical section z2,j (which blocks τ3 later), its priority should still be no larger than the read/write priority ceiling RW P Lk of the data object Ok. In other words, τ2 will not be allowed to enter critical section z2,j until τ1 leaves z1,i, and the transitive blocking should not occur. 2

Theorem 1 2VPCP is deadlock-free.

Proof. Since there is no transitive blocking (please see Lemma 2), a deadlock can only happen between two transaction instances. Let two distinct transaction instances τ1 and τ2 form a deadlock, and τ1 enter critical section z1,i which blocks τ2 before τ2 enters critical section z2,j which blocks τ1. Because critical section z1,i blocks τ2, the read/write priority ceiling RW P Lkof the data object Oklocked by τ1 when τ1 enters z1,ishould be no lower than the priority of τ2. In other words, the lock request of τ2 to enter critical section z2,j should not succeed until τ1 leaves z1,i, and no deadlock should occur. 2

Theorem 2 The maximum number of priority inversion per transaction instance is one.

Proof. Let a transaction instance τH be blocked by two distinct lower-priority trans- action instances τL and τL. Since there is no transitive blocking (please see Lemma 2), τL and τL must be executing in critical sections zL,i and zL,j to directly block τH, respectively.

Let τL enter critical section zL,i before τL enters critical section zL,j. Since critical section zL,i blocks τH, critical section zL,i should also block τL because the priority of τH is higher than the priority of τL. In other words, τL should not enter critical section zL,j to directly block τH until τLleaves zL,i, and the maximum number of priority inversion per transaction instance should not be more than one. 2.

Note that when transactions do abort, Theorems 1 and 2 remain correct, provided that aborted transactions must unlock their data objects. This is because aborted transac- tions will not be in a deadlock cycle or introduce any further priority inversion.

(13)

We shall first prove Lemmas 3, 4, and 5 to show that all of the 2VPCP schedules comply with the compatibility matrix shown in Table 1:

Lemma 3 When data object Ok is read-locked by a transaction instance τ , no transaction instance can certify-lock Ok under the 2VPCP protocol.

Proof. The lemma can be proven by contradiction. Let a distinct transaction instance τ receive a certify lock on Ok when Ok is read-locked by a transaction instance τ under the 2VPCP protocol. By the definitions of the 2VPCP protocol, the priority of τ must be higher than RW P Lk (i.e., W P Lk), where RW P Lk is no less than the original priority of τ. In other words, τ must inherit the priority of some transaction instance which is higher than RW P Lkto certify-lock Ok. Let zibe the earliest critical section which τenters and which later blocks some higher-priority transaction instance τwhose priority is higher than RW P Lk. Based on Lemmas 1 and 2, τ must be executing in critical section zi before τ is initiated.

There are two cases for discussions on when τ enters critical section zi: Suppose that data object Ok is read-locked by transaction instance τ before τ enters critical section zi. Since the 2VPCP protocol should not allow τ to enter critical section zi because the priority of τ is no higher than RW P Lk, a contradiction exists. (We assume that τ enters a critical section which blocks a transaction instance with a priority higher than RW P Lk in the past paragraph.)

Let data object Ok be read-locked by transaction instance τ after τ enters critical section zi. The priority of τ must be higher than the priority of τ; otherwise, data object Okcannot be read-locked by transaction instance τ . If the priority of τ is really higher than the priority of τ, then τ has no chance to regain the CPU and issues a certify lock on Ok unless τ is blocked (when τ issues a certify lock on Ok). Since there is no deadlock and no transitive blocking (please see Lemma 2 and Theorem 1), τ must be blocked by τ. Since τ blocks τ , τ must be executing in a critical section which later blocks τ before τ is initiated (please see Lemma 1). It contradicts the assumption that data object Ok is read-locked by τ because τ has no way to read-lock Ok. 2

Lemma 4 When data object Okis write-locked by a transaction instance τ , no transaction instance can write-lock or certify-lock Ok under the 2VPCP protocol.

Proof. Since both read and write locks set RW P Lk as W P Lk, and the sets of transactions which may issue write or certify locks are the same, this lemma can be proven in a way similar to the proof of Lemma 3. 2

(14)

Lemma 5 When data object Ok is certify-locked by a transaction instance τ , no transaction instance can lock Ok in any way under the 2VPCP protocol.

Proof. This lemma can be proven in a way similar to the proof of Lemma 3 (by replacing every occurrence of “read-lock by τ ”, “RW P Lk”, and “certify-lock by τ” with

“certify-lock by τ ”, “AP Lk”, and “lock by τ”, respectively). 2

Theorem 3 All 2VPCP schedules satisfy the compatibility matrix shown in Table 1.

Proof. The correctness of this theorem directly follows from Lemmas 3, 4, and 5. 2

Theorem 4 All 2VPCP schedules are serializable.

Proof. Since schedules generated by the 2-version 2PL protocol is (one-copy) serial- izable (1SR) [4, 9], and the set of schedules generated by the 2VPCP protocol is a subset of that generated by the 2-version 2PL protocol (Please see Theorem 3), all 2VPCP schedules are serializable. Note that all schedules which satisfy the 2PL scheme and the compatibility matrix in Table 1 are 2-Version 2PL schedules [4, 9]. Since all 2VPCP schedules satisfy the 2PL scheme and the compatibility matrix in Table 1, all 2VPCP schedules are 2-Version 2PL schedules. 2

3 Read-Only Transaction Processing

3.1 Overview

Consistent version

Work Space version

Server Computer

Consistent verion

Client Computers

Network Connection

Figure 3: A client-server architecture for read-only transactions

The purpose of this section is to extend the 2VPCP protocol into local processing of read-only transactions, as shown in Figure 3. For the purpose of this paper, we assume that

(15)

all updating transactions are submitted to the server-side system for execution. The main idea in this 2VPCP extension is to “duplicate” a consistent version of the database image at each client-side system to service read-only transactions locally at client-side systems. There are two major advantages of this approach: (1) A potentially large number of queries, i.e., read-only transactions, can be screened out of the normal operation of a real-time database system (at the server side). (2) Real-only transactions can be processed much faster and efficiently at client-side systems without going through potentially jammed network.

As astute readers may notice, higher-priority read-only transactions at the server system are already favored by the 2VPCP protocol because higher-priority read-only trans- actions can read-lock and access any data objects, unless the consistent versions of the data objects are under modifications (i.e., locked in a certify mode). We surmise that, in nor- mal operation, the interval of a certify lock should not be long for server-side transactions because usually only a committing transaction tries to obtain a certify lock. The real ques- tion here is how to improve the response time of lower-priority (and even higher-priority) read-only transactions issued by users at client-side systems. The main idea is to main- tain a consistent database image at each client-side system to improve the response time of users’ local lower-priority and higher-priority read-only transactions and, at the same time, without sacrificing the serializability correctness of the entire system. In order to achieve this goal, we must build a serializability order of all transactions executing at client-side and server-side systems.

The technical question here is how to efficiently maintain a consistent database image at each client-side system which satisfies the above condition. Our approach is to let each of the client-side systems autonomously fabricate the consistent version of the server-side two- version database at the client side, such that all executions of read-only transactions at the client side can be properly inserted into the serializability order of transaction executions at the server side. In order to maintain a consistent database image at each client-side system, each transaction (or the system) must send client systems a message similar to a redo log (τi, object name, old value, new value) for each of its write operations (to the working version of the server-side database). Each of the client-side systems then maintains its consistent database image by observing these messages. Note that the client-side consistent database images will be used for efficient failure recovery in Section 4, and no processing of redo logs is needed again during failure recovery. old value in each redo-log message can be removed because it will not be used in any way.

For the rest of this paper, we assume that all messages sent in the network arrive at the destination in their sending order.

(16)

3.2 The Serializability-Order Rebuilding Mechanism

Because of the existence of a two-version database and the “preemptions” of conflicting transactions (which result in the effects of dynamic adjustment of serializability order) at the server-side system, the serializability order of transaction executions at the server side cannot be simply observed by the timestamps of successful conflicting locking requests issued by the server-side transactions. We propose to observe the serializability order of server-side transactions based on the order of the beginning of the shrinking phase of the server-side transaction instances. The beginning of the shrinking phase of any server-side transaction instance can be easily observed by the appearance of the first unlock request of the transaction instance. The information is purely syntactic and can be easily observed by the system with very low overheads because all server-side transaction instances issue lock and unlock requests to the system, regardless of whether the system tries to observe the beginning of their shrinking phase.

Let begin unlock denote the first unlocking operation of a transaction instance. We shall prove in the following theorem that the begin unlock order of transaction instances at the server side complies with the serializability order of the transaction instances executing at the server side. This observation provides a simple and efficient mechanism (which we will show you in the next section) to determine the serializability order of server-side transactions.

Theorem 5 The begin unlock order of transaction instances (at the server side) complies with the serializability order of the transaction instances (at the server side).

Proof. This theorem can be proven by considering all of the combinations of con- flicting r/w operations. To determine the serializability order of conflicting transaction instances, four cases must be considered, as shown in Figure 4, where W Li(x), RLi(x), CerLi(x), and Begin U nlock denote the write lock, read lock, certify lock, and begin unlock message of transaction τi. Note that the 2VPCP protocol satisfies the 2PL scheme and the compatibility matrix shown in Table 1 (Please see Theorem 3 and the definitions of the 2VPCP protocol).

1. If there is a write/write conflict between two conflicting transaction instances, e.g., τ1 and τ2in Figure 4.a, then the begin unlock order of transaction instances must comply with the serializability order of the transaction instances. This is because write and certify locks are incompatible with each another. Every write lock of a transaction instance must precede its begin unlock.

(17)

T1

WL1(X) CerL1(X) Begin_Unlock

Begin_Unlock CerL2(X)

WL2(X)

T2

(a) W/W conflict

Time

T1 WL1(X) CerL1(X) Begin_Unlock

Begin_Unlock RL2(X)

T2

(b) W/R conflict : W proceeds R

Time

T1 WL1(X) CerL1(X) Begin_Unlock

Begin_Unlock RL2(X)

T2

(c) W/R conflict : R proceeds W

Time

T1

RL1(X) Begin_Unlock

Begin_Unlock CerL2(X)

WL2(X)

T2

(d) R/W conflict

Time

Figure 4: Serializability order of conflicting transactions

2. If there is a write/read conflict between two conflicting transaction instances, and the read lock of a transaction instance is granted after the write lock and certify lock of another transaction instance, e.g., τ1and τ2in Figure 4.b, then the begin unlock order of transaction instances must comply with the serializability order of the transaction instances. This is because read and certify locks are incompatible. The unlocking of the certify lock must be earlier than the granting of the read lock.

3. If there is a write/read conflict between two conflicting transaction instances, and the read lock of a transaction instance is granted before the certify lock, but after the write lock of another transaction instance, e.g., τ1 and τ2 in Figure 4.c, then the begin unlock order of transaction instances must comply with the serializability order

(18)

of the transaction instances. This is because read and certify locks are incompatible.

The certify lock cannot be granted until the unlocking of the read lock. Obviously, the fact that τ2 may have accessed the consistent version lets τ2 be before τ1 in the serializability order.

4. Suppose that there is a read/write conflict between two conflicting transaction in- stances, and the read lock of a transaction instance precede the write lock of another transaction instance, e.g., τ1 and τ2 in Figure 4.d. Regardless of what the order of the begin unlock of τ1 and the write lock W L2(x) of τ2 is, the begin unlock order of transaction instances τ1 and τ2 must comply with the serializability order of the transaction instances. This is because the transformation of the write lock into the certify lock must be blocked by the read lock if the read lock is not released. Note that τ1 may release a lock after τ2 obtains a conflicting write lock when the begin unlock of τ1 is after the write lock W L2(x) of τ2.

2

Note that if the server-side system allows conflicting transaction instances to commit in an order different from their serializability order, then each client-side system must apply the redo logs of the committing transaction instance on the consistent database version of the system in order of their begin unlock messages, instead of their commit order. Theorem 5 provides the general relationship between the serializability order of server-side transaction instances and their order of begin unlock messages, regardless of whether the server-side system may or may not crash. However, we must emphasize that if the server-side system indeed allows conflicting transaction instances to commit in an order different from their serializability order, and the server may crash at any time, then some schedules of server- side transactions are not recoverable according to the definition of recoverable schedules [9].

In the next section, we shall address this recovery issue for the extended 2VPCP protocol at the server side.

3.3 The Extended 2VPCP Protocol

We now present the mechanism in extending the 2VPCP protocol for local processing of read-only transaction instances at client sides. We assume that messages sent in a network arrive at destination systems in the order of their sending times, and the network is reliable.

We are interested in a close environment, where the network is close and under reasonable control. A network protocol such as TCP/IP which supports reliable message transmissions

(19)

is adopted. If the network fails, then the extended 2VPCP protocol, similar to many other distributed concurrency control protocols, will not work in a distributed environment.

Server-Side Transactions:

Each server-side transaction instance τi (scheduled by the 2VPCP protocol) is re- quired to broadcast a message to all client-side systems under the following three circum- stances: Note that the server system can broadcast the messages on behalf of the server-side transaction instances.

1. Before τi writes on the working version of a data object Oj, τi (or the server-side system) must broadcast a message similar to a redo log (τi, Oj, old value, new value) to all client-side systems.

2. Before τi commits, τi (or the system) must broadcast a message similar to a commit log (τi, commit) to all client-side systems.

3. When τi first unlocks any data object, τi (or the server-side system) must broadcast a message (τi, begin unlock) to all client-side systems to signal the beginning of the shrinking period of τi.

Note that if the server system allows conflicting transaction instances to commit in an order different from their serializability order, and the server may crash at any time, then some schedules of server-side transactions are not recoverable according to the definition of recoverable schedules [9]. In order to maintain the recoverability of the system, the server system must delay the commitment of a transaction instance (i.e., the actual releasing of certify locks and the sending of commit log to client-side systems) until all preceding transaction instances in the serializability order (i.e., their order of begin unlock messages) commit. However, if the server system may never crash, the above delaying requirement of commitment is not necessary. In other words, the above delaying requirement is not necessary before Section 4 which is for failure recovery.

Client-Side System:

Let each client-side system and the server-side system share the same consistent version of the database initially. A transaction instance is said to have committed at a client-side system if all of the redo logs of the transaction instance have been applied on the consistent database version of the client-side system. Note that the write through procedure of each committing transaction instance at the server-side computer cannot be completed until all redo logs and commit log are delivered to all client-side systems. When a

(20)

reliable broadcasting network is adopted, the server-side system may simply return from any sending operations and assume that all client-side systems will receive the messages sent by these operations eventually. The details regarding logging and commitment of server-side transaction instances will be discussed in Section 4.2.

During the system operation, each client-side system keeps all redo-log messages of server-side transaction instances which have not committed at the client-side system.

When a client-side system receives a commit message, the system applies the redo logs of the committing transaction instance on the consistent database version of the system atomically in order of their begin unlock messages. Note that the client-side systems may be busy doing something else; so there could be a difference between copies on the client sides and on the server side. However, sooner or later, the client-side systems will catch up when the local workloads drop. Such a phenomenon will not cause any problem because all 2VPCP schedules with local read-only transaction processing are serializable even though some clients might actually apply the logs later (Please see Theorem 6).

Each client-side system has a unique updating transaction τU pd which is responsi- ble for atomically updating the consistent database image based on redo-log messages of committing transaction instances. Before transaction τU pd updates the consistent database image, it must write-lock the entire database image. Transaction τU pd may be periodic or aperiodic. The higher the priority of τU pd, the faster τU pd can update the consistent database image by processing the redo logs to reflect the database image updated by com- mitted server-side transactions. The assignment of a high priority to τU pd can also help in reducing the recovery time because the recovery mechanism must reflect the consistent database image updated by all of the committed server-side transactions, and the mech- anism depends on τU pd to process the redo logs. However, the high priority of τU pd may interfere with the executions of read-only transactions at the client side. Simulation exper- iments regarding the priority of τU pd will be included in Section 5.

Autonomous Read-Only Transaction Processing at Client-Side Systems:

Each client-side system should schedule all of its transaction instances including τU pd in a preemptive priority-driven fashion. Before a read-only transaction instance reads any data object, it must read-lock the entire database image. Note that the entire database can be locked or unlocked by simply locking or unlocking a global flag. (Simulation experiments will be done to justify the setting of the priority of τU pd.)

Note that the consistent database image at the server-side system may not be the same as the consistent database image at some client-side systems. It is even possible that the τU pd transactions of different client-side systems process the redo logs of committed

(21)

transaction instances at different speeds, due to different workloads at different client-side systems. However, as shown in Theorem 6, all 2VPCP schedules with local read-only trans- action processing are serializable. Read-only transactions at different client-side systems may have consistent, but different views of the database.

Lemma 6 All client-side systems have the same consistent database image if they receive the same set of messages sent from the server-side system and apply them to the database image.

Proof. The correctness of this lemma follows directly from the assumption that the network delivers messages in a first-come-first-serve fashion. 2

Lemma 7 The database image maintained at client-side systems always satisfies the seri- alizability order of server-side transaction instances.

Proof. The correctness of this proof follows directly from Theorem 5 and the defi- nitions of the database image maintenance mechanism. 2

Theorem 6 All 2VPCP schedules with local read-only transaction processing are serializ- able.

Proof. Theorem 4 shows that all transactions at the server side are serializable.

The problem is whether all client-side read-only transactions and all server-side transac- tions together are still serializable. Lemma 6 shows that all client-side systems have the same consistent database image if they receive the same set of messages sent from the server-side system and apply them to the database image. That is, there is no inconsis- tent view of the database among client-side transactions running on different client sides.

Furthermore, Lemma 7 shows that the database image maintained at client-side systems always satisfies the serializability order of server-side transactions. A client-side read-only transaction is considered to occur exactly after the server-side transactions which commit at the corresponding client-side system before the read-only transaction read-lock and read the consistent database image. We conclude that client-side read-only transactions and all server-side transactions together are serializable. 2

(22)

4 Failure Recovery

4.1 Motivation

The purpose of this section is to further extend the 2VPCP protocol to the failure recovery of the server-side system. As astute readers may notice, the 2VPCP protocol can be applied in both memory or disk resident databases. For the rest of this section, we will focus our discussions on memory-resident databases. We shall also require that the 2VPCP protocol only allows conflicting server-side transactions to commit in their serializability order. The enforcement of commit order can be done easily by delaying the commitment of transactions. However, the delaying of the commitment of server-side transactions may increase the maximum number of priority inversions for a server-side transaction by one. It can be explained by the following example:

Let a higher-priority transaction instance τH be blocked by a lower-priority transac- tion instance τL under the 2VPCP protocol at the server side. The begin unlock message of τL will be before that of τH because of the adoption of the 2PL scheme. Since τH has a higher priority than τL does, τH may preempt τL (after τL unlocks the data object which blocks τH) and try to commit before τL commits. In order to let conflicting transaction in- stances commit according to their serializability order, i.e., the begin unlock message order, τH will be delayed to wait for τL to commit to keep the system recoverable. Because there is no transitive blocking, and the maximum of priority inversions for the 2VPCP protocol is one, the extra number of priority inversions, due to the delaying of commitment, for τH is one.

Consistent version

Work Space version

Server Computer

Consistent verion

(Client Computers)

Network Connection Recovery Servers

Consistent Copy

Figure 5: A client-server architecture for failure recovery

The client computers adopted for local processing of read-only transactions (in Sec-

參考文獻

相關文件

“Blue Teen from Hong Kong had an interesting project ~ Bluetooth critical zone control system automatic authentication and sterilization system for centralized butchery in order

1) Ensure that you have received a password from the Indicators Section. 2) Ensure that the system clock of the ESDA server is properly set up. 3) Ensure that the ESDA server

However, if the EAP Identity does match a client Identifier and the CredentialState is Accepted the EAP server proceeds with the authentication process and verifies the credential

This option is designed to provide students an understanding of the basic concepts network services and client-server communications, and the knowledge and skills

[r]

Multiple images from a sequence tracked with 6DOF SLAM on a client, while a localization server provides the global pose used to overlay the building outlines with transparent

For your reference, the following shows an alternative proof that is based on a combinatorial method... For each x ∈ S, we show that x contributes the same count to each side of

Unlike the client-server model, BitTorrent divides a file into a number of equal-sized pieces, where each peer simultaneously downloads and uploads via its neighbors that