Concurrency Control and Reliable
Commit Protocol in Distributed
Database Systems
Outline
Distributed Database Management System
(DDBMS)
Concurrency Control models (CC)
Deadlock Management in DDBMS
Recovery and Reliable Mechanisms in
DDBMS.
Distributed Database Management
System (DDBMS)
A collection of multiple, logically interrelated
databases distributed over a computer
network.A distributed database management system is as the software system that permits the management of the distributed database and make the distribution transparent to the users.
Architectural Models for Distributed
Database Management System
Autonomy(A): controller
– 0: right integration, 1:semiautonomous system, 2:
isolation
Heterogeneity(H):
– 0:homogenous, 1: heterogeneous
Distribution(D): data arrangement
2:peer-to-Issues in DDBMS
Data Planning (NP-complete)
Query Optimization and Decomposition
(NP-complete)
Distributed Transaction Management
Fault Tolerance and Reliability
Transaction and Transaction
Management
The ACID property is still must be notified in
DDBMS
Transaction structures: flat, nested
Begin_transaction T1(); T2(); …… End_transaction Begin_transaction Begin_transaction T1 Begin_transaction T2 T3(); …… End_transaction T2 End_transaction T1
Concurrency Control Algorithms
Pessimistic
– Two-phase locking protocol (Mutex)
– Timestamp ordering protocol
– Hybrid
Optimistic
– Locking based
Locking and Timestamp Ordering
2PL is simple and guarantees serializability but the lock
ing may damage the throughput of the system, and ma y cause dead-lock.
Timestamp Ordering (TO) protocols don’t attempt to m
aintain serializability of by mutual exclusion so that it w on’t cause dead-lock
TO rule: Given two conflicting operations Oij and Okl, be
longs to Ti and Tk (Tk is younger), the former operation i
Basic TO Algorithm
Transaction (Ti) is assigned a globally unique timestamp ts(Ti) Transaction managers attach the timestamp to all operations i
ssued by the transaction
Each data item is assigned a write timestamp and a read time
stamp:
– rts(x), wts(x)
For Rt(x) for Wt(x)
– If ts(Tt) < wts(x) if ts(Tt) < rts(x) and ts(Tt) < wts(x) – then reject Rt(x) then reject Wt(x)
– else accept Rt(x) else accept Wt(x) – rts(x) <- ts(Tt) wts(x) <- ts(Tt)
Basic TO Algorithm (2) cont.
The basic TO algorithm is simple and
deadlock-free. The penalty of such mechanism is potential restart of a transaction numerous times.
Take an example.
The global unique timestamp assignment is not
Conservation TO Algorithms
The previous example shows that the restart p
enalty may be serious, the conservation TO alg o.s attempt to lower the aggressive restart.
The algo.s delay each operation until there is a
Deadlock Management
There are some ways to solve the deadlock
problem: prevention, avoidance, detection, and resolution.
Deadlock prevention is not easy to achieve
since it must have the complete serial graph
A famous deadlock avoidance approach is
Deadlock Detection Approach
It is a NP-complete problem to find the minimum cost e
dge for breaking the deadlock cycle.
Local wait-for graph and Global wait-for graph. We only
concern the deadlocks among the sites.
Topologies for deadlock detection algo.
– Centralized – Distributed – Hierarchical
Distributed Reliability Protocols
Commit protocols:
– How to execute commit command for distributed transactions?
How to ensure atomicity and durability.
Termination protocols:
– If a failure occurs, how do the remaining operational sites deal
with it?
Recovery protocols:
– If a failure occurs, how do the site where the failure occurred
deal with it?
Two-Phase Commit Protocol
I had reported this protocol briefly in my previous
presentation.
Global Commit Rule: all or nothing.
Phase 1: The coordinator gets the participants ready to
write the result to the physical storage.
Phase 2: Everyone writes its results into the database.
– Coordinator: – Participants:
Site Failures and Recovery
Develop non-blocking termination and
independent recovery protocols.
A proof shows that such protocols exist when a
single site fails.
However it is not possible to design
independent recovery protocols when multiple sites fail.
Problems with 2PC
Blocking
– Ready implies that the participants wait for the
coordinator
– If coordinator fails, site is blocked until recovery.
– Blocking reduces availability
Independent recovery is not possible
The 3PC protocol was proposed to solve the
blocking problems. 3PC is non-blocking(non-realistic but reducing).
Three Phase Commit Protocol
A proof shows that there are necessary and
sufficient conditions for designing non-blocking atomic commitment protocols:
– No state that is adjacent to both a commit and an
abort state. (2PC violates)
– No non-committable state that is adjacent to a
Mobile Databases
Mobile database is an extension of distributed database
system.
A mobile database may contain databases connected
with wire-line networks and databases built on mobile stations.
The characteristics:
– The wireless network have restricted bandwidth
– The power supplies in Mobile stations have limited lifetimes
Mobile Databases cont.(2)
Base on the previous characteristics, the CC problem i
n Mobile Databases is harder than that in distributed da tabases.
The disconnection of stations is long, so that locking pr
otocol and timestamp ordering protocol is not suitable. 2PC is not suitable neither, since the availability is redu ced.
Different transaction models were proposed in Mobile
Database environment. ex. relaxation of the ACID prop erty, relaxation of the serializability.
Mobile Databases cont.(3)
It is much harder to design real-time mobile
database systems.
Because of the unpredictability of the
environment, hard real-time transactions are hard to meet their deadline. Almost papers discuss about firm and soft real-time in the environment.