CAS CS 460/660 Introduction To Database Systems Transactions and Concurrency Control
CAS CS 460/660 Introduction To Database Systems Transactions and Concurrency Control
CAS CS 460/660 Introduction To Database Systems Transactions and Concurrency Control
Transactions
and
Concurrency Control
1.1
Recall: Structure of a DBMS
Query in:
e.g. “Select min(account balance)” Data out:
Database app e.g. 2000
Query Optimization
and Execution
Relational Operators
These layers
Access Methods must consider
concurrency
Buffer Management control and
recovery
Disk Space Management
Customer accounts
stored on disk1.2
= File System vs. DBMS?
Thought Experiment 1:
You and your project partner are editing the same file.
You both save it at the same time.
Whose changes survive?
1.4
Key concept: Transaction
transaction
consistent state 1 consistent state 2
1.5
Example
1.6
Transaction - Example
BEGIN; --BEGIN TRANSACTION
UPDATE accounts SET balance = balance - 100.00 WHERE name =
'Alice';
1.7
Transaction Example (with Savepoint)
BEGIN;
UPDATE accounts SET balance = balance - 100.00
WHERE name = 'Alice';
SAVEPOINT my_savepoint;
UPDATE accounts SET balance = balance + 100.00
WHERE name = 'Bob’ ;
1.8
The ACID properties of Transactions
1.9
Atomicity of Transactions
1.10
Transaction Consistency
“Consistency” - data in DBMS is accurate in modeling real world,
follows integrity constraints
User must ensure transaction consistent by itself
If DBMS is consistent before transaction, it will be after also.
System checks ICs and if they fail, the transaction rolls back (i.e., is
aborted).
DBMS enforces some ICs, depending on the ICs declared in CREATE
TABLE statements.
Beyond this, DBMS does not understand the semantics of the data. (e.g.,
it does not understand how the interest on a bank account is computed).
1.11
Isolation (Concurrency)
1.12
Durability - Recovering From a Crash
System Crash - short-term memory (RAM) lost (disk okay)
This is the case we will handle.
Disk Crash - “stable” data lost
ouch --- need back ups; raid-techniques can help avoid this.
There are 3 phases in Aries recovery (and most others):
Analysis: Scan the log forward (from the most recent checkpoint) to identify all
Xacts that were active, and all dirty pages in the buffer pool at the time of the
crash.
Redo: Redoes all updates to dirty pages in the buffer pool, as needed, to ensure
that all logged updates are in fact carried out.
Undo: The writes of all Xacts that were active at the crash are undone (by
restoring the before value of the update, as found in the log), working backwards
in the log.
At the end --- all committed updates and only those updates are reflected in
the database.
Some care must be taken to handle the case of a crash occurring during the
recovery process!
1.13
Plan of attack (ACID properties)
1.14
Example
Consider two transactions (Xacts):
1.16
Example (Contd.)
Legal outcomes: A=1166,B=954 or A=1160,B=960
Consider a possible interleaved schedule:
1.17
Scheduling Transactions
Serial schedule: A schedule that does not interleave the actions of different
transactions.
i.e., you run the transactions serially (one at a time)
Equivalent schedules: For any database state, the effect (on the set of
objects in the database) and output of executing the first schedule is
identical to the effect of executing the second schedule.
1.18
Anomalies with Interleaved Execution
Unrepeatable Reads:
T1: R(A), R(A), W(A), C
T2: R(A), W(A), C
1.19
Conflict Serializable Schedules
1.20
Dependency Graph
1.21
Example
T1:
T1: R(A),
R(A), W(A),
W(A), R(B),
R(B), W(B)
W(B)
T2:
T2: R(A),
R(A), W(A),
W(A), R(B),
R(B), W(B)
W(B)
A
T1 T2 Dependency graph
1.22
Example
A
B
T1 T2 Dependency graph
No Cycle Here!
1.23
Another Example
T1
A, D
T2
C,D
D
T3
1.24
View Serializability – an Aside
1.25
Notes on Conflict Serializability
1.26
Locks
We use “locks” to control access to items.
Exclusive (X) locks – only one of these and no other locks, can
be held on a particular item at a time.
S X
Lock
Compatibility S –
Matrix
X – –
1.27
Two-Phase Locking (2PL)
Lock Point!
Growing Shrinking
Phase Phase
1.28
Two-Phase Locking (2PL)
1.29
Ex 1: A= 1000, B=2000, Output =?
Lock_X(A) <granted>
Read(A) Lock_S(A)
A: = A-50
Write(A)
Unlock(A) <granted>
Read(A)
Unlock(A)
Lock_S(B) <granted>
Lock_X(B)
Read(B)
<granted> Unlock(B)
PRINT(A+B)
Read(B)
B := B +50
Write(B)
Unlock(B)
Lock_X(B) <granted>
Unlock(A) <granted>
Read(A)
Lock_S(B)
Read(B)
B := B +50
Write(B)
Unlock(B) <granted>
Unlock(A)
Read(B)
Unlock(B)
PRINT(A+B)
1.32
Strict 2PL (continued)
All locks held by a transaction are released only when the transaction
completes
Like 2PL, Strict 2PL allows only schedules whose precedence
graph is acyclic, but it is actually stronger than needed for that
purpose.
1.33
Ex 3: A= 1000, B=2000, Output =?
Lock_X(A) <granted>
Read(A) Lock_S(A)
A: = A-50
Write(A)
Lock_X(B) <granted>
Read(B)
B := B +50
Write(B)
Unlock(A)
Unlock(B) <granted>
Read(A)
Lock_S(B) <granted>
Read(B)
PRINT(A+B)
Unlock(A)
Unlock(B)
Lock_X(B) <granted>
Unlock(A) <granted>
Read(A)
Lock_S(B)
Read(B)
B := B +50
Write(B)
Unlock(B) <granted>
Unlock(A)
Read(B)
Unlock(B)
PRINT(A+B)
1.36
Ex 4: Output = ?
Lock_X(A) <granted>
Lock_S(B) <granted>
Read(B)
Lock_S(A)
Read(A)
A: = A-50
Write(A)
Lock_X(B)
Deadlock prevention
Deadlock detection
1.38
Deadlock Prevention
1.39
Deadlock Detection
1.40
Deadlock Detection (Continued)
Example:
T1 T2
T4 T3
1.41
Multiple-Granularity Locks
Database
Tables
contains
Pages
Tuples
1.42
Solution: New Lock Modes, Protocol
Database
Allow Xacts to lock at each level, but with a special protocol using new
“intention” locks: Tables
Still need S and X locks, but before locking an item, Xact must have proper
intension locks on all its ancestors in the granularity hierarchy. Pages
Tuples
IS IX SIX S X
IS – Intent to get S lock(s)
at finer granularity. IS -
IX – Intent to get X lock(s) IX
- - -
at finer granularity. SIX - - - -
SIX mode: Like S & IX at S - - -
the same time. Why useful?
X
- - - - -
1.43
Multiple Granularity Lock Protocol Database
Tables
Pages
Each Xact starts from the root of the
hierarchy. Tuples
To get S or IS lock on a node, must hold IS or IX on parent
node.
What if Xact holds SIX on parent? S on parent?
To get X or IX or SIX on a node, must hold IX or SIX on
parent node.
Must release locks in bottom-up order.
1.44
Examples – 2 level hierarchy Tables
Tuples
T1 scans R, and updates a few tuples:
T1 gets an SIX lock on R, then get X lock on tuples that are updated.
T2 uses an index to read only part of R:
T2 gets an IS lock on R, and repeatedly
gets an S lock on tuples of R.
T3 reads all of R:
T3 gets an S lock on R.
OR, T3 could behave like T2; can
IS IX SIX S X
X
1.45
Isolation Levels
46
1.46
Optimistic CC (Kung-Robinson)
Locking is a conservative approach in which
conflicts are prevented. Disadvantages:
Lock management overhead.
Deadlock detection/resolution.
Lock contention for heavily used objects.
Locking is “pessimistic” because it assumes
that conflicts will happen.
If conflicts are rare, we might get better
performance by not locking, and instead
checking for conflicts at commit.
1.47
Kung-Robinson Model
modified ROOT
objects
new
1.48
Validation
1.49
Test 1 – non-overlapping
For all i and j such that Ti < Tj, check that Ti completes before
Tj begins.
Ti
Tj
R V W
R V W
1.50
Test 2 – No Write Phase Conflict
Ti
R V W
Tj
R V W
1.51
Test 3 – Overlapping Write Phases
Ti
R V W
Tj
R V W
Does Tj read dirty data? Does Ti overwrite Tj’s writes?
1.52
Applying Tests 1, 2, &3
To validate Xact T:
valid = true;
// S = set of Xacts that committed after Begin(T)
// (above defn implements Test 1)
//The following is done in critical section
< foreach Ts in S do {
if (ReadSet(T) intersects WriteSet(Ts)) OR
start (WriteSet(T) intersects WriteSet(Ts))
of
critical
then valid = false;
section }>
if valid then { install updates; // Write phase
end of critical section
Commit T }
else Restart T
1.53
Applying Tests 1 & 2: Serial Validation
To validate Xact T:
valid = true;
// S = set of Xacts that committed after Begin(T)
// (above defn implements Test 1)
//The following is done in critical section
< foreach Ts in S do {
if ReadSet(T) intersects WriteSet(Ts)
start then valid = false;
of
critical
}
section if valid then { install updates; // Write phase
Commit T } >
else Restart T
end of critical section
1.54
Comments on Serial Validation
1.55
Overheads in Optimistic CC
1.56
Snapshot Isolation (SI)
57
1.57
Snapshot Isolation (SI)
58
1.58
Benefits of SI
59
1.59
Other Techniques
Timestamp CC: Give each object a read-timestamp (RTS) and a
write-timestamp (WTS), give each Xact a timestamp (TS) when it
begins:
If action ai of Xact Ti conflicts with action aj of Xact Tj, and TS(Ti) <
TS(Tj), then ai must occur before aj. Otherwise, restart violating Xact.
Multiversion CC: Let writers make a “new” copy while readers use
an appropriate “old” copy.
Advantage is that readers don’t need to get locks
Oracle and PostgreSQL use a simple form of this.
1.60
Summary
Correctness criterion for isolation is “serializability”.
In practice, we use “conflict serializability”, which is somewhat more
restrictive but easy to enforce.
Two Phase Locking, and Strict 2PL: Locks directly implement the
notions of conflict.
The lock manager keeps track of the locks issued. Deadlocks can either be
prevented or detected.
Must be careful if objects can be added to or removed from the
database (“phantom problem”).
Index locking common, affects performance significantly.
Needed when accessing records via index.
Needed for locking logical sets of records (index locking/predicate locking).
1.61
Summary (Contd.)
1.62