In a Database Management System (DBMS), multiple users often access and modify the database at the same time. This is called concurrent access. While concurrency increases efficiency and performance, it can also lead to problems such as data inconsistency, lost updates, or incorrect results.
Concurrency Control is the process of managing simultaneous transactions in a way that ensures data consistency, integrity, and isolation. It ensures that even if multiple transactions happen at the same time, the outcome is the same as if they were executed one after another.
Why is Concurrency Control Needed?
When multiple users or processes perform operations on the database at the same time, the following issues can occur:
- Lost Update Problem – When two transactions overwrite each other’s changes.
Example:- Transaction 1 updates a salary from ₹50,000 to ₹55,000.
- Transaction 2 updates it to ₹52,000 without knowing the first change.
- The update from Transaction 1 is lost.
- Dirty Read Problem – A transaction reads uncommitted changes made by another transaction. If that change is later rolled back, the read data becomes invalid.
- Unrepeatable Read Problem – A transaction reads the same data twice but gets different results because another transaction updated the data in between.
- Phantom Read Problem – A transaction re-executes a query and finds additional records that weren’t there before because another transaction inserted them.
Objectives of Concurrency Control
- Consistency – Ensure that the database remains in a valid state.
- Isolation – Each transaction should work independently, as if it is the only one running.
- Atomicity – A transaction is either completed fully or not at all.
- Serializability – The final result of concurrent transactions should be the same as if transactions were executed in sequence.
Methods of Concurrency Control
There are two main categories of concurrency control techniques:
1. Lock-Based Protocols
Locks prevent multiple transactions from accessing the same data in a conflicting way.
- Shared Lock (S-Lock) – Allows read-only access. Multiple transactions can have shared locks on the same data.
- Exclusive Lock (X-Lock) – Allows read and write access, but only one transaction can hold an exclusive lock on a data item.
Common lock-based protocols:
- Two-Phase Locking (2PL) – Transactions acquire all required locks before releasing any lock, ensuring serializability.
- Strict Two-Phase Locking – Locks are released only after the transaction commits, preventing cascading rollbacks.
2. Timestamp-Based Protocols
Each transaction is given a unique timestamp when it starts. The DBMS uses these timestamps to decide the order of execution, ensuring older transactions get priority.
Rules:
- Read Timestamp – The largest timestamp of any transaction that successfully read a data item.
- Write Timestamp – The largest timestamp of any transaction that successfully wrote to a data item.
3. Optimistic Concurrency Control (OCC)
Assumes that conflicts are rare and allows transactions to execute without locking. At commit time, the DBMS checks for conflicts. If a conflict is found, the transaction is rolled back.
Steps in OCC:
- Read Phase – Transaction reads data without locking.
- Validation Phase – Checks if any conflict occurred.
- Write Phase – Updates the database if validation succeeds.
4. Multiversion Concurrency Control (MVCC)
Instead of locking, MVCC keeps multiple versions of data. Transactions read the version valid at their start time, avoiding conflicts with write operations.
- Used in PostgreSQL, MySQL (InnoDB), and Oracle databases.
Example of Concurrency Control in Action
Suppose two customers, A and B, are withdrawing money from the same bank account:
- Without Concurrency Control:
- Both read balance ₹10,000.
- A withdraws ₹3,000 → New balance should be ₹7,000.
- B withdraws ₹5,000 → Based on old balance ₹10,000, new balance is ₹5,000.
- Final result: ₹5,000 (₹3,000 withdrawal lost).
- With Concurrency Control (Lock):
- A gets exclusive lock → Withdraws ₹3,000 → Updates balance to ₹7,000.
- Lock is released, then B withdraws ₹5,000 from ₹7,000 → Updates balance to ₹2,000.
- Final result: ₹2,000 (Correct).
Advantages of Concurrency Control
- Maintains data consistency and integrity.
- Prevents data anomalies like lost updates.
- Ensures fair access to resources.
- Improves user experience in multi-user systems.
Disadvantages
- Locking may cause deadlocks (two transactions waiting for each other’s locks).
- Can increase transaction waiting time.
- Requires careful design for performance optimization.
Conclusion
Concurrency control is a critical component of any database system that supports multiple users or processes. Without it, simultaneous access could lead to incorrect, inconsistent, or lost data. By using techniques like locking, timestamps, MVCC, and optimistic control, databases ensure that transactions run smoothly while keeping the data accurate and reliable.