Skip to content

Deadlocks & Synchronization

When multiple threads or processes share resources, two fundamental problems arise: synchronization (how do we coordinate access to shared data?) and deadlock (what happens when processes get stuck waiting for each other forever?). These topics are among the most important in operating systems — they appear in every OS course, most technical interviews, and are the root cause of some of the hardest bugs in real-world systems.

The Critical Section Problem

A critical section is a segment of code that accesses shared resources (variables, files, databases) and must not be executed by more than one thread at a time. The critical section problem is: how do we ensure mutual exclusion?

Thread A Thread B
───────── ─────────
// Entry section // Entry section
acquire(lock) acquire(lock) ← blocks until A releases
// Critical section // Critical section
balance = balance + 100 balance = balance - 50
// Exit section // Exit section
release(lock) release(lock)
// Remainder section // Remainder section

Requirements for a Correct Solution

Any solution to the critical section problem must satisfy three conditions:

RequirementDescription
Mutual ExclusionAt most one thread can be in its critical section at any time
ProgressIf no thread is in its critical section and some threads want to enter, the selection cannot be postponed indefinitely
Bounded WaitingThere is a limit on the number of times other threads can enter their critical sections after a thread has requested entry

Synchronization Primitives

Mutex (Mutual Exclusion Lock)

A mutex is the simplest synchronization primitive. It has two states: locked and unlocked. Only one thread can hold the mutex at a time. If a thread tries to acquire a locked mutex, it blocks until the mutex is released.

import threading
# Shared resource
balance = 0
mutex = threading.Lock()
def deposit(amount, iterations):
global balance
for _ in range(iterations):
mutex.acquire()
try:
# Critical section
balance += amount
finally:
mutex.release()
def withdraw(amount, iterations):
global balance
for _ in range(iterations):
with mutex: # Pythonic way (context manager)
# Critical section
balance -= amount
# Without mutex: race condition → unpredictable balance
# With mutex: balance is always correct
threads = [
threading.Thread(target=deposit, args=(1, 100_000)),
threading.Thread(target=withdraw, args=(1, 100_000)),
]
for t in threads:
t.start()
for t in threads:
t.join()
print(f"Final balance: {balance}") # Always 0 with mutex

Semaphore

A semaphore is a more general synchronization primitive than a mutex. It maintains an integer counter and supports two atomic operations:

  • wait() (also called P() or down()): Decrement the counter. If the counter becomes negative, the calling thread blocks.
  • signal() (also called V() or up()): Increment the counter. If threads are waiting, wake one up.
TypeCounter ValueUse Case
Binary Semaphore0 or 1Same as mutex — mutual exclusion
Counting Semaphore0 to NControlling access to N identical resources (e.g., connection pool)
import threading
import time
# Counting semaphore: limit concurrent access to 3
pool_semaphore = threading.Semaphore(3)
def access_resource(thread_id):
print(f"Thread {thread_id}: waiting for resource...")
pool_semaphore.acquire()
try:
print(f"Thread {thread_id}: acquired resource")
time.sleep(2) # Simulate work
finally:
print(f"Thread {thread_id}: releasing resource")
pool_semaphore.release()
threads = []
for i in range(8):
t = threading.Thread(target=access_resource, args=(i,))
threads.append(t)
t.start()
for t in threads:
t.join()
# Binary semaphore for signaling (producer-consumer pattern)
items_available = threading.Semaphore(0) # Start at 0
buffer = []
def producer():
for i in range(5):
item = f"item-{i}"
buffer.append(item)
print(f"Produced: {item}")
items_available.release() # Signal: item available
time.sleep(0.5)
def consumer():
for _ in range(5):
items_available.acquire() # Wait for item
item = buffer.pop(0)
print(f"Consumed: {item}")

Monitor

A monitor is a higher-level synchronization construct that encapsulates shared data, the operations on that data, and the synchronization logic into a single module. Only one thread can be active inside a monitor at a time.

Monitors include condition variables for signaling:

  • wait(condition) — Release the monitor lock and sleep until signaled
  • signal(condition) — Wake up one thread waiting on this condition
  • broadcast(condition) — Wake up all threads waiting on this condition
┌────────────────────────────────────────┐
│ Monitor │
│ │
│ Shared Data: │
│ - buffer[] │
│ - count │
│ │
│ Condition Variables: │
│ - not_full │
│ - not_empty │
│ │
│ Operations: │
│ - insert(item) │
│ - remove() → item │
│ │
│ ┌──────────────────────┐ │
│ │ Entry Queue │ │
│ │ (threads waiting │ │
│ │ to enter monitor) │ │
│ └──────────────────────┘ │
│ │
│ ┌──────────────────────┐ │
│ │ Condition Queues │ │
│ │ not_full: [T3] │ │
│ │ not_empty: [T1, T5] │ │
│ └──────────────────────┘ │
└────────────────────────────────────────┘
import threading
import time
class BoundedBuffer:
"""Monitor-style bounded buffer using Condition variables."""
def __init__(self, capacity):
self.buffer = []
self.capacity = capacity
self.lock = threading.Lock()
self.not_full = threading.Condition(self.lock)
self.not_empty = threading.Condition(self.lock)
def insert(self, item):
with self.not_full: # Acquire lock
while len(self.buffer) >= self.capacity:
self.not_full.wait() # Release lock and wait
self.buffer.append(item)
self.not_empty.notify() # Signal waiting consumers
def remove(self):
with self.not_empty: # Acquire lock
while len(self.buffer) == 0:
self.not_empty.wait() # Release lock and wait
item = self.buffer.pop(0)
self.not_full.notify() # Signal waiting producers
return item
# Usage
buf = BoundedBuffer(5)
def producer(name):
for i in range(10):
item = f"{name}-{i}"
buf.insert(item)
print(f"{name} produced {item}")
time.sleep(0.1)
def consumer(name):
for _ in range(10):
item = buf.remove()
print(f"{name} consumed {item}")
time.sleep(0.2)
t1 = threading.Thread(target=producer, args=("P1",))
t2 = threading.Thread(target=consumer, args=("C1",))
t1.start(); t2.start()
t1.join(); t2.join()

Deadlock

A deadlock occurs when two or more processes are waiting for each other to release resources, and none of them can proceed. The system is stuck.

Deadlock Example:
Thread A Thread B
───────── ─────────
1. acquire(Lock_X) ✓ 1. acquire(Lock_Y) ✓
2. acquire(Lock_Y) ← BLOCKED 2. acquire(Lock_X) ← BLOCKED
A holds X, wants Y B holds Y, wants X
┌─────────┐
│ Lock_X │
└────┬────┘
holds │ wants
┌───────────┘ ┌───────────┐
│ │ │
v │ v
┌──────┐ │ ┌──────┐
│ A │ │ │ B │
└──┬───┘ │ └──┬───┘
│ │ │
│ wants │ holds │
└────────────────┘───────────┘
┌─────┴───┐
│ Lock_Y │
└─────────┘

Four Necessary Conditions for Deadlock

All four conditions must hold simultaneously for a deadlock to occur:

ConditionDescriptionExample
1. Mutual ExclusionAt least one resource is held in a non-sharable modeA printer can only be used by one process at a time
2. Hold and WaitA process holding at least one resource is waiting to acquire additional resources held by other processesProcess holds Lock A and waits for Lock B
3. No PreemptionResources cannot be forcibly taken from a process; they must be released voluntarilyOS cannot revoke a file lock from a process
4. Circular WaitA circular chain of processes exists where each process holds a resource needed by the nextP1 waits for P2, P2 waits for P3, P3 waits for P1

Resource Allocation Graph

A resource allocation graph visualizes the relationship between processes and resources. A cycle in the graph indicates a potential deadlock.

Resource Allocation Graph:
R1 ──────── ● ● ────── R2
(1 instance)│ │(1 instance)
│ assign │ assign
v v
┌──────┐ ┌──────┐
│ P1 │ │ P2 │
└──┬───┘ └──┬───┘
│ request │ request
v v
R2 ──────── ● ● ────── R1
P1 holds R1, requests R2
P2 holds R2, requests R1
→ Cycle exists → DEADLOCK (with single-instance resources)
Note: With multi-instance resources, a cycle is necessary
but not sufficient for deadlock.

Deadlock Prevention

Prevent deadlock by ensuring that at least one of the four necessary conditions cannot hold.

Condition to BreakPrevention StrategyTrade-off
Mutual ExclusionMake resources sharable (e.g., read-only files)Not always possible (printers, write locks)
Hold and WaitRequire processes to request ALL resources before starting, or release all before requesting new onesPoor resource utilization, possible starvation
No PreemptionIf a process cannot get a resource, force it to release all held resources and retryExpensive, possible livelock
Circular WaitImpose a total ordering on resources; processes must request in increasing orderRequires knowing all resources in advance

Breaking Circular Wait (Most Practical)

Assign a global order to all resources (R1 < R2 < R3 …) and require every thread to acquire resources in increasing order:

Resource ordering: Lock_X = 1, Lock_Y = 2
Thread A (correct): Thread B (correct):
acquire(Lock_X) // 1 acquire(Lock_X) // 1
acquire(Lock_Y) // 2 acquire(Lock_Y) // 2
... ...
release(Lock_Y) release(Lock_Y)
release(Lock_X) release(Lock_X)
Both threads always acquire Lock_X before Lock_Y.
→ No circular wait → No deadlock!

Deadlock Avoidance: Banker’s Algorithm

Deadlock avoidance allows the four conditions to potentially hold but makes careful resource-allocation decisions to ensure the system never enters a deadlock state. The most famous avoidance algorithm is the Banker’s Algorithm, proposed by Dijkstra.

Concept

The system maintains information about:

  • Available: Resources currently available
  • Maximum: Maximum demand of each process
  • Allocation: Resources currently allocated to each process
  • Need: Remaining resources each process may need (Need = Maximum - Allocation)

Before granting a request, the system checks if the resulting state is safe — meaning there exists at least one sequence in which all processes can complete.

Worked Example

Given: 3 resource types (A, B, C) with total instances (10, 5, 7).

ProcessAllocation (A B C)Maximum (A B C)Need (A B C)
P00 1 07 5 37 4 3
P12 0 03 2 21 2 2
P23 0 29 0 26 0 0
P32 1 12 2 20 1 1
P40 0 24 3 34 3 1

Available = Total - Sum(Allocation) = (10,5,7) - (7,2,5) = (3, 3, 2)

Safety Check — Find a safe sequence:

Step 1: Available = (3, 3, 2)
Can P1 run? Need(1,2,2) <= Available(3,3,2)? YES
→ Run P1. Available = (3,3,2) + Allocation_P1(2,0,0) = (5, 3, 2)
Step 2: Available = (5, 3, 2)
Can P3 run? Need(0,1,1) <= Available(5,3,2)? YES
→ Run P3. Available = (5,3,2) + Allocation_P3(2,1,1) = (7, 4, 3)
Step 3: Available = (7, 4, 3)
Can P4 run? Need(4,3,1) <= Available(7,4,3)? YES
→ Run P4. Available = (7,4,3) + Allocation_P4(0,0,2) = (7, 4, 5)
Step 4: Available = (7, 4, 5)
Can P0 run? Need(7,4,3) <= Available(7,4,5)? YES
→ Run P0. Available = (7,4,5) + Allocation_P0(0,1,0) = (7, 5, 5)
Step 5: Available = (7, 5, 5)
Can P2 run? Need(6,0,0) <= Available(7,5,5)? YES
→ Run P2. Available = (7,5,5) + Allocation_P2(3,0,2) = (10, 5, 7)
Safe sequence: <P1, P3, P4, P0, P2>
System is in a SAFE STATE.

If a new request would leave the system in an unsafe state, the request is denied (the process must wait).


Deadlock Detection and Recovery

Instead of preventing or avoiding deadlock, let deadlocks happen and then detect and recover.

Detection

  • Single instance resources: Use a wait-for graph (simplified resource allocation graph with only processes). A cycle means deadlock.
  • Multiple instance resources: Use a detection algorithm similar to Banker’s but without the Maximum/Need matrices.

Recovery

StrategyDescriptionTrade-off
Process terminationKill one or all deadlocked processesLoss of work; choose victim carefully
Resource preemptionTake resources from one process, give to anotherMust handle rollback; risk of starvation if same process always chosen

In practice, many systems (Linux, Windows) use a combination: prevent circular wait where possible, and if a deadlock is suspected, use timeouts to break it (e.g., database lock timeouts).


Classic Synchronization Problems

1. Producer-Consumer Problem

One or more producers generate data and place it in a shared buffer. One or more consumers remove data from the buffer. The buffer has a fixed capacity.

Constraints: Producers must wait if the buffer is full. Consumers must wait if the buffer is empty. Access to the buffer must be mutually exclusive.

This problem was solved above with the BoundedBuffer monitor. It can also be solved with semaphores:

Semaphore Solution:
mutex = Semaphore(1) // mutual exclusion for buffer access
empty = Semaphore(N) // counts empty slots (initially N)
full = Semaphore(0) // counts full slots (initially 0)
Producer: Consumer:
wait(empty) wait(full)
wait(mutex) wait(mutex)
// add item to buffer // remove item from buffer
signal(mutex) signal(mutex)
signal(full) signal(empty)

2. Readers-Writers Problem

Multiple readers can access shared data simultaneously, but a writer needs exclusive access. No reader should wait unless a writer is already using the shared data.

Readers-Writers Solution (First Readers-Writers):
mutex = Semaphore(1) // protect read_count
rw_mutex = Semaphore(1) // mutual exclusion for writers
read_count = 0 // number of active readers
Writer: Reader:
wait(rw_mutex) wait(mutex)
// write data read_count++
signal(rw_mutex) if read_count == 1:
wait(rw_mutex) // first reader locks out writers
signal(mutex)
// read data
wait(mutex)
read_count--
if read_count == 0:
signal(rw_mutex) // last reader unlocks for writers
signal(mutex)

Note: This solution favors readers — a continuous stream of readers can starve writers. The Second Readers-Writers problem gives priority to writers.

3. Dining Philosophers Problem

Five philosophers sit around a table with five forks. Each philosopher needs both the left and right fork to eat. They alternate between thinking and eating.

P0
F4 F0
P4 P1
F3 F1
P3
F2
P2
Naive solution (DEADLOCKS):
philosopher(i):
while true:
think()
pick_up(fork[i]) // left fork
pick_up(fork[(i+1) % 5]) // right fork
eat()
put_down(fork[i])
put_down(fork[(i+1) % 5])
If all 5 pick up their left fork simultaneously → deadlock!

Solutions:

SolutionHow It Works
Resource orderingPhilosopher picks up lower-numbered fork first. Breaks circular wait.
Limit concurrencyAllow at most 4 philosophers to attempt eating simultaneously (use a semaphore initialized to 4).
AsymmetricEven-numbered philosophers pick up left first; odd-numbered pick up right first.
MonitorPhilosopher checks both forks are available before picking up either.
Resource Ordering Solution:
philosopher(i):
first = min(i, (i+1) % 5)
second = max(i, (i+1) % 5)
while true:
think()
pick_up(fork[first]) // always lower-numbered first
pick_up(fork[second]) // then higher-numbered
eat()
put_down(fork[second])
put_down(fork[first])
This breaks the circular wait condition → no deadlock.

Summary: Synchronization Primitives Comparison

PrimitiveLevelKey FeatureWhen to Use
MutexLowBinary lock; only owner can unlockProtecting a single shared resource
SemaphoreLowCounter-based; any thread can signalLimiting concurrent access to N resources
MonitorHighEncapsulates data + operations + syncComplex shared data structures
Condition VariableUsed with mutex/monitorWait/signal for specific conditionsProducer-consumer, readers-writers
Read-Write LockMediumMultiple readers OR one writerRead-heavy workloads
Atomic OperationsHardwareLock-free, single instructionSimple counters and flags

Key Takeaways

  1. The critical section problem requires mutual exclusion, progress, and bounded waiting
  2. Mutexes provide simple binary locking; semaphores generalize to counting; monitors provide structured high-level synchronization
  3. Deadlock requires all four conditions: mutual exclusion, hold-and-wait, no preemption, and circular wait
  4. Prevention breaks at least one condition (resource ordering is most practical); avoidance uses the Banker’s Algorithm to stay in safe states; detection finds deadlocks after they occur
  5. The Banker’s Algorithm checks if a safe sequence exists before granting resource requests
  6. Classic problems (Producer-Consumer, Readers-Writers, Dining Philosophers) appear repeatedly in exams and interviews — know at least one solution for each
  7. In real-world systems, lock ordering and timeouts are the most common deadlock prevention strategies

Next Steps