Skip to content

Threads & Synchronization

Threads are the fundamental building blocks of concurrent programs. A thread is a lightweight unit of execution within a process — all threads in a process share the same memory space, which makes communication fast but also introduces synchronization challenges.

Thread Fundamentals

What Is a Thread?

A process can contain one or more threads. Each thread has its own:

  • Program counter — tracks which instruction is executing
  • Stack — stores local variables and function call frames
  • Register set — the CPU registers assigned to the thread

But threads within the same process share:

  • Heap memory — dynamically allocated objects
  • Global variables — static/global data
  • File descriptors — open files, sockets, and other I/O resources
  • Code segment — the executable instructions
┌──────────────────── Process ─────────────────────┐
│ │
│ Shared: Heap, Global Vars, Code, File Descs │
│ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ Thread 1│ │ Thread 2│ │ Thread 3│ │
│ │ Stack │ │ Stack │ │ Stack │ │
│ │ PC │ │ PC │ │ PC │ │
│ │ Regs │ │ Regs │ │ Regs │ │
│ └─────────┘ └─────────┘ └─────────┘ │
└───────────────────────────────────────────────────┘

Thread Lifecycle

A thread transitions through several states during its lifetime:

┌────────┐ start() ┌─────────┐
│ New │────────────►│ Runnable│◄─────────────────┐
└────────┘ └────┬────┘ │
│ │
scheduled signal/
│ notify
▼ │
┌─────────┐ ┌──────┴────┐
│ Running │──wait()──►│ Waiting │
└────┬────┘ └───────────┘
completed
┌─────────┐
│ Dead │
└─────────┘
  • New: Thread object is created but not yet started
  • Runnable: Thread is ready to run and waiting for CPU time
  • Running: Thread is actively executing on a CPU core
  • Waiting/Blocked: Thread is waiting for a lock, I/O, or a signal
  • Dead/Terminated: Thread has finished execution

Creating and Managing Threads

import threading
import time
# Method 1: Pass a target function
def worker(name, delay):
print(f"Thread {name} starting")
time.sleep(delay)
print(f"Thread {name} finished")
t1 = threading.Thread(target=worker, args=("A", 2))
t2 = threading.Thread(target=worker, args=("B", 1))
t1.start()
t2.start()
# Wait for both threads to complete
t1.join()
t2.join()
print("All threads done")
# Method 2: Subclass Thread
class MyThread(threading.Thread):
def __init__(self, name):
super().__init__()
self.name = name
def run(self):
print(f"Thread {self.name} running")
time.sleep(1)
print(f"Thread {self.name} done")
threads = [MyThread(f"Worker-{i}") for i in range(3)]
for t in threads:
t.start()
for t in threads:
t.join()
# Daemon threads -- automatically killed when main thread exits
daemon = threading.Thread(target=worker, args=("Daemon", 10), daemon=True)
daemon.start()
# Program exits without waiting for daemon to finish

Synchronization Primitives

When multiple threads access shared data, you need synchronization to prevent data corruption. Below are the essential primitives.

Mutex (Mutual Exclusion Lock)

A mutex ensures that only one thread can enter a critical section at a time. If a thread tries to acquire a locked mutex, it blocks until the mutex is released.

import threading
counter = 0
lock = threading.Lock()
def increment(n):
global counter
for _ in range(n):
lock.acquire()
try:
counter += 1 # Critical section
finally:
lock.release()
# Preferred: use the context manager
def increment_safe(n):
global counter
for _ in range(n):
with lock: # Automatically acquires and releases
counter += 1
threads = [threading.Thread(target=increment_safe, args=(100000,)) for _ in range(4)]
for t in threads:
t.start()
for t in threads:
t.join()
print(f"Counter: {counter}") # Always 400000

Semaphores

A semaphore maintains a counter and allows up to N threads to access a resource concurrently. A mutex is essentially a semaphore with N = 1 (a binary semaphore).

import threading
import time
# Allow at most 3 concurrent connections
connection_pool = threading.Semaphore(3)
def access_database(thread_id):
print(f"Thread {thread_id} waiting for connection...")
with connection_pool:
print(f"Thread {thread_id} connected (slot acquired)")
time.sleep(2) # Simulate database work
print(f"Thread {thread_id} disconnected (slot released)")
threads = [threading.Thread(target=access_database, args=(i,)) for i in range(8)]
for t in threads:
t.start()
for t in threads:
t.join()

Condition Variables

A condition variable allows threads to wait for a specific condition to become true, rather than busy-waiting or polling. Threads waiting on a condition variable are woken up by a signal or broadcast from another thread.

import threading
import time
import random
queue = []
MAX_SIZE = 5
condition = threading.Condition()
def producer():
for i in range(10):
with condition:
while len(queue) >= MAX_SIZE:
print("Producer waiting -- queue full")
condition.wait()
item = random.randint(1, 100)
queue.append(item)
print(f"Produced: {item} (queue size: {len(queue)})")
condition.notify_all() # Wake up consumers
time.sleep(random.uniform(0.1, 0.5))
def consumer(name):
for _ in range(5):
with condition:
while len(queue) == 0:
print(f"Consumer {name} waiting -- queue empty")
condition.wait()
item = queue.pop(0)
print(f"Consumer {name} consumed: {item} (queue size: {len(queue)})")
condition.notify_all() # Wake up producer
time.sleep(random.uniform(0.2, 0.6))
p = threading.Thread(target=producer)
c1 = threading.Thread(target=consumer, args=("A",))
c2 = threading.Thread(target=consumer, args=("B",))
p.start(); c1.start(); c2.start()
p.join(); c1.join(); c2.join()

Read-Write Locks

A read-write lock allows multiple concurrent readers but only one writer. This is ideal when reads are far more frequent than writes.

import threading
# Python does not have a built-in RWLock, but you can use a simple implementation
class ReadWriteLock:
def __init__(self):
self._read_ready = threading.Condition(threading.Lock())
self._readers = 0
def acquire_read(self):
with self._read_ready:
self._readers += 1
def release_read(self):
with self._read_ready:
self._readers -= 1
if self._readers == 0:
self._read_ready.notify_all()
def acquire_write(self):
self._read_ready.acquire()
while self._readers > 0:
self._read_ready.wait()
def release_write(self):
self._read_ready.release()
# Usage
rw_lock = ReadWriteLock()
shared_data = {"value": 0}
def reader(reader_id):
rw_lock.acquire_read()
try:
print(f"Reader {reader_id} reads: {shared_data['value']}")
finally:
rw_lock.release_read()
def writer(value):
rw_lock.acquire_write()
try:
shared_data["value"] = value
print(f"Writer set value to {value}")
finally:
rw_lock.release_write()

Barriers

A barrier is a synchronization point where all threads must arrive before any of them can proceed. Useful for phased computations where each phase depends on the results of the previous phase.

import threading
import time
barrier = threading.Barrier(3) # Wait for 3 threads
def phase_worker(thread_id):
# Phase 1
print(f"Thread {thread_id}: Phase 1 complete")
barrier.wait() # Wait for all threads to finish Phase 1
# Phase 2 -- only starts after all threads finish Phase 1
print(f"Thread {thread_id}: Phase 2 complete")
barrier.wait()
print(f"Thread {thread_id}: All phases done")
threads = [threading.Thread(target=phase_worker, args=(i,)) for i in range(3)]
for t in threads:
t.start()
for t in threads:
t.join()

Thread-Safe Data Structures

Rather than adding locks around every access to a standard data structure, many languages offer data structures that handle synchronization internally.

LanguageThread-Safe Collections
Pythonqueue.Queue, collections.deque (with locks), multiprocessing.Queue
JavaConcurrentHashMap, CopyOnWriteArrayList, BlockingQueue, ConcurrentLinkedQueue
C++No standard concurrent containers; use std::mutex with standard containers, or third-party libraries like Intel TBB
Gosync.Map, channels as concurrent queues
RustArc<Mutex<T>>, crossbeam crate for lock-free structures
import queue
import threading
# Thread-safe queue (blocks automatically)
q = queue.Queue(maxsize=10)
def producer():
for i in range(20):
q.put(i) # Blocks if full
print(f"Produced: {i}")
def consumer():
while True:
item = q.get() # Blocks if empty
print(f"Consumed: {item}")
q.task_done()
# Start consumer as daemon thread
t = threading.Thread(target=consumer, daemon=True)
t.start()
producer()
q.join() # Wait until all items are processed

Thread Pools

Creating a new thread for every task is expensive. Thread pools maintain a set of reusable threads that pick up tasks from a work queue, reducing overhead.

from concurrent.futures import ThreadPoolExecutor, as_completed
import time
def fetch_url(url):
"""Simulate fetching a URL"""
time.sleep(1) # Simulate network delay
return f"Content from {url}"
urls = [f"https://example.com/page/{i}" for i in range(10)]
# ThreadPoolExecutor manages a pool of worker threads
with ThreadPoolExecutor(max_workers=4) as executor:
# Submit all tasks
futures = {executor.submit(fetch_url, url): url for url in urls}
# Process results as they complete
for future in as_completed(futures):
url = futures[future]
try:
result = future.result()
print(f"{url}: {result}")
except Exception as e:
print(f"{url} generated an exception: {e}")
# Using executor.map for simpler cases
with ThreadPoolExecutor(max_workers=4) as executor:
results = executor.map(fetch_url, urls)
for url, result in zip(urls, results):
print(f"{url}: {result}")

Synchronization Primitives Summary

PrimitivePurposeUse When
MutexExclusive access to a critical sectionProtecting shared mutable state
SemaphoreLimit concurrent access to NConnection pools, rate limiting
Condition VariableWait for a condition to become trueProducer-consumer queues, event signaling
Read-Write LockMultiple readers OR one writerRead-heavy workloads with infrequent writes
BarrierSynchronize threads at a checkpointPhased parallel algorithms
Atomic OperationsLock-free single-variable updatesCounters, flags, simple state

Best Practices

  1. Minimize the critical section — Hold locks for as short a time as possible to reduce contention
  2. Use high-level abstractions — Prefer thread pools and concurrent collections over raw threads and locks
  3. Avoid nested locks — Acquiring multiple locks increases deadlock risk; if unavoidable, use a consistent lock ordering
  4. Prefer immutability — Immutable data is inherently thread-safe and requires no synchronization
  5. Use RAII for locks — Always use context managers, try/finally, or lock guards to ensure locks are released
  6. Test under load — Concurrency bugs often surface only under high contention; use stress tests and thread sanitizers
  7. Document thread safety — Clearly state which methods and classes are thread-safe and which are not

Next Steps