Skip to content

Processes & Threads

A process is a program in execution. When you run a program, the operating system creates a process that includes the program code, its current activity (represented by the program counter and CPU registers), a stack, a data section, and a heap. Understanding processes and threads is fundamental to everything else in operating systems.

The Process Concept

What Makes Up a Process?

A process is far more than just the executable code. It consists of:

  • Text section — The compiled program code
  • Program counter — The address of the next instruction to execute
  • CPU registers — Current values of all processor registers
  • Stack — Temporary data such as function parameters, return addresses, and local variables
  • Data section — Global and static variables
  • Heap — Dynamically allocated memory during runtime

Process Control Block (PCB)

The OS tracks each process using a data structure called the Process Control Block (PCB). The PCB is the kernel’s representation of a process and contains everything needed to manage it.

┌─────────────────────────────────────┐
│ Process Control Block (PCB) │
├─────────────────────────────────────┤
│ Process ID (PID) : 4821 │
│ Process State : Running │
│ Program Counter : 0x7f3a │
│ CPU Registers : {...} │
│ CPU Scheduling Info : │
│ - Priority : 20 │
│ - Queue pointer : 0xab12 │
│ Memory Management Info : │
│ - Page table base : 0x1000 │
│ - Memory limits : 4 GB │
│ I/O Status Info : │
│ - Open files : [0,1,2] │
│ - I/O devices : [tty0] │
│ Accounting Info : │
│ - CPU time used : 1.24s │
│ - Time limits : none │
│ Parent PID : 4800 │
│ Child PIDs : [4822] │
└─────────────────────────────────────┘

Process States

Every process moves through a well-defined set of states during its lifetime. The OS scheduler manages these transitions.

┌───────────────┐
│ New │
│ (admitted) │
└───────┬───────┘
v
┌──────────────────────────────────┐
│ Ready │
│ (waiting for CPU assignment) │
└──────┬──────────────────▲────────┘
│ │
scheduler │ │ interrupt /
dispatch │ │ I/O complete /
│ │ time quantum
v │ expired
┌──────────────────────────────────┐
│ Running │
│ (instructions being executed) │
└──────┬──────────────┬────────────┘
│ │
I/O or │ │ exit
event │ │
wait │ v
│ ┌───────────────┐
│ │ Terminated │
│ │ (exit) │
│ └───────────────┘
v
┌──────────────────────────────────┐
│ Waiting │
│ (waiting for I/O or event) │
└──────────────────────────────────┘
StateDescription
NewProcess is being created
ReadyProcess is loaded in memory and waiting for CPU time
RunningProcess instructions are being executed on the CPU
WaitingProcess is waiting for an I/O operation or event to complete
TerminatedProcess has finished execution

Process Creation

In Unix/Linux systems, new processes are created using the fork() system call. The fork() call creates a child process that is an almost exact copy of the parent. The child can then use exec() to replace its memory image with a new program.

Parent Process (PID 100)
├── fork() ─────────────────────┐
│ │
│ Parent continues │ Child Process (PID 101)
│ fork() returns child PID │ fork() returns 0
│ (101) │
│ │
v v
wait() exec("/bin/ls")
│ │
│ │ (replaces process image
│ │ with 'ls' program)
│ │
│ v
│ ls executes and exits
│ │
v │
Parent resumes <─────────────────┘
(child exited)

Process Creation Example

import os
import sys
def main():
print(f"Parent process PID: {os.getpid()}")
# Create a child process
pid = os.fork()
if pid < 0:
# Fork failed
print("Fork failed!", file=sys.stderr)
sys.exit(1)
elif pid == 0:
# Child process
print(f"Child process PID: {os.getpid()}")
print(f"Child's parent PID: {os.getppid()}")
# Replace child with a new program
os.execlp("echo", "echo", "Hello from child!")
else:
# Parent process
print(f"Parent created child with PID: {pid}")
# Wait for child to finish
os.wait()
print("Child has terminated.")
if __name__ == "__main__":
main()

Process vs Thread

A thread is the smallest unit of execution within a process. A process can have multiple threads that share the same address space but execute independently.

┌─────────────────────────────────────────────────────┐
│ Process │
│ │
│ ┌──────────────────────────────────────────────┐ │
│ │ Shared Resources │ │
│ │ Code Section | Data Section | Heap | Files │ │
│ └──────────────────────────────────────────────┘ │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Thread 1 │ │ Thread 2 │ │ Thread 3 │ │
│ │ │ │ │ │ │ │
│ │ Registers│ │ Registers│ │ Registers│ │
│ │ Stack │ │ Stack │ │ Stack │ │
│ │ PC │ │ PC │ │ PC │ │
│ └──────────┘ └──────────┘ └──────────┘ │
└─────────────────────────────────────────────────────┘
FeatureProcessThread
Address spaceOwn separate address spaceShares address space with other threads in same process
Creation overheadHeavy (allocate memory, copy page tables)Light (just a new stack and register set)
CommunicationIPC required (pipes, sockets, shared memory)Direct access to shared memory
Context switch costExpensive (flush TLB, swap page tables)Cheap (same address space)
IsolationFull isolation; one process crash does not affect othersNo isolation; one thread crash can kill the entire process
Resource usageEach process has its own resourcesThreads share process resources

User Threads vs Kernel Threads

FeatureUser-Level ThreadsKernel-Level Threads
Managed byUser-space thread libraryOperating system kernel
Kernel awarenessKernel sees only one thread per processKernel schedules each thread individually
Context switchFast (no kernel involvement)Slower (requires kernel mode switch)
BlockingIf one thread blocks, all threads in the process blockOther threads continue running
ParallelismCannot exploit multiple CPUsCan run on different CPUs simultaneously
ExamplesGreen threads (early Java), GNU Portable ThreadsPOSIX threads (pthreads), Windows threads

Multi-Threading Models

Many-to-One One-to-One Many-to-Many
(User threads → (User threads → (User threads →
1 kernel thread) kernel threads) kernel threads)
U U U U U U U U U U U U U
\ | | / | | | | \ | | / |
\|_|/ | | | | \ | / \ / |
K K K K K K K K
- Fast but no - True parallelism - Best of both
true parallelism - Thread creation is - OS can create
- One block stops all more expensive as many kernel
- Used by Linux, threads as needed
Windows, macOS
  • Many-to-One: Many user threads map to a single kernel thread. Simple but a single blocking call blocks all threads.
  • One-to-One: Each user thread maps to a kernel thread. Provides true parallelism but higher overhead. This is the model used by Linux (NPTL), Windows, and macOS.
  • Many-to-Many: Many user threads map to many (often fewer) kernel threads. Flexible but complex to implement.

Multi-Threading Examples

import threading
import time
from multiprocessing import Process
# === Threading (shared memory, limited by GIL for CPU-bound) ===
def worker(name, duration):
print(f"Thread {name} starting")
time.sleep(duration) # Simulates I/O-bound work
print(f"Thread {name} finished")
# Create and start threads
threads = []
for i in range(4):
t = threading.Thread(target=worker, args=(f"T-{i}", 1))
threads.append(t)
t.start()
# Wait for all threads to complete
for t in threads:
t.join()
print("All threads finished")
# === Multiprocessing (separate address spaces, true parallelism) ===
def cpu_intensive(n):
"""CPU-bound work that benefits from multiprocessing."""
total = sum(i * i for i in range(n))
print(f"Process {n}: result = {total}")
processes = []
for n in [10_000_000, 20_000_000, 30_000_000]:
p = Process(target=cpu_intensive, args=(n,))
processes.append(p)
p.start()
for p in processes:
p.join()
print("All processes finished")
# === Thread Synchronization ===
counter = 0
lock = threading.Lock()
def increment(times):
global counter
for _ in range(times):
with lock: # Acquire lock before modifying shared data
counter += 1
threads = [threading.Thread(target=increment, args=(100_000,)) for _ in range(4)]
for t in threads:
t.start()
for t in threads:
t.join()
print(f"Counter: {counter}") # Always 400000 with lock

Context Switching

A context switch occurs when the OS saves the state of one process (or thread) and loads the state of another so that the CPU can execute a different process. Context switches are triggered by:

  • A timer interrupt (time quantum expired)
  • An I/O request (process must wait)
  • A higher-priority process becoming ready
  • A system call that causes the process to block
Process A (Running) Kernel Process B (Ready)
│ │ │
│ Timer interrupt │ │
│─────────────────────────────>│ │
│ │ │
│ Save state of A │ │
│ (registers, PC, SP │ │
│ → PCB of A) │ │
│ │ │
│ Scheduler decides to run B │
│ │ │
│ │ Load state of B │
│ │ (PCB of B → │
│ │ registers, PC, SP) │
│ │──────────────────────>│
│ │ │
│ A is now Ready │ B is Running │
│ │ │

Cost of Context Switching

Context switching is pure overhead — no useful work is done during a switch. The cost includes:

  • Direct costs: Saving and restoring registers, switching the memory address space (flushing TLB), updating kernel data structures
  • Indirect costs: Cache pollution (the new process has different data in L1/L2 cache), TLB misses, pipeline flushes

Typical context switch times range from 1-10 microseconds on modern hardware. Thread context switches within the same process are cheaper because the address space does not change.


Inter-Process Communication (IPC)

Since processes have separate address spaces, they need special mechanisms to communicate. The OS provides several IPC methods.

Pipes

A pipe provides a unidirectional byte stream between two related processes (typically parent and child).

┌──────────┐ write end ┌──────┐ read end ┌──────────┐
│ Process A │───────────────>│ Pipe │───────────────>│ Process B │
│ (writer) │ │ (buf)│ │ (reader) │
└──────────┘ └──────┘ └──────────┘

Shared Memory

Processes map the same region of physical memory into their address spaces. This is the fastest IPC mechanism because data does not need to be copied — but it requires synchronization.

┌──────────────┐ ┌──────────────┐
│ Process A │ │ Process B │
│ │ │ │
│ Virtual │ │ Virtual │
│ Address │ │ Address │
│ Space │ │ Space │
│ │ │ │ │ │
│ │ mapped│ │mapped│ │
│ v │ │ v │
└──────┼───────┘ └──────┼───────┘
│ │
└───────────┬───────────┘
┌──────v──────┐
│ Shared │
│ Memory │
│ Region │
└─────────────┘

Message Passing

Processes send and receive messages through the kernel. This approach is simpler to program correctly (no shared state) but involves data copying and kernel overhead.

IPC MethodSpeedComplexityUse Case
PipeMediumLowParent-child communication, shell pipelines
Named Pipe (FIFO)MediumLowUnrelated processes on same machine
Shared MemoryVery FastHigh (needs synchronization)High-throughput data sharing
Message QueueMediumMediumStructured message exchange
SocketSlowerMediumNetwork communication, cross-machine IPC
SignalVery FastLow (limited data)Simple notifications (e.g., SIGTERM, SIGKILL)

Key Takeaways

  1. Processes are independent execution environments with their own address space; threads share the address space within a process
  2. The PCB stores everything the OS needs to manage a process: PID, state, registers, memory info, open files
  3. Process states cycle through New, Ready, Running, Waiting, and Terminated
  4. fork() creates a child process; exec() replaces the process image with a new program
  5. Context switching is necessary for multitasking but introduces overhead — minimize it where possible
  6. IPC mechanisms (pipes, shared memory, message passing) let processes with separate address spaces communicate
  7. Threads are cheaper to create and switch than processes, but lack isolation — one thread crash kills the entire process
  8. Modern OSes use the one-to-one threading model where each user thread maps to a kernel thread

Next Steps