Skip to content

Heaps

A heap is a complete binary tree that satisfies the heap property. It’s the foundation for priority queues and enables efficient access to the minimum or maximum element.

Heap Property

  • Min-Heap: Parent ≤ Children (smallest at root)
  • Max-Heap: Parent ≥ Children (largest at root)
Min-Heap: Max-Heap:
1 9
/ \ / \
3 2 7 8
/ \ / \
7 4 3 5
Array representation (0-indexed):
Min-Heap: [1, 3, 2, 7, 4]
Max-Heap: [9, 7, 8, 3, 5]

Array Representation

For a node at index i:

  • Parent: (i - 1) // 2
  • Left Child: 2 * i + 1
  • Right Child: 2 * i + 2
Index: 0 1 2 3 4
Array: [ 1, 3, 2, 7, 4 ]
Tree: 1 (index 0)
/ \
3 2 (indices 1, 2)
/ \
7 4 (indices 3, 4)

Time Complexity

OperationTime
InsertO(log n)
Extract Min/MaxO(log n)
Peek Min/MaxO(1)
Build HeapO(n)
HeapifyO(log n)

Basic Operations

import heapq
# Python heapq implements MIN-HEAP
# Create heap from list
nums = [5, 3, 8, 1, 2]
heapq.heapify(nums) # [1, 2, 8, 5, 3]
# Insert (push)
heapq.heappush(nums, 0)
# Extract minimum (pop)
smallest = heapq.heappop(nums) # 0
# Peek minimum (without removing)
smallest = nums[0]
# Push and pop in one operation
result = heapq.heappushpop(nums, 4) # Push 4, pop smallest
# Pop and push in one operation
result = heapq.heapreplace(nums, 6) # Pop smallest, push 6
# Get n smallest/largest
smallest_3 = heapq.nsmallest(3, nums)
largest_3 = heapq.nlargest(3, nums)
# MAX-HEAP: Negate values
max_heap = []
heapq.heappush(max_heap, -5)
heapq.heappush(max_heap, -3)
largest = -heapq.heappop(max_heap) # 5

Heap Implementation

class MinHeap:
def __init__(self):
self.heap = []
def parent(self, i):
return (i - 1) // 2
def left_child(self, i):
return 2 * i + 1
def right_child(self, i):
return 2 * i + 2
def swap(self, i, j):
self.heap[i], self.heap[j] = self.heap[j], self.heap[i]
def push(self, val):
"""Add element and maintain heap property"""
self.heap.append(val)
self._bubble_up(len(self.heap) - 1)
def pop(self):
"""Remove and return minimum element"""
if not self.heap:
raise IndexError("Pop from empty heap")
if len(self.heap) == 1:
return self.heap.pop()
min_val = self.heap[0]
self.heap[0] = self.heap.pop()
self._sink_down(0)
return min_val
def peek(self):
"""Return minimum without removing"""
if not self.heap:
raise IndexError("Peek from empty heap")
return self.heap[0]
def _bubble_up(self, idx):
"""Move element up to maintain heap property"""
while idx > 0 and self.heap[self.parent(idx)] > self.heap[idx]:
self.swap(idx, self.parent(idx))
idx = self.parent(idx)
def _sink_down(self, idx):
"""Move element down to maintain heap property"""
size = len(self.heap)
while True:
smallest = idx
left = self.left_child(idx)
right = self.right_child(idx)
if left < size and self.heap[left] < self.heap[smallest]:
smallest = left
if right < size and self.heap[right] < self.heap[smallest]:
smallest = right
if smallest == idx:
break
self.swap(idx, smallest)
idx = smallest
def heapify(self, arr):
"""Build heap from array in O(n)"""
self.heap = arr[:]
# Start from last non-leaf node
for i in range(len(arr) // 2 - 1, -1, -1):
self._sink_down(i)

Common Patterns

1. Kth Largest Element

Algorithm: Maintain min-heap of size k
nums = [3, 2, 1, 5, 6, 4], k = 2
Step 1: push 3 → heap = [3]
Step 2: push 2 → heap = [2, 3]
Step 3: push 1 → heap size > k → pop 1 → heap = [2, 3]
Step 4: push 5 → heap size > k → pop 2 → heap = [3, 5]
Step 5: push 6 → heap size > k → pop 3 → heap = [5, 6]
Step 6: push 4 → heap size > k → pop 4 → heap = [5, 6]
Answer: heap[0] = 5 (2nd largest)
import heapq
def find_kth_largest(nums: list, k: int) -> int:
"""
Find kth largest using min-heap of size k.
Time: O(n log k), Space: O(k)
Algorithm:
1. Maintain a min-heap of size k
2. For each number, push to heap
3. If heap size > k, pop smallest
4. After all elements, heap[0] is kth largest
"""
heap = []
for num in nums:
heapq.heappush(heap, num)
if len(heap) > k:
heapq.heappop(heap)
return heap[0]
# Alternative using heapq.nlargest
def find_kth_largest_v2(nums: list, k: int) -> int:
return heapq.nlargest(k, nums)[-1]

2. Top K Frequent Elements

from collections import Counter
import heapq
def top_k_frequent(nums: list, k: int) -> list:
"""
Find k most frequent elements.
Time: O(n log k), Space: O(n)
Algorithm:
1. Count frequency of each element
2. Use min-heap of size k to track top frequencies
3. Return elements in the heap
"""
count = Counter(nums)
# Use min-heap to keep top k
heap = []
for num, freq in count.items():
heapq.heappush(heap, (freq, num))
if len(heap) > k:
heapq.heappop(heap)
return [num for freq, num in heap]
# Alternative: Bucket sort O(n)
def top_k_frequent_bucket(nums: list, k: int) -> list:
count = Counter(nums)
# Buckets indexed by frequency
buckets = [[] for _ in range(len(nums) + 1)]
for num, freq in count.items():
buckets[freq].append(num)
result = []
for i in range(len(buckets) - 1, -1, -1):
for num in buckets[i]:
result.append(num)
if len(result) == k:
return result
return result

3. Merge K Sorted Lists

K-way Merge Algorithm:
lists = [[1,4,5], [1,3,4], [2,6]]
Initial heap: [(1, list0), (1, list1), (2, list2)]
Step 1: pop (1, list0) → result = [1]
push (4, list0) → heap = [(1, list1), (2, list2), (4, list0)]
Step 2: pop (1, list1) → result = [1,1]
push (3, list1) → heap = [(2, list2), (3, list1), (4, list0)]
...continue until heap is empty
Final: [1,1,2,3,4,4,5,6]
import heapq
def merge_k_lists(lists: list) -> list:
"""
Merge k sorted lists into one.
Time: O(n log k), Space: O(k)
Algorithm:
1. Push first element from each list to heap
2. Pop smallest, add to result
3. Push next element from same list
4. Repeat until heap is empty
"""
heap = []
result = []
# Initialize heap with first element from each list
for i, lst in enumerate(lists):
if lst:
heapq.heappush(heap, (lst[0], i, 0))
while heap:
val, list_idx, elem_idx = heapq.heappop(heap)
result.append(val)
# Add next element from same list
if elem_idx + 1 < len(lists[list_idx]):
next_val = lists[list_idx][elem_idx + 1]
heapq.heappush(heap, (next_val, list_idx, elem_idx + 1))
return result
# For linked lists (LeetCode version)
def merge_k_linked_lists(lists):
heap = []
for i, node in enumerate(lists):
if node:
heapq.heappush(heap, (node.val, i, node))
dummy = ListNode(0)
current = dummy
while heap:
val, idx, node = heapq.heappop(heap)
current.next = node
current = current.next
if node.next:
heapq.heappush(heap, (node.next.val, idx, node.next))
return dummy.next

4. Find Median from Data Stream

Two Heaps Strategy:
small (max-heap): stores smaller half
large (min-heap): stores larger half
Numbers: [5, 2, 3, 4, 1]
After 5: small=[-5], large=[] median=5
After 2: small=[-2], large=[5] median=(2+5)/2=3.5
After 3: small=[-3,-2], large=[5] median=3
After 4: small=[-3,-2], large=[4,5] median=(3+4)/2=3.5
After 1: small=[-3,-2,-1], large=[4,5] median=3
Invariant: |small| == |large| or |small| == |large| + 1
import heapq
class MedianFinder:
"""
Maintain running median using two heaps.
Time: O(log n) per add, O(1) for median
Strategy:
- small: max-heap for smaller half (negate values)
- large: min-heap for larger half
- Keep small same size or one larger than large
"""
def __init__(self):
self.small = [] # Max-heap (store negative)
self.large = [] # Min-heap
def add_num(self, num: int):
# Add to max-heap first
heapq.heappush(self.small, -num)
# Balance: move max of small to large
heapq.heappush(self.large, -heapq.heappop(self.small))
# Keep small same size or one larger
if len(self.large) > len(self.small):
heapq.heappush(self.small, -heapq.heappop(self.large))
def find_median(self) -> float:
if len(self.small) > len(self.large):
return -self.small[0]
return (-self.small[0] + self.large[0]) / 2.0

5. Task Scheduler

from collections import Counter, deque
import heapq
def least_interval(tasks: list, n: int) -> int:
"""
Find minimum time to complete all tasks with cooldown.
Time: O(t * 26), Space: O(26)
Algorithm:
1. Count task frequencies
2. Use max-heap to always process most frequent task
3. Track cooldown times in queue
4. Idle if no tasks available
"""
count = Counter(tasks)
max_heap = [-cnt for cnt in count.values()]
heapq.heapify(max_heap)
time = 0
cooldown = deque() # (available_time, count)
while max_heap or cooldown:
time += 1
if max_heap:
cnt = 1 + heapq.heappop(max_heap) # Decrement count
if cnt: # More of this task remaining
cooldown.append((time + n, cnt))
# Check if any task is ready
if cooldown and cooldown[0][0] == time:
heapq.heappush(max_heap, cooldown.popleft()[1])
return time
# Mathematical approach (optimal)
def least_interval_math(tasks: list, n: int) -> int:
count = Counter(tasks)
max_freq = max(count.values())
max_count = sum(1 for freq in count.values() if freq == max_freq)
# Formula: (max_freq - 1) * (n + 1) + max_count
return max(len(tasks), (max_freq - 1) * (n + 1) + max_count)

6. Heap Sort

Heap Sort Algorithm:
1. Build max-heap from array
2. Repeatedly extract max and place at end
arr = [4, 10, 3, 5, 1]
Build max-heap: [10, 5, 3, 4, 1]
Extract phase:
Swap 10↔1: [1, 5, 3, 4, | 10] → heapify → [5, 4, 3, 1, | 10]
Swap 5↔1: [1, 4, 3, | 5, 10] → heapify → [4, 1, 3, | 5, 10]
Swap 4↔3: [3, 1, | 4, 5, 10] → heapify → [3, 1, | 4, 5, 10]
Swap 3↔1: [1, | 3, 4, 5, 10]
Sorted: [1, 3, 4, 5, 10]
def heap_sort(arr: list) -> list:
"""
Sort using heap.
Time: O(n log n), Space: O(1) in-place
Algorithm:
1. Build max-heap in O(n)
2. Extract max n times in O(n log n)
"""
n = len(arr)
# Build max-heap (start from last non-leaf)
for i in range(n // 2 - 1, -1, -1):
heapify(arr, n, i)
# Extract elements one by one
for i in range(n - 1, 0, -1):
# Move current root (max) to end
arr[0], arr[i] = arr[i], arr[0]
# Heapify reduced heap
heapify(arr, i, 0)
return arr
def heapify(arr, size, root):
"""Maintain max-heap property"""
largest = root
left = 2 * root + 1
right = 2 * root + 2
if left < size and arr[left] > arr[largest]:
largest = left
if right < size and arr[right] > arr[largest]:
largest = right
if largest != root:
arr[root], arr[largest] = arr[largest], arr[root]
heapify(arr, size, largest)
# Example
arr = [4, 10, 3, 5, 1]
heap_sort(arr) # [1, 3, 4, 5, 10]

Applications

  1. Priority Scheduling - OS task scheduling, job queues
  2. Dijkstra’s Algorithm - Shortest path in weighted graphs
  3. Huffman Coding - Data compression
  4. Event-Driven Simulation - Process events by time
  5. Median Maintenance - Streaming statistics
  6. Top-K Problems - Find largest/smallest k elements

Practice Problems

Easy

ProblemPatternCompanies
Kth Largest ElementMin-heap of size kAmazon, Google
Last Stone WeightMax-heap simulationAmazon
Relative RanksMax-heapAmazon

Medium

ProblemPatternCompanies
Top K Frequent ElementsHeap + Hash MapAmazon, Meta
Kth Smallest in MatrixMin-heapGoogle, Amazon
Find K Pairs with Smallest SumsMin-heapAmazon
Task SchedulerMax-heap + CooldownMeta, Amazon

Hard

ProblemPatternCompanies
Merge K Sorted ListsK-way mergeAmazon, Google, Meta
Find Median from Data StreamTwo heapsAmazon, Google
Sliding Window MedianTwo heaps + Lazy DeleteGoogle
Trapping Rain Water IIMin-heap BFSGoogle

Key Takeaways

  1. O(1) access to min/max - Perfect for priority-based problems
  2. O(log n) insert/delete - Efficient modifications
  3. Array-based - Complete binary tree fits perfectly in array
  4. Two heaps pattern - Powerful for median and partition problems
  5. K elements - Use heap of size k for top-k problems

Next Steps