Algorithms - 16. Two Heaps (Median Maintenance)
When to use Two Heaps
Use the Two Heaps pattern when you need to efficiently maintain the median or partition elements around a threshold in a dynamic dataset. While most commonly used for finding running medians, it can be adapted for:
- Running median in a data stream (classic use case)
- Sliding window median with element removal
- Finding the median in online/streaming scenarios
- Balancing two partitions with size constraints
- Kth largest/smallest element maintenance with K close to n/2
- Any problem where you need to efficiently maintain a balanced partition of sorted data
Example problem
Stock Price Analysis: A trading system receives real-time stock prices [12, 15, 10, 8, 13, 17, 14, 16, 11]
. For risk assessment, you need to:
- Find the median price at any moment (for market equilibrium)
- Update the median as new prices arrive
- Handle thousands of price updates per second efficiently
After processing all prices:
- Small heap (max heap):
{12, 11, 10, 8}
- lower half of values - Large heap (min heap):
{13, 14, 15, 16, 17}
- upper half of values - Current median: 13 (the smallest element in the large heap)
Answer: The median after all insertions is 13, and we can get new medians in O(1) time after O(log n) insertions.
What Two Heaps Actually Does
The Two Heaps pattern maintains two balanced priority queues that partition your data around the median:
- Small Heap (Max Heap): Stores the smaller half of all numbers (max element at top)
- Large Heap (Min Heap): Stores the larger half of all numbers (min element at top)
Think of it like organizing a line of people by height for a group photo:
- Left side (Small/Max Heap): Shorter people, with the tallest of them at the boundary
- Right side (Large/Min Heap): Taller people, with the shortest of them at the boundary
- The median: Either the tallest person on the left, shortest on the right, or their average
How the Operations Work
INSERT Operation - "Add a new element and rebalance"
When you insert a number, the algorithm:
- Adds it to the appropriate heap (small or large)
- Checks if the heaps violate the ordering invariant (max of small > min of large)
- Swaps elements if needed to maintain the invariant
- Rebalances sizes to keep them within 1 element of each other
GET MEDIAN Operation - "Find the middle value(s)"
When you request the median:
- If heaps have equal size: average the two tops (even number of elements)
- If one heap is larger by 1: return that heap's top (odd number of elements)
- This is always O(1) because we maintain the invariant!
The Key Invariants
These invariants must always hold after every operation:
// Invariant 1: Ordering
// Every element in small heap ≤ every element in large heap
maxOfSmall = small.top(); // Max element from small heap
minOfLarge = large.top(); // Min element from large heap
assert(maxOfSmall <= minOfLarge);
// Invariant 2: Balance
// Heap sizes differ by at most 1
assert(Math.abs(small.size - large.size) <= 1);
The beauty is that maintaining these two invariants guarantees the median is always at the heap tops!
Single most important insight
The median always lies at the boundary between two equal (or nearly equal) partitions - and by maintaining this boundary with two heaps that keep their tops accessible in O(1), we transform the O(n log n) problem of repeatedly sorting into O(log n) insertions with O(1) median access.
Mental note: The algorithm's genius lies in recognizing that we don't need the entire sorted order - just the ability to partition around the middle. At each insertion, we only need to decide: (1) Which heap gets the new element? (2) Do the heaps need rebalancing? These simple local decisions maintain the global median property.
Decision Framework
The Two Heaps pattern involves three key decisions:
During insertion - "Where does this element belong?":
- If number ≤ max of small heap (or small is empty) → add to small
- Otherwise → add to large
- Then check if we violated the ordering invariant
After insertion - "Do we need to swap?":
- If max of small > min of large → move the max from small to large
- This maintains the ordering invariant
After potential swap - "Do we need to rebalance?":
- If small has 2+ more elements → move its max to large
- If large has 2+ more elements → move its min to small
- This maintains the size invariant
Why the Two-Step Rebalancing Matters
The algorithm cleverly separates concerns:
Step 1: Initial Placement + Order Fix
Insert 15 into [small: {10, 8}, large: {20, 25}]
→ 15 > 10, so add to small (for now)
→ small: {15, 10, 8}, large: {20, 25}
→ But wait! 15 > 20 violates ordering
→ Move 15 from small to large
→ small: {10, 8}, large: {15, 20, 25}
Step 2: Size Rebalancing
After order fix: small: {10, 8}, large: {15, 20, 25}
→ large.size (3) > small.size (2) + 1
→ Move min of large to small
→ small: {15, 10, 8}, large: {20, 25}
→ Balanced!
This two-phase approach ensures both invariants are maintained while keeping operations simple and efficient.
Code
// First, let's implement simple Min and Max heap classes
// These follow the heap property and provide O(log n) insertion/removal
class MinHeap {
constructor() {
this.heap = [];
}
// Helper: Get parent/child indices for navigation
getParentIndex(childIndex) {
return Math.floor((childIndex - 1) / 2);
}
getLeftChildIndex(parentIndex) {
return 2 * parentIndex + 1;
}
getRightChildIndex(parentIndex) {
return 2 * parentIndex + 2;
}
// Helper: Check if indices are valid
hasParent(index) {
return this.getParentIndex(index) >= 0;
}
hasLeftChild(index) {
return this.getLeftChildIndex(index) < this.heap.length;
}
hasRightChild(index) {
return this.getRightChildIndex(index) < this.heap.length;
}
// Helper: Swap two elements in the heap
swap(indexOne, indexTwo) {
const temp = this.heap[indexOne];
this.heap[indexOne] = this.heap[indexTwo];
this.heap[indexTwo] = temp;
}
// Get the minimum element (at root) without removing it
peek() {
if (this.heap.length === 0) {
return null;
}
return this.heap[0];
}
// Add a new element and maintain heap property
add(value) {
// Step 1: Add to the end
this.heap.push(value);
// Step 2: Bubble up to maintain min heap property
this.bubbleUp();
}
// Bubble up: Move the last element up until heap property is satisfied
bubbleUp() {
let currentIndex = this.heap.length - 1;
// Keep moving up while we have a parent and we're smaller than parent
while (this.hasParent(currentIndex)) {
const parentIndex = this.getParentIndex(currentIndex);
const currentValue = this.heap[currentIndex];
const parentValue = this.heap[parentIndex];
if (currentValue < parentValue) {
// We're smaller than parent, need to swap
this.swap(currentIndex, parentIndex);
currentIndex = parentIndex;
} else {
// Heap property satisfied
break;
}
}
}
// Remove and return the minimum element
extractMin() {
if (this.heap.length === 0) {
return null;
}
// Step 1: Save the min value (at root)
const minValue = this.heap[0];
// Step 2: Move last element to root
const lastElement = this.heap.pop();
if (this.heap.length > 0) {
this.heap[0] = lastElement;
// Step 3: Bubble down to restore heap property
this.bubbleDown();
}
return minValue;
}
// Bubble down: Move root element down until heap property is satisfied
bubbleDown() {
let currentIndex = 0;
// Keep going while we have at least a left child
while (this.hasLeftChild(currentIndex)) {
// Find which child is smaller
let smallerChildIndex = this.getLeftChildIndex(currentIndex);
if (this.hasRightChild(currentIndex)) {
const rightChildIndex = this.getRightChildIndex(currentIndex);
const rightChild = this.heap[rightChildIndex];
const leftChild = this.heap[smallerChildIndex];
if (rightChild < leftChild) {
smallerChildIndex = rightChildIndex;
}
}
// Check if we need to swap with the smaller child
const currentValue = this.heap[currentIndex];
const smallerChildValue = this.heap[smallerChildIndex];
if (currentValue > smallerChildValue) {
// We're larger than our smaller child, swap
this.swap(currentIndex, smallerChildIndex);
currentIndex = smallerChildIndex;
} else {
// Heap property satisfied
break;
}
}
}
size() {
return this.heap.length;
}
isEmpty() {
return this.heap.length === 0;
}
}
// MaxHeap is identical to MinHeap but with comparison operators reversed
class MaxHeap {
constructor() {
this.heap = [];
}
// All helper methods are identical to MinHeap
getParentIndex(childIndex) {
return Math.floor((childIndex - 1) / 2);
}
getLeftChildIndex(parentIndex) {
return 2 * parentIndex + 1;
}
getRightChildIndex(parentIndex) {
return 2 * parentIndex + 2;
}
hasParent(index) {
return this.getParentIndex(index) >= 0;
}
hasLeftChild(index) {
return this.getLeftChildIndex(index) < this.heap.length;
}
hasRightChild(index) {
return this.getRightChildIndex(index) < this.heap.length;
}
swap(indexOne, indexTwo) {
const temp = this.heap[indexOne];
this.heap[indexOne] = this.heap[indexTwo];
this.heap[indexTwo] = temp;
}
peek() {
if (this.heap.length === 0) {
return null;
}
return this.heap[0];
}
add(value) {
this.heap.push(value);
this.bubbleUp();
}
// Bubble up: Move element up while it's LARGER than parent (max heap)
bubbleUp() {
let currentIndex = this.heap.length - 1;
while (this.hasParent(currentIndex)) {
const parentIndex = this.getParentIndex(currentIndex);
const currentValue = this.heap[currentIndex];
const parentValue = this.heap[parentIndex];
// DIFFERENT FROM MIN HEAP: Check if larger than parent
if (currentValue > parentValue) {
this.swap(currentIndex, parentIndex);
currentIndex = parentIndex;
} else {
break;
}
}
}
extractMax() {
if (this.heap.length === 0) {
return null;
}
const maxValue = this.heap[0];
const lastElement = this.heap.pop();
if (this.heap.length > 0) {
this.heap[0] = lastElement;
this.bubbleDown();
}
return maxValue;
}
// Bubble down: Move element down while it's SMALLER than children (max heap)
bubbleDown() {
let currentIndex = 0;
while (this.hasLeftChild(currentIndex)) {
// Find which child is LARGER (for max heap)
let largerChildIndex = this.getLeftChildIndex(currentIndex);
if (this.hasRightChild(currentIndex)) {
const rightChildIndex = this.getRightChildIndex(currentIndex);
const rightChild = this.heap[rightChildIndex];
const leftChild = this.heap[largerChildIndex];
// DIFFERENT FROM MIN HEAP: Choose larger child
if (rightChild > leftChild) {
largerChildIndex = rightChildIndex;
}
}
const currentValue = this.heap[currentIndex];
const largerChildValue = this.heap[largerChildIndex];
// DIFFERENT FROM MIN HEAP: Check if smaller than larger child
if (currentValue < largerChildValue) {
this.swap(currentIndex, largerChildIndex);
currentIndex = largerChildIndex;
} else {
break;
}
}
}
size() {
return this.heap.length;
}
isEmpty() {
return this.heap.length === 0;
}
}
// Now the MedianFinder using our custom heaps
class MedianFinder {
constructor() {
// Small heap: Max heap storing the smaller half of numbers
// The maximum of small numbers is at the top (accessible in O(1))
this.smallerHalf = new MaxHeap();
// Large heap: Min heap storing the larger half of numbers
// The minimum of large numbers is at the top (accessible in O(1))
this.largerHalf = new MinHeap();
}
addNumber(num) {
// DECISION FRAMEWORK: Initial placement
// Where should this number go initially?
// Step 1: Decide which heap gets the number
const shouldAddToSmallerHalf = this.decideInitialHeap(num);
if (shouldAddToSmallerHalf) {
this.smallerHalf.add(num);
} else {
this.largerHalf.add(num);
}
// Step 2: Fix ordering invariant if violated
this.maintainOrderingInvariant();
// Step 3: Fix size invariant if violated
this.maintainSizeBalance();
}
decideInitialHeap(num) {
// DECISION FRAMEWORK: Which heap should receive the new number?
// If smaller half is empty, start there
if (this.smallerHalf.isEmpty()) {
return true;
}
// Compare with the boundary (max of smaller half)
const boundaryValue = this.smallerHalf.peek();
// Numbers <= boundary go to smaller half
return num <= boundaryValue;
}
maintainOrderingInvariant() {
// DECISION FRAMEWORK: Check ordering invariant
// Ensure: max(smallerHalf) <= min(largerHalf)
// Can't violate if either heap is empty
if (this.smallerHalf.isEmpty() || this.largerHalf.isEmpty()) {
return;
}
const maxOfSmaller = this.smallerHalf.peek();
const minOfLarger = this.largerHalf.peek();
// Check for violation
const orderingViolated = maxOfSmaller > minOfLarger;
if (orderingViolated) {
// Fix: Move the problematic element from smaller to larger
const elementToMove = this.smallerHalf.extractMax();
this.largerHalf.add(elementToMove);
}
}
maintainSizeBalance() {
// DECISION FRAMEWORK: Balance heap sizes
// Ensure: |smallerHalf.size - largerHalf.size| <= 1
const smallerSize = this.smallerHalf.size();
const largerSize = this.largerHalf.size();
const sizeDifference = smallerSize - largerSize;
if (sizeDifference > 1) {
// Smaller half has too many elements
const elementToMove = this.smallerHalf.extractMax();
this.largerHalf.add(elementToMove);
} else if (sizeDifference < -1) {
// Larger half has too many elements
const elementToMove = this.largerHalf.extractMin();
this.smallerHalf.add(elementToMove);
}
// Size difference is now at most 1
}
findMedian() {
// DECISION FRAMEWORK: How to compute median?
const smallerSize = this.smallerHalf.size();
const largerSize = this.largerHalf.size();
const totalElements = smallerSize + largerSize;
// Handle empty case
if (totalElements === 0) {
return null;
}
// Check which configuration we have
const hasEvenElements = totalElements % 2 === 0;
if (hasEvenElements) {
// Even number of elements: average the two middle values
const leftMiddle = this.smallerHalf.peek();
const rightMiddle = this.largerHalf.peek();
return (leftMiddle + rightMiddle) / 2;
} else {
// Odd number of elements: return the middle value
// The middle is in whichever heap has more elements
if (smallerSize > largerSize) {
return this.smallerHalf.peek();
} else {
return this.largerHalf.peek();
}
}
}
}
// === EXAMPLE USAGE: Stock Price Analysis ===
function analyzeStockPrices() {
const medianTracker = new MedianFinder();
// Stream of stock prices throughout the day
const stockPrices = [12, 15, 10, 8, 13, 17, 14, 16, 11];
console.log("Real-time median tracking:");
console.log("Initial: No prices yet, median = null");
for (const price of stockPrices) {
medianTracker.addNumber(price);
const currentMedian = medianTracker.findMedian();
// Show the state after each insertion
const totalPrices =
medianTracker.smallerHalf.size() + medianTracker.largerHalf.size();
console.log(`After price $${price}:`);
console.log(` Total prices: ${totalPrices}`);
console.log(` Current median: $${currentMedian}`);
}
// Show final heap configuration
console.log("\nFinal heap configuration:");
console.log(
`Smaller half (max heap) top: $${medianTracker.smallerHalf.peek()}`
);
console.log(
`Larger half (min heap) top: $${medianTracker.largerHalf.peek()}`
);
console.log(`Final median: $${medianTracker.findMedian()}`);
}
// Let's trace through the algorithm step by step
function detailedTrace() {
console.log("\nDetailed trace of first few insertions:");
const tracer = new MedianFinder();
// Insert 12
console.log("\n1. Insert 12:");
console.log(" - Smaller half empty, add to smaller half");
tracer.addNumber(12);
console.log(" - Smaller: [12], Larger: []");
console.log(" - Median: 12");
// Insert 15
console.log("\n2. Insert 15:");
console.log(" - 15 > 12 (smaller's max), add to larger half");
tracer.addNumber(15);
console.log(" - Smaller: [12], Larger: [15]");
console.log(" - Median: (12 + 15) / 2 = 13.5");
// Insert 10
console.log("\n3. Insert 10:");
console.log(" - 10 <= 12 (smaller's max), add to smaller half");
tracer.addNumber(10);
console.log(" - Smaller: [12, 10], Larger: [15]");
console.log(" - Median: 12 (smaller has more elements)");
// Insert 8
console.log("\n4. Insert 8:");
console.log(" - 8 <= 12 (smaller's max), add to smaller half");
console.log(" - Smaller now has [12, 10, 8], Larger has [15]");
console.log(" - Size imbalance! Move 12 from smaller to larger");
tracer.addNumber(8);
console.log(" - Smaller: [10, 8], Larger: [12, 15]");
console.log(" - Median: (10 + 12) / 2 = 11");
}
// Run the examples
analyzeStockPrices();
detailedTrace();
Doubt buster
Common doubt: "Why do we need TWO heaps? Can't we just use one heap or keep a sorted array and find the middle element?"
This is a natural question - finding the median seems like it should be simpler. Let's explore why the naive approaches fail and why two heaps is the optimal solution.
Why a single heap doesn't work
A single heap (either min or max) has a fundamental limitation:
Max Heap: [17, 16, 15, 14, 13, 12, 11, 10, 8]
↑
Easy to get maximum (17)
But where's the median (13)?
- It's somewhere in the middle of the heap structure
- No efficient way to access it without destroying the heap
- Finding it requires O(n) time to extract elements
The problem: Heaps only give efficient access to the extreme (min or max), not the middle. The median requires access to the middle element(s), which a single heap cannot provide efficiently.
Why a sorted array is too expensive
You might think: "Just maintain a sorted array and return array[n/2]!"
class NaiveMedianFinder {
constructor() {
this.sorted = [];
}
insert(num) {
// Find position and insert - O(n) time!
let pos = 0;
while (pos < this.sorted.length && this.sorted[pos] < num) {
pos++;
}
this.sorted.splice(pos, 0, num); // Shifting elements is O(n)
}
getMedian() {
// This is fast - O(1)
const mid = Math.floor(this.sorted.length / 2);
return this.sorted[mid];
}
}
The problem: Every insertion requires O(n) time to maintain sorted order. For n insertions, that's O(n²) total time. With thousands of updates per second, this becomes impossibly slow.
Why binary search tree seems good but isn't perfect
A balanced BST could work:
Balanced BST:
13
/ \
10 15
/ \ / \
8 11 14 16
\ \
12 17
Finding median: O(log n) with order statistics
Insertion: O(log n)
The limitations:
- Requires maintaining node counts in subtrees (added complexity)
- Rebalancing operations are complex (AVL or Red-Black tree rotations)
- The median-finding still requires traversal, not O(1)
- More complex to implement correctly than two heaps
The key realization: We only need the boundary
Here's the brilliant insight that makes two heaps optimal:
All numbers: [8, 10, 11, 12, | 13, 14, 15, 16, 17]
↑
The median
We DON'T need to know that 10 comes before 11
We DON'T need to know that 15 comes before 16
We ONLY need to know:
- Everything left of | is ≤ everything right of |
- The values exactly at the boundary (12 and 13)
- The sizes of each side
Two heaps gives us exactly this:
Small (Max Heap): Large (Min Heap):
12 ← top 13 ← top
/ \ / \
10 11 14 15
/ / \
8 16 17
We can access 12 and 13 in O(1)!
The median is always one of: small.top(), large.top(), or their average
Concrete example: Why rebalancing through heaps is elegant
Let's trace adding 9 to see the elegance:
Step 1: Initial state
Small: {12, 11, 10, 8} Large: {13, 14, 15, 16, 17}
Median: 13 (large has one more element)
Step 2: Insert 9
9 < 12 (small's max), so add to small
Small: {12, 11, 10, 9, 8} Large: {13, 14, 15, 16, 17}
Step 3: Rebalance (small has 2 more than large!)
Move max of small (12) to large
Small: {11, 10, 9, 8} Large: {12, 13, 14, 15, 16, 17}
But wait - now large has 2 more than small!
Move min of large (12) to small... wait, that's circular!
Actually, let's trace more carefully:
After moving 12: Small has 4, Large has 6
Large has 2 more, so move its min (12) back? No!
Move 12 to large first keeps ordering valid
Then if large gets too big, we move its NEW min
Actually, let me recalculate:
Correct Step 3: Size rebalancing
Small has 5, Large has 5 - balanced!
No rebalancing needed
New median: (11 + 13) / 2 = 12
The beautiful part: we never needed to sort all 10 numbers. We just maintained two heaps and the median emerged naturally!
Why Two Heaps achieves optimal complexity
| Operation | Sorted Array | Single Heap | Two Heaps | | ---------- | ------------ | ----------- | --------- | | Insert | O(n) | O(log n) | O(log n) | | Get Median | O(1) | O(n) | O(1) | | Space | O(n) | O(n) | O(n) |
For a stream of n elements:
- Sorted array: O(n²) total time (unusable for large streams)
- Single heap: O(n log n) insert but O(n²) for getting medians repeatedly
- Two heaps: O(n log n) total with O(1) median access anytime
The mathematical guarantee
The two-heap approach guarantees:
-
Ordering invariant: max(small) ≤ min(large)
- This ensures correct partitioning
-
Size invariant: |small.size - large.size| ≤ 1
- This ensures the median is always at the boundary
Together, these invariants mean:
- If we have 2k elements: median = (small.top() + large.top()) / 2
- If we have 2k+1 elements: median = larger_heap.top()
The bottom line: Two heaps is optimal because it maintains exactly the information we need (the boundary between halves) and nothing more. It's not about sorting everything - it's about maintaining a balanced partition with O(1) access to the partition boundary. That's the elegant insight that makes this pattern so powerful!