📌 Problem Overview
LeetCode 295, Find Median from Data Stream, asks us to design a data structure that efficiently supports:
addNum(num)– add a number from the data streamfindMedian()– return the median of all elements so far
The challenge is not correctness, but performance. The operations may be called up to 10⁵ times, so naive approaches often lead to Time Limit Exceeded (TLE).
❌ Why a Sorted Linked List Causes TLE
A common first attempt is to maintain a sorted doubly linked list:
- Traverse the list to find the correct position
- Insert the number while keeping the list sorted
- Walk from both ends to compute the median
⏱ Time Complexity
| Operation | Time Complexity |
|---|---|
| addNum() | O(n) |
| findMedian() | O(n) |
With up to 100,000 operations, this approach degrades to O(n²) overall, which is why it results in TLE on LeetCode.
✅ Optimal Approach: Two Heaps
The correct and accepted solution uses two heaps instead of maintaining full order.
- Max Heap (left) → stores the smaller half of numbers
- Min Heap (right) → stores the larger half of numbers
The idea is to maintain balance, not full sorting.
📐 Invariant Rules
- The size difference between heaps is at most 1
- Max heap always has the extra element when count is odd
🧠 How Median Is Calculated
- If total count is odd → median is top of max heap
- If total count is even → median is average of both heap tops
🚀 Time Complexity (Optimized)
| Operation | Time Complexity |
|---|---|
| addNum() | O(log n) |
| findMedian() | O(1) |
This guarantees excellent performance even for large input sizes.
🐍 Python Implementation (Two Heaps)
import heapq
class MedianFinder:
def __init__(self):
self.left = [] # max heap
self.right = [] # min heap
def addNum(self, num: int) -> None:
heapq.heappush(self.left, -num)
heapq.heappush(self.right, -heapq.heappop(self.left))
if len(self.right) > len(self.left):
heapq.heappush(self.left, -heapq.heappop(self.right))
def findMedian(self) -> float:
if len(self.left) > len(self.right):
return -self.left[0]
return (-self.left[0] + self.right[0]) / 2
📎 Full Solution on GitHub
You can find the complete and tested implementation here:
👉 LeetCode 295 – Find Median from Data Stream (Python)
🎯 Key Takeaways
- Maintaining a fully sorted structure is expensive for data streams
- Balancing data is more important than ordering it
- Two heaps provide the perfect trade-off between speed and simplicity
This problem is a classic example of how choosing the right data structure matters more than clever logic.
Comments
Post a Comment
Share your views on this blog😍😍