Skip to main content

Find Median from Data Stream (LeetCode 295) – Why Linked List TLEs & Two Heaps Work

📌 Problem Overview

LeetCode 295, Find Median from Data Stream, asks us to design a data structure that efficiently supports:

  • addNum(num) – add a number from the data stream
  • findMedian() – return the median of all elements so far

The challenge is not correctness, but performance. The operations may be called up to 10⁵ times, so naive approaches often lead to Time Limit Exceeded (TLE).


❌ Why a Sorted Linked List Causes TLE

A common first attempt is to maintain a sorted doubly linked list:

  • Traverse the list to find the correct position
  • Insert the number while keeping the list sorted
  • Walk from both ends to compute the median

⏱ Time Complexity

Operation Time Complexity
addNum() O(n)
findMedian() O(n)

With up to 100,000 operations, this approach degrades to O(n²) overall, which is why it results in TLE on LeetCode.


✅ Optimal Approach: Two Heaps

The correct and accepted solution uses two heaps instead of maintaining full order.

  • Max Heap (left) → stores the smaller half of numbers
  • Min Heap (right) → stores the larger half of numbers

The idea is to maintain balance, not full sorting.

📐 Invariant Rules

  • The size difference between heaps is at most 1
  • Max heap always has the extra element when count is odd

🧠 How Median Is Calculated

  • If total count is odd → median is top of max heap
  • If total count is even → median is average of both heap tops

🚀 Time Complexity (Optimized)

Operation Time Complexity
addNum() O(log n)
findMedian() O(1)

This guarantees excellent performance even for large input sizes.


🐍 Python Implementation (Two Heaps)

import heapq

class MedianFinder:
    def __init__(self):
        self.left = []   # max heap
        self.right = []  # min heap

    def addNum(self, num: int) -> None:
        heapq.heappush(self.left, -num)
        heapq.heappush(self.right, -heapq.heappop(self.left))
        if len(self.right) > len(self.left):
            heapq.heappush(self.left, -heapq.heappop(self.right))

    def findMedian(self) -> float:
        if len(self.left) > len(self.right):
            return -self.left[0]
        return (-self.left[0] + self.right[0]) / 2

📎 Full Solution on GitHub

You can find the complete and tested implementation here:

👉 LeetCode 295 – Find Median from Data Stream (Python)


🎯 Key Takeaways

  • Maintaining a fully sorted structure is expensive for data streams
  • Balancing data is more important than ordering it
  • Two heaps provide the perfect trade-off between speed and simplicity

This problem is a classic example of how choosing the right data structure matters more than clever logic.

Comments