[LeetCode][295. Find Median from Data Stream] Two Heaps with the Follow Ups

By Long Luo

This article is the solution Two Heaps with the Follow Up’s Solution of Problem 295. Find Median from Data Stream.

Intuition

We can simply use a \(\texttt{ArrayList}\) to record the number and sort the list, then we can easily get the median element of the list. However, the Time Complexity will be \(O(n^2logn)\) and the Space Complexity is \(O(n)\).

It surely will be TLE and we have to find a better solution.

Heap

We can use Two Priority Queues (Heaps) to maintain the data of the entire data stream.

The min Heap denoted as \(\textit{queueMin}\) is used to maintain the number \(\textit{num} \leq \textit{median}\). The max Heap denoted as \(\textit{queueMax}\) is used to maintain the number \(\textit{num} \gt \textit{median}\).

  • When the total number of data stream elements is Even: \(\texttt{queueMax.size()} = \texttt{queueMin.size()}\), the dynamic median is \(\frac{\texttt{queueMax.peek()} + \texttt{queueMin.peek()}}{2}\);

  • When the total number of data stream elements is Odd: \(\texttt{queueMin.size()} = \texttt{queueMin.size()} + 1\), the dynamic median is \(\texttt{queueMin.peek()}\).

When we try to add a new number \(\textit{num}\) to the Two Heaps, the cases can be as follows:

  • \(\textit{num} \leq \max \{\textit{queMin}\}\)

We need to add this number to \(\textit{queueMin}\). The new median will be less than or equal to the original median, so we may need to move the \(\texttt{queueMin.peek()}\) to \(\textit{queueMax}\).

  • \(\textit{num} \gt \max \{\textit{queMin}\}\)

We need to add this number to \(\textit{queueMax}\). The new median will be greater than or equal to the original median, so we may need to move the \(\texttt{queueMax.peek()}\) to \(\textit{queueMin}\).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
static class MedianFinder {
PriorityQueue<Integer> queueMin;
PriorityQueue<Integer> queueMax;

public MedianFinder() {
queueMin = new PriorityQueue<>((a, b) -> b - a);
queueMax = new PriorityQueue<>(((a, b) -> a - b));
}

public void addNum(int num) {
if (queueMin.isEmpty() || num <= queueMin.peek()) {
queueMin.offer(num);

if (queueMax.size() + 1 < queueMin.size()) {
queueMax.offer(queueMin.poll());
}
} else {
queueMax.offer(num);

if (queueMin.size() < queueMax.size()) {
queueMin.offer(queueMax.poll());
}
}
}

public double findMedian() {
if (queueMin.size() > queueMax.size()) {
return queueMin.peek();
}

return (queueMin.peek() + queueMax.peek()) / 2.0;
}
}

Analysis

  • Time Complexity: \(O(nlogn)\)
  • Space Complexity: \(O(n)\)

Follow Ups

If all integer numbers from the stream are in the range [0, 100], how would you optimize your solution?

  • We can use a bucket with a length of \(101\). Each bucket stores the number of occurrences of each number, and records the total number of elements in the data stream. When searching for the median, calculate the number of the median, and then scan all buckets from front to back to get the answer.

If 99% of all integer numbers from the stream are in the range [0, 100], how would you optimize your solution?

  • As last question, we can still use buckets to store the data. Then use Two Pointers to maintain the median. For the number out of range, we can use two arrays to record the number which less than \(0\) or greater than \(100\). If the median is not in \([0, 100]\), we can perform brute force to find it.

All suggestions are welcome. If you have any query or suggestion please comment below. Please upvote👍 if you like💗 it. Thank you:-)

Explore More Leetcode Solutions. 😉😃💗