LC 295 - Find Median from Data Stream
Question
The median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value, and the median is the mean of the two middle values.
- For example, for
arr = [2,3,4], the median is3. - For example, for
arr = [2,3], the median is(2 + 3) / 2 = 2.5.
Implement the MedianFinder class:
MedianFinder()initializes theMedianFinderobject.void addNum(int num)adds the integernumfrom the data stream to the data structure.double findMedian()returns the median of all elements so far. Answers within10-5of the actual answer will be accepted.
Example 1:
1
2
3
4
5
Input
["MedianFinder", "addNum", "addNum", "findMedian", "addNum", "findMedian"]
[[], [1], [2], [], [3], []]
Output
[null, null, null, 1.5, null, 2.0]
Explanation MedianFinder medianFinder = new MedianFinder(); medianFinder.addNum(1); // arr = [1] medianFinder.addNum(2); // arr = [1, 2] medianFinder.findMedian(); // return 1.5 (i.e., (1 + 2) / 2) medianFinder.addNum(3); // arr[1, 2, 3] medianFinder.findMedian(); // return 2.0
Constraints:
-105 <= num <= 105- There will be at least one element in the data structure before calling
findMedian. - At most
5 * 104calls will be made toaddNumandfindMedian.
Follow up:
- If all integer numbers from the stream are in the range
[0, 100], how would you optimize your solution? - If
99%of all integer numbers from the stream are in the range[0, 100], how would you optimize your solution?
Links
Question here and solution here
Solution
concept
The key idea is to keep 2 heaps, and these 2 heaps is roughly the same size:
- the first heap to keep track all the smaller numbers (this is a max heap)
- the second heap to keep track all the bigger numbers (this is a min heap) with these 2 setup, finding the median is easy since we can get the top of the 2 heap and compute the median (if the total number is odd, then it is one of the number on top of these 2 heaps, if even, then it is the average between these two numbers)
When we add in the numbers in the heap, we can just add the number into the max heap (i.e. the smaller portion) and then we need to take care:
- order check: make sure all number in the small heap is smaller than the large heap
- make sure the size is about the same, this is important for medium computation.
code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
class MedianFinder:
def __init__(self):
self.small = [] # max heap
self.large = [] # min heap
def addNum(self, num: int) -> None:
heapq.heappush(self.small, -1 * num)
# check order
if self.small and self.large and -1*self.small[0] > self.large[0]:
tmp = heapq.heappop(self.small)
heapq.heappush(self.large, -1*tmp)
# check uneven size
if len(self.small) > len(self.large) + 1:
tmp = heapq.heappop(self.small)
heappush(self.large, -1*tmp)
if len(self.large) > len(self.small) + 1:
tmp = heapq.heappop(self.large)
heappush(self.small, -1*tmp)
def findMedian(self) -> float:
# odd total number
if len(self.small) > len(self.large):
return -1*self.small[0]
elif len(self.small) < len(self.large):
return self.large[0]
else: #even total number
return (-1*self.small[0] + self.large[0])/2
# Your MedianFinder object will be instantiated and called as such:
# obj = MedianFinder()
# obj.addNum(num)
# param_2 = obj.findMedian()
Complexity
time: $O(m* \log n)$
space: $O(n)$