LC 287 - Find the Duplicate Number
Question
Given an array of integers nums containing n + 1 integers where each integer is in the range [1, n] inclusive.
There is only one repeated number in nums, return this repeated number.
You must solve the problem without modifying the array nums and using only constant extra space.
Example 1:
1
2
Input: nums = [1,3,4,2,2]
Output: 2
Example 2:
1
2
Input: nums = [3,1,3,4,2]
Output: 3
Example 3:
1
2
Input: nums = [3,3,3,3,3]
Output: 3
Constraints:
1 <= n <= 105nums.length == n + 11 <= nums[i] <= n- All the integers in
numsappear only once except for precisely one integer which appears two or more times.
Follow up:
- How can we prove that at least one duplicate number must exist in
nums? - Can you solve the problem in linear runtime complexity?
Links
Question here and solution here
Solution
concept
The main problem here is the constraints: we cannot modify the array and only constant extra space. (this means we cannot use a hashset) This problem we need to recognise that:
- this is a linked list problem: specifically we want to find the cycle in a linked list and the start of the cycle is our answer
- in order to achieve this, we need to use Floyd’s tortoise and hare algorithm.
For the first part, we need to take note that each number in the array is a pointer that points to the next index of the array. Take note below the graph is not correct, it should be1 -> 3 -> .... Basically the node itself is the value in the array, and it points to the next index. Note that due to definition, 0 is not included in the array, this means that index 0 is never going to be part of the cycle, since no number can point to 0. This is important information for the Floyd’s algorithm.
The Floyd’s algorithm requires 2 phase:
slowandfastpointer starts at the same place (and this place cannot be part of the cycle). In this question this is guarantee because index0is not part of cycle- when the slow and fast meet in the cycle, we will stop and take note the
slowposition
- when the slow and fast meet in the cycle, we will stop and take note the
- we then have another pointer
slow2which starts at the starting point (same asfastandslowpointer initially), and haveslow2andslowtraverse again.- when
slow2andslowmeets, this is the starting point of the cycle - this because the distance from
slow2to the starting point is the same as the distance of theslowto the starting of the cycle. Ifslow2is very far away from the starting point of the cycle,slowmight need to cycle many times before meetingslow2.
- when
code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
class Solution:
"""
Floyd's Tortoise and Hare (Cycle Detection)
rephrase the problem to a linked list problem: the value of the array is the index of the next element
time: O(n)
space: O(1)
Refer to NeetCode's solution: https://www.youtube.com/watch?v=wjYnzkAhcNk
1. Find the intersection point of the slow and fast, these two pointers will meet if cycle is detected
2. Find the entrance of the cycle:
the distance of the overlap of fast and slow pointers to the entrance of the cycle is equal to the distance of the head to the entrance of the cycle
"""
def findDuplicate(self, nums: List[int]) -> int:
# first phase: find the intersection point of the slow and fast pointers
slow, fast = 0, 0
while True:
slow = nums[slow]
fast = nums[nums[fast]]
if slow == fast:
break
# second phase: find the entrance of the cycle
# if the distance from head to the entrance is very long, then slow pointer will loop many rounds in cycle
slow_2 = 0
while True:
slow = nums[slow]
slow_2 = nums[slow_2]
if slow == slow_2:
return slow
Complexity
time: $O(n)$
space: $O(1)$