The problem comes with two variations, one is with the constraint that the duplicate element can appear twice. For the other, the duplicate element can appear two or more times. It is stated that there is only one duplicate element in the array. The given array is called nums
, the length of the array is n + 1
and the elements are in the range of [1, n]
, both inclusive.
The solution is based on the tortoise and hair algorithm. The idea is to consider the n + 1
length array as a tree like structure (linked list is easier to imagine if the count of duplicate appearance is exactly two, but for multiple appearance, it'll form a tree like structure) with node
labelled from 0
to n
where nums[node]
represents the next node for the current node, i.e. to which node the current node points to (forms an edge). As nums
can contain only one value for a particular node, a node can only point to one next node. As we have n + 1
nodes, to form a tree we would require exactly n
edges, but as there are n + 1
edges, one node must point back to form a graph. The graph will contain a directed cycle because the outdegree of each node is exactly one.
The argument, that starting from 0
will always lead into the cycle is that in case it does not, 0
must exist after the cycle in a directed path, but that leads to a contradiction as the node in the cycle from which the branch of 0
split, would have two outgoing edges, which is not possible. This can be extrapolated for any starting node, not only for zero, So starting from any position would always lead us into the cycle, and we can use Floyd's algorithm of fast and slow pointer to solve this. The duplicate element in the element at which the cycle starts because both the cyclic node and linear chain node points to the starting of the cycle.
class Solution:
def findDuplicate(self, nums: List[int]) -> int:
slow, fast = 0, 0
# The starting position needs to be computed outside of
# the while loop once else both fast and slow being zero
# will prematurely break the loop.
slow = nums[slow]
fast = nums[fast]
fast = nums[fast]
while slow != fast:
slow = nums[slow]
fast = nums[fast]
fast = nums[fast]
fast = 0
while slow != fast:
slow = nums[slow]
fast = nums[fast]
return slow