Find the Duplicate Number in Array — From HashSet to Floyd’s Cycle Detection

Problem Overview

The problem Find the Duplicate Number gives an array of size n + 1 where each number lies in the range 1 to n. There is only one repeated number, but it may appear more than once. The main challenge is to find the duplicate without modifying the array and using constant extra space.

My First Approach: Using a HashSet

The most straightforward solution that came to my mind was using a HashSet. The idea is simple: keep inserting elements into a set, and if an element already exists, return it as the duplicate.


class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        h_set = set()
        for i in nums:
            if i in h_set:
                return i
            h_set.add(i)

This solution works perfectly and runs in O(n) time, but it uses O(n) extra space, which violates the problem constraint.

Key Insight from the Tutorial: Think of It as a Linked List

After watching a YouTube tutorial, I learned a crucial observation:

Each index points to nums[index]
All values lie in the range 1..n, so every value is a valid index
A duplicate means two indices point to the same value

This structure naturally forms a cycle, similar to a linked list. Once I understood this, the problem became a classic cycle detection problem.

Applying Floyd’s Tortoise and Hare Algorithm

Using Floyd’s Cycle Detection algorithm, I first detected the presence of a cycle using two pointers:

slow moves one step at a time
fast moves two steps at a time

When they meet, we are guaranteed to be inside the cycle.


class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        slow = nums[0]
        fast = nums[0]

        while True:
            slow = nums[slow]
            fast = nums[fast]
            fast = nums[fast]
            if slow == fast:
                break

However, this meeting point is not necessarily the duplicate number.

Finding the Entry Point of the Cycle (The Duplicate)

To find the exact duplicate, I introduced a third pointer (head) starting from the beginning. Both pointers now move one step at a time. The point where they meet is the entry of the cycle, which directly corresponds to the duplicate number.


class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        slow = nums[0]
        fast = nums[0]
        head = nums[0]

        while True:
            slow = nums[slow]
            fast = nums[fast]
            fast = nums[fast]
            if slow == fast:
                while head != slow:
                    head = nums[head]
                    slow = nums[slow]
                return head

Why This Works

The array behaves like a linked list with a cycle
The duplicate value is the entry point of that cycle
Floyd’s algorithm guarantees finding it using constant space

This approach satisfies all problem constraints and runs in:

Time Complexity: O(n)
Space Complexity: O(1)

Complete Solution Code

You can find the complete implementation here:

👉 GitHub – Find the Duplicate Number (LC287)

Final Thoughts

This problem was a great example of how understanding constraints can completely change the approach. Starting from a simple HashSet solution and gradually deriving an optimal cycle detection algorithm helped me build deeper intuition about pointer-based problems.

Road2Geeks

Search This Blog