Skip to main content

Find the Duplicate Number in Array — From HashSet to Floyd’s Cycle Detection

Problem Overview

The problem Find the Duplicate Number gives an array of size n + 1 where each number lies in the range 1 to n. There is only one repeated number, but it may appear more than once. The main challenge is to find the duplicate without modifying the array and using constant extra space.


My First Approach: Using a HashSet

The most straightforward solution that came to my mind was using a HashSet. The idea is simple: keep inserting elements into a set, and if an element already exists, return it as the duplicate.


class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        h_set = set()
        for i in nums:
            if i in h_set:
                return i
            h_set.add(i)

This solution works perfectly and runs in O(n) time, but it uses O(n) extra space, which violates the problem constraint.


Key Insight from the Tutorial: Think of It as a Linked List

After watching a YouTube tutorial, I learned a crucial observation:

  • Each index points to nums[index]
  • All values lie in the range 1..n, so every value is a valid index
  • A duplicate means two indices point to the same value

This structure naturally forms a cycle, similar to a linked list. Once I understood this, the problem became a classic cycle detection problem.


Applying Floyd’s Tortoise and Hare Algorithm

Using Floyd’s Cycle Detection algorithm, I first detected the presence of a cycle using two pointers:

  • slow moves one step at a time
  • fast moves two steps at a time

When they meet, we are guaranteed to be inside the cycle.


class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        slow = nums[0]
        fast = nums[0]

        while True:
            slow = nums[slow]
            fast = nums[fast]
            fast = nums[fast]
            if slow == fast:
                break

However, this meeting point is not necessarily the duplicate number.


Finding the Entry Point of the Cycle (The Duplicate)

To find the exact duplicate, I introduced a third pointer (head) starting from the beginning. Both pointers now move one step at a time. The point where they meet is the entry of the cycle, which directly corresponds to the duplicate number.


class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        slow = nums[0]
        fast = nums[0]
        head = nums[0]

        while True:
            slow = nums[slow]
            fast = nums[fast]
            fast = nums[fast]
            if slow == fast:
                while head != slow:
                    head = nums[head]
                    slow = nums[slow]
                return head

Why This Works

  • The array behaves like a linked list with a cycle
  • The duplicate value is the entry point of that cycle
  • Floyd’s algorithm guarantees finding it using constant space

This approach satisfies all problem constraints and runs in:

  • Time Complexity: O(n)
  • Space Complexity: O(1)

Complete Solution Code

You can find the complete implementation here:

👉 GitHub – Find the Duplicate Number (LC287)


Final Thoughts

This problem was a great example of how understanding constraints can completely change the approach. Starting from a simple HashSet solution and gradually deriving an optimal cycle detection algorithm helped me build deeper intuition about pointer-based problems.

Comments

Popular posts from this blog

LeetCode 88 Explained: Four Approaches, Mistakes, Fixes & the Final Optimal Python Solution

Evolving My Solution to “Merge Sorted Array” A practical, beginner-friendly walkthrough showing four versions of my code (from a naive approach to the optimal in-place two-pointer solution). Includes explanations, complexity and ready-to-paste code. Problem Summary You are given two sorted arrays: nums1 with size m + n (first m are valid) nums2 with size n Goal: Merge nums2 into nums1 in sorted order in-place . Version 1 — Beginner Approach (Extra List) I merged into a new list then copied back. Works, but not in-place and uses extra memory. class Solution: def merge(self, nums1, m, nums2, n): result = [] p1 = 0 p2 = 0 for _ in range(m+n): if p1 >= m: result.extend(nums2[p2:n]) break elif p2 >= n: result.extend(nums1[p1:m]) break elif nu...

Introducing CodeMad: Your Ultimate Universal IDE with Custom Shortcuts

Introducing CodeMad: Your Ultimate Multi-Language IDE with Custom Shortcuts Welcome to the world of CodeMad, your all-in-one Integrated Development Environment (IDE) that simplifies coding and boosts productivity. Developed in Python, CodeMad is designed to make your coding experience smoother and more efficient across a variety of programming languages, including C, C++, Java, Python, and HTML. Whether you're a beginner or an experienced programmer, CodeMad is your go-to tool. In this blog, we'll dive deep into the workings of CodeMad, highlighting its unique features and easy installation process. The Power of Shortcuts CodeMad's intuitive interface is built around a set of powerful keyboard shortcuts that make coding a breeze. Here are some of the key shortcuts you'll find in CodeMad: Copy (Ctrl+C) : Duplicate text with ease. Paste (Ctrl+V) : Quickly insert copied content into your code. Undo (Ctrl+Z) and Redo (Ctrl+Y) : Correct mistakes and s...

Product of Array Except Self in Python | Prefix & Suffix Explained (LeetCode 238)

Problem Overview The Product of Array Except Self is a classic problem that tests your understanding of array traversal and optimization. The task is simple to state but tricky to implement efficiently. Given an integer array nums , you need to return an array such that each element at index i is equal to the product of all the elements in nums except nums[i] . The challenge is that: Division is not allowed The solution must run in O(n) time Initial Thoughts At first glance, it feels natural to compute the total product of the array and divide it by the current element. However, this approach fails because division is forbidden and handling zeroes becomes messy. This pushed me to think differently — instead of excluding the current element, why not multiply everything around it? That’s where the prefix and suffix product pattern comes in. Key Insight: Prefix & Suffix Products For every index i : Prefix product → product of all elements to t...