Skip to main content

Group Anagrams in Python – Optimal Hashing Solution (LeetCode 49)

Group Anagrams in Python – Optimal Hashing Approach (LeetCode 49)

The Group Anagrams problem is a classic hashing question frequently asked in coding interviews. The goal is to group strings that are anagrams of each other, meaning they contain the same characters with the same frequencies but possibly in a different order.

In this post, I’ll explain an optimal Python solution using character frequency counting, which avoids sorting and achieves better performance.


๐Ÿ” Problem Overview

You are given an array of strings. Your task is to group all strings that are anagrams of each other and return them as a list of groups.

Two strings are anagrams if:

  • They contain the same characters
  • Each character appears the same number of times

๐Ÿ’ก Key Intuition

Instead of sorting every string (which costs O(k log k) time), we represent each word using a fixed-size frequency array of 26 characters.

Since the English alphabet has only 26 lowercase letters, the frequency representation becomes a constant-size signature. If two words generate the same signature, they are anagrams.


⚙️ Approach Explained

  1. Create a helper function that converts a word into a 26-length frequency tuple.
  2. Use this tuple as a key in a dictionary.
  3. Group words that share the same frequency signature.
  4. Return all grouped values.

✅ Python Implementation (Optimal Solution)


class Solution:
    def tuple_repr(self, string: str) -> tuple:
        a = ord("a")
        r = [0] * 26

        for i in string:
            r[ord(i) - a] += 1

        return tuple(r)

    def groupAnagrams(self, strs):
        d = {}
        for i in strs:
            a = self.tuple_repr(i)
            if a in d:
                d[a].append(i)
            else:
                d[a] = [i]
        return list(d.values())

⏱️ Time and Space Complexity

  • Time Complexity: O(n × k), where n is the number of strings and k is the average length of a string.
  • Space Complexity: O(n) for storing grouped anagrams.
  • The frequency array itself uses constant space (26 characters).

๐Ÿ“Œ Why This Approach Is Optimal

  • Avoids sorting strings
  • Uses constant-size hashing keys
  • Scales efficiently for large inputs
  • Preferred solution in interviews

๐Ÿ”— Reference Implementation

You can find the complete solution on my GitHub repository:

๐Ÿ‘‰ View Group Anagrams Solution on GitHub


๐Ÿ“˜ Final Thoughts

This problem highlights how choosing the right data structure can drastically improve performance. Using a frequency-based hash instead of sorting is a powerful optimization technique that applies to many string problems.

If you’re preparing for interviews or revising hashing patterns, this solution is a must-remember.

Comments

Popular posts from this blog

LeetCode 88 Explained: Four Approaches, Mistakes, Fixes & the Final Optimal Python Solution

Evolving My Solution to “Merge Sorted Array” A practical, beginner-friendly walkthrough showing four versions of my code (from a naive approach to the optimal in-place two-pointer solution). Includes explanations, complexity and ready-to-paste code. Problem Summary You are given two sorted arrays: nums1 with size m + n (first m are valid) nums2 with size n Goal: Merge nums2 into nums1 in sorted order in-place . Version 1 — Beginner Approach (Extra List) I merged into a new list then copied back. Works, but not in-place and uses extra memory. class Solution: def merge(self, nums1, m, nums2, n): result = [] p1 = 0 p2 = 0 for _ in range(m+n): if p1 >= m: result.extend(nums2[p2:n]) break elif p2 >= n: result.extend(nums1[p1:m]) break elif nu...

Introducing CodeMad: Your Ultimate Universal IDE with Custom Shortcuts

Introducing CodeMad: Your Ultimate Multi-Language IDE with Custom Shortcuts Welcome to the world of CodeMad, your all-in-one Integrated Development Environment (IDE) that simplifies coding and boosts productivity. Developed in Python, CodeMad is designed to make your coding experience smoother and more efficient across a variety of programming languages, including C, C++, Java, Python, and HTML. Whether you're a beginner or an experienced programmer, CodeMad is your go-to tool. In this blog, we'll dive deep into the workings of CodeMad, highlighting its unique features and easy installation process. The Power of Shortcuts CodeMad's intuitive interface is built around a set of powerful keyboard shortcuts that make coding a breeze. Here are some of the key shortcuts you'll find in CodeMad: Copy (Ctrl+C) : Duplicate text with ease. Paste (Ctrl+V) : Quickly insert copied content into your code. Undo (Ctrl+Z) and Redo (Ctrl+Y) : Correct mistakes and s...

Product of Array Except Self in Python | Prefix & Suffix Explained (LeetCode 238)

Problem Overview The Product of Array Except Self is a classic problem that tests your understanding of array traversal and optimization. The task is simple to state but tricky to implement efficiently. Given an integer array nums , you need to return an array such that each element at index i is equal to the product of all the elements in nums except nums[i] . The challenge is that: Division is not allowed The solution must run in O(n) time Initial Thoughts At first glance, it feels natural to compute the total product of the array and divide it by the current element. However, this approach fails because division is forbidden and handling zeroes becomes messy. This pushed me to think differently — instead of excluding the current element, why not multiply everything around it? That’s where the prefix and suffix product pattern comes in. Key Insight: Prefix & Suffix Products For every index i : Prefix product → product of all elements to t...