Group Anagrams in Python – Optimal Hashing Approach (LeetCode 49)
The Group Anagrams problem is a classic hashing question frequently asked in coding interviews. The goal is to group strings that are anagrams of each other, meaning they contain the same characters with the same frequencies but possibly in a different order.
In this post, I’ll explain an optimal Python solution using character frequency counting, which avoids sorting and achieves better performance.
๐ Problem Overview
You are given an array of strings. Your task is to group all strings that are anagrams of each other and return them as a list of groups.
Two strings are anagrams if:
- They contain the same characters
- Each character appears the same number of times
๐ก Key Intuition
Instead of sorting every string (which costs O(k log k) time),
we represent each word using a fixed-size frequency array of 26 characters.
Since the English alphabet has only 26 lowercase letters, the frequency representation becomes a constant-size signature. If two words generate the same signature, they are anagrams.
⚙️ Approach Explained
- Create a helper function that converts a word into a 26-length frequency tuple.
- Use this tuple as a key in a dictionary.
- Group words that share the same frequency signature.
- Return all grouped values.
✅ Python Implementation (Optimal Solution)
class Solution:
def tuple_repr(self, string: str) -> tuple:
a = ord("a")
r = [0] * 26
for i in string:
r[ord(i) - a] += 1
return tuple(r)
def groupAnagrams(self, strs):
d = {}
for i in strs:
a = self.tuple_repr(i)
if a in d:
d[a].append(i)
else:
d[a] = [i]
return list(d.values())
⏱️ Time and Space Complexity
- Time Complexity: O(n × k), where
nis the number of strings andkis the average length of a string. - Space Complexity: O(n) for storing grouped anagrams.
- The frequency array itself uses constant space (26 characters).
๐ Why This Approach Is Optimal
- Avoids sorting strings
- Uses constant-size hashing keys
- Scales efficiently for large inputs
- Preferred solution in interviews
๐ Reference Implementation
You can find the complete solution on my GitHub repository:
๐ View Group Anagrams Solution on GitHub
๐ Final Thoughts
This problem highlights how choosing the right data structure can drastically improve performance. Using a frequency-based hash instead of sorting is a powerful optimization technique that applies to many string problems.
If you’re preparing for interviews or revising hashing patterns, this solution is a must-remember.
Comments
Post a Comment
Share your views on this blog๐๐