Reducing Time Complexity with HashSet: A Practical Example
Discover how HashSets can drastically improve time complexity in real-world coding challenges.
In the world of programming, efficiency is key. When dealing with large datasets, reducing time complexity can make the difference between a responsive application and a sluggish one. Hash Sets, with their constant-time lookup, offer an elegant solution to many computational problems. In this article, we’ll explore how Hash Sets can optimize a common problem and dive into a practical example that demonstrates their power.
Problem Overview
Imagine a scenario where you are tasked with analyzing people’s favorite companies. You’re given an array favoriteCompanies
, where each sub-array represents a person's list of favorite companies. The goal is to determine which people have unique lists that are not subsets of any other person’s list. [People Whose List of Favorite Companies Is Not a Subset of Another List - LeetCode]
Problem Statement
Given a list favoriteCompanies
, return the indices of people whose list of favorite companies is not a subset of any other list. The indices must be sorted in ascending order.
Example:
Input:
favoriteCompanies = [["leetcode", "google", "facebook"], ["google", "microsoft"], ["google", "facebook"], ["google"], ["amazon"]]
Output: [0, 1, 4]
Explanation:
- Person
2
's list["google", "facebook"]
is a subset of person0
's list. - Person
3
's list["google"]
is a subset of both persons0
and1
. - Therefore, the unique indices are
[0, 1, 4]
.
The Naive Approach
The most straightforward solution involves checking each person’s list against every other list to see if it’s a subset. Here’s the implementation:
public class Solution
{
public bool IsSubset(int i, IList<IList<string>> favoriteCompanies)
{
for (int j = 0; j < favoriteCompanies.Count; j++)
{
if (i != j && favoriteCompanies[i].All(item => favoriteCompanies[j].Contains(item)))
return true;
}
return false;
}
public IList<int> PeopleIndexes(IList<IList<string>> favoriteCompanies)
{
IList<int> res = new List<int>();
for (int i = 0; i < favoriteCompanies.Count; i++)
{
if (!IsSubset(i, favoriteCompanies))
res.Add(i);
}
return res;
}
}
The Optimized Solution with HashSets
To improve efficiency, we can leverage HashSets, which offer constant-time operations for checking membership. The idea is simple:
- Convert each list of companies into a HashSet.
- Use HashSet operations like
IsSupersetOf
to quickly determine if one set is a superset of another.
Implementation
public class Solution
{
public bool IsSubset(int i, Dictionary<int, HashSet<string>> hashSets)
{
for (int j= 0; j < hashSets.Count; j++)
{
if (i!=j && hashSets[j].IsSupersetOf(hashSets[i]))
return true;
}
return false;
}
public IList<int> PeopleIndexes(IList<IList<string>> favoriteCompanies)
{
var hashSets = new Dictionary<int, HashSet<string>>();
for (int i = 0; i < favoriteCompanies.Count; i++)
{
hashSets[i] = new HashSet<string>(favoriteCompanies[i]);
}
IList<int> res = new List<int>();
for (int i=0; i<favoriteCompanies.Count; i++)
{
if (!IsSubset(i, hashSets)) res.Add(i);
}
return res;
}
}
Key Optimizations
- Efficient Subset Checking: Instead of iterating through every element,
IsSupersetOf
checks if one set contains another in constant time. - Preprocessing: Converting each list to a HashSet upfront ensures all subsequent operations are fast.
Time Complexity
- Conversion to HashSets: O(n×m)), where n is the number of lists and m is the average list size.
- Subset Checks: O(n²), as we only compare HashSets.
This reduces the overall complexity to O(n² + n×m), a significant improvement over the naive approach.
Practical Benefits of HashSets
- Speed: HashSets drastically reduce the time required for subset checks.
- Simplicity: Using HashSet operations like
IsSupersetOf
makes the code cleaner and easier to read. - Scalability: The optimized solution handles larger inputs effectively, making it suitable for real-world applications.
Conclusion
HashSets are a powerful tool for optimizing subset-related problems. By converting lists into HashSets and leveraging their efficient operations, we transformed a computationally expensive problem into a streamlined solution. This example demonstrates the importance of choosing the right data structure for the task — a key principle for writing high-performance code.