Reducing Time Complexity with HashSet: A Practical Example

Discover how HashSets can drastically improve time complexity in real-world coding challenges.

3 min readJan 5, 2025

In the world of programming, efficiency is key. When dealing with large datasets, reducing time complexity can make the difference between a responsive application and a sluggish one. Hash Sets, with their constant-time lookup, offer an elegant solution to many computational problems. In this article, we’ll explore how Hash Sets can optimize a common problem and dive into a practical example that demonstrates their power.

Problem Overview

Imagine a scenario where you are tasked with analyzing people’s favorite companies. You’re given an array favoriteCompanies, where each sub-array represents a person's list of favorite companies. The goal is to determine which people have unique lists that are not subsets of any other person’s list. [People Whose List of Favorite Companies Is Not a Subset of Another List - LeetCode]

Problem Statement

Given a list favoriteCompanies, return the indices of people whose list of favorite companies is not a subset of any other list. The indices must be sorted in ascending order.

Example:

Input:

favoriteCompanies = [["leetcode", "google", "facebook"], ["google", "microsoft"], ["google", "facebook"], ["google"], ["amazon"]]

Output: [0, 1, 4]

Explanation:

Person 2's list ["google", "facebook"] is a subset of person 0's list.
Person 3's list ["google"] is a subset of both persons 0 and 1.
Therefore, the unique indices are [0, 1, 4].

The Naive Approach

The most straightforward solution involves checking each person’s list against every other list to see if it’s a subset. Here’s the implementation:

public class Solution
{
    public bool IsSubset(int i, IList<IList<string>> favoriteCompanies)
    {
        for (int j = 0; j < favoriteCompanies.Count; j++)
        {
            if (i != j && favoriteCompanies[i].All(item => favoriteCompanies[j].Contains(item)))
                return true;
        }
        return false;
    }

    public IList<int> PeopleIndexes(IList<IList<string>> favoriteCompanies)
    {
        IList<int> res = new List<int>();
        for (int i = 0; i < favoriteCompanies.Count; i++)
        {
            if (!IsSubset(i, favoriteCompanies)) 
                res.Add(i);
        }
        return res;
    }
}

The Optimized Solution with HashSets

To improve efficiency, we can leverage HashSets, which offer constant-time operations for checking membership. The idea is simple:

Convert each list of companies into a HashSet.
Use HashSet operations like IsSupersetOf to quickly determine if one set is a superset of another.

Implementation

public class Solution
{
    public bool IsSubset(int i, Dictionary<int, HashSet<string>> hashSets)
    {        
        for (int j= 0; j < hashSets.Count; j++)
        {
            if (i!=j && hashSets[j].IsSupersetOf(hashSets[i]))
                return true;
        }
        return false;
    }
    
    public IList<int> PeopleIndexes(IList<IList<string>> favoriteCompanies)
    {
        var hashSets = new Dictionary<int, HashSet<string>>();
        for (int i = 0; i < favoriteCompanies.Count; i++)
        {
            hashSets[i] = new HashSet<string>(favoriteCompanies[i]);
        }
        IList<int> res = new List<int>();
        for (int i=0; i<favoriteCompanies.Count; i++)
        {
            if (!IsSubset(i, hashSets)) res.Add(i);
        }
        return res;
    }
}

Key Optimizations

Efficient Subset Checking: Instead of iterating through every element, IsSupersetOf checks if one set contains another in constant time.
Preprocessing: Converting each list to a HashSet upfront ensures all subsequent operations are fast.

Time Complexity

Conversion to HashSets: O(n×m)), where n is the number of lists and m is the average list size.
Subset Checks: O(n²), as we only compare HashSets.

This reduces the overall complexity to O(n² + n×m), a significant improvement over the naive approach.

Practical Benefits of HashSets

Speed: HashSets drastically reduce the time required for subset checks.
Simplicity: Using HashSet operations like IsSupersetOf makes the code cleaner and easier to read.
Scalability: The optimized solution handles larger inputs effectively, making it suitable for real-world applications.

Conclusion

HashSets are a powerful tool for optimizing subset-related problems. By converting lists into HashSets and leveraging their efficient operations, we transformed a computationally expensive problem into a streamlined solution. This example demonstrates the importance of choosing the right data structure for the task — a key principle for writing high-performance code.