A Quick Guide to Implementing and Using Sets Efficiently

Python has had sets for a while, but they aren't used as much as lists. Nevertheless, they can be very beneficial in certain cases.

What Are Sets?

A set is a collection of unique objects.

A basic use case is removing duplication.

test_set = ['spam', 'spam', 'eggs', 'spam', 'bacon', 'eggs']
print(set(test_set))
# {'eggs', 'spam', 'bacon'}
print(list(set(l)))
#['eggs', 'spam', 'bacon']

There are two types of sets in Python:

The set Type itself is not hashable, however, the set elements must be hashable to enforce uniqueness. The frozenset Type is hashable since it's immutable by default.

In addition to this, Set Type implements many operations as Infix Operators, for example, given two sets a and b:

# Defining the two sets
first_set = {1, 5, 7, 4, 5}
second_set = {4, 5, 6, 7, 8}
# Creating the intersection of the two sets
intersect_set = first_set & second_set
print(intersect_set) # Output: {4, 5}
# Creating the union of the two sets
union_set = first_set | second_set
print(union_set) # Output: {1, 2, 3, 4, 5, 6, 7, 8}
# Creating the difference of the two sets
diff_set = first_set - second_set
print(diff_set) # Output: {1, 2, 3}

How do I create a Set?

Python provides different ways to create or instantiate set objects, the first is through Literal Syntax and the second is Set Comprehension, and this comes as no surprise since a set is a Python sequence.

Set Literals

The standard string representation of sets always uses the {…} notation.

The syntax is quite simple with the one caveat, there is no way to create an empty Set, so we must call the constructor directly.

not_an_empty_set = {1, 5, 7, 4, 5} # Set Literal
empty_sest = set() # Empty Set

The syntax {1, 2, 3} is quicker and easier to read than writing out set([1, 2, 3]). It's slower to evaluate the latter form since Python has to check the set name, build a list, and afterward pass it to the constructor.

Set Comprehension

SetComp is similar to listComp and dictComp, it’s the creation of a set based on existing iterables.

new_set = {s for s in [1, 2, 1, 0]}
print(new_set) 
# set([0, 1, 2])

Practical Consequences of Using Sets

The set Type and frozenset Type are implemented with a hash table, and this can have certain effects:

Further Reading


Also published here.