Skip to main content

Python Set Data Type

Introduction

After understanding lists and tuples, where both allow duplicates and preserve insertion order—with the only major difference being mutability—it becomes necessary to explore another important collection data type that is designed for a completely different kind of requirement. In many real-world scenarios, the problem is not about storing values in a specific order or allowing repeated elements, but rather about ensuring that only unique values are maintained, while the sequence in which they appear does not matter at all.

This is precisely the situation where the set data structure becomes not only useful but also the most appropriate choice, because it is specifically designed to handle collections where duplicates must be eliminated automatically and ordering is irrelevant.

Understanding the Need for Set

If you observe carefully, lists and tuples are ideal when:

  • Order is important
  • Duplicates are allowed

However, there are many situations where:

  • You do not care about the order of elements
  • You strictly want only unique values

For example, consider a scenario where you want to send a notification or SMS to a group of users, and your data source may contain duplicate phone numbers due to data entry issues or system merges; in such a case, sending multiple messages to the same number would be inefficient and undesirable, so your requirement naturally shifts toward maintaining only distinct values without worrying about order.

In such scenarios, using a list or tuple would require additional logic to remove duplicates, whereas a set inherently guarantees uniqueness, making it a more elegant and efficient solution.

What is a Set?

A set in Python is a collection data type that stores multiple values as a single entity, but with two defining characteristics:

  • Duplicates are not allowed
  • Order is not preserved

This means that when you insert elements into a set, Python automatically removes duplicates and does not guarantee any specific order when displaying or iterating over the elements.

Mathematical Intuition Behind Set

The concept of sets in Python closely resembles the concept of sets in mathematics, where the order of elements is irrelevant and only the presence of elements matters.

For instance, in mathematics:

  • {1, 2, 3}
  • {2, 3, 1}
  • {3, 1, 2}

All of these are considered equal because they contain the same elements, regardless of their arrangement.

Similarly, in Python, sets follow this principle, and therefore asking questions like “what is the first element” or “what is the last element” does not make sense, because sets are inherently unordered collections.

Creating a Set

Sets are typically represented using curly braces .

s = {10, 20, 30, 40}

To confirm the type:

print(type(s)) # <class 'set'>

Behavior of Duplicates in Set

One of the most important characteristics of a set is that it automatically removes duplicate elements, even if they are explicitly provided.

s = {10, 20, 10, "James", 20, 30, 40}
print(s)

Even though 10 and 20 appear multiple times, The set will store them only once.

Possible Output (order not guaranteed) is {40, 10, 'James', 20, 30}.

Here, two key observations must be made:

  • Duplicate values are removed automatically
  • The order of elements is not preserved

Order is Not Guaranteed

Unlike lists and tuples, sets do not maintain insertion order in a predictable way, and therefore the output may vary between executions or environments.

This behavior means that you should never write logic that depends on the position of elements within a set, because the concept of indexing does not apply here at all.

Indexing and Slicing Are Not Supported

Since sets are unordered, operations like indexing and slicing—which rely on positional access—are not applicable.

s = {10, 20, 30}

print(s[0]) # Error
print(s[1:3]) # Error

These operations will result in errors such as TypeError: 'set' object is not subscriptable.

This clearly indicates that sets are designed for membership and uniqueness, not for positional access.

Heterogeneous Data in Set

Just like lists and tuples, sets can also store elements of different data types, which allows flexibility in representing mixed collections.

s = {10, "Hello", True, 10.5}

This confirms that sets are not restricted to a single data type.

Mutability of Set

Unlike tuples, sets are mutable, which means that once a set is created, we can modify its contents by adding or removing elements.

Adding Elements to a Set

To add elements, we use the add() method.

s = {10, 20, 30, 40}
s.add(50)

print(s)

Here, the new element 50 is added, but its position is not guaranteed because sets do not preserve order.

Removing Elements from a Set

s.remove(30)
print(s)

This removes the element 30 from the set.

Why “add” Instead of “append”?

This is a subtle but important design decision.

In a list, elements are added at the end, so the term append makes sense. In a set, There is no concept of “end”. The element can appear anywhere internally. Therefore, Python uses the term add instead of append, because it reflects the absence of positional guarantees.

Important Concept: Empty Set

One of the most confusing and commonly misunderstood aspects of sets is how to create an empty set.

If you write:

s = {}

This does not create a set. Instead, it creates an empty dictionary.

print(type(s)) # <class 'dict'>

Correct Way to Create Empty Set

s = set()
print(type(s)) # <class 'set'>

Set is Growable and Mutable

Sets allow:

  • Adding new elements
  • Removing existing elements

This means sets are:

  • Mutable
  • Growable

However, despite being mutable, they still enforce uniqueness and do not maintain order.

Difference Between List and Set

Let us consolidate the key differences in a conceptual manner.

A list is suitable when order matters and duplicates are allowed, which makes it ideal for scenarios like user inputs, logs, or sequences where position is meaningful.

A set, on the other hand, is ideal when uniqueness is the priority and ordering is irrelevant, which makes it perfect for operations like removing duplicates, membership testing, and mathematical set operations.