# 6.2. Sets and Relations¶

## 6.2.1. Set Notation¶

The concept of a set in the mathematical sense has wide application in computer science. The notations and techniques of set theory are commonly used when describing and implementing algorithms because the abstractions associated with sets often help to clarify and simplify algorithm design.

A *set* is a collection of distinguishable
*members* or *elements*.
The members are typically drawn from some larger population known as
the *base type*.
Each member of a set is either a *primitive element* of the
base type or is a set itself.
There is no concept of duplication in a set.
Each value from the base type is either in the set or not in the set.
For example, a set named \(\mathbf{P}\) might consist of the three
integers 7, 11, and 42.
In this case, \(\mathbf{P}\)'s members are 7, 11, and 42, and the
base type is integer.

The following table shows the symbols commonly used to express sets and their relationships.

Table 6.2.1

Here are some examples of this notation in use. First define two sets, \(\mathbf{P}\) and \(\mathbf{Q}\).

\(|\mathbf{P}| = 3\) (because \(\mathbf{P}\) has three members) and \(|\mathbf{Q}| = 2\) (because \(\mathbf{Q}\) has two members). Both of these sets are finite in length. Other sets can be infinite, for example, the set of integers.

The union of \(\mathbf{P}\) and \(\mathbf{Q}\), written \(\mathbf{P} \cup \mathbf{Q}\), is the set of elements in either \(\mathbf{P}\) or \(\mathbf{Q}\), which is {2, 3, 5, 10}. The intersection of \(\mathbf{P}\) and \(\mathbf{Q}\), written \(\mathbf{P} \cap \mathbf{Q}\), is the set of elements that appear in both \(\mathbf{P}\) and \(\mathbf{Q}\), which is {5}. The set difference of \(\mathbf{P}\) and \(\mathbf{Q}\), written \(\mathbf{P} - \mathbf{Q}\), is the set of elements that occur in \(\mathbf{P}\) but not in \(\mathbf{Q}\), which is {2, 3}. Note that \(\mathbf{P} \cup \mathbf{Q} = \mathbf{Q} \cup \mathbf{P}\) and that \(\mathbf{P} \cap \mathbf{Q} = \mathbf{Q} \cap \mathbf{P}\), but in general \(\mathbf{P} - \mathbf{Q} \neq \mathbf{Q} - \mathbf{P}\). In this example, \(\mathbf{Q} - \mathbf{P} = \{10\}\). Finally, the set {5, 3, 2} is indistinguishable from set \(\mathbf{P}\), because sets have no concept of order. Likewise, set {2, 3, 2, 5} is also indistinguishable from \(\mathbf{P}\), because sets have no concept of duplicate elements.

The *set product* or *Cartesian product* of two sets
\(\mathbf{Q} \times \mathbf{P}\) is a set of ordered pairs.
For our example sets, the set product would be

The *powerset* of a set \(\mathbf{S}\) (denoted \(2^S\))
is the set of all possible subsets for \(\mathbf{S}\).
Consider the set \(\mathbf{S} = \{ a, b, c \}\).
The powerset of \(\mathbf{S}\) is

A collection of elements with no order (like a set), but with
duplicate-valued elements is called a
*bag* [1].
To distinguish bags from sets, we will use square brackets [] around
a bag's elements.
For example, bag [3, 4, 5, 4] is distinct from bag [3, 4, 5],
while set {3, 4, 5, 4} is indistinguishable from set
{3, 4, 5}.
However, bag [3, 4, 5, 4] is indistinguishable from bag
[3, 4, 4, 5].

A *sequence* is a collection of elements with an order, and
which may contain duplicate-valued elements.
A sequence is also sometimes called a *tuple* or a
*vector*.
In a sequence, there is a 0th element, a 1st element, 2nd element, and
so on.
We will use angle brackets \(\langle\rangle\) to enclose the
elements of a sequence.
For example, \(\langle3, 4, 5, 4\rangle\) is a sequence.
Note that sequence \(\langle3, 5, 4, 4\rangle\) is distinct from
sequence \(\langle3, 4, 5, 4\rangle\), and both are distinct from
sequence \(\langle3, 4, 5\rangle\).

[1] | The object referred to here as a
bag is sometimes called a
multilist.
But, the term multilist also refers to a list that may contain
sublists. |

### 6.2.1.1. Relations¶

A *relation* \(R\) over set \(\mathbf{S}\) is a set of
ordered pairs from \(\mathbf{S}\).
As an example of a relation, if \(\mathbf{S}\) is
\(\{a, b, c\}\), then

is a relation, and

is a different relation. If tuple \(\langle x, y\rangle\) is in relation \(R\), we may use the infix notation \(xRy\). We often use relations such as the less than operator (\(<\)) on the natural numbers, which includes ordered pairs such as \(\langle1, 3\rangle\) and \(\langle2, 23\rangle\), but not \(\langle3, 2\rangle\) or \(\langle2, 2\rangle\). Rather than writing the relationship in terms of ordered pairs, we typically use an infix notation for such relations, writing \(1<3\).

Define the properties of relations as follows, with \(R\) a binary relation over set \(\mathbf{S}\).

- \(R\) is
*reflexive*if \(aRa\) for all \(a \in \mathbf{S}\). - \(R\) is
*irreflexive*if \(aRa\) is not true for all \(a \in \mathbf{S}\). - \(R\) is
*symmetric*if whenever \(aRb\), then \(bRa\), for all \(a, b \in \mathbf{S}\). - \(R\) is
*antisymmetric*if whenever \(aRb\) and \(bRa\), then \(a = b\), for all \(a, b \in \mathbf{S}\). - \(R\) is
*transitive*if whenever \(aRb\) and \(bRc\), then \(aRc\), for all \(a, b, c \in \mathbf{S}\).

As examples, for the natural numbers, \(<\) is irreflexive (because \(aRa\) is never true), antisymmetric (because there is no case where \(aRb\) and \(bRa\)), and transitive. Relation \(\leq\) is reflexive, antisymmetric, and transitive. Relation \(=\) is reflexive, symmetric (and antisymmetric!), and transitive. For people, the relation "is a sibling of" is symmetric and transitive. If we define a person to be a sibling of themself, then it is reflexive; if we define a person not to be a sibling of themself, then it is not reflexive.

## 6.2.2. Equivalence Relations¶

\(R\) is an *equivalence relation* on set \(\mathbf{S}\)
if it is reflexive, symmetric, and transitive.
An equivalence relation can be used to partition a set into
*equivalence classes*.
If two elements \(a\) and \(b\) are equivalent to each other,
we write \(a \equiv b\).
A *partition* of a set \(\mathbf{S}\) is a collection of
subsets that are *disjoint* from each other and whose union is
\(\mathbf{S}\).
An *equivalence relation* on set \(\mathbf{S}\) partitions
the set into disjoint subsets whose elements are equivalent.
The *UNION/FIND* algorithm efficiently
maintains equivalence classes on a set.
One application for such *disjoint sets*
computing a *minimal cost spanning tree*.

Example 6.2.1

For the integers, \(=\) is an equivalence relation that partitions each element into a distinct subset. In other words, for any integer \(a\), three things are true.

- \(a = a\),
- if \(a = b\) then \(b = a\), and
- if \(a = b\) and \(b = c\), then \(a = c\).

Of course, for distinct integers \(a\), \(b\), and \(c\) there are never cases where \(a = b\), \(b = a\), or \(b = c\). So the requirements for symmetry and transitivity are never violated, and therefore the relation is symmetric and transitive.

Example 6.2.2

If we clarify the definition of sibling to mean that a person is a sibling of themself, then the sibling relation is an equivalence relation that partitions the set of people.

Example 6.2.3

We can use the *modulus* function
to define an equivalence relation.
For the set of integers, use the modulus function
to define a binary relation such that two numbers
\(x\) and \(y\) are in the relation if and only if
\(x \bmod m = y \bmod m\).
Thus, for \(m = 4\), \(\langle1, 5\rangle\) is in the
relation because \(1 \bmod 4 = 5 \bmod 4\).
We see that modulus used in this way defines an equivalence
relation on the integers, and this relation can be used to
partition the integers into \(m\) equivalence classes.
This relation is an equivalence relation because

- \(x \bmod m = x \bmod m\) for all \(x\);
- if \(x \bmod m = y \bmod m\), then \(y \bmod m = x \bmod m\); and
- if \(x \bmod m = y \bmod m\) and \(y \bmod m = z \bmod m\), then \(x \bmod m = z \bmod m\).

## 6.2.3. Partial Orders¶

A binary relation is called a
*partial order* if it is antisymmetric and transitive.
If the relation is reflexive, it is called a
*non-strict partial order*.
If the relation is *irreflexive*, it is called a
*strict partial order*.
The set on which the partial order is defined is called a
*partially ordered set* or a *poset*.
Elements \(x\) and \(y\) of a set are *comparable* under
a given relation \(R\) if either \(xRy\) or \(yRx\).
If every pair of distinct elements in a partial order are comparable,
then the order is called a *total order* or *linear order*.

Example 6.2.4

For the integers, relations \(<\) and \(\leq\) define partial orders. Operation \(<\) is a total order because, for every pair of integers \(x\) and \(y\) such that \(x \neq y\), either \(x < y\) or \(y < x\). Likewise, \(\leq\) is a total order because, for every pair of integers \(x\) and \(y\) such that \(x \neq y\), either \(x \leq y\) or \(y \leq x\).

Example 6.2.5

For the powerset of the integers, the subset operator defines a partial order (because it is antisymmetric and transitive). For example, \(\{1, 2\}\subseteq\{1, 2, 3\}\). However, sets {1, 2} and {1, 3} are not comparable by the subset operator, because neither is a subset of the other. Therefore, the subset operator does not define a total order on the powerset of the integers.