2.1. Set Notation¶
2.1.1. Introduction to Sets¶
The concept of a set in the mathematical sense is widely used in computer science. The notations and techniques of set theory are commonly used when describing and implementing algorithms because the abstractions associated with sets often help to clarify and simplify algorithm design.
A set is a collection of distinguishable members or elements. The members are typically drawn from some larger population known as the base type. Each member of a set is either a primitive element of the base type or is a set itself. There is no concept of duplication in a set. Each value from the base type is either in the set or not in the set. For example, a set named \(\mathbf{P}\) might consist of the three integers 7, 11, and 42. In this case, \(\mathbf{P}\)’s members are 7, 11, and 42, and the base type is integer.
The following table shows the symbols commonly used to express sets and their relationships.
Table 2.1.1
Here are some examples of this notation in use. First define two sets, \(\mathbf{P}\) and \(\mathbf{Q}\).
\(|\mathbf{P}| = 3\) (because \(\mathbf{P}\) has three members) and \(|\mathbf{Q}| = 2\) (because \(\mathbf{Q}\) has two members). Both of these sets are finite in length. Other sets can be infinite, for example, the set of integers.
The union of \(\mathbf{P}\) and \(\mathbf{Q}\), written \(\mathbf{P} \cup \mathbf{Q}\), is the set of elements in either \(\mathbf{P}\) or \(\mathbf{Q}\), which is {2, 3, 5, 10}. The intersection of \(\mathbf{P}\) and \(\mathbf{Q}\), written \(\mathbf{P} \cap \mathbf{Q}\), is the set of elements that appear in both \(\mathbf{P}\) and \(\mathbf{Q}\), which is {5}. The set difference of \(\mathbf{P}\) and \(\mathbf{Q}\), written \(\mathbf{P} - \mathbf{Q}\), is the set of elements that occur in \(\mathbf{P}\) but not in \(\mathbf{Q}\), which is {2, 3}. Note that \(\mathbf{P} \cup \mathbf{Q} = \mathbf{Q} \cup \mathbf{P}\) and that \(\mathbf{P} \cap \mathbf{Q} = \mathbf{Q} \cap \mathbf{P}\), but in general \(\mathbf{P} - \mathbf{Q} \neq \mathbf{Q} - \mathbf{P}\). In this example, \(\mathbf{Q} - \mathbf{P} = \{10\}\). Finally, the set {5, 3, 2} is indistinguishable from set \(\mathbf{P}\), because sets have no concept of order. Likewise, set {2, 3, 2, 5} is also indistinguishable from \(\mathbf{P}\), because sets have no concept of duplicate elements.
The set product or Cartesian product of two sets \(\mathbf{Q} \times \mathbf{P}\) is a set of ordered pairs. For our example sets, the set product would be
The powerset of a set \(\mathbf{S}\) (denoted \(2^S\)) is the set of all possible subsets for \(\mathbf{S}\). Consider the set \(\mathbf{S} = \{ a, b, c \}\). The powerset of \(\mathbf{S}\) is
A collection of elements with no order (like a set), but with duplicate-valued elements is called a bag. To distinguish bags from sets, we will use square brackets [] around a bag’s elements. For example, bag [3, 4, 5, 4] is distinct from bag [3, 4, 5], while set {3, 4, 5, 4} is indistinguishable from set {3, 4, 5}. However, bag [3, 4, 5, 4] is indistinguishable from bag [3, 4, 4, 5].
A sequence is a collection of elements with an order, and which may contain duplicate-valued elements. A sequence is also sometimes called a tuple or a vector. In a sequence, there is a 0th element, a 1st element, 2nd element, and so on. We will use angle brackets \(\langle\rangle\) to enclose the elements of a sequence. For example, \(\langle3, 4, 5, 4\rangle\) is a sequence. Note that sequence \(\langle3, 5, 4, 4\rangle\) is distinct from sequence \(\langle3, 4, 5, 4\rangle\), and both are distinct from sequence \(\langle3, 4, 5\rangle\).