# 6.3. Miscellaneous Notation¶

This module collects together definitions for a number of mathematical terms and concepts, as a place for reference when needed.

**Units of measure:**
OpenDSA modules use the following notation for units of measure.
"B" will be used as an abbreviation for bytes, "b" for bits,
"KB" for kilobytes \((2^{10} = 1024\) bytes),
"MB" for megabytes \((2^{20}\) bytes)
"GB" for gigabytes \((2^{30}\) bytes)
and "ms" for milliseconds
(a millisecond is 1/1000 of a second).
Spaces are not placed between the number and the unit abbreviation
when a power of two is intended.
Thus a disk drive of size 25 gigabytes (where a gigabyte is intended
as \(2^{30}\) bytes) will be written as "25GB".
Spaces are used when a decimal value is intended.
An amount of 2000 bits would therefore be written "2 Kb" while
"2Kb" represents 2048 bits.
2000 milliseconds is written as 2000 ms.
Note that in this book large amounts of storage are nearly always
measured in powers of two and times in powers of
ten.

**Factorial function:**
The *factorial* function, written \(n!\) for \(n\) an
integer greater than 0, is the product of
the integers between 1 and \(n\), inclusive.
Thus, \(5! = 1 \cdot 2 \cdot 3 \cdot 4 \cdot 5 = 120\).
As a special case, \(0! = 1\).
The factorial function grows quickly as \(n\) becomes larger.
Because computing the factorial function directly is a time-consuming
process, it can be useful to have an equation that provides a
good approximation.
Stirling's approximation states that
\(n! \approx \sqrt{2\pi n}(\frac{n}{e})^n\),
where \(e \approx 2.71828\)
(\(e\) is the base for the system of natural logarithms) [1].
Thus we see that while \(n!\) grows
slower than \(n^n\) (because \(\sqrt{2\pi n}/e^n < 1\)),
it grows faster than \(c^n\) for any positive integer constant
\(c\).

**Permutations:**
A *permutation* of a sequence \(\mathbf{S}\)
is simply the members of \(\mathbf{S}\) arranged in some order.
For example, a permutation of the integers 1 through \(n\) would
be those values arranged in some order.
If the sequence contains \(n\) distinct members, then there are
\(n!\) different permutations for the sequence.
This is because there are \(n\) choices for the first member in
the permutation; for each choice of first member there are \(n-1\)
choices for the second member, and so on.
Sometimes one would like to obtain a *random permutation* for a
sequence, that is, one of the \(n!\) possible permutations is
selected in such a way that each permutation has equal probability of
being selected.
A simple function for generating a random permutation is as
follows.
Here, the \(n\) values of the sequence are stored in
positions 0 through \(n-1\) of array `A`

,
function `swap(A, i, j)`

exchanges elements `i`

and `j`

in array `A`

,
and `Random(n)`

returns an integer value in the range 0 to
\(n-1\).

```
// Randomly permute the values in array A
public static void permute(Object[] A) {
for (int i = A.length; i > 0; i--) // for each i
Swap.swap(A, i-1, random(i)); // swap A[i-1] with a random
} // position in the range 0 to i-1.
```

```
// Randomly permute the values in array A
static <T> void permute(T[] A) {
for (int i = A.length; i > 0; i--) // for each i
swap(A, i-1, random(i)); // swap A[i-1] with a random
} // position in the range 0 to i-1.
```

```
//Randomly permute the values in array A
void permute(int A[], int n) {
for (int i = n; i > 0; i--) // for each i
swap(A, i-1, int(Random(i))); // swap A[i-1] with a random
// position in the range 0 to i-1.
}
```

**Boolean variables:**
A *Boolean variable*
is a variable that takes on one of the two values `True`

and
`False`

.
These two values are often associated with the values 1 and 0,
respectively, although there is no reason why this needs to be the
case.
It is poor programming practice to rely on the
correspondence between 0 and False, because these are logically
distinct objects of different types.

**Logic Notation:**
We will occasionally make use of the notation of symbolic or Boolean
logic.
\(A \Rightarrow B\) means "\(A\) implies \(B\)" or
"If \(A\) then \(B\)".
\(A \Leftrightarrow B\) means "\(A\) if and only if \(B\)"
or "\(A\) is equivalent to \(B\)".
\(A \vee B\) means "\(A\) or \(B\)"
(useful both in the context of symbolic
logic or when performing a Boolean operation).
\(A \wedge B\) means "\(A\) and \(B\)".
\(\sim\!A\) and \(\overline{A}\) both mean "not \(A\)" or
the negation of \(A\) where \(A\) is a Boolean variable.

**Floor and ceiling:**
The *floor* of \(x\) (written \(\lfloor x \rfloor\))
takes real value \(x\) and returns the greatest
integer \(\leq x\).
For example, \(\lfloor 3.4 \rfloor = 3\),
as does \(\lfloor 3.0 \rfloor\),
while \(\lfloor -3.4 \rfloor = -4\) and
\(\lfloor -3.0 \rfloor = -3\).
The *ceiling* of \(x\) (written
\(\lceil x \rceil\)) takes real value \(x\) and returns the
least integer \(\geq x\).
For example, \(\lceil 3.4 \rceil = 4\), as does
\(\lceil 4.0 \rceil\),
while \(\lceil -3.4 \rceil = \lceil -3.0 \rceil = -3\).

**Modulus function:**
The *modulus* (or *mod*) function returns the remainder of
an integer division.
Sometimes written \(n \bmod m\) in mathematical expressions,
the syntax in many programming languages is `n % m`

.
From the definition of remainder, \(n \bmod m\) is the integer
\(r\) such that \(n = qm + r\) for \(q\) an integer,
and \(|r| < |m|\).
Therefore, the result of \(n \bmod m\) must be between 0 and
\(m-1\) when \(n\) and \(m\) are positive integers.
For example, \(5 \bmod 3 = 2\); \(25 \bmod 3 = 1\),
\(5 \bmod 7 = 5\), and \(5 \bmod 5 = 0\).

There is more than one way to assign values to \(q\) and \(r\), depending on how integer division is interpreted. The most common mathematical definition computes the mod function as \(n \bmod m = n - m\lfloor n/m\rfloor\). In this case, \(-3 \bmod 5 = 2\). However, Java and C++ compilers typically use the underlying processor's machine instruction for computing integer arithmetic. On many computers this is done by truncating the resulting fraction, meaning \(n \bmod m = n - m (\mathrm{trunc}(n/m))\). Under this definition, \(-3 \bmod 5 = -3\). Another language might do something different.

Unfortunately, for many applications this is not what the user wants
or expects.
For example, many *hash systems*
will perform some computation on a record's *key* value and then
take the result modulo the hash table size.
The expectation here would be that the result is a legal index into
the hash table, not a negative number.
Implementers of hash functions must either insure that the
result of the computation is always positive, or else add the hash
table size to the result of the modulo function when that result is
negative.

[1] | The symbol "\(\approx\)" means "approximately equal." |