.. _Glossary:
.. raw:: html
.. |--| unicode:: U+2013 .. en dash
.. |---| unicode:: U+2014 .. em dash, trimming surrounding whitespace
:trim:
.. This file is part of the OpenDSA eTextbook project. See
.. http://algoviz.org/OpenDSA for more details.
.. Copyright (c) 2012-2013 by the OpenDSA Project Contributors, and
.. distributed under an MIT open source license.
.. avmetadata::
:author: OpenDSA Contributors
:topic:
Glossary
========
.. glossary::
:sorted:
2-3 tree
A specialized form of the :term:`B-tree` where each internal
node has either 2 children or 3 children.
Key values are ordered to maintain the
:term:`binary search tree property`.
The 2-3 tree is always height balanced, and its insert, search,
and remove operations all have :math:`\Theta(\log n)` cost.
80/20 rule
Given a typical application where there is a collection of
records and a series of search operations for records,
the 80/20 rule is an empirical observation that
80% of the record accessess typically go to 20% of the records.
The exact values varies between data collections, and is related
to the concept of :term:`locality of reference`.
abstract data type
Abbreviated :term:`ADT`. The specification of a :term:`data type`
within some language, independent of an implementation.
The :term:`interface` for the ADT is defined in terms of a :term:`type`
and a set of operations on that type.
The behavior of each operation is determined by its inputs and
outputs.
An ADT does not specify *how* the data type is implemented.
These implementation details are hidden from the user of the ADT
and protected from outside access, a concept referred to as
:term:`encapsulation`.
accept
When a :term:`finite automata` executes on a string and
terminates in an :term:`accepting state`, it is said to accept
the string.
The finite automata is said to accept the language that consists
of all strings for which the finite automata completes execution
in an accepting state.
accepting state
Part of the definition of a :term:`finite automata` is to
designate some :term:`states ` as accepting states.
If the finite automata executes on an input string and completes
the computation in an accepting state, then the machine is said
to :term:`accept` the string.
activation record
The entity that is stored on the :term:`runtime stack` during
program execution.
It stores any active :term:`local variable` and the return
address from which a new subroutine is being called, so that
this information can be recovered when the subroutine
terminates.
acyclic graph
In :term:`graph` terminology, a graph that contains no
:term:`cycles `.
address
A location in memory.
adjacent
Two :term:`nodes ` of a :term:`tree` or two
:term:`vertices ` of a :term:`graph` are said to be
adjacent if they have an :term:`edge` connecting them.
If the edge is directed from :math:`a` to :math:`b`,
then we say that :math:`a` is adjacent to :math:`b`,
and :math:`b` is adjacent from :math:`a`.
adjacency list
An implementation for a :term:`graph` that uses an (array-based)
:term:`list` to represent the :term:`vertices ` of the
graph, and each vertex is in turn represented by a
(linked) list of the vertices that are
:term:`neighbors `.
adjacency matrix
An implementation for a :term:`graph` that uses a 2-dimensional
:term:`array` where each row and each column corresponds to a
:term:`vertex` in the :term:`graph`. A given row and column in
the matrix corresponds to an edge from the :term:`vertex`
corresponding to the row to the vertex corresponding to the
column.
ADT
Abbreviation for :term:`abstract data type`.
adversary
A fictional construct introduced for use in an
:term:`adversary argument`.
adversary argument
A type of :term:`lower bounds proof` for a problem where a
(fictional) "adversary" is assumed to control access to an
algorithm's input, and which yields information about that input
in such a way
that will drive the cost for any proposed algorithm to solve the
problem as high as possible.
So long as the adversary never gives an answer that conflicts
with any previous answer, it is permitted to do whatever
necessary to make the algorithm require as much cost as
possible.
aggregate type
A :term:`data type` whose :term:`members ` have subparts.
For example, a typical database record.
Another term for this is :term:`composite type`.
algorithm
A method or a process followed to solve a :term:`problem`.
algorithm analysis
A less formal version of the term
:term:`asymptotic algorithm analysis`, generally used as a
synonym for :term:`asymptotic analysis`.
alias
Another name for something. In programming, this usually refers
to two :term:`references ` that refer to the same
object.
allocated
allocation
Reserving memory for an object in the Heap memory.
all-pairs shortest paths problem
Given a :term:`graph` with :term:`weights ` or
distances on the :term:`edges `,
find the shortest paths between every pair of
vertices in the graph.
One approach to solving this problem is
:term:`Floyd's algorithm`, which uses the
:term:`dynamic programming` algorithmic technique.
alphabet
The characters or symbols that strings in a given language may
be composed of.
alphabet trie
A :term:`trie` data structure for storing variable-length
strings.
Level :math:`i` of the tree corresponds to the letter in
position :math:`i` of the string.
The root will have potential branches on each intial letter of
string.
Thus, all strings starting with "a" will be stored in the "a"
branch of the tree.
At the second level, such strings will be separated by branching
on the second letter.
amortized analysis
An :term:`algorithm analysis` techique that looks at the total
cost for a series of operations and amortizes this total cost
over the full series.
This is as opposed to considering every individual operation to
independently have the :term:`worst case` cost, which might lead
to an overestimate for the total cost of the series.
amortized cost
The total cost for a series of operations to be used in an
:term:`amortized analysis`.
ancestor
In a tree, for a given node :math:`A`, any node on a
:term:`path` from :math:`A` up to the root is an ancestor of
:math:`A`.
antisymmetric
In set notation, relation :math:`R` is antisymmetric if whenever
:math:`aRb` and :math:`bRa`, then :math:`a = b`, for all
:math:`a, b \in \mathbf{S}`.
approximation algorithm
An algorthm for an :term:`optimization problem` that finds a
good, but not necessarily cheapest, solution.
arm
In the context of an :term:`I/O head`, this attaches the sensor
on the I/O head to the :term:`boom`.
array
A :term:`data type` that is used to store elements in consecutive memory
locations and refers to them by an index.
array-based list
An implementation for the :term:`list` ADT that uses an :term:`array` to
store the list elements. Typical implementations fix the array
size at creation of the list, and the :term:`overhead`
is the number of array positions that are presently unused.
array-based stack
Analogous to an :term:`array-based list`, this uses an :term:`array` to
store the elements when implementing the :term:`stack` ADT.
array-based queue
Analogous to an :term:`array-based list`, this uses an :term:`array` to
store the elements when implementing the :term:`queue` ADT.
ASCII character coding
American Standard Code for Information Interchange.
A commonly used method for encoding characters using a binary code.
Standard ASCII uses an 8-bit code to represent upper and lower
case letters, digits, some punctuation, and some number of
non-printing characters (such as carrage return).
Now largely replaced by UTF-8 encoding.
assembly code
A form of :term:`intermediate code` created by a compiler that
is easy to convert into the final form that the computer can
execute.
An assembly language is typically a direct mapping of one or a
few instructions that the CPU can execute into a mnemonic form
that is relatively easy for a human to read.
asymptotic algorithm analysis
A more formal term for :term:`asymptotic analysis`.
asymptotic analysis
A method for estimating the efficiency of an algorithm or
computer program by identifying its :term:`growth rate`.
Asymptotic analysis also gives a way to
define the inherent difficulty of a :term:`problem`.
We frequently use the term :term:`algorithm analysis` to mean
the same thing.
attribute
In :term:`object-oriented programming `,
a synonym for :term:`data members `.
automata
Synonym for :term:`finite state machine`.
automatic variable
A synonym for :term:`local variable`.
When program flow enters and leaves the variable's scope,
automatic variables will be allocated and de-allocated
automatically.
average case
In :term:`algorithm analysis`, the average of the costs for all
:term:`problem instances ` of a given input
size :math:`n`. If not all problem
instances have equal probability of occurring, then average case
must be calculated using a weighted average.
average seek time
Expected (average) time to perform a :term:`seek` operation on a
:term:`disk drive`, assuming that the seek is between two
randomly selected tracks.
This is one of two metrics commonly provided by disk drive
vendors for disk drive performance, with the other being
:term:`track-to-track seek time`.
AVL Tree
A variant implementation for the :term:`BST`, which differs from
the standard BST in that it uses modified insert and remove
methods in order to keep the tree
:term:`balanced `.
Similar to a :term:`Splay Tree` in that it uses the concept of
:term:`rotations ` in the insert and remove operations.
B$^+$-tree
The most commonly implemented form of :term:`B-tree`.
A B$^+$-tree does not store data at the
:term:`internal nodes `, but
instead only stores :term:`search key` values as direction
finders for the purpose of searching through the tree.
Only the :term:`leaf nodes ` store a
:term:`reference` to the actual data records.
B$^*$-tree
A variant on the :term:`B$^+$-tree`.
The :math:`\mathrm{B}^*` tree is identical to the :math:`\mathrm{B}^+`
tree, except for the rules used to split and merge nodes.
Instead of splitting a node in half when it overflows, the
:math:`\mathrm{B}^*` tree
gives some records to its neighboring sibling, if possible.
If the sibling is also full, then these two nodes split into three.
Similarly, when a node underflows, it is combined with its two
siblings, and the total reduced to two nodes.
Thus, the nodes are always at least two thirds full.
B-tree
A method for :term:`indexing` a large collection of records.
A B-tree is a :term:`balanced tree` that typically has high
branching factor (commonly as much as 100
:term:`children ` per :term:`internal node`),
causing the tree to be very shallow.
When stored on disk, the node size is selected to be same as the
desired unit of I/O (so some multiple of the disk :term:`sector`
size).
This makes it easy to gain access to the record associated with
a given :term:`search key` stored in the tree with few
:term:`disk accesses `.
The most commonly implemented variant of the B-tree is the
:term:`B$^+$-tree`.
backing storage
In the context of a :term:`caching` system or
:term:`buffer pool`, backing storage is the relatively large but
slower source of data that needs to be cached.
For example, in a :term:`virtual memory`, the disk drive would
be the backing storage.
In the context of a web browser, the Internet might be
considered the backing storage.
backtracking
A :term:`heuristic` for brute-force search of a solution space.
It is essentially a :term:`depth-first search` of the solution
space.
This can be improved using a :term:`branch-and-bounds algorithm`.
bag
In set notation, a bag is a collection of elements with no order
(like a set), but which allows for duplicate-valued elements
(unlike a set).
balanced tree
A :term:`tree` where the :term:`subtrees ` meet some
criteria for being balanced.
Two possibilities are that the tree is
:term:`height balanced`, or that the tree has a roughly equal
number of :term:`nodes ` in each subtree.
base
Synonym for :term:`radix`.
base case
In :term:`recursion` or :term:`proof by induction`, the base case
is the termination condition.
This is a simple input or value that can be solved (or proved in
the case of induction) without resorting to a recursive call
(or the :term:`induction hypothesis`).
base class
In :term:`object-oriented programming `,
a class from which another class :term:`inherits `.
The class that inherits is called a :term:`subclass`.
base type
The :term:`data type` for the elements in a set.
For example, the set might consist of the integer values 3, 5,
and 7.
In this example, the base type is integers.
basic operation
Examples of basic operations include inserting a data
item into the data structure, deleting a :term:`data item` from the
data structure, and finding a specified :term:`data item`.
best case
In algorithm analysis, the :term:`problem instance` from among
all problem instances for a given input size :math:`n` that has
least cost. Note that the best case is **not** when :math:`n` is
small, since we are referring to the best from a class of inputs
(i.e, those inputs of size :math:`n`).
best fit
In a :term:`memory manager`, best fit is a :term:`heuristic`
for deciding which :term:`free block` to use when allocating
memory from a :term:`memory pool`.
Best fit will always allocate from the smallest
:term:`free block` that is large enough to service the memory
request.
The rationale is that this will be the method that best
preserves large blocks needed for unusually large requests.
The disadvantage is that it tends to
cause :term:`external fragmentation` in the form of small,
unuseable memory blocks.
BFS
Abbreviation for :term:`breadth-first search`.
big-Oh notation
In :term:`algorithm analysis`, a shorthand notation for
describing the :term:`upper bound` for an :term:`algorithm` or
:term:`problem`.
binary insert sort
A variation on :term:`insertion sort` where the position of the
value being inserted is located by binary search, and then put
into place. In normal usage this is not an improvement on
standard insertion sort because of the expense of moving many
items in the :term:`array`. But it is directly useful if the cost of
comparison is high compared to that of moving an element, or
is theoretically useful if we only care to count the cost of
comparisons.
binary search
A standard :term:`recursive ` algorithm for finding
the :term:`record` with a given :term:`search key` value within
a sorted list.
It runs in :math:`O(\log n)` time.
At each step, look at the middle of the current sublist, and throw
away the half of the records whose keys are either too small or
too large.
binary search tree
A binary tree that imposes the following constraint on its node
values: The :term:`search key` value for any node :math:`A` must
be greater than the (key) values for all nodes in the left
:term:`subtree` of :math:`A`, and less than the key values for
all nodes in the right subtree of :math:`A`.
Some convention must be adopted if
multiple nodes with the same key value are permitted,
typically these are required to be in the right subtree.
binary search tree property
The defining relationship between the :term:`key` values for
:term:`nodes ` in a :term:`BST`.
All nodes stored in the left subtree of a node whose key value
is :math:`K` have key values less than or equal to :math:`K`.
All nodes stored in the right subtree of a node whose key value
is :math:`K` have key values greater than :math:`K`.
binary tree
A finite set of nodes which is either empty, or else has a root
node together two binary trees, called the left and right
:term:`subtrees `, which are :term:`disjoint` from each
other and from the :term:`root`.
binary trie
A :term:`binary tree` whose structure is that of a :term:`trie`.
Generally this is an implementation for a :term:`search tree`.
This means that the :term:`search key` values are thought of a
binary digits, with the digit in the position corresponding to
this a node's :term:`level` in the tree indicating a left branch
if it is "0", or a right branch if it is "1".
Examples include the :term:`Huffman coding tree` and the
:term:`Bintree`.
binning
In :term:`hashing`, binning is a type of :term:`hash function`.
Say we are given keys in the range 0 to 999, and have a hash
table of size 10.
In this case, a possible hash function might simply divide the
key value by 100.
Thus, all keys in the range 0 to 99 would hash to slot 0, keys
100 to 199 would hash to slot 1, and so on.
In other words, this hash function "bins" the first 100 keys to
the first slot, the next 100 keys to the second slot, and so
on.
This approach tends to make the hash function dependent on the
distribution of the high-order bits of the keys.
bintree
A :term:`spatial data structure` in the form of binary
:term:`trie`, typically used to store point data in two or more
dimensions.
Similar to a :term:`PR quadtree` except that at each level, it
splits one dimension in half.
Since many leaf nodes of the PR quadtree will contain no data
points, implementation often makes use of the :term:`flyweight`
:term:`design pattern`.
Binsort
A sort that works by taking each record and placing it into a
bin based on its value. The bins are then gathered up in order
to sort the list. It is generally not practical in this form,
but it is the conceptual underpinning of the :term:`radix sort`.
bitmap
bit vector
An :term:`array` that stores a single bit at each position.
Typically these bits represent
:term:`Boolean variables ` associated with
a collection of objects, such that the :math:`i` th bit is the
Boolean value for the :math:`i` th object.
block
A unit of storage, usually referring to storage on a
:term:`disk drive` or other :term:`peripheral storage` device.
A block is the basic unit of I/O for that device.
Boolean expression
A Boolean expression is comprised of
:term:`Boolean variables `
combined using the operators AND (:math:`\cdot`), OR
(:math:`+`), and NOT (to negate Boolean variable :math:`x` we
write :math:`\overline{x}`).
Boolean variable
A variable that takes on one of the two values ``True`` and
``False``.
boom
In the context of an :term:`I/O head`, is the central structure
to which all of the I/O heads are attached.
Thus, the all move together during a :term:`seek` operation.
bounding box
A box (usually aligned to the coordinate axes of the reference
system) that contains a (potentially complex) object. In
graphics and computational geometry, complex objects might be
associated with a bounding box for use by algorithms that search
for objects in a particular location. The idea is that if the
bounding box is not within the area of interest, then neither is
the object. Checking the bounding box is cheaper than checking
the object, but it does require some time. So if enough objects
are not outside the area of interest, this approach will not
save time. But if most objects are outside of the area of
interest, then checking bounding boxes first can save a lot of
time.
branch-and-bounds algorithm
A variation on :term:`backtracking` that applies
to :term:`optimization problems `.
We traverse the :term:`solution tree` as with backtracking.
Proceeding deeper in the solution tree generally requires
additional cost.
We remember the best-cost solution found so far.
If the cost of the current branch in the tree exceeds the best
tour cost found so far, then we know to stop pursuing this
branch of the tree.
At this point we can immediately back up and take another branch.
break-even point
The point at which two costs become even when measured as the
function of some variable.
In particular, used to compare the space requirements of two
implementations.
For example, when comparing the space requirements of an
:term:`array-based list` implementation versus a
:term:`linked list` implementation, the key issue is how full
the list is compared to its capacity limit (for the array-based
list).
The point where the two representations would have the same
space cost is the break-even point.
As the list becomes more full beyond this point, the array-based
list implementation becomes more space efficent, while as the
list becomes less full below this point, the linked list
implementation becomes more space efficient.
breadth-first search
A :term:`graph` :term:`traversal` algorithm.
As the name implies, all immediate :term:`neighbors `
for a :term:`node` are :term:`visited ` before any
more-distant nodes are visited.
BFS is driven by a :term:`queue`.
A start vertex is placed on the queue.
Then, until the queue is empty, a node is taken off the
queue, visited, and and then any :term:`unvisited` neighbors are
placed onto the queue.
BST
Abbreviation for :term:`binary search tree`.
bubble sort
A simple sort that requires :math:`Theta(n^2)` time in
:term:`best `, :term:`average `,
and :term:`worst ` cases.
Even an optimized version will normally run slower than
:term:`insertion sort`, so it has little to recommend it.
bucket
In :term:`bucket hashing`, a bucket is a sequence of
:term:`slots ` in the :term:`hash table` that are grouped
together.
bucket hashing
A method of :term:`hashing` where multiple :term:`slots `
of the :term:`hash table` are grouped together to form a
:term:`bucket`.
The :term:`hash function` then either hashes to some bucket, or
else it hashes to a :term:`home slot` in the normal way, but
this home slot is part of some bucket.
:term:`Collision resolution ` is handled
first by attempting to find a free position within the same
bucket as the home slot.
If the bucket if full, then the record is placed in an
:term:`overflow bucket`.
bucket sort
A variation on the :term:`Binsort`, where each bin is associated
with a range of :term:`key` values.
This will require some method of
sorting the records placed into each bin.
buddy method
In a :term:`memory manager`, an alternative to using a
:term:`free block list` and a :term:`sequential fit` method to
seach for a suitable free block to service a
:term:`memory request`.
Instead, the memory pool is broken down as needed into smaller
chunks by splitting it in half repeatedly until the smallest
power of 2 that is as big or bigger than the size of the memory
request is reached.
The name comes from the fact that the binary representation for
the start of the block positions only differ by one bit for
adjacent blocks of the same size.
These are referred to as "buddies" and will be merged together
if both are free.
buffer
A block of memory, most often in :term:`primary storage`.
The size of a buffer is typically one or a multiple of the basic
unit of I/O that is read or written on each access to
:term:`secondary storage` such as a :term:`disk drive`.
buffer passing
An approach to implementing the :term:`ADT` for a
:term:`buffer pool`, where a pointer to a :term:`buffer` is
passed between the client and the buffer pool.
This is in contrast to a :term:`message passing` approach,
it is most likely to be used for long messages or when the
message size is always the same as the buffer size, such as when
implementing a :term:`B-tree`.
buffer pool
A collection of one or more :term:`buffers `.
The buffer pool is an example of a :term:`cache `.
It is stored in :term:`primary storage`, and holds data that is
expected to be used in the near future.
When a data value is requested, the buffer pool is searched
first.
If the value is found in the buffer pool, then
:term:`secondary storage` need not be accessed.
If the value is not found in the buffer pool, then it must be
fetched from secondary storage.
A number of traditional :term:`heuristics `
have been developed for deciding which data to :term:`flush`
from the buffer pool when new data must be stored,
such as :term:`least recently used`.
buffering
A synonym for :term:`caching`.
More specifically, it refers to an arrangement where all
accesses to data (such as on a
:term:`peripheral storage` device) must
be done in multiples of some minimum unit of storage.
On a :term:`disk drive`, this basic or smallest unit of I/O is a
:term:`sector`.
It is called "buffering" because the block of data returned by
such an access is stored in a :term:`buffer`.
caching
The concept of keeping selected data in :term:`main memory`.
The goal is to have in main memory the data values that are
most likely to be used in the near future.
An example of a caching technique is the use of a
:term:`buffer pool`.
call stack
Known also as execution stack. A stack that stores the function
call sequence and the return address for each function.
Cartesian product
For sets, this is another name for the :term:`set product`.
ceiling
Written :math:`\lceil x \rceil`, for real value :math:`x` the
ceiling is the least integer :math:`\geq x`.
child
In a tree, the set of :term:`nodes ` directly pointed to
by a node :math:`R` are the :term:`children ` of :math:`R`.
circular first fit
In a :term:`memory manager`, circular first fit is a
:term:`heuristic` for deciding which :term:`free block` to use
when allocating memory from a :term:`memory pool`.
Circular first fit is a minor modification on :term:`first fit`
memory allocation, where the last free block allocated from is
remembered, and search for the next suitable free block picks up
from there.
Like first fit, it has the advantage that it is typically not
necessary to look at all free blocks on the free block list to
find a suitable free block.
And it has the advantage over first fit that it spreads out
memory allocations evenly across the :term:`free block list`.
This might help to minimize :term:`external fragmentation`.
circular list
A :term:`list` ADT implementation variant where the last element
of the list provides access to the first element of the list.
class
In the :term:`object-oriented programming paradigm`
an ADT and its implementation together make up a class.
An instantiation of a class within a program is termed an
:term:`object`.
class hierarchy
In :term:`object-oriented programming `,
a set of classes and their interrelationships.
One of the classes is the :term:`base class`, and the others are
:term:`subclasses ` that :term:`inherit` either
directly or indirectly from the base class.
clause
In a :term:`Boolean expression`, a clause is one or more
:term:`literals ` OR'ed together.
client
The user of a service.
For example, the object or part of the program that calls a
:term:`memory manager` class is the client of that memory
manager.
Likewise the class or code that calls a :term:`buffer pool`.
clique
In :term:`graph` terminology, a clique is a :term:`subgraph`,
defined as any :term:`subset` :math:`U` of the graph's
:term:`vertices ` such that every vertex in :math:`U`
has an :term:`edge` to every other vertex in :math:`U`.
The size of the clique is the number of vertices in the clique.
closed
A set is closed over a (binary) operation if,
whenever the operation is applied to two members of the set, the
result is a member of the set.
closed-form solution
An algebraic equation with the same value as a :term:`summation`
or :term:`recurrence relation`.
The process of replacing the summation or
recurrence with its closed-form solution is known as solving the
summation or recurrence.
closed hash system
A :term:`hash system` where all records are stored in slots of
the :term:`hash table`.
This is in contrast to an :term:`open hash system`.
cluster
In :term:`file processing`, a collection of physically adjacent
:term:`sectors ` that define the smallest allowed
allocation unit of space to a disk file.
The idea of requiring space to be allocated in multiples of
sectors is that this will reduce the number of
:term:`extents ` required to store the file, which
reduces the expected number of :term:`seek` operations reuquired
to process a series of :term:`disk accesses ` to
the file.
The disadvantage of large cluster size is that it increases
:term:`internal fragmentation` since any space not actually
used by the file in the last cluster is wasted.
code generation
A phase in a :term:`compiler` that transforms
:term:`intermediate code` into the final executable form of the
code.
More generally, this can refer to the process of turning a parse
tree (that determines the correctness of the structure of the
program) into actual instructions that the computer can execute.
code optimization
A phase in a :term:`compiler` that makes changes in the code
(typically :term:`assembly code`) with the goal of replacing
it with a version of the code that will run faster while
performing the same computation.
cohesion
In :term:`object-oriented programming `,
a term that refers to the degree to which a class has a single
well-defined role or responsibility.
Collatz sequence
For a given integer value :math:`n`, the sequence of numbers
that derives from performing the following computatin on :math:`n`::
while (n > 1)
if (ODD(n))
n = 3 * n + 1;
else
n = n / 2;
This is famous because, while it terminates for any value of
:math:`n` that you try, it has never been proven to be a fact
that this always terminates.
collision
In a :term:`hash system`, this refers to the case where two
search :term:`keys ` are mapped by the
:term:`hash function` to the same
slot in the :term:`hash table`.
This can happen on insertion or search when another record has
already been hashed to that slot.
In this case, a :term:`closed hash system` will require a
process known as :term:`collision resolution` to find the
location of the desired record.
collision resolution
The outcome of a :term:`collision resolution policy`.
collision resolution policy
In :term:`hashing`, the process of resolving a
:term:`collision`.
Specifically in a :term:`closed hash system`, this is the
process of finding the proper position in a :term:`hash table`
that contains the
desired record if the :term:`hash function` did not return the
correct position for that record due to a :term:`collision` with
another record.
comparable
The concept that two objects can be compared to determine if they
are equal or not, or to determine which one is greater than the
other.
In set notation, elements :math:`x` and :math:`y` of a set are
comparable under a given relation :math:`R` if either
:math:`xRy` or :math:`yRx`.
To be reliably compared for a greater/lesser relationship,
the values being compared must belong to a :term:`total order`.
In programming, the property of a data type such that two
elements of the type can be compared to determine if they the
same (a weaker version), or which of the two is larger (a
stronger version).
``Comparable`` is also the name of an :term:`interface` in Java that
asserts a comparable relationship between objects with a class,
and ``.compareTo()`` is the ``Comparable`` interface method that
implements the actual comparison between two objects of the class.
comparator
A function given as a parameter to a method of a library
(or alternatively, a parameter for a C++ template or a Java
generic).
The comparator function concept provides a generic way
encapulates the process of performing a comparison between two
objects of a specific type.
For example, if we want to write a generic sorting routine, that
can handle any record type, we can require that the user of the
sorting routine pass in a comparator function
to define how records in the collection are to be compared.
comparison
The act of comparing two :term:`keys ` or
:term:`records `.
For many :term:`data types `, a comparison has
constant time cost.
The number of comparisons required is often used as a
:term:`measure of cost` for sorting and searching algorithms.
compiler
A computer program that reads computer programs and converts
them into a form that can be directly excecuted by some form of
computer.
The major phases in a compiler include :term:`lexical analysis`,
:term:`syntax analysis`, :term:`intermediate code generation`,
:term:`code optimization`, and :term:`code generation`.
More broadly, a compiler can be viewed as :term:`parsing
` the program to verify that it is syntactically
correct, and then doing :term:`code generation` to convert the
hig-level program into something that the computer can execute.
compile-time polymorphism
A form of :term:`polymorphism` known as Overloading.
Overloaded methods have the same names, but different signatures
as a method available elsewhere in the class.
Compare to :term:`run-time polymorphism`.
complete binary tree
A binary tree where the nodes are filled in row by row, with the
bottom row filled in left to right.
Due to this requirement, there is only one tree of :math:`n`
nodes for any value of :math:`n`.
Since storing the records in an :term:`array` in row order leads to a
simple mapping from a node's position in the array to its
:term:`parent`, :term:`siblings `, and
:term:`children `, the array representation is most
commonly used to implement the complete binary tree.
The :term:`heap` data structure is a complete binary tree with
partial ordering constraints on the node values.
complete graph
A :term:`graph` where every :term:`vertex` connects to every
other vertex.
complex number
In mathematics, an imaginary number, that is, a number with a
real component and an imaginary component.
Composite design pattern
Given a class hierarchy representing a set of objects, and a
container for a collection of objects, the composite
:term:`design pattern` addresses the relationship between the
object hierarchy and a bunch of behaviors on the objects.
In the composite design, each object is required to implement
the collection of behaviors.
This is in contrast to the procedural approach where a behavior
(such as a tree :term:`traversal`) is implemented as a
method on the object collection (such as a :term:`tree`).
Procedural tree traversal requires that the tree have a method
that understands what to do when it encounters any of the object
types (:term:`internal ` or
:term:`leaf nodes `) that the tree might contain.
The composite approach would have the tree call the "traversal"
method on its root node, which then knows how to perform the
"traversal" behavior.
This might in turn require invoking the traversal method of
other objects (in this case, the children of the root).
composite type
A type whose :term:`members ` have subparts.
For example, a typical database record.
Another term for this is :term:`aggregate type`.
composition
Relationships between classes based on usage rather than
:term:`inheritance `, i.e. a **HAS-A** relationship.
For example, some code in class 'A' has a :term:`reference` to some
other class 'B'.
computability
A branch of computer science that deals with the theory of
solving problems through computation.
More specificially, it deals with the limits to what problems
(functions) are computable.
An example of a famous problem that cannot in principle be
solved by a computer is the :term:`halting problem`.
computation
In a :term:`finite automata`, a computation is a sequence of
:term:`configurations ` for some
length :math:`n \geq 0`.
In general, it is a series of operations that the machine
performs.
computational complexity theory
A branch of the theory of computation in theoretical computer
science and mathematics that focuses on classifying
computational problems according to their inherent difficulty,
and relating those classes to each other.
An example is the study of :term:`NP-Complete` problems.
configuration
For a :term:`finite automata`, a complete specification for the
current condition of the machine on some input string.
This includes the current :term:`state` that the machine is in,
and the current condition of the string, including which
character is about to be processed.
Conjunctive Normal Form
CNF
A :term:`Boolean expression` written as a series of
:term:`clauses ` that are AND'ed together.
connected component
In an :term:`undirected graph`, a :term:`subset` of the
:term:`nodes ` such that each node in the subset can be
reached from any other node in that subset.
connected graph
An :term:`undirected graph` is a connected graph if there is at
least one path from any :term:`vertex` to any other.
constant running time
The cost of a function whose running time is not related to its
input size.
In Theta notation, this is traditionally written as
:math:`\Theta(1)`.
constructive induction
A process for finding the
:term:`closed form ` for a
:term:`recurrence relation`,
that involves substituting in a guess for the closed form to
replace the recursive part(s) of the recurrence.
Depending on the goal (typically either to show that the
hypothesized growth rate is right, or to find the precise
constants), one then manipulates the resulting non-recursive
equation.
container
container class
A :term:`data structure` that stores a collection of
:term:`records `.
Typical examples are :term:`arrays `,
:term:`search trees `, and
:term:`hash tables `.
context-free grammar
A :term:`grammar` comprised only of productions of the form
:math:`A \rightarrow x` where :math:`A` is a
:term:`non-terminal` and :math:`x` is a series of one or more
:term:`terminals ` and non-terminals.
That is, the given non-terminal :math:`A` can be replaced at any
time.
context-free language
The set of :term:`languages ` that can be defined by
:term:`context-sensitive grammars `.
context-sensitive grammar
A :term:`grammar` comprised only of productions of the form
:math:`xAy \rightarrow xvy` where :math:`A` is a
:term:`non-terminal` and :math:`x` and :math:`y` are each a
series of one or more
:term:`terminals ` and non-terminals.
That is, the given non-terminal :math:`A` can be replaced only
when it is within the proper context.
countably infinite
countable
A :term:`set` is countably infinite if it contains a finite
number of elements, or (for a set with an infinite number of
elements) if there exists a one-to-one mapping from
the set to the set of integers.
cost
The amount of resources that the solution consumes.
cost model
In :term:`algorithm analysis`, a definition for the cost of each
:term:`basic operation` performed by the algorithm,
along with a definition for the size of the input.
Having these definitions allows us to calculate the :term:`cost`
to run the algorithm on a given input, and from there determine
the :term:`growth rate` of the algorithm.
A cost model would be considered "good" if it yields predictions
that conform to our understanding of reality.
CPU
Acronym for Central Processing Unit, the primary processing
device for a computer.
current position
A property of some list ADTs, where there is maintained a
"current position" state that can be referred to later.
cycle
In :term:`graph` terminology,
a :term:`cycle` is a :term:`path` of length three or more that
connects some :term:`vertex` :math:`v_1` to itself.
cylinder
A :term:`disk drive` normally consists of a stack of
:term:`platters `.
While this might not be so true today, traditionally all of the
:term:`I/O heads ` moved together during a
:term:`seek` operation.
Thus, when a given I/O head is positioned over a particular
:term:`track` on a platter, the other I/O heads are also
positioned over the corresponding track on their platters.
That collection of tracks is called a cylinder.
A given cylinder represents all of the data that can be read
from all of the platters without doing another seek operation.
cylinder index
In the :term:`ISAM` system, a simple :term:`linear index` that
stores the lowest key value stored in each :term:`cylinder`.
cylinder overflow
In the :term:`ISAM` system, this is space reserved for storing
any records that can not fit in their respective
:term:`cylinder`.
DAG
Abbreviation for :term:`directed acyclic graph`.
data field
In :term:`object-oriented programming `,
a synonym for :term:`data member`.
data item
A piece of information or a record whose value is drawn from a type.
data member
The variables that together define the space required by a data
item are referred to as data members.
Some of the commonly used synonyms include :term:`data field`,
:term:`attribute`, and :term:`instance variable`.
data structure
The implementation for an :term:`ADT`.
data type
A type together with a collection of operations to manipulate
the type.
deallocated
deallocation
Free the memory allocated to an unused object.
decision tree
A theoretical construct for modeling the behavior of algorithms.
Each point at which the algorithm makes a decision (such as an
if statement) is modeled by a branch in the tree that represents
the algorithms behavior.
Decision trees can be used in
:term:`lower bounds proofs `,
such as the proof that sorting requires
:math:`\Omega(n \log n)` comparisons in the :term:`worst case`.
decision problem
A problem whose output is either "YES" or "NO".
deep copy
Copying the actual content of a :term:`pointee`.
degree
In :term:`graph` terminology, the degree for a :term:`vertex` is
its number of :term:`neighbors `.
In a :term:`directed graph`, the :term:`in degree` is the number
of edges directed into the vertex, and the :term:`out degree` is
the number of edges directed out of the vertex.
In :term:`tree` terminology, the degree for a :term:`node` is
its number of :term:`children `.
delegation mental model for recursion
A way of thinking about the process of :term:`recursion`.
The recursive function "delegates" most of the work when it
makes the recursive call.
The advantage of the delegation mental model for recursion is
that you don't need to think about how the delegated task is
performed.
It just gets done.
dense graph
A :term:`graph` where the actual number of :term:`edges `
is a large fraction of the possible number of edges.
Generally, this is interpreted to mean that the :term:`degree`
for any :term:`vertex` in the graph is relatively high.
depth
The depth of a node :math:`M` in a tree is the length
of the path from the root of the tree to :math:`M`.
depth-first search
A :term:`graph` :term:`traversal` algorithm.
Whenever a :math:`v` is :term:`visited ` during the
traversal, DFS will :term:`recursively ` visit all of
:math:`v` 's :term:`unvisited` :term:`neighbors `.
depth-first search tree
A :term:`tree` that can be defined by the operation of a
:term:`depth-first search` (DFS) on a :term:`graph`.
This tree would consist of the :term:`nodes ` of the graph
and a subset of the :term:`edges ` of the graph that was
followed during the DFS.
dequeue
A specialized term used to indicate removing an element from a queue.
dereference
Accessing the value of the :term:`pointee` for some
:term:`reference` variable.
Commonly, this happens in a language like Java when using the
"dot" operator to access some field of an object.
derivation
In formal languages, the process of executing a series of
:term:`production rules ` from a :term:`grammar`.
A typical example of a derivation would be the series of
productions executed to go from the :term:`start symbol` to a
given string.
descendant
In a tree, the set of all nodes that have a node :math:`A` as an
:term:`ancestor` are the descendants of :math:`A`.
In other words, all of the nodes that can be reached from
:math:`A` by progressing downwards in tree.
Another way to say it is: The
:term:`children ` of :math:`A`, their children, and so
on.
deserialization
The process of returning a :term:`serialized `
representation for a data structure back to its original
in-memory form.
design pattern
An abstraction for describing the design of programs,
that is, the interactions of objects and classes.
Experienced software designers learn and reuse patterns
for combining software components, and design patterns allow
this design knowledge to be passed on to new programmers more quickly.
deterministic
Any :term:`finite automata` in which, for every pair of
:term:`state` and symbol, there is only a single transition.
This means that whenever the machine is in a given state and
sees a given symbol, only a single thing can happen.
This is in contrast to a :term:`non-deterministic` finite
automata, which has at least one state with multiple transitions
on at least one symbol.
deterministic algorithm
An algorithm that does not involve any element of randomness,
and so its behavior on a given input will always be the same.
This is in contrast to a :term:`randomized algorithm`.
Deterministic Finite Automata
Deterministic Finite Acceptor
DFA
An :term:`automata` or abstract machine that can process an
input string (shown on a tape) from left to right.
There is a control unit (with :term:`states `),
behavior defined for what to do when in a given state and with a
given symbol on the current square of the tape.
All that we can "do" is change state before going to the next
letter to the right.
DFS
Abbreviation for :term:`depth-first search`.
diagonalization argument
A proof technique for proving that a set is
:term:`uncountably infinite`.
The approach is to show that, no matter what order the elements
of the set are put in, a new element of the set can be
constructed that is not in that ordering.
This is done by changing the :math:`i` th value or position of
the element to be different from that of the :math:`i` th
element in the proposed ordering.
dictionary
An abstract data type or :term:`interface` for a data structure or
software subsystem that supports insertion, search, and deletion
of records.
dictionary search
A close relative of an :term:`interpolation search`.
In a classical (paper) dictionary of words in a natural
language, there are markings for where in the dictionary the
words with a given letter start.
So in typical usage of such a dictionary, words are found by
opening the dictionary to some appropriate place within the
pages that contain words starting with that letter.
digraph
Abbreviation for :term:`directed graph`.
Dijkstra's algorithm
An algorithm to solve the
:term:`single-source shortest paths problem` in a :term:`graph`.
This is a :term:`greedy algorithm`.
It is nearly identical to :term:`Prim's algorithm` for finding a
:term:`minimal-cost spanning tree`, with the only difference
being the calculation done to update the best-known distance.
diminishing increment sort
Another name for :term:`Shellsort`.
direct access
A storage device, such as a disk drive, that has some ability to
move to a desired data location more-or-less directly.
This is in contrast to a :term:`sequential access` storage
device such as a tape drive.
direct proof
In general, a direct proof is just a "logical explanation".
A direct proof is sometimes referred to as an argument by deduction.
This is simply an argument in terms of logic.
Often written in English with words such as "if ... then",
it could also be written with logic notation such as
:math:`P \Rightarrow Q`.
directed acyclic graph
A :term:`graph` with no cycles.
Abbreviated as :term:`DAG`.
Note that a DAG is not necessarily a :term:`tree` since a given
:term:`node` might have multiple :term:`parents `.
directed edge
An :term:`edge` that goes from :term:`vertex` to another.
In contrast, an :term:`undirected edge` simply links to vertices
without a direction.
directed graph
A :term:`graph` whose :term:`edges ` each are directed
from one of its defining :term:`vertices ` to the
other.
dirty bit
Within a :term:`buffer pool`, a piece of information associated
with each :term:`buffer` that indicates whether the contents of
the buffer have changed since being read in from
:term:`backing storage`.
When the buffer is :term:`flushed ` from the buffer pool,
the buffer's contents must be written to the backing storage if
the dirty bit is set (that is, if the contents have changed).
This means that a relatively expensive write operation is
required.
In contrast, if the dirty bit is not set, then it is unnecessary
to write the contents to backing storage, thus saving time over
not keeping track of whether the contents have changed or not.
Discrete Fourier Transform
DFT
Let :math:`a = [a_0, a_1, ..., a_{n-1}]^T` be a vector that
stores the coefficients for a polynomial being evaluated.
We can then do the calculations to evaluate the polynomial at
the :math:`n` th :math:`roots of unity `
by multiplying the :math:`A_{z}`
matrix by the coefficient vector.
The resulting vector :math:`F_{z}` is called the
Discrete Fourier Transform (or DFT) for the polynomial.
discriminator
A part of a :term:`multi-dimensional search key`.
Certain tree data structures such as the :term:`bintree` and the
:term:`kd tree` operate by making branching decisions at nodes
of the tree based on a single attribute of the multi-dimensional
key, with the attribute determined by the level of the node in
the tree.
For example, in 2 dimensions, nodes at the odd levels in the
tree might branch based on the :math:`x` value of a coordinate,
while at the even levels the tree would branch based on the
:math:`y` value of the coordinate.
Thus, the :math:`x` coordinate is the discriminator for the odd
levels, while the :math:`y` coordinate is the discriminator for
the even levels.
disjoint
Two parts of a :term:`data structure` or two
collections with no objects in common are disjoint.
This term is often used in conjunction with a data structure
that has :term:`nodes ` (such as a :term:`tree`).
Also used in the context of :term:`sets `, where two
:term:`subsets ` are disjoint if they share no elements.
disjoint sets
A collection of :term:`sets `, any pair of which share no
elements in common.
A collection of disjoint sets partitions some objects
such that every object is in exactly one of the disjoint sets.
disk-based space/time tradeoff
In contrast to the standard :term:`space/time tradeoff`, this
principle states that the smaller you can make your disk storage
requirements, the faster your program will run.
This is because the time to read information from disk is
enormous compared to computation time, so almost any amount of
additional computation needed to unpack the data is going to be
less than the disk-reading time saved by reducing the storage
requirements.
disk controller
The control mechanism for a :term:`disk drive`.
Responsible for the action of reading or writing a :term:`sector`
of data.
disk drive
An example of :term:`peripheral storage` or
:term:`secondary storage`.
Data access times are typically measured in thousandths of a
second (milliseconds), which
is roughly a million times slower than access times for
:term:`RAM`, which is an example of a :term:`primary storage`
device.
Reads from and writes to a disk drive are always done in terms
of some minimum size, which is typically called a
:term:`block`.
The block size is 512 bytes on most disk drives.
Disk drives and RAM are typical parts of a computer's
:term:`memory hierarchy`.
disk access
The act of reading data from a disk drive (or other form of
:term:`peripheral storage`).
The number of times data must be read from (or written to) a
disk is often a good measure of cost for an algorithm that
involves disk I/O, since this is usually the dominant cost.
disk I/O
Refers to the act of reading data from or writing data to a
:term:`disk drive`.
All disk reads and writes are done in units of a :term:`sector`
or :term:`block`.
distance
In :term:`graph` representations, a synonym for :term:`weight`.
divide and conquer
A technique for designing algorithms where a solution is found
by breaking the problem into smaller (similar) subproblems,
solving the subproblems, then combining the subproblem solutions
to form the solution to the original problem.
This process is often implemented using :term:`recursion`.
divide-and-conquer recurrences
A common form of :term:`recurrence relation`
that have the form
.. math::
{\bf T}(n) = a{\bf T}(n/b) + cn^k; \quad {\bf T}(1) = c
where :math:`a`, :math:`b`, :math:`c`, and :math:`k` are constants.
In general, this recurrence describes a problem of size :math:`n`
divided into :math:`a` subproblems of size :math:`n/b`,
while :math:`cn^k` is the amount of work necessary to combine the
partial solutions.
divide-and-guess
A technique for finding a :term:`closed-form solution` to a
:term:`summation` or :term:`recurrence relation`.
domain
The set of possible inputs to a function.
double buffering
The idea of using multiple :term:`buffers ` to allow the
:term:`CPU` to operate in parallel with a
:term:`peripheral storage` device.
Once the first buffer's worth of data has been read in, the CPU
can process this while the next block of data is being
read from the peripheral storage.
For this idea to work, the next block of data to be processed
must be known or predicted with reasonable accuracy.
double hashing
A :term:`collision resolution` method. A second hash
function is used to generate a value :math:`c` on the key.
That value is then used by this key as the step size in
:term:`linear probing by steps`.
Since different keys use different step sizes (as generated by
the second hash function), this process avoids the clustering
caused by standard linear probing by steps.
double rotation
A type of :term:`rebalancing operation` used by the
:term:`Splay Tree` and :term:`AVL Tree`.
doubly linked list
A :term:`linked list` implementation variant where each list
node contains access pointers to both the previous element and
the next element on the list.
DSA
Abbreviation for Data Structures and Algorithms.
dynamic
Something that is changes (in contrast to :term:`static`).
In computer programming, dynamic normally refers to something
that happens at run time.
For example, run-time analysis is analysis of the program's
behavior, as opposed to its (static) text or structure
Dynamic binding or dynamic memory allocation occurs at run time.
dynamic allocation
The act of creating an object from :term:`free store`.
In C++, Java, and JavaScript, this is done using the ``new``
operator.
dynamic array
Arrays, once allocated, are of fixed size. A dynamic array puts
an :term:`interface` around the array so as to appear to allow the array
to grow and shrink in size as necessary. Typically this is done
by allocating a new copy, copying the contents of the old array,
and then returning the old array to :term:`free store`.
If done correctly, the :term:`amortized cost` for dynamically
resizing the array can be made constant.
In some programming languages such as Java, the term
:term:`vector` is used as a synonym for dynamic array.
dynamic memory allocation
A programming technique where linked objects in a data structure
are created from :term:`free store` as needed. When no longer
needed, the object is either returned to :term:`free store` or
left as :term:`garbage`, depending on the programming language.
dynamic programming
An approach to designing algorithms that works by storing a table
of results for subproblems.
A typical cause for excessive cost in
:term:`recursive `
algorithms is that different branches of the recursion might
solve the same subproblem.
Dynamic programming uses a table to store information about
which subproblems have already been solved, and uses the stored
information to immediately give the answer for any repeated
attempts to solve that subproblem.
edge
The connection that links two :term:`nodes ` in a
:term:`tree`, :term:`linked list`, or :term:`graph`.
edit distance
Given strings :math:`S` and :math:`T`, the edit distance is
a measure for the number of editing steps required to convert
:math:`S` into :math:`T`.
efficient
A solution is said to be efficient
if it solves the problem within the required
:term:`resource constraints`.
A solution is sometimes said to be
efficient if it requires fewer resources than known
alternatives, regardless of whether it meets any particular
requirements.
element
One value or member in a set.
empirical comparison
An approach to comparing to things by actually seeing how they
perform.
Most typically, we are referring to the comparison of two
programs by running each on a suite of test data and measuring
the actual running times.
Empirical comparison is subject to many possible complications,
including unfair selection of test data, and inaccuracies in the
time measurements due to variations in the computing environment
between various executions of the programs.
empty
For a :term:`container` class, the state of containing no
:term:`elements `.
encapsulation
In programming, the concept of hiding implementation details
from the user of an ADT, and protecting
:term:`data members ` of an
object from outside access.
enqueue
A specialized term used to indicate inserting an element onto a queue.
entry-sequenced file
A file that stores records in the order that they were added to
the file.
enumeration
The process by which a :term:`traversal` lists every object in
the :term:`container` exactly once.
Thus, a traversal that prints the :term:`nodes ` is said
to enumerate the nodes.
An enumeration can also refer to the actual listing that is
produced by the traversal
(as well as the process that created that listing).
equidistribution property
In random number theory, this means that a given series of
random numbers cannot be described more briefly than simply
listing it out.
equivalence class
An :term:`equivalence relation` can be used to partition a set
into equivalence classes.
equivalence relation
Relation :math:`R` is an equivalence relation on set
:math:`\mathbf{S}` if it is :term:`reflexive`,
:term:`symmetric`, and :term:`transitive`.
estimation
As a technical skill, this is the process of generating a rough
estimate in order to evaluate the feasibility of a proposed
solution.
This is sometimes known as "back of the napkin" or
"back of the envelope" calculation.
The estimation process can be formalized as (1) determine the
major parameters that affect the problem, (2) derive an equation
that relates the parameters to the problem, then (3) select
values for the parameters and apply the equation to yield an
estimated solution.
evaluation
The act of finding the value for a polynomial at a given point.
exact-match query
Records are accessed by unique identifier.
exceptions
Exceptions are techniques used to predict possible runtime
errors and handle them properly.
exchange
A swap of adjacent records in an :term:`array`.
exchange sort
A sort that relies solely on exchanges (swaps of adjacent
records) to reorder the list.
:term:`Insertion Sort ` and
:term:`Bubble Sort` are examples of exchange sorts.
All exchange sorts require
:math:`\Theta(n^2)` time in the :term:`worst case`.
expanding the recurrence
A technique for solving a :term:`recurrence relation`.
The idea is to replace the recursive part of the recurrence with
a copy of recurrence.
exponential growth rate
A :term:`growth rate` function where :math:`n` (the input size)
appears in the exponent. For example, :math:`2^n`.
expression tree
A :term:`tree` structure meant to represent a mathematical expression.
:term:`Internal nodes ` of the expression tree
are operators in the expression, with the subtrees being the
sub-expressions that are its operand.
All :term:`leaf nodes ` are operands.
extent
A physically contiguous block of :term:`sectors ` on a
:term:`disk drive` that are all part of a given disk file.
The fewer extents needed to store the data for a disk file,
generally the fewer :term:`seek` operations that will be
required to process a series of :term:`disk access` operations
on that file.
external fragmentation
A condition that arises when a series of
:term:`memory requests `
result in lots of small :term:`free blocks `, no one
of which is useful for servicing typical requests.
external sort
A sorting algorithm that is applied to data stored in
:term:`peripheral storage` such as on a :term:`disk drive`.
This is in contrast to an :term:`internal sort` that works on
data stored in :term:`main memory`.
factorial
The factorial function is defined as :math:`f(n) = n f(n-1)` for
:math:`n > 0`.
failure policy
In a :term:`memory manager`, a failure policy is the response
that takes place when there is no way to satisfy a
:term:`memory request` from the current
:term:`free blocks ` in the :term:`memory pool`.
Possibilities include rejecting the request, expanding the
memory pool, collecting :term:`garbage`, and reorganizing the
memory pool (to collect together free space).
family of languages
Given some class or type of :term:`finite automata`
(for example, the :term:`deterministic finite automata`),
the set of languages accepted by that class of finite automata
is called a family.
For example, the :term:`regular languages ` is
a family defined by the DFAs.
file allocation table
A legacy file system architecture orginially developed for DOS
and then used in Windows.
It is still in use in many small-scale peripheral devices such
as USB memory sticks and digital camera memory.
file manager
A part of the :term:`operating system`
responsible for taking requests for data from a
:term:`logical file` and mapping those requests to the
physical location of the data on disk.
file processing
The domain with Computer Science that deals with processing data
stored on a :term:`disk drive` (in a file), or more broadly,
dealing with data stored on any :term:`peripheral storage`
device.
Two fundamental properties make dealing with data on a
peripheral device different from dealing with data in main
memory:
(1) Reading/writing data on a peripheral storage device is far
slower than reading/writing data to main memory (for example, a
typical disk drive is about a million times slower than
:term:`RAM`).
(2) All I/O to a peripheral device is typically in terms of a
:term:`block` of data (for example, nearly all disk drives do
all I/O in terms of blocks of 512 bytes).
file structure
The organization of data on :term:`peripheral storage`,
such as a :term:`disk drive` or DVD drive.
FIFO
Abbreviation for "first-in, first-out".
This is the access paradigm for a :term:`queue`,
and an old terminolgy for the queue is "FIFO list".
final state
A required element of any :term:`acceptor `.
When computation on a string ends in a final state, then the
machine accepts the string.
Otherwise the machine rejects the string.
FIND
One half of the :term:`UNION/FIND` algorithm for managing
:term:`disjoint sets`.
It is the process of moving upwards in a
tree to find the tree's root.
Finite State Machine
FSM
Finite State Automata
FSA
Finite Automata
Any abstract state machine, generally represented as a graph
where the nodes are the :term:`states `, and the edges
represent transitions between nodes that take place when the
machine is in that node (state) and sees an appropriate input.
See, as an example, :term:`Deterministic Finite Automata`.
Finite State Acceptor
A simple type of :term:`finite state automata`, an acceptor's
only ability is to accept or reject a string.
So, a finite state acceptor does not have the ability to modify
the input tape.
If computation on the string ends in a :term:`final state`,
then the the string is accepted, otherwise it is rejected.
first fit
In a :term:`memory manager`, first fit is a :term:`heuristic`
for deciding which :term:`free block` to use when allocating
memory from a :term:`memory pool`.
First fit will always allocate the first :term:`free block` on
the :term:`free block list` that is large enough to service the
memory request.
The advantage of this approach is that it is typically not
necessary to look at all free blocks on the free block list to
find a suitable free block.
The disadvantage is that it is not "intelligently" selecting
what might be a better choice of free block.
fixed-length coding
Given a collection of objects, a fixed-length coding scheme
assigns a code to each object in the collection using codes that
are all of the same length.
Standard ASCII and Unicode representations for characters are
both examples of fixed-length coding schemes.
This is in contrast to :term:`variable-length coding`.
floor
Written :math:`\lfloor x \rfloor`, for real value :math:`x` the
floor is the greatest integer :math:`\leq x`.
flush
The act of removing data from a :term:`cache `, most
typically because other data considered of higher future value
must replace it in the cache.
If the data being flushed has been modified since it was first
read in from :term:`secondary storage` (and the changes are
meant to be saved), then it must be written back to that
secondary storage.
Floyd's algorithm
An algorithm to solve the
:term:`all-pairs shortest paths problem`.
It uses the :term:`dynamic programming` algorithmic technique,
and runs in :math:`\Theta(n^3)` time.
As with any :term:`dynamic programming` algorithm,
the key issue is to avoid duplicating work by using proper
bookkeeping on the algorithm's progress through the solution space.
The basic idea is to first find all the direct edge costs, then
improving those costs by allowing paths through :term:`vertex`
0, then the cheapest paths involving paths going through
vertices 0 and 1, and so on.
flush
The the context of a :term:`buffer pool`, the process of
removing the contents stored in a :term:`buffer`
when that buffer is required in order to store new data.
If the buffer's contents have been changed since having been
read in from :term:`backing storage` (this fact would
normally be tracked by using a :term:`dirty bit`),
then they must be copied back to the backing storage before the
buffer can be reused.
flyweight
A :term:`design pattern` that is meant to solve the following
problem:
You have an application with many objects.
Some of these objects are identical in the information that
they contain, and the role that they play.
But they must be reached from various places, and conceptually they
really are distinct objects.
Because there is so much duplication of the same information,
we want to reduce memory cost by sharing that space.
For example, in document layout,
the letter "C" might be represented by an object that
describes that character's strokes and bounding box.
However, we do not want to create a separate "C" object everywhere
in the document that a "C" appears.
The solution is to allocate a single copy of the shared representation
for "C" objects.
Then, every place in the document that needs a "C" in a given font,
size, and typeface will reference this single copy.
The various instances of :term:`references ` to a
specific form of "C" are called flyweights.
Flyweights can also be used to implement the empty leaf nodes
of the :term:`bintree` and :term:`PR quadtree`.
folding method
In :term:`hashing`, an approach to implementing a
:term:`hash function`.
Most typically used when the key is a string, the folding method
breaks the string into pieces (perhaps each letter is a piece,
or a small series of letters is a piece), converts the letter(s)
to an integer value (typically by using its underlying encoding
value), and summing up the pieces.
Ford and Johnson sort
A sorting algorithm that is close to the theoretical minimum
number of key comparisons necessary to sort.
Generally not considered practical in practice due to the fact
that it is not efficient in terms of the number of records that
need to be moved.
It consists of first sorting pairs of nodes into winners and
losers (of the pairs comparisons), then (recursively)
sorting the winners of the pairs, and then finally carefully
selecting the order in which the losers are added to the chain
of sorted items.
forest
A collection of one or more :term:`trees `.
free block
A block of unused space in a :term:`memory pool`.
free block list
In a :term:`memory manager`, the list that stores the necessary
information about the current :term:`free blocks `.
Generally, this is done with some sort of :term:`linked list`,
where each node of the linked list indicates the start position
and length of the free block in the :term:`memory pool`.
free store
Space available to a program during runtime to be used for
:term:`dynamic allocation` of objects.
The free store is distinct from the :term:`runtime stack`.
The free store is sometimes referred to as the :term:`heap`,
which can be confusing because :term:`heap` more often refers to
a specific data structure. Most programming languages provide
functions to allocate (and maybe to deallocate) objects from the
free store, such as ``new`` in C++ and Java.
freelist
A simple and faster alternative to using :term:`free store` when
the objects being dynamically allocated are all of the same size
(and thus are interchangeable).
Typically implemented as a :term:`linked stack`, released
objects are put on the front of the freelist.
When a request is made to allocate an object, the freelist is
checked first and it provides the object if possible.
If the freelist is empty, then a new object is allocated from
:term:`free store`.
free tree
A connected, :term:`undirected graph` with no simple cycles.
An equivalent definition is that a free tree is connected and
has :math:`|\mathbf{V}| - 1` edges.
frequency count
A :term:`heuristic` used to maintain a
:term:`self-organizing list`.
Under this heuristic, a count is maintained for every record.
When a record access is made, its count is increased.
If this makes its count greater than that of another record in
the list, it moves up toward the front of the list accordingly
so as to keep the list sorted by frequency.
Analogous to the :term:`least frequently used` heuristic for
maintaining a :term:`buffer pool`.
full binary tree theorem
This theorem states that
the number of leaves in a non-empty full binary tree is one
more than the number of internal nodes.
Equivalently, then number of null pointers in a standard
:term:`pointer-based implementation for binary tree nodes`
is one more than the number of nodes in the binary tree.
full tree
A :term:`binary tree` is full if every :term:`node` is either a
:term:`leaf node` or else it is an :term:`internal node` with
two non-empty :term:`children `.
function
In mathematics, a matching between inputs (the :term:`domain`)
and outputs (the :term:`range`).
In programming, a subroutine that takes input parameters and
uses them to compute and return a value.
In this case, it is usually considered bad practice for a
function to change any global variables
(doing so is called a side effect).
garbage
In :term:`memory management `,
any memory that was previously (dynamically)
allocated by the program during runtime, but which is no longer
accessible since all pointers to the memory have been deleted or
overwritten.
In some languages, garbage can be recovered by
:term:`garbage collection`.
In languages such as C and C++ that do not support garbage
collection, so creating garbage is considered a
:term:`memory leak`.
garbage collection
Languages with garbage collection such
Java, JavaScript, Lisp, and Scheme will periodically reclaim
:term:`garbage` and return it to :term:`free store`.
general tree
A tree in which any given node can have any number of
:term:`children `.
This is in contrast to, for example, a :term:`binary tree` where
each node has a fixed number of children (some of which might be
``null``).
General tree nodes tend to be harder to implement for this reason.
grammar
A formal definition for what strings make up a :term:`language`,
in terms of a set of :term:`production rules `.
graph
A :term:`graph` :math:`\mathbf{G} = (\mathbf{V}, \mathbf{E})`
consists of a set of :term:`vertices `
:math:`\mathbf{V}` and a set of :term:`edges `
:math:`\mathbf{E}`, such that each edge in :math:`\mathbf{E}` is
a connection between a pair of vertices in :math:`\mathbf{V}`.
greedy algorithm
An algorithm that makes locally optimal choices at each step.
growth rate
In :term:`algorithm analysis`, the rate at which the cost
of the :term:`algorithm` grows as the size of its input grows.
guess-and-test
A technique used when trying to determine the
:term:`closed-form solution` for a
:term:`summation` or :term:`recurrence relation`.
Given a hypothesis for the closed-form solution,
if it is correct, then it is often relatively easy to prove that
using :term:`induction `.
guided traversal
A :term:`tree traversal` that does not need to visit every node
in the tree.
An example would be a :term:`range query` in a :term:`BST`.
halt state
In a :term:`finite automata`, a designated :term:`state` which
causes the machine to immediately halt when it is entered.
halted configuration
A halted configuration occurs in a :term:`Turing machine` when
the machine transitions into the :term:`halt state`.
halting problem
The halting problem is to answer this question:
Given a computer program :math:`P` and an
input :math:`I`, will program :math:`P` halt when executed on
input :math:`I`?
This problem has been proved impossible to solve in the general
case.
Thus, it is an example of an :term:`unsolveable problem`.
handle
When using a :term:`memory manager` to store data, the
:term:`client` will pass data to be stored
(the :term:`message`) to the memory manager, and the memory
manager will return to the client a handle.
The handle encodes the necessary information that the memory
manager can later use to recover and return the message to the
client.
This is typically the location and length of the message within
the :term:`memory pool`.
hanging configuration
A hanging configuration occurs in a :term:`Turing machine` when
the I/O head moves to the left from the left-most square of the
tape, or when the machine goes into an infinite loop.
hard algorithm
"Hard" is traditionally defined in relation to running time, and
a "hard" algorithm is defined to be an algorithm with exponential
running time.
hard problem
"Hard" is traditionally defined in relation to running time, and
a "hard" problem is defined to be one whose best known algorithm
requires exponential running time.
harmonic series
The sum of reciprocals from 1 to :math:`n` is called the
Harmonic Series, and is written :math:`{\cal H}_n`.
This sum has a value between :math:`\log_e n` and
:math:`\log_e n + 1`.
hash function
In a :term:`hash system`, the function that converts a
:term:`key` value to a position in the :term:`hash table`.
The hope is that this position in the hash table contains the
record that matches the key value.
hash system
The implementation for search based on hash lookup in a
:term:`hash table`.
The :term:`search key` is processed by a
:term:`hash function`, which returns a position in a
:term:`hash table`, which hopefully is the correct position in
which to find the record corresponding to the search key.
hash table
The data structure (usually an :term:`array`) that stores data
records for lookup using :term:`hashing`.
hashing
A search method that uses a :term:`hash function` to convert a
:term:`search key` value into a position within a
:term:`hash table`.
In a properly implemented :term:`hash system`, that position in
the table will have high probability of containing the record
that matches the key value.
Sometimes, the hash function will return an position that does
not store the desired key, due to a process called
:term:`collision`.
In that case, the desired record is found through a process
known as :term:`collision resolution`.
head
The beginning of a :term:`list`.
header node
Commonly used in implementations for a :term:`linked list` or
related structure, this :term:`node` preceeds the first element
of the list.
Its purpose is to simplify the code implementation by
reducing the number of special cases that must be programmed
for.
heap
This term has two different meanings.
Uncommonly, it is a synonym for :term:`free store`.
Most often it is used to refer to a particular data structure.
This data structure is a :term:`complete binary tree` with the
requirement that every :term:`node` has a value greater than its
:term:`children ` (called a :term:`max heap`), or else
the requirement that every node has a value less than its
children (called a :term:`min heap`).
Since it is a complete binary tree, a heap is nearly always
implemented using an :term:`array` rather than an explicit tree
structure.
To add a new value to a heap, or to remove the extreme value
(the max value in a max-heap or min value in a min-heap) and
update the heap,
takes :math:`\Theta(\log n)` time in the :term:`worst case`.
However, if given all of the values in an unordered array,
the values can be re-arranged to form a heap in only
:math:`\Theta(n)` time.
Due to its space and time efficiency, the heap is a
popular choice for implementing a :term:`priority queue`.
Heapsort
A sorting algorithm that costs :math:`\Theta(n \log n)` time in
the :term`best `, :term:`average `,
and :term:`worst ` cases.
It tends to be slower than :term:`Mergesort` and
:term:`Quicksort`.
It works by building a :term:`max heap`, and
then repeatedly removing the item with maximum :term:`key` value
(moving it to the end of the heap) until all elements have been
removed (and replaced at their proper location in the array).
height
The height of a tree is one more than the :term:`depth` of the
deepest :term:`node` in the tree.
height balanced
The condition the :term:`depths ` of each :term:`subtree`
in a tree are roughly the same.
heuristic
A way to solve a problem that is not guarenteed to be optimal.
While it might not be guarenteed to be optimal, it is generally
expected (by the agent employing the heuristic) to provide a
reasonably efficient solution.
heuristic algorithm
A type of :term:`approximation algorithm`, that uses a
:term:`heuristic` to find a good, but not necessarily cheapest,
solution to an :term:`optimization problem`.
home position
In :term:`hashing`, a synonym for :term:`home slot`.
home slot
In :term:`hashing`, this is the :term:`slot` in the
:term:`hash table` determined for a given key by the
:term:`hash function`.
homogeneity
In a :term:`container` class, this is the property that all
objects stored in the ncontainer are of the same class.
For example, if you have a list intended to store Payroll
records, is it possible for the programmer to insert an integer
onto the list instead?
Huffman coding tree
A Huffman coding tree is a :term:`full binary tree `
that is used to represent letters (or other symbols)
efficiently.
Each letter is associated with a node in the tree, and is then
given a :term:`Huffman code ` based on the
position of the associated node.
A Huffman coding tree is an example of a binary :term:`trie`.
Huffman codes
The codes given to a collection of letters (or other symbols)
through the process of Huffman coding.
Huffman coding uses a :term:`Huffman coding tree` to generate
the codes.
The codes can be of variable length, such that the letters which
are expected to appear most frequently are shorter.
Huffman coding is optimal whenever the true frequencies are
known, and the frequency of a letter is independent of the
context of that letter in the message.
Huffman tree
Shorter form of the term :term:`Huffman coding tree`.
in degree
In :term:`graph` terminology, the in degree for a :term:`vertex` is
the number of edges directed into the vertex.
information theoretic lower bound
A :term:`lower bound` on the amount of resources needed to solve
a :term:`problem` based on the number of bits of information
needed to uniquely specify the answer.
Sometimes referred to as a "Shannon theoretic lower bound" due
to Shannon's work on information theory and entropy.
An example is that sorting has a lower bound of
:math:`\Omega(\log_2 n!)` because there are :math:`n!` possible
orderings for :math:`n` values.
This observation alone does not make the lower bound tight,
because it is possible that no algorithm could actually reach
the information theory lower limit.
inode
Short for "index node".
In UNIX-style file systems, specific disk :term:`sectors `
that hold indexing information to define the layout of the file
system.
image-space decomposition
A from of :term:`key-space decomposition` where the
:term:`key space` splitting points is predetermined (typically
by splitting in half).
For example, a :term:`Huffman coding tree` splits the letters
being coded into those with codes that start with 0 on the left
side, and those with codes that start with 1 on the right side.
This regular decomposition of the key space is the basis for a
:term:`trie` data structure.
An image-space decomposition is in opposition to an
:term:`object-space decomposition`.
incident
In :term:`graph` terminology,
an edge connecting two vertices is said to be incident with
those vertices.
The two vertices are said to be :term:`adjacent`.
index file
A file whose records consist of
:term:`key-value pairs ` where the
pointers are referencing the complete records stored in another
file.
indexing
The process of associating a :term:`search key` with the
location of a corresponding data record.
The two defining points to the concept of an index is the
association of a key with a record, and the fact that the index
does not actually store the record itself but rather it stores a
:term:`reference` to the record.
In this way, a collection of records can be supported by
multiple indices, typically a separate index for each key field
in the record.
induction hypothesis
The key assumption used in a :term:`proof by induction`,
that the theorem to be proved holds for smaller instances of the
theorem.
The induction hypothesis is equivalent to the
:term:`recursive `
call in a recursive function.
induction step
Part of a :term:`proof by induction`.
In its simplest form, this is a proof of the implication that if
the theorem holds for $n-1$, then it holds for $n$.
As an alternative, see :term:`strong induction`.
induction variable
The variable used to parameterize the theorem being proved by
induction.
For example, if we seek to prove that the sum of the integers
from 1 to $n$ is $n(n+1)/2$, then $n$ is the induction
variable.
An induction variable must be an integer.
inherit
In :term:`object-oriented programming `,
the process by which a :term:`subclass` gains
:term:`data members ` and :term:`methods `
from a :term:`base class`.
initial state
A synonym for :term:`start state`.
inorder traversal
In a :term:`binary tree`, a :term:`traversal` that first
:term:`recursively ` :term:`visits ` the left
:term:`child`, then visits the :term:`root`,
an then recursively visits the right child.
In a :term:`binary search tree`, this traversal will
:term:`enumerate ` the nodes in sorted order.
Insertion Sort
A sorting algorithm with :math:`\Theta(n^2)`
:term`average ` and :term:`worst case` cost,
and :math:`Theta(n)` :term:`best case` cost.
This best case cost makes it useful when we have reason to
expect the input to be nearly sorted.
instance variable
In :term:`object-oriented programming `,
a synonym for :term:`data member`.
integer function
Any function whose input is an integer and whose output is an
integer. It can be proved by
:term:`diagonalization ` that the
set of integer functions is :term:`uncountably infinite`.
inter-sector gap
On a disk drive, a physical gap in the data that occurs between
the :term:`sectors `.
This allows the :term:`I/O head` detect the end of the sector.
interface
An interface is a class-like structure that only contains method
signatures and fields. An interface does not contain an implementation
of the methods or any :term:`data members `.
intermediate code
A step in a typical :term:`compiler` is to transform the
original high-level language into a form on which it is easier
to do other stages of the process.
For example, some compilers will transform the original
high-level source code into :term:`assembly code` on which it
can do code optimization, before translating it into its final
executable form.
intermediate code generation
A phase in a :term:`compiler`, that walks through a
:term:`parse tree` to produce simple :term:`assembly code`.
internal fragmentation
A condition that occurs when more than :math:`m` bytes
are allocated to service a :term:`memory request` for :math:`m`
bytes, wasting free storage.
This is often done to simplify
:term:`memory management `.
internal node
In a tree, any node that has at least one non-empty
:term:`child` is an internal node.
internal sort
A sorting algorithm that is applied to data stored in
:term:`main memory`.
This is in contrast to an :term:`external sort` that is meant to
work on data stored in
:term:`peripheral storage` such as on a :term:`disk drive`.
interpolation
The act of finding the coefficients of a polynomial, given the
values at some points.
A polynomal of degree :math:`n-1` requires :math:`n` points to
interpolate the coefficients.
interpolation search
Given a sorted array, and knowing the first and last :term:`key`
values stored in some subarray known to contain
:term:`search key` :math:`K`, interpolation search will compute
the expected location of :math:`K` in the subarray as a fraction
of the distance between the known key values.
So it will next check that computed location, thus narrowing the
search for the next iteration.
Given reasonable key value distribution, the :term:`average
case` for interpolation search will be
:math:`\Theta(\log \log n)`, or better than the expected cost of
:term:`binary search`.
Nonetheless, binary search is expected to be faster in nearly
all practical situations due to the small difference between the
two costs, combined with the higher constant factors required to
implement interpolation search as compared to binary search.
interpreter
In contrast to a :term:`compiler` that translates a high-level
program into something that can be repeatedly executed to
perform a computation, an interpreter directly performs
computation on the high-level langauge.
This tends to make the computation much slower than if it were
performed on the directly executable version produced by a
compiler.
inversion
A measure of how disordered a series of values is. For each
element :math:`X` in the series, count one inversion for each
element to left of :math:`X` that is greater than the value of
:math:`X` (and so must ultimately be moved to the right of
:math:`X` during a sorting process).
inverted list
An :term:`index ` which links
:term:`secondary keys ` to either the associated
:term:`primary key` or the actual record in the database.
inverted file
Synonym for :term:`inverted list` when the inverted list is
stored in a disk file.
I/O head
On a :term:`disk drive` (or similar device), the part of the
machinery that actually reads data from the disk.
irreflexive
In set notation, binary relation :math:`R` on set :math:`S` is
irreflexive if :math:`aRa` is never in the relation for
any :math:`a \in \mathbf{S}`.
ISAM
Indexed Sequential Access Method: an obsolete method for
indexing data for (at the time) fast retrieval. More generally,
the term is used also to generically refer to an
:term:`index ` that supports both sequential and
:term:`keyed ` access to data records.
Today, that would nearly always be implemented using a
:term:`B-Tree`.
iterator
In a :term:`container` such as a List, a separate class that
indicates position within the container, with support for
:term:`traversing ` through all
:term:`elements ` in the container.
job
Common name for processes or tasks to be run by an operating
system.
They typically need to be processed in order of
importance, and so are kept organized by a
:term:`priority queue`.
Another common use for this term is for a collection of tasks to
be ordered by a :term:`topological sort`.
jump search
An algorithm for searching a sorted list, that falls between
:term:`sequential search` and :term:`binary search` in both
computational cost and conceptual complexity.
The idea is to keep jumping by some fixed number of positions
until a value is found that is bigger than :term:`search key`
:math:`K`, then do a sequential search over the subarray that is
now known to contain the search key.
The optimal number of steps to jump will be :math:`\sqrt{n}` for
an array of size :math:`n`, and the :term:`worst case` cost will
be :math:`\Theta(\sqrt{n})`.
K-ary tree
A type of :term:`full tree` where every internal node has
exactly :math:`K` :term:`children `.
k-path
In :term:`Floyd's algorithm`, a k-path is a path between two
vertices :math:`i` and :math:`j` that can only go through
vertices with an index value less than or equal to :math:`k`.
kd tree
A :term:`spatial data structure` that uses a binary tree to
store a collection of data records based on their (point)
location in space.
It uses the concept of a :term:`discriminator` at each level to
decide which single component of the
:term:`multi-dimensional search key` to branch on at that level.
It uses a :term:`key-space decomposition`, meaning that all data
records in the left subtree of a node have a value on the
corresponding discriminator that is less than that of the node,
while all data records in the right subtree have a greater
value.
The :term:`bintree` is the :term:`image-space decomposition`
analog of the kd tree.
key
A field or part of a larger record used to represent that record
for the purpose of searching or comparing.
Another term for :term:`search key`.
key sort
Any sorting operation applied to a collection of
:term:`key-value pairs ` where the value in this
case is a :term:`reference` to a complete record (that is, a
pointer to the record in memory or a position for a record on
disk).
This is in contrast to a sorting operation that works directly
on a collection of records.
The intention is that the collection of key-value pairs is far
smaller than the collection of records themselves.
As such, this might allow for an :term:`internal sort` when
sorting the records directly would require an :term:`external
sort`.
The collection of key-value pairs can also act as an
:term:`index `.
key space
The range of values that a :term:`key` value may take on.
key-space decomposition
The idea that the range for a :term:`search key` will be split
into pieces.
There are two general approaches to this:
:term:`object-space decomposition` and
:term:`image-space decomposition`.
key-value pair
A standard solution for solving the problem of how to relate a
:term:`key` value to a record (or how to find the key for a
given record) within the context of a particular
:term:`index `.
The idea is to simply store as records in the index pairs of
keys and records.
Specifically, the index will typically store a copy of the key
along with a :term:`reference` to the record.
The other standard solution to this problem is to pass a
:term:`comparator` function to the index.
knapsack problem
While there are many variations of this problem, here is a
typical version: Given knapsack of a fixed size, and a
collection of objects of various sizes, is there a subset of the
objects that exactly fits into the knapsack?
This problem is known to be :term:`NP-complete`, but can be
solved for problem instances in practical time relatively
quickly using :term:`dynamic programming`.
Thus, it is considered to have
:term:`pseudo-polynomial ` cost.
An :term:`optimization problem` version is to find the subset
that can fit with the greatest amount of items, either in terms of
their total size, or in terms of the sum of values associated
with each item.
Kruskal's algorithm
An algorithm for computing the :term:`MCST` of a
:term:`graph`.
During processing, it makes use of the :term:`UNION/FIND`
process to efficiently determine of two vertices are within the
same :term:`subgraph`.
LFU
Abbreviation for :term:`least frequently used`.
LIFO
Abbreviation for "Last-In, First-Out".
This is the access paradigm for a :term:`stack`,
and an old terminolgy for the stack is "LIFO list".
LRU
Abbreviation for :term:`least recently used`.
labeled graph
A :term:`graph` with labels associated with the
:term:`nodes `.
language
A set of strings.
Las Vegas algorithms
A form of :term:`randomized algorithm`.
We always find the maximum value, and "usually" we find it fast.
Such algorithms have a guaranteed result, but do not guarantee fast
running time.
leaf node
In a :term:`binary tree`, leaf node is any node that has two
empty :term:`children `.
(Note that a binary tree is defined so that every
node has two children, and that is why the leaf node has to have
two empty children, rather than no children.)
In a general tree, any node is a leaf node if it has no children.
least frequently used
Abbreviated :term:`LFU`, it is a :term:`heuristic` that can be
used to decide which :term:`buffer` in a :term:`buffer pool`
to :term:`flush` when data in the buffer pool must be
replaced by new data being read into a
:term:`cache `.
However, :term:`least recently used` is more popular than LFU.
Analogous to the :term:`frequency count` heuristic for
maintaining a :term:`self-organizing list`.
least recently used
Abbreviated :term:`LRU`, it is a popular :term:`heuristic` to
use for deciding which :term:`buffer` in a :term:`buffer pool`
to :term:`flush` when data in the buffer pool must be
replaced by new data being read into a :term:`cache
`.
Analogous to the :term:`move-to-front` heuristic for
maintaining a :term:`self-organizing list`.
left recursive
In automata theory, a :term:`production` is left recursive
if it is of the form :math:`A \rightarrow Ax`,
:math:`A \in V, x \in (V \cup T)^*` where :math:`V` is the set
of :term:`non-terminals ` and :math:`T` is the set
of :term:`terminals ` in the :term:`grammar`.
length
In a :term:`list`, the number of elements. In a string, the
number of characters.
level
In a tree, all nodes of :term:`depth` :math:`d` are at
level :math:`d` in the tree.
The root is the only node at level 0, and its depth is 0.
lexical analysis
A phase of a :term:`compiler` or :term:`interpreter` responsible
for reading in characters of the program or language and grouping
them into :term:`tokens `.
lexical scoping
Within programming languages, the convention of allowing access
to a variable only within the block of code in which the
variable is defined.
A synonym for :term:`static scoping`.
lifetime
For a variable, lifetime is the amount of time it will exist
before it is destroyed.
linear congruential method
In random number theory, a process for computing the next number
in a :term:`pseudo-random ` sequence.
Starting from a :term:`seed`, the next term :math:`r(i)` in the
series is calculated from term :math:`r(i-1)` by the equation
.. math::
r(i) = (r(i-1)\times b) \bmod t
where :math:`b` and :math:`t` are constants.
These constants must be well chosen for the resulting series of
numbers to have desireable properties as a random number sequence.
linear growth rate
For input size :math:`n`, a growth rate of :math:`cn` (for
:math:`c` any positive constant).
In other words, the cost of
the associated function is linear on the input size.
linear index
A form of :term:`indexing` that stores
:term:`key-value pairs ` in a sorted array.
Typically this is used for an index to a large collection of
records stored on disk, where the linear index itself might be
on disk or in :term:`main memory`.
It allows for efficient search (including for
:term:`range queries `), but it is not good for
inserting and deleting entries in the array.
Therefore, it is an ideal indexing structure when the system
needs to do range queries but the collection of records never
changes once the linear index has been created.
linear order
Another term for :term:`total order`.
linear probing
In :term:`hashing`, this is the simplest
:term:`collision resolution` method.
Term :math:`i` of the :term:`probe sequence` is simply
:math:`i`, meaning that collision resolution works by moving
sequentially through the hash table from the :term:`home slot`.
While simple, it is also inefficient, since it quickly leads to
certain free :term:`slots ` in the hash table having
higher probability of being selected during insertion or
search.
linear probing by steps
In :term:`hashing`, this :term:`collision resolution` method is
a variation on simple :term:`linear probing`.
Some constant :math:`c` is defined such that
term :math:`i` of the :term:`probe sequence` is
:math:`ci`.
This means that collision resolution works by moving
sequentially through the hash table from the :term:`home slot`
in steps of size :math:`c`.
While not much improvement on linear probing, it forms the basis
of another collision resolution method called
:term:`double hashing`, where each key uses a value for
:math:`c` defined by a second :term:`hash function`.
linear search
Another name for :term:`sequential search`.
link node
A widely used supporting object that forms the basic
building block for a :term:`linked list` and similar
:term:`data structures `.
A link node contains one or more fields that store data, and a
:term:`pointer` or :term:`reference` to another link node.
linked list
An implementation for the list ADT that uses
:term:`dynamic allocation`
of :term:`link nodes ` to store the list elements.
Common variants are the :term:`singly linked list`,
:term:`doubly linked list` and :term:`circular list`.
The :term:`overhead` required is the pointers in each link node.
linked stack
Analogous to a :term:`linked list`, this uses
:term:`dynamic allocation` of nodes to
store the elements when implementing the stack ADT.
list
A finite, ordered sequence of :term:`data items ` known as
:term:`elements `.
This is close to the mathematical concept of a :term:`sequence`.
Note that "ordered" in this definition means that the list
elements have position.
It does not refer to the relationship
between :term:`key` values for the list elements (that is,
"ordered" does not mean "sorted").
literal
In a :term:`Boolean expression`, a :term:`literal` is a
:term:`Boolean variable` or its negation.
In the context of compilers, it is any constant value.
Similar to a :term:`terminal`.
load factor
In :term:`hashing` this is the fraction of the :term:`hash
table` :term:`slots ` that contain a record.
Hash systems usually try to keep the load factor below 50%.
local variable
local variables
A variable declared within a function or method.
It exists only from the time when the function is called to when
the function exits.
When a function is suspended (due to calling another function),
the function's local variables are stored in an
:term:`activation record` on the :term:`runtime stack`.
locality of reference
The concept that accesses within a collection of records is not
evenly distributed.
This can express itself as some small fraction of the records
receiving the bulk of the accesses (:term:`80/20 rule`).
Alternatively, it can express itself as an increased probability
that the next or future accesses will come close to the most
recent access.
This is the fundamental property for success of :term:`caching`.
local storage
local storage.
logarithm
The `logarithm` of base :math:`b` for value :math:`y` is the power
to which :math:`b` is raised to get :math:`y`.
logical file
In :term:`file processing`, the programmer's view of a
:term:`random access` file stored on :term:`disk `
as a contiguous series of bytes, with those bytes possibly
combining to form data records.
This is in contrast to the :term:`physical file`.
logical form
The definition for a data type in terms of an ADT. Contrast to
the :term:`physical form` for the data type.
lookup table
A table of pre-calculated values, used to speed up processing
time when the values are going to be viewed many times. The
costs to this approach are the space required for the table and
the time required to compute the table. This is an example of a
:term:`space/time tradeoff`.
lower bound
In :term:`algorithm analysis`, a :term:`growth rate` that is
always less than or equal to the that of the
:term:`algorithm` in question.
In practice, this is the fastest-growing function that we know
grows no faster than all but a constant number of inputs.
It could be a gross under-estimate of the truth.
Since the lower bound for the algorithm can be very different
for different situations (such as the :term:`best case` or
:term:`worst case`), we typically have to specify which
situation we are referring to.
lower bounds proof
A proof regarding the lower bound, with this term most typically
referring to the lower bound for any possible algorithm to solve
a given :term:`problem`.
Many problems have a simple lower bound based on the concept
that the minimum amount of processing is related to looking at
all of the problem's input.
However, some problems have a higher lower bound than that.
For example, the lower bound for the problem of sorting
(:math:`\Omega(n \log n)`) is greater than the input size to
sorting (:math:`n`).
Proving such "non-trivial" lower bounds for problems is
notoriously difficult.
main memory
A synonym for :term:`primary storage`.
In a computer, typically this will be :term:`RAM`.
map
A :term:`data structure` that relates a :term:`key` to a
:term:`record`.
mapping
A :term:`function` that maps every element of a given
:term:`set` to a unique element of another set; a
correspondence.
mark array
It is typical in :term:`graph` algorithms that there is a need
to track which nodes have been visited at some point in the
algorithm.
An :term:`array` of bits or values called the :term:`mark array` is
often maintained for this purpose.
mark/sweep algorithm
An algorithm for :term:`garbage collection`.
All accessible variables, and any space that is reachable by a
chain of pointers from any accessible variable, is "marked".
Then a sequential sweep of all memory in the pool is made.
Any unmarked memory locations are assumed to not be needed by
the program and can be considered as free to be reused.
master theorem
A theorem that makes it easy to solve
:term:`divide-and-conquer recurrences`.
matching
In graph theory, a pairing (or match) of various nodes in a graph.
matching problem
Any problem that involves finding a :term:`matching` in a graph
with some desired property.
For example, a well-known :term:`NP-complete` problem is to find
a :term:`maximum match` for an undirected graph.
max heap
A :term:`heap` where every :term:`node` has a :term:`key` value
greater than its :term:`children `.
As a consequence, the node with maximum key value is
at the :term:`root`.
maximal match
In a graph, any :term:`matching` that leaves no pair of
unmatched vertices that are connected.
A maximal matching is not necessarily a
:term:`maximum match`.
In other words, there might be a larger matching than the
maximal matching that was found.
maximum lower bound
The :term:`lower bound` for the :term:`problem` of finding the
maximum value in an unsorted list is :math:`\Omega(n)`.
maximum match
In a graph, the largest possible :term:`matching`.
measure of cost
When comparing two things, such as two algorithms, some event or
unit must be used as the basic unit of comparison.
It might be number of milliseconds needed or machine instructions
expended by a program, but it is usually desirable to have a way
to do comparison between two algorithms without writing a
program.
Thus, some other measure of cost might be used as a basis for
comparison between the algorithms.
For example, when comparing two sorting algorthms it is
traditional to use as a measure of cost the number of
:term:`comparisons ` made between the key values of
record pairs.
Mergesort
A sorting algorithm that requires :math:`\Theta(n \log n)` in
the :term:`best `, :term:`average `,
and :term:`worst ` cases.
Conceptually it is simple:
Split the list in half, sort the halves, then merge them
together.
It is a bit complicated to implement efficiently on an :term:`array`.
member
In set notation, this is a synonym for :term:`element`.
In abstract design, a :term:`data item` is a member of a :term:`type`.
In an object-oriented language,
:term:`data members ` are data fields in an
object.
member function
Each operation associated with the ADT is implemented by a
member function or :term:`method`.
memory allocation
In a :term:`memory manager`, the act of honoring a request for
memory.
memory deallocation
In a :term:`memory manager`, the act of freeing a block of
memory, which should create or add to a :term:`free block`.
memory hierarchy
The concept that a computer system stores data in a range of
storage types that range from fast but expensive
(:term:`primary storage`) to slow but cheap
(:term:`secondary storage`).
When there is too much data to store in :term:`primary storage`,
the goal is to have the data that is needed soon or
most often in the primary storage as much as possible,
by using :term:`caching` techniques.
memory leak
In programming, the act of creating :term:`garbage`.
In languages such as C and C++ that do not support
:term:`garbage collection`, repeated memory leaks will evenually
cause the program to terminate.
memory manager
Functionality for managing a :term:`memory pool`.
Typically, the memory pool is viewed as an :term:`array` of bytes by the
memory manager.
The :term:`client` of the memory manager will request a
collection of (adjacent) bytes of some size, and release the
bytes for reuse when the space is no longer needed.
The memory manager should not know anything about the
interpretation of the data that is being stored by the client
into the memory pool.
Depending on the precise implementation, the client might pass
in the data to be stored, in which case the memory manager will
deal with the actual copy of the data into the memory pool.
The memory manager will return to the client a :term:`handle`
that can later be used by the client to retrieve the data.
memory pool
Memory (usually in :term:`RAM` but possibly on disk or
:term:`peripheral storage` device) that is logically viewed as
an array of memory positions.
A memory pool is usually managed by a :term:`memory manager`.
memory request
In a :term:`memory manager`, a request from some :term:`client`
to the memory manager to reserve a block of memory and store
some bytes there.
merge insert sort
A synonym for the :term:`Ford and Johnson sort`.
message
In a :term:`memory manager` implementation
(particularly a memory manager implemented with a
:term:`message passing` style of
:term:`interface`), the message is the data that the :term:`client` of
the memory manager wishes to have stored in the
:term:`memory pool`.
The memory manager will reply to the client by returning a
:term:`handle` that defines the location and size of the message
as stored in the memory pool.
The client can later recover the message by passing the handle
back to the memory manager.
message passing
A common approach to implementing the :term:`ADT` for a
:term:`memory manager` or :term:`buffer pool`, where the
contents of a :term:`message` to be stored is explicitly
passed between the client and the memory manager.
This is in contrast to a :term:`buffer passing` approach.
metaphor
Humans deal with complexity by assigning a label to an assembly of
objects or concepts and then manipulating the label in place of the
assembly. Cognitive psychologists call such a label a
metaphor.
method
In the :term:`object-oriented programming paradigm`,
a method is an operation on a :term:`class`.
A synonym for :term:`member function`.
MCST
MST
Abbreviation for :term:`minimal-cost spanning tree`.
mid-square method
In :term:`hashing`, an approach to implementing a
:term:`hash function`.
The key value is squared, and some number of bits from the
middle of the resulting value are extracted as the hash code.
Some care must be taken to extract bits that tend to actually be
in the middle of the resulting value, which requires some
understanding of the typical key values.
When done correctly, this has the advantage of having the hash
code be affected by all bits of the key
min heap
A :term:`heap` where every :term:`node` has a :term:`key` value
less than its :term:`children `.
As a consequence, the node with minimum key value is
at the :term:`root`.
minimal-cost spanning tree
Abbreviated as MCST, or sometimes as MST.
Derived from a :term:`weighted graph`, the MCST is the
:term:`subset` of the graph's :term:`edges ` that
maintains the connectivitiy of the graph while having lowest
total cost (as defined by the sum of the
:term:`weights ` of the edges in the MCST).
The result is referred to as a :term:`tree` because it would
never have a :term:`cycle` (since an edge could be removed from
the cycle and still preserve connectivity).
Two algorithms to solve this problem are
:term:`Prim's algorithm` and :term:`Kruskal's algorithm`.
minimum external path weight
Given a collection of objects, each associated with a
:term:`leaf node` in a tree, the binary tree with minimum
external path weight is the one with the minimum sum of
:term:`weighted path lengths ` for the
given set of leaves.
This concept is used to create a :term:`Huffman coding tree`,
where a letter with high weight should have low depth, so that
it will count the least against the total path length.
As a result, another letter might be pushed deeper in the tree
if it has less weight.
mod
Abbreviation for the :term:`modulus` function.
model
A simplification of reality that preserves only the essential
elements.
With a model, we can more easily focus on and reason about these
essentials.
In :term:`algorithm analysis`, we are especially concerned with
the :term:`cost model` for measuring the cost of an algorithm.
modulus
The modulus function returns the
remainder of an integer division.
Sometimes written :math:`n \bmod m` in mathematical expressions,
the syntax in many programming languages is ``n % m``.
Monte Carlo algorithms
A form of :term:`randomized algorithm`.
We find the maximum value fast, or we don't get an answer at all
(but fast).
While such algorithms have good running time, their result is not
guaranteed.
move-to-front
A :term:`heuristic` used to maintain a
:term:`self-organizing list`.
Under this heuristic, whenever a record is accessed it is moved
to the front of the list.
Analogous to the :term:`least recently used` heuristic for
maintaining a :term:`buffer pool`.
multi-dimensional search key
A search key containing multiple parts, that works in
conjunction with a :term:`multi-dimensional search structure`.
Most typically, a :term:`spatial` search key representing a
position in multi-dimensional (2 or 3 dimensions) space.
But a multi-dimensional key could be used to organize data within
non-spatial dimensions, such as temperature and time.
multi-dimensional search structure
A data structure used to support efficient search on a
:term:`multi-dimensional search key`.
The main concept here is that a multi-dimensional search
structure works more efficiently by considering the multiple
parts of the search key as a whole, rather than making
independent searches on each one-dimensional component of the
key.
A primary example is a :term:`spatial data structure` that can
efficiently represent and search for records in
multi-dimensional space.
multilist
A list that may contain sublists.
This term is sometimes used as a synonym to the term
:term:`bag`.
natural numbers
Zero and the positive integers.
necessary fallacy
A common mistake in a
:term:`lower bounds proof` for a problem, where the proof makes
an inappropriate assumption that any algorithm must operate in
some manner (typically in the way that some known algorithm
behaves).
neighbor
In a :term:`graph`, a :term:`node` :math:`w` is said to be a
neighbor of :term:`node` :math:`v` if there is an :term:`edge`
from :math:`v` to :math:`w`.
non-deterministic
In a :term:`finite automata`, at least one :term:`state` has
multiple transitions on at least one symbol.
This means that it is not :term:`deterministic` about what
transition to take in that situation.
A non-deterministic machine is said to :term:`accept` a string
if it completes execution on the string in an
:term:`accepting state` under at least one choice of
non-deterministic transitions.
Generally, non-determinism can be simulated with a deterministic
machine by alternating between the execution that would take
place under each of the branching choices.
non-terminal
In contrast to a :term:`terminal`, a non-terminal is an abstract
state in a :term:`production rule`. Begining with the
:term:`start symbol`, all non-terminals must be converted into
terminals in order to complete a :term`derivation`.
node
The objects that make up a linked structure such as a linked
list or binary tree.
Typically, nodes are allocated using
:term:`dynamic memory allocation`.
In :term:`graph` terminology, the nodes are more commonly called
:term:`vertices `.
non-strict partial order
In set notation, a relation that is :term:`reflexive`,
:term:`antisymmetric`, and :term:`transitive`.
NP
An abbreviation for
:term:`non-deterministic polynomial `.
NP-Complete
A class of problems that are related to each other in this way:
If ever one such problem is proved to be solvable in
polynomial time, or proved to require exponential time,
then all other NP-Complete problems will cost likewise.
Since so many real-world problems have been proved to be
NP-Complete, it would be extremely useful to determine if they
have polynomial or exponential cost. But so far, nobody has
been able to determine the truth of the situation.
A more technical definition is that a problem is NP-Complete if
it is in NP and is NP-hard.
NP-Completeness proof
A type of :term:`reduction` used to demonstrate that a
particular :term:`problem` is :term:`NP-complete`.
Specifically, an NP-Completeness proof must first show that the
problem is in class :term:`NP`, and then show (by using a
reduction to another NP-Complete problem) that the problem is
:term:`NP-hard`.
NP-hard
A problem that is "as hard as" any other problem in :term:`NP`.
That is, Problem X is NP-hard if any algorithm in NP can be
:term:`reduced ` to X in polynomial time.
non-deterministic algorithm
An algorithm that may operate using a
:term:`non-deterministic choice` operation.
non-deterministic choice
An operation that captures the concept of nondeterminism.
A nondeterministic choice can be viewed as either
"correctly guessing" between a set of choices, or implementing
each of the choices in parallel.
In the parallel view, the nondeterminism was successful if at
least one of the choices leads to a correct answer.
non-deterministic polynomial time algorithm
An algorithm that runs in polynomial time, and which may
(or might not) use :term:`non-deterministic choice`.
nth roots of unity
All of the points along the unit circle in the complex plane
that represent multiples of the
:term:`primitive nth root of unity`.
object
An instance of a :term:`class`, that is, something that is created and
takes up storage during the execution of a computer program.
In the :term:`object-oriented programming paradigm`, objects
are the basic units of operation.
Objects have state in the form of :term:`data members `,
and they know how to perform certain actions
(:term:`methods `).
object-oriented programming paradigm
An approach to problem-solving where all computations are
carried out using :term:`objects