# 29.1. Glossary¶

- 2-3 tree
- A specialized form of the
*B-tree*where each internal node has either 2 children or 3 children. Key values are ordered to maintain the*binary search tree property*. The 2-3 tree is always height balanced, and its insert, search, and remove operations all have \(\Theta(\log n)\) cost. - 80/20 rule
- Given a typical application where there is a collection of
records and a series of search operations for records,
the 80/20 rule is an empirical observation that
80% of the record accessess typically go to 20% of the records.
The exact values varies between data collections, and is related
to the concept of
*locality of reference*. - abstract data type
- Abbreviated
*ADT*. The specification of a*data type*within some language, independent of an implementation. The*interface*for the ADT is defined in terms of a*type*and a set of operations on that type. The behavior of each operation is determined by its inputs and outputs. An ADT does not specify*how*the data type is implemented. These implementation details are hidden from the user of the ADT and protected from outside access, a concept referred to as*encapsulation*. - accept
- When a
*finite automata*executes on a string and terminates in an*accepting state*, it is said to accept the string. The finite automata is said to accept the language that consists of all strings for which the finite automata completes execution in an accepting state. - accepting state
- Part of the definition of a
*finite automata*is to designate some*states*as accepting states. If the finite automata executes on an input string and completes the computation in an accepting state, then the machine is said to*accept*the string. - activation record
- The entity that is stored on the
*runtime stack*during program execution. It stores any active*local variable*and the return address from which a new subroutine is being called, so that this information can be recovered when the subroutine terminates. - acyclic graph
- In
*graph*terminology, a graph that contains no*cycles*. - address
- A location in memory.
- adjacency list
- An implementation for a
*graph*that uses an (array-based)*list*to represent the*vertices*of the graph, and each vertex is in turn represented by a (linked) list of the vertices that are*neighbors*. - adjacency matrix
- An implementation for a
*graph*that uses a 2-dimensional*array*where each row and each column corresponds to a*vertex*in the*graph*. A given row and column in the matrix corresponds to an edge from the*vertex*corresponding to the row to the vertex corresponding to the column. - adjacent
- Two
*nodes*of a*tree*or two*vertices*of a*graph*are said to be adjacent if they have an*edge*connecting them. If the edge is directed from \(a\) to \(b\), then we say that \(a\) is adjacent to \(b\), and \(b\) is adjacent from \(a\). - ADT
- Abbreviation for
*abstract data type*. - adversary
- A fictional construct introduced for use in an
*adversary argument*. - adversary argument
- A type of
*lower bounds proof*for a problem where a (fictional) "adversary" is assumed to control access to an algorithm's input, and which yields information about that input in such a way that will drive the cost for any proposed algorithm to solve the problem as high as possible. So long as the adversary never gives an answer that conflicts with any previous answer, it is permitted to do whatever necessary to make the algorithm require as much cost as possible. - aggregate type
- A
*data type*whose*members*have subparts. For example, a typical database record. Another term for this is*composite type*. - algorithm
- A method or a process followed to solve a
*problem*. - algorithm analysis
- A less formal version of the term
*asymptotic algorithm analysis*, generally used as a synonym for*asymptotic analysis*. - alias
- Another name for something. In programming, this usually refers
to two
*references*that refer to the same object. - all-pairs shortest paths problem
- Given a
*graph*with*weights*or distances on the*edges*, find the shortest paths between every pair of vertices in the graph. One approach to solving this problem is*Floyd's algorithm*, which uses the*dynamic programming*algorithmic technique. - allocated

allocation - Reserving memory for an object in the Heap memory.
- alphabet
- The characters or symbols that strings in a given language may be composed of.
- alphabet trie
- A
*trie*data structure for storing variable-length strings. Level \(i\) of the tree corresponds to the letter in position \(i\) of the string. The root will have potential branches on each intial letter of string. Thus, all strings starting with "a" will be stored in the "a" branch of the tree. At the second level, such strings will be separated by branching on the second letter. - amortized analysis
- An
*algorithm analysis*techique that looks at the total cost for a series of operations and amortizes this total cost over the full series. This is as opposed to considering every individual operation to independently have the*worst case*cost, which might lead to an overestimate for the total cost of the series. - amortized cost
- The total cost for a series of operations to be used in an
*amortized analysis*. - ancestor
- In a tree, for a given node \(A\), any node on a
*path*from \(A\) up to the root is an ancestor of \(A\). - antisymmetric
- In set notation, relation \(R\) is antisymmetric if whenever \(aRb\) and \(bRa\), then \(a = b\), for all \(a, b \in \mathbf{S}\).
- approximation algorithm
- An algorthm for an
*optimization problem*that finds a good, but not necessarily cheapest, solution. - arm
- In the context of an
*I/O head*, this attaches the sensor on the I/O head to the*boom*. - array
- A
*data type*that is used to store elements in consecutive memory locations and refers to them by an index. - array-based list
- An implementation for the
*list*ADT that uses an*array*to store the list elements. Typical implementations fix the array size at creation of the list, and the*overhead*is the number of array positions that are presently unused. - array-based queue
- Analogous to an
*array-based list*, this uses an*array*to store the elements when implementing the*queue*ADT. - array-based stack
- Analogous to an
*array-based list*, this uses an*array*to store the elements when implementing the*stack*ADT. - ASCII character coding
- American Standard Code for Information Interchange. A commonly used method for encoding characters using a binary code. Standard ASCII uses an 8-bit code to represent upper and lower case letters, digits, some punctuation, and some number of non-printing characters (such as carrage return). Now largely replaced by UTF-8 encoding.
- assembly code
- A form of
*intermediate code*created by a*compiler*that is easy to convert into the final form that the computer can execute. An assembly language is typically a direct mapping of one or a few instructions that the CPU can execute into a mnemonic form that is relatively easy for a human to read. - asymptotic algorithm analysis
- A more formal term for
*asymptotic analysis*. - asymptotic analysis
- A method for estimating the efficiency of an algorithm or
computer program by identifying its
*growth rate*. Asymptotic analysis also gives a way to define the inherent difficulty of a*problem*. We frequently use the term*algorithm analysis*to mean the same thing. - attribute
- In
*object-oriented programming*, a synonym for*data members*. - automata
- Synonym for
*finite state machine*. - automatic variable
- A synonym for
*local variable*. When program flow enters and leaves the variable's scope, automatic variables will be allocated and de-allocated automatically. - average case
- In
*algorithm analysis*, the average of the costs for all*problem instances*of a given input size \(n\). If not all problem instances have equal probability of occurring, then average case must be calculated using a weighted average. - average seek time
- Expected (average) time to perform a
*seek*operation on a*disk drive*, assuming that the seek is between two randomly selected tracks. This is one of two metrics commonly provided by disk drive vendors for disk drive performance, with the other being*track-to-track seek time*. - AVL Tree
- A variant implementation for the
*BST*, which differs from the standard BST in that it uses modified insert and remove methods in order to keep the tree*balanced*. Similar to a*Splay Tree*in that it uses the concept of*rotations*in the insert and remove operations. - B$^*$-tree
- A variant on the
*B$^+$-tree*. The \(\mathrm{B}^*\) tree is identical to the \(\mathrm{B}^+\) tree, except for the rules used to split and merge nodes. Instead of splitting a node in half when it overflows, the \(\mathrm{B}^*\) tree gives some records to its neighboring sibling, if possible. If the sibling is also full, then these two nodes split into three. Similarly, when a node underflows, it is combined with its two siblings, and the total reduced to two nodes. Thus, the nodes are always at least two thirds full. - B$^+$-tree
- The most commonly implemented form of
*B-tree*. A B$^+$-tree does not store data at the*internal nodes*, but instead only stores*search key*values as direction finders for the purpose of searching through the tree. Only the*leaf nodes*store a*reference*to the actual data records. - B-tree
- A method for
*indexing*a large collection of records. A B-tree is a*balanced tree*that typically has high branching factor (commonly as much as 100*children*per*internal node*), causing the tree to be very shallow. When stored on disk, the node size is selected to be same as the desired unit of I/O (so some multiple of the disk*sector*size). This makes it easy to gain access to the record associated with a given*search key*stored in the tree with few*disk accesses*. The most commonly implemented variant of the B-tree is the*B$^+$-tree*. - backing storage
- In the context of a
*caching*system or*buffer pool*, backing storage is the relatively large but slower source of data that needs to be cached. For example, in a*virtual memory*, the disk drive would be the backing storage. In the context of a web browser, the Internet might be considered the backing storage. - backtracking
- A
*heuristic*for brute-force search of a solution space. It is essentially a*depth-first search*of the solution space. This can be improved using a*branch-and-bounds algorithm*. - bad reference
- A reference is referred to as a bad reference if it is allocated but not initialized.
- bag
- In set notation, a bag is a collection of elements with no order (like a set), but which allows for duplicate-valued elements (unlike a set).
- balanced tree
- A
*tree*where the*subtrees*meet some criteria for being balanced. Two possibilities are that the tree is*height balanced*, or that the tree has a roughly equal number of*nodes*in each subtree. - base
- Synonym for
*radix*. - base case
- In
*recursion*or*proof by induction*, the base case is the termination condition. This is a simple input or value that can be solved (or proved in the case of induction) without resorting to a recursive call (or the*induction hypothesis*). - base class
- In
*object-oriented programming*, a class from which another class*inherits*. The class that inherits is called a*subclass*. - base type
- The
*data type*for the elements in a set. For example, the set might consist of the integer values 3, 5, and 7. In this example, the base type is integers. - basic operation
- Examples of basic operations include inserting a data
item into the data structure, deleting a
*data item*from the data structure, and finding a specified*data item*. - best case
- In algorithm analysis, the
*problem instance*from among all problem instances for a given input size \(n\) that has least cost. Note that the best case is**not**when \(n\) is small, since we are referring to the best from a class of inputs (i.e, we want the best of those inputs of size \(n\)). - best fit
- In a
*memory manager*, best fit is a*heuristic*for deciding which*free block*to use when allocating memory from a*memory pool*. Best fit will always allocate from the smallest*free block*that is large enough to service the memory request. The rationale is that this will be the method that best preserves large blocks needed for unusually large requests. The disadvantage is that it tends to cause*external fragmentation*in the form of small, unuseable memory blocks. - BFS
- Abbreviation for
*breadth-first search*. - big-Oh notation
- In
*algorithm analysis*, a shorthand notation for describing the*upper bound*for an*algorithm*or*problem*. - binary insert sort
- A variation on
*insertion sort*where the position of the value being inserted is located by binary search, and then put into place. In normal usage this is not an improvement on standard insertion sort because of the expense of moving many items in the*array*. But it is directly useful if the cost of comparison is high compared to that of moving an element, or is theoretically useful if we only care to count the cost of comparisons. - binary search
- A standard
*recursive*algorithm for finding the*record*with a given*search key*value within a sorted list. It runs in \(O(\log n)\) time. At each step, look at the middle of the current sublist, and throw away the half of the records whose keys are either too small or too large. - binary search tree
- A binary tree that imposes the following constraint on its node
values: The
*search key*value for any node \(A\) must be greater than the (key) values for all nodes in the left*subtree*of \(A\), and less than the key values for all nodes in the right subtree of \(A\). Some convention must be adopted if multiple nodes with the same key value are permitted, typically these are required to be in the right subtree. - binary search tree property
- The defining relationship between the
*key*values for*nodes*in a*BST*. All nodes stored in the left subtree of a node whose key value is \(K\) have key values less than or equal to \(K\). All nodes stored in the right subtree of a node whose key value is \(K\) have key values greater than \(K\). - binary tree
- A finite set of nodes which is either empty, or else has a root
node together two binary trees, called the left and right
*subtrees*, which are*disjoint*from each other and from the*root*. - binary trie
- A
*binary tree*whose structure is that of a*trie*. Generally this is an implementation for a*search tree*. This means that the*search key*values are thought of a binary digits, with the digit in the position corresponding to this a node's*level*in the tree indicating a left branch if it is "0", or a right branch if it is "1". Examples include the*Huffman coding tree*and the*Bintree*. - binning
- In
*hashing*, binning is a type of*hash function*. Say we are given keys in the range 0 to 999, and have a hash table of size 10. In this case, a possible hash function might simply divide the key value by 100. Thus, all keys in the range 0 to 99 would hash to slot 0, keys 100 to 199 would hash to slot 1, and so on. In other words, this hash function "bins" the first 100 keys to the first slot, the next 100 keys to the second slot, and so on. This approach tends to make the hash function dependent on the distribution of the high-order bits of the keys. - Binsort
- A sort that works by taking each record and placing it into a
bin based on its value. The bins are then gathered up in order
to sort the list. It is generally not practical in this form,
but it is the conceptual underpinning of the
*radix sort*. - bintree
- A
*spatial data structure*in the form of binary*trie*, typically used to store point data in two or more dimensions. Similar to a*PR quadtree*except that at each level, it splits one dimension in half. Since many leaf nodes of the PR quadtree will contain no data points, implementation often makes use of the*flyweight**design pattern*. - bitmap

bit vector - An
*array*that stores a single bit at each position. Typically these bits represent*Boolean variables*associated with a collection of objects, such that the \(i\) th bit is the Boolean value for the \(i\) th object. - block
- A unit of storage, usually referring to storage on a
*disk drive*or other*peripheral storage*device. A block is the basic unit of I/O for that device. - Boolean expression
- A Boolean expression is comprised of
*Boolean variables*combined using the operators AND (\(\cdot\)), OR (\(+\)), and NOT (to negate Boolean variable \(x\) we write \(\overline{x}\)). - Boolean variable
- A variable that takes on one of the two values
`True`

and`False`

. - boom
- In the context of an
*I/O head*, is the central structure to which all of the I/O heads are attached. Thus, the all move together during a*seek*operation. - bounding box
- A box (usually aligned to the coordinate axes of the reference system) that contains a (potentially complex) object. In graphics and computational geometry, complex objects might be associated with a bounding box for use by algorithms that search for objects in a particular location. The idea is that if the bounding box is not within the area of interest, then neither is the object. Checking the bounding box is cheaper than checking the object, but it does require some time. So if enough objects are not outside the area of interest, this approach will not save time. But if most objects are outside of the area of interest, then checking bounding boxes first can save a lot of time.
- branch-and-bounds algorithm
- A variation on
*backtracking*that applies to*optimization problems*. We traverse the*solution tree*as with backtracking. Proceeding deeper in the solution tree generally requires additional cost. We remember the best-cost solution found so far. If the cost of the current branch in the tree exceeds the best tour cost found so far, then we know to stop pursuing this branch of the tree. At this point we can immediately back up and take another branch. - breadth-first search
- A
*graph**traversal*algorithm. As the name implies, all immediate*neighbors*for a*node*are*visited*before any more-distant nodes are visited. BFS is driven by a*queue*. A start vertex is placed on the queue. Then, until the queue is empty, a node is taken off the queue, visited, and and then any*unvisited*neighbors are placed onto the queue. - break-even point
- The point at which two costs become even when measured as the
function of some variable.
In particular, used to compare the space requirements of two
implementations.
For example, when comparing the space requirements of an
*array-based list*implementation versus a*linked list*implementation, the key issue is how full the list is compared to its capacity limit (for the array-based list). The point where the two representations would have the same space cost is the break-even point. As the list becomes more full beyond this point, the array-based list implementation becomes more space efficent, while as the list becomes less full below this point, the linked list implementation becomes more space efficient. - BST
- Abbreviation for
*binary search tree*. - bubble sort
- A simple sort that requires \(Theta(n^2)\) time in
*best*,*average*, and*worst*cases. Even an optimized version will normally run slower than*insertion sort*, so it has little to recommend it. - bucket
- In
*bucket hashing*, a bucket is a sequence of*slots*in the*hash table*that are grouped together. - bucket hashing
- A method of
*hashing*where multiple*slots*of the*hash table*are grouped together to form a*bucket*. The*hash function*then either hashes to some bucket, or else it hashes to a*home slot*in the normal way, but this home slot is part of some bucket.*Collision resolution*is handled first by attempting to find a free position within the same bucket as the home slot. If the bucket if full, then the record is placed in an*overflow bucket*. - bucket sort
- A variation on the
*Binsort*, where each bin is associated with a range of*key*values. This will require some method of sorting the records placed into each bin. - buddy method
- In a
*memory manager*, an alternative to using a*free block list*and a*sequential fit*method to seach for a suitable free block to service a*memory request*. Instead, the memory pool is broken down as needed into smaller chunks by splitting it in half repeatedly until the smallest power of 2 that is as big or bigger than the size of the memory request is reached. The name comes from the fact that the binary representation for the start of the block positions only differ by one bit for adjacent blocks of the same size. These are referred to as "buddies" and will be merged together if both are free. - buffer
- A block of memory, most often in
*primary storage*. The size of a buffer is typically one or a multiple of the basic unit of I/O that is read or written on each access to*secondary storage*such as a*disk drive*. - buffer passing
- An approach to implementing the
*ADT*for a*buffer pool*, where a pointer to a*buffer*is passed between the client and the buffer pool. This is in contrast to a*message passing*approach, it is most likely to be used for long messages or when the message size is always the same as the buffer size, such as when implementing a*B-tree*. - buffer pool
- A collection of one or more
*buffers*. The buffer pool is an example of a*cache*. It is stored in*primary storage*, and holds data that is expected to be used in the near future. When a data value is requested, the buffer pool is searched first. If the value is found in the buffer pool, then*secondary storage*need not be accessed. If the value is not found in the buffer pool, then it must be fetched from secondary storage. A number of traditional*heuristics*have been developed for deciding which data to*flush*from the buffer pool when new data must be stored, such as*least recently used*. - buffering
- A synonym for
*caching*. More specifically, it refers to an arrangement where all accesses to data (such as on a*peripheral storage*device) must be done in multiples of some minimum unit of storage. On a*disk drive*, this basic or smallest unit of I/O is a*sector*. It is called "buffering" because the block of data returned by such an access is stored in a*buffer*. - caching
- The concept of keeping selected data in
*main memory*. The goal is to have in main memory the data values that are most likely to be used in the near future. An example of a caching technique is the use of a*buffer pool*. - call stack
- Known also as execution stack. A stack that stores the function call sequence and the return address for each function.
- Cartesian product
- For sets, this is another name for the
*set product*. - ceiling
- Written \(\lceil x \rceil\), for real value \(x\) the ceiling is the least integer \(\geq x\).
- child
- In a tree, the set of
*nodes*directly pointed to by a node \(R\) are the*children*of \(R\). - circular first fit
- In a
*memory manager*, circular first fit is a*heuristic*for deciding which*free block*to use when allocating memory from a*memory pool*. Circular first fit is a minor modification on*first fit*memory allocation, where the last free block allocated from is remembered, and search for the next suitable free block picks up from there. Like first fit, it has the advantage that it is typically not necessary to look at all free blocks on the free block list to find a suitable free block. And it has the advantage over first fit that it spreads out memory allocations evenly across the*free block list*. This might help to minimize*external fragmentation*. - circular list
- A
*list*ADT implementation variant where the last element of the list provides access to the first element of the list. - class
- In the
*object-oriented programming paradigm*an ADT and its implementation together make up a class. An instantiation of a class within a program is termed an*object*. - class hierarchy
- In
*object-oriented programming*, a set of classes and their interrelationships. One of the classes is the*base class*, and the others are*subclasses*that*inherit*either directly or indirectly from the base class. - clause
- In a
*Boolean expression*, a clause is one or more*literals*OR'ed together. - client
- The user of a service.
For example, the object or part of the program that calls a
*memory manager*class is the client of that memory manager. Likewise the class or code that calls a*buffer pool*. - clique
- In
*graph*terminology, a clique is a*subgraph*, defined as any*subset*\(U\) of the graph's*vertices*such that every vertex in \(U\) has an*edge*to every other vertex in \(U\). The size of the clique is the number of vertices in the clique. - closed
- A set is closed over a (binary) operation if, whenever the operation is applied to two members of the set, the result is a member of the set.
- closed hash system
- A
*hash system*where all records are stored in slots of the*hash table*. This is in contrast to an*open hash system*. - closed-form solution
- An algebraic equation with the same value as a
*summation*or*recurrence relation*. The process of replacing the summation or recurrence with its closed-form solution is known as solving the summation or recurrence. - cluster
- In
*file processing*, a collection of physically adjacent*sectors*that define the smallest allowed allocation unit of space to a disk file. The idea of requiring space to be allocated in multiples of sectors is that this will reduce the number of*extents*required to store the file, which reduces the expected number of*seek*operations reuquired to process a series of*disk accesses*to the file. The disadvantage of large cluster size is that it increases*internal fragmentation*since any space not actually used by the file in the last cluster is wasted. - code generation
- A phase in a
*compiler*that transforms*intermediate code*into the final executable form of the code. More generally, this can refer to the process of turning a parse tree (that determines the correctness of the structure of the program) into actual instructions that the computer can execute. - code optimization
- A phase in a
*compiler*that makes changes in the code (typically*assembly code*) with the goal of replacing it with a version of the code that will run faster while performing the same computation. - cohesion
- In
*object-oriented programming*, a term that refers to the degree to which a class has a single well-defined role or responsibility. - Collatz sequence
For a given integer value \(n\), the sequence of numbers that derives from performing the following computatin on \(n\):

while (n > 1) if (ODD(n)) n = 3 * n + 1; else n = n / 2;

This is famous because, while it terminates for any value of \(n\) that you try, it has never been proven to be a fact that this always terminates.

- collision
- In a
*hash system*, this refers to the case where two search*keys*are mapped by the*hash function*to the same slot in the*hash table*. This can happen on insertion or search when another record has already been hashed to that slot. In this case, a*closed hash system*will require a process known as*collision resolution*to find the location of the desired record. - collision resolution
- The outcome of a
*collision resolution policy*. - collision resolution policy
- In
*hashing*, the process of resolving a*collision*. Specifically in a*closed hash system*, this is the process of finding the proper position in a*hash table*that contains the desired record if the*hash function*did not return the correct position for that record due to a*collision*with another record. - comparable
- The concept that two objects can be compared to determine if they
are equal or not, or to determine which one is greater than the
other.
In set notation, elements \(x\) and \(y\) of a set are
comparable under a given relation \(R\) if either
\(xRy\) or \(yRx\).
To be reliably compared for a greater/lesser relationship,
the values being compared must belong to a
*total order*. In programming, the property of a data type such that two elements of the type can be compared to determine if they the same (a weaker version), or which of the two is larger (a stronger version).`Comparable`

is also the name of an*interface*in Java that asserts a comparable relationship between objects with a class, and`.compareTo()`

is the`Comparable`

interface method that implements the actual comparison between two objects of the class. - comparator
- A function given as a parameter to a method of a library (or alternatively, a parameter for a C++ template or a Java generic). The comparator function concept provides a generic way encapulates the process of performing a comparison between two objects of a specific type. For example, if we want to write a generic sorting routine, that can handle any record type, we can require that the user of the sorting routine pass in a comparator function to define how records in the collection are to be compared.
- comparison
- The act of comparing two
*keys*or*records*. For many*data types*, a comparison has constant time cost. The number of comparisons required is often used as a*measure of cost*for sorting and searching algorithms. - compile-time polymorphism
- A form of
*polymorphism*known as Overloading. Overloaded methods have the same names, but different signatures as a method available elsewhere in the class. Compare to*run-time polymorphism*. - compiler
- A computer program that reads computer programs and converts
them into a form that can be directly excecuted by some form of
computer.
The major phases in a compiler include
*lexical analysis*,*syntax analysis*,*intermediate code generation*,*code optimization*, and*code generation*. More broadly, a compiler can be viewed as*parsing*the program to verify that it is syntactically correct, and then doing*code generation*to convert the hig-level program into something that the computer can execute. - complete binary tree
- A binary tree where the nodes are filled in row by row, with the
bottom row filled in left to right.
Due to this requirement, there is only one tree of \(n\)
nodes for any value of \(n\).
Since storing the records in an
*array*in row order leads to a simple mapping from a node's position in the array to its*parent*,*siblings*, and*children*, the array representation is most commonly used to implement the complete binary tree. The*heap*data structure is a complete binary tree with partial ordering constraints on the node values. - complete graph
- A
*graph*where every*vertex*connects to every other vertex. - complex number
- In mathematics, an imaginary number, that is, a number with a real component and an imaginary component.
- Composite design pattern
- Given a class hierarchy representing a set of objects, and a
container for a collection of objects, the composite
*design pattern*addresses the relationship between the object hierarchy and a bunch of behaviors on the objects. In the composite design, each object is required to implement the collection of behaviors. This is in contrast to the procedural approach where a behavior (such as a tree*traversal*) is implemented as a method on the object collection (such as a*tree*). Procedural tree traversal requires that the tree have a method that understands what to do when it encounters any of the object types (*internal*or*leaf nodes*) that the tree might contain. The composite approach would have the tree call the "traversal" method on its root node, which then knows how to perform the "traversal" behavior. This might in turn require invoking the traversal method of other objects (in this case, the children of the root). - composite type
- A type whose
*members*have subparts. For example, a typical database record. Another term for this is*aggregate type*. - composition
- Relationships between classes based on usage rather than
*inheritance*, i.e. a**HAS-A**relationship. For example, some code in class 'A' has a*reference*to some other class 'B'. - computability
- A branch of computer science that deals with the theory of
solving problems through computation.
More specificially, it deals with the limits to what problems
(functions) are computable.
An example of a famous problem that cannot in principle be
solved by a computer is the
*halting problem*. - computation
- In a
*finite automata*, a computation is a sequence of*configurations*for some length \(n \geq 0\). In general, it is a series of operations that the machine performs. - computational complexity theory
- A branch of the theory of computation in theoretical computer
science and mathematics that focuses on classifying
computational problems according to their inherent difficulty,
and relating those classes to each other.
An example is the study of
*NP-Complete*problems. - configuration
- For a
*finite automata*, a complete specification for the current condition of the machine on some input string. This includes the current*state*that the machine is in, and the current condition of the string, including which character is about to be processed. - Conjunctive Normal Form

CNF - A
*Boolean expression*written as a series of*clauses*that are AND'ed together. - connected component
- In an
*undirected graph*, a*subset*of the*nodes*such that each node in the subset can be reached from any other node in that subset. - connected graph
- An
*undirected graph*is a connected graph if there is at least one path from any*vertex*to any other. - constant running time
- The cost of a function whose running time is not related to its input size. In Theta notation, this is traditionally written as \(\Theta(1)\).
- constructive induction
- A process for finding the
*closed form*for a*recurrence relation*, that involves substituting in a guess for the closed form to replace the recursive part(s) of the recurrence. Depending on the goal (typically either to show that the hypothesized growth rate is right, or to find the precise constants), one then manipulates the resulting non-recursive equation. - container

container class - A
*data structure*that stores a collection of*records*. Typical examples are*arrays*,*search trees*, and*hash tables*. - context-free grammar
- A
*grammar*comprised only of productions of the form \(A \rightarrow x\) where \(A\) is a*non-terminal*and \(x\) is a series of one or more*terminals*and non-terminals. That is, the given non-terminal \(A\) can be replaced at any time. - context-free language
- The set of
*languages*that can be defined by*context-sensitive grammars*. - context-sensitive grammar
- A
*grammar*comprised only of productions of the form \(xAy \rightarrow xvy\) where \(A\) is a*non-terminal*and \(x\) and \(y\) are each a series of one or more*terminals*and non-terminals. That is, the given non-terminal \(A\) can be replaced only when it is within the proper context. - cost
- The amount of resources that the solution consumes.
- cost model
- In
*algorithm analysis*, a definition for the cost of each*basic operation*performed by the algorithm, along with a definition for the size of the input. Having these definitions allows us to calculate the*cost*to run the algorithm on a given input, and from there determine the*growth rate*of the algorithm. A cost model would be considered "good" if it yields predictions that conform to our understanding of reality. - countably infinite

countable - A
*set*is countably infinite if it contains a finite number of elements, or (for a set with an infinite number of elements) if there exists a one-to-one mapping from the set to the set of integers. - CPU
- Acronym for Central Processing Unit, the primary processing device for a computer.
- current position
- A property of some list ADTs, where there is maintained a "current position" state that can be referred to later.
- cycle
- In
*graph*terminology, a*cycle*is a*path*of length three or more that connects some*vertex*\(v_1\) to itself. - cylinder
- A
*disk drive*normally consists of a stack of*platters*. While this might not be so true today, traditionally all of the*I/O heads*moved together during a*seek*operation. Thus, when a given I/O head is positioned over a particular*track*on a platter, the other I/O heads are also positioned over the corresponding track on their platters. That collection of tracks is called a cylinder. A given cylinder represents all of the data that can be read from all of the platters without doing another seek operation. - cylinder index
- In the
*ISAM*system, a simple*linear index*that stores the lowest key value stored in each*cylinder*. - cylinder overflow
- In the
*ISAM*system, this is space reserved for storing any records that can not fit in their respective*cylinder*. - DAG
- Abbreviation for
*directed acyclic graph*. - data field
- In
*object-oriented programming*, a synonym for*data member*. - data item
- A piece of information or a record whose value is drawn from a type.
- data member
- The variables that together define the space required by a data
item are referred to as data members.
Some of the commonly used synonyms include
*data field*,*attribute*, and*instance variable*. - data structure
- The implementation for an
*ADT*. - data type
- A type together with a collection of operations to manipulate the type.
- deallocated

deallocation - Free the memory allocated to an unused object.
- decision problem
- A problem whose output is either "YES" or "NO".
- decision tree
- A theoretical construct for modeling the behavior of algorithms.
Each point at which the algorithm makes a decision (such as an
if statement) is modeled by a branch in the tree that represents
the algorithms behavior.
Decision trees can be used in
*lower bounds proofs*, such as the proof that sorting requires \(\Omega(n \log n)\) comparisons in the*worst case*. - deep copy
- Copying the actual content of a
*pointee*. - degree
- In
*graph*terminology, the degree for a*vertex*is its number of*neighbors*. In a*directed graph*, the*in degree*is the number of edges directed into the vertex, and the*out degree*is the number of edges directed out of the vertex. In*tree*terminology, the degree for a*node*is its number of*children*. - delegation mental model for recursion
- A way of thinking about the process of
*recursion*. The recursive function "delegates" most of the work when it makes the recursive call. The advantage of the delegation mental model for recursion is that you don't need to think about how the delegated task is performed. It just gets done. - dense graph
- A
*graph*where the actual number of*edges*is a large fraction of the possible number of edges. Generally, this is interpreted to mean that the*degree*for any*vertex*in the graph is relatively high. - depth
- The depth of a node \(M\) in a tree is the length of the path from the root of the tree to \(M\).
- depth-first search
- A
*graph**traversal*algorithm. Whenever a \(v\) is*visited*during the traversal, DFS will*recursively*visit all of \(v\) 's*unvisited**neighbors*. - depth-first search tree
- A
*tree*that can be defined by the operation of a*depth-first search*(DFS) on a*graph*. This tree would consist of the*nodes*of the graph and a subset of the*edges*of the graph that was followed during the DFS. - dequeue
- A specialized term used to indicate removing an element from a queue.
- dereference
- Accessing the value of the
*pointee*for some*reference*variable. Commonly, this happens in a language like Java when using the "dot" operator to access some field of an object. - derivation
- In formal languages, the process of executing a series of
*production rules*from a*grammar*. A typical example of a derivation would be the series of productions executed to go from the*start symbol*to a given string. - descendant
- In a tree, the set of all nodes that have a node \(A\) as an
*ancestor*are the descendants of \(A\). In other words, all of the nodes that can be reached from \(A\) by progressing downwards in tree. Another way to say it is: The*children*of \(A\), their children, and so on. - deserialization
- The process of returning a
*serialized*representation for a data structure back to its original in-memory form. - design pattern
- An abstraction for describing the design of programs, that is, the interactions of objects and classes. Experienced software designers learn and reuse patterns for combining software components, and design patterns allow this design knowledge to be passed on to new programmers more quickly.
- deterministic
- Any
*finite automata*in which, for every pair of*state*and symbol, there is only a single transition. This means that whenever the machine is in a given state and sees a given symbol, only a single thing can happen. This is in contrast to a*non-deterministic*finite automata, which has at least one state with multiple transitions on at least one symbol. - deterministic algorithm
- An algorithm that does not involve any element of randomness,
and so its behavior on a given input will always be the same.
This is in contrast to a
*randomized algorithm*. - Deterministic Finite Automata

Deterministic Finite Acceptor

DFA - An
*automata*or abstract machine that can process an input string (shown on a tape) from left to right. There is a control unit (with*states*), behavior defined for what to do when in a given state and with a given symbol on the current square of the tape. All that we can "do" is change state before going to the next letter to the right. - DFS
- Abbreviation for
*depth-first search*. - diagonalization argument
- A proof technique for proving that a set is
*uncountably infinite*. The approach is to show that, no matter what order the elements of the set are put in, a new element of the set can be constructed that is not in that ordering. This is done by changing the \(i\) th value or position of the element to be different from that of the \(i\) th element in the proposed ordering. - dictionary
- An abstract data type or
*interface*for a data structure or software subsystem that supports insertion, search, and deletion of records. - dictionary search
- A close relative of an
*interpolation search*. In a classical (paper) dictionary of words in a natural language, there are markings for where in the dictionary the words with a given letter start. So in typical usage of such a dictionary, words are found by opening the dictionary to some appropriate place within the pages that contain words starting with that letter. - digraph
- Abbreviation for
*directed graph*. - Dijkstra's algorithm
- An algorithm to solve the
*single-source shortest paths problem*in a*graph*. This is a*greedy algorithm*. It is nearly identical to*Prim's algorithm*for finding a*minimal-cost spanning tree*, with the only difference being the calculation done to update the best-known distance. - diminishing increment sort
- Another name for
*Shellsort*. - direct access
- A storage device, such as a disk drive, that has some ability to
move to a desired data location more-or-less directly.
This is in contrast to a
*sequential access*storage device such as a tape drive. - direct proof
- In general, a direct proof is just a "logical explanation". A direct proof is sometimes referred to as an argument by deduction. This is simply an argument in terms of logic. Often written in English with words such as "if ... then", it could also be written with logic notation such as \(P \Rightarrow Q\).
- directed acyclic graph
- A
*graph*with no cycles. Abbreviated as*DAG*. Note that a DAG is not necessarily a*tree*since a given*node*might have multiple*parents*. - directed edge
- An
*edge*that goes from*vertex*to another. In contrast, an*undirected edge*simply links to vertices without a direction. - directed graph
- A
*graph*whose*edges*each are directed from one of its defining*vertices*to the other. - dirty bit
- Within a
*buffer pool*, a piece of information associated with each*buffer*that indicates whether the contents of the buffer have changed since being read in from*backing storage*. When the buffer is*flushed*from the buffer pool, the buffer's contents must be written to the backing storage if the dirty bit is set (that is, if the contents have changed). This means that a relatively expensive write operation is required. In contrast, if the dirty bit is not set, then it is unnecessary to write the contents to backing storage, thus saving time over not keeping track of whether the contents have changed or not. - Discrete Fourier Transform

DFT - Let \(a = [a_0, a_1, ..., a_{n-1}]^T\) be a vector that stores the coefficients for a polynomial being evaluated. We can then do the calculations to evaluate the polynomial at the \(n\) th \(roots of unity <nth roots of unit>\) by multiplying the \(A_{z}\) matrix by the coefficient vector. The resulting vector \(F_{z}\) is called the Discrete Fourier Transform (or DFT) for the polynomial.
- discriminator
- A part of a
*multi-dimensional search key*. Certain tree data structures such as the*bintree*and the*kd tree*operate by making branching decisions at nodes of the tree based on a single attribute of the multi-dimensional key, with the attribute determined by the level of the node in the tree. For example, in 2 dimensions, nodes at the odd levels in the tree might branch based on the \(x\) value of a coordinate, while at the even levels the tree would branch based on the \(y\) value of the coordinate. Thus, the \(x\) coordinate is the discriminator for the odd levels, while the \(y\) coordinate is the discriminator for the even levels. - disjoint
- Two parts of a
*data structure*or two collections with no objects in common are disjoint. This term is often used in conjunction with a data structure that has*nodes*(such as a*tree*). Also used in the context of*sets*, where two*subsets*are disjoint if they share no elements. - disjoint sets
- A collection of
*sets*, any pair of which share no elements in common. A collection of disjoint sets partitions some objects such that every object is in exactly one of the disjoint sets. - disk access
- The act of reading data from a disk drive (or other form of
*peripheral storage*). The number of times data must be read from (or written to) a disk is often a good measure of cost for an algorithm that involves disk I/O, since this is usually the dominant cost. - disk controller
- The control mechanism for a
*disk drive*. Responsible for the action of reading or writing a*sector*of data. - disk drive
- An example of
*peripheral storage*or*secondary storage*. Data access times are typically measured in thousandths of a second (milliseconds), which is roughly a million times slower than access times for*RAM*, which is an example of a*primary storage*device. Reads from and writes to a disk drive are always done in terms of some minimum size, which is typically called a*block*. The block size is 512 bytes on most disk drives. Disk drives and RAM are typical parts of a computer's*memory hierarchy*. - disk I/O
- Refers to the act of reading data from or writing data to a
*disk drive*. All disk reads and writes are done in units of a*sector*or*block*. - disk-based space/time tradeoff
- In contrast to the standard
*space/time tradeoff*, this principle states that the smaller you can make your disk storage requirements, the faster your program will run. This is because the time to read information from disk is enormous compared to computation time, so almost any amount of additional computation needed to unpack the data is going to be less than the disk-reading time saved by reducing the storage requirements. - distance
- In
*graph*representations, a synonym for*weight*. - divide and conquer
- A technique for designing algorithms where a solution is found
by breaking the problem into smaller (similar) subproblems,
solving the subproblems, then combining the subproblem solutions
to form the solution to the original problem.
This process is often implemented using
*recursion*. - divide-and-conquer recurrences
A common form of

*recurrence relation*that have the form\[{\bf T}(n) = a{\bf T}(n/b) + cn^k; \quad {\bf T}(1) = c\]where \(a\), \(b\), \(c\), and \(k\) are constants. In general, this recurrence describes a problem of size \(n\) divided into \(a\) subproblems of size \(n/b\), while \(cn^k\) is the amount of work necessary to combine the partial solutions.

- divide-and-guess
- A technique for finding a
*closed-form solution*to a*summation*or*recurrence relation*. - domain
- The set of possible inputs to a function.
- double buffering
- The idea of using multiple
*buffers*to allow the*CPU*to operate in parallel with a*peripheral storage*device. Once the first buffer's worth of data has been read in, the CPU can process this while the next block of data is being read from the peripheral storage. For this idea to work, the next block of data to be processed must be known or predicted with reasonable accuracy. - double hashing
- A
*collision resolution*method. A second hash function is used to generate a value \(c\) on the key. That value is then used by this key as the step size in*linear probing by steps*. Since different keys use different step sizes (as generated by the second hash function), this process avoids the clustering caused by standard linear probing by steps. - double rotation
- A type of
*rebalancing operation*used by the*Splay Tree*and*AVL Tree*. - doubly linked list
- A
*linked list*implementation variant where each list node contains access pointers to both the previous element and the next element on the list. - DSA
- Abbreviation for Data Structures and Algorithms.
- dynamic
- Something that is changes (in contrast to
*static*). In computer programming, dynamic normally refers to something that happens at run time. For example, run-time analysis is analysis of the program's behavior, as opposed to its (static) text or structure Dynamic binding or dynamic memory allocation occurs at run time. - dynamic allocation
- The act of creating an object from
*free store*. In C++, Java, and JavaScript, this is done using the`new`

operator. - dynamic array
- Arrays, once allocated, are of fixed size. A dynamic array puts
an
*interface*around the array so as to appear to allow the array to grow and shrink in size as necessary. Typically this is done by allocating a new copy, copying the contents of the old array, and then returning the old array to*free store*. If done correctly, the*amortized cost*for dynamically resizing the array can be made constant. In some programming languages such as Java, the term*vector*is used as a synonym for dynamic array. - dynamic memory allocation
- A programming technique where linked objects in a data structure
are created from
*free store*as needed. When no longer needed, the object is either returned to*free store*or left as*garbage*, depending on the programming language. - dynamic programming
- An approach to designing algorithms that works by storing a table
of results for subproblems.
A typical cause for excessive cost in
*recursive*algorithms is that different branches of the recursion might solve the same subproblem. Dynamic programming uses a table to store information about which subproblems have already been solved, and uses the stored information to immediately give the answer for any repeated attempts to solve that subproblem. - edge
- The connection that links two
*nodes*in a*tree*,*linked list*, or*graph*. - edit distance
- Given strings \(S\) and \(T\), the edit distance is a measure for the number of editing steps required to convert \(S\) into \(T\).
- efficient
- A solution is said to be efficient
if it solves the problem within the required
*resource constraints*. A solution is sometimes said to be efficient if it requires fewer resources than known alternatives, regardless of whether it meets any particular requirements. - element
- One value or member in a set.
- empirical comparison
- An approach to comparing to things by actually seeing how they perform. Most typically, we are referring to the comparison of two programs by running each on a suite of test data and measuring the actual running times. Empirical comparison is subject to many possible complications, including unfair selection of test data, and inaccuracies in the time measurements due to variations in the computing environment between various executions of the programs.
- empty
- For a
*container*class, the state of containing no*elements*. - encapsulation
- In programming, the concept of hiding implementation details
from the user of an ADT, and protecting
*data members*of an object from outside access. - enqueue
- A specialized term used to indicate inserting an element onto a queue.
- entry-sequenced file
- A file that stores records in the order that they were added to the file.
- enumeration
- The process by which a
*traversal*lists every object in the*container*exactly once. Thus, a traversal that prints the*nodes*is said to enumerate the nodes. An enumeration can also refer to the actual listing that is produced by the traversal (as well as the process that created that listing). - equidistribution property
- In random number theory, this means that a given series of random numbers cannot be described more briefly than simply listing it out.
- equivalence class
- An
*equivalence relation*can be used to partition a set into equivalence classes. - equivalence relation
- Relation \(R\) is an equivalence relation on set
\(\mathbf{S}\) if it is
*reflexive*,*symmetric*, and*transitive*. - estimation
- As a technical skill, this is the process of generating a rough estimate in order to evaluate the feasibility of a proposed solution. This is sometimes known as "back of the napkin" or "back of the envelope" calculation. The estimation process can be formalized as (1) determine the major parameters that affect the problem, (2) derive an equation that relates the parameters to the problem, then (3) select values for the parameters and apply the equation to yield an estimated solution.
- evaluation
- The act of finding the value for a polynomial at a given point.
- exact-match query
- Records are accessed by unique identifier.
- exceptions
- Exceptions are techniques used to predict possible runtime errors and handle them properly.
- exchange
- A swap of adjacent records in an
*array*. - exchange sort
- A sort that relies solely on exchanges (swaps of adjacent
records) to reorder the list.
*Insertion Sort*and*Bubble Sort*are examples of exchange sorts. All exchange sorts require \(\Theta(n^2)\) time in the*worst case*. - expanding the recurrence
- A technique for solving a
*recurrence relation*. The idea is to replace the recursive part of the recurrence with a copy of recurrence. - exponential growth rate
- A
*growth rate*function where \(n\) (the input size) appears in the exponent. For example, \(2^n\). - expression tree
- A
*tree*structure meant to represent a mathematical expression.*Internal nodes*of the expression tree are operators in the expression, with the subtrees being the sub-expressions that are its operand. All*leaf nodes*are operands. - extent
- A physically contiguous block of
*sectors*on a*disk drive*that are all part of a given disk file. The fewer extents needed to store the data for a disk file, generally the fewer*seek*operations that will be required to process a series of*disk access*operations on that file. - external fragmentation
- A condition that arises when a series of
*memory requests*result in lots of small*free blocks*, no one of which is useful for servicing typical requests. - external sort
- A sorting algorithm that is applied to data stored in
*peripheral storage*such as on a*disk drive*. This is in contrast to an*internal sort*that works on data stored in*main memory*. - factorial
- The factorial function is defined as \(f(n) = n f(n-1)\) for \(n > 0\).
- failure policy
- In a
*memory manager*, a failure policy is the response that takes place when there is no way to satisfy a*memory request*from the current*free blocks*in the*memory pool*. Possibilities include rejecting the request, expanding the memory pool, collecting*garbage*, and reorganizing the memory pool (to collect together free space). - family of languages
- Given some class or type of
*finite automata*(for example, the*deterministic finite automata*), the set of languages accepted by that class of finite automata is called a family. For example, the*regular languages*is a family defined by the DFAs. - FIFO
- Abbreviation for "first-in, first-out".
This is the access paradigm for a
*queue*, and an old terminology for the queue is "FIFO list". - file allocation table
- A legacy file system architecture orginially developed for DOS and then used in Windows. It is still in use in many small-scale peripheral devices such as USB memory sticks and digital camera memory.
- file manager
- A part of the
*operating system*responsible for taking requests for data from a*logical file*and mapping those requests to the physical location of the data on disk. - file processing
- The domain with Computer Science that deals with processing data
stored on a
*disk drive*(in a file), or more broadly, dealing with data stored on any*peripheral storage*device. Two fundamental properties make dealing with data on a peripheral device different from dealing with data in main memory: (1) Reading/writing data on a peripheral storage device is far slower than reading/writing data to main memory (for example, a typical disk drive is about a million times slower than*RAM*). (2) All I/O to a peripheral device is typically in terms of a*block*of data (for example, nearly all disk drives do all I/O in terms of blocks of 512 bytes). - file structure
- The organization of data on
*peripheral storage*, such as a*disk drive*or DVD drive. - final state
- A required element of any
*acceptor*. When computation on a string ends in a final state, then the machine accepts the string. Otherwise the machine rejects the string. - FIND
- One half of the
*UNION/FIND*algorithm for managing*disjoint sets*. It is the process of moving upwards in a tree to find the tree's root. - Finite State Acceptor
- A simple type of
*finite state automata*, an acceptor's only ability is to accept or reject a string. So, a finite state acceptor does not have the ability to modify the input tape. If computation on the string ends in a*final state*, then the the string is accepted, otherwise it is rejected. - Finite State Machine

FSM

Finite State Automata

FSA

Finite Automata - Any abstract state machine, generally represented as a graph
where the nodes are the
*states*, and the edges represent transitions between nodes that take place when the machine is in that node (state) and sees an appropriate input. See, as an example,*Deterministic Finite Automata*. - first fit
- In a
*memory manager*, first fit is a*heuristic*for deciding which*free block*to use when allocating memory from a*memory pool*. First fit will always allocate the first*free block*on the*free block list*that is large enough to service the memory request. The advantage of this approach is that it is typically not necessary to look at all free blocks on the free block list to find a suitable free block. The disadvantage is that it is not "intelligently" selecting what might be a better choice of free block. - fixed-length coding
- Given a collection of objects, a fixed-length coding scheme
assigns a code to each object in the collection using codes that
are all of the same length.
Standard ASCII and Unicode representations for characters are
both examples of fixed-length coding schemes.
This is in contrast to
*variable-length coding*. - floor
- Written \(\lfloor x \rfloor\), for real value \(x\) the floor is the greatest integer \(\leq x\).
- Floyd's algorithm
- An algorithm to solve the
*all-pairs shortest paths problem*. It uses the*dynamic programming*algorithmic technique, and runs in \(\Theta(n^3)\) time. As with any*dynamic programming*algorithm, the key issue is to avoid duplicating work by using proper bookkeeping on the algorithm's progress through the solution space. The basic idea is to first find all the direct edge costs, then improving those costs by allowing paths through*vertex*0, then the cheapest paths involving paths going through vertices 0 and 1, and so on. - flush
- The act of removing data from a
*cache*, most typically because other data considered of higher future value must replace it in the cache. If the data being flushed has been modified since it was first read in from*secondary storage*(and the changes are meant to be saved), then it must be written back to that secondary storage. - flush
- The the context of a
*buffer pool*, the process of removing the contents stored in a*buffer*when that buffer is required in order to store new data. If the buffer's contents have been changed since having been read in from*backing storage*(this fact would normally be tracked by using a*dirty bit*), then they must be copied back to the backing storage before the buffer can be reused. - flyweight
- A
*design pattern*that is meant to solve the following problem: You have an application with many objects. Some of these objects are identical in the information that they contain, and the role that they play. But they must be reached from various places, and conceptually they really are distinct objects. Because there is so much duplication of the same information, we want to reduce memory cost by sharing that space. For example, in document layout, the letter "C" might be represented by an object that describes that character's strokes and bounding box. However, we do not want to create a separate "C" object everywhere in the document that a "C" appears. The solution is to allocate a single copy of the shared representation for "C" objects. Then, every place in the document that needs a "C" in a given font, size, and typeface will reference this single copy. The various instances of*references*to a specific form of "C" are called flyweights. Flyweights can also be used to implement the empty leaf nodes of the*bintree*and*PR quadtree*. - folding method
- In
*hashing*, an approach to implementing a*hash function*. Most typically used when the key is a string, the folding method breaks the string into pieces (perhaps each letter is a piece, or a small series of letters is a piece), converts the letter(s) to an integer value (typically by using its underlying encoding value), and summing up the pieces. - Ford and Johnson sort
- A sorting algorithm that is close to the theoretical minimum number of key comparisons necessary to sort. Generally not considered practical in practice due to the fact that it is not efficient in terms of the number of records that need to be moved. It consists of first sorting pairs of nodes into winners and losers (of the pairs comparisons), then (recursively) sorting the winners of the pairs, and then finally carefully selecting the order in which the losers are added to the chain of sorted items.
- forest
- A collection of one or more
*trees*. - free block
- A block of unused space in a
*memory pool*. - free block list
- In a
*memory manager*, the list that stores the necessary information about the current*free blocks*. Generally, this is done with some sort of*linked list*, where each node of the linked list indicates the start position and length of the free block in the*memory pool*. - free store
- Space available to a program during runtime to be used for
*dynamic allocation*of objects. The free store is distinct from the*runtime stack*. The free store is sometimes referred to as the*heap*, which can be confusing because*heap*more often refers to a specific data structure. Most programming languages provide functions to allocate (and maybe to deallocate) objects from the free store, such as`new`

in C++ and Java. - free tree
- A connected,
*undirected graph*with no simple cycles. An equivalent definition is that a free tree is connected and has \(|\mathbf{V}| - 1\) edges. - freelist
- A simple and faster alternative to using
*free store*when the objects being dynamically allocated are all of the same size (and thus are interchangeable). Typically implemented as a*linked stack*, released objects are put on the front of the freelist. When a request is made to allocate an object, the freelist is checked first and it provides the object if possible. If the freelist is empty, then a new object is allocated from*free store*. - frequency count
- A
*heuristic*used to maintain a*self-organizing list*. Under this heuristic, a count is maintained for every record. When a record access is made, its count is increased. If this makes its count greater than that of another record in the list, it moves up toward the front of the list accordingly so as to keep the list sorted by frequency. Analogous to the*least frequently used*heuristic for maintaining a*buffer pool*. - full binary tree theorem
- This theorem states that
the number of leaves in a non-empty full binary tree is one
more than the number of internal nodes.
Equivalently, then number of null pointers in a standard
*pointer-based implementation for binary tree nodes*is one more than the number of nodes in the binary tree. - full tree
- A
*binary tree*is full if every*node*is either a*leaf node*or else it is an*internal node*with two non-empty*children*. - function
- In mathematics, a matching between inputs (the
*domain*) and outputs (the*range*). In programming, a subroutine that takes input parameters and uses them to compute and return a value. In this case, it is usually considered bad practice for a function to change any global variables (doing so is called a side effect). - garbage
- In
*memory management*, any memory that was previously (dynamically) allocated by the program during runtime, but which is no longer accessible since all pointers to the memory have been deleted or overwritten. In some languages, garbage can be recovered by*garbage collection*. In languages such as C and C++ that do not support garbage collection, so creating garbage is considered a*memory leak*. - garbage collection
- Languages with garbage collection such
Java, JavaScript, Lisp, and Scheme will periodically reclaim
*garbage*and return it to*free store*. - general tree
- A tree in which any given node can have any number of
*children*. This is in contrast to, for example, a*binary tree*where each node has a fixed number of children (some of which might be`null`

). General tree nodes tend to be harder to implement for this reason. - grammar
- A formal definition for what strings make up a
*language*, in terms of a set of*production rules*. - graph
- A
*graph*\(\mathbf{G} = (\mathbf{V}, \mathbf{E})\) consists of a set of*vertices*\(\mathbf{V}\) and a set of*edges*\(\mathbf{E}\), such that each edge in \(\mathbf{E}\) is a connection between a pair of vertices in \(\mathbf{V}\). - greedy algorithm
- An algorithm that makes locally optimal choices at each step.
- growth rate
- In
*algorithm analysis*, the rate at which the cost of the*algorithm*grows as the size of its input grows. - guess-and-test
- A technique used when trying to determine the
*closed-form solution*for a*summation*or*recurrence relation*. Given a hypothesis for the closed-form solution, if it is correct, then it is often relatively easy to prove that using*induction*. - guided traversal
- A
*tree traversal*that does not need to visit every node in the tree. An example would be a*range query*in a*BST*. - halt state
- In a
*finite automata*, a designated*state*which causes the machine to immediately halt when it is entered. - halted configuration
- A halted configuration occurs in a
*Turing machine*when the machine transitions into the*halt state*. - halting problem
- The halting problem is to answer this question:
Given a computer program \(P\) and an
input \(I\), will program \(P\) halt when executed on
input \(I\)?
This problem has been proved impossible to solve in the general
case.
Thus, it is an example of an
*unsolveable problem*. - handle
- When using a
*memory manager*to store data, the*client*will pass data to be stored (the*message*) to the memory manager, and the memory manager will return to the client a handle. The handle encodes the necessary information that the memory manager can later use to recover and return the message to the client. This is typically the location and length of the message within the*memory pool*. - hanging configuration
- A hanging configuration occurs in a
*Turing machine*when the I/O head moves to the left from the left-most square of the tape, or when the machine goes into an infinite loop. - hard algorithm
- "Hard" is traditionally defined in relation to running time, and a "hard" algorithm is defined to be an algorithm with exponential running time.
- hard problem
- "Hard" is traditionally defined in relation to running time, and a "hard" problem is defined to be one whose best known algorithm requires exponential running time.
- harmonic series
- The sum of reciprocals from 1 to \(n\) is called the Harmonic Series, and is written \({\cal H}_n\). This sum has a value between \(\log_e n\) and \(\log_e n + 1\).
- hash function
- In a
*hash system*, the function that converts a*key*value to a position in the*hash table*. The hope is that this position in the hash table contains the record that matches the key value. - hash system
- The implementation for search based on hash lookup in a
*hash table*. The*search key*is processed by a*hash function*, which returns a position in a*hash table*, which hopefully is the correct position in which to find the record corresponding to the search key. - hash table
- The data structure (usually an
*array*) that stores data records for lookup using*hashing*. - hashing
- A search method that uses a
*hash function*to convert a*search key*value into a position within a*hash table*. In a properly implemented*hash system*, that position in the table will have high probability of containing the record that matches the key value. Sometimes, the hash function will return an position that does not store the desired key, due to a process called*collision*. In that case, the desired record is found through a process known as*collision resolution*. - head
- The beginning of a
*list*. - header node
- Commonly used in implementations for a
*linked list*or related structure, this*node*preceeds the first element of the list. Its purpose is to simplify the code implementation by reducing the number of special cases that must be programmed for. - heap
- This term has two different meanings.
Uncommonly, it is a synonym for
*free store*. Most often it is used to refer to a particular data structure. This data structure is a*complete binary tree*with the requirement that every*node*has a value greater than its*children*(called a*max heap*), or else the requirement that every node has a value less than its children (called a*min heap*). Since it is a complete binary tree, a heap is nearly always implemented using an*array*rather than an explicit tree structure. To add a new value to a heap, or to remove the extreme value (the max value in a max-heap or min value in a min-heap) and update the heap, takes \(\Theta(\log n)\) time in the*worst case*. However, if given all of the values in an unordered array, the values can be re-arranged to form a heap in only \(\Theta(n)\) time. Due to its space and time efficiency, the heap is a popular choice for implementing a*priority queue*. - heapsort
- A sorting algorithm that costs \(\Theta(n \log n)\) time in
the
*best*,*average*, and*worst*cases. It tends to be slower than*Mergesort*and*Quicksort*. It works by building a*max heap*, and then repeatedly removing the item with maximum*key*value (moving it to the end of the heap) until all elements have been removed (and replaced at their proper location in the array). - height
- The height of a tree is one more than the
*depth*of the deepest*node*in the tree. - height balanced
- The condition the
*depths*of each*subtree*in a tree are roughly the same. - heuristic
- A way to solve a problem that is not guarenteed to be optimal. While it might not be guarenteed to be optimal, it is generally expected (by the agent employing the heuristic) to provide a reasonably efficient solution.
- heuristic algorithm
- A type of
*approximation algorithm*, that uses a*heuristic*to find a good, but not necessarily cheapest, solution to an*optimization problem*. - home position
- In
*hashing*, a synonym for*home slot*. - home slot
- In
*hashing*, this is the*slot*in the*hash table*determined for a given key by the*hash function*. - homogeneity
- In a
*container*class, this is the property that all objects stored in the ncontainer are of the same class. For example, if you have a list intended to store Payroll records, is it possible for the programmer to insert an integer onto the list instead? - Huffman codes
- The codes given to a collection of letters (or other symbols)
through the process of Huffman coding.
Huffman coding uses a
*Huffman coding tree*to generate the codes. The codes can be of variable length, such that the letters which are expected to appear most frequently are shorter. Huffman coding is optimal whenever the true frequencies are known, and the frequency of a letter is independent of the context of that letter in the message. - Huffman coding tree
- A Huffman coding tree is a
*full binary tree*that is used to represent letters (or other symbols) efficiently. Each letter is associated with a node in the tree, and is then given a*Huffman code*based on the position of the associated node. A Huffman coding tree is an example of a binary*trie*. - Huffman tree
- Shorter form of the term
*Huffman coding tree*. - I/O head
- On a
*disk drive*(or similar device), the part of the machinery that actually reads data from the disk. - image-space decomposition
- A from of
*key-space decomposition*where the*key space*splitting points is predetermined (typically by splitting in half). For example, a*Huffman coding tree*splits the letters being coded into those with codes that start with 0 on the left side, and those with codes that start with 1 on the right side. This regular decomposition of the key space is the basis for a*trie*data structure. An image-space decomposition is in opposition to an*object-space decomposition*. - in degree
- In
*graph*terminology, the in degree for a*vertex*is the number of edges directed into the vertex. - incident
- In
*graph*terminology, an edge connecting two vertices is said to be incident with those vertices. The two vertices are said to be*adjacent*. - index file
- A file whose records consist of
*key-value pairs*where the pointers are referencing the complete records stored in another file. - indexing
- The process of associating a
*search key*with the location of a corresponding data record. The two defining points to the concept of an index is the association of a key with a record, and the fact that the index does not actually store the record itself but rather it stores a*reference*to the record. In this way, a collection of records can be supported by multiple indices, typically a separate index for each key field in the record. - induction hypothesis
- The key assumption used in a
*proof by induction*, that the theorem to be proved holds for smaller instances of the theorem. The induction hypothesis is equivalent to the*recursive*call in a recursive function. - induction step
- Part of a
*proof by induction*. In its simplest form, this is a proof of the implication that if the theorem holds for $n-1$, then it holds for $n$. As an alternative, see*strong induction*. - induction variable
- The variable used to parameterize the theorem being proved by induction. For example, if we seek to prove that the sum of the integers from 1 to $n$ is $n(n+1)/2$, then $n$ is the induction variable. An induction variable must be an integer.
- information theoretic lower bound
- A
*lower bound*on the amount of resources needed to solve a*problem*based on the number of bits of information needed to uniquely specify the answer. Sometimes referred to as a "Shannon theoretic lower bound" due to Shannon's work on information theory and entropy. An example is that sorting has a lower bound of \(\Omega(\log_2 n!)\) because there are \(n!\) possible orderings for \(n\) values. This observation alone does not make the lower bound tight, because it is possible that no algorithm could actually reach the information theory lower limit. - inherit
- In
*object-oriented programming*, the process by which a*subclass*gains*data members*and*methods*from a*base class*. - initial state
- A synonym for
*start state*. - inode
- Short for "index node".
In UNIX-style file systems, specific disk
*sectors*that hold indexing information to define the layout of the file system. - inorder traversal
- In a
*binary tree*, a*traversal*that first*recursively**visits*the left*child*, then visits the*root*, and then recursively visits the right child. In a*binary search tree*, this traversal will*enumerate*the nodes in sorted order. - Insertion Sort
- A sorting algorithm with \(\Theta(n^2)\)
*average*and*worst case*cost, and \(Theta(n)\)*best case*cost. This best case cost makes it useful when we have reason to expect the input to be nearly sorted. - instance variable
- In
*object-oriented programming*, a synonym for*data member*. - integer function
- Any function whose input is an integer and whose output is an
integer. It can be proved by
*diagonalization*that the set of integer functions is*uncountably infinite*. - inter-sector gap
- On a disk drive, a physical gap in the data that occurs between
the
*sectors*. This allows the*I/O head*detect the end of the sector. - interface
- An interface is a class-like structure that only contains method
signatures and fields. An interface does not contain an implementation
of the methods or any
*data members*. - intermediate code
- A step in a typical
*compiler*is to transform the original high-level language into a form on which it is easier to do other stages of the process. For example, some compilers will transform the original high-level source code into*assembly code*on which it can do*code optimization*, before translating it into its final executable form. - intermediate code generation
- A phase in a
*compiler*, that walks through a*parse tree*to produce simple*assembly code*. - internal fragmentation
- A condition that occurs when more than \(m\) bytes
are allocated to service a
*memory request*for \(m\) bytes, wasting free storage. This is often done to simplify*memory management*. - internal node
- In a tree, any node that has at least one non-empty
*child*is an internal node. - internal sort
- A sorting algorithm that is applied to data stored in
*main memory*. This is in contrast to an*external sort*that is meant to work on data stored in*peripheral storage*such as on a*disk drive*. - interpolation
- The act of finding the coefficients of a polynomial, given the values at some points. A polynomal of degree \(n-1\) requires \(n\) points to interpolate the coefficients.
- interpolation search
- Given a sorted array, and knowing the first and last
*key*values stored in some subarray known to contain*search key*\(K\), interpolation search will compute the expected location of \(K\) in the subarray as a fraction of the distance between the known key values. So it will next check that computed location, thus narrowing the search for the next iteration. Given reasonable key value distribution, the*average case*for interpolation search will be \(\Theta(\log \log n)\), or better than the expected cost of*binary search*. Nonetheless, binary search is expected to be faster in nearly all practical situations due to the small difference between the two costs, combined with the higher constant factors required to implement interpolation search as compared to binary search. - interpreter
- In contrast to a
*compiler*that translates a high-level program into something that can be repeatedly executed to perform a computation, an interpreter directly performs computation on the high-level langauge. This tends to make the computation much slower than if it were performed on the directly executable version produced by a compiler. - inversion
- A measure of how disordered a series of values is. For each element \(X\) in the series, count one inversion for each element to left of \(X\) that is greater than the value of \(X\) (and so must ultimately be moved to the right of \(X\) during a sorting process).
- inverted file
- Synonym for
*inverted list*when the inverted list is stored in a disk file. - inverted list
- An
*index*which links*secondary keys*to either the associated*primary key*or the actual record in the database. - irreflexive
- In set notation, binary relation \(R\) on set \(S\) is irreflexive if \(aRa\) is never in the relation for any \(a \in \mathbf{S}\).
- ISAM
- Indexed Sequential Access Method: an obsolete method for
indexing data for (at the time) fast retrieval. More generally,
the term is used also to generically refer to an
*index*that supports both sequential and*keyed*access to data records. Today, that would nearly always be implemented using a*B-Tree*. - iterator
- In a
*container*such as a List, a separate class that indicates position within the container, with support for*traversing*through all*elements*in the container. - job
- Common name for processes or tasks to be run by an operating
system.
They typically need to be processed in order of
importance, and so are kept organized by a
*priority queue*. Another common use for this term is for a collection of tasks to be ordered by a*topological sort*. - jump search
- An algorithm for searching a sorted list, that falls between
*sequential search*and*binary search*in both computational cost and conceptual complexity. The idea is to keep jumping by some fixed number of positions until a value is found that is bigger than*search key*\(K\), then do a sequential search over the subarray that is now known to contain the search key. The optimal number of steps to jump will be \(\sqrt{n}\) for an array of size \(n\), and the*worst case*cost will be \(\Theta(\sqrt{n})\). - K-ary tree
- A type of
*full tree*where every internal node has exactly \(K\)*children*. - k-path
- In
*Floyd's algorithm*, a k-path is a path between two vertices \(i\) and \(j\) that can only go through vertices with an index value less than or equal to \(k\). - kd tree
- A
*spatial data structure*that uses a binary tree to store a collection of data records based on their (point) location in space. It uses the concept of a*discriminator*at each level to decide which single component of the*multi-dimensional search key*to branch on at that level. It uses a*key-space decomposition*, meaning that all data records in the left subtree of a node have a value on the corresponding discriminator that is less than that of the node, while all data records in the right subtree have a greater value. The*bintree*is the*image-space decomposition*analog of the kd tree. - key
- A field or part of a larger record used to represent that record
for the purpose of searching or comparing.
Another term for
*search key*. - key sort
- Any sorting operation applied to a collection of
*key-value pairs*where the value in this case is a*reference*to a complete record (that is, a pointer to the record in memory or a position for a record on disk). This is in contrast to a sorting operation that works directly on a collection of records. The intention is that the collection of key-value pairs is far smaller than the collection of records themselves. As such, this might allow for an*internal sort*when sorting the records directly would require an*external sort*. The collection of key-value pairs can also act as an*index*. - key space
- The range of values that a
*key*value may take on. - key-space decomposition
- The idea that the range for a
*search key*will be split into pieces. There are two general approaches to this:*object-space decomposition*and*image-space decomposition*. - key-value pair
- A standard solution for solving the problem of how to relate a
*key*value to a record (or how to find the key for a given record) within the context of a particular*index*. The idea is to simply store as records in the index pairs of keys and records. Specifically, the index will typically store a copy of the key along with a*reference*to the record. The other standard solution to this problem is to pass a*comparator*function to the index. - knapsack problem
- While there are many variations of this problem, here is a
typical version: Given knapsack of a fixed size, and a
collection of objects of various sizes, is there a subset of the
objects that exactly fits into the knapsack?
This problem is known to be
*NP-complete*, but can be solved for problem instances in practical time relatively quickly using*dynamic programming*. Thus, it is considered to have*pseudo-polynomial*cost. An*optimization problem*version is to find the subset that can fit with the greatest amount of items, either in terms of their total size, or in terms of the sum of values associated with each item. - Kruskal's algorithm
- An algorithm for computing the
*MCST*of a*graph*. During processing, it makes use of the*UNION/FIND*process to efficiently determine of two vertices are within the same*subgraph*. - labeled graph
- A
*graph*with labels associated with the*nodes*. - language
- A set of strings.
- Las Vegas algorithms
- A form of
*randomized algorithm*. We always find the maximum value, and "usually" we find it fast. Such algorithms have a guaranteed result, but do not guarantee fast running time. - leaf node
- In a
*binary tree*, leaf node is any node that has two empty*children*. (Note that a binary tree is defined so that every node has two children, and that is why the leaf node has to have two empty children, rather than no children.) In a general tree, any node is a leaf node if it has no children. - least frequently used
- Abbreviated
*LFU*, it is a*heuristic*that can be used to decide which*buffer*in a*buffer pool*to*flush*when data in the buffer pool must be replaced by new data being read into a*cache*. However,*least recently used*is more popular than LFU. Analogous to the*frequency count*heuristic for maintaining a*self-organizing list*. - least recently used
- Abbreviated
*LRU*, it is a popular*heuristic*to use for deciding which*buffer*in a*buffer pool*to*flush*when data in the buffer pool must be replaced by new data being read into a*cache*. Analogous to the*move-to-front*heuristic for maintaining a*self-organizing list*. - left recursive
- In automata theory, a
*production*is left recursive if it is of the form \(A \rightarrow Ax\), \(A \in V, x \in (V \cup T)^*\) where \(V\) is the set of*non-terminals*and \(T\) is the set of*terminals*in the*grammar*. - length
- In a
*list*, the number of elements. In a string, the number of characters. - level
- In a tree, all nodes of
*depth*\(d\) are at level \(d\) in the tree. The root is the only node at level 0, and its depth is 0. - lexical analysis
- A phase of a
*compiler*or*interpreter*responsible for reading in characters of the program or language and grouping them into*tokens*. - lexical scoping
- Within programming languages, the convention of allowing access
to a variable only within the block of code in which the
variable is defined.
A synonym for
*static scoping*. - LFU
- Abbreviation for
*least frequently used*. - lifetime
- For a variable, lifetime is the amount of time it will exist before it is destroyed.
- LIFO
- Abbreviation for "Last-In, First-Out".
This is the access paradigm for a
*stack*, and an old terminolgy for the stack is "LIFO list". - linear congruential method
In random number theory, a process for computing the next number in a

*pseudo-random*sequence. Starting from a*seed*, the next term \(r(i)\) in the series is calculated from term \(r(i-1)\) by the equation\[r(i) = (r(i-1)\times b) \bmod t\]where \(b\) and \(t\) are constants. These constants must be well chosen for the resulting series of numbers to have desireable properties as a random number sequence.

- linear growth rate
- For input size \(n\), a growth rate of \(cn\) (for \(c\) any positive constant). In other words, the cost of the associated function is linear on the input size.
- linear index
- A form of
*indexing*that stores*key-value pairs*in a sorted array. Typically this is used for an index to a large collection of records stored on disk, where the linear index itself might be on disk or in*main memory*. It allows for efficient search (including for*range queries*), but it is not good for inserting and deleting entries in the array. Therefore, it is an ideal indexing structure when the system needs to do range queries but the collection of records never changes once the linear index has been created. - linear order
- Another term for
*total order*. - linear probing
- In
*hashing*, this is the simplest*collision resolution*method. Term \(i\) of the*probe sequence*is simply \(i\), meaning that collision resolution works by moving sequentially through the hash table from the*home slot*. While simple, it is also inefficient, since it quickly leads to certain free*slots*in the hash table having higher probability of being selected during insertion or search. - linear probing by steps
- In
*hashing*, this*collision resolution*method is a variation on simple*linear probing*. Some constant \(c\) is defined such that term \(i\) of the*probe sequence*is \(ci\). This means that collision resolution works by moving sequentially through the hash table from the*home slot*in steps of size \(c\). While not much improvement on linear probing, it forms the basis of another collision resolution method called*double hashing*, where each key uses a value for \(c\) defined by a second*hash function*. - linear search
- Another name for
*sequential search*. - link node
- A widely used supporting object that forms the basic
building block for a
*linked list*and similar*data structures*. A link node contains one or more fields that store data, and a*pointer*or*reference*to another link node. - linked list
- An implementation for the list ADT that uses
*dynamic allocation*of*link nodes*to store the list elements. Common variants are the*singly linked list*,*doubly linked list*and*circular list*. The*overhead*required is the pointers in each link node. - linked stack
- Analogous to a
*linked list*, this uses*dynamic allocation*of nodes to store the elements when implementing the stack ADT. - list
- A finite, ordered sequence of
*data items*known as*elements*. This is close to the mathematical concept of a*sequence*. Note that "ordered" in this definition means that the list elements have position. It does not refer to the relationship between*key*values for the list elements (that is, "ordered" does not mean "sorted"). - literal
- In a
*Boolean expression*, a*literal*is a*Boolean variable*or its negation. In the context of compilers, it is any constant value. Similar to a*terminal*. - load factor
- In
*hashing*this is the fraction of the*hash table**slots*that contain a record. Hash systems usually try to keep the load factor below 50%. - local storage
- local storage.
- local variable
- A variable declared within a function or method.
It exists only from the time when the function is called to when
the function exits.
When a function is suspended (due to calling another function),
the function's local variables are stored in an
*activation record*on the*runtime stack*. - locality of reference
- The concept that accesses within a collection of records is not
evenly distributed.
This can express itself as some small fraction of the records
receiving the bulk of the accesses (
*80/20 rule*). Alternatively, it can express itself as an increased probability that the next or future accesses will come close to the most recent access. This is the fundamental property for success of*caching*. - logarithm
- The logarithm of base \(b\) for value \(y\) is the power to which \(b\) is raised to get \(y\).
- logical file
- In
*file processing*, the programmer's view of a*random access*file stored on*disk*as a contiguous series of bytes, with those bytes possibly combining to form data records. This is in contrast to the*physical file*. - logical form
- The definition for a data type in terms of an ADT. Contrast to
the
*physical form*for the data type. - lookup table
- A table of pre-calculated values, used to speed up processing
time when the values are going to be viewed many times. The
costs to this approach are the space required for the table and
the time required to compute the table. This is an example of a
*space/time tradeoff*. - lower bound
- In
*algorithm analysis*, a*growth rate*that is always less than or equal to the that of the*algorithm*in question. In practice, this is the fastest-growing function that we know grows no faster than all but a constant number of inputs. It could be a gross under-estimate of the truth. Since the lower bound for the algorithm can be very different for different situations (such as the*best case*or*worst case*), we typically have to specify which situation we are referring to. - lower bounds proof
- A proof regarding the lower bound, with this term most typically
referring to the lower bound for any possible algorithm to solve
a given
*problem*. Many problems have a simple lower bound based on the concept that the minimum amount of processing is related to looking at all of the problem's input. However, some problems have a higher lower bound than that. For example, the lower bound for the problem of sorting (\(\Omega(n \log n)\)) is greater than the input size to sorting (\(n\)). Proving such "non-trivial" lower bounds for problems is notoriously difficult. - LRU
- Abbreviation for
*least recently used*. - main memory
- A synonym for
*primary storage*. In a computer, typically this will be*RAM*. - map
- A
*data structure*that relates a*key*to a*record*. - mapping
- A
*function*that maps every element of a given*set*to a unique element of another set; a correspondence. - mark array
- It is typical in
*graph*algorithms that there is a need to track which nodes have been visited at some point in the algorithm. An*array*of bits or values called the*mark array*is often maintained for this purpose. - mark/sweep algorithm
- An algorithm for
*garbage collection*. All accessible variables, and any space that is reachable by a chain of pointers from any accessible variable, is "marked". Then a sequential sweep of all memory in the pool is made. Any unmarked memory locations are assumed to not be needed by the program and can be considered as free to be reused. - master theorem
- A theorem that makes it easy to solve
*divide-and-conquer recurrences*. - matching
- In graph theory, a pairing (or match) of various nodes in a graph.
- matching problem
- Any problem that involves finding a
*matching*in a graph with some desired property. For example, a well-known*NP-complete*problem is to find a*maximum match*for an undirected graph. - max heap
- A
*heap*where every*node*has a*key*value greater than its*children*. As a consequence, the node with maximum key value is at the*root*. - maximal match
- In a graph, any
*matching*that leaves no pair of unmatched vertices that are connected. A maximal matching is not necessarily a*maximum match*. In other words, there might be a larger matching than the maximal matching that was found. - maximum lower bound
- The
*lower bound*for the*problem*of finding the maximum value in an unsorted list is \(\Omega(n)\). - maximum match
- In a graph, the largest possible
*matching*. - MCST

MST - Abbreviation for
*minimal-cost spanning tree*. - measure of cost
- When comparing two things, such as two algorithms, some event or
unit must be used as the basic unit of comparison.
It might be number of milliseconds needed or machine instructions
expended by a program, but it is usually desirable to have a way
to do comparison between two algorithms without writing a
program.
Thus, some other measure of cost might be used as a basis for
comparison between the algorithms.
For example, when comparing two sorting algorthms it is
traditional to use as a measure of cost the number of
*comparisons*made between the key values of record pairs. - member
- In set notation, this is a synonym for
*element*. In abstract design, a*data item*is a member of a*type*. In an object-oriented language,*data members*are data fields in an object. - member function
- Each operation associated with the ADT is implemented by a
member function or
*method*. - memory allocation
- In a
*memory manager*, the act of honoring a request for memory. - memory deallocation
- In a
*memory manager*, the act of freeing a block of memory, which should create or add to a*free block*. - memory hierarchy
- The concept that a computer system stores data in a range of
storage types that range from fast but expensive
(
*primary storage*) to slow but cheap (*secondary storage*). When there is too much data to store in*primary storage*, the goal is to have the data that is needed soon or most often in the primary storage as much as possible, by using*caching*techniques. - memory leak
- In programming, the act of creating
*garbage*. In languages such as C and C++ that do not support*garbage collection*, repeated memory leaks will evenually cause the program to terminate. - memory manager
- Functionality for managing a
*memory pool*. Typically, the memory pool is viewed as an*array*of bytes by the memory manager. The*client*of the memory manager will request a collection of (adjacent) bytes of some size, and release the bytes for reuse when the space is no longer needed. The memory manager should not know anything about the interpretation of the data that is being stored by the client into the memory pool. Depending on the precise implementation, the client might pass in the data to be stored, in which case the memory manager will deal with the actual copy of the data into the memory pool. The memory manager will return to the client a*handle*that can later be used by the client to retrieve the data. - memory pool
- Memory (usually in
*RAM*but possibly on disk or*peripheral storage*device) that is logically viewed as an array of memory positions. A memory pool is usually managed by a*memory manager*. - memory request
- In a
*memory manager*, a request from some*client*to the memory manager to reserve a block of memory and store some bytes there. - merge insert sort
- A synonym for the
*Ford and Johnson sort*. - Mergesort
- A sorting algorithm that requires \(\Theta(n \log n)\) in
the
*best*,*average*, and*worst*cases. Conceptually it is simple: Split the list in half, sort the halves, then merge them together. It is a bit complicated to implement efficiently on an*array*. - message
- In a
*memory manager*implementation (particularly a memory manager implemented with a*message passing*style of*interface*), the message is the data that the*client*of the memory manager wishes to have stored in the*memory pool*. The memory manager will reply to the client by returning a*handle*that defines the location and size of the message as stored in the memory pool. The client can later recover the message by passing the handle back to the memory manager. - message passing
- A common approach to implementing the
*ADT*for a*memory manager*or*buffer pool*, where the contents of a*message*to be stored is explicitly passed between the client and the memory manager. This is in contrast to a*buffer passing*approach. - metaphor
- Humans deal with complexity by assigning a label to an assembly of objects or concepts and then manipulating the label in place of the assembly. Cognitive psychologists call such a label a metaphor.
- method
- In the
*object-oriented programming paradigm*, a method is an operation on a*class*. A synonym for*member function*. - mid-square method
- In
*hashing*, an approach to implementing a*hash function*. The key value is squared, and some number of bits from the middle of the resulting value are extracted as the hash code. Some care must be taken to extract bits that tend to actually be in the middle of the resulting value, which requires some understanding of the typical key values. When done correctly, this has the advantage of having the hash code be affected by all bits of the key - min heap
- A
*heap*where every*node*has a*key*value less than its*children*. As a consequence, the node with minimum key value is at the*root*. - minimal-cost spanning tree
- Abbreviated as MCST, or sometimes as MST.
Derived from a
*weighted graph*, the MCST is the*subset*of the graph's*edges*that maintains the connectivitiy of the graph while having lowest total cost (as defined by the sum of the*weights*of the edges in the MCST). The result is referred to as a*tree*because it would never have a*cycle*(since an edge could be removed from the cycle and still preserve connectivity). Two algorithms to solve this problem are*Prim's algorithm*and*Kruskal's algorithm*. - minimum external path weight
- Given a collection of objects, each associated with a
*leaf node*in a tree, the binary tree with minimum external path weight is the one with the minimum sum of*weighted path lengths*for the given set of leaves. This concept is used to create a*Huffman coding tree*, where a letter with high weight should have low depth, so that it will count the least against the total path length. As a result, another letter might be pushed deeper in the tree if it has less weight. - mod
- Abbreviation for the
*modulus*function. - model
- A simplification of reality that preserves only the essential
elements.
With a model, we can more easily focus on and reason about these
essentials.
In
*algorithm analysis*, we are especially concerned with the*cost model*for measuring the cost of an algorithm. - modulus
- The modulus function returns the
remainder of an integer division.
Sometimes written \(n \bmod m\) in mathematical expressions,
the syntax in many programming languages is
`n % m`

. - Monte Carlo algorithms
- A form of
*randomized algorithm*. We find the maximum value fast, or we don't get an answer at all (but fast). While such algorithms have good running time, their result is not guaranteed. - move-to-front
- A
*heuristic*used to maintain a*self-organizing list*. Under this heuristic, whenever a record is accessed it is moved to the front of the list. Analogous to the*least recently used*heuristic for maintaining a*buffer pool*. - multi-dimensional search key
- A search key containing multiple parts, that works in
conjunction with a
*multi-dimensional search structure*. Most typically, a*spatial*search key representing a position in multi-dimensional (2 or 3 dimensions) space. But a multi-dimensional key could be used to organize data within non-spatial dimensions, such as temperature and time. - multi-dimensional search structure
- A data structure used to support efficient search on a
*multi-dimensional search key*. The main concept here is that a multi-dimensional search structure works more efficiently by considering the multiple parts of the search key as a whole, rather than making independent searches on each one-dimensional component of the key. A primary example is a*spatial data structure*that can efficiently represent and search for records in multi-dimensional space. - multilist
- A list that may contain sublists.
This term is sometimes used as a synonym to the term
*bag*. - natural numbers
- Zero and the positive integers.
- necessary fallacy
- A common mistake in a
*lower bounds proof*for a problem, where the proof makes an inappropriate assumption that any algorithm must operate in some manner (typically in the way that some known algorithm behaves). - neighbor
- In a
*graph*, a*node*\(w\) is said to be a neighbor of*node*\(v\) if there is an*edge*from \(v\) to \(w\). - node
- The objects that make up a linked structure such as a linked
list or binary tree.
Typically, nodes are allocated using
*dynamic memory allocation*. In*graph*terminology, the nodes are more commonly called*vertices*. - non-deterministic
- In a
*finite automata*, at least one*state*has multiple transitions on at least one symbol. This means that it is not*deterministic*about what transition to take in that situation. A non-deterministic machine is said to*accept*a string if it completes execution on the string in an*accepting state*under at least one choice of non-deterministic transitions. Generally, non-determinism can be simulated with a deterministic machine by alternating between the execution that would take place under each of the branching choices. - non-deterministic algorithm
- An algorithm that may operate using a
*non-deterministic choice*operation. - non-deterministic choice
- An operation that captures the concept of nondeterminism. A nondeterministic choice can be viewed as either "correctly guessing" between a set of choices, or implementing each of the choices in parallel. In the parallel view, the nondeterminism was successful if at least one of the choices leads to a correct answer.
- non-deterministic polynomial time algorithm
- An algorithm that runs in polynomial time, and which may
(or might not) use
*non-deterministic choice*. - non-strict partial order
- In set notation, a relation that is
*reflexive*,*antisymmetric*, and*transitive*. - non-terminal
- In contrast to a
*terminal*, a non-terminal is an abstract state in a*production rule*. Begining with the*start symbol*, all non-terminals must be converted into terminals in order to complete a*derivation*. - NP
- An abbreviation for
*non-deterministic polynomial*. - NP-Complete
- A class of problems that are related to each other in this way: If ever one such problem is proved to be solvable in polynomial time, or proved to require exponential time, then all other NP-Complete problems will cost likewise. Since so many real-world problems have been proved to be NP-Complete, it would be extremely useful to determine if they have polynomial or exponential cost. But so far, nobody has been able to determine the truth of the situation. A more technical definition is that a problem is NP-Complete if it is in NP and is NP-hard.
- NP-Completeness proof
- A type of
*reduction*used to demonstrate that a particular*problem*is*NP-complete*. Specifically, an NP-Completeness proof must first show that the problem is in class*NP*, and then show (by using a reduction to another NP-Complete problem) that the problem is*NP-hard*. - NP-hard
- A problem that is "as hard as" any other problem in
*NP*. That is, Problem X is NP-hard if any algorithm in NP can be*reduced*to X in polynomial time. - nth roots of unity
- All of the points along the unit circle in the complex plane
that represent multiples of the
*primitive nth root of unity*. - object
- An instance of a
*class*, that is, something that is created and takes up storage during the execution of a computer program. In the*object-oriented programming paradigm*, objects are the basic units of operation. Objects have state in the form of*data members*, and they know how to perform certain actions (*methods*). - object-oriented programming paradigm
- An approach to problem-solving where all computations are
carried out using
*objects*. - object-space decomposition
- A from of
*key-space decomposition*where the*key space*is determined by the actual values of keys that are found. For example, a*BST*stores a key value in its root, and all other values in the tree with lesser value are in the left*subtree*. Thus, the root value has split (or decomposed) the key space for that key based on its value into left and right parts. An object-space decomposition is in opposition to an*image-space decomposition*. - octree
- The three-dimensional equivalent of the
*quadtree*would be a tree with \(2^3\) or eight branches. - Omega notation
- In
*algorithm analysis*, \(\Omega\) notation is used to describe a*lower bound*. Roughly (but not completely) analogous to*big-Oh notation*used to define an*upper bound*. - one-way list
- A synonym for a
*singly linked list*. - open addressing
- A synonym for
*closed hashing*. - open hash system
- A
*hash system*where multiple records might be associated with the same slot of a*hash table*. Typically this is done using a linked list to store the records. This is in contrast to a*closed hash system*. - operating system
- The control program for a computer. Its purpose is to control hardware, manage resources, and present a standard interface to these to other software components.
- optimal static ordering
- A theoretical construct defining the best static (non-changing)
order in which to place a collection of records so as to
minimize the number of records
*visited*by a series of sequential searches. It is a useful concept for the purpose of defining a theoretical optimum against which to compare the performance for a*self-organizing list heuristic*. - optimization problem
- Any problem where there are a (typically large) collection of potential solutions, and the goal is to find the best solution. An example is the Traveling Salesman Problem, where visiting \(n\) cities in some order has a cost, and the goal is to visit in the cheapest order.
- out degree
- In
*graph*terminology, the out degree for a*vertex*is the number of edges directed out of the vertex. - overflow
- The condition where the amount of data stored in an entity has
exceeded its capacity.
For example, a node in a
*B-tree*can store a certain number of records. If a record is attempted to be inserted into a node that is full, then something has to be done to handle this case. - overflow bucket
- In
*bucket hashing*, this is the*bucket*into which a record is placed if the bucket containing the record's*home slot*is full. The overflow bucket is logically considered to have infinite capacity, though in practice search and insert will become relatively expensive if many records are stored in the overflow bucket. - overhead
- All information stored by a data structure aside from the actual
data.
For example, the pointer fields in a
*linked list*or*BST*, or the unused positions in an*array-based list*. - page
- A term often used to refer to the contents of a single
*buffer*within a*buffer pool*or other*virtual memory*. This corresponds to a single*block*or*sector*of data from*backing storage*, which is the fundamental unit of I/O. - parameter
- The values making up an input to a
*function*. - parent
- In a tree, the
*node*\(P\) that directly links to a node \(A\) is the parent of \(A\). \(A\) is the*child*of \(P\). - parent pointer representation
- For
*trees*, a*node*implementation where each node stores only a pointer to its*parent*, rather than to its*children*. This makes it easy to go up the tree toward the*root*, but not down the tree toward the*leaves*. This is most appropriate for solving the*UNION/FIND*problem. - parity
- The concept of matching even-ness or odd-ness, the basic idea
behind using a
*parity bit*for error detection. - parity bit
- A common method for checking if transmission of a
sequence of bits has been performed correctly.
The idea is to count the number of 1 bits in the sequence, and
set the parity bit to 1 if this number is odd, and 0 if it is
even.
Then, the transmitted sequence of bits can be checked to see if
its parity matches the value of the parity bit.
This will catch certain types of errors, in particular if the
value for a single bit has been reversed.
This was used, for example, in early versions of
*ASCII character coding*. - parse tree
- A tree that represents the syntactic structure of an input
string, making it easy to compare against a
*grammar*to see if it is syntactically correct. - parser
- A part of a
*compiler*that takes as input the program text (or more typically, the tokens from the*scanner*), and verifies that the program is syntactically correct. Typically it will build a*parse tree*as part of the process. - partial order
- In set notation, a binary relation is called a partial order if
it is
*antisymmetric*and*transitive*. If the relation is also*reflexive*, then it is a*non-strict partial order*. Alternatively, if the relation is also*irreflexive*, then it is a*strict partial order*. - partially ordered set
- The set on which a
*partial order*is defined is called a partially ordered set. - partition
- In
*Quicksort*, the process of splitting a list into two sublists, such that one sublist has values less than the*pivot*value, and the other with values greater than the pivot. This process takes \(\Theta(i)\) time on a sublist of length \(i\). - pass by reference
- A
*reference*to the variable is passed to the called function. So, any modifications will affect the original variable. - pass by value
- A copy of a variable is passed to the called function. So, any modifications will not affect the original variable.
- path
- In
*tree*or*graph*terminology, a sequence of*vertices*\(v_1, v_2, ..., v_n\) forms a path of length \(n-1\) if there exist edges from \(v_i\) to \(v_{i+1}\) for \(1 \leq i < n\). - path compression
- When implementing the
*UNION/FIND*algorithm, path compression is a local optimization step that can be performed during the FIND step. Once the root of the tree for the current object has been found, the path to the root can be traced a second time, with all objects in the tree made to point directly to the root. This reduces the depth of the tree from typically \(\Theta(\log n)\) to nearly constant. - peripheral storage
- Any storage device that is not part of the core processing
of the computer (that is,
*RAM*). A typical example is a*disk drive*. - permutation
- A permutation of a sequence \(\mathbf{S}\)
is the
*elements*of \(\mathbf{S}\) arranged in some order. - persistent
- In the context of computer memory, this refers to a memory that does not lose its stored information when the power is turned off.
- physical file
- The collection of sectors that comprise a file on a
*disk drive*. This is in contrast to the*logical file*. - physical form
- The implementation of a data type as a data structure.
Contrast to the
*physical form*for the data type. - Pigeonhole Principle
- A commonly used lemma in Mathematics. A typical variant states: When \(n+1\) objects are stored in \(n\) locations, at least one of the locations must store two or more of the objects.
- pivot
- In
*Quicksort*, the value that is used to split the list into sublists, one with lesser values than the pivot, the other with greater values than the pivot. - platter
- In a
*disk drive*, one of a series of flat disks that comprise the storage space for the drive. Typically, each surface (top and bottom) of each platter stores data, and each surface has its own*I/O head*. - point quadtree
- A
*spatial data structure*for storing point data. It is similar to a*PR quadtree*in that it (in two dimensions) splits the world into four parts. However, it splits using an*object-space decomposition*. That is, quadrant containing the point is split into four parts at the point. It is similar to the*kd tree*which splits alternately in each dimension, except that it splits in all dimensions at once. - point-region quadtree
- Formal name for what is commonly referred to as a
*PR quadtree*. - pointee
- The term pointee refers to anything that is pointed to by a
*pointer*or*reference*. - pointer
- A variable whose value is the
*address*of another variable; a link. - pointer-based implementation for binary tree nodes
- A common way to implement
*binary tree**nodes*. Each node stores a data value (or a*reference*to a data value), and pointers to the left and right children. If either or both of the children does not exist, then a null pointer is stored. - polymorphism
- An
*object-oriented programming*term meaning*one name, many forms*. It describes the ability of software to change its behavior dynamically. Two basic forms exist:*run-time polymorphism*and*compile-time polymorphism*. - pop
- A specialized term used to indicate removing an
*element*from a*stack*. - poset
- Another name for a
*partially ordered set*. - position
- The defining property of the list ADT, this is the concept that list elements are in a position. Many list ADTs support access by position.
- postorder traversal
- In a
*binary tree*, a*traversal*that first*recursively**visits*the left*child*, then recursively visits the right child, and then visits the*root*. - potential
- A concept related to
*amortized analysis*. Potential is the total or currently available amount of work that can be done. - powerset
- For a
*set*\(\mathbf{S}\), the power set is the set of all possible*subsets*for \(\mathbf{S}\). - PR quadtree
- A type of
*quadtree*that stores point data in two dimensions. The root of the PR quadtree represents some square region of 2d space. If that space stores more than one data point, then the region is decomposed into four equal subquadrants, each represented*recursively*by a subtree of the PR quadtree. Since many leaf nodes of the PR quadtree will contain no data points, implementation often makes use of the*Flyweight**design pattern*. Related to the*bintree*. - prefix property
- Given a collection of strings, the collection has the prefix
property if no string in the collection is a prefix for another
string in the collection.
The significance is that, given a long string composed of
members of the collection, it can be uniquely decomposed into
the constituent members.
An example of such a collection of strings with the prefix
property is a set of
*Huffman codes*. - preorder traversal
- In a
*binary tree*, a*traversal*that first*visits*the*root*, then*recursively*visits the left*child*, then recursively visits the right child. - Prim's algorithm
- A
*greedy algorithm*for computing the*MCST*of a*graph*. It is nearly identical to*Dijkstra's algorithm*for solving the*single-source shortest paths problem*, with the only difference being the calculation done to update the best-known distance. - primary clustering
- In
*hashing*, the tendency in certain*collision resolution*methods to create clustering in sections of the hash table. The classic example is*linear probing*. This tends to happen when a group of keys follow the same*probe sequence*during collision resolution. - primary index
- Synonym for
*primary key index*. - primary key
- A unique identifier for a
*record*. - primary key index
- Relates each
*primary key*value with a pointer to the actual record on disk. - primary storage
- The faster but more expensive memory in a computer, most often
*RAM*in modern computers. This is in contrast to*secondary storage*, which together with primary storage devices make up the computer's*memory hierarchy*. - primitive data type
- In Java, one of a particular group of
*simple types*that are not implemented as objects. An example is an`int`

. - primitive element
- In set notation, this is a single element that is a member of the base type for the set. This is as opposed to an element of the set being another set.
- primitive nth root of unity
- The \(n\) th root of 1. Normally a
*complex number*. An intuitive way to view this is one \(n\) th of the unit circle in the complex plain. - priority
- A quantity assigned to each of a collection of
*jobs*or tasks that indicate importance for order of processing. For example, in an operating system, there could be a collection of processes (jobs) ready to run. The operating system must select the next task to execute, based on their priorities. - priority queue
- An ADT whose primary operations of insert of records, and
deletion of the greatest (or, in an alternative implementation,
the least) valued record.
Most often implemented using the
*heap*data structure. The name comes from a common application where the records being stored represent tasks, with the ordering values based on the*priorities*of the tasks. - probabilistic algorithm
- A form of
*randomized algorithm*that might yield an incorrect result, or that might fail to produce a result. - probabilistic data structure
- Any data structure that uses
*probabilistic algorithms*to perform its operations. A good example is the*skip list*. - probe function
- In
*hashing*, the function used by a*collision resolution*method to calculate where to look next in the*hash table*. - probe sequence
- In
*hashing*, the series of*slots*visited by the*probe function*during*collision resolution*. - problem
- A task to be performed.
It is best thought of as a
*function*or a mapping of inputs to outputs. - problem instance
- A specific selection of values for the parameters to a problem.
In other words, a specific set of inputs to a problem.
A given problem instance has a size under some
*cost model*. - problem lower bound
- In
*algorithm analysis*, the tightest*lower bound*that we can prove over all*algorithms*for that*problem*. This is often much harder to determine than the*problem upper bound*. Since the lower bound for the algorithm can be very different for different situations (such as the*best case*or*worst case*), we typically have to specify which situation we are referring to. - problem upper bound
- In
*algorithm analysis*, the*upper bound*for the best*algorithm*that we know for the*problem*. Since the upper bound for the algorithm can be very different for different situations (such as the*best case*or*worst case*), we typically have to specify which situation we are referring to. - procedural
- Typically referring to the
*procedural programming paradigm*, in contrast to the*object-oriented programming paradigm*. - procedural programming paradigm
- Procedural programming uses a list of instructions (and
procedure calls) that define a series of computational steps to
be carried out.
This is in contrast to the
*object-oriented programming paradigm*. - production

production rule - A
*grammar*is comprised of production rules. The production rules consist of*terminals*and*non-terminals*, with one of the non-terminals being the*start symbol*. Each production rule replaces one or more non-terminals (perhaps with associated terminals) with one or more terminals and non-terminals. Depending on the restrictions placed on the form of the rules, there are classes of languages that can be represented by specific types of grammars. A*derivation*is a series of productions that results in a string (that is, all non-terminals), and this derivation can be represented as a*parse tree*. - program
- An instance, or concrete representation, of an algorithm in some programming language.
- promotion
- In the context of certain
*balanced tree*structures such as the*2-3 tree*, a promotion takes place when an insertion causes the node to*overflow*. In the case of the 2-3 tree, the*key*with the middlemost value is sent to be stored in the parent. - proof
- The establishment of the truth of anything, a demonstration.
- proof by contradiction
- A mathematical proof technique that proves a theorem by first assuming that the theorem is false, and then uses a chain of reasoning to reach a logical contradiction. Since when the theorem is false a logical contradiction arises, the conclusion is that the theorem must be true.
- proof by induction
- A mathematical proof technique similar to
*recursion*. It is used to prove a parameterized theorem $S(n)$, that is, a theorem where there is a*induction variable*involved (such as the sum of the numbers from 1 to $n$). One first proves that the theorem holds true for a*base case*, then one proves the implication that whenever $S(n)$ is true then $S(n+1)$ is also true. Another variation is*strong induction*. - proving the contrapositive
- We can prove that \(P \Rightarrow Q\) by proving \((\mathrm{not}\ Q) \Rightarrow (\mathrm{not}\ P)\).
- pseudo polynomial
- In complexity analysis, refers to the time requirements of an
algorithm for an
*NP-Complete*problem that still runs acceptably fast for practical application. An example is the standard*dynamic programming*algorithm for the*knapsack problem*. - pseudo random
- In random number theory this means that, given all past terms in the series, no future term of the series can be accurately predicted in polynomial time.
- pseudo-random probing
- In
*hashing*, this is a*collision resolution*method that stores a random permutation of the values 1 through the size of the*hash table*. Term \(i\) of the*probe sequence*is simply the value of position \(i\) in the permuation. - push
- A specialized term used to indicate inserting an
*element*onto a*stack*. - pushdown automata

PDA - A type of
*Finite State Automata*that adds a stack memory to the basic*Deterministic Finite Automata*machine. This extends the set of languages that can be recognize to the*context-free languages*. - quadratic growth rate
- A growth rate function of the form \(cn^2\) where \(n\) is the input size and \(c\) is a constant.
- quadratic probing
- In
*hashing*, this is a*collision resolution*method that computes term \(i\) of the*probe sequence*using some quadratic equation \(ai^2 _ bi + c\) for suitable constants \(a, b, c\). The simplest form is simply to use \(i^2\) as term \(i\) of the probe sequence. - quadtree
- A
*full tree*where each internal node has four children. Most typically used to store two dimensional*spatial data*. Related to the*bintree*. The difference is that the quadtree splits all dimensions simultaneously, while the bintree splits one dimension at each level. Thus, to extend the quadtree concept to more dimensions requires a rapid increase in the number of splits (for example, 8 in three dimensions). - queue
- A list-like structure in which elements are inserted only at one end, and removed only from the other one end.
- Quicksort
- A sort that is \(\Theta(n \log n)\) in the
*best*and*average*cases, though \(\Theta(n^2)\) in the*worst case*. However, a reasonable implmentation will make the worst case occur under exceedingly rare circumstances. Due to its tight inner loop, it tends to run better than any other known sort in general cases. Thus, it is a popular sort to use in code libraries. It works by divide and conquer, by selecting a*pivot*value, splitting the list into parts that are either less than or greater than the pivot, and then sorting the two parts. - radix
- Synonym for
*base*. The number of digits in a number representation. For example, we typically represent numbers in base (or radix) 10. Hexidecimal is base (or radix) 16. - radix sort
- A sorting algorithm that works by processing records with
\(k\) digit keys in \(k\) passes, where each pass sorts
the records according to the current digit. At the end of the
process, the records will be sorted. This can be efficient if
the number of digits is small compared to the number of
records. However, if the \(n\) records all have unique key
values, than at least \(\Omega(\log n)\) digits are required,
leading to an \(\Omega(n \log n)\) sorting algorithm that
tends to be much slower than other sorting algorithms like
*Quicksort*or*mergesort*. - RAM
- Abbreviation for
*Random Access Memory*. - random access
- In
*file processing*terminology, a*disk access*to a random position within the file. More generally, the ability to access an arbitrary record in the file. - random access memory
- Abbreviated
*RAM*, this is the principle example of*primary storage*in a modern computer. Data access times are typically measured in billionths of a second (microseconds), which is roughly a million times faster than data access from a disk drive. RAM is where data are held for immediate processing, since access times are so much faster than for*secondary storage*. RAM is a typical part of a computer's*memory hierarchy*. - random permutation
- One of the \(n!\) possible permutations for a set of \(n\) element is selected in such a way that each permutation has equal probability of being selected.
- randomized algorithm
- An algorithm that involves some form of randomness to control
its behavior.
The ultimate goal of a randomized algorithm is to improve
performance over a deterministic algorithm to solve the same
problem.
There are a number of variations on this theme.
A "Las Vegas algorithm" returns a correct result,
but the amount of time required might or might not improve over
a
*deterministic algorithm*. A "Monte Carlo algorithm" is a form of*probabilistic algorithm*that is not guarenteed to return a correct result, but will return a result relatively quickly. - range
- The set of possible outputs for a function.
- range query
- Records are returned if their relevant key value falls with a specified range.
- read/write head
- Synonym for
*I/O head*. - rebalancing operation
- An operation performed on balanced search trees, such as the
*AVL Tree*or*Splay Tree*, for the purpose of keeping the tree*height balanced*. - record
- A collection of information, typically implemented as an
*object*in an*object-oriented programming language*. Many data structures are organized containers for a collection of records. - recurrence relation
- A
*recurrence relation*(or less formally, recurrence) defines a function by means of an expression that includes one or more (smaller) instances of itself. A classic example is the*recursive*definition for the factorial function, \(F(n) = n*F(n-1)\). - recurrence with full history
- A special form of
*recurrence relation*that includes a summation with a copy of the recurrence inside. The recurrence that represents the average case cost for*Quicksort*is an example. This internal summation can typically be removed with simple techniques to simplify solving the recurrence. - recursion
- The process of using recursive calls.
An algorithm is recursive if it calls itself to do part of
its work.
See
*recursion*. - recursive call
- Within a
*recursive function*, it is a call that the function makes to itself. - recursive data structure
- A data structure that is partially
composed of smaller or simpler instances of the same data structure.
For example,
*linked lists*and*binary trees*can be viewed as recursive data structures. - recursive function
- A function that includes a
*recursive call*. - recursively enumerable
- A language \(L\) is recursively enumerable
if there exists a
*Turing machine*\(M\) such that \(L = L(M)\). - Red-Black Tree
- A balanced variation on a
*BST*. - reduction
- In
*algorithm analysis*, the process of deriving*asymptotic bounds*for one*problem*from the asymptotic bounds of another. In particular, if problem A can be used to solve problem B, and problem A is proved to be in \(O(f(n))\), then problem B must also be in \(O(f(n))\). Reductions are often used to show that certain problems are at least as expensive as sorting, or that certain problems are*NP-Complete*. - reference
- A value that enables a program to directly access some
particular
*data item*. An example might be a byte position within a file where the record is stored, or a pointer to a record in memory. (Note that Java makes a distinction between a reference and the concept of a pointer, since it does not define a reference to necessarily be a byte position in memory.) - reference count algorithm
- An algorithm for
*garbage collection*. Whenever a reference is made from a variable to some memory location, a counter associated with that memory location is incremented. Whenever the reference is changed or deleted, the reference count is decremented. If this count goes to zero, then the memory is considered free for reuse. This approach can fail if there is a cycle in the chain of references. - reference parameter
- A
*parameter*that has been*passed by reference*. Such a parameter can be modified inside the function or method. - reflexive
- In set notation, binary relation \(R\) on set \(S\) is reflexive if \(aRa\) for all \(a \in \mathbf{S}\).
- Region Quadtree
- A
*spatial data structure*for storing 2D pixel data. The idea is that the root of the tree represents the entire image, and it is recursively divided into four equal subquadrants if not all pixels associated with the current node have the same value. This is structurally equivalent to a*PR quadtree*, only the decomposition rule is changed. - regular expression
- A way to specify a set of strings that define a language using
the operators of union, contatenation, and star-closure.
A regular expression defines some
*regular language*. - regular language
- A language \(L\) is a regular language if and only if
there exists a
*Deterministic Finite Automata*\(M\) such that \(L = L(M)\). - relation
- In set notation, a relation \(R\) over set \(\mathbf{S}\) is a set of ordered pairs from \(\mathbf{S}\).
- replacement selection
- A variant of
*heapsort*most often used as one phase of an*external sort*. Given a collection of records stored in an*array*, and a stream of additional records too large to fit into*working memory*, replacement selection will unload the*heap*by sending records to an output stream, and seek to bring new records into the heap from the input stream in preference to shrinking the heap size whenever possible. - reserved block
- In a
*memory manager*, this refers to space in the*memory pool*that has been allocated to store data received from the*client*. This is in contrast to the*free blocks*that represent space in the memory pool that is not allocated to storing client data. - resource constraints
- Examples of resource constraints include the total space available to store the data (possibly divided into separate main memory and disk space constraints) and the time allowed to perform each subtask.
- root
- In a
*tree*, the topmost*node*of the tree. All other nodes in the tree are*descendants*of the root. - rotation
- In the
*AVL Tree*and*Splay Tree*, a rotation is a local operation performed on a node, its children, and its grandchildren that can result in reordering their relationship. The goal of performing a rotation is to make the tree more*balanced*. - rotational delay
- When processing a
*disk access*, the time that it takes for the first byte of the desired data to move to under the*I/O head*. On average, this will take one half of a disk rotation, and so constitutes a substantial portion of the time required for the disk access. - rotational latency
- A synonym for
*rotational delay*. - run
- A series of sorted records.
Most often this refers to a (sorted) subset of records that are
being sorted by means of an
*external sort*. - run file
- A temporary file that is created during the operation of an
*external sort*, the run file contains a collection of*runs*. A common structure for an external sort is to first create a series of runs (stored in a run file), followed by merging the runs together. - run-time polymorphism
- A form of
*polymorphism*known as Overriding. Overridden methods are those which implement a new method with the same signature as a method inherited from its*base class*. Compare to*compile-time polymorphism*. - runtime environment
- The environment in which a program (of a particular programming
language) executes.
The runtime environment handles such activities as managing the
*runtime stack*, the*free store*, and the*garbage collector*, and it conducts the execution of the program. - runtime stack
- The place where an
*activation record*is stored when a subroutine is called during a program's runtime. - scanner
- The part of a
*compiler*that is responsible for doing*lexical analysis*. - scope
- The parts of a program that can see and access a variable.
- search key
- A field or part of a record that is used to represent the record when searching. For example, in a database of customer records, we might want to search by name. In this case the name field is used as the search key.
- search lower bound
- The problem of searching in an
*array*has provable lower bounds for specific variations of the problem. For an unsorted array, it is \(\Omega(n)\)*comparisons*in the*worst case*, typically proved using an*adversary argument*. For a sorted array, it is \(\Omega(\log n)\) in the worst case, typically proved using an argument similar to the*sorting lower bound*proof. However, it is possible to search a sorted array in the average case in \(O(\log \log n)\) time. - search problem
- Given a particular key value \(K\), the search problem is to
locate a
*record*\((k_j, I_j)\) in some collection of records**L**such that \(k_j = K\) (if one exists).*Searching*is a systematic method for locating the record (or records) with key value \(k_j = K\). - search tree
- A
*tree*data structure that makes search by*key*value more efficient. A type of*container*, it is common to implement an*index*using a search tree. A good search tree implementation will guarentee that insertion, deletion, and search operations are all \(\Theta(\log n)\). - search trie
- Any
*search tree*that is a*trie*. - searching
- Given a
*search key*\(K\) and some collection of records**L**, searching is a systematic method for locating the record (or records) in**L**with key value \(k_j = K\). - secondary clustering
- In
*hashing*, the tendency in certain*collision resolution*methods to create clustering in sections of the hash table. In*primary clustering*, this is caused by a cluster of keys that don't necessarily hash to the same slot but which following significant portions of the same*probe sequence*during collision resolution. Secondary clustering results from the keys hashing to the same slot of the table (and so a collision resolution method that is not affected by the key value must use the same probe sequence for all such keys). This problem can be resolved by*double hashing*since its probe sequence is determined in part by a second hash function. - secondary index
- Synonym for
*secondary key index*. - secondary key
- A key field in a record such as salary, where a particular key
value might be duplicated in multiple records. A secondary key
is more likely to be used by a user as a search key than is the
record's
*primary key*. - secondary key index
- Associates a
*secondary key*value with the*primary key*of each record having that secondary key value. - secondary storage
- Refers to slower but cheaper means of storing data.
Typical examples include a
*disk drive*, a USB memory stick, or a solid state drive. - sector
- A unit of space on a
*disk drive*that is the amount of data that will be read or written at one time by the disk drive hardware. This is typically 512 bytes. - sector header
- On a disk drive, a piece of information at the start of a
*sector*that allows the*I/O head*to recognize the identity (or equivalently, the address) of the current sector. - seed
- In random number theory, the starting value for a random number
series.
Typically used with any
*linear congruential method*. - seek
- On a
*disk drive*, the act of moving the*I/O head*from one*track*to another. This is usually considered the most expensive step during a*disk access*. - selection sort
- While this sort requires \(\Theta(n^2)\) time in the
*best*,*average*, and*worst*cases, it requires only \(\Theta(n)\) swap operations. Thus, it does relatively well in applications where swaps are expensive. It can be viewed as an optimization on*bubble sort*, where a swap is deferred until the end of each iteration. - self-organizing list
- A list that, over a series of search operations, will make
use of some
*heuristic*to re-order its elements in an effort to improve search times. Generally speaking, search is done sequentially from the beginning, but the self-organizing heuristic will attempt to put the records that are most likely to be searched for at or near the front of the list. While typically not as efficient as*binary search*on a sorted list, self-organizing lists do not require that the list be sorted (and so do not pay the cost of doing the sorting operation). - self-organizing list heuristic
- A
*heuristic*to use for the purpose of maintaining a*self-organizing list*. Commonly used heuristics include*move-to-front*and*transpose*. - separate chaining
- In
*hashing*, a synonym for*open hashing* - sequence
- In set notation, a collection of elements with an order, and
which may contain duplicate-valued elements.
A sequence is also sometimes called a
*tuple*or a*vector*. - sequential access
- In
*file processing*terminology, the requirement that all records in a file are accessed in sequential order. Alternatively, a storage device that can only access data sequentially, such as a tape drive. - sequential fit
- In a
*memory manager*, the process of searching the*memory pool*for a*free block*large enough to service a*memory request*, possibly reserving the remaining space as a free block. Examples are*first fit*,*circular first fit*,*best fit*, and*worst fit*. - sequential search
- The simplest search algorithm: In an
*array*, simply look at the array elements in the order that they appear. - sequential tree representation
- A representation that stores a series of node values with the
minimum information needed to reconstruct the tree structure.
This is a technique for
*serializing*a tree. - serialization
- The process of taking a data structure in memory and
representing it as a sequence of bytes.
This is sometimes done in order to transmit the data structure
across a network or store the data structure in a
*stream*, such as on disk.*Deserialization*reconstructs the original data structure from the serialized representation. - set
- A collection of distinguishable
*members*or*elements*. - set product
- Written \(\mathbf{Q} \times \mathbf{P}\), the set product is a set of ordered pairs such that ordered pair \((a, b)\) is in the product whenever \(a \in \mathbf{P}\) and \(b \in \mathbf{Q}\). For example, when \(\mathbf{P} = \{2, 3, 5\}\) and \(\mathbf{Q} = \{5, 10\}\), \(\mathbf{Q} \times \mathbf{P} = \{(2, 5),\ (2, 10),\ (3, 5),\ (3, 10),\ (5, 5),\ (5, 10)\}\).
- shallow copy
- Copying the
*reference*or*pointer*value without copying the actual content. - Shellsort
- A sort that relies on the best-case cost of
*insertion sort*to improve over \(\Theta(n^2)\)*worst case*cost. - shifting method
- A technique for finding a
*closed-form solution*to a*summation*or*recurrence relation*. - shortest path
- Given a
*graph*with distances or*weights*on the*edges*, the shortest path between two nodes is the path with least total distance or weight. Examples of the shortest paths problems are the*single-source shortest paths problem*and the*all-pairs shortest paths problem*. - sibling
- In a
*tree*, a sibling of*node*\(A\) is any other node with the same*parent*as \(A\). - signature
- In a programming language, the signature for a function is its return type and its list of parameters and their types.
- signature file
- In document processing, a signature file is a type of
*bitmap*used to indicate which documents in a collection contain a given keyword, such that there is a*bitmap*for each keyword. - simple cycle
- In
*graph*terminology, a*cycle*is simple if its corresponding*path*is simple, except that the first and last*vertices*of the cycle are the same. - simple path
- In
*graph*terminology, a*path*is simple if all vertices on the path are distinct. - simple type
- A
*data type*whose values contain no subparts. An example is the integers. - simulating recursion
- If a programming language does not support
*recursion*, or if you want to implement the effects of recursion more efficiently, you can use a*stack*to maintain the collection of subproblems that would be waiting for completion during the recursive process. Using a loop, whenever a recursive call would have been made, simply add the necessary program state to the stack. When a return would have been made from the recursive call, pop the previous program state off of the stack. - single rotation
- A type of
*rebalancing operation*used by the*Splay Tree*and*AVL Tree*. - single-source shortest paths problem
- Given a
*graph*with*weights*or distances on the*edges*, and a designated start*vertex*\(s\), find the shortest path from \(s\) to every other vertex in the graph. One algorithm to solve this problem is*Dijkstra's algorithm*. - singly linked list
- A
*linked list*implementation variant where each list node contains access an pointer only to the next element in the list. - skip list
- A form of
*linked list*that adds additional links to improve the cost of fundamental operations like insert, delete, and search. It is a*probabilistic data structure*since it adds the additional links using a*probabilistic algorithm*. It can implement a*dictionary*more efficiently than a*BST*, and is roughly as difficult to implement. - slot
- In
*hashing*, a position in a*hash table*. - snowplow argument
- An analogy used to give intuition for why
*replacement selection*will generate*runs*that are on average twice the size of working memory. Records coming from the input stream have key values that might be of any size, whose size is related to the position of a falling snowflake. The replacement selection process is analogous to a snowplow that moves around a circular track picking up snow. In steady state, given a certain amount of snow equivalent to*working memory*size \(M\), an amount of snow (incoming records from the input stream) is expected to fall ahead of the plow as the size of the working memory during one cycle of the plow (analogously, one run of the replacement selection algorithm). Thus, the snowplow is expected in one pass (one run of replacement selection) to pick up \(2M\) snow. - software engineering
- The study and application of engineering to the design, development, and maintenance of software.
- software reuse
- In
*software engineering*, the concept of reusing a piece of software. In particular, using an existing piece of software (such as a function or library) when creating new software. - solution space
- The possible solutions to a problem.
This typically refers to an
*optimization problem*, where some solutions are more desireable than others. - solution tree
- An ordering imposed on the set of solutions within a
*solution space*in the form of a tree, typically derived from the order that some algorithm would visit the solutions. - sorted list
- A
*list*where the records stored in the list are arranged so that their*key*values are in ascending order. If the list uses an*array-based list*implementation, then it can use*binary search*for a cost of \(\Theta(\log n)\). But both insertion and deletion will be require \(\Theta(n)\) time. - sorting lower bound
- The lower bound for the
*problem*of*sorting*is \(\Omega(n \log n)\). This is traditionally proved using a*decision tree*model for sorting algorithms, and recognizing that the minimum depth of the decision tree for any sorting algorithm is \(\Omega(n \log n)\) since there are \(n!\) permutations of the \(n\) input records to distinguish between during the sorting process. - sorting problem
- Given a set of records \(r_1\), \(r_2\), ..., \(r_n\)
with
*key*values \(k_1\), \(k_2\), ..., \(k_n\), the sorting problem is to arrange the records into any order \(s\) such that records \(r_{s_1}\), \(r_{s_2}\), ..., \(r_{s_n}\) have keys obeying the property \(k_{s_1} \leq k_{s_2} \leq ... \leq k_{s_n}\). In other words, the sorting problem is to arrange a set of records so that the values of their key fields are in non-decreasing order. - space/time tradeoff
- Many programs can be designed to either speed processing at the cost of additional storage, or reduce storage at the cost of additional processing time.
- sparse graph
- A
*graph*where the actual number of*edges*is much less than the possible number of edges. Generally, this is interpreted to mean that the*degree*for any*vertex*in the graph is relatively low. - sparse matrix
- A matrix whose values are mostly zero. There are a number of data structures that have been developed to store sparse matrices, with the goal of reducing the amount of space required to represent it as compared to simply using a regular matrix representation that stores a value for every matrix position.
- spatial
- Referring to a position in space.
- spatial application
- An application what has spatial aspects. In particular, an application that stores records that need to be searched by location.
- spatial attribute
- An attribute of a record that has a position in space, such as the coordinate. This is typically in two or more dimensions.
- spatial data
- Any object or record that has a position (in space).
- spatial data structure
- A
*data structure*designed to support efficient processing when a*spatial attribute*is used as the key. In particular, a data structure that supports efficient search by location, or finds all records within a given region in two or more dimensions. Examples of spatial data structures to store point data include the*bintree*, the*PR quadtree*and the*kd tree*. - spindle
- The center of a
*disk drive*that holds the*platters*in place. - Splay Tree
- A variant implementation for the
*BST*, which differs from the standard BST in that it uses modified insert and remove methods in order to keep the tree*balanced*. Similar to an*AVL Tree*in that it uses the concept of*rotations*in the insert and remove operations. While a Splay Tree does not guarentee that the tree is balanced, it does guarentee that a series of \(n\) operations on the tree will have a total cost of \(\Theta(n \log n)\) cost, meaning that any given operation can be viewed as having*amortized cost*of \(\Theta(\log n)\). - splaying
- The act of performing an
*rebalancing operation*on a*Splay Tree*. - stable
- A sorting algorithm is said to be stable if it does not
change the relative ordering of records with identical
*key*values. - stack
- A list-like structure in which elements may be inserted or removed from only one end.
- stack frame
- Frame of data that pushed into and poped from call stack
- stack variable
- Another name for a
*local variable*. - stale pointer
- Within the context of a
*buffer pool*or*memory manager*, this means a*reference*to a*buffer*or memory location that is no longer valid. For example, a program might make a memory request to a buffer pool, and be given a reference to the buffer holding the requested data. Over time, due to inactivity, the contents of this buffer might be flushed. If the program holding the buffer reference then tries to access the contents of that buffer again, then the data contents will have changed. The possibility for this to occur depends on the design of the interface to the buffer pool system. Some designs make this impossible to occur. Other designs make it possible in an attempt to deliver greater performance. - start state
- In a
*finite automata*, the designated state in which the machine will always begin a computation. - start symbol
- In a
*grammar*, the designated*non-terminal*that is the intial point for*deriving*a string in the langauge. - state
- The condition that something is in at some point in time.
In computing, this typically means the collective values of any
existing variables at some point in time.
In an
*automata*, a state is an abstract condition, possibly with associated information, that is primarily defined in terms of the conditions that the automata may transition from its present state to another state. - State Machine
- Synonym for
*finite automata*. - static
- Something that is not changing (in contrast to
*dynamic*). In computer programming, static normally refers to something that happens at compile time. For example, static analysis is analysis of the program's text or structure, as opposed to its run-time behavior. Static binding or static memory allocation occurs at compile time. - static scoping
- A synonym for
*lexical scoping*. - Strassen's algorithm
- A
*recursive*algorithm for matrix multiplication. When multiplying two \(n \times n\) matrices, this algorithm runs faster than the \(\Theta(n^3)\) time required by the standard matrix multiplication algorithm. Specifically, Strassen's algorithm requires time \(Theta(n^{\log_2 7})\) time. This is achieved by refactoring the sub-matrix multiplication and addition operations so as to need only 7 sub-matrix multiplications instead of 8, at a cost of additional sub-matrix addition operations. Thus, while the asymptotic cost is lower, the constant factor in the growth rate equation is higher. This makes Strassen's algorithm inefficient in practice unless the arrays being multiplied are rather large. Variations on Strassen's algorithm exist that reduce the number of sub-matrix multiplications even futher at a cost of even more sub-matrix additions. - strategy
- An approach to accomplish a task, often encapsulated as an
algorithm.
Also the name for a
*design pattern*that separates the algorithm for performing a task from the control for applying that task to each member of a collection. A good example is a generic sorting function that takes a collection of records (such as an*array*) and a "strategy" in the form of an algorithm that knows how to extract the key from a record in the array. Only subtly different from the*visitor*design pattern, where the difference is primarily one of intent rather than syntax. The strategy design pattern is focused on encapsulating an activity that is part of a larger process, so that different ways of performing that activity can be substituted. The visitor design pattern is focused on encapsulating an activity that will be performed on all members of a collection so that completely different activities can be substituted within a generic method that accesses all of the collection members. - stream
- The process of delivering content in a
*serialized*form. - strict partial order
- In set notation, a relation that is
*irreflexive*,*antisymmetric*, and*transitive*. - strong induction
- An alternative formulation for the
*induction step*in a*proof by induction*. The induction step for strong induction is: If**Thrm**holds for all \(k, c \leq k < n\), then**Thrm**holds for \(n\). - subclass
- In
*object-oriented programming*, any class within a*class hierarchy*that*inherits*from some other class. - subgraph
- A subgraph \(\mathbf{S}\) is formed from
*graph*\(\mathbf{G}\) by selecting a*subset*\(\mathbf{V}_s\) of \(\mathbf{G}\)'s*vertices*and a subset \(\mathbf{E}_s\) of \(\mathbf{G}\)'s*edges*such that for every edge \(e \in \mathbf{E}_s\), both vertices of \(e\) are in \(\mathbf{V}_s\). - subset
- In set theory, a set \(A\) is a subset of a set
\(B\), or equivalently \(B\) is a
*superset*of \(A\), if all elements of \(A\) are also elements of \(B\). - subtract-and-guess
- A technique for finding a
*closed-form solution*to a*summation*or*recurrence relation*. - subtree
- A subtree is a
*subset*of the nodes of a binary tree that includes some node \(R\) of the tree as the subtree*root*along with all the*descendants*of \(R\). - successful search
- When searching for a
*key*value in a collection of records, we might find it. If so, we call this a successful search. The alternative is an*unsuccessful search*. - summation
- The sum of costs for some
*function*applied to a range of parameter values. Often written using Sigma notation. For example, the sum of the integers from 1 to \(n\) can be written as \(\sum_{i=1}^{n} i\). - superset
- In set theory, a set \(A\) is a
*subset*of a*set*\(B\), or equivalently \(B\) is a*superset*of \(A\), if all*elements*of \(A\) are also elements of \(B\). - symbol table
- As part of a
*compiler*, the symbol table stores all of the identifiers in the program, along with any necessary information needed about the identifier to allow the compiler to do its job. - symmetric
- In set notation, relation \(R\) is symmetric if whenever \(aRb\), then \(bRa\), for all \(a, b \in \mathbf{S}\).
- symmetric matrix
- A square matrix that is equal to its
*transpose*. Equivalently, for a \(n \times n\) matrix \(A\), for all \(i,j < n\), \(A[i, j] = A[j, i]\). - syntax analysis
- A phase of
*compilation*that accepts*tokens*, checks if program is syntactically correct, and then generates a*parse tree*. - tail
- The end of a
*list*. - terminal
- A specific character or string that appears in a
*production rule*. In contrast to a*non-terminal*, which represents an abstract state in the production. Similar to a*literal*, but this is the term more typically used in the context of a*compiler*. - Theta notation
- In
*algorithm analysis*, \(\Theta\) notation is used to indicate that the*upper bound*and*lower bound*for an*algorithm*or*problem*match. - token
- The basic logical units of a program, as deterimined by
*lexical analysis*. These are things like arithmetic operators, language keywords, variable or function names, or numbers. - tombstone
- In
*hashing*, a tombstone is used to mark a*slot*in the*hash table*where a record has been deleted. Its purpose is to allow the*collision resolution*process to probe through that slot (so that records further down the*probe sequence*are not unreachable after deleting the record), while also allowing the slot to be reused by a future insert operation. - topological sort
- The process of laying out the
*vertices*of a*DAG*in a*linear order*such that no vertex \(A\) in the order is preceded by a vertex that can be reached by a (directed)*path*from \(A\). Usually the (directed) edges in the graph define a prerequisite system, and the goal of the topological sort is to list the vertices in an order such that no prerequisites are violated. - total order
- A binary relation on a set where every pair of distinct elements
in the set are
*comparable*(that is, one can determine which of the pair is greater than the other). - total path length
- In a
*tree*, the sum of the*levels*for each*node*. - Towers of Hanoi problem
- A standard example of a recursive algorithm. The problem starts with a stack of disks (each with unique size) stacked decreasing order on the left pole, and two additional poles. The problem is to move the disks to the right pole, with the constraints that only one disk can be moved at a time and a disk may never be on top of a smaller disk. For \(n\) disks, this problem requires \(\Theta(2^n)\) moves. The standard solution is to move \(n-1\) disks to the middle pole, move the bottom disk to the right pole, and then move the \(n-1\) disks on the middle pole to the right pole.
- track
- On a
*disk drive*, a concentric circle representing all of the*sectors*that can be viewed by the*I/O head*as the disk rotates. The significance is that, for a given placement of the I/O head, the sectors on the track can be read without performing a (relatively expensive)*seek*operation. - track-to-track seek time
- Expected (average) time to perform a
*seek*operation from a random*track*to an adjacent track. Thus, this can be viewed as the minimum possible seek time for the*disk drive*. This is one of two metrics commonly provided by disk drive vendors for disk drive performance, with the other being*average seek time*. - trailer node
- Commonly used in implementations for a
*linked list*or related structure, this*node*follows the last element of the list. Its purpose is to simplify the code implementation by reducing the number of special cases that must be programmed for. - transducer
- A machine that takes an input and creates an output.
A
*Turing Machine*is an example of a transducer. - transitive
- In set notation, relation \(R\) is transitive if whenever \(aRb\) and \(bRc\), then \(aRc\), for all \(a, b, c \in \mathbf{S}\).
- transpose
- In the context of linear algebra,
the transpose of a matrix \(A\) is
another matrix \(A^T\) created by writing the rows of
\(A\) as the columns of \(A^T\).
In the context of a
*self-organizing list*, transpose is a*heuristic*used to maintain the list. Under this heuristic, whenever a record is accessed it is moved one position closer to the front of the list. - trap state
- In a
*FSA*, any state that has all transitions cycle back to itself. Such a state might be*final*. - traversal
- Any process for visiting all of the objects in a collection
(such as a
*tree*or*graph*) in some order. - tree
- A tree \(\mathbf{T}\) is a finite set of one or more
*nodes*such that there is one designated node \(R\), called the*root*of \(\mathbf{T}\). If the set \((\mathbf{T} -\{R\})\) is not empty, these nodes are partitioned into \(n > 0\)*disjoint sets*\(\mathbf{T}_0\), \(\mathbf{T}_1\), ..., \(\mathbf{T}_{n-1}\), each of which is a tree, and whose*roots*\(R_1, R_2, ..., R_n\), respectively, are*children*of \(R\). - tree traversal
- A
*traversal*performed on a tree. Traditional tree traversals include*preorder*and*postorder*traversals for both*binary*and*general*trees, and*inorder traversal*that is most appropriate for a*BST*. - trie
- A form of
*search tree*where an internal node represents a split in the*key space*at a predetermined location, rather than split based on the actual*key*values seen. For example, a simple binary search trie for key values in the range 0 to 1023 would store all records with key values less than 512 on the left side of the tree, and all records with key values equal to or greater than 512 on the right side of the tree. A trie is always a*full tree*. Folklore has it that the term comes from "retrieval", and should be pronounced as "try" (in contrast to "tree", to distinguish the differences in the space decomposition method of a search tree versus a search trie). The term "trie" is also sometimes used as a synonym for the*alphabet trie*. - truth table
- In symbolic logic, a table that contains as rows all possible combinations of the boolean variables, with a column that shows the outcome (true or false) for the expression when given that row's truth assignment for the boolean variables.
- tuple
- In set notation, another term for a
*sequence*. - Turing machine
- A type of
*finite automata*that, while simple to define completely, is capable of performing any computation that can be performed by any known computer. - Turing-acceptable
- A language is \(Turing-acceptable\) if there is some
*Turing machine*that*accepts*it. That is, the machine will halt in an accepting configuration if the string is in the language, and go into a*hanging configuration*if the string is not in the language. - Turing-computable function
- Any function for which there exists a Turing machine that can perform the necessary work to compute the function.
- Turing-decidable
- A language is Turing-decideable if there exists a Turing machine
that can clearly indicate for every string whether that string
is in the language or not.
Every Turing-decidable language is also Turing-acceptable,
because the Turing machine that can decide if the string is in
the language can be modified to go into a
*hanging configuration*if the string is not in the language. - two-coloring
- An assignment from two colors to regions in an image such that no two regions sharing a side have the same color.
- type
- A collection of values.
- unary notation
- A way to represent
*natural numbers*, where the value of zero is represented by the empty string, and the value \(n\) is represented by a series of \(n\) marks. - uncountably infinite

uncountable - An infinite set is uncountably infinite if there does not exist
any mapping from it to the set of integers.
This is often proved using a
*diagonalization argument*. The real numbers is an example of an uncountably infinite set. - underflow
- The condition where the amount of data stored in an entity has
dropped below some minimum threshold.
For example, a node in a
*B-tree*is required to be at least half full. If a record deletion causes the node to be less than half full, then it is in a condition of underflow, and something has to be done to correct this. - undirected edge
- An
*edge*that connects two*vertices*with no direction between them. Many graph representations will represent such an edge with two*directed edges*. - undirected graph
- A
*graph*whose*edges*do not have a direction. - uninitialized
- Uninitialized variable means it has no initial value.
- UNION
- One half of the
*UNION/FIND*algorithm for managing*disjoint sets*. It is the process of merging two trees that are represented using the*parent pointer representation*by making the root for one of the trees set its parent pointer to the root of the other tree. - UNION/FIND
- A process for mainining a collection of disjoint sets.
The
*FIND*operation determines which disjoint set a given object resides in, and the*UNION*operation combines two disjoint sets when it is determined that they are members of the same*equivalence class*under some*equivalence relation*. - unit production
- A unit production is a
*production*in a*grammar*of the form \(A \rightarrow B\), where \(A, B \in\) the set of*non-terminals*for the grammar. Any grammar with unit productions can be rewritten to remove them. - unsolveable problem
- A problem that can proved impossible to solve on a computer.
The classic example is the
*halting problem*. - unsorted list
- A
*list*where the records stored in the list can appear in any order (as opposed to a*sorted list*). An unsorted list can support efficient (\(\Theta(1)\)) insertion time (since you can put the record anywhere convenient), but requires \(\Theta(n)\) time for both search and and deletion. - unsuccessful search
- When searching for a
*key*value in a collection of records, we might not find it. If so, we call this an unsuccessful search. Usually we require that this means that no record in the collection actually has that key value (though a*probabilistic algorithm*for search might not require this to be true). The alternative to an unsuccessful search is a*successful search*. - unvisited
- In
*graph*algorithms, this refers to a node that has not been processed at the current point in the algorithm. This information is typically maintained by using a*mark array*. - upper bound
- In
*algorithm analysis*, a*growth rate*that is always greater than or equal to the that of the*algorithm*in question. In practice, this is the slowest-growing function that we know grows at least as fast as all but a constant number of inputs. It could be a gross over-estimate of the truth. Since the upper bound for the algorithm can be very different for different situations (such as the*best case*or*worst case*), we typically have to specify which situation we are referring to. - value parameter
- A
*parameter*that has been*passed by value*. Changing such a parameter inside the function or method will not affect the value of the calling parameter. - variable-length coding
- Given a collection of objects, a variable-length coding scheme
assigns a code to each object in the collection using codes that
can be of different lengths.
Typically this is done in a way such that the objects that are
most likely to be used have the shortest codes, with the goal of
minimizing the total space needed to represent a sequence of
objects, such as when representing the characters in a document.
*Huffman coding*is an example of a variable-length coding scheme. This is in contrast to*fixed-length coding*. - vector
- In set notation, another term for a
*sequence*. As a data structure, the term vector usually used as a snyonym for a*dynamic array*. - vertex
- Another name for a
*node*in a*graph*. - virtual memory
- A memory management technique for making relatively fast but
small memory appear larger to the program.
The large "virtual" data space is actually stored on a
relatively slow but large
*backing storage*device, and portions of the data are copied into the smaller, faster memory as needed by use of a*buffer pool*. A common example is to use*RAM*to manage access to a large virtual space that is actually stored on a*disk drive*. The programmer can implement a program as though the entire data content were stored in RAM, even if that is larger than the physical RAM available making it easier to implement. - visit
- During the process of a
*traversal*on a*graph*or*tree*the action that takes place on each*node*. - visited
- In
*graph*algorithms, this refers to a node that has previously been processed at the current point in the algorithm. This information is typically maintained by using a*mark array*. - visitor
- A
*design pattern*where a*traversal*process is given a function (known as the visitor) that is applied to every object in the collection being traversed. For example, a generic tree or graph traversal might be designed such that it takes a function parameter, where that function is applied to each node. - volatile
- In the context of computer memory, this refers to a memory that loses all stored information when the power is turned off.
- weight
- A cost or distance most often associated with an
*edge*in a*graph*. - weighted graph
- A
*graph*whose*edges*each have an associated*weight*or cost. - weighted path length
- Given a tree, and given a
*weight*for each leaf in the tree, the weighted path length for a leaf is its weight times its*depth*. - weighted union rule
- When merging two disjoint sets using the
*UNION/FIND*algorithm, the weighted union rule is used to determine which subtree's root points to the other. The root of the subtree with fewer nodes will be set to point to the root of the subtree with more nodes. In this way, the average depth of nodes in the resulting tree will be less than if the assignment had been made in the other direction. - working memory
- The portion of
*main memory*available to an algorithm for its use. Typically refers to main memory made available to an algorithm that is operating on large amounts of data stored in*peripheral storage*, the working memory represents space that can hold some subset of the total data being processed. - worst case
- In algorithm analysis, the
*problem instance*from among all problem instances for a given input size \(n\) that has the greatest cost. Note that the worst case is**not**when \(n\) is big, since we are referring to the wrost from a class of inputs (i.e, we want the worst of those inputs of size \(n\)). - worst fit
- In a
*memory manager*, worst fit is a*heuristic*for deciding which*free block*to use when allocating memory from a*memory pool*. Worst fit will always allocate from the largest free block. The rationale is that this will be the method least likely to cause*external fragmentation*in the form of small, unuseable memory blocks. The disadvantage is that it tends to eliminate the availability of large freeblocks needed for unusually large requests. - zigzig
- A type of
*rebalancing operation*used by*splay trees*. - Zipf distribution
- A data distribution that follows Zipf's law, an emprical
observation that many types of data studied in the physical and
social sciences follow a power law probability distribution.
That is, the frequency of any record in the data collection is
inversely proportional to its rank when the collection is sorted
by frequency.
Thus, the most frequently appearing record has a frequency much
higher than the next most frequently appearing record, which in
turn has a frequency much higher than the third (but with ratio
slightly lower than that for the first two records) and so on.
The
*80/20 rule*is a casual characterization of a Zipf distribution. Adherence to a Zipf distribution is important to the successful operation of a*cache*or*self-organizing list*. - zone
- In
*memory management*, the concept that different parts of the*memory pool*are handled in different ways. For example, some of the memory might be handled by a simple*freelist*, while other portions of the memory pool might be handled by a*sequential fit*memory manager. On a*disk drive*the concept of a zone relates to the fact that there are limits to the maximum data density, combined with the fact that the need for the same angular distance to be used for a sector in each track means that tracks further from the center of the disk will become progressively less dense. A zone in this case is a series of adjacent tracks whose data density is set by the maximum density of the innermost track of that zone. The next zone can then reset the data density for its innermost track, thereby gaining more total storage space while preserving angular distance for each sector.