.. _StringSearchRabinKarp:
.. raw:: html
.. |--| unicode:: U+2013 .. en dash
.. |---| unicode:: U+2014 .. em dash, trimming surrounding whitespace
:trim:
.. This file is part of the OpenDSA eTextbook project. See
.. http://algoviz.org/OpenDSA for more details.
.. Copyright (c) 2012-13 by the OpenDSA Project Contributors, and
.. distributed under an MIT open source license.
.. avmetadata::
:author: Tom Naps and Sam Micka
Rabin-Karp String Search Algorithm [Draft]
===========================================
Rabin-Karp String Search Algorithm [Draft]
------------------------------------------
..
..
The Rabin-Karp algorithm is based on what could be imagined as a
"perfect hash function for strings". We will assume that our
strings are drawn from an alphabet with :math:`C` possible
characters. Denote the characters in string :math:`S` by :math:`s_0,
s_1, \ldots s_{n-1}`. Suppose that we have a mapping :math:`c
\rightarrow \hat{c}` that associates with each character :math:`c` an
integer :math:`\hat{c}` in the range :math:`0 \ldots c - 1`. Then a
"perfect hash function for strings" is:
.. math::
\widehat{s_0} \times C^{n-1} + \widehat{s_1} \times C^{n-2} + \ldots + \widehat{s_{n-2}} \times C + \widehat{s_{n-1}} \times C^0
Suppose that we call this a string's "magic number". In effect it
associates each string with a unique number in the base :math:`C`
number system. However, nothing is perfect -- these magic numbers for
strings get very big very quickly. Hence the following sub-algorithm
of Rabin-Karp to compute a string's magic number (which is itself
known as Horner's polynomial evaluation algorithm) takes this into
account by using the :math:`mod` operator to avoid an overflow
condition.
Slideshow for Horner's Method algorithm for computing Rabin-Karp "magic number" for a string
.. avembed:: AV/Development/StringMatch/Rabin_Karp_Horner_Slideshow.html ss
:module: StringSearchRabinKarp
:points: 0.0
:required: False
:threshold: 1.0
:id: 178200
:exer_opts: JXOP-debug=true&JOP-lang=en&JXOP-code=java
To check your understanding of this "magic number" computation try the
following exercise in using Horner's Method to compute a string's
"magic number" in a simple case
.. avembed:: Exercises/Development/StringMatch/Rabin_Karp_Horners_Exercise.html ka
:module: StringSearchRabinKarp
:points: 1.0
:required: True
:threshold: 5.0
:id: 178201
:exer_opts: JXOP-debug=true&JOP-lang=en&JXOP-code=java
Because Horner's Method cannot truly compute a magic number that is
unique for every string, the Rabin-Karp algorithm must allow for two
different strings having the same magic number. In effect, such a
situation represents a "false positive" in which Rabin-Karp thinks it
has found a match only to be disappointed. Watch Rabin-Karp in action
in the following slideshow.
.. avembed:: AV/Development/StringMatch/Rabin_Karp_Algorithm_Slideshow.html ss
:module: StringSearchRabinKarp
:points: 0.0
:required: False
:threshold: 1.0
:id: 178202
:exer_opts: JXOP-debug=true&JOP-lang=en&JXOP-code=java
Finally try this exercise in tracing one step of the Rabin-Karp
algorithm using the modified Horner's algorithm to compute the "magic
number" of a string.
.. avembed:: Exercises/Development/StringMatch/Rabin_Karp_Next_Step.html ka
:module: StringSearchRabinKarp
:points: 1.0
:required: True
:threshold: 5.0
:id: 178203
:exer_opts: JXOP-debug=true&JOP-lang=en&JXOP-code=java