3.Review
How do we prove that a language is regular?
We have a number of approaches in our toolbox.
How do we know if a language is non-regular?
Given so many tools for creating a regular language, are there
languages that are not regular?
(The very fact that we are concerned with this question is a hint that
this can happen.)
If a language L is finite, is L regular?
If L is infinite, is L regular?
Cycles
Consider a DFA (or NFA) M for a language L (that
is, L(M)=L).
If L is finite, can M not have a cycle?
If L is finite, can M have a cycle?
If L is infinite, can M not have a cycle?
If L is infinite, can M have a cycle?
Something to Think About, Revisited
L2={anbn∣n>0}
Is language L2 regular?
Can you draw a DFA, regular expression, or Regular grammar for this
language?
Our First Non-regular Language (1)
Prove that L2={anbn|n>0} is not regular.
Proof 1 (by contradiction)
If L2 is regular then ∃ DFA M
that recognizes L2.
M has a finite number of states, say some number less
than m states.
Consider a long string ambm∈L2.
Since there are less than m states and m a’s,
some state in M must be reached more than once when
following the path of am.
In that case, there is a loop with one or more a’s
(say t a’s for some t>1) along the path.
Our First Non-regular Language (2)
Proof 1 (continued)
Suppose we start at the initial state, traverse the same path for
ambm, but we traverse the loop of a ‘s one
additional time.
We will end up in the same final state that ambm did,
but our actual number of a’s is m+t.
Therefore, the string am+tbm is accepted by M,
but this string is not in L2. Contradiction!
Thus, L2 is not regular.
Pigeonhole Principle
This is an example of the Pigeonhole Principle.
The Pigeonhole Principle states that, given n pigeonholes
and n+1 pigeons, when all of the pigeons go into the holes
we can be sure that at least one hole contains more than one pigeon.
In our case, the number of a ‘s are the pigeons,
and the states in the DFA are the pigeonholes.
We can’t distinguish the various possibilities for the number of
a ‘s, so we can’t verify that they properly match the number
of b ‘s.
Pumping Concept (1)
We introduce the concept of “pumping” the string as
we go around the cycle.
Cycles are how we get infinite languages.
They are also how we lose count or otherwise lose the ability to
distinguish various properties of the string being processed.
Another view: the memory of a DFA is embodied
explicitly in its set of states.
Since the number of states is finite, the memory is finite.
A program in a traditional programming language might
have one integer (that conceptually at least stores an infinite number
of values, even if that is not literally true), a DFA has no such
thing.
Pumping Concept (2)
In general, consider DFAs with or without cycles.
If there is no cycle, the language accepted is finite.
If there is one or more cycle, then the language is infinite.
Let’s assume that there is exactly one cycle.
The cycle might be skipped, executed once, or executed more than once.
We can consider any string accepted by this DFA when the cycle is
executed exactly once to be of the form
w=w1vw2 where the v represents the part of the
string captured by the cycle.
If the cycle is skipped, then we get w=w1w2
If its executed twice we get w=w1v2w2.
In general, the DFA accepts all strings like w=w1v∗w2.
Pumping Lemma
Let L be an infinite regular language.
Then there exists a constant m>0 such that any
w∈L with |w|≥m can be decomposed into three
parts as w=xyz with:
|xy|≤m
|y|≥1
xyiz∈L for all i≥0
Meaning: Every sufficiently long string in L
(the constant m corresponds to the finite number of states in
M)
can be partitioned into three parts such that the middle
part can be “pumped”, resulting in strings that must be in L.
P.L. Proof Template
Proof by Contradiction.
Assume L is regular.
Therefore L satisfies the pumping lemma.
Choose a long string w∈L, |w|≥m.
(The choice of the string is crucial.
We must pick a string that will yield a contradiction.
Some strings make the proof easier or harder.)
Show that there is no division of w into xyz
(must consider all possible divisions) such that
|xy|≤m, |y|≥1 and xyiz∈L
for all i≥0.
The pumping lemma does not hold. Contradiction!
⇒L is not regular.
Example Proof
Prove that L={anbn|n≥0} is not regular.
Assume L is regular, therefore the pumping lemma holds.
Choose w=ambm
where m is the constant in the pumping lemma.
(Note that w must be chosen such that |w|≥m.)
Now show there is no partition of w into xyz
such that |xy|≤m, |y|≥1,
and xyiz∈L for all i≥0.
If xy is am, then
x=am−k|y=ak|z=bm
where k>0.
It should be true that xyiz∈L for all i≥0.
But clearly this is not true, no matter what k>0 we
pick. Contradiction!
For any other partion with xy of length less than
m, we have a contradiction because in all such cases
y is some a’s, which cannot be pumped.
Harder version (1)
Prove that L={anbn|n≥0} is not regular.
Assume L is regular, therefore the pumping lemma holds.
Choose w=am/2bm/2
where m is the constant in the pumping lemma.
(Note that w must be chosen such that |w|≥m.)
Now show there is no partition of w into xyz
such that |xy|≤m, |y|≥1,
and xyiz∈L for all i≥0.
Harder version (2)
In the last version of the proof, all partitions with
|xy|≤m required that y be all a’s.
This version is harder because the dividing line can be
anywhere within w=am/2bm/2 (the requirement is
just that |xy|≤m). We have to show that none of
them work.
In some partitions, y is all a’s.
In some, it is all b’s. In some, it is a’s followed by b`s.
But in every possible partition, y cannot be pumped.
So we still have a contradition, and so the language cannot be
regular.
Observations
Unfortunately, the pumping lemma is one-way:
For (some) languages we can use the pumping lemma to prove that they
are not regular.
But we cannot use the pumping lemma to help us prove that a language
is regular.
And the pumping lemma is not a universal solution for determining that
a language is non-regular.
Its just a tool in the toolbox.
More Observations
The pumping lemma says that there
is some way to define the language that meets the criteria.
It is not enough to pick your favorite value of m for which
the language would/would not be regular.
You have to show that no satisfactory m can exist.
But, the string you pick can be related to m (such as
always choosing ambm in our last proof).
Once you pick the string, all possible decompositions
must be unpumpable.
If you do a good job at picking the string as a function of
m, then there might not be many different ways to
subdivide into xyz.
Can View as Adversary Argument
Your want to establish a contradiction (to prove the language is
not regular)
Meanwhile, the opponent tries to stop the proof.
The moves in the game are:
1. The opponent picks m.
2. We pick string w in L of length equal or greater
than m.
We are free to chose any w, so long as w∈L and
|w|≥m.
3. The opponent chooses the decomposition xyz, such that
|xy|≤m,|y|≥1.
The opponent will make the choice that is a winner if they can.
4. We try to pick i so that the pumped string
wi=xyiz is not in L.
If we can do this, we win (L is not regular).
Example
Prove L={wwR:w∈Σ∗} is not regular.
For any value m, we pick the string
ambmbmam.
Since |xy|≤m, y must consist entirely of
a ‘s.
If we pick i=0, then the resulting string has fewer
a ‘s on the left than on the right and so cannot be of
the form wwR.
Therefore, L is not regular.
Example: Failure (as expected)
If the language is indeed regular, you should find it impossible to
use the pumping lemma to prove it non-regular!
Prove L={ambn∣n+m is odd } is not regular.
If the opponent picks m=1, then we can pick
w=abb.
Whatever the adversary picks for
xyz, we end up with y such that we can pump
strings not in the language.
SO… does this mean that L is non-regular?
NO!! The adversary will not pick a
bad choice for m if they don’t have to!
Example: Failure (as expected)
Prove L={ambn∣n+m is odd } is not regular.
Say that the opponent picks m=3.
We can choose this string that is in the language:
w=aaabb so as to constrain the opponent to picking
values for y with all a ‘s.
But unfortunately, the opponent picks decomposition
a(aa)ibb.
We can’t pick i that is not in the language.
The point is that we cannot find a string, for all values
of m, such that the opponent cannot also pick workable
values for x,y,z.
Adversary Argument Explained (1)
Consider the Pumping Lemma definition again:
Let L be an infinite regular language.
There exists a constant m>0 such that any
w∈L with |w|≥m can be decomposed into three
parts as w=xyz with:
|xy|≤m
|y|≥1
xyiz∈L for all i≥0
1. The opponent picks m.
2. We pick string w.
3. The opponent chooses the decomposition xyz.
4. We try to pick i.
Adversary Argument Explained (2)
WE seek to prove the language non-regular.
The adversary seeks to stop us.
- There exists a constant m>0
[= Adversary picks a value for m.]
- … such that any w∈L with |w|≥m
[= WE pick our choice for w.]
- … can be decomposed into three parts as w=xyz
[= Adversary picks xyz]
(that meets the length criteria on xy and y)
- … such that xyiz∈L for all i≥0
[= WE pick a value for i.]
Example
Prove that L={ancbn|n>0} is not regular.
Assume L is regular, therefore the pumping lemma holds.
Choose w=amcbm
where m is the constant in the pumping lemma.
(Note that w must be choosen such that |w|≥m.)
The only way to partition w into three parts,
w=xyz, is such that x contains 0 or more a’s,
y contains 1 or more a’s, and z contains 0 or
more a’s concatenated with cbm.
This is because of the restrictions |xy|≤m and
|y|>0.
So the partition is: x=ak|y=aj|z=am−k−jcbm
where k≥0, j>0, and k+j≤m
for some constants k and j.
It should be true that xyiz∈L for all i≥0.
xy0z=am−jcbm∉L. Contradiction!
Another Example
Prove that L={a3bncn−3|n>3} is not regular.
We can do this with the Pumping Lemma, but it is more
complicated!
The reason is that we can’t pick w to make all
decompositions look pretty much the same.
So we need to reason through all the cases. See the example in
the OpenDSA modules.